Comprehensive annotations of the mutational spectra of SARS-CoV-2 spike protein: a fast and accurate pipeline
Rahman, Mohammed Shaminur1; Islam, Mohammed Rafiul1; Hoque, Mohammed Nazmul1,2; Akther, Masuda1; Puspo, Joynob Akter1; Akter, Salma1,4; Sultana, Munawar1; Hossain, Mohammed Anwar1; Alam, Abu Sayed Mohammad Rubayet Ul3
2020-10
发表期刊TRANSBOUNDARY AND EMERGING DISEASES
ISSN1865-1674
EISSN1865-1682
摘要Infecting millions of people, the SARS-CoV-2 is evolving at an unprecedented rate, demanding advanced and specified analytic pipeline to capture the mutational spectra. In order to explore mutations and deletions in the spike (S) protein - the most-discussed protein of SARS-CoV-2 - we comprehensively analyzed 35,750 complete S protein-coding sequences through a custom Python-based pipeline. This GISAID-collected dataset of until 24 June 2020 covered six continents and five major climate zones. We identified 27,801 (77.77% sequences) mutated strains compared to reference Wuhan-Hu-1 wherein 84.40% of these strains mutated by only a single amino acid (aa). An outlier strain (EPI_ISL_463893) from Bosnia and Herzegovina possessed six aa substitutions. We also identified 11 residues with high aa mutation frequency, and each contains four types of aa variations. The infamous D614G variant has spread worldwide with ever-rising dominance and across regions with different climatic conditions alongside L5F and D936Y mutants, which have been documented throughout all regions and climate zones, respectively. We also found 988 unique aa substitutions spanned across 660 residues, which differed significantly among different continents (p = .003) and climatic zones (p = .021) as inferred with the Kruskal-Wallis test. Besides, 17 in-frame deletions at four sites adjacent to receptor-binding-domain were determined that may have a possible impact on attenuation. This study provides a fast and accurate pipeline for identifying mutations and deletions from the large dataset for coding and also non-coding sequences as evidenced by the representative analysis on existing S protein data. By using separate multi-sequence alignment, removing ambiguous sequences and in-frame stop codons, and utilizing pairwise alignment, this method can derive both synonymous and non-synonymous mutations (strain_ID reference aa:mutation position:strain aa). We suggest that the pipeline will aid in the evolutionary surveillance of any SARS-CoV-2 encoded proteins and will prove to be crucial in tracking the ever-increasing variation of many other divergent RNA viruses in the future. The code is available at https://github.com/SShaminur/Mutation-Analysis.
关键词Climate Geography Mutations SARS-CoV-2 Spike (S) protein | COVID-19
DOI10.1111/tbed.13834
WOS关键词2019-NCOV ; GENETICS
WOS研究方向Infectious Diseases ; Veterinary Sciences
WOS类目Infectious Diseases ; Veterinary Sciences
出版者WILEY
引用统计
文献类型期刊论文
专题新冠肺炎
循证社会科学证据集成
作者单位1.Univ Dhaka;
2.Bangabandhu Sheikh Mujibur Rahman Agr Univ;
3.Jashore Univ Sci & Technol;
4.Jahangirnagar Univ
推荐引用方式
GB/T 7714
Rahman, Mohammed Shaminur,Islam, Mohammed Rafiul,Hoque, Mohammed Nazmul,et al. Comprehensive annotations of the mutational spectra of SARS-CoV-2 spike protein: a fast and accurate pipeline[J]. TRANSBOUNDARY AND EMERGING DISEASES,2020.
APA Rahman, Mohammed Shaminur.,Islam, Mohammed Rafiul.,Hoque, Mohammed Nazmul.,Akther, Masuda.,Puspo, Joynob Akter.,...&Alam, Abu Sayed Mohammad Rubayet Ul.(2020).Comprehensive annotations of the mutational spectra of SARS-CoV-2 spike protein: a fast and accurate pipeline.TRANSBOUNDARY AND EMERGING DISEASES.
MLA Rahman, Mohammed Shaminur,et al."Comprehensive annotations of the mutational spectra of SARS-CoV-2 spike protein: a fast and accurate pipeline".TRANSBOUNDARY AND EMERGING DISEASES (2020).
条目包含的文件 下载所有文件
文件名称/大小 文献类型 版本类型 开放类型 使用许可
Rahman-Comprehensive(2359KB)期刊论文出版稿开放获取CC BY-NC-SA浏览 下载
个性服务
查看访问统计
谷歌学术
谷歌学术中相似的文章
[Rahman, Mohammed Shaminur]的文章
[Islam, Mohammed Rafiul]的文章
[Hoque, Mohammed Nazmul]的文章
百度学术
百度学术中相似的文章
[Rahman, Mohammed Shaminur]的文章
[Islam, Mohammed Rafiul]的文章
[Hoque, Mohammed Nazmul]的文章
必应学术
必应学术中相似的文章
[Rahman, Mohammed Shaminur]的文章
[Islam, Mohammed Rafiul]的文章
[Hoque, Mohammed Nazmul]的文章
相关权益政策
暂无数据
收藏/分享
文件名: Rahman-Comprehensive annotations of the mutati.pdf
格式: Adobe PDF
此文件暂不支持浏览
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。

元出版是什么?

元出版是融合预印本出版、数据出版、结构化信息出版等当前开放出版实践与理念为一体的开放出版新模式,旨在提供一个科学工作者完全融入的泛在沉浸式开放知识交流机制。

MetaPub团队

  • 关于我们
  • 编委会
  • 审稿专家
  • 编辑部

开放研究

  • 学科领域
  • 入驻期刊
  • 入驻会议
  • 开放数据集

帮助

  • 元作品投稿流程
  • 元作品写作要求
  • 元作品出版声明
  • 元作品出版标准
  • 审稿注意事项
地址:四川天府新区群贤南街289号 邮编:610299 电子邮箱:liucj@clas.ac.cn
版权所有 蜀ICP备05003827号