Suppr超能文献

SARS-CoV-2 基因组变异的流行病学关联。

Epidemiological associations with genomic variation in SARS-CoV-2.

机构信息

Computational Biology Institute, Department of Biostatistics and Bioinformatics, Milken Institute School of Public Health, The George Washington University, Washington, DC, USA.

CIBIO-InBIO, Centro de Investigação em Biodiversidade e Recursos Genéticos, Universidade do Porto, Campus Agrário de Vairão, Vairão, Portugal.

出版信息

Sci Rep. 2021 Nov 26;11(1):23023. doi: 10.1038/s41598-021-02548-w.

Abstract

SARS-CoV-2 (CoV) is the etiological agent of the COVID-19 pandemic and evolves to evade both host immune systems and intervention strategies. We divided the CoV genome into 29 constituent regions and applied novel analytical approaches to identify associations between CoV genomic features and epidemiological metadata. Our results show that nonstructural protein 3 (nsp3) and Spike protein (S) have the highest variation and greatest correlation with the viral whole-genome variation. S protein variation is correlated with nsp3, nsp6, and 3'-to-5' exonuclease variation. Country of origin and time since the start of the pandemic were the most influential metadata associated with genomic variation, while host sex and age were the least influential. We define a novel statistic-coherence-and show its utility in identifying geographic regions (populations) with unusually high (many new variants) or low (isolated) viral phylogenetic diversity. Interestingly, at both global and regional scales, we identify geographic locations with high coherence neighboring regions of low coherence; this emphasizes the utility of this metric to inform public health measures for disease spread. Our results provide a direction to prioritize genes associated with outcome predictors (e.g., health, therapeutic, and vaccine outcomes) and to improve DNA tests for predicting disease status.

摘要

SARS-CoV-2(CoV)是 COVID-19 大流行的病原体,它会进化以逃避宿主免疫系统和干预策略。我们将 CoV 基因组分为 29 个组成区域,并应用新的分析方法来确定 CoV 基因组特征与流行病学元数据之间的关联。我们的结果表明,非结构蛋白 3(nsp3)和 Spike 蛋白(S)变异最大,与病毒全基因组变异相关性最强。S 蛋白的变异与 nsp3、nsp6 和 3'-5'外切酶的变异相关。起源国和大流行开始以来的时间是与基因组变异最相关的元数据,而宿主性别和年龄的影响最小。我们定义了一个新的统计量——一致性,并展示了它在识别具有异常高(有许多新变体)或低(孤立)病毒系统发育多样性的地理区域(人群)方面的实用性。有趣的是,在全球和区域尺度上,我们发现一致性高的地理位置与一致性低的地理位置相邻;这强调了该指标在为疾病传播提供公共卫生措施方面的实用性。我们的研究结果为优先考虑与预后预测因子(如健康、治疗和疫苗结果)相关的基因提供了方向,并改进了用于预测疾病状态的 DNA 检测。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/96da/8626494/06568b8d6fac/41598_2021_2548_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验