Suppr超能文献

GenomegaMap:来自 1 万多基因组的种内全基因组 dN/dS 估计。

GenomegaMap: Within-Species Genome-Wide dN/dS Estimation from over 10,000 Genomes.

机构信息

Big Data Institute, Nuffield Department of Population Health, Li Ka Shing Centre for Health Information and Discovery, University of Oxford, Oxford, United Kingdom.

出版信息

Mol Biol Evol. 2020 Aug 1;37(8):2450-2460. doi: 10.1093/molbev/msaa069.

Abstract

The dN/dS ratio provides evidence of adaptation or functional constraint in protein-coding genes by quantifying the relative excess or deficit of amino acid-replacing versus silent nucleotide variation. Inexpensive sequencing promises a better understanding of parameters, such as dN/dS, but analyzing very large data sets poses a major statistical challenge. Here, I introduce genomegaMap for estimating within-species genome-wide variation in dN/dS, and I apply it to 3,979 genes across 10,209 tuberculosis genomes to characterize the selection pressures shaping this global pathogen. GenomegaMap is a phylogeny-free method that addresses two major problems with existing approaches: 1) It is fast no matter how large the sample size and 2) it is robust to recombination, which causes phylogenetic methods to report artefactual signals of adaptation. GenomegaMap uses population genetics theory to approximate the distribution of allele frequencies under general, parent-dependent mutation models. Coalescent simulations show that substitution parameters are well estimated even when genomegaMap's simplifying assumption of independence among sites is violated. I demonstrate the ability of genomegaMap to detect genuine signatures of selection at antimicrobial resistance-conferring substitutions in Mycobacterium tuberculosis and describe a novel signature of selection in the cold-shock DEAD-box protein A gene deaD/csdA. The genomegaMap approach helps accelerate the exploitation of big data for gaining new insights into evolution within species.

摘要

dN/dS 比值通过量化取代性和沉默核苷酸变异的相对过剩或不足,为蛋白质编码基因的适应或功能约束提供了证据。廉价的测序有望更好地了解 dN/dS 等参数,但分析非常大的数据集带来了重大的统计挑战。在这里,我引入了 genomegaMap 来估计物种内全基因组的 dN/dS 变异,并将其应用于 10209 个结核分枝杆菌基因组中的 3979 个基因,以描述塑造这种全球病原体的选择压力。genomegaMap 是一种无需构建系统发育树的方法,它解决了现有方法的两个主要问题:1)无论样本量大小,它都非常快速;2)它对重组具有鲁棒性,这会导致基于系统发育的方法报告适应的人为信号。genomegaMap 使用群体遗传学理论来近似一般的、依赖于亲代的突变模型下的等位基因频率分布。合并模拟表明,即使违反了 genomegaMap 中对位点之间独立性的简化假设,替代参数也能得到很好的估计。我证明了 genomegaMap 能够检测结核分枝杆菌中抗微生物药物耐药性赋予的取代的真正选择信号,并描述了冷休克 DEAD 框蛋白 A 基因 deaD/csdA 中选择的新信号。genomegaMap 方法有助于加速利用大数据来深入了解物种内的进化。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/da9d/7403622/229d9fa6b032/msaa069f1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验