Suppr超能文献

大规模比较分析考虑蛋白质和核苷酸选择的密码子模型。

Large-Scale Comparative Analysis of Codon Models Accounting for Protein and Nucleotide Selection.

机构信息

Department of Computational Biology, Biophore, University of Lausanne, Lausanne, Switzerland.

Department of Ecology and Evolution, Biophore, University of Lausanne, Lausanne, Switzerland.

出版信息

Mol Biol Evol. 2019 Jun 1;36(6):1316-1332. doi: 10.1093/molbev/msz048.

Abstract

There are numerous sources of variation in the rate of synonymous substitutions inside genes, such as direct selection on the nucleotide sequence, or mutation rate variation. Yet scans for positive selection rely on codon models which incorporate an assumption of effectively neutral synonymous substitution rate, constant between sites of each gene. Here we perform a large-scale comparison of approaches which incorporate codon substitution rate variation and propose our own simple yet effective modification of existing models. We find strong effects of substitution rate variation on positive selection inference. More than 70% of the genes detected by the classical branch-site model are presumably false positives caused by the incorrect assumption of uniform synonymous substitution rate. We propose a new model which is strongly favored by the data while remaining computationally tractable. With the new model we can capture signatures of nucleotide level selection acting on translation initiation and on splicing sites within the coding region. Finally, we show that rate variation is highest in the highly recombining regions, and we propose that recombination and mutation rate variation, such as high CpG mutation rate, are the two main sources of nucleotide rate variation. Although we detect fewer genes under positive selection in Drosophila than without rate variation, the genes which we detect contain a stronger signal of adaptation of dynein, which could be associated with Wolbachia infection. We provide software to perform positive selection analysis using the new model.

摘要

基因内同义替换率存在许多变异来源,例如对核苷酸序列的直接选择或突变率变异。然而,正选择扫描依赖于包含有效中性同义替换率假设的密码子模型,该假设在每个基因的位点之间保持不变。在这里,我们对纳入密码子替换率变异的方法进行了大规模比较,并提出了对现有模型的简单而有效的修改。我们发现替换率变异对正选择推断有强烈影响。经典分支位点模型检测到的基因中,超过 70%可能是由于不正确的均匀同义替换率假设造成的假阳性。我们提出了一个新的模型,该模型受到数据的强烈支持,同时仍然具有计算可操作性。使用新模型,我们可以捕获作用于翻译起始和编码区剪接位点的核苷酸水平选择的特征。最后,我们表明在高度重组区域中变异率最高,我们提出重组和突变率变异(如高 CpG 突变率)是核苷酸变异的两个主要来源。尽管我们在果蝇中检测到的正选择基因比没有变异率时少,但我们检测到的基因包含更强的动力蛋白适应信号,这可能与沃尔巴克氏体感染有关。我们提供了使用新模型进行正选择分析的软件。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/279d/6526913/b0b43ed793e5/msz048f1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验