Suppr超能文献

直系同源推断中的系统误差及其对进化分析的影响。

Systematic errors in orthology inference and their effects on evolutionary analyses.

作者信息

Natsidis Paschalis, Kapli Paschalia, Schiffer Philipp H, Telford Maximilian J

机构信息

Centre for Life's Origins and Evolution, Department of Genetics, Evolution and Ecology, University College London, London WC1E 6BT, UK.

出版信息

iScience. 2021 Jan 28;24(2):102110. doi: 10.1016/j.isci.2021.102110. eCollection 2021 Feb 19.

Abstract

The availability of complete sets of genes from many organisms makes it possible to identify genes unique to (or lost from) certain clades. This information is used to reconstruct phylogenetic trees; identify genes involved in the evolution of clade specific novelties; and for phylostratigraphy-identifying ages of genes in a given species. These investigations rely on accurately predicted orthologs. Here we use simulation to produce sets of orthologs that experience no gains or losses. We show that errors in identifying orthologs increase with higher rates of evolution. We use the predicted sets of orthologs, with errors, to reconstruct phylogenetic trees; to count gains and losses; and for phylostratigraphy. Our simulated data, containing information only from errors in orthology prediction, closely recapitulate findings from empirical data. We suggest published downstream analyses must be informed to a large extent by errors in orthology prediction that mimic expected patterns of gene evolution.

摘要

许多生物完整基因组的可得性使得识别特定进化枝特有的(或缺失的)基因成为可能。这些信息被用于重建系统发育树;识别与进化枝特定新特征进化相关的基因;以及进行系统发育年代学分析——确定给定物种中基因的年龄。这些研究依赖于准确预测的直系同源基因。在这里,我们通过模拟产生没有基因获得或丢失情况的直系同源基因集。我们表明,随着进化速率的提高,识别直系同源基因时的错误会增加。我们使用带有错误的预测直系同源基因集来重建系统发育树;计算基因的获得和丢失情况;以及进行系统发育年代学分析。我们的模拟数据仅包含来自直系同源性预测错误的信息,却能紧密重现实证数据的结果。我们建议,已发表的下游分析在很大程度上必须考虑到直系同源性预测中的错误,这些错误模拟了预期的基因进化模式。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/14b7/7892920/83305e2fe7bd/fx1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验