Suppr超能文献

估计正达尔文选择受到测序、注释和比对错误的影响而膨胀。

Estimates of positive Darwinian selection are inflated by errors in sequencing, annotation, and alignment.

机构信息

ETH Zürich, Zürich, Switzerland.

出版信息

Genome Biol Evol. 2009 Jun 5;1:114-8. doi: 10.1093/gbe/evp012.

Abstract

Published estimates of the proportion of positively selected genes (PSGs) in human vary over three orders of magnitude. In mammals, estimates of the proportion of PSGs cover an even wider range of values. We used 2,980 orthologous protein-coding genes from human, chimpanzee, macaque, dog, cow, rat, and mouse as well as an established phylogenetic topology to infer the fraction of PSGs in all seven terminal branches. The inferred fraction of PSGs ranged from 0.9% in human through 17.5% in macaque to 23.3% in dog. We found three factors that influence the fraction of genes that exhibit telltale signs of positive selection: the quality of the sequence, the degree of misannotation, and ambiguities in the multiple sequence alignment. The inferred fraction of PSGs in sequences that are deficient in all three criteria of coverage, annotation, and alignment is 7.2 times higher than that in genes with high trace sequencing coverage, "known" annotation status, and perfect alignment scores. We conclude that some estimates on the prevalence of positive Darwinian selection in the literature may be inflated and should be treated with caution.

摘要

已发表的人类正选择基因(PSG)比例的估计值在三个数量级上变化。在哺乳动物中,PSG 比例的估计值涵盖了更广泛的数值范围。我们使用了来自人类、黑猩猩、猕猴、狗、牛、大鼠和小鼠的 2980 个直系同源蛋白编码基因以及已建立的系统发育拓扑结构,来推断所有七个末端分支中的 PSG 分数。推断的 PSG 分数范围从人类的 0.9%到猕猴的 17.5%到狗的 23.3%。我们发现了三个影响表现出正选择明显迹象的基因分数的因素:序列质量、错误注释程度和多重序列比对中的歧义。在覆盖范围、注释和比对这三个标准都不足的序列中推断的 PSG 分数比具有高痕量测序覆盖度、“已知”注释状态和完美比对分数的基因高 7.2 倍。我们得出结论,文献中关于正达尔文选择普遍性的一些估计可能被夸大了,应该谨慎对待。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验