Suppr超能文献

鉴定 Metazoa 中的单拷贝直系同源基因。

Identifying single copy orthologs in Metazoa.

机构信息

Teagasc, Animal & Grassland Research and Innovation Centre, Grange, Dunsany, County Meath, Ireland.

出版信息

PLoS Comput Biol. 2011 Dec;7(12):e1002269. doi: 10.1371/journal.pcbi.1002269. Epub 2011 Dec 1.

Abstract

The identification of single copy (1-to-1) orthologs in any group of organisms is important for functional classification and phylogenetic studies. The Metazoa are no exception, but only recently has there been a wide-enough distribution of taxa with sufficiently high quality sequenced genomes to gain confidence in the wide-spread single copy status of a gene.Here, we present a phylogenetic approach for identifying overlooked single copy orthologs from multigene families and apply it to the Metazoa. Using 18 sequenced metazoan genomes of high quality we identified a robust set of 1,126 orthologous groups that have been retained in single copy since the last common ancestor of Metazoa. We found that the use of the phylogenetic procedure increased the number of single copy orthologs found by over a third more than standard taxon-count approaches. The orthologs represented a wide range of functional categories, expression profiles and levels of divergence.To demonstrate the value of our set of single copy orthologs, we used them to assess the completeness of 24 currently published metazoan genomes and 62 EST datasets. We found that the annotated genes in published genomes vary in coverage from 79% (Ciona intestinalis) to 99.8% (human) with an average of 92%, suggesting a value for the underlying error rate in genome annotation, and a strategy for identifying single copy orthologs in larger datasets. In contrast, the vast majority of EST datasets with no corresponding genome sequence available are largely under-sampled and probably do not accurately represent the actual genomic complement of the organisms from which they are derived.

摘要

鉴定任何一组生物中的单拷贝(1-1)直系同源物对于功能分类和系统发育研究很重要。Metazoa 也不例外,但直到最近,才有足够广泛的分类群分布,并且其测序基因组质量足够高,从而可以确定基因的广泛单拷贝状态。在这里,我们提出了一种从多基因家族中鉴定被忽视的单拷贝直系同源物的系统发育方法,并将其应用于 Metazoa。使用 18 个高质量的已测序后生动物基因组,我们鉴定了一组稳健的 1,126 个直系同源物,它们自 Metazoa 的最后共同祖先以来一直保持单拷贝状态。我们发现,与标准分类计数方法相比,使用系统发育程序可以增加三分之一以上的单拷贝直系同源物的数量。这些直系同源物代表了广泛的功能类别、表达谱和分化水平。为了展示我们的单拷贝直系同源物集的价值,我们使用它们来评估 24 个当前已发表的后生动物基因组和 62 个 EST 数据集的完整性。我们发现,已发表基因组中的注释基因的覆盖范围从 79%(Ciona intestinalis)到 99.8%(人类)不等,平均为 92%,这表明基因组注释中的潜在错误率有一个值,以及在更大的数据集识别单拷贝直系同源物的策略。相比之下,绝大多数没有相应基因组序列的 EST 数据集大部分都是采样不足的,可能不能准确代表它们所源自的生物体的实际基因组组成。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/675c/3228760/a005ef85db82/pcbi.1002269.g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验