Suppr超能文献

社区共建的太平洋真涡虫基因组有助于相关线虫基因自动注释的改进。

The community-curated Pristionchus pacificus genome facilitates automated gene annotation improvement in related nematodes.

机构信息

Department for Integrative Evolutionary Biology, Max Planck Institute for Developmental Biology, Max-Planck-Ring 9, 72076, Tübingen, Germany.

出版信息

BMC Genomics. 2021 Mar 25;22(1):216. doi: 10.1186/s12864-021-07529-x.

Abstract

BACKGROUND

The nematode Pristionchus pacificus is an established model organism for comparative studies with Caenorhabditis elegans. Over the past years, it developed into an independent animal model organism for elucidating the genetic basis of phenotypic plasticity. Community-based curations were employed recently to improve the quality of gene annotations of P. pacificus and to more easily facilitate reverse genetic studies using candidate genes from C. elegans.

RESULTS

Here, I demonstrate that the reannotation of phylogenomic data from nine related nematode species using the community-curated P. pacificus gene set as homology data substantially improves the quality of gene annotations. Benchmarking of universal single copy orthologs (BUSCO) estimates a median completeness of 84% which corresponds to a 9% increase over previous annotations. Nevertheless, the ability to infer gene models based on homology already drops beyond the genus level reflecting the rapid evolution of nematode lineages. This also indicates that the highly curated C. elegans genome is not optimally suited for annotating non-Caenorhabditis genomes based on homology. Furthermore, comparative genomic analysis of apparently missing BUSCO genes indicates a failure of ortholog detection by the BUSCO pipeline due to the insufficient sample size and phylogenetic breadth of the underlying OrthoDB data set. As a consequence, the quality of multiple divergent nematode genomes might be underestimated.

CONCLUSIONS

This study highlights the need for optimizing gene annotation protocols and it demonstrates the benefit of a high quality genome for phylogenomic data of related species.

摘要

背景

秀丽隐杆线虫是一种已被广泛应用于比较研究的模式生物。近年来,它已发展成为阐明表型可塑性遗传基础的独立动物模型生物。最近采用基于社区的策展来提高秀丽隐杆线虫基因注释的质量,并更轻松地利用秀丽隐杆线虫的候选基因进行反向遗传学研究。

结果

在这里,我证明使用社区策展的秀丽隐杆线虫基因集作为同源数据重新注释来自九个相关线虫物种的系统发育基因组数据,可显著提高基因注释的质量。通用单拷贝同源物(BUSCO)的基准测试估计中位数完整性为 84%,比以前的注释增加了 9%。然而,基于同源性推断基因模型的能力已经超出了属的水平,反映出线虫谱系的快速进化。这也表明,经过高度策展的秀丽隐杆线虫基因组并不完全适合基于同源性注释非秀丽隐杆线虫基因组。此外,对明显缺失 BUSCO 基因的比较基因组分析表明,由于基础 OrthoDB 数据集的样本量和系统发育广度不足,BUSCO 管道的同源检测失败。因此,多个不同的线虫基因组的质量可能被低估。

结论

本研究强调了优化基因注释协议的必要性,并证明了高质量基因组对于相关物种的系统发育基因组数据的重要性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bd9e/7992802/992d3891c02f/12864_2021_7529_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验