Suppr超能文献

使用OMArk对基因库注释进行质量评估。

Quality assessment of gene repertoire annotations with OMArk.

作者信息

Nevers Yannis, Warwick Vesztrocy Alex, Rossier Victor, Train Clément-Marie, Altenhoff Adrian, Dessimoz Christophe, Glover Natasha M

机构信息

Department of Computational Biology, University of Lausanne, Lausanne, Switzerland.

Swiss Institute of Bioinformatics, Lausanne, Switzerland.

出版信息

Nat Biotechnol. 2025 Jan;43(1):124-133. doi: 10.1038/s41587-024-02147-w. Epub 2024 Feb 21.

Abstract

In the era of biodiversity genomics, it is crucial to ensure that annotations of protein-coding gene repertoires are accurate. State-of-the-art tools to assess genome annotations measure the completeness of a gene repertoire but are blind to other errors, such as gene overprediction or contamination. We introduce OMArk, a software package that relies on fast, alignment-free sequence comparisons between a query proteome and precomputed gene families across the tree of life. OMArk assesses not only the completeness but also the consistency of the gene repertoire as a whole relative to closely related species and reports likely contamination events. Analysis of 1,805 UniProt Eukaryotic Reference Proteomes with OMArk demonstrated strong evidence of contamination in 73 proteomes and identified error propagation in avian gene annotation resulting from the use of a fragmented zebra finch proteome as a reference. This study illustrates the importance of comparing and prioritizing proteomes based on their quality measures.

摘要

在生物多样性基因组学时代,确保蛋白质编码基因库注释的准确性至关重要。评估基因组注释的先进工具可衡量基因库的完整性,但对其他错误(如基因过度预测或污染)视而不见。我们引入了OMArk,这是一个软件包,它依赖于查询蛋白质组与生命之树中预先计算的基因家族之间快速、无需比对的序列比较。OMArk不仅评估基因库的完整性,还评估整个基因库相对于近缘物种的一致性,并报告可能的污染事件。使用OMArk对1805个UniProt真核生物参考蛋白质组进行分析,结果表明73个蛋白质组存在污染的有力证据,并确定了由于使用碎片化的斑胸草雀蛋白质组作为参考而导致的鸟类基因注释中的错误传播。这项研究说明了根据蛋白质组的质量指标进行比较和排序的重要性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c21c/11738984/6b837bdfca4d/41587_2024_2147_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验