• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

同源基因预测方法:基于已验证蛋白质家族的质量评估

Orthology prediction methods: a quality assessment using curated protein families.

机构信息

Structural and Computational Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany.

出版信息

Bioessays. 2011 Oct;33(10):769-80. doi: 10.1002/bies.201100062. Epub 2011 Aug 19.

DOI:10.1002/bies.201100062
PMID:21853451
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3193375/
Abstract

The increasing number of sequenced genomes has prompted the development of several automated orthology prediction methods. Tests to evaluate the accuracy of predictions and to explore biases caused by biological and technical factors are therefore required. We used 70 manually curated families to analyze the performance of five public methods in Metazoa. We analyzed the strengths and weaknesses of the methods and quantified the impact of biological and technical challenges. From the latter part of the analysis, genome annotation emerged as the largest single influencer, affecting up to 30% of the performance. Generally, most methods did well in assigning orthologous group but they failed to assign the exact number of genes for half of the groups. The publicly available benchmark set (http://eggnog.embl.de/orthobench/) should facilitate the improvement of current orthology assignment protocols, which is of utmost importance for many fields of biology and should be tackled by a broad scientific community.

摘要

随着测序基因组数量的不断增加,已经开发出了几种自动化的直系同源预测方法。因此,需要进行测试以评估预测的准确性,并探索由生物学和技术因素引起的偏差。我们使用了 70 个经过人工整理的家族来分析 Metazoa 中五种公共方法的性能。我们分析了方法的优缺点,并量化了生物学和技术挑战的影响。从分析的后半部分可以看出,基因组注释成为最大的单一影响因素,最多可达 30%的性能受到影响。通常,大多数方法在分配直系同源群方面表现良好,但它们未能为一半的群组分配确切的基因数量。公开可用的基准集(http://eggnog.embl.de/orthobench/)应该有助于改进当前的直系同源分配协议,这对生物学的许多领域都至关重要,应该由广泛的科学界来解决。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2741/3193375/bc27e176b7af/bies0033-0769-f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2741/3193375/f3428fbda95d/bies0033-0769-f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2741/3193375/3b449f2b5b37/bies0033-0769-f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2741/3193375/bc27e176b7af/bies0033-0769-f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2741/3193375/f3428fbda95d/bies0033-0769-f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2741/3193375/3b449f2b5b37/bies0033-0769-f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2741/3193375/bc27e176b7af/bies0033-0769-f4.jpg

相似文献

1
Orthology prediction methods: a quality assessment using curated protein families.同源基因预测方法:基于已验证蛋白质家族的质量评估
Bioessays. 2011 Oct;33(10):769-80. doi: 10.1002/bies.201100062. Epub 2011 Aug 19.
2
A phylogeny-based benchmarking test for orthology inference reveals the limitations of function-based validation.一种基于系统发育的直系同源推断基准测试揭示了基于功能验证的局限性。
PLoS One. 2014 Nov 4;9(11):e111122. doi: 10.1371/journal.pone.0111122. eCollection 2014.
3
eggNOG v4.0: nested orthology inference across 3686 organisms.eggNOG v4.0:跨越 3686 个生物体的嵌套同源推断。
Nucleic Acids Res. 2014 Jan;42(Database issue):D231-9. doi: 10.1093/nar/gkt1253. Epub 2013 Dec 1.
4
Fast Genome-Wide Functional Annotation through Orthology Assignment by eggNOG-Mapper.通过eggNOG-Mapper进行直系同源物分配实现全基因组快速功能注释
Mol Biol Evol. 2017 Aug 1;34(8):2115-2122. doi: 10.1093/molbev/msx148.
5
eggNOG: automated construction and annotation of orthologous groups of genes.蛋酒(EggNOG):直系同源基因簇的自动构建与注释
Nucleic Acids Res. 2008 Jan;36(Database issue):D250-4. doi: 10.1093/nar/gkm796. Epub 2007 Oct 16.
6
eggNOG-mapper v2: Functional Annotation, Orthology Assignments, and Domain Prediction at the Metagenomic Scale.eggNOG-mapper v2:宏基因组尺度的功能注释、直系同源物分配和结构域预测。
Mol Biol Evol. 2021 Dec 9;38(12):5825-5829. doi: 10.1093/molbev/msab293.
7
Protein-Coding Gene Families in Prokaryote Genome Comparisons.原核生物基因组比较中的蛋白质编码基因家族。
Methods Mol Biol. 2024;2802:33-55. doi: 10.1007/978-1-0716-3838-5_2.
8
A meta-approach for improving the prediction and the functional annotation of ortholog groups.一种用于改进直系同源基因簇预测和功能注释的元方法。
BMC Genomics. 2014;15 Suppl 6(Suppl 6):S16. doi: 10.1186/1471-2164-15-S6-S16. Epub 2014 Oct 17.
9
eggNOG v3.0: orthologous groups covering 1133 organisms at 41 different taxonomic ranges.eggNOG v3.0:涵盖了 41 个不同分类范围的 1133 个生物体的直系同源物组。
Nucleic Acids Res. 2012 Jan;40(Database issue):D284-9. doi: 10.1093/nar/gkr1060. Epub 2011 Nov 16.
10
Gene orthology assessment with OrthologID.使用OrthologID进行基因直系同源性评估。
Methods Mol Biol. 2009;537:23-38. doi: 10.1007/978-1-59745-251-9_2.

引用本文的文献

1
Annotation matters: the effect of structural gene annotation on orthology inference.注释很重要:结构基因注释对直系同源推断的影响。
Bioinformatics. 2025 Jul 1;41(7). doi: 10.1093/bioinformatics/btaf365.
2
A curated benchmark dataset for molecular identification based on genome skimming.一个基于基因组浅层测序的用于分子鉴定的精选基准数据集。
Sci Data. 2025 May 29;12(1):906. doi: 10.1038/s41597-025-05230-2.
3
SCARAP: scalable cross-species comparative genomics of prokaryotes.SCARAP:原核生物的可扩展跨物种比较基因组学

本文引用的文献

1
Computational methods for Gene Orthology inference.基因直系同源推断的计算方法。
Brief Bioinform. 2011 Sep;12(5):379-91. doi: 10.1093/bib/bbr030. Epub 2011 Jun 19.
2
Evaluating ortholog prediction algorithms in a yeast model clade.在酵母模型进化枝中评估直系同源预测算法。
PLoS One. 2011 Apr 13;6(4):e18755. doi: 10.1371/journal.pone.0018755.
3
MetaPhOrs: orthology and paralogy predictions from multiple phylogenetic evidence using a consistency-based confidence score.MetaPhOrs:使用基于一致性的置信分数,从多种系统发育证据预测直系同源和旁系同源。
Bioinformatics. 2024 Dec 26;41(1). doi: 10.1093/bioinformatics/btae735.
4
Unraveling genomic features and phylogenomics through the analysis of three Mexican endemic genomes.通过对三个墨西哥特有基因组的分析,揭示基因组特征和系统发生基因组学。
PeerJ. 2024 Jul 8;12:e17651. doi: 10.7717/peerj.17651. eCollection 2024.
5
Integrating gene annotation with orthology inference at scale.大规模整合基因注释与直系同源推断。
Science. 2023 Apr 28;380(6643):eabn3107. doi: 10.1126/science.abn3107.
6
An Efficient Feature Selection Algorithm for Gene Families Using NMF and ReliefF.基于 NMF 和 ReliefF 的基因家族高效特征选择算法。
Genes (Basel). 2023 Feb 6;14(2):421. doi: 10.3390/genes14020421.
7
Draft genome assembly for the colombian freshwater bocachico fish, .哥伦比亚淡水博卡奇科鱼的基因组草图组装
Front Genet. 2023 Jan 19;13:989788. doi: 10.3389/fgene.2022.989788. eCollection 2022.
8
OrthoDB v11: annotation of orthologs in the widest sampling of organismal diversity.OrthoDB v11:在最广泛的生物多样性样本中注释直系同源物。
Nucleic Acids Res. 2023 Jan 6;51(D1):D445-D451. doi: 10.1093/nar/gkac998.
9
ORTHOSCOPE*: A Phylogenetic Pipeline to Infer Gene Histories from Genome-Wide Data.ORTHOSCOPE*:一种从全基因组数据推断基因历史的系统发育分析流程。
Mol Biol Evol. 2022 Jan 7;39(1). doi: 10.1093/molbev/msab301.
10
KinOrtho: a method for mapping human kinase orthologs across the tree of life and illuminating understudied kinases.KinOrtho:一种在生命之树中映射人类激酶直系同源物并阐明研究不足的激酶的方法。
BMC Bioinformatics. 2021 Sep 18;22(1):446. doi: 10.1186/s12859-021-04358-3.
Nucleic Acids Res. 2011 Mar;39(5):e32. doi: 10.1093/nar/gkq953. Epub 2010 Dec 11.
4
OMA 2011: orthology inference among 1000 complete genomes.OMA 2011:1000个完整基因组间的直系同源推断
Nucleic Acids Res. 2011 Jan;39(Database issue):D289-94. doi: 10.1093/nar/gkq1238. Epub 2010 Nov 27.
5
PhylomeDB v3.0: an expanding repository of genome-wide collections of trees, alignments and phylogeny-based orthology and paralogy predictions.系统发育基因组数据库v3.0:一个不断扩展的全基因组树集合、比对以及基于系统发育的直系同源和旁系同源预测的知识库。
Nucleic Acids Res. 2011 Jan;39(Database issue):D556-60. doi: 10.1093/nar/gkq1109. Epub 2010 Nov 12.
6
Ensembl 2011.Ensembl 2011年版
Nucleic Acids Res. 2011 Jan;39(Database issue):D800-6. doi: 10.1093/nar/gkq1064. Epub 2010 Nov 2.
7
OrthoDB: the hierarchical catalog of eukaryotic orthologs in 2011.OrthoDB:2011年真核生物直系同源基因的分层目录。
Nucleic Acids Res. 2011 Jan;39(Database issue):D283-8. doi: 10.1093/nar/gkq930. Epub 2010 Oct 23.
8
New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0.新算法和方法估计最大似然系统发育:评估 PhyML 3.0 的性能。
Syst Biol. 2010 May;59(3):307-21. doi: 10.1093/sysbio/syq010. Epub 2010 Mar 29.
9
The dynamic genome of Hydra.水螅的动态基因组。
Nature. 2010 Mar 25;464(7288):592-6. doi: 10.1038/nature08830. Epub 2010 Mar 14.
10
A new generation of homology search tools based on probabilistic inference.基于概率推理的新一代同源性搜索工具。
Genome Inform. 2009 Oct;23(1):205-11.