• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

系统发生树重建与抽样相结合提高了同源基因家族系统发育树大规模推断的准确性。

Tree reconciliation combined with subsampling improves large scale inference of orthologous group hierarchies.

机构信息

Institute of Molecular Life Sciences, University of Zurich, Winterthurerstrasse 190, Zurich, 8057, Switzerland.

SIB Swiss Institute of Bioinformatics, Quartier Sorge, Batiment Genopode, Lausanne, 1015, Switzerland.

出版信息

BMC Bioinformatics. 2019 May 6;20(1):228. doi: 10.1186/s12859-019-2828-z.

DOI:10.1186/s12859-019-2828-z
PMID:31060495
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6501302/
Abstract

BACKGROUND

An orthologous group (OG) comprises a set of orthologous and paralogous genes that share a last common ancestor (LCA). OGs are defined with respect to a chosen taxonomic level, which delimits the position of the LCA in time to a specified speciation event. A hierarchy of OGs expands on this notion, connecting more general OGs, distant in time, to more recent, fine-grained OGs, thereby spanning multiple levels of the tree of life. Large scale inference of OG hierarchies with independently computed taxonomic levels can suffer from inconsistencies between successive levels, such as the position in time of a duplication event. This can be due to confounding genetic signal or algorithmic limitations. Importantly, inconsistencies limit the potential use of OGs for functional annotation and third-party applications.

RESULTS

Here we present a new methodology to ensure hierarchical consistency of OGs across taxonomic levels. To resolve an inconsistency, we subsample the protein space of the OG members and perform gene tree-species tree reconciliation for each sampling. Differently from previous approaches, by subsampling the protein space, we avoid the notoriously difficult task of accurately building and reconciling very large phylogenies. We implement the method into a high-throughput pipeline and apply it to the eggNOG database. We use independent protein domain definitions to validate its performance.

CONCLUSION

The presented consistency pipeline shows that, contrary to previous limitations, tree reconciliation can be a useful instrument for the construction of OG hierarchies. The key lies in the combination of sampling smaller trees and aggregating their reconciliations for robustness. Results show comparable or greater performance to previous pipelines. The code is available on Github at: https://github.com/meringlab/og_consistency_pipeline .

摘要

背景

直系同源群(OG)包含一组具有共同最近祖先(LCA)的直系同源和旁系同源基因。OG 是相对于选定的分类学级别定义的,该级别限定了 LCA 在时间上的位置,以指定的物种形成事件为准。OG 层次结构扩展了这一概念,将时间上较远的更一般的 OG 与更近的、更精细的 OG 连接起来,从而跨越了生命之树的多个层次。具有独立计算的分类学级别大规模推断 OG 层次结构可能会受到连续级别之间的不一致的影响,例如重复事件的时间位置。这可能是由于混杂的遗传信号或算法限制。重要的是,不一致限制了 OG 用于功能注释和第三方应用的潜力。

结果

在这里,我们提出了一种新的方法,以确保 OG 在分类学级别上的层次一致性。为了解决不一致性,我们对 OG 成员的蛋白质空间进行了抽样,并对每个抽样进行了基因树-物种树的协调。与以前的方法不同,通过对蛋白质空间进行抽样,我们避免了准确构建和协调非常大的系统发育树这一众所周知的难题。我们将该方法实现到一个高通量管道中,并将其应用于 eggNOG 数据库。我们使用独立的蛋白质结构域定义来验证其性能。

结论

所提出的一致性管道表明,与以前的限制相反,树协调可以成为构建 OG 层次结构的有用工具。关键在于结合抽样较小的树,并将它们的协调结果进行聚合以获得稳健性。结果表明,与以前的管道相比,该方法具有可比或更高的性能。代码可在 Github 上获得:https://github.com/meringlab/og_consistency_pipeline。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9521/6501302/0f682ffc253d/12859_2019_2828_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9521/6501302/1f3c0ebbfe24/12859_2019_2828_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9521/6501302/9ff5318e2ba0/12859_2019_2828_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9521/6501302/a0a3d91f2d57/12859_2019_2828_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9521/6501302/0f682ffc253d/12859_2019_2828_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9521/6501302/1f3c0ebbfe24/12859_2019_2828_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9521/6501302/9ff5318e2ba0/12859_2019_2828_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9521/6501302/a0a3d91f2d57/12859_2019_2828_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9521/6501302/0f682ffc253d/12859_2019_2828_Fig4_HTML.jpg

相似文献

1
Tree reconciliation combined with subsampling improves large scale inference of orthologous group hierarchies.系统发生树重建与抽样相结合提高了同源基因家族系统发育树大规模推断的准确性。
BMC Bioinformatics. 2019 May 6;20(1):228. doi: 10.1186/s12859-019-2828-z.
2
eggNOG 6.0: enabling comparative genomics across 12 535 organisms.eggNOG 6.0:支持 12535 个生物的比较基因组学研究。
Nucleic Acids Res. 2023 Jan 6;51(D1):D389-D394. doi: 10.1093/nar/gkac1022.
3
Structural properties of the reconciliation space and their applications in enumerating nearly-optimal reconciliations between a gene tree and a species tree.调和空间的结构性质及其在枚举基因树和物种树之间近乎最优的调和中的应用。
BMC Bioinformatics. 2011 Oct 5;12 Suppl 9(Suppl 9):S7. doi: 10.1186/1471-2105-12-S9-S7.
4
eggNOG 4.5: a hierarchical orthology framework with improved functional annotations for eukaryotic, prokaryotic and viral sequences.蛋nog 4.5:一个具有改进功能注释的层次同源框架,适用于真核、原核和病毒序列。
Nucleic Acids Res. 2016 Jan 4;44(D1):D286-93. doi: 10.1093/nar/gkv1248. Epub 2015 Nov 17.
5
BPhyOG: an interactive server for genome-wide inference of bacterial phylogenies based on overlapping genes.BPhyOG:一个基于重叠基因进行全基因组细菌系统发育推断的交互式服务器。
BMC Bioinformatics. 2007 Jul 25;8:266. doi: 10.1186/1471-2105-8-266.
6
eggNOG v2.0: extending the evolutionary genealogy of genes with enhanced non-supervised orthologous groups, species and functional annotations.eggNOG v2.0:通过增强的非监督同源物聚类、物种和功能注释,扩展基因的进化系统发生。
Nucleic Acids Res. 2010 Jan;38(Database issue):D190-5. doi: 10.1093/nar/gkp951. Epub 2009 Nov 9.
7
eggNOG 5.0: a hierarchical, functionally and phylogenetically annotated orthology resource based on 5090 organisms and 2502 viruses.eggNOG 5.0:一个基于 5090 个生物体和 2502 种病毒的层次化、功能和系统发育注释的同源资源。
Nucleic Acids Res. 2019 Jan 8;47(D1):D309-D314. doi: 10.1093/nar/gky1085.
8
OGtree: a tool for creating genome trees of prokaryotes based on overlapping genes.OGtree:一种基于重叠基因创建原核生物基因组树的工具。
Nucleic Acids Res. 2008 Jul 1;36(Web Server issue):W475-80. doi: 10.1093/nar/gkn240. Epub 2008 May 2.
9
Rickettsia phylogenomics: unwinding the intricacies of obligate intracellular life.立克次氏体系统发育基因组学:揭开专性细胞内寄生生活的复杂性
PLoS One. 2008;3(4):e2018. doi: 10.1371/journal.pone.0002018. Epub 2008 Apr 16.
10
Reconstructing protein and gene phylogenies using reconciliation and soft-clustering.利用比对和软聚类重建蛋白质和基因系统发育树。
J Bioinform Comput Biol. 2017 Dec;15(6):1740007. doi: 10.1142/S0219720017400078. Epub 2017 Oct 19.

引用本文的文献

1
OrthoDB v11: annotation of orthologs in the widest sampling of organismal diversity.OrthoDB v11:在最广泛的生物多样性样本中注释直系同源物。
Nucleic Acids Res. 2023 Jan 6;51(D1):D445-D451. doi: 10.1093/nar/gkac998.

本文引用的文献

1
An empirical test of the midpoint rooting method.中点生根法的实证检验。
Biol J Linn Soc Lond. 2007 Dec;92(4):669-674. doi: 10.1111/j.1095-8312.2007.00864.x. Epub 2007 Dec 7.
2
An Integrated Reconciliation Framework for Domain, Gene, and Species Level Evolution.域、基因和物种水平进化的综合协调框架。
IEEE/ACM Trans Comput Biol Bioinform. 2019 Jan-Feb;16(1):63-76. doi: 10.1109/TCBB.2018.2846253. Epub 2018 Jun 12.
3
Gene Tree Construction and Correction Using SuperTree and Reconciliation.使用超级树和调和构建和修正基因树。
IEEE/ACM Trans Comput Biol Bioinform. 2018 Sep-Oct;15(5):1560-1570. doi: 10.1109/TCBB.2017.2720581. Epub 2017 Jun 27.
4
Xenolog classification.异源基因分类
Bioinformatics. 2017 Mar 1;33(5):640-649. doi: 10.1093/bioinformatics/btw686.
5
InterPro in 2017-beyond protein family and domain annotations.2017年的InterPro——超越蛋白质家族和结构域注释
Nucleic Acids Res. 2017 Jan 4;45(D1):D190-D199. doi: 10.1093/nar/gkw1107. Epub 2016 Nov 29.
6
OrthoDB v9.1: cataloging evolutionary and functional annotations for animal, fungal, plant, archaeal, bacterial and viral orthologs.OrthoDB v9.1:编目动物、真菌、植物、古细菌、细菌和病毒直系同源基因的进化和功能注释。
Nucleic Acids Res. 2017 Jan 4;45(D1):D744-D749. doi: 10.1093/nar/gkw1119. Epub 2016 Nov 28.
7
A natural barrier to lateral gene transfer from prokaryotes to eukaryotes revealed from genomes: the 70 % rule.基因组揭示的原核生物向真核生物横向基因转移的天然屏障:70%规则
BMC Biol. 2016 Oct 17;14(1):89. doi: 10.1186/s12915-016-0315-9.
8
HieranoiDB: a database of orthologs inferred by Hieranoid.HieranoiDB:一个由Hieranoid推断出的直系同源基因数据库。
Nucleic Acids Res. 2017 Jan 4;45(D1):D687-D690. doi: 10.1093/nar/gkw923. Epub 2016 Oct 13.
9
Standardized benchmarking in the quest for orthologs.寻找直系同源基因过程中的标准化基准测试。
Nat Methods. 2016 May;13(5):425-30. doi: 10.1038/nmeth.3830. Epub 2016 Apr 4.
10
Inferring Orthologs: Open Questions and Perspectives.推断直系同源基因:未解决的问题与展望
Genomics Insights. 2016 Feb 25;9:17-28. doi: 10.4137/GEI.S37925. eCollection 2016.