• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

Orthonome——一种用于预测适用于完整基因组和草图基因组的高质量直系同源基因集的新流程。

Orthonome - a new pipeline for predicting high quality orthologue gene sets applicable to complete and draft genomes.

作者信息

Rane Rahul V, Oakeshott John G, Nguyen Thu, Hoffmann Ary A, Lee Siu F

机构信息

Bio21 Institute, School of Biosciences, The University of Melbourne, Melbourne, Victoria, Australia.

CSIRO, Canberra, Australian Capital Territory, Australia.

出版信息

BMC Genomics. 2017 Aug 31;18(1):673. doi: 10.1186/s12864-017-4079-6.

DOI:10.1186/s12864-017-4079-6
PMID:28859620
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5580312/
Abstract

BACKGROUND

Distinguishing orthologous and paralogous relationships between genes across multiple species is essential for comparative genomic analyses. Various computational approaches have been developed to resolve these evolutionary relationships, but strong trade-offs between precision and recall of orthologue prediction remains an ongoing challenge.

RESULTS

Here we present Orthonome, an orthologue prediction pipeline, designed to reduce the trade-off between orthologue capture rates (recall) and accuracy of multi-species orthologue prediction. The pipeline compares sequence domains and then forms sequence-similar clusters before using phylogenetic comparisons to identify inparalogues. It then corrects sequence similarity metrics for fragment and gene length bias using a novel scoring metric capturing relationships between full length as well as fragmented genes. The remaining genes are then brought together for the identification of orthologues within a phylogenetic framework. The orthologue predictions are further calibrated along with inparalogues and gene births, using synteny, to identify novel orthologous relationships. We use 12 high quality Drosophila genomes to show that, compared to other orthologue prediction pipelines, Orthonome provides orthogroups with minimal error but high recall. Furthermore, Orthonome is resilient to suboptimal assembly/annotation quality, with the inclusion of draft genomes from eight additional Drosophila species still providing >6500 1:1 orthologues across all twenty species while retaining a better combination of accuracy and recall than other pipelines. Orthonome is implemented as a searchable database and query tool along with multiple-sequence alignment browsers for all sets of orthologues. The underlying documentation and database are accessible at http://www.orthonome.com .

CONCLUSION

We demonstrate that Orthonome provides a superior combination of orthologue capture rates and accuracy on complete and draft drosophilid genomes when tested alongside previously published pipelines. The study also highlights a greater degree of evolutionary conservation across drosophilid species than earlier thought.

摘要

背景

区分多个物种间基因的直系同源和旁系同源关系对于比较基因组分析至关重要。已开发出多种计算方法来解析这些进化关系,但直系同源物预测的精度和召回率之间存在强烈权衡,这仍是一个持续存在的挑战。

结果

在此,我们展示了Orthonome,一种直系同源物预测流程,旨在减少直系同源物捕获率(召回率)与多物种直系同源物预测准确性之间的权衡。该流程先比较序列结构域,然后形成序列相似性聚类,再使用系统发育比较来识别旁系同源物。接着,它使用一种新颖的评分指标来校正片段和基因长度偏差对序列相似性度量的影响,该指标能捕捉全长基因和片段化基因之间的关系。然后将其余基因整合在一起,在系统发育框架内识别直系同源物。直系同源物预测会进一步结合旁系同源物和基因起源,利用共线性进行校准,以识别新的直系同源关系。我们使用12个高质量的果蝇基因组表明,与其他直系同源物预测流程相比,Orthonome提供的直系同源组错误最少但召回率高。此外,Orthonome能适应次优的组装/注释质量,纳入另外8个果蝇物种的草图基因组后,在所有20个物种中仍能提供超过6500个1:1直系同源物,同时在准确性和召回率方面保持比其他流程更好的组合。Orthonome被实现为一个可搜索的数据库和查询工具,以及用于所有直系同源物集的多序列比对浏览器。基础文档和数据库可在http://www.orthonome.com获取。

结论

我们证明,在与先前发布的流程一起测试时,Orthonome在完整和草图果蝇基因组上提供了直系同源物捕获率和准确性的卓越组合。该研究还强调了果蝇物种间的进化保守程度比先前认为的更高。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e758/5580312/212ca1b92741/12864_2017_4079_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e758/5580312/071f5f0194a4/12864_2017_4079_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e758/5580312/b85948d07a2d/12864_2017_4079_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e758/5580312/212ca1b92741/12864_2017_4079_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e758/5580312/071f5f0194a4/12864_2017_4079_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e758/5580312/b85948d07a2d/12864_2017_4079_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e758/5580312/212ca1b92741/12864_2017_4079_Fig3_HTML.jpg

相似文献

1
Orthonome - a new pipeline for predicting high quality orthologue gene sets applicable to complete and draft genomes.Orthonome——一种用于预测适用于完整基因组和草图基因组的高质量直系同源基因集的新流程。
BMC Genomics. 2017 Aug 31;18(1):673. doi: 10.1186/s12864-017-4079-6.
2
Identification and analysis of serpin-family genes by homology and synteny across the 12 sequenced Drosophilid genomes.通过对12个已测序果蝇基因组的同源性和共线性分析来鉴定和分析丝氨酸蛋白酶抑制剂家族基因。
BMC Genomics. 2009 Oct 22;10:489. doi: 10.1186/1471-2164-10-489.
3
Comparative genomics of ParaHox clusters of teleost fishes: gene cluster breakup and the retention of gene sets following whole genome duplications.硬骨鱼类副同源盒基因簇的比较基因组学:基因簇的分裂以及全基因组复制后基因集的保留
BMC Genomics. 2007 Sep 6;8:312. doi: 10.1186/1471-2164-8-312.
4
Comparative Genomics in Drosophila.果蝇中的比较基因组学。
Methods Mol Biol. 2018;1704:433-450. doi: 10.1007/978-1-4939-7463-4_17.
5
A universal genomic coordinate translator for comparative genomics.用于比较基因组学的通用基因组坐标转换器。
BMC Bioinformatics. 2014 Jun 30;15:227. doi: 10.1186/1471-2105-15-227.
6
Phylogenetic reconstruction of orthology, paralogy, and conserved synteny for dog and human.狗和人类直系同源、旁系同源及保守共线性的系统发育重建。
PLoS Comput Biol. 2006 Sep 29;2(9):e133. doi: 10.1371/journal.pcbi.0020133.
7
An integrative and applicable phylogenetic footprinting framework for cis-regulatory motifs identification in prokaryotic genomes.一种用于原核生物基因组中顺式调控基序识别的综合且适用的系统发育足迹分析框架。
BMC Genomics. 2016 Aug 9;17:578. doi: 10.1186/s12864-016-2982-x.
8
Orthologs, turn-over, and remolding of tRNAs in primates and fruit flies.灵长类动物和果蝇中tRNA的直系同源物、周转及重塑
BMC Genomics. 2016 Aug 11;17(1):617. doi: 10.1186/s12864-016-2927-4.
9
The multiple facets of homology and their use in comparative genomics to study the evolution of genes, genomes, and species.同源性的多个方面及其在比较基因组学中用于研究基因、基因组和物种进化的应用。
Biochimie. 2008 Apr;90(4):595-608. doi: 10.1016/j.biochi.2007.09.010. Epub 2007 Sep 22.
10
Mulan: multiple-sequence local alignment and visualization for studying function and evolution.木兰:用于研究功能和进化的多序列局部比对与可视化
Genome Res. 2005 Jan;15(1):184-94. doi: 10.1101/gr.3007205. Epub 2004 Dec 8.

引用本文的文献

1
Transcriptomic Temperature Stress Responses Show Differentiation Between Biomes for Diverse Plants.转录组水平的温度应激反应显示不同植物在生物群落间存在差异。
Genome Biol Evol. 2025 Apr 3;17(4). doi: 10.1093/gbe/evaf056.
2
Uncovering transcriptional reprogramming during callus development in soybean: insights and implications.揭示大豆愈伤组织发育过程中的转录重编程:见解与启示
Front Plant Sci. 2023 Aug 4;14:1239917. doi: 10.3389/fpls.2023.1239917. eCollection 2023.
3
Tripartite factors leading to molecular divergence between human and murine smooth muscle.

本文引用的文献

1
Are feeding preferences and insecticide resistance associated with the size of detoxifying enzyme families in insect herbivores?取食偏好和抗药性是否与昆虫食草动物解毒酶家族的大小有关?
Curr Opin Insect Sci. 2016 Feb;13:70-76. doi: 10.1016/j.cois.2015.12.001. Epub 2015 Dec 29.
2
Standardized benchmarking in the quest for orthologs.寻找直系同源基因过程中的标准化基准测试。
Nat Methods. 2016 May;13(5):425-30. doi: 10.1038/nmeth.3830. Epub 2016 Apr 4.
3
Inferring Orthologs: Open Questions and Perspectives.推断直系同源基因:未解决的问题与展望
导致人类和鼠类平滑肌分子分化的三方因素。
PLoS One. 2020 Jan 16;15(1):e0227672. doi: 10.1371/journal.pone.0227672. eCollection 2020.
4
Divergent evolutionary trajectories following speciation in two ectoparasitic honey bee mites.物种形成后两种寄生性蜜蜂螨虫的分歧进化轨迹。
Commun Biol. 2019 Oct 1;2:357. doi: 10.1038/s42003-019-0606-0. eCollection 2019.
5
OrthoVenn2: a web server for whole-genome comparison and annotation of orthologous clusters across multiple species.OrthoVenn2:一个用于跨多个物种的全基因组比较和直系同源簇注释的网络服务器。
Nucleic Acids Res. 2019 Jul 2;47(W1):W52-W58. doi: 10.1093/nar/gkz333.
6
Genomic changes associated with adaptation to arid environments in cactophilic Drosophila species.与耐旱环境适应相关的基因组变化在嗜干仙人掌果蝇物种中。
BMC Genomics. 2019 Jan 16;20(1):52. doi: 10.1186/s12864-018-5413-3.
7
De-Extinction.复活灭绝物种
Genes (Basel). 2018 Nov 13;9(11):548. doi: 10.3390/genes9110548.
8
New Tools in Orthology Analysis: A Brief Review of Promising Perspectives.直系同源分析的新工具:对前景广阔的观点的简要综述
Front Genet. 2017 Oct 31;8:165. doi: 10.3389/fgene.2017.00165. eCollection 2017.
Genomics Insights. 2016 Feb 25;9:17-28. doi: 10.4137/GEI.S37925. eCollection 2016.
4
FlyBase: establishing a Gene Group resource for Drosophila melanogaster.果蝇数据库:为黑腹果蝇建立一个基因群组资源。
Nucleic Acids Res. 2016 Jan 4;44(D1):D786-92. doi: 10.1093/nar/gkv1046. Epub 2015 Oct 13.
5
OrthoDB v8: update of the hierarchical catalog of orthologs and the underlying free software.OrthoDB v8:直系同源基因分层目录及底层免费软件的更新
Nucleic Acids Res. 2015 Jan;43(Database issue):D250-6. doi: 10.1093/nar/gku1220. Epub 2014 Nov 26.
6
The OMA orthology database in 2015: function predictions, better plant support, synteny view and other improvements.2015年的OMA直系同源数据库:功能预测、对植物的更好支持、共线性视图及其他改进
Nucleic Acids Res. 2015 Jan;43(Database issue):D240-9. doi: 10.1093/nar/gku1158. Epub 2014 Nov 15.
7
InterProScan 5: genome-scale protein function classification.InterProScan 5:基因组规模的蛋白质功能分类。
Bioinformatics. 2014 May 1;30(9):1236-40. doi: 10.1093/bioinformatics/btu031. Epub 2014 Jan 21.
8
Fastphylo: fast tools for phylogenetics.Fastphylo:用于系统发生学的快速工具。
BMC Bioinformatics. 2013 Nov 20;14:334. doi: 10.1186/1471-2105-14-334.
9
Functional and evolutionary implications of gene orthology.基因直系同源的功能和进化意义。
Nat Rev Genet. 2013 May;14(5):360-6. doi: 10.1038/nrg3456. Epub 2013 Apr 4.
10
Hieranoid: hierarchical orthology inference.Hieranoid:层次同源推断。
J Mol Biol. 2013 Jun 12;425(11):2072-2081. doi: 10.1016/j.jmb.2013.02.018. Epub 2013 Feb 26.