• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

从基因顺序数据扩大准确的系统发育重建。

Scaling up accurate phylogenetic reconstruction from gene-order data.

作者信息

Tang Jijun, Moret Bernard M E

机构信息

Department of Computer Science, University of New Mexico, Albuquerque, NM 87131, USA.

出版信息

Bioinformatics. 2003;19 Suppl 1:i305-12. doi: 10.1093/bioinformatics/btg1042.

DOI:10.1093/bioinformatics/btg1042
PMID:12855474
Abstract

MOTIVATION

Phylogenetic reconstruction from gene-order data has attracted increasing attention from both biologists and computer scientists over the last few years. Methods used in reconstruction include distance-based methods (such as neighbor-joining), parsimony methods using sequence-based encodings, Bayesian approaches, and direct optimization. The latter, pioneered by Sankoff and extended by us with the software suite GRAPPA, is the most accurate approach, but cannot handle more than about 15 genomes of limited size (e.g. organelles).

RESULTS

We report here on our successful efforts to scale up direct optimization through a two-step approach: the first step decomposes the dataset into smaller pieces and runs the direct optimization (GRAPPA) on the smaller pieces, while the second step builds a tree from the results obtained on the smaller pieces. We used the sophisticated disk-covering method (DCM) pioneered by Warnow and her group, suitably modified to take into account the computational limitations of GRAPPA. We find that DCM-GRAPPA scales gracefully to at least 1000 genomes of a few hundred genes each and retains surprisingly high accuracy throughout the range: in our experiments, the topological error rate rarely exceeded a few percent. Thus, reconstruction based on gene-order data can now be accomplished with high accuracy on datasets of significant size.

摘要

动机

在过去几年中,基于基因顺序数据的系统发育重建吸引了生物学家和计算机科学家越来越多的关注。重建中使用的方法包括基于距离的方法(如邻接法)、使用基于序列编码的简约法、贝叶斯方法以及直接优化。后者由桑科夫开创,并由我们通过软件套件GRAPPA进行扩展,是最准确的方法,但无法处理超过约15个大小有限的基因组(如细胞器基因组)。

结果

我们在此报告通过两步法成功扩大直接优化规模的工作:第一步将数据集分解为较小的片段,并在这些较小的片段上运行直接优化(GRAPPA),而第二步则根据在较小片段上获得的结果构建一棵树。我们使用了由瓦尔诺及其团队开创的复杂的磁盘覆盖方法(DCM),并对其进行了适当修改以考虑GRAPPA的计算限制。我们发现DCM-GRAPPA能够很好地扩展到至少1000个每个包含几百个基因的基因组,并且在整个范围内都保持了惊人的高精度:在我们的实验中,拓扑错误率很少超过百分之几。因此,现在可以在具有相当规模的数据集上高精度地完成基于基因顺序数据的重建。

相似文献

1
Scaling up accurate phylogenetic reconstruction from gene-order data.从基因顺序数据扩大准确的系统发育重建。
Bioinformatics. 2003;19 Suppl 1:i305-12. doi: 10.1093/bioinformatics/btg1042.
2
Efficient multiple genome alignment.高效多基因组比对。
Bioinformatics. 2002;18 Suppl 1:S312-20. doi: 10.1093/bioinformatics/18.suppl_1.s312.
3
An improved method for identifying functionally linked proteins using phylogenetic profiles.一种利用系统发育谱识别功能相关蛋白质的改进方法。
BMC Bioinformatics. 2007 May 22;8 Suppl 4(Suppl 4):S7. doi: 10.1186/1471-2105-8-S4-S7.
4
Whole-genome prokaryotic phylogeny.全基因组原核生物系统发育学。
Bioinformatics. 2005 May 15;21(10):2329-35. doi: 10.1093/bioinformatics/bth324. Epub 2004 May 27.
5
QuickJoin--fast neighbour-joining tree reconstruction.快速连接——快速邻接树重建
Bioinformatics. 2004 Nov 22;20(17):3261-2. doi: 10.1093/bioinformatics/bth359. Epub 2004 Jun 16.
6
A new sequence distance measure for phylogenetic tree construction.一种用于构建系统发育树的新序列距离度量方法。
Bioinformatics. 2003 Nov 1;19(16):2122-30. doi: 10.1093/bioinformatics/btg295.
7
ProfDist: a tool for the construction of large phylogenetic trees based on profile distances.ProfDist:一种基于轮廓距离构建大型系统发育树的工具。
Bioinformatics. 2005 May 1;21(9):2108-9. doi: 10.1093/bioinformatics/bti289. Epub 2005 Jan 27.
8
Improving reversal median computation using commuting reversals and cycle information.利用可交换反转和循环信息改进反转中位数计算。
J Comput Biol. 2008 Oct;15(8):1079-92. doi: 10.1089/cmb.2008.0116.
9
Using median sets for inferring phylogenetic trees.使用中位数集推断系统发育树。
Bioinformatics. 2007 Jan 15;23(2):e129-35. doi: 10.1093/bioinformatics/btl300.
10
Multiple-sequence functional annotation and the generalized hidden Markov phylogeny.多序列功能注释与广义隐马尔可夫系统发育
Bioinformatics. 2004 Aug 12;20(12):1850-60. doi: 10.1093/bioinformatics/bth153. Epub 2004 Feb 26.

引用本文的文献

1
A Guide to Phylogenomic Inference.系统发育基因组推断指南。
Methods Mol Biol. 2024;2802:267-345. doi: 10.1007/978-1-0716-3838-5_11.
2
Phylogenetic Reconstruction Based on Synteny Block and Gene Adjacencies.基于同线性块和基因邻接的系统发育重建。
Mol Biol Evol. 2020 Sep 1;37(9):2747-2762. doi: 10.1093/molbev/msaa114.
3
GOOGA: A platform to synthesize mapping experiments and identify genomic structural diversity.GOOGA:一个用于合成作图实验和识别基因组结构多样性的平台。
PLoS Comput Biol. 2019 Apr 15;15(4):e1006949. doi: 10.1371/journal.pcbi.1006949. eCollection 2019 Apr.
4
Phylogenetic signal from rearrangements in 18 Anopheles species by joint scaffolding extant and ancestral genomes.18 种按蚊种系重排的系统发育信号,通过联合支架现存和祖先基因组。
BMC Genomics. 2018 May 9;19(Suppl 2):96. doi: 10.1186/s12864-018-4466-7.
5
Phase change for the accuracy of the median value in estimating divergence time.相位变化对估计分歧时间中位数值准确性的影响。
BMC Bioinformatics. 2013;14 Suppl 15(Suppl 15):S7. doi: 10.1186/1471-2105-14-S15-S7. Epub 2013 Oct 15.
6
Inferring phylogenetic networks from gene order data.从基因顺序数据推断系统发生网络。
Biomed Res Int. 2013;2013:503193. doi: 10.1155/2013/503193. Epub 2013 Aug 28.
7
Maximum likelihood phylogenetic reconstruction from high-resolution whole-genome data and a tree of 68 eukaryotes.基于高分辨率全基因组数据的最大似然系统发育重建及68种真核生物的系统发育树
Pac Symp Biocomput. 2013:285-96.
8
Rec-DCM-Eigen: reconstructing a less parsimonious but more accurate tree in shorter time.Rec-DCM-Eigen:用更短的时间重建一个不太简约但更准确的树。
PLoS One. 2011;6(8):e22483. doi: 10.1371/journal.pone.0022483. Epub 2011 Aug 24.
9
progressiveMauve: multiple genome alignment with gene gain, loss and rearrangement.渐进紫:具有基因增益、缺失和重排的多基因组比对。
PLoS One. 2010 Jun 25;5(6):e11147. doi: 10.1371/journal.pone.0011147.
10
Seevolution: visualizing chromosome evolution.Seevolution:可视化染色体进化。
Bioinformatics. 2009 Apr 1;25(7):960-1. doi: 10.1093/bioinformatics/btp096. Epub 2009 Feb 20.