• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

重建天然基因组的重排系统发育树。

Reconstructing rearrangement phylogenies of natural genomes.

作者信息

Bohnenkämper Leonard, Stoye Jens, Doerr Daniel

机构信息

Faculty of Technology, Bielefeld University, Universitätsstraße 25, 33615, Bielefeld, NRW, Germany.

Center for Biotechnology (CeBiTec), Bielefeld University, Universitätsstraße 25, 33615, Bielefeld, NRW, Germany.

出版信息

Algorithms Mol Biol. 2025 Jun 7;20(1):10. doi: 10.1186/s13015-025-00279-5.

DOI:10.1186/s13015-025-00279-5
PMID:40483529
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12144824/
Abstract

BACKGROUND

We study the classical problem of inferring ancestral genomes from a set of extant genomes under a given phylogeny, known as the Small Parsimony Problem (SPP). Genomes are represented as sequences of oriented markers, organized in one or more linear or circular chromosomes. Any marker may appear in several copies, without restriction on orientation or genomic location, known as the natural genomes model. Evolutionary events along the branches of the phylogeny encompass large scale rearrangements, including segmental inversions, translocations, gain and loss (DCJ-indel model). Even under simpler rearrangement models, such as the classical breakpoint model without duplicates, the SPP is computationally intractable. Nevertheless, the SPP for natural genomes under the DCJ-indel model has been studied recently, with limited success.

METHODS

Building on prior work, we present a highly optimized ILP that is able to solve the SPP for sufficiently small phylogenies and gene families. A notable improvement w.r.t. the previous result is an optimized way of handling both circular and linear chromosomes. This is especially relevant to the SPP, since the chromosomal structure of ancestral genomes is unknown and the solution space for this chromosomal structure is typically large.

RESULTS

We benchmark our method on simulated and real data. On simulated phylogenies we observe a considerable performance improvement on problems that include linear chromosomes. And even when the ground truth contains only one circular chromosome per genome, our method outperforms its predecessor due to its optimized handling of the solution space. The practical advantage becomes also visible in an analysis of seven Anopheles taxa.

摘要

背景

我们研究在给定系统发育树的情况下,从一组现存基因组推断祖先基因组的经典问题,即所谓的小简约问题(SPP)。基因组被表示为定向标记的序列,组织在一条或多条线性或环状染色体中。任何标记可能会出现多个拷贝,对其方向或基因组位置没有限制,这就是自然基因组模型。沿着系统发育树分支的进化事件包括大规模重排,包括片段倒位、易位、获得和丢失(DCJ - 插入缺失模型)。即使在更简单的重排模型下,例如没有重复的经典断点模型,SPP在计算上也是难以处理的。然而,最近对DCJ - 插入缺失模型下自然基因组的SPP进行了研究,但取得的成功有限。

方法

在先前工作的基础上,我们提出了一种高度优化的整数线性规划(ILP),它能够解决足够小的系统发育树和基因家族的SPP。相对于先前结果的一个显著改进是处理环状和线性染色体的优化方法。这与SPP特别相关,因为祖先基因组的染色体结构是未知的,并且这种染色体结构的解空间通常很大。

结果

我们在模拟数据和真实数据上对我们的方法进行了基准测试。在模拟系统发育树上,我们观察到在包含线性染色体的问题上性能有相当大的提升。而且即使在每个基因组的真实情况仅包含一条环状染色体的情况下,由于我们的方法对解空间的优化处理,它也优于其前身。在对七个按蚊类群的分析中,实际优势也很明显。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b17b/12144824/25286642bef1/13015_2025_279_Fig11_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b17b/12144824/4a4866ca9083/13015_2025_279_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b17b/12144824/16c9f9e37742/13015_2025_279_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b17b/12144824/ffe9a36554d0/13015_2025_279_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b17b/12144824/7d6e2cbc5d2e/13015_2025_279_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b17b/12144824/e368998b141c/13015_2025_279_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b17b/12144824/70b2528e668d/13015_2025_279_Figa_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b17b/12144824/1905e796fb8d/13015_2025_279_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b17b/12144824/f838949531e4/13015_2025_279_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b17b/12144824/fa9e81dae061/13015_2025_279_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b17b/12144824/3728b35f6282/13015_2025_279_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b17b/12144824/5327fc3b6fe8/13015_2025_279_Fig10_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b17b/12144824/25286642bef1/13015_2025_279_Fig11_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b17b/12144824/4a4866ca9083/13015_2025_279_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b17b/12144824/16c9f9e37742/13015_2025_279_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b17b/12144824/ffe9a36554d0/13015_2025_279_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b17b/12144824/7d6e2cbc5d2e/13015_2025_279_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b17b/12144824/e368998b141c/13015_2025_279_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b17b/12144824/70b2528e668d/13015_2025_279_Figa_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b17b/12144824/1905e796fb8d/13015_2025_279_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b17b/12144824/f838949531e4/13015_2025_279_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b17b/12144824/fa9e81dae061/13015_2025_279_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b17b/12144824/3728b35f6282/13015_2025_279_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b17b/12144824/5327fc3b6fe8/13015_2025_279_Fig10_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b17b/12144824/25286642bef1/13015_2025_279_Fig11_HTML.jpg

相似文献

1
Reconstructing rearrangement phylogenies of natural genomes.重建天然基因组的重排系统发育树。
Algorithms Mol Biol. 2025 Jun 7;20(1):10. doi: 10.1186/s13015-025-00279-5.
2
Small parsimony for natural genomes in the DCJ-indel model.在 DCJ 插入-缺失模型中,自然基因组的小简约性。
J Bioinform Comput Biol. 2021 Dec;19(6):2140009. doi: 10.1142/S0219720021400096. Epub 2021 Nov 19.
3
Recombinations, chains and caps: resolving problems with the DCJ-indel model.重组、链与端粒帽:用DCJ-插入缺失模型解决问题
Algorithms Mol Biol. 2024 Feb 27;19(1):8. doi: 10.1186/s13015-024-00253-7.
4
Chromosome structures: reduction of certain problems with unequal gene content and gene paralogs to integer linear programming.染色体结构:将某些具有不等基因含量和基因旁系同源物的问题简化为整数线性规划。
BMC Bioinformatics. 2017 Dec 6;18(1):537. doi: 10.1186/s12859-017-1944-x.
5
Sorting Linear Genomes with Rearrangements and Indels.通过重排和插入缺失对线性基因组进行排序
IEEE/ACM Trans Comput Biol Bioinform. 2015 May-Jun;12(3):500-6. doi: 10.1109/TCBB.2014.2329297.
6
The Floor Is Lava: Halving Natural Genomes with Viaducts, Piers, and Pontoons.“地板是熔岩”:利用高架桥、桥墩和浮桥将天然基因组减半。
J Comput Biol. 2024 Apr;31(4):294-311. doi: 10.1089/cmb.2023.0330. Epub 2024 Apr 15.
7
Fast ancestral gene order reconstruction of genomes with unequal gene content.具有不等基因含量的基因组的快速祖先基因顺序重建
BMC Bioinformatics. 2016 Nov 11;17(Suppl 14):413. doi: 10.1186/s12859-016-1261-9.
8
Computing the Rearrangement Distance of Natural Genomes.计算自然基因组的重排距离。
J Comput Biol. 2021 Apr;28(4):410-431. doi: 10.1089/cmb.2020.0434. Epub 2020 Dec 30.
9
Restricted DCJ-indel model: sorting linear genomes with DCJ and indels.受限 DCJ 插入缺失模型:使用 DCJ 和插入缺失对线性基因组进行排序。
BMC Bioinformatics. 2012;13 Suppl 19(Suppl 19):S14. doi: 10.1186/1471-2105-13-S19-S14. Epub 2012 Dec 19.
10
Algebraic double cut and join : A group-theoretic approach to the operator on multichromosomal genomes.代数双切割与连接:一种关于多染色体基因组上算子的群论方法。
J Math Biol. 2015 Nov;71(5):1149-78. doi: 10.1007/s00285-014-0852-1. Epub 2014 Dec 11.

本文引用的文献

1
Applying rearrangement distances to enable plasmid epidemiology with pling.应用重排距离使 pling 能够进行质粒流行病学研究。
Microb Genom. 2024 Oct;10(10). doi: 10.1099/mgen.0.001300.
2
AGO, a Framework for the Reconstruction of Ancestral Syntenies and Gene Orders.AGO,用于重建祖先同线性和基因顺序的框架。
Methods Mol Biol. 2024;2802:247-265. doi: 10.1007/978-1-0716-3838-5_10.
3
Recombinations, chains and caps: resolving problems with the DCJ-indel model.重组、链与端粒帽:用DCJ-插入缺失模型解决问题
Algorithms Mol Biol. 2024 Feb 27;19(1):8. doi: 10.1186/s13015-024-00253-7.
4
Efficient gene orthology inference via large-scale rearrangements.通过大规模重排进行高效的基因直系同源推断。
Algorithms Mol Biol. 2023 Sep 28;18(1):14. doi: 10.1186/s13015-023-00238-y.
5
Small parsimony for natural genomes in the DCJ-indel model.在 DCJ 插入-缺失模型中,自然基因组的小简约性。
J Bioinform Comput Biol. 2021 Dec;19(6):2140009. doi: 10.1142/S0219720021400096. Epub 2021 Nov 19.
6
VEuPathDB: the eukaryotic pathogen, vector and host bioinformatics resource center.VEuPathDB:真核病原体、载体和宿主生物信息学资源中心。
Nucleic Acids Res. 2022 Jan 7;50(D1):D898-D911. doi: 10.1093/nar/gkab929.
7
Computing the Rearrangement Distance of Natural Genomes.计算自然基因组的重排距离。
J Comput Biol. 2021 Apr;28(4):410-431. doi: 10.1089/cmb.2020.0434. Epub 2020 Dec 30.
8
IQ-TREE 2: New Models and Efficient Methods for Phylogenetic Inference in the Genomic Era.IQ-TREE 2:基因组时代系统发育推断的新模型和有效方法。
Mol Biol Evol. 2020 May 1;37(5):1530-1534. doi: 10.1093/molbev/msaa015.
9
Zombi: a phylogenetic simulator of trees, genomes and sequences that accounts for dead linages.僵尸:一种系统发育模拟器,用于模拟树、基因组和序列,同时考虑到已灭绝的谱系。
Bioinformatics. 2020 Feb 15;36(4):1286-1288. doi: 10.1093/bioinformatics/btz710.
10
MACSE v2: Toolkit for the Alignment of Coding Sequences Accounting for Frameshifts and Stop Codons.MACSE v2:用于对齐编码序列的工具包,考虑到移码和终止密码子。
Mol Biol Evol. 2018 Oct 1;35(10):2582-2584. doi: 10.1093/molbev/msy159.