• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

重组、链与端粒帽:用DCJ-插入缺失模型解决问题

Recombinations, chains and caps: resolving problems with the DCJ-indel model.

作者信息

Bohnenkämper Leonard

机构信息

Faculty of Technology and Center for Biotechnology (CeBiTec), Bielefeld University, Universitätsstraße 25, 33615, Bielefeld, NRW, Germany.

出版信息

Algorithms Mol Biol. 2024 Feb 27;19(1):8. doi: 10.1186/s13015-024-00253-7.

DOI:10.1186/s13015-024-00253-7
PMID:38414060
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10900646/
Abstract

One of the most fundamental problems in genome rearrangement studies is the (genomic) distance problem. It is typically formulated as finding the minimum number of rearrangements under a model that are needed to transform one genome into the other. A powerful multi-chromosomal model is the Double Cut and Join (DCJ) model.While the DCJ model is not able to deal with some situations that occur in practice, like duplicated or lost regions, it was extended over time to handle these cases. First, it was extended to the DCJ-indel model, solving the issue of lost markers. Later ILP-solutions for so called natural genomes, in which each genomic region may occur an arbitrary number of times, were developed, enabling in theory to solve the distance problem for any pair of genomes. However, some theoretical and practical issues remained unsolved. On the theoretical side of things, there exist two disparate views of the DCJ-indel model, motivated in the same way, but with different conceptualizations that could not be reconciled so far. On the practical side, while ILP solutions for natural genomes typically perform well on telomere to telomere resolved genomes, they have been shown in recent years to quickly loose performance on genomes with a large number of contigs or linear chromosomes. This has been linked to a particular technique, namely capping. Simply put, capping circularizes linear chromosomes by concatenating them during solving time, increasing the solution space of the ILP superexponentially. Recently, we introduced a new conceptualization of the DCJ-indel model within the context of another rearrangement problem. In this manuscript, we will apply this new conceptualization to the distance problem. In doing this, we uncover the relation between the disparate conceptualizations of the DCJ-indel model. We are also able to derive an ILP solution to the distance problem that does not rely on capping. This solution significantly improves upon the performance of previous solutions on genomes with high numbers of contigs while still solving the problem exactly and being competitive in performance otherwise. We demonstrate the performance advantage on simulated genomes as well as showing its practical usefulness in an analysis of 11 Drosophila genomes.

摘要

基因组重排研究中最基本的问题之一是(基因组)距离问题。它通常被表述为在一种模型下找到将一个基因组转化为另一个基因组所需的最少重排次数。一种强大的多染色体模型是双切接(DCJ)模型。虽然DCJ模型无法处理实际中出现的一些情况,比如重复或缺失区域,但随着时间的推移它被扩展以处理这些情况。首先,它被扩展为DCJ - 插入缺失模型,解决了标记丢失的问题。后来针对所谓的自然基因组开发了整数线性规划(ILP)解决方案,其中每个基因组区域可能出现任意次数,理论上能够解决任意一对基因组的距离问题。然而,一些理论和实际问题仍未解决。从事物的理论方面来看,对于DCJ - 插入缺失模型存在两种截然不同的观点,它们的动机相同,但概念化方式不同,到目前为止无法调和。在实际方面,虽然针对自然基因组的ILP解决方案在端粒到端粒解析的基因组上通常表现良好,但近年来已表明它们在具有大量重叠群或线性染色体的基因组上会迅速失去性能。这与一种特定技术,即加帽有关。简单地说,加帽是在求解时通过连接线性染色体使其环化,从而使ILP解决方案空间呈超指数增长。最近,我们在另一个重排问题的背景下引入了DCJ - 插入缺失模型的一种新的概念化。在本手稿中,我们将把这种新的概念化应用于距离问题。通过这样做,我们揭示了DCJ - 插入缺失模型不同概念化之间的关系。我们还能够推导出一个不依赖加帽的距离问题的ILP解决方案。该解决方案在具有大量重叠群的基因组上显著提高了先前解决方案的性能,同时仍然能够准确地解决问题,并且在其他方面性能也具有竞争力。我们在模拟基因组上展示了性能优势,并在对11个果蝇基因组的分析中展示了其实际用途。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b601/10900646/5f26230b405e/13015_2024_253_Figb_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b601/10900646/c3ce653b1efa/13015_2024_253_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b601/10900646/a8bd6bfcccb7/13015_2024_253_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b601/10900646/198064444a17/13015_2024_253_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b601/10900646/cb8efb7992e0/13015_2024_253_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b601/10900646/c301e29aecc7/13015_2024_253_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b601/10900646/e546429a0e10/13015_2024_253_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b601/10900646/4ed4ff30a25d/13015_2024_253_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b601/10900646/7a9946587c7c/13015_2024_253_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b601/10900646/0fa16c936d96/13015_2024_253_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b601/10900646/99fe0639b3f2/13015_2024_253_Figa_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b601/10900646/60a805092c23/13015_2024_253_Fig10_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b601/10900646/946f3ad94241/13015_2024_253_Fig11_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b601/10900646/998ec3463178/13015_2024_253_Fig12_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b601/10900646/2e11ca4912eb/13015_2024_253_Fig13_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b601/10900646/4cba217f7dd9/13015_2024_253_Fig14_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b601/10900646/50d600c88120/13015_2024_253_Fig15_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b601/10900646/5f26230b405e/13015_2024_253_Figb_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b601/10900646/c3ce653b1efa/13015_2024_253_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b601/10900646/a8bd6bfcccb7/13015_2024_253_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b601/10900646/198064444a17/13015_2024_253_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b601/10900646/cb8efb7992e0/13015_2024_253_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b601/10900646/c301e29aecc7/13015_2024_253_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b601/10900646/e546429a0e10/13015_2024_253_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b601/10900646/4ed4ff30a25d/13015_2024_253_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b601/10900646/7a9946587c7c/13015_2024_253_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b601/10900646/0fa16c936d96/13015_2024_253_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b601/10900646/99fe0639b3f2/13015_2024_253_Figa_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b601/10900646/60a805092c23/13015_2024_253_Fig10_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b601/10900646/946f3ad94241/13015_2024_253_Fig11_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b601/10900646/998ec3463178/13015_2024_253_Fig12_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b601/10900646/2e11ca4912eb/13015_2024_253_Fig13_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b601/10900646/4cba217f7dd9/13015_2024_253_Fig14_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b601/10900646/50d600c88120/13015_2024_253_Fig15_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b601/10900646/5f26230b405e/13015_2024_253_Figb_HTML.jpg

相似文献

1
Recombinations, chains and caps: resolving problems with the DCJ-indel model.重组、链与端粒帽:用DCJ-插入缺失模型解决问题
Algorithms Mol Biol. 2024 Feb 27;19(1):8. doi: 10.1186/s13015-024-00253-7.
2
The Floor Is Lava: Halving Natural Genomes with Viaducts, Piers, and Pontoons.“地板是熔岩”:利用高架桥、桥墩和浮桥将天然基因组减半。
J Comput Biol. 2024 Apr;31(4):294-311. doi: 10.1089/cmb.2023.0330. Epub 2024 Apr 15.
3
Sorting Linear Genomes with Rearrangements and Indels.通过重排和插入缺失对线性基因组进行排序
IEEE/ACM Trans Comput Biol Bioinform. 2015 May-Jun;12(3):500-6. doi: 10.1109/TCBB.2014.2329297.
4
Restricted DCJ-indel model: sorting linear genomes with DCJ and indels.受限 DCJ 插入缺失模型:使用 DCJ 和插入缺失对线性基因组进行排序。
BMC Bioinformatics. 2012;13 Suppl 19(Suppl 19):S14. doi: 10.1186/1471-2105-13-S19-S14. Epub 2012 Dec 19.
5
Computing the Rearrangement Distance of Natural Genomes.计算自然基因组的重排距离。
J Comput Biol. 2021 Apr;28(4):410-431. doi: 10.1089/cmb.2020.0434. Epub 2020 Dec 30.
6
Approximating the DCJ distance of balanced genomes in linear time.在线性时间内近似平衡基因组的DCJ距离。
Algorithms Mol Biol. 2017 Mar 9;12:3. doi: 10.1186/s13015-017-0095-y. eCollection 2017.
7
DCJ-indel and DCJ-substitution distances with distinct operation costs.具有不同操作成本的DCJ插入缺失和DCJ替换距离。
Algorithms Mol Biol. 2013 Jul 23;8(1):21. doi: 10.1186/1748-7188-8-21.
8
DCJ-Indel sorting revisited.重新审视DCJ插入缺失排序
Algorithms Mol Biol. 2013 Mar 1;8(1):6. doi: 10.1186/1748-7188-8-6.
9
Natural family-free genomic distance.自然的无家族基因组距离。
Algorithms Mol Biol. 2021 May 10;16(1):4. doi: 10.1186/s13015-021-00183-8.
10
On the inversion-indel distance.关于倒位缺失距离。
BMC Bioinformatics. 2013;14 Suppl 15(Suppl 15):S3. doi: 10.1186/1471-2105-14-S15-S3. Epub 2013 Oct 15.

引用本文的文献

1
Reconstructing rearrangement phylogenies of natural genomes.重建天然基因组的重排系统发育树。
Algorithms Mol Biol. 2025 Jun 7;20(1):10. doi: 10.1186/s13015-025-00279-5.
2
Structural variation, selection, and diversification of the gene family from the human pangenome.人类泛基因组中基因家族的结构变异、选择与多样化
bioRxiv. 2025 Feb 5:2025.02.04.636496. doi: 10.1101/2025.02.04.636496.

本文引用的文献

1
Small parsimony for natural genomes in the DCJ-indel model.在 DCJ 插入-缺失模型中,自然基因组的小简约性。
J Bioinform Comput Biol. 2021 Dec;19(6):2140009. doi: 10.1142/S0219720021400096. Epub 2021 Nov 19.
2
The potential of family-free rearrangements towards gene orthology inference.无家族重排用于基因直系同源性推断的潜力。
J Bioinform Comput Biol. 2021 Dec;19(6):2140014. doi: 10.1142/S021972002140014X. Epub 2021 Nov 13.
3
Computing the Rearrangement Distance of Natural Genomes.计算自然基因组的重排距离。
J Comput Biol. 2021 Apr;28(4):410-431. doi: 10.1089/cmb.2020.0434. Epub 2020 Dec 30.
4
OrthoFinder: solving fundamental biases in whole genome comparisons dramatically improves orthogroup inference accuracy.OrthoFinder:解决全基因组比较中的基本偏差可显著提高直系同源组推断准确性。
Genome Biol. 2015 Aug 6;16(1):157. doi: 10.1186/s13059-015-0721-2.
5
An Exact Algorithm to Compute the Double-Cut-and-Join Distance for Genomes with Duplicate Genes.一种用于计算具有重复基因的基因组的双切割连接距离的精确算法。
J Comput Biol. 2015 May;22(5):425-35. doi: 10.1089/cmb.2014.0096. Epub 2014 Dec 17.
6
DCJ-Indel sorting revisited.重新审视DCJ插入缺失排序
Algorithms Mol Biol. 2013 Mar 1;8(1):6. doi: 10.1186/1748-7188-8-6.
7
Double cut and join with insertions and deletions.带有插入和缺失的双切割与连接。
J Comput Biol. 2011 Sep;18(9):1167-84. doi: 10.1089/cmb.2011.0118.
8
Circos: an information aesthetic for comparative genomics.Circos:一种用于比较基因组学的信息美学。
Genome Res. 2009 Sep;19(9):1639-45. doi: 10.1101/gr.092759.109. Epub 2009 Jun 18.
9
Application of phylogenetic networks in evolutionary studies.系统发育网络在进化研究中的应用。
Mol Biol Evol. 2006 Feb;23(2):254-67. doi: 10.1093/molbev/msj030. Epub 2005 Oct 12.
10
Efficient sorting of genomic permutations by translocation, inversion and block interchange.通过易位、倒位和块交换对基因组排列进行高效排序。
Bioinformatics. 2005 Aug 15;21(16):3340-6. doi: 10.1093/bioinformatics/bti535. Epub 2005 Jun 9.