• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

具有单基因复制的单切割或连接模型中的距离和中位数问题。

The distance and median problems in the single-cut-or-join model with single-gene duplications.

作者信息

Mane Aniket C, Lafond Manuel, Feijao Pedro C, Chauve Cedric

机构信息

1Department of Mathematics, Simon Fraser University, 8888 University Drive, Burnaby, V5A 1S6 Canada.

2Department of Computer Science, Université de Sherbrooke, Boulevard de l'Université, Sherbrooke, J1K 2R1 Canada.

出版信息

Algorithms Mol Biol. 2020 May 4;15:8. doi: 10.1186/s13015-020-00169-y. eCollection 2020.

DOI:10.1186/s13015-020-00169-y
PMID:32391071
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7197181/
Abstract

BACKGROUND

In the field of genome rearrangement algorithms, models accounting for gene duplication lead often to hard problems. For example, while computing the pairwise distance is tractable in most duplication-free models, the problem is NP-complete for most extensions of these models accounting for duplicated genes. Moreover, problems involving more than two genomes, such as the genome median and the Small Parsimony problem, are intractable for most duplication-free models, with some exceptions, for example the Single-Cut-or-Join (SCJ) model.

RESULTS

We introduce a variant of the SCJ distance that accounts for duplicated genes, in the context of directed evolution from an ancestral genome to a descendant genome where orthology relations between ancestral genes and their descendant are known. Our model includes two duplication mechanisms: single-gene tandem duplication and the creation of single-gene circular chromosomes. We prove that in this model, computing the directed distance and a parsimonious evolutionary scenario in terms of SCJ and single-gene duplication events can be done in linear time. We also show that the directed median problem is tractable for this distance, while the rooted median problem, where we assume that one of the given genomes is ancestral to the median, is NP-complete. We also describe an Integer Linear Program for solving this problem. We evaluate the directed distance and rooted median algorithms on simulated data.

CONCLUSION

Our results provide a simple genome rearrangement model, extending the SCJ model to account for single-gene duplications, for which we prove a mix of tractability and hardness results. For the NP-complete rooted median problem, we design a simple Integer Linear Program. Our publicly available implementation of these algorithms for the directed distance and median problems allow to solve efficiently these problems on large instances.

摘要

背景

在基因组重排算法领域,考虑基因复制的模型常常会导致难题。例如,虽然在大多数无复制模型中计算成对距离是可处理的,但对于这些考虑了重复基因的模型的大多数扩展而言,该问题是NP完全问题。此外,涉及两个以上基因组的问题,如基因组中位数和小简约问题,对于大多数无复制模型来说是难以处理的,不过也有一些例外,比如单切割或连接(SCJ)模型。

结果

我们引入了一种SCJ距离的变体,该变体在从祖先基因组到后代基因组的定向进化背景下考虑了重复基因,其中祖先基因与其后代之间的直系同源关系是已知的。我们的模型包括两种复制机制:单基因串联复制和单基因环状染色体的产生。我们证明,在这个模型中,根据SCJ和单基因复制事件计算定向距离和简约进化场景可以在线性时间内完成。我们还表明,对于这个距离,定向中位数问题是可处理的,而根中位数问题(即我们假设给定基因组之一是中位数的祖先)是NP完全问题。我们还描述了一个用于解决此问题的整数线性规划。我们在模拟数据上评估了定向距离和根中位数算法。

结论

我们的结果提供了一个简单的基因组重排模型,将SCJ模型扩展到考虑单基因复制,我们证明了其兼具可处理性和难度结果。对于NP完全的根中位数问题,我们设计了一个简单的整数线性规划。我们针对定向距离和中位数问题的这些算法的公开可用实现允许在大型实例上高效地解决这些问题。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2f55/7197181/a804ff5deed8/13015_2020_169_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2f55/7197181/3da253268d69/13015_2020_169_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2f55/7197181/4b027435b258/13015_2020_169_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2f55/7197181/809d1e4e04f2/13015_2020_169_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2f55/7197181/51d3a6f3a718/13015_2020_169_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2f55/7197181/a804ff5deed8/13015_2020_169_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2f55/7197181/3da253268d69/13015_2020_169_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2f55/7197181/4b027435b258/13015_2020_169_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2f55/7197181/809d1e4e04f2/13015_2020_169_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2f55/7197181/51d3a6f3a718/13015_2020_169_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2f55/7197181/a804ff5deed8/13015_2020_169_Fig5_HTML.jpg

相似文献

1
The distance and median problems in the single-cut-or-join model with single-gene duplications.具有单基因复制的单切割或连接模型中的距离和中位数问题。
Algorithms Mol Biol. 2020 May 4;15:8. doi: 10.1186/s13015-020-00169-y. eCollection 2020.
2
Small parsimony for natural genomes in the DCJ-indel model.在 DCJ 插入-缺失模型中,自然基因组的小简约性。
J Bioinform Comput Biol. 2021 Dec;19(6):2140009. doi: 10.1142/S0219720021400096. Epub 2021 Nov 19.
3
SCJ: a breakpoint-like distance that simplifies several rearrangement problems.SCJ:一种类似于断点的距离,可简化多个重排问题。
IEEE/ACM Trans Comput Biol Bioinform. 2011 Sep-Oct;8(5):1318-29. doi: 10.1109/TCBB.2011.34.
4
Chromosome structures: reduction of certain problems with unequal gene content and gene paralogs to integer linear programming.染色体结构:将某些具有不等基因含量和基因旁系同源物的问题简化为整数线性规划。
BMC Bioinformatics. 2017 Dec 6;18(1):537. doi: 10.1186/s12859-017-1944-x.
5
Fast ancestral gene order reconstruction of genomes with unequal gene content.具有不等基因含量的基因组的快速祖先基因顺序重建
BMC Bioinformatics. 2016 Nov 11;17(Suppl 14):413. doi: 10.1186/s12859-016-1261-9.
6
Sorting by Cuts, Joins, and Whole Chromosome Duplications.
J Comput Biol. 2017 Feb;24(2):127-137. doi: 10.1089/cmb.2016.0045. Epub 2016 Oct 5.
7
Rearrangement-based phylogeny using the Single-Cut-or-Join operation.基于单切或连操作的重排系统发育分析。
IEEE/ACM Trans Comput Biol Bioinform. 2013 Jan-Feb;10(1):122-34. doi: 10.1109/TCBB.2012.168.
8
A parsimony approach to analysis of human segmental duplications.一种用于人类片段重复分析的简约方法。
Pac Symp Biocomput. 2009:126-37.
9
The SCJ Small Parsimony Problem for Weighted Gene Adjacencies.加权基因邻接的 SCJ 简约性问题。
IEEE/ACM Trans Comput Biol Bioinform. 2019 Jul-Aug;16(4):1364-1373. doi: 10.1109/TCBB.2017.2661761. Epub 2017 Jan 31.
10
The median problems on linear multichromosomal genomes: graph representation and fast exact solutions.线性多染色体基因组上的中位数问题:图形表示与快速精确解
J Comput Biol. 2010 Sep;17(9):1195-211. doi: 10.1089/cmb.2010.0106.

本文引用的文献

1
Phylogenetic signal from rearrangements in 18 Anopheles species by joint scaffolding extant and ancestral genomes.18 种按蚊种系重排的系统发育信号,通过联合支架现存和祖先基因组。
BMC Genomics. 2018 May 9;19(Suppl 2):96. doi: 10.1186/s12864-018-4466-7.
2
The gene family-free median of three.三个无基因家族的中位数
Algorithms Mol Biol. 2017 May 26;12:14. doi: 10.1186/s13015-017-0106-z. eCollection 2017.
3
DeCoSTAR: Reconstructing the Ancestral Organization of Genes or Genomes Using Reconciled Phylogenies.DeCoSTAR:使用协调系统发育树重建基因或基因组的祖先组织
Genome Biol Evol. 2017 May 1;9(5):1312-1319. doi: 10.1093/gbe/evx069.
4
Approximating the DCJ distance of balanced genomes in linear time.在线性时间内近似平衡基因组的DCJ距离。
Algorithms Mol Biol. 2017 Mar 9;12:3. doi: 10.1186/s13015-017-0095-y. eCollection 2017.
5
Sorting by Cuts, Joins, and Whole Chromosome Duplications.
J Comput Biol. 2017 Feb;24(2):127-137. doi: 10.1089/cmb.2016.0045. Epub 2016 Oct 5.
6
The pineapple genome and the evolution of CAM photosynthesis.菠萝基因组与景天酸代谢光合作用的进化
Nat Genet. 2015 Dec;47(12):1435-42. doi: 10.1038/ng.3435. Epub 2015 Nov 2.
7
Mosquito genomics. Highly evolvable malaria vectors: the genomes of 16 Anopheles mosquitoes.蚊子基因组学。高度可进化的疟疾传播媒介:16种按蚊的基因组
Science. 2015 Jan 2;347(6217):1258522. doi: 10.1126/science.1258522. Epub 2014 Nov 27.
8
An Exact Algorithm to Compute the Double-Cut-and-Join Distance for Genomes with Duplicate Genes.一种用于计算具有重复基因的基因组的双切割连接距离的精确算法。
J Comput Biol. 2015 May;22(5):425-35. doi: 10.1089/cmb.2014.0096. Epub 2014 Dec 17.
9
Inapproximability of (1,2)-exemplar distance.(1,2)-典范距离的不可近似性。
IEEE/ACM Trans Comput Biol Bioinform. 2013 Nov-Dec;10(6):1384-90. doi: 10.1109/TCBB.2012.144.
10
On the complexity of rearrangement problems under the breakpoint distance.断点距离下重排问题的复杂性
J Comput Biol. 2014 Jan;21(1):1-15. doi: 10.1089/cmb.2013.0004. Epub 2013 Nov 7.