带反转、转位和插入缺失的基因组重排距离

Genome Rearrangement Distance with Reversals, Transpositions, and Indels.

作者信息

Alexandrino Alexsandro Oliveira, Oliveira Andre Rodrigues, Dias Ulisses, Dias Zanoni

机构信息

Institute of Computing, University of Campinas, Campinas, Brazil.

School of Technology, University of Campinas, Limeira, Brazil.

出版信息

J Comput Biol. 2021 Mar;28(3):235-247. doi: 10.1089/cmb.2020.0121. Epub 2020 Oct 20.

DOI:10.1089/cmb.2020.0121

PMID:33085536

Abstract

The rearrangement distance is a well-known problem in the field of comparative genomics. Given two genomes, the rearrangement distance is the minimum number of rearrangements in a set of allowed rearrangements (rearrangement model), which transforms one genome into the other. In rearrangement distance problems, a genome is modeled as a string, where each element represents a conserved region within the two genomes. When the orientation of the genes is known, it is represented by (plus or minus) signs assigned to the elements of the string. Two of the most studied rearrangements are reversals, which invert a segment of the genome, and transpositions, which exchange the relative positions of two adjacent segments of the genome. The first works in genome rearrangements considered that the genomes being compared had the same genetic material and that rearrangement events were restricted to reversals, transpositions, or both. El-Mabrouk extended the reversal model on signed strings to include the operations of insertion and deletion of segments in the genome, which allowed the comparison of genomes with different genetic material. Other studies also addressed this problem and, recently, this problem was proved to be solvable in polynomial time by Willing et al. For unsigned strings, we still observe a lack of results. That said, in this study we prove that computing the rearrangement distance for the following models is NP-Hard: reversals and indels on unsigned strings; transpositions and indels on unsigned strings; and reversals, transpositions, and indels on signed and unsigned strings. Along with the NP-hardness proofs, we present a 2-approximation algorithm for reversals on unsigned strings and 3-approximation algorithms for the other models.

摘要

重排距离是比较基因组学领域一个著名的问题。给定两个基因组，重排距离是一组允许的重排（重排模型）中，将一个基因组转化为另一个基因组所需的最少重排次数。在重排距离问题中，基因组被建模为一个字符串，其中每个元素代表两个基因组中的一个保守区域。当基因的方向已知时，用分配给字符串元素的（正或负）符号表示。研究最多的两种重排是反转，它会反转基因组的一个片段；以及转位，它会交换基因组中两个相邻片段的相对位置。最早关于基因组重排的研究认为，被比较的基因组具有相同的遗传物质，并且重排事件仅限于反转、转位或两者皆有。埃尔 - 马布鲁克将有符号字符串上的反转模型扩展到包括基因组片段的插入和删除操作，这使得能够比较具有不同遗传物质的基因组。其他研究也涉及了这个问题，最近，威林等人证明了这个问题在多项式时间内是可解的。对于无符号字符串，我们仍然缺乏相关结果。也就是说，在本研究中我们证明，计算以下模型的重排距离是NP难的：无符号字符串上的反转和插入缺失；无符号字符串上的转位和插入缺失；以及有符号和无符号字符串上的反转、转位和插入缺失。除了NP难的证明，我们还提出了一种用于无符号字符串上反转的2近似算法以及用于其他模型的3近似算法。

相似文献

Genome Rearrangement Distance with Reversals, Transpositions, and Indels.带反转、转位和插入缺失的基因组重排距离

J Comput Biol. 2021 Mar;28(3):235-247. doi: 10.1089/cmb.2020.0121. Epub 2020 Oct 20.

On the Complexity of Sorting by Reversals and Transpositions Problems.关于通过反转和转置问题进行排序的复杂性

J Comput Biol. 2019 Nov;26(11):1223-1229. doi: 10.1089/cmb.2019.0078. Epub 2019 May 23.

Sorting Signed Permutations by Intergenic Reversals.按基因间倒位对有符号排列进行分类。

IEEE/ACM Trans Comput Biol Bioinform. 2021 Nov-Dec;18(6):2870-2876. doi: 10.1109/TCBB.2020.2993002. Epub 2021 Dec 8.

Reversal and Transposition Distance on Unbalanced Genomes Using Intergenic Information.利用基因间信息计算不平衡基因组的反转和转位距离

J Comput Biol. 2023 Aug;30(8):861-876. doi: 10.1089/cmb.2023.0087. Epub 2023 May 24.

Sorting by Weighted Reversals and Transpositions.

J Comput Biol. 2019 May;26(5):420-431. doi: 10.1089/cmb.2018.0257. Epub 2019 Feb 19.

Incorporating intergenic regions into reversal and transposition distances with indels.将基因间区纳入插入缺失的倒位和转座距离中。

J Bioinform Comput Biol. 2021 Dec;19(6):2140011. doi: 10.1142/S0219720021400114. Epub 2021 Nov 13.

Sorting Linear Genomes with Rearrangements and Indels.通过重排和插入缺失对线性基因组进行排序

IEEE/ACM Trans Comput Biol Bioinform. 2015 May-Jun;12(3):500-6. doi: 10.1109/TCBB.2014.2329297.

Labeled Cycle Graph for Transposition and Indel Distance.用于转置和插入缺失距离的标记循环图。

J Comput Biol. 2022 Mar;29(3):243-256. doi: 10.1089/cmb.2021.0279. Epub 2021 Nov 1.

Rearrangement distance with reversals, indels, and moves in intergenic regions on signed and unsigned permutations.带有倒位、缺失和基因间区域移动的重排距离在有符号和无符号排列上。

J Bioinform Comput Biol. 2023 Apr;21(2):2350009. doi: 10.1142/S0219720023500099. Epub 2023 Apr 27.

Heuristics for Genome Rearrangement Distance With Replicated Genes.含重复基因的基因组重排距离的启发式算法。

IEEE/ACM Trans Comput Biol Bioinform. 2021 Nov-Dec;18(6):2094-2108. doi: 10.1109/TCBB.2021.3095021. Epub 2021 Dec 8.

带反转、转位和插入缺失的基因组重排距离

Genome Rearrangement Distance with Reversals, Transpositions, and Indels.

作者信息

Alexandrino Alexsandro Oliveira, Oliveira Andre Rodrigues, Dias Ulisses, Dias Zanoni

机构信息

Institute of Computing, University of Campinas, Campinas, Brazil.

School of Technology, University of Campinas, Limeira, Brazil.

出版信息

J Comput Biol. 2021 Mar;28(3):235-247. doi: 10.1089/cmb.2020.0121. Epub 2020 Oct 20.

DOI:10.1089/cmb.2020.0121

PMID:33085536

Abstract

摘要

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

带反转、转位和插入缺失的基因组重排距离

Genome Rearrangement Distance with Reversals, Transpositions, and Indels.

作者信息

机构信息

出版信息

相似文献

带反转、转位和插入缺失的基因组重排距离

Genome Rearrangement Distance with Reversals, Transpositions, and Indels.

作者信息

机构信息

出版信息

相似文献