Suppr超能文献

通过熵正则化对偶分解实现可扩展凸多序列比对

Scalable Convex Multiple Sequence Alignment via Entropy-Regularized Dual Decomposition.

作者信息

Zhang Jiong, Yen Ian E H, Ravikumar Pradeep, Dhillon Inderjit S

机构信息

University of Texas at Austin.

Carnegie Mellon University.

出版信息

JMLR Workshop Conf Proc. 2017 Apr;54:1514-1522.

Abstract

is one of the fundamental tasks in biological sequence analysis that underlies applications such as phylogenetic trees, profiles, and structure prediction. The task, however, is NP-hard, and the current practice resorts to heuristic and local-search methods. Recently, a convex optimization approach for MSA was proposed based on the concept of atomic norm [23], which demonstrates significant improvement over existing methods in the quality of alignments. However, the convex program is challenging to solve due to the constraint given by the intersection of two atomic-norm balls, for which the existing algorithm can only handle sequences of length up to 50, with an iteration complexity subject to constants of unknown relation to the natural parameters of MSA. In this work, we propose an algorithm that exploits to induce closed-form solutions for each atomic-norm-constrained subproblem, giving a single-loop algorithm of iteration complexity linear to the problem size (total length of all sequences). The proposed algorithm gives significantly better alignments than existing methods on sequences of length up to hundreds, where the existing convex programming method fails to converge in one day.

摘要

是生物序列分析中的基本任务之一,它是诸如系统发育树、轮廓和结构预测等应用的基础。然而,该任务是NP难的,当前的做法是采用启发式和局部搜索方法。最近,基于原子范数的概念提出了一种用于多序列比对(MSA)的凸优化方法[23],该方法在比对质量上比现有方法有显著提高。然而,由于两个原子范数球相交给出的约束,凸规划很难求解,现有算法只能处理长度达50的序列,其迭代复杂度取决于与MSA自然参数关系未知的常数。在这项工作中,我们提出了一种算法,该算法利用[具体内容缺失]为每个原子范数约束子问题导出闭式解,给出了一种迭代复杂度与问题规模(所有序列的总长度)成线性关系的单循环算法。在长度达数百的序列上,所提出的算法比对现有方法得到的比对结果要好得多,而现有凸规划方法在一天内无法收敛。

相似文献

5
Gradient regularization of Newton method with Bregman distances.基于布雷格曼距离的牛顿法梯度正则化
Math Program. 2024;204(1-2):1-25. doi: 10.1007/s10107-023-01943-7. Epub 2023 Mar 24.
10
Hessian Schatten-norm regularization for linear inverse problems.海森暗范数正则化用于线性反问题。
IEEE Trans Image Process. 2013 May;22(5):1873-88. doi: 10.1109/TIP.2013.2237919. Epub 2013 Jan 4.

本文引用的文献

4
Clustal W and Clustal X version 2.0.Clustal W和Clustal X 2.0版本
Bioinformatics. 2007 Nov 1;23(21):2947-8. doi: 10.1093/bioinformatics/btm404. Epub 2007 Sep 10.
5
Recent evolutions of multiple sequence alignment algorithms.多重序列比对算法的最新进展。
PLoS Comput Biol. 2007 Aug;3(8):e123. doi: 10.1371/journal.pcbi.0030123.
6
Settling the intractability of multiple alignment.解决多重比对的棘手问题。
J Comput Biol. 2006 Sep;13(7):1323-39. doi: 10.1089/cmb.2006.13.1323.
10

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验