基于组件的启发式多序列比对算法设计与组装

Component-Based Design and Assembly of Heuristic Multiple Sequence Alignment Algorithms.

作者信息

Shi Haihe, Zhang Xuchu

机构信息

School of Computer and Information Engineering, Jiangxi Normal University, Nanchang, China.

出版信息

Front Genet. 2020 Feb 27;11:105. doi: 10.3389/fgene.2020.00105. eCollection 2020.

DOI:10.3389/fgene.2020.00105

PMID:32174970

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7056898/

Abstract

In recent years, there has been an explosive increase in the amount of bioinformatics data produced, but data are not information. The purpose of bioinformatics research is to obtain information with biological significance from large amounts of data. Multiple sequence alignment is widely used in sequence homology detection, protein secondary and tertiary structure prediction, phylogenetic tree analysis, and other fields. Existing research mainly focuses on the specific steps of the algorithm or on specific problems, and there is a lack of high-level abstract domain algorithm frameworks. As a result, multiple sequence alignment algorithms are complex, redundant, and difficult to understand, and it is not easy for users to select the appropriate algorithm, which may lead to computing errors. Here, through in-depth study and analysis of the heuristic multiple sequence alignment algorithm (HMSAA) domain, a domain-feature model and an interactive model of HMSAA components have been established according to the generative programming method. With the support of the PAR (partition and recur) platform, the HMSAA algorithm component library is formalized and a specific alignment algorithm is assembled, thus improving the reliability of algorithm assembly. This work provides a valuable theoretical reference for the applications of other biological sequence analysis algorithms.

摘要

近年来，所产生的生物信息学数据量呈爆发式增长，但数据并非信息。生物信息学研究的目的是从大量数据中获取具有生物学意义的信息。多序列比对广泛应用于序列同源性检测、蛋白质二级和三级结构预测、系统发育树分析等领域。现有研究主要集中在算法的具体步骤或特定问题上，缺乏高层次的抽象领域算法框架。因此，多序列比对算法复杂、冗余且难以理解，用户不易选择合适的算法，这可能导致计算错误。在此，通过对启发式多序列比对算法（HMSAA）领域的深入研究与分析，依据生成式编程方法建立了HMSAA的领域特征模型和组件交互模型。在PAR（划分与递归）平台的支持下，对HMSAA算法组件库进行形式化并组装特定的比对算法，从而提高了算法组装的可靠性。这项工作为其他生物序列分析算法的应用提供了有价值的理论参考。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d9f8/7056898/10369f5df638/fgene-11-00105-g001.jpg

相似文献

Component-Based Design and Assembly of Heuristic Multiple Sequence Alignment Algorithms.基于组件的启发式多序列比对算法设计与组装

Front Genet. 2020 Feb 27;11:105. doi: 10.3389/fgene.2020.00105. eCollection 2020.

Efficient Multiple Sequences Alignment Algorithm Generation Components Assembly Under PAR Framework.并行架构下高效多序列比对算法生成组件装配

Front Genet. 2021 Feb 4;11:628175. doi: 10.3389/fgene.2020.628175. eCollection 2020.

Efficient Generation of RNA Secondary Structure Prediction Algorithm Under PAR Framework.PAR框架下RNA二级结构预测算法的高效生成

Front Plant Sci. 2022 Jan 21;12:830042. doi: 10.3389/fpls.2021.830042. eCollection 2021.

Research on Components Assembly Platform of Biological Sequences Alignment Algorithm.生物序列比对算法组件组装平台研究

Front Genet. 2021 Jan 21;11:630923. doi: 10.3389/fgene.2020.630923. eCollection 2020.

New Construction of Family of MLCS Algorithms.新的 MLCS 算法族构建。

J Healthc Eng. 2021 Jan 19;2021:6636710. doi: 10.1155/2021/6636710. eCollection 2021.

A survey on the algorithm and development of multiple sequence alignment.多序列比对算法与发展研究综述。

Brief Bioinform. 2022 May 13;23(3). doi: 10.1093/bib/bbac069.

Comprehensive study on iterative algorithms of multiple sequence alignment.多重序列比对迭代算法的综合研究

Comput Appl Biosci. 1995 Feb;11(1):13-8. doi: 10.1093/bioinformatics/11.1.13.

Multiple sequence alignment by parallel simulated annealing.通过并行模拟退火进行多序列比对。

Comput Appl Biosci. 1993 Jun;9(3):267-73. doi: 10.1093/bioinformatics/9.3.267.

Multiple sequence alignment using simulated annealing.使用模拟退火进行多序列比对。

Comput Appl Biosci. 1994 Jul;10(4):419-26. doi: 10.1093/bioinformatics/10.4.419.

A New Implementation of Genome Rearrangement Problem.一种新的基因组重排问题实现。

J Healthc Eng. 2021 Jan 23;2021:6692775. doi: 10.1155/2021/6692775. eCollection 2021.

本文引用的文献

CUDA ClustalW: An efficient parallel algorithm for progressive multiple sequence alignment on Multi-GPUs.CUDA ClustalW：一种用于在多图形处理器上进行渐进式多序列比对的高效并行算法。

Comput Biol Chem. 2015 Oct;58:62-8. doi: 10.1016/j.compbiolchem.2015.05.004. Epub 2015 May 21.

HAlign: Fast multiple similar DNA/RNA sequence alignment based on the centre star strategy.HAlign：基于中心星型策略的快速多重相似DNA/RNA序列比对

Bioinformatics. 2015 Aug 1;31(15):2475-81. doi: 10.1093/bioinformatics/btv177. Epub 2015 Mar 25.

IDBA-UD: a de novo assembler for single-cell and metagenomic sequencing data with highly uneven depth.IDBA-UD：一个用于具有高度不均匀深度的单细胞和宏基因组测序数据的从头组装程序。

Bioinformatics. 2012 Jun 1;28(11):1420-8. doi: 10.1093/bioinformatics/bts174. Epub 2012 Apr 11.

Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega.使用 Clustal Omega 快速、可扩展地生成高质量蛋白质多重序列比对。

Mol Syst Biol. 2011 Oct 11;7:539. doi: 10.1038/msb.2011.75.

The roots of bioinformatics in theoretical biology.生物信息学在理论生物学中的起源。

PLoS Comput Biol. 2011 Mar;7(3):e1002021. doi: 10.1371/journal.pcbi.1002021. Epub 2011 Mar 31.

De novo assembly of human genomes with massively parallel short read sequencing.利用大规模平行短读测序进行人类基因组从头组装。

Genome Res. 2010 Feb;20(2):265-72. doi: 10.1101/gr.097261.109. Epub 2009 Dec 17.

An adaptive and iterative algorithm for refining multiple sequence alignment.一种用于优化多序列比对的自适应迭代算法。

Comput Biol Chem. 2004 Apr;28(2):141-8. doi: 10.1016/j.compbiolchem.2004.02.001.

MUSCLE: multiple sequence alignment with high accuracy and high throughput.MUSCLE：具有高精度和高吞吐量的多序列比对。

Nucleic Acids Res. 2004 Mar 19;32(5):1792-7. doi: 10.1093/nar/gkh340. Print 2004.

T-Coffee: A novel method for fast and accurate multiple sequence alignment.T-Coffee：一种用于快速准确的多序列比对的新方法。

J Mol Biol. 2000 Sep 8;302(1):205-17. doi: 10.1006/jmbi.2000.4042.

New goals for the U.S. Human Genome Project: 1998-2003.美国人类基因组计划的新目标：1998 - 2003年。

Science. 1998 Oct 23;282(5389):682-9. doi: 10.1126/science.282.5389.682.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

基于组件的启发式多序列比对算法设计与组装

Component-Based Design and Assembly of Heuristic Multiple Sequence Alignment Algorithms.

作者信息

机构信息

出版信息

相似文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

本文引用的文献