利用计算智能技术预测多种序列比对算法的准确性。

Predicting the accuracy of multiple sequence alignment algorithms by using computational intelligent techniques.

机构信息

Department of Computer Architecture and Computer Technology, University of Granada, 18071 Granada, Spain.

出版信息

Nucleic Acids Res. 2013 Jan 7;41(1):e26. doi: 10.1093/nar/gks919. Epub 2012 Oct 11.

DOI:10.1093/nar/gks919

PMID:23066102

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3592395/

Abstract

Multiple sequence alignments (MSAs) have become one of the most studied approaches in bioinformatics to perform other outstanding tasks such as structure prediction, biological function analysis or next-generation sequencing. However, current MSA algorithms do not always provide consistent solutions, since alignments become increasingly difficult when dealing with low similarity sequences. As widely known, these algorithms directly depend on specific features of the sequences, causing relevant influence on the alignment accuracy. Many MSA tools have been recently designed but it is not possible to know in advance which one is the most suitable for a particular set of sequences. In this work, we analyze some of the most used algorithms presented in the bibliography and their dependences on several features. A novel intelligent algorithm based on least square support vector machine is then developed to predict how accurate each alignment could be, depending on its analyzed features. This algorithm is performed with a dataset of 2180 MSAs. The proposed system first estimates the accuracy of possible alignments. The most promising methodologies are then selected in order to align each set of sequences. Since only one selected algorithm is run, the computational time is not excessively increased.

摘要

多序列比对（MSA）已经成为生物信息学中研究最多的方法之一，可用于执行其他杰出任务，如结构预测、生物功能分析或下一代测序。然而，当前的 MSA 算法并不总是提供一致的解决方案，因为在处理低相似度序列时，比对变得越来越困难。众所周知，这些算法直接依赖于序列的特定特征，这对比对准确性产生了相关影响。最近设计了许多 MSA 工具，但无法事先知道哪一个最适合特定的序列集。在这项工作中，我们分析了文献中介绍的一些最常用的算法及其对几种特征的依赖性。然后，开发了一种基于最小二乘支持向量机的新型智能算法，根据其分析的特征来预测每个比对的准确性。该算法使用 2180 个 MSA 数据集进行。该系统首先估计可能的比对的准确性。然后选择最有前途的方法来对齐每一组序列。由于只运行一个选定的算法，因此不会大大增加计算时间。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aa9d/3592395/d3cfd134ada1/gks919f1p.jpg

相似文献

Predicting the accuracy of multiple sequence alignment algorithms by using computational intelligent techniques.

Nucleic Acids Res. 2013 Jan 7;41(1):e26. doi: 10.1093/nar/gks919. Epub 2012 Oct 11.

Optimizing multiple sequence alignments using a genetic algorithm based on three objectives: structural information, non-gaps percentage and totally conserved columns.

Bioinformatics. 2013 Sep 1;29(17):2112-21. doi: 10.1093/bioinformatics/btt360. Epub 2013 Jun 21.

Protein multiple sequence alignment benchmarking through secondary structure prediction.

Bioinformatics. 2017 May 1;33(9):1331-1337. doi: 10.1093/bioinformatics/btw840.

AlexSys: a knowledge-based expert system for multiple sequence alignment construction and analysis.

Nucleic Acids Res. 2010 Oct;38(19):6338-49. doi: 10.1093/nar/gkq526. Epub 2010 Jun 8.

Accurate extension of multiple sequence alignments using a phylogeny-aware graph algorithm.

Bioinformatics. 2012 Jul 1;28(13):1684-91. doi: 10.1093/bioinformatics/bts198. Epub 2012 Apr 23.

Iterative refinement of structure-based sequence alignments by Seed Extension.

BMC Bioinformatics. 2009 Jul 9;10:210. doi: 10.1186/1471-2105-10-210.

Multiple sequence alignment with affine gap by using multi-objective genetic algorithm.

Comput Methods Programs Biomed. 2014 Apr;114(1):38-49. doi: 10.1016/j.cmpb.2014.01.013. Epub 2014 Jan 31.

Combining partial order alignment and progressive multiple sequence alignment increases alignment speed and scalability to very large alignment problems.

Bioinformatics. 2004 Jul 10;20(10):1546-56. doi: 10.1093/bioinformatics/bth126. Epub 2004 Feb 12.

PicXAA: greedy probabilistic construction of maximum expected accuracy alignment of multiple sequences.

Nucleic Acids Res. 2010 Aug;38(15):4917-28. doi: 10.1093/nar/gkq255. Epub 2010 Apr 22.

Mind the gaps: evidence of bias in estimates of multiple sequence alignments.

Mol Biol Evol. 2007 Nov;24(11):2433-42. doi: 10.1093/molbev/msm176. Epub 2007 Aug 20.

本文引用的文献

SuiteMSA: visual tools for multiple sequence alignment comparison and molecular sequence simulation.

BMC Bioinformatics. 2011 May 21;12:184. doi: 10.1186/1471-2105-12-184.

iPBA: a tool for protein structure comparison using sequence alignment strategies.

Nucleic Acids Res. 2011 Jul;39(Web Server issue):W18-23. doi: 10.1093/nar/gkr333. Epub 2011 May 17.

The impact of multiple protein sequence alignment on phylogenetic estimation.

IEEE/ACM Trans Comput Biol Bioinform. 2011 Jul-Aug;8(4):1108-19. doi: 10.1109/TCBB.2009.68.

Prediction of missense mutation functionality depends on both the algorithm and sequence alignment employed.

Hum Mutat. 2011 Jun;32(6):661-8. doi: 10.1002/humu.21490. Epub 2011 Apr 7.

RNA-RNA interaction prediction based on multiple sequence alignments.

Bioinformatics. 2011 Feb 15;27(4):456-63. doi: 10.1093/bioinformatics/btq659. Epub 2010 Dec 5.

Reticular alignment: a progressive corner-cutting method for multiple sequence alignment.

BMC Bioinformatics. 2010 Nov 23;11:570. doi: 10.1186/1471-2105-11-570.

MSAProbs: multiple sequence alignment based on pair hidden Markov models and partition function posterior probabilities.

Bioinformatics. 2010 Aug 15;26(16):1958-64. doi: 10.1093/bioinformatics/btq338. Epub 2010 Jun 23.

AlexSys: a knowledge-based expert system for multiple sequence alignment construction and analysis.

Nucleic Acids Res. 2010 Oct;38(19):6338-49. doi: 10.1093/nar/gkq526. Epub 2010 Jun 8.

A survey of sequence alignment algorithms for next-generation sequencing.

Brief Bioinform. 2010 Sep;11(5):473-83. doi: 10.1093/bib/bbq015. Epub 2010 May 11.

Utilizing shared interacting domain patterns and Gene Ontology information to improve protein-protein interaction prediction.

Comput Biol Med. 2010 Jun;40(6):555-64. doi: 10.1016/j.compbiomed.2010.03.009. Epub 2010 Apr 24.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

利用计算智能技术预测多种序列比对算法的准确性。

Predicting the accuracy of multiple sequence alignment algorithms by using computational intelligent techniques.

机构信息

Department of Computer Architecture and Computer Technology, University of Granada, 18071 Granada, Spain.

出版信息

Nucleic Acids Res. 2013 Jan 7;41(1):e26. doi: 10.1093/nar/gks919. Epub 2012 Oct 11.

DOI:10.1093/nar/gks919

PMID:23066102

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3592395/

Abstract

摘要

利用计算智能技术预测多种序列比对算法的准确性。

Predicting the accuracy of multiple sequence alignment algorithms by using computational intelligent techniques.

机构信息

出版信息

相似文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

利用计算智能技术预测多种序列比对算法的准确性。

Predicting the accuracy of multiple sequence alignment algorithms by using computational intelligent techniques.

机构信息

出版信息

相似文献

本文引用的文献