Suppr超能文献

基于模型的序列比对质量预测。

Model-based prediction of sequence alignment quality.

作者信息

Ahola Virpi, Aittokallio Tero, Vihinen Mauno, Uusipaikka Esa

机构信息

Biotechnology and Food Research, MTT Agrifood Research Finland, FI-31600 Jokioinen, Finland.

出版信息

Bioinformatics. 2008 Oct 1;24(19):2165-71. doi: 10.1093/bioinformatics/btn414. Epub 2008 Aug 4.

Abstract

MOTIVATION

Multiple sequence alignment (MSA) is an essential prerequisite for many sequence analysis methods and valuable tool itself for describing relationships between protein sequences. Since the success of the sequence analysis is highly dependent on the reliability of alignments, measures for assessing the quality of alignments are highly requisite.

RESULTS

We present a statistical model-based alignment quality score. Unlike other quality scores, it does not require several parallel alignments for the same set of sequences or additional structural information. Our quality score is based on measuring the conservation level of reference alignments in Homstrad. Reference sequences were realigned with the Mafft, Muscle and Probcons alignment programs, and a sum-of-pairs (SP) score was used to measure the quality of the realignments. Statistical modelling of the SP score as a function of conservation level and other alignment characteristics makes it possible to predict the SP score for any global MSA. The predicted SP scores are highly correlated with the correct SP scores, when tested on the Homstrad and SABmark databases. The results are comparable to that of multiple overlap score (MOS) and better than those of normalized mean distance (NorMD) and normalized iRMSD (NiRMSD) alignment quality criteria. Furthermore, the predicted SP score is able to detect alignments with badly aligned or unrelated sequences.

AVAILABILITY

The method is freely available at http://www.mtt.fi/AlignmentQuality/.

摘要

动机

多序列比对(MSA)是许多序列分析方法的重要前提,其本身也是描述蛋白质序列间关系的重要工具。由于序列分析的成功高度依赖于比对的可靠性,因此评估比对质量的方法非常必要。

结果

我们提出了一种基于统计模型的比对质量得分。与其他质量得分不同,它不需要对同一组序列进行多个并行比对或额外的结构信息。我们的质量得分基于测量Homstrad中参考比对的保守水平。参考序列使用Mafft、Muscle和Probcons比对程序重新进行比对,并使用双序列比对和(SP)得分来衡量重新比对的质量。将SP得分作为保守水平和其他比对特征的函数进行统计建模,使得能够预测任何全局多序列比对的SP得分。在Homstrad和SABmark数据库上进行测试时,预测的SP得分与正确的SP得分高度相关。结果与多重重叠得分(MOS)相当,且优于归一化平均距离(NorMD)和归一化iRMSD(NiRMSD)比对质量标准。此外,预测的SP得分能够检测出比对不佳或序列不相关的比对。

可用性

该方法可在http://www.mtt.fi/AlignmentQuality/免费获取。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验