Suppr超能文献

上下文特定氨基酸取代概率的判别建模。

Discriminative modelling of context-specific amino acid substitution probabilities.

机构信息

Gene Center Munich and Department of Biochemistry, Ludwig-Maximilians-Universtät München, 81377 Munich, Germany.

出版信息

Bioinformatics. 2012 Dec 15;28(24):3240-7. doi: 10.1093/bioinformatics/bts622. Epub 2012 Oct 17.

Abstract

MOTIVATION

Protein sequence searching and alignment are fundamental tools of modern biology. Alignments are assessed using their similarity scores, essentially the sum of substitution matrix scores over all pairs of aligned amino acids. We previously proposed a generative probabilistic method that yields scores that take the sequence context around each aligned residue into account. This method showed drastically improved sensitivity and alignment quality compared with standard substitution matrix-based alignment.

RESULTS

Here, we develop an alternative discriminative approach to predict sequence context-specific substitution scores. We applied our approach to compute context-specific sequence profiles for Basic Local Alignment Search Tool (BLAST) and compared the new tool (CS-BLASTdis) to BLAST and the previous context-specific version (CS-BLASTgen). On a dataset filtered to 20% maximum sequence identity, CS-BLASTdisis was 51% more sensitive than BLAST and 17% more sensitive than CS-BLASTgenin, detecting remote homologues at 10% false discovery rate. At 30% maximum sequence identity, its alignments contain 21 and 12% more correct residue pairs than those of BLAST and CS-BLASTgen, respectively. Clear improvements are also seen when the approach is combined with PSI-BLAST and HHblits. We believe the context-specific approach should replace substitution matrices wherever sensitivity and alignment quality are critical.

摘要

动机

蛋白质序列搜索和比对是现代生物学的基本工具。比对的评估使用它们的相似性得分,本质上是所有对齐氨基酸对的替换矩阵得分的总和。我们之前提出了一种生成概率方法,该方法生成的分数考虑了每个对齐残基周围的序列上下文。与基于标准替换矩阵的比对相比,该方法显示出明显提高的敏感性和比对质量。

结果

在这里,我们开发了一种替代的判别方法来预测序列上下文特定的替换分数。我们将我们的方法应用于计算基本局部比对搜索工具 (BLAST) 的上下文特定序列分布,并将新工具 (CS-BLASTdis) 与 BLAST 和之前的上下文特定版本 (CS-BLASTgen) 进行比较。在经过过滤以获得最大序列同一性 20%的数据集上,CS-BLASTdis 的敏感性比 BLAST 高 51%,比 CS-BLASTgen 高 17%,在假发现率为 10%时检测到远程同源物。在最大序列同一性为 30%时,它的比对比 BLAST 和 CS-BLASTgen 的正确残基对分别多 21%和 12%。当该方法与 PSI-BLAST 和 HHblits 结合使用时,也可以看到明显的改进。我们认为,在敏感性和比对质量至关重要的情况下,应该用上下文特定方法替代替换矩阵。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验