Suppr超能文献

使用成分调整替代矩阵进行蛋白质数据库搜索。

Protein database searches using compositionally adjusted substitution matrices.

作者信息

Altschul Stephen F, Wootton John C, Gertz E Michael, Agarwala Richa, Morgulis Aleksandr, Schäffer Alejandro A, Yu Yi-Kuo

机构信息

National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA.

出版信息

FEBS J. 2005 Oct;272(20):5101-9. doi: 10.1111/j.1742-4658.2005.04945.x.

Abstract

Almost all protein database search methods use amino acid substitution matrices for scoring, optimizing, and assessing the statistical significance of sequence alignments. Much care and effort has therefore gone into constructing substitution matrices, and the quality of search results can depend strongly upon the choice of the proper matrix. A long-standing problem has been the comparison of sequences with biased amino acid compositions, for which standard substitution matrices are not optimal. To address this problem, we have recently developed a general procedure for transforming a standard matrix into one appropriate for the comparison of two sequences with arbitrary, and possibly differing compositions. Such adjusted matrices yield, on average, improved alignments and alignment scores when applied to the comparison of proteins with markedly biased compositions. Here we review the application of compositionally adjusted matrices and consider whether they may also be applied fruitfully to general purpose protein sequence database searches, in which related sequence pairs do not necessarily have strong compositional biases. Although it is not advisable to apply compositional adjustment indiscriminately, we describe several simple criteria under which invoking such adjustment is on average beneficial. In a typical database search, at least one of these criteria is satisfied by over half the related sequence pairs. Compositional substitution matrix adjustment is now available in NCBI's protein-protein version of blast.

摘要

几乎所有蛋白质数据库搜索方法都使用氨基酸替换矩阵来进行序列比对的评分、优化及统计显著性评估。因此,构建替换矩阵投入了大量的精力,搜索结果的质量在很大程度上取决于合适矩阵的选择。长期存在的一个问题是具有偏向性氨基酸组成的序列之间的比较,对于这类序列,标准替换矩阵并非最优选择。为解决这一问题,我们最近开发了一种通用方法,可将标准矩阵转换为适用于比较具有任意组成(可能不同)的两个序列的矩阵。当应用于具有明显偏向性组成的蛋白质比较时,这种经过调整的矩阵平均能产生更好的比对和比对得分。在此,我们回顾了成分调整矩阵的应用,并探讨它们是否也能有效地应用于通用蛋白质序列数据库搜索,在这类搜索中相关序列对不一定具有很强的组成偏向性。虽然不加区分地应用成分调整并不可取,但我们描述了几个简单的标准,在这些标准下进行这种调整平均而言是有益的。在典型的数据库搜索中,超过半数的相关序列对至少满足其中一个标准。成分替换矩阵调整现已在NCBI的蛋白质-蛋白质版本的Blast中可用。

相似文献

3
The compositional adjustment of amino acid substitution matrices.氨基酸替换矩阵的组成调整。
Proc Natl Acad Sci U S A. 2003 Dec 23;100(26):15688-93. doi: 10.1073/pnas.2533904100. Epub 2003 Dec 8.
8
Substitution scoring matrices for proteins - An overview.蛋白质替换评分矩阵——概述。
Protein Sci. 2020 Nov;29(11):2150-2163. doi: 10.1002/pro.3954. Epub 2020 Oct 12.

引用本文的文献

6
Borrelia surface proteins: new horizons in Lyme disease diagnosis.疏螺旋体表面蛋白:莱姆病诊断的新视野
Appl Microbiol Biotechnol. 2025 Jul 1;109(1):156. doi: 10.1007/s00253-025-13490-6.

本文引用的文献

2
An alternative model of amino acid replacement.氨基酸替代的另一种模型。
Bioinformatics. 2005 Apr 1;21(7):975-80. doi: 10.1093/bioinformatics/bti109. Epub 2004 Nov 5.
4
The compositional adjustment of amino acid substitution matrices.氨基酸替换矩阵的组成调整。
Proc Natl Acad Sci U S A. 2003 Dec 23;100(26):15688-93. doi: 10.1073/pnas.2533904100. Epub 2003 Dec 8.
5
ASTRAL compendium enhancements.ASTRAL汇编增强功能。
Nucleic Acids Res. 2002 Jan 1;30(1):260-3. doi: 10.1093/nar/30.1.260.
8
Modeling amino acid replacement.模拟氨基酸替换。
J Comput Biol. 2000;7(6):761-76. doi: 10.1089/10665270050514918.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验