利用局部结构预测改进“黄昏区”蛋白质的成对比对。

Improved pairwise alignments of proteins in the Twilight Zone using local structure predictions.

作者信息

Huang Yao-Ming, Bystroff Christopher

机构信息

Center for Bioinformatics, Department of Biology, Rensselaer Polytechnic Institute, Troy, NY 12180, USA.

出版信息

Bioinformatics. 2006 Feb 15;22(4):413-22. doi: 10.1093/bioinformatics/bti828. Epub 2005 Dec 13.

DOI:10.1093/bioinformatics/bti828

PMID:16352653

Abstract

MOTIVATION

In recent years, advances have been made in the ability of computational methods to discriminate between homologous and non-homologous proteins in the 'twilight zone' of sequence similarity, where the percent sequence identity is a poor indicator of homology. To make these predictions more valuable to the protein modeler, they must be accompanied by accurate alignments. Pairwise sequence alignments are inferences of orthologous relationships between sequence positions. Evolutionary distance is traditionally modeled using global amino acid substitution matrices. But real differences in the likelihood of substitutions may exist for different structural contexts within proteins, since structural context contributes to the selective pressure.

RESULTS

HMMSUM (HMMSTR-based substitution matrices) is a new model for structural context-based amino acid substitution probabilities consisting of a set of 281 matrices, each for a different sequence-structure context. HMMSUM does not require the structure of the protein to be known. Instead, predictions of local structure are made using HMMSTR, a hidden Markov model for local structure. Alignments using the HMMSUM matrices compare favorably to alignments carried out using the BLOSUM matrices or structure-based substitution matrices SDM and HSDM when validated against remote homolog alignments from BAliBASE. HMMSUM has been implemented using local Dynamic Programming and with the Bayesian Adaptive alignment method.

摘要

动机

近年来，计算方法在区分序列相似性处于“模糊地带”（即序列同一性百分比不能很好地指示同源性）的同源和非同源蛋白质方面取得了进展。为了使这些预测对蛋白质建模者更有价值，它们必须伴随着准确的比对。成对序列比对是序列位置之间直系同源关系的推断。传统上，进化距离是使用全局氨基酸替换矩阵来建模的。但是，由于结构背景会影响选择压力，蛋白质内不同结构背景下的替换可能性可能存在实际差异。

结果

HMMSUM（基于HMMSTR的替换矩阵）是一种基于结构背景的氨基酸替换概率新模型，由一组281个矩阵组成，每个矩阵对应不同的序列-结构背景。HMMSUM不需要知道蛋白质的结构。相反，使用HMMSTR（一种用于局部结构的隐马尔可夫模型）对局部结构进行预测。当针对来自BAliBASE的远程同源比对进行验证时，使用HMMSUM矩阵的比对与使用BLOSUM矩阵或基于结构的替换矩阵SDM和HSDM进行的比对相比更具优势。HMMSUM已使用局部动态规划和贝叶斯自适应比对方法实现。

相似文献

Improved pairwise alignments of proteins in the Twilight Zone using local structure predictions.

Bioinformatics. 2006 Feb 15;22(4):413-22. doi: 10.1093/bioinformatics/bti828. Epub 2005 Dec 13.

Non-sequential structure-based alignments reveal topology-independent core packing arrangements in proteins.

Bioinformatics. 2005 Apr 1;21(7):1010-9. doi: 10.1093/bioinformatics/bti128. Epub 2004 Nov 5.

Periodic distributions of hydrophobic amino acids allows the definition of fundamental building blocks to align distantly related proteins.

Proteins. 2007 May 15;67(3):695-708. doi: 10.1002/prot.21319.

Eigenvalue analysis of amino acid substitution matrices reveals a sharp transition of the mode of sequence conservation in proteins.

Bioinformatics. 2004 Nov 1;20(16):2504-8. doi: 10.1093/bioinformatics/bth297. Epub 2004 May 6.

The construction of amino acid substitution matrices for the comparison of proteins with non-standard compositions.

Bioinformatics. 2005 Apr 1;21(7):902-11. doi: 10.1093/bioinformatics/bti070. Epub 2004 Oct 27.

A metric model of amino acid substitution.

Bioinformatics. 2004 May 22;20(8):1214-21. doi: 10.1093/bioinformatics/bth065. Epub 2004 Feb 10.

PROMALS: towards accurate multiple sequence alignments of distantly related proteins.

Bioinformatics. 2007 Apr 1;23(7):802-8. doi: 10.1093/bioinformatics/btm017. Epub 2007 Jan 31.

Fold-specific substitution matrices for protein classification.

Bioinformatics. 2004 Apr 12;20(6):847-53. doi: 10.1093/bioinformatics/btg492. Epub 2004 Feb 5.

Pairwise alignment incorporating dipeptide covariation.

Bioinformatics. 2005 Oct 1;21(19):3704-10. doi: 10.1093/bioinformatics/bti616. Epub 2005 Aug 25.

Enriching the sequence substitution matrix by structural information.

Proteins. 2004 Jan 1;54(1):41-8. doi: 10.1002/prot.10474.

引用本文的文献

AlphaFold2, SPINE-X, and Seder on Four Hard CASP Targets.

Methods Mol Biol. 2025;2867:141-152. doi: 10.1007/978-1-0716-4196-5_8.

Parameterized hypercomplex convolutional network for accurate protein backbone torsion angle prediction.

Sci Rep. 2024 Nov 8;14(1):27193. doi: 10.1038/s41598-024-77412-8.

Exploring amino acid functions in a deep mutational landscape.

Mol Syst Biol. 2021 Jul;17(7):e10305. doi: 10.15252/msb.202110305.

Substitution scoring matrices for proteins - An overview.

Protein Sci. 2020 Nov;29(11):2150-2163. doi: 10.1002/pro.3954. Epub 2020 Oct 12.

Profile Comparer Extended: phylogeny of lytic polysaccharide monooxygenase families using profile hidden Markov model alignments.

F1000Res. 2019 Oct 31;8:1834. doi: 10.12688/f1000research.21104.1. eCollection 2019.

Prediction of Protein Backbone Torsion Angles Using Deep Residual Inception Neural Networks.

IEEE/ACM Trans Comput Biol Bioinform. 2018 Mar 12. doi: 10.1109/TCBB.2018.2814586.

General overview on structure prediction of twilight-zone proteins.

Theor Biol Med Model. 2015 Sep 4;12:15. doi: 10.1186/s12976-015-0014-1.

Evaluation of protein dihedral angle prediction methods.

PLoS One. 2014 Aug 28;9(8):e105667. doi: 10.1371/journal.pone.0105667. eCollection 2014.

Revisiting amino acid substitution matrices for identifying distantly related proteins.

Bioinformatics. 2014 Feb 1;30(3):317-25. doi: 10.1093/bioinformatics/btt694. Epub 2013 Nov 26.

Microbiome in human health and disease.

Sci Prog. 2013;96(Pt 2):153-70. doi: 10.3184/003685013X13683759820813.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

利用局部结构预测改进“黄昏区”蛋白质的成对比对。

Improved pairwise alignments of proteins in the Twilight Zone using local structure predictions.

作者信息

机构信息

出版信息

MOTIVATION

RESULTS

动机

结果

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献