• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于蛋白质分类的特定折叠替换矩阵。

Fold-specific substitution matrices for protein classification.

作者信息

Vilim R B, Cunningham R M, Lu B, Kheradpour P, Stevens F J

机构信息

Argonne National Laboratory, 9700 S. Cass Avenue, Argonne, IL 60439, USA.

出版信息

Bioinformatics. 2004 Apr 12;20(6):847-53. doi: 10.1093/bioinformatics/btg492. Epub 2004 Feb 5.

DOI:10.1093/bioinformatics/btg492
PMID:14764567
Abstract

MOTIVATION

Methods that focus on secondary structures, such as Position Specific Scoring Matrices and Hidden Markov Models, have proved useful for assigning proteins to families. However, for assigning proteins to an attribute class within a family these methods may introduce more free parameters than are needed. There are fewer members and there is less variability among sequences within a family. We describe a method for organizing proteins in a family that exhibits up to an order of magnitude reduction in the number of parameters. The basis is the log odds ratio commonly used to measure similarity. We adapt this to characterize the sequence dissimilarities that give rise to attribute differentiation. This leads to the definition of Class Attribute Substitution Matrices (CLASSUM), a dual of the BLOSUM.

RESULTS

The method was applied to classify sequences hierarchically in the lambda and kappa subgroups of the immunoglobulin superfamily. Positions conferring class were identified based on the degree of amino acid variability at a position. The CLASSUM computed for these positions classified better than 90% of test data correctly compared with 35-50% for BLOSUM-62. The expected value for a random matrix is 14%. The results suggest that family-specific data-derived substitution matrices can improve the resolution of automated methods that use generic substitution matrices for searching for and classifying proteins.

摘要

动机

诸如位置特异性得分矩阵和隐马尔可夫模型等关注二级结构的方法已被证明在将蛋白质归类到家族中很有用。然而,对于将蛋白质归类到家族内的属性类别,这些方法可能会引入比所需更多的自由参数。家族中的成员较少,序列间的变异性也较小。我们描述了一种在家族中组织蛋白质的方法,该方法可使参数数量减少多达一个数量级。其基础是常用于衡量相似性的对数优势比。我们对其进行调整以表征导致属性分化的序列差异。这就引出了类属性替换矩阵(CLASSUM)的定义,它是BLOSUM的对偶矩阵。

结果

该方法被应用于对免疫球蛋白超家族的λ和κ亚组中的序列进行层次分类。根据某一位置氨基酸的可变程度确定赋予类别的位置。针对这些位置计算的CLASSUM能正确分类超过90%的测试数据,而BLOSUM - 62的正确率为35 - 50%。随机矩阵的预期值为14%。结果表明,源自家族特异性数据的替换矩阵可以提高使用通用替换矩阵来搜索和分类蛋白质的自动化方法的分辨率。

相似文献

1
Fold-specific substitution matrices for protein classification.用于蛋白质分类的特定折叠替换矩阵。
Bioinformatics. 2004 Apr 12;20(6):847-53. doi: 10.1093/bioinformatics/btg492. Epub 2004 Feb 5.
2
Improved pairwise alignments of proteins in the Twilight Zone using local structure predictions.利用局部结构预测改进“黄昏区”蛋白质的成对比对。
Bioinformatics. 2006 Feb 15;22(4):413-22. doi: 10.1093/bioinformatics/bti828. Epub 2005 Dec 13.
3
A metric model of amino acid substitution.氨基酸取代的度量模型。
Bioinformatics. 2004 May 22;20(8):1214-21. doi: 10.1093/bioinformatics/bth065. Epub 2004 Feb 10.
4
The construction of amino acid substitution matrices for the comparison of proteins with non-standard compositions.用于比较具有非标准组成的蛋白质的氨基酸替换矩阵的构建。
Bioinformatics. 2005 Apr 1;21(7):902-11. doi: 10.1093/bioinformatics/bti070. Epub 2004 Oct 27.
5
Fast model-based protein homology detection without alignment.基于快速模型的无需比对的蛋白质同源性检测。
Bioinformatics. 2007 Jul 15;23(14):1728-36. doi: 10.1093/bioinformatics/btm247. Epub 2007 May 8.
6
Prediction of functional specificity determinants from protein sequences using log-likelihood ratios.利用对数似然比从蛋白质序列预测功能特异性决定因素。
Bioinformatics. 2006 Jan 15;22(2):164-71. doi: 10.1093/bioinformatics/bti766. Epub 2005 Nov 8.
7
HMM-ModE--improved classification using profile hidden Markov models by optimising the discrimination threshold and modifying emission probabilities with negative training sequences.HMM-ModE——通过优化判别阈值并利用负训练序列修改发射概率,使用轮廓隐马尔可夫模型改进分类。
BMC Bioinformatics. 2007 Mar 27;8:104. doi: 10.1186/1471-2105-8-104.
8
A 3D-1D substitution matrix for protein fold recognition that includes predicted secondary structure of the sequence.一种用于蛋白质折叠识别的3D-1D替换矩阵,其包含序列的预测二级结构。
J Mol Biol. 1997 Apr 11;267(4):1026-38. doi: 10.1006/jmbi.1997.0924.
9
Use of multiple profiles corresponding to a sequence alignment enables effective detection of remote homologues.使用与序列比对相对应的多个图谱能够有效地检测远源同源物。
Bioinformatics. 2005 Jun 15;21(12):2821-6. doi: 10.1093/bioinformatics/bti432. Epub 2005 Apr 7.
10
Enriching the sequence substitution matrix by structural information.通过结构信息丰富序列替换矩阵。
Proteins. 2004 Jan 1;54(1):41-8. doi: 10.1002/prot.10474.

引用本文的文献

1
New alignment method for remote protein sequences by the direct use of pairwise sequence correlations and substitutions.通过直接利用成对序列相关性和替换来对远程蛋白质序列进行新的比对方法。
Front Bioinform. 2023 Oct 12;3:1227193. doi: 10.3389/fbinf.2023.1227193. eCollection 2023.
2
Developing similarity matrices for antibody-protein binding interactions.开发抗体-蛋白质结合相互作用的相似性矩阵。
PLoS One. 2023 Oct 26;18(10):e0293606. doi: 10.1371/journal.pone.0293606. eCollection 2023.
3
Mutation Space of Spatially Conserved Amino Acid Sites in Proteins.
蛋白质中空间保守氨基酸位点的突变空间
ACS Omega. 2023 Jun 28;8(27):24302-24310. doi: 10.1021/acsomega.3c01473. eCollection 2023 Jul 11.
4
New amino acid substitution matrix brings sequence alignments into agreement with structure matches.新的氨基酸替代矩阵使序列比对与结构匹配一致。
Proteins. 2021 Jun;89(6):671-682. doi: 10.1002/prot.26050. Epub 2021 Feb 2.
5
Substitution scoring matrices for proteins - An overview.蛋白质替换评分矩阵——概述。
Protein Sci. 2020 Nov;29(11):2150-2163. doi: 10.1002/pro.3954. Epub 2020 Oct 12.
6
SubVis: an interactive R package for exploring the effects of multiple substitution matrices on pairwise sequence alignment.SubVis:一个用于探索多个替换矩阵对成对序列比对影响的交互式R包。
PeerJ. 2017 Jun 27;5:e3492. doi: 10.7717/peerj.3492. eCollection 2017.
7
Fold-specific sequence scoring improves protein sequence matching.特定折叠序列评分可改善蛋白质序列匹配。
BMC Bioinformatics. 2016 Aug 30;17(1):328. doi: 10.1186/s12859-016-1198-z.
8
PR2ALIGN: a stand-alone software program and a web-server for protein sequence alignment using weighted biochemical properties of amino acids.PR2ALIGN:一个用于利用氨基酸加权生化特性进行蛋白质序列比对的独立软件程序和网络服务器。
BMC Res Notes. 2015 May 7;8:187. doi: 10.1186/s13104-015-1152-6.
9
Protein sequence alignment with family-specific amino acid similarity matrices.使用家族特异性氨基酸相似性矩阵进行蛋白质序列比对。
BMC Res Notes. 2011 Aug 16;4:296. doi: 10.1186/1756-0500-4-296.
10
Aligning protein sequence and analysing substitution pattern using a class-specific matrix.使用特定类别矩阵对齐蛋白质序列并分析取代模式。
J Biosci. 2010 Jun;35(2):295-314. doi: 10.1007/s12038-010-0033-3.