分析具有应用于蛋白质功能预测的酶蛋白家族的亚结构变异。

Analysis of substructural variation in families of enzymatic proteins with applications to protein function prediction.

机构信息

Department of Computer Science, Rice University, Houston, TX, USA.

出版信息

BMC Bioinformatics. 2010 May 11;11:242. doi: 10.1186/1471-2105-11-242.

DOI:10.1186/1471-2105-11-242

PMID:20459833

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2885373/

Abstract

BACKGROUND

Structural variations caused by a wide range of physico-chemical and biological sources directly influence the function of a protein. For enzymatic proteins, the structure and chemistry of the catalytic binding site residues can be loosely defined as a substructure of the protein. Comparative analysis of drug-receptor substructures across and within species has been used for lead evaluation. Substructure-level similarity between the binding sites of functionally similar proteins has also been used to identify instances of convergent evolution among proteins. In functionally homologous protein families, shared chemistry and geometry at catalytic sites provide a common, local point of comparison among proteins that may differ significantly at the sequence, fold, or domain topology levels.

RESULTS

This paper describes two key results that can be used separately or in combination for protein function analysis. The Family-wise Analysis of SubStructural Templates (FASST) method uses all-against-all substructure comparison to determine Substructural Clusters (SCs). SCs characterize the binding site substructural variation within a protein family. In this paper we focus on examples of automatically determined SCs that can be linked to phylogenetic distance between family members, segregation by conformation, and organization by homology among convergent protein lineages. The Motif Ensemble Statistical Hypothesis (MESH) framework constructs a representative motif for each protein cluster among the SCs determined by FASST to build motif ensembles that are shown through a series of function prediction experiments to improve the function prediction power of existing motifs.

CONCLUSIONS

FASST contributes a critical feedback and assessment step to existing binding site substructure identification methods and can be used for the thorough investigation of structure-function relationships. The application of MESH allows for an automated, statistically rigorous procedure for incorporating structural variation data into protein function prediction pipelines. Our work provides an unbiased, automated assessment of the structural variability of identified binding site substructures among protein structure families and a technique for exploring the relation of substructural variation to protein function. As available proteomic data continues to expand, the techniques proposed will be indispensable for the large-scale analysis and interpretation of structural data.

摘要

背景

由广泛的物理化学和生物来源引起的结构变化直接影响蛋白质的功能。对于酶蛋白，催化结合位点残基的结构和化学性质可以松散地定义为蛋白质的一个亚结构。在物种间和种内对药物受体亚结构进行比较分析已被用于先导评估。功能相似的蛋白质之间结合位点的亚结构相似性也被用于鉴定蛋白质之间趋同进化的实例。在功能同源的蛋白质家族中，催化位点的共享化学和几何形状为蛋白质提供了一个共同的、局部的比较点，这些蛋白质在序列、折叠或结构域拓扑水平上可能有很大的差异。

结果

本文描述了两个可单独使用或组合使用的关键结果，用于蛋白质功能分析。家族内亚结构模板分析（FASST）方法使用全对全亚结构比较来确定亚结构簇（SC）。SC 特征在于蛋白质家族内的结合位点亚结构变化。在本文中，我们重点介绍了自动确定的 SC 的示例，这些示例可以与家族成员之间的系统发育距离、构象分离以及趋同蛋白谱系之间的同源组织相关联。Motif Ensemble Statistical Hypothesis（MESH）框架为 FASST 确定的 SC 中的每个蛋白质簇构建一个代表基序，构建 motif 集合，通过一系列功能预测实验证明，这些集合可以提高现有 motif 的功能预测能力。

结论

FASST 为现有的结合位点亚结构识别方法提供了关键的反馈和评估步骤，并可用于深入研究结构-功能关系。MESH 的应用允许将结构变化数据自动、严格地纳入蛋白质功能预测管道。我们的工作提供了一种对蛋白质结构家族中识别的结合位点亚结构的结构可变性进行无偏、自动评估的方法，以及一种探索亚结构变化与蛋白质功能关系的技术。随着可用蛋白质组数据的不断扩展，所提出的技术对于大规模分析和解释结构数据将是不可或缺的。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3dfe/2885373/311b599486b6/1471-2105-11-242-1.jpg

相似文献

Analysis of substructural variation in families of enzymatic proteins with applications to protein function prediction.分析具有应用于蛋白质功能预测的酶蛋白家族的亚结构变异。

BMC Bioinformatics. 2010 May 11;11:242. doi: 10.1186/1471-2105-11-242.

Composite motifs integrating multiple protein structures increase sensitivity for function prediction.整合多种蛋白质结构的复合基序提高了功能预测的灵敏度。

Comput Syst Bioinformatics Conf. 2007;6:343-55.

Active site prediction using evolutionary and structural information.利用进化和结构信息进行活性位点预测。

Bioinformatics. 2010 Mar 1;26(5):617-24. doi: 10.1093/bioinformatics/btq008. Epub 2010 Jan 14.

Automated multiple structure alignment and detection of a common substructural motif.自动多结构比对及常见子结构基序的检测。

Proteins. 2001 May 15;43(3):235-45. doi: 10.1002/prot.1034.

FLORA: a novel method to predict protein function from structure in diverse superfamilies.FLORA：一种从不同超家族的结构预测蛋白质功能的新方法。

PLoS Comput Biol. 2009 Aug;5(8):e1000485. doi: 10.1371/journal.pcbi.1000485. Epub 2009 Aug 28.

Predicting functional sites with an automated algorithm suitable for heterogeneous datasets.使用适用于异构数据集的自动算法预测功能位点。

BMC Bioinformatics. 2005 May 13;6:116. doi: 10.1186/1471-2105-6-116.

Cofactor-binding sites in proteins of deviating sequence: comparative analysis and clustering in torsion angle, cavity, and fold space.序列不同的蛋白质中的辅因子结合位点：扭转角、腔和折叠空间中的比较分析与聚类

Proteins. 2012 Feb;80(2):626-48. doi: 10.1002/prot.23226. Epub 2011 Nov 17.

Cavity scaling: automated refinement of cavity-aware motifs in protein function prediction.腔室缩放：蛋白质功能预测中腔室感知基序的自动优化

J Bioinform Comput Biol. 2007 Apr;5(2a):353-82. doi: 10.1142/s021972000700276x.

Evolution of function in protein superfamilies, from a structural perspective.从结构角度看蛋白质超家族中功能的演变。

J Mol Biol. 2001 Apr 6;307(4):1113-43. doi: 10.1006/jmbi.2001.4513.

Predicting protein functional sites with phylogenetic motifs.利用系统发育基序预测蛋白质功能位点。

Proteins. 2005 Feb 1;58(2):309-20. doi: 10.1002/prot.20321.

引用本文的文献

CrossDome: an interactive R package to predict cross-reactivity risk using immunopeptidomics databases.CrossDome：一个交互式 R 包，用于使用免疫肽组学数据库预测交叉反应性风险。

Front Immunol. 2023 Jun 12;14:1142573. doi: 10.3389/fimmu.2023.1142573. eCollection 2023.

Large-Scale Structure-Based Screening of Potential T Cell Cross-Reactivities Involving Peptide-Targets From BCG Vaccine and SARS-CoV-2.基于大规模结构的潜在 T 细胞交叉反应性筛选，涉及卡介苗疫苗和 SARS-CoV-2 的肽靶标。

Front Immunol. 2022 Jan 13;12:812176. doi: 10.3389/fimmu.2021.812176. eCollection 2021.

FunHoP: Enhanced Visualization and Analysis of Functionally Homologous Proteins in Complex Metabolic Networks.FunHoP：复杂代谢网络中功能同源蛋白的增强可视化和分析。

Genomics Proteomics Bioinformatics. 2021 Oct;19(5):848-859. doi: 10.1016/j.gpb.2021.03.003. Epub 2021 Mar 17.

Interpreting T-Cell Cross-reactivity through Structure: Implications for TCR-Based Cancer Immunotherapy.通过结构解读T细胞交叉反应性：对基于T细胞受体的癌症免疫疗法的启示

Front Immunol. 2017 Oct 4;8:1210. doi: 10.3389/fimmu.2017.01210. eCollection 2017.

VASP-E: specificity annotation with a volumetric analysis of electrostatic isopotentials.VASP-E：通过静电等势体体积分析进行特异性注释。

PLoS Comput Biol. 2014 Aug 28;10(8):e1003792. doi: 10.1371/journal.pcbi.1003792. eCollection 2014 Aug.

An aggregate analysis of many predicted structures to reduce errors in protein structure comparison caused by conformational flexibility.对多个预测结构进行汇总分析，以减少构象灵活性导致的蛋白质结构比较中的误差。

BMC Struct Biol. 2013;13 Suppl 1(Suppl 1):S10. doi: 10.1186/1472-6807-13-S1-S10. Epub 2013 Nov 8.

Combinatorial clustering of residue position subsets predicts inhibitor affinity across the human kinome.残基位置子集的组合聚类预测了人类激酶组中抑制剂的亲和力。

PLoS Comput Biol. 2013;9(6):e1003087. doi: 10.1371/journal.pcbi.1003087. Epub 2013 Jun 6.

The LabelHash algorithm for substructure matching.LabelHash 算法用于子结构匹配。

BMC Bioinformatics. 2010 Nov 11;11:555. doi: 10.1186/1471-2105-11-555.

本文引用的文献

Biophysics (Nagoya-shi). 2007 Dec 28;3:75-84. doi: 10.2142/biophysics.3.75. eCollection 2007.

FLORA: a novel method to predict protein function from structure in diverse superfamilies.FLORA：一种从不同超家族的结构预测蛋白质功能的新方法。

PLoS Comput Biol. 2009 Aug;5(8):e1000485. doi: 10.1371/journal.pcbi.1000485. Epub 2009 Aug 28.

Matching of structural motifs using hashing on residue labels and geometric filtering for protein function prediction.利用残基标签上的哈希和几何过滤进行结构基序匹配以预测蛋白质功能。

Comput Syst Bioinformatics Conf. 2008;7:157-68.

Drug discovery using chemical systems biology: identification of the protein-ligand binding network to explain the side effects of CETP inhibitors.利用化学系统生物学进行药物发现：鉴定蛋白质-配体结合网络以解释CETP抑制剂的副作用

PLoS Comput Biol. 2009 May;5(5):e1000387. doi: 10.1371/journal.pcbi.1000387. Epub 2009 May 15.

Comprehensive structural classification of ligand-binding motifs in proteins.蛋白质中配体结合基序的综合结构分类

Structure. 2009 Feb 13;17(2):234-46. doi: 10.1016/j.str.2008.11.009.

Predicting protein function and binding profile via matching of local evolutionary and geometric surface patterns.通过局部进化和几何表面模式匹配预测蛋白质功能和结合谱。

J Mol Biol. 2009 Mar 27;387(2):451-64. doi: 10.1016/j.jmb.2008.12.072. Epub 2009 Jan 6.

The FEATURE framework for protein function annotation: modeling new functions, improving performance, and extending to novel applications.用于蛋白质功能注释的FEATURE框架：对新功能进行建模、提高性能并扩展到新应用。

BMC Genomics. 2008 Sep 16;9 Suppl 2(Suppl 2):S2. doi: 10.1186/1471-2164-9-S2-S2.

Comprehensive in silico mutagenesis highlights functionally important residues in proteins.全面的计算机模拟诱变揭示了蛋白质中功能重要的残基。

Bioinformatics. 2008 Aug 15;24(16):i207-12. doi: 10.1093/bioinformatics/btn268.

Detecting evolutionary relationships across existing fold space, using sequence order-independent profile-profile alignments.利用与序列顺序无关的profile-profile比对来检测现有折叠空间中的进化关系。

Proc Natl Acad Sci U S A. 2008 Apr 8;105(14):5441-6. doi: 10.1073/pnas.0704422105. Epub 2008 Apr 2.

The phylogeny of the mammalian heme peroxidases and the evolution of their diverse functions.哺乳动物血红素过氧化物酶的系统发育及其多样功能的演化。

BMC Evol Biol. 2008 Mar 27;8:101. doi: 10.1186/1471-2148-8-101.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

分析具有应用于蛋白质功能预测的酶蛋白家族的亚结构变异。

Analysis of substructural variation in families of enzymatic proteins with applications to protein function prediction.

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSIONS

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献