• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

识别重复出现的蛋白质结构微环境并发现半胱氨酸残基周围的新功能位点。

Identification of recurring protein structure microenvironments and discovery of novel functional sites around CYS residues.

作者信息

Wu Shirley, Liu Tianyun, Altman Russ B

机构信息

23andMe, 1390 Shorebird Way, Mountain View, CA, USA.

出版信息

BMC Struct Biol. 2010 Feb 2;10:4. doi: 10.1186/1472-6807-10-4.

DOI:10.1186/1472-6807-10-4
PMID:20122268
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2833161/
Abstract

BACKGROUND

The emergence of structural genomics presents significant challenges in the annotation of biologically uncharacterized proteins. Unfortunately, our ability to analyze these proteins is restricted by the limited catalog of known molecular functions and their associated 3D motifs.

RESULTS

In order to identify novel 3D motifs that may be associated with molecular functions, we employ an unsupervised, two-phase clustering approach that combines k-means and hierarchical clustering with knowledge-informed cluster selection and annotation methods. We applied the approach to approximately 20,000 cysteine-based protein microenvironments (3D regions 7.5 A in radius) and identified 70 interesting clusters, some of which represent known motifs (e.g. metal binding and phosphatase activity), and some of which are novel, including several zinc binding sites. Detailed annotation results are available online for all 70 clusters at http://feature.stanford.edu/clustering/cys.

CONCLUSIONS

The use of microenvironments instead of backbone geometric criteria enables flexible exploration of protein function space, and detection of recurring motifs that are discontinuous in sequence and diverse in structure. Clustering microenvironments may thus help to functionally characterize novel proteins and better understand the protein structure-function relationship.

摘要

背景

结构基因组学的出现给生物学特性未知的蛋白质注释带来了重大挑战。不幸的是,我们分析这些蛋白质的能力受到已知分子功能及其相关三维基序有限目录的限制。

结果

为了识别可能与分子功能相关的新型三维基序,我们采用了一种无监督的两阶段聚类方法,该方法将k均值聚类和层次聚类与基于知识的聚类选择和注释方法相结合。我们将该方法应用于约20000个基于半胱氨酸的蛋白质微环境(半径为7.5埃的三维区域),并识别出70个有趣的聚类,其中一些代表已知基序(如金属结合和磷酸酶活性),一些是新型的,包括几个锌结合位点。所有70个聚类的详细注释结果可在http://feature.stanford.edu/clustering/cys在线获取。

结论

使用微环境而非主链几何标准能够灵活地探索蛋白质功能空间,并检测序列中不连续且结构多样的重复基序。因此,对微环境进行聚类可能有助于对新型蛋白质进行功能表征,并更好地理解蛋白质结构与功能的关系。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d256/2833161/525c330f689c/1472-6807-10-4-9.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d256/2833161/2b3aa03cce12/1472-6807-10-4-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d256/2833161/0d6a729b872c/1472-6807-10-4-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d256/2833161/ca57b18e095f/1472-6807-10-4-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d256/2833161/0754558892c6/1472-6807-10-4-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d256/2833161/8dcc5245405d/1472-6807-10-4-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d256/2833161/fafe54111604/1472-6807-10-4-6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d256/2833161/606f2b233357/1472-6807-10-4-7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d256/2833161/f8531a10b289/1472-6807-10-4-8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d256/2833161/525c330f689c/1472-6807-10-4-9.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d256/2833161/2b3aa03cce12/1472-6807-10-4-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d256/2833161/0d6a729b872c/1472-6807-10-4-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d256/2833161/ca57b18e095f/1472-6807-10-4-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d256/2833161/0754558892c6/1472-6807-10-4-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d256/2833161/8dcc5245405d/1472-6807-10-4-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d256/2833161/fafe54111604/1472-6807-10-4-6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d256/2833161/606f2b233357/1472-6807-10-4-7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d256/2833161/f8531a10b289/1472-6807-10-4-8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d256/2833161/525c330f689c/1472-6807-10-4-9.jpg

相似文献

1
Identification of recurring protein structure microenvironments and discovery of novel functional sites around CYS residues.识别重复出现的蛋白质结构微环境并发现半胱氨酸残基周围的新功能位点。
BMC Struct Biol. 2010 Feb 2;10:4. doi: 10.1186/1472-6807-10-4.
2
Clustering protein environments for function prediction: finding PROSITE motifs in 3D.为功能预测对蛋白质环境进行聚类:在三维结构中寻找PROSITE基序
BMC Bioinformatics. 2007 May 22;8 Suppl 4(Suppl 4):S10. doi: 10.1186/1471-2105-8-S4-S10.
3
Identification of subfamily-specific sites based on active sites modeling and clustering.基于活性位点建模和聚类识别亚家族特异性位点。
Bioinformatics. 2010 Dec 15;26(24):3075-82. doi: 10.1093/bioinformatics/btq595. Epub 2010 Oct 26.
4
Structural fragment clustering reveals novel structural and functional motifs in alpha-helical transmembrane proteins.结构片段聚类揭示了α-螺旋跨膜蛋白中的新结构和功能基序。
BMC Bioinformatics. 2010 Apr 26;11:204. doi: 10.1186/1471-2105-11-204.
5
Target selection and determination of function in structural genomics.结构基因组学中的靶点选择与功能确定
IUBMB Life. 2003 Apr-May;55(4-5):249-55. doi: 10.1080/1521654031000123385.
6
The MASH pipeline for protein function prediction and an algorithm for the geometric refinement of 3D motifs.用于蛋白质功能预测的MASH管道及三维基序几何优化算法。
J Comput Biol. 2007 Jul-Aug;14(6):791-816. doi: 10.1089/cmb.2007.R017.
7
Comparison of topological clustering within protein networks using edge metrics that evaluate full sequence, full structure, and active site microenvironment similarity.使用评估全序列、全结构和活性位点微环境相似性的边度量对蛋白质网络内的拓扑聚类进行比较。
Protein Sci. 2015 Sep;24(9):1423-39. doi: 10.1002/pro.2724. Epub 2015 Aug 18.
8
Protein cavity clustering based on community structure of pocket similarity network.基于口袋相似性网络社区结构的蛋白质腔聚类
Int J Bioinform Res Appl. 2008;4(4):445-60. doi: 10.1504/IJBRA.2008.021179.
9
Structural clusters of evolutionary trace residues are statistically significant and common in proteins.进化追踪残基的结构簇在蛋白质中具有统计学意义且普遍存在。
J Mol Biol. 2002 Feb 8;316(1):139-54. doi: 10.1006/jmbi.2001.5327.
10
Improved K-means clustering algorithm for exploring local protein sequence motifs representing common structural property.用于探索代表共同结构特性的局部蛋白质序列基序的改进K均值聚类算法。
IEEE Trans Nanobioscience. 2005 Sep;4(3):255-65. doi: 10.1109/tnb.2005.853667.

引用本文的文献

1
Unsupervised learning reveals landscape of local structural motifs across protein classes.无监督学习揭示了跨蛋白质类别的局部结构基序格局。
Bioinformatics. 2025 Jul 1;41(7). doi: 10.1093/bioinformatics/btaf377.
2
Prospector Heads: Generalized Feature Attribution for Large Models & Data.探索者头部:大型模型和数据的广义特征归因
ArXiv. 2024 Jun 20:arXiv:2402.11729v2.
3
A deep learning framework to predict binding preference of RNA constituents on protein surface.一种用于预测 RNA 成分与蛋白质表面结合偏好的深度学习框架。

本文引用的文献

1
Detection of functionally important regions in "hypothetical proteins" of known structure.已知结构的“假设蛋白质”中功能重要区域的检测。
Structure. 2008 Dec 10;16(12):1755-63. doi: 10.1016/j.str.2008.10.017.
2
The Universal Protein Resource (UniProt) 2009.通用蛋白质资源(UniProt)2009 版
Nucleic Acids Res. 2009 Jan;37(Database issue):D169-74. doi: 10.1093/nar/gkn664. Epub 2008 Oct 4.
3
The FEATURE framework for protein function annotation: modeling new functions, improving performance, and extending to novel applications.
Nat Commun. 2019 Oct 30;10(1):4941. doi: 10.1038/s41467-019-12920-0.
4
Functional and Structural Diversity of Acyl-coA Binding Proteins in Oil Crops.油料作物中酰基辅酶A结合蛋白的功能与结构多样性
Front Genet. 2018 May 22;9:182. doi: 10.3389/fgene.2018.00182. eCollection 2018.
5
An integrative computational framework based on a two-step random forest algorithm improves prediction of zinc-binding sites in proteins.基于两步随机森林算法的综合计算框架提高了蛋白质中锌结合位点的预测能力。
PLoS One. 2012;7(11):e49716. doi: 10.1371/journal.pone.0049716. Epub 2012 Nov 14.
6
Prediction of functionally important residues in globular proteins from unusual central distances of amino acids.从氨基酸的异常中心距离预测球状蛋白质中功能重要的残基。
BMC Struct Biol. 2011 Sep 18;11:34. doi: 10.1186/1472-6807-11-34.
7
Amino Acid Features of P1B-ATPase Heavy Metal Transporters Enabling Small Numbers of Organisms to Cope with Heavy Metal Pollution.P1B-ATP酶重金属转运蛋白的氨基酸特征使少数生物能够应对重金属污染。
Bioinform Biol Insights. 2011 Apr 17;5:59-82. doi: 10.4137/BBI.S6206.
8
Mining the TRAF6/p62 interactome for a selective ubiquitination motif.挖掘TRAF6/p62相互作用组以寻找选择性泛素化基序。
BMC Proc. 2011 May 28;5 Suppl 2(Suppl 2):S4. doi: 10.1186/1753-6561-5-S2-S4.
9
Remote thioredoxin recognition using evolutionary conservation and structural dynamics.利用进化保守性和结构动力学进行远程硫氧还蛋白识别。
Structure. 2011 Apr 13;19(4):461-70. doi: 10.1016/j.str.2011.02.007.
用于蛋白质功能注释的FEATURE框架:对新功能进行建模、提高性能并扩展到新应用。
BMC Genomics. 2008 Sep 16;9 Suppl 2(Suppl 2):S2. doi: 10.1186/1471-2164-9-S2-S2.
4
Target selection for structural genomics: an overview.结构基因组学的靶点选择:综述
Methods Mol Biol. 2008;426:3-25. doi: 10.1007/978-1-60327-058-8_1.
5
Proteases.蛋白酶
Curr Protoc Protein Sci. 2001 May;Chapter 21:Unit 21.1. doi: 10.1002/0471140864.ps2101s21.
6
Functionally important segments in proteins dissected using Gene Ontology and geometric clustering of peptide fragments.利用基因本体论和肽片段的几何聚类剖析蛋白质中功能重要的片段。
Genome Biol. 2008;9(3):R52. doi: 10.1186/gb-2008-9-3-r52. Epub 2008 Mar 10.
7
The SeqFEATURE library of 3D functional site models: comparison to existing methods and applications to protein function annotation.SeqFEATURE 三维功能位点模型库:与现有方法的比较及在蛋白质功能注释中的应用。
Genome Biol. 2008 Jan 16;9(1):R8. doi: 10.1186/gb-2008-9-1-r8.
8
Impact of structures from the protein structure initiative.蛋白质结构启动计划中结构的影响。
Structure. 2007 Dec;15(12):1528-9. doi: 10.1016/j.str.2007.11.006.
9
Robust recognition of zinc binding sites in proteins.蛋白质中锌结合位点的可靠识别。
Protein Sci. 2008 Jan;17(1):54-65. doi: 10.1110/ps.073138508. Epub 2007 Nov 27.
10
PocketPicker: analysis of ligand binding-sites with shape descriptors.口袋选择器:使用形状描述符分析配体结合位点。
Chem Cent J. 2007 Mar 13;1:7. doi: 10.1186/1752-153X-1-7.