• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

蛋白质结构域的贝叶斯数据挖掘提供了一种高效的预测算法和新见解。

Bayesian data mining of protein domains gives an efficient predictive algorithm and new insight.

作者信息

Joshi Rajani R, Samant Vivekanand V

机构信息

Department of Mathematics, Indian Institute of Technology Bombay, Powai, Mumbai, 400 076, India.

出版信息

J Mol Model. 2007 Jan;13(1):275-82. doi: 10.1007/s00894-006-0141-z. Epub 2006 Oct 7.

DOI:10.1007/s00894-006-0141-z
PMID:17028865
Abstract

Identification of structural domains in uncharacterized protein sequences is important in the prediction of protein tertiary folds and functional sites, and hence in designing biologically active molecules. We present a new predictive computational method of classifying a protein into single, two continuous or two discontinuous domains using Bayesian Data Mining. The algorithm requires only the primary sequence and computer-predicted secondary structure. It incorporates correlation patterns between certain 3-dimensional motifs and some local helical folds found conserved in the vicinity of protein domains with high statistical confidence. The prediction of domain-class by this computationally simple and fast method shows good accuracy of prediction-average accuracies 83.3% for single domain, 60% for two continuous and 65.7% for two discontinuous domain proteins. Experiments on the large validation sample show its performance to be significantly better than that of DGS and DomSSEA. Computations of Bayesian probabilities show important features in terms of correlation of certain conserved patterns of secondary folds and tertiary motifs and give new insight. Applications for improved accuracy of predicting domain boundary points relevant to protein structural and functional modeling are also highlighted.

摘要

识别未表征蛋白质序列中的结构域对于预测蛋白质三级结构和功能位点至关重要,因此对于设计生物活性分子也很重要。我们提出了一种新的预测计算方法,使用贝叶斯数据挖掘将蛋白质分类为单结构域、两个连续结构域或两个不连续结构域。该算法仅需要蛋白质一级序列和计算机预测的二级结构。它结合了某些三维基序与在蛋白质结构域附近发现的一些局部螺旋折叠之间的相关模式,且具有较高的统计置信度。通过这种计算简单且快速的方法预测结构域类别显示出良好的预测准确性——单结构域蛋白质的平均预测准确率为83.3%,两个连续结构域蛋白质为60%,两个不连续结构域蛋白质为65.7%。在大型验证样本上的实验表明,其性能明显优于DGS和DomSSEA。贝叶斯概率计算显示了某些二级折叠和三级基序保守模式相关性方面的重要特征,并提供了新的见解。还强调了该方法在提高与蛋白质结构和功能建模相关的结构域边界点预测准确性方面的应用。

相似文献

1
Bayesian data mining of protein domains gives an efficient predictive algorithm and new insight.蛋白质结构域的贝叶斯数据挖掘提供了一种高效的预测算法和新见解。
J Mol Model. 2007 Jan;13(1):275-82. doi: 10.1007/s00894-006-0141-z. Epub 2006 Oct 7.
2
Fast prediction of protein domain boundaries using conserved local patterns.利用保守局部模式快速预测蛋白质结构域边界
J Mol Model. 2006 Sep;12(6):943-52. doi: 10.1007/s00894-006-0116-0. Epub 2006 Apr 29.
3
Computing motif correlations in proteins.计算蛋白质中的基序相关性。
J Comput Chem. 2003 Dec;24(16):2032-43. doi: 10.1002/jcc.10332.
4
Prediction of novel and analogous folds using fragment assembly and fold recognition.使用片段组装和折叠识别预测新型和类似折叠结构
Proteins. 2005;61 Suppl 7:143-151. doi: 10.1002/prot.20731.
5
MOTIPS: automated motif analysis for predicting targets of modular protein domains.MOTIPS:用于预测模块化蛋白质结构域靶标的自动化基序分析。
BMC Bioinformatics. 2010 May 11;11:243. doi: 10.1186/1471-2105-11-243.
6
SnapDRAGON: a method to delineate protein structural domains from sequence data.SnapDRAGON:一种从序列数据中描绘蛋白质结构域的方法。
J Mol Biol. 2002 Feb 22;316(3):839-51. doi: 10.1006/jmbi.2001.5387.
7
Proteomic tools for the analysis of cytoskeleton proteins.用于分析细胞骨架蛋白的蛋白质组学工具。
Methods Mol Biol. 2009;586:375-88. doi: 10.1007/978-1-60761-376-3_22.
8
DomHR: accurately identifying domain boundaries in proteins using a hinge region strategy.使用铰链区策略准确识别蛋白质的结构域边界。
PLoS One. 2013 Apr 11;8(4):e60559. doi: 10.1371/journal.pone.0060559. Print 2013.
9
Automated prediction of domain boundaries in CASP6 targets using Ginzu and RosettaDOM.使用Ginzu和RosettaDOM自动预测CASP6目标中的结构域边界。
Proteins. 2005;61 Suppl 7:193-200. doi: 10.1002/prot.20737.
10
Prediction of distant residue contacts with the use of evolutionary information.利用进化信息预测远距离残基接触。
Proteins. 2005 Mar 1;58(4):935-49. doi: 10.1002/prot.20370.

引用本文的文献

1
Review of Personalized Medicine and Pharmacogenomics of Anti-Cancer Compounds and Natural Products.抗癌化合物与天然产物的个性化医疗及药物基因组学综述
Genes (Basel). 2024 Apr 8;15(4):468. doi: 10.3390/genes15040468.
2
Quantitative characterization of protein tertiary motifs.蛋白质三级结构基序的定量描述。
J Mol Model. 2014 Jan;20(1):2077. doi: 10.1007/s00894-014-2077-z. Epub 2014 Jan 26.

本文引用的文献

1
Structure prediction of a multi-domain EF-hand Ca2+ binding protein by PROPAINOR.利用PROPAINOR对一种多结构域EF手型钙离子结合蛋白进行结构预测。
J Mol Model. 2005 Nov;11(6):481-8. doi: 10.1007/s00894-005-0256-7. Epub 2005 Aug 11.
2
PPRODO: prediction of protein domain boundaries using neural networks.PPRODO:使用神经网络预测蛋白质结构域边界
Proteins. 2005 May 15;59(3):627-32. doi: 10.1002/prot.20442.
3
Characteristics and prediction of domain linker sequences in multi-domain proteins.多结构域蛋白中结构域连接子序列的特征与预测
J Struct Funct Genomics. 2003;4(2-3):79-85. doi: 10.1023/a:1026163008203.
4
Ab-initio prediction and reliability of protein structural genomics by PROPAINOR algorithm.基于PROPAINOR算法的蛋白质结构基因组学从头预测及可靠性
Comput Biol Chem. 2003 Jul;27(3):241-52. doi: 10.1016/s0097-8485(02)00074-8.
5
DomCut: prediction of inter-domain linker regions in amino acid sequences.DomCut:氨基酸序列中结构域间连接区的预测
Bioinformatics. 2003 Mar 22;19(5):673-4. doi: 10.1093/bioinformatics/btg031.
6
Rapid protein domain assignment from amino acid sequence using predicted secondary structure.利用预测的二级结构从氨基酸序列中快速进行蛋白质结构域分配。
Protein Sci. 2002 Dec;11(12):2814-24. doi: 10.1110/ps.0209902.
7
SnapDRAGON: a method to delineate protein structural domains from sequence data.SnapDRAGON:一种从序列数据中描绘蛋白质结构域的方法。
J Mol Biol. 2002 Feb 22;316(3):839-51. doi: 10.1006/jmbi.2001.5387.
8
Universal similarity measure for comparing protein structures.用于比较蛋白质结构的通用相似性度量。
Biopolymers. 2001 Oct 15;59(5):305-9. doi: 10.1002/1097-0282(20011015)59:5<305::AID-BIP1027>3.0.CO;2-6.
9
Domain size distributions can predict domain boundaries.畴尺寸分布可以预测畴界。
Bioinformatics. 2000 Jul;16(7):613-8. doi: 10.1093/bioinformatics/16.7.613.
10
The PSIPRED protein structure prediction server.PSIPRED蛋白质结构预测服务器。
Bioinformatics. 2000 Apr;16(4):404-5. doi: 10.1093/bioinformatics/16.4.404.