• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

在图形模型中结合特征以预测蛋白质结合位点。

Combining features in a graphical model to predict protein binding sites.

作者信息

Wierschin Torsten, Wang Keyu, Welter Marlon, Waack Stephan, Stanke Mario

机构信息

Institute of Mathematics and Computer Science, University of Greifswald, 17487, Greifswald, Germany.

出版信息

Proteins. 2015 May;83(5):844-52. doi: 10.1002/prot.24775. Epub 2015 Mar 14.

DOI:10.1002/prot.24775
PMID:25663045
Abstract

Large efforts have been made in classifying residues as binding sites in proteins using machine learning methods. The prediction task can be translated into the computational challenge of assigning each residue the label binding site or non-binding site. Observational data comes from various possibly highly correlated sources. It includes the structure of the protein but not the structure of the complex. The model class of conditional random fields (CRFs) has previously successfully been used for protein binding site prediction. Here, a new CRF-approach is presented that models the dependencies of residues using a general graphical structure defined as a neighborhood graph and thus our model makes fewer independence assumptions on the labels than sequential labeling approaches. A novel node feature "change in free energy" is introduced into the model, which is then denoted by ΔF-CRF. Parameters are trained with an online large-margin algorithm. Using the standard feature class relative accessible surface area alone, the general graph-structure CRF already achieves higher prediction accuracy than the linear chain CRF of Li et al. ΔF-CRF performs significantly better on a large range of false positive rates than the support-vector-machine-based program PresCont of Zellner et al. on a homodimer set containing 128 chains. ΔF-CRF has a broader scope than PresCont since it is not constrained to protein subgroups and requires no multiple sequence alignment. The improvement is attributed to the advantageous combination of the novel node feature with the standard feature and to the adopted parameter training method.

摘要

人们已经付出了巨大努力,使用机器学习方法将蛋白质中的残基分类为结合位点。预测任务可以转化为给每个残基分配“结合位点”或“非结合位点”标签的计算挑战。观测数据来自各种可能高度相关的来源。它包括蛋白质的结构,但不包括复合物的结构。条件随机场(CRF)模型类别此前已成功用于蛋白质结合位点预测。在此,提出了一种新的CRF方法,该方法使用定义为邻域图的通用图形结构对残基的依赖性进行建模,因此我们的模型在标签上做出的独立性假设比顺序标记方法更少。一种新颖的节点特征“自由能变化”被引入到模型中,该模型随后被称为ΔF-CRF。参数使用在线大间隔算法进行训练。仅使用标准特征类相对可及表面积,通用图结构CRF已经比Li等人的线性链CRF实现了更高的预测准确率。在包含128条链的同二聚体集上,ΔF-CRF在大范围的误报率上比Zellner等人基于支持向量机的程序PresCont表现得显著更好。ΔF-CRF的适用范围比PresCont更广,因为它不受限于蛋白质亚组,并且不需要多序列比对。这种改进归因于新颖节点特征与标准特征的有利组合以及所采用的参数训练方法。

相似文献

1
Combining features in a graphical model to predict protein binding sites.在图形模型中结合特征以预测蛋白质结合位点。
Proteins. 2015 May;83(5):844-52. doi: 10.1002/prot.24775. Epub 2015 Mar 14.
2
PresCont: predicting protein-protein interfaces utilizing four residue properties.PresCont:利用四个残基性质预测蛋白质-蛋白质界面。
Proteins. 2012 Jan;80(1):154-68. doi: 10.1002/prot.23172. Epub 2011 Oct 31.
3
CRF-based models of protein surfaces improve protein-protein interaction site predictions.基于 CRF 的蛋白质表面模型可提高蛋白质-蛋白质相互作用位点预测。
BMC Bioinformatics. 2014 Aug 13;15(1):277. doi: 10.1186/1471-2105-15-277.
4
Protein-protein interaction site prediction based on conditional random fields.基于条件随机场的蛋白质-蛋白质相互作用位点预测
Bioinformatics. 2007 Mar 1;23(5):597-604. doi: 10.1093/bioinformatics/btl660. Epub 2007 Jan 18.
5
Prediction of protein binding sites in protein structures using hidden Markov support vector machine.利用隐马尔可夫支持向量机预测蛋白质结构中的蛋白质结合位点。
BMC Bioinformatics. 2009 Nov 20;10:381. doi: 10.1186/1471-2105-10-381.
6
Prediction of protein-RNA residue-base contacts using two-dimensional conditional random field with the lasso.使用带套索的二维条件随机场预测蛋白质-RNA残基-碱基相互作用
BMC Syst Biol. 2013;7 Suppl 2(Suppl 2):S15. doi: 10.1186/1752-0509-7-S2-S15. Epub 2013 Dec 17.
7
A novel method of protein secondary structure prediction with high segment overlap measure: support vector machine approach.一种具有高片段重叠度量的蛋白质二级结构预测新方法:支持向量机方法。
J Mol Biol. 2001 Apr 27;308(2):397-407. doi: 10.1006/jmbi.2001.4580.
8
PSSM-based prediction of DNA binding sites in proteins.基于位置特异性得分矩阵的蛋白质中DNA结合位点预测
BMC Bioinformatics. 2005 Feb 19;6:33. doi: 10.1186/1471-2105-6-33.
9
TargetATPsite: a template-free method for ATP-binding sites prediction with residue evolution image sparse representation and classifier ensemble.靶标 ATP 结合位点预测的模板自由方法:基于残基进化图像稀疏表示和分类器集成。
J Comput Chem. 2013 Apr 30;34(11):974-85. doi: 10.1002/jcc.23219. Epub 2013 Jan 3.
10
Prediction of protein-RNA binding sites by a random forest method with combined features.基于组合特征的随机森林方法预测蛋白质-RNA 结合位点。
Bioinformatics. 2010 Jul 1;26(13):1616-22. doi: 10.1093/bioinformatics/btq253. Epub 2010 May 18.

引用本文的文献

1
Utilizing knowledge base of amino acids structural neighborhoods to predict protein-protein interaction sites.利用氨基酸结构邻域知识库预测蛋白质-蛋白质相互作用位点。
BMC Bioinformatics. 2017 Dec 6;18(Suppl 15):492. doi: 10.1186/s12859-017-1921-4.
2
Proteins and Their Interacting Partners: An Introduction to Protein-Ligand Binding Site Prediction Methods.蛋白质及其相互作用伙伴:蛋白质-配体结合位点预测方法介绍
Int J Mol Sci. 2015 Dec 15;16(12):29829-42. doi: 10.3390/ijms161226202.