• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用神经网络和统计方法预测蛋白质二级结构。

Predicting protein secondary structure using neural net and statistical methods.

作者信息

Stolorz P, Lapedes A, Xia Y

机构信息

Theoretical Division, Los Alamos National Laboratory, NM 87545.

出版信息

J Mol Biol. 1992 May 20;225(2):363-77. doi: 10.1016/0022-2836(92)90927-c.

DOI:10.1016/0022-2836(92)90927-c
PMID:1593625
Abstract

A comparison of neural network methods and Bayesian statistical methods is presented for prediction of the secondary structure of proteins given their primary sequence. The Bayesian method makes the unphysical assumption that the probability of an amino acid occurring in each position in the protein is independent of the amino acids occurring elsewhere. However, we find the predictive accuracy of the Bayesian method to be only minimally less than the accuracy of the most sophisticated methods used to date. We present the relationship of neural network methods to Bayesian statistical methods and show that, in principle, neural methods offer considerable power, although apparently they are not particularly useful for this problem. In the process, we derive a neural formalism in which the output neurons directly represent the conditional probabilities of structure class. The probabilistic formalism allows introduction of a new objective function, the mutual information, which translates the notion of correlation as a measure of predictive accuracy into a useful training measure. Although a similar accuracy to other approaches (utilizing a mean-square error) is achieved using this new measure, the accuracy on the training set is significantly and tantalizingly higher, even though the number of adjustable parameters remains the same. The mutual information measure predicts a greater fraction of helix and sheet structures correctly than the mean-square error measure, at the expense of coil accuracy, precisely as it was designed to do. By combining the two objective functions, we obtain a marginally improved accuracy of 64.4%, with Matthews coefficients C alpha, C beta and Ccoil of 0.40, 0.32 and 0.42, respectively. However, since all methods to date perform only slightly better than the Bayes algorithm, which entails the drastic assumption of independence of amino acids, one is forced to conclude that little progress has been made on this problem, despite the application of a variety of sophisticated algorithms such as neural networks, and that further advances will require a better understanding of the relevant biophysics.

摘要

针对给定蛋白质一级序列预测其二级结构的问题,本文对神经网络方法和贝叶斯统计方法进行了比较。贝叶斯方法做出了一个不符合实际的假设,即蛋白质中每个位置出现氨基酸的概率与其他位置出现的氨基酸无关。然而,我们发现贝叶斯方法的预测准确率仅略低于迄今为止使用的最复杂方法的准确率。我们阐述了神经网络方法与贝叶斯统计方法的关系,并表明原则上神经网络方法具有相当大的能力,尽管显然它们对这个问题并不是特别有用。在此过程中,我们推导了一种神经形式体系,其中输出神经元直接表示结构类别的条件概率。概率形式体系允许引入一个新的目标函数——互信息,它将作为预测准确率度量的相关性概念转化为一种有用的训练度量。尽管使用这种新度量获得的准确率与其他方法(利用均方误差)相似,但训练集上的准确率显著且诱人地更高,即使可调整参数的数量保持不变。互信息度量比均方误差度量能更准确地预测更大比例的螺旋和片状结构,代价是对卷曲结构的预测准确率降低,这正是它的设计目的。通过结合这两个目标函数,我们获得了略有提高的准确率,为64.4%,马修斯系数Cα、Cβ和C卷曲分别为0.40、0.32和0.42。然而,由于迄今为止所有方法的表现仅比贝叶斯算法略好,而贝叶斯算法需要氨基酸独立性这一极端假设,所以人们不得不得出结论,尽管应用了各种复杂算法,如神经网络,但在这个问题上几乎没有取得进展,进一步的进展将需要更好地理解相关的生物物理学。

相似文献

1
Predicting protein secondary structure using neural net and statistical methods.使用神经网络和统计方法预测蛋白质二级结构。
J Mol Biol. 1992 May 20;225(2):363-77. doi: 10.1016/0022-2836(92)90927-c.
2
Predicting the secondary structure of globular proteins using neural network models.使用神经网络模型预测球状蛋白质的二级结构。
J Mol Biol. 1988 Aug 20;202(4):865-84. doi: 10.1016/0022-2836(88)90564-5.
3
Protein secondary structure prediction with SPARROW.利用 SPARROW 进行蛋白质二级结构预测。
J Chem Inf Model. 2012 Feb 27;52(2):545-56. doi: 10.1021/ci200321u. Epub 2012 Jan 23.
4
Predicting solvent accessibility: higher accuracy using Bayesian statistics and optimized residue substitution classes.预测溶剂可及性:使用贝叶斯统计和优化的残基替代类别提高准确性。
Proteins. 1996 May;25(1):38-47. doi: 10.1002/(SICI)1097-0134(199605)25:1<38::AID-PROT4>3.0.CO;2-G.
5
Improving prediction of protein secondary structure using structured neural networks and multiple sequence alignments.使用结构化神经网络和多序列比对改进蛋白质二级结构预测
J Comput Biol. 1996 Spring;3(1):163-83. doi: 10.1089/cmb.1996.3.163.
6
Predicting protein secondary structure with probabilistic schemata of evolutionarily derived information.利用进化衍生信息的概率模式预测蛋白质二级结构。
Protein Sci. 1997 Sep;6(9):1963-75. doi: 10.1002/pro.5560060917.
7
Predicting protein secondary structure content. A tandem neural network approach.
J Mol Biol. 1992 Jun 5;225(3):713-27. doi: 10.1016/0022-2836(92)90396-2.
8
Determination of eukaryotic protein coding regions using neural networks and information theory.使用神经网络和信息论确定真核生物蛋白质编码区域
J Mol Biol. 1992 Jul 20;226(2):471-9. doi: 10.1016/0022-2836(92)90961-i.
9
Neural networks for secondary structure and structural class predictions.用于二级结构和结构类别预测的神经网络。
Protein Sci. 1995 Feb;4(2):275-85. doi: 10.1002/pro.5560040214.
10
LGANN: a parallel system combining a local genetic algorithm and neural networks for the prediction of secondary structure of proteins.
Comput Appl Biosci. 1995 Jun;11(3):253-60. doi: 10.1093/bioinformatics/11.3.253.

引用本文的文献

1
DNSS2: Improved ab initio protein secondary structure prediction using advanced deep learning architectures.DNSS2:使用先进深度学习架构改进从头算蛋白质二级结构预测
Proteins. 2021 Feb;89(2):207-217. doi: 10.1002/prot.26007. Epub 2020 Sep 16.
2
PROSHIFT: protein chemical shift prediction using artificial neural networks.PROSHIFT:使用人工神经网络进行蛋白质化学位移预测
J Biomol NMR. 2003 May;26(1):25-37. doi: 10.1023/a:1023060720156.
3
Environmental features are important in determining protein secondary structure.环境特征在决定蛋白质二级结构方面很重要。
Protein Sci. 2001 Jun;10(6):1172-7. doi: 10.1110/ps.420101.
4
Predicting protein secondary structure with probabilistic schemata of evolutionarily derived information.利用进化衍生信息的概率模式预测蛋白质二级结构。
Protein Sci. 1997 Sep;6(9):1963-75. doi: 10.1002/pro.5560060917.
5
Improving protein secondary structure prediction with aligned homologous sequences.利用比对的同源序列改进蛋白质二级结构预测
Protein Sci. 1996 Jan;5(1):106-13. doi: 10.1002/pro.5560050113.
6
A preference-based free-energy parameterization of enzyme-inhibitor binding. Applications to HIV-1-protease inhibitor design.基于偏好的酶-抑制剂结合自由能参数化。在HIV-1蛋白酶抑制剂设计中的应用。
Protein Sci. 1995 Sep;4(9):1881-903. doi: 10.1002/pro.5560040923.
7
Predicting secondary structures of membrane proteins with neural networks.利用神经网络预测膜蛋白的二级结构。
Eur Biophys J. 1993;22(1):41-51. doi: 10.1007/BF00205811.
8
Improved prediction of protein secondary structure by use of sequence profiles and neural networks.利用序列谱和神经网络改进蛋白质二级结构预测
Proc Natl Acad Sci U S A. 1993 Aug 15;90(16):7558-62. doi: 10.1073/pnas.90.16.7558.
9
Development of artificial neural filters for pattern recognition in protein sequences.
J Mol Evol. 1993 Jun;36(6):586-95. doi: 10.1007/BF00556363.
10
Self-organized neural maps of human protein sequences.人类蛋白质序列的自组织神经图谱。
Protein Sci. 1994 Mar;3(3):507-21. doi: 10.1002/pro.5560030316.