使用临界随机网络从氨基酸序列预测天然蛋白质结构的二级结构、接触数和残基水平的接触序。

Predicting secondary structures, contact numbers, and residue-wise contact orders of native protein structures from amino acid sequences using critical random networks.

作者信息

Kinjo Akira R, Nishikawa Ken

机构信息

Center for Information Biology and DNA Data Bank of Japan, National Institute of Genetics, Mishima 411-8540, Japan; Department of Genetics, The Graduate University for Advanced Studies (SOKENDAI), Mishima 411-8540, Japan.

出版信息

Biophysics (Nagoya-shi). 2005 Nov 22;1:67-74. doi: 10.2142/biophysics.1.67. eCollection 2005.

DOI:10.2142/biophysics.1.67

PMID:27857554

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5036631/

Abstract

Predictions of one-dimensional protein structures such as secondary structures and contact numbers are useful for predicting three-dimensional structure and important for understanding the sequence-structure relationship. Here we present a new machine-learning method, critical random networks (CRNs), for predicting one-dimensional structures, and apply it, with position-specific scoring matrices, to the prediction of secondary structures (SS), contact numbers (CN), and residue-wise contact orders (RWCO). The present method achieves, on average, accuracy of 77.8% for SS, and correlation coefficients of 0.726 and 0.601 for CN and RWCO, respectively. The accuracy of the SS prediction is comparable to that obtained with other state-of-the-art methods, and accuracy of the CN prediction is a significant improvement over that with previous methods. We give a detailed formulation of the critical random networks-based prediction scheme, and examine the context-dependence of prediction accuracies. In order to study the nonlinear and multi-body effects, we compare the CRNs-based method with a purely linear method based on position-specific scoring matrices. Although not superior to the CRNs-based method, the surprisingly good accuracy achieved by the linear method highlights the difficulty in extracting structural features of higher order from an amino acid sequence beyond the information provided by the position-specific scoring matrices.

摘要

诸如二级结构和接触数等一维蛋白质结构的预测对于三维结构预测很有用，并且对于理解序列 - 结构关系很重要。在此，我们提出一种用于预测一维结构的新机器学习方法——临界随机网络（CRNs），并将其与位置特异性评分矩阵一起应用于二级结构（SS）、接触数（CN）和残基级接触序（RWCO）的预测。本方法对于SS平均达到77.8%的准确率，对于CN和RWCO的相关系数分别为0.726和0.601。SS预测的准确率与其他现有最先进方法所获得的准确率相当，并且CN预测的准确率相较于先前方法有显著提高。我们给出了基于临界随机网络的预测方案的详细公式，并研究了预测准确率的上下文依赖性。为了研究非线性和多体效应，我们将基于CRNs的方法与基于位置特异性评分矩阵的纯线性方法进行比较。尽管线性方法并不优于基于CRNs的方法，但其取得的惊人高准确率凸显了从氨基酸序列中提取超出位置特异性评分矩阵所提供信息的高阶结构特征的困难。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4639/5036631/7ea515f129ed/1_67f1.jpg

相似文献

Predicting secondary structures, contact numbers, and residue-wise contact orders of native protein structures from amino acid sequences using critical random networks.

Biophysics (Nagoya-shi). 2005 Nov 22;1:67-74. doi: 10.2142/biophysics.1.67. eCollection 2005.

Predicting residue-wise contact orders in proteins by support vector regression.

BMC Bioinformatics. 2006 Oct 3;7:425. doi: 10.1186/1471-2105-7-425.

CRNPRED: highly accurate prediction of one-dimensional protein structures by large-scale critical random networks.

BMC Bioinformatics. 2006 Sep 5;7:401. doi: 10.1186/1471-2105-7-401.

Improving prediction of protein secondary structure, backbone angles, solvent accessibility and contact numbers by using predicted contact maps and an ensemble of recurrent and residual convolutional neural networks.

Bioinformatics. 2019 Jul 15;35(14):2403-2410. doi: 10.1093/bioinformatics/bty1006.

A novel structural position-specific scoring matrix for the prediction of protein secondary structures.

Bioinformatics. 2012 Jan 1;28(1):32-9. doi: 10.1093/bioinformatics/btr611. Epub 2011 Nov 7.

Predicting absolute contact numbers of native protein structure from amino acid sequence.

Proteins. 2005 Jan 1;58(1):158-65. doi: 10.1002/prot.20300.

SPSSM8: an accurate approach for predicting eight-state secondary structures of proteins.

Biochimie. 2013 Dec;95(12):2460-4. doi: 10.1016/j.biochi.2013.09.007. Epub 2013 Sep 18.

COMSAT: Residue contact prediction of transmembrane proteins based on support vector machines and mixed integer linear programming.

Proteins. 2016 Mar;84(3):332-48. doi: 10.1002/prot.24979. Epub 2016 Jan 20.

Predicting protein residue-residue contacts using random forests and deep networks.

BMC Bioinformatics. 2019 Mar 14;20(Suppl 2):100. doi: 10.1186/s12859-019-2627-6.

Prediction of beta-turns at over 80% accuracy based on an ensemble of predicted secondary structures and multiple alignments.

BMC Bioinformatics. 2008 Oct 10;9:430. doi: 10.1186/1471-2105-9-430.

引用本文的文献

Information quantity for secondary structure propensities of protein subsequences in the Protein Data Bank.

Biophys Physicobiol. 2022 Feb 8;19:1-12. doi: 10.2142/biophysico.bppb-v19.0002. eCollection 2022.

Characteristics of interactions at protein segments without non-local intramolecular contacts in the Protein Data Bank.

PLoS One. 2018 Dec 11;13(12):e0205052. doi: 10.1371/journal.pone.0205052. eCollection 2018.

Protein Data Bank Japan (PDBj): updated user interfaces, resource description framework, analysis tools for large structures.

Nucleic Acids Res. 2017 Jan 4;45(D1):D282-D288. doi: 10.1093/nar/gkw962. Epub 2016 Oct 26.

Protein Data Bank Japan (PDBj): maintaining a structural data archive and resource description framework format.

Nucleic Acids Res. 2012 Jan;40(Database issue):D453-60. doi: 10.1093/nar/gkr811. Epub 2011 Oct 5.

Protein contact order prediction from primary sequences.

BMC Bioinformatics. 2008 May 30;9:255. doi: 10.1186/1471-2105-9-255.

Nature of protein family signatures: insights from singular value analysis of position-specific scoring matrices.

PLoS One. 2008 Apr 9;3(4):e1963. doi: 10.1371/journal.pone.0001963.

Predicting residue-wise contact orders in proteins by support vector regression.

BMC Bioinformatics. 2006 Oct 3;7:425. doi: 10.1186/1471-2105-7-425.

CRNPRED: highly accurate prediction of one-dimensional protein structures by large-scale critical random networks.

BMC Bioinformatics. 2006 Sep 5;7:401. doi: 10.1186/1471-2105-7-401.

本文引用的文献

Recoverable one-dimensional encoding of three-dimensional protein structures.

Bioinformatics. 2005 May 15;21(10):2167-70. doi: 10.1093/bioinformatics/bti330. Epub 2005 Feb 18.

DDBJ in collaboration with mass-sequencing teams on annotation.

Nucleic Acids Res. 2005 Jan 1;33(Database issue):D25-8. doi: 10.1093/nar/gki020.

Predicting absolute contact numbers of native protein structure from amino acid sequence.

Proteins. 2005 Jan 1;58(1):158-65. doi: 10.1002/prot.20300.

Eigenvalue analysis of amino acid substitution matrices reveals a sharp transition of the mode of sequence conservation in proteins.

Bioinformatics. 2004 Nov 1;20(16):2504-8. doi: 10.1093/bioinformatics/bth297. Epub 2004 May 6.

Harnessing nonlinearity: predicting chaotic systems and saving energy in wireless communication.

Science. 2004 Apr 2;304(5667):78-80. doi: 10.1126/science.1091277.

Protein secondary structure: entropy, correlations and prediction.

Bioinformatics. 2004 Jul 10;20(10):1603-11. doi: 10.1093/bioinformatics/bth132. Epub 2004 Feb 26.

FORTE: a profile-profile comparison tool for protein fold recognition.

Bioinformatics. 2004 Mar 1;20(4):594-5. doi: 10.1093/bioinformatics/btg474. Epub 2004 Feb 5.

The ASTRAL Compendium in 2004.

Nucleic Acids Res. 2004 Jan 1;32(Database issue):D189-92. doi: 10.1093/nar/gkh034.

Prediction in 1D: secondary structure, membrane helices, and accessibility.

Methods Biochem Anal. 2003;44:559-87.

Improving the prediction of protein secondary structure in three and eight classes using recurrent neural networks and profiles.

Proteins. 2002 May 1;47(2):228-35. doi: 10.1002/prot.10082.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

使用临界随机网络从氨基酸序列预测天然蛋白质结构的二级结构、接触数和残基水平的接触序。

Predicting secondary structures, contact numbers, and residue-wise contact orders of native protein structures from amino acid sequences using critical random networks.

作者信息

Kinjo Akira R, Nishikawa Ken

机构信息

出版信息

Biophysics (Nagoya-shi). 2005 Nov 22;1:67-74. doi: 10.2142/biophysics.1.67. eCollection 2005.

DOI:10.2142/biophysics.1.67

PMID:27857554

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5036631/

Abstract

摘要

使用临界随机网络从氨基酸序列预测天然蛋白质结构的二级结构、接触数和残基水平的接触序。

Predicting secondary structures, contact numbers, and residue-wise contact orders of native protein structures from amino acid sequences using critical random networks.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

使用临界随机网络从氨基酸序列预测天然蛋白质结构的二级结构、接触数和残基水平的接触序。

Predicting secondary structures, contact numbers, and residue-wise contact orders of native protein structures from amino acid sequences using critical random networks.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献