• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

一维结构蛋白描述符及其基于序列的预测。

Structural protein descriptors in 1-dimension and their sequence-based predictions.

机构信息

Department of Electrical and Computer Engineering, University of Alberta, Edmonton, AB, Canada.

出版信息

Curr Protein Pept Sci. 2011 Sep;12(6):470-89. doi: 10.2174/138920311796957711.

DOI:10.2174/138920311796957711
PMID:21787299
Abstract

The last few decades observed an increasing interest in development and application of 1-dimensional (1D) descriptors of protein structure. These descriptors project 3D structural features onto 1D strings of residue-wise structural assignments. They cover a wide-range of structural aspects including conformation of the backbone, burying depth/solvent exposure and flexibility of residues, and inter-chain residue-residue contacts. We perform first-of-its-kind comprehensive comparative review of the existing 1D structural descriptors. We define, review and categorize ten structural descriptors and we also describe, summarize and contrast over eighty computational models that are used to predict these descriptors from the protein sequences. We show that the majority of the recent sequence-based predictors utilize machine learning models, with the most popular being neural networks, support vector machines, hidden Markov models, and support vector and linear regressions. These methods provide high-throughput predictions and most of them are accessible to a non-expert user via web servers and/or stand-alone software packages. We empirically evaluate several recent sequence-based predictors of secondary structure, disorder, and solvent accessibility descriptors using a benchmark set based on CASP8 targets. Our analysis shows that the secondary structure can be predicted with over 80% accuracy and segment overlap (SOV), disorder with over 0.9 AUC, 0.6 Matthews Correlation Coefficient (MCC), and 75% SOV, and relative solvent accessibility with PCC of 0.7 and MCC of 0.6 (0.86 when homology is used). We demonstrate that the secondary structure predicted from sequence without the use of homology modeling is as good as the structure extracted from the 3D folds predicted by top-performing template-based methods.

摘要

过去几十年,人们对一维(1D)蛋白质结构描述符的开发和应用产生了浓厚的兴趣。这些描述符将 3D 结构特征投影到残基结构分配的 1D 字符串上。它们涵盖了广泛的结构方面,包括骨架构象、埋藏深度/溶剂暴露和残基柔韧性,以及链间残基-残基接触。我们对现有的 1D 结构描述符进行了首次全面的比较性综述。我们定义、回顾和分类了十种结构描述符,并描述、总结和对比了八十多种用于从蛋白质序列预测这些描述符的计算模型。我们表明,大多数最近的基于序列的预测器都使用机器学习模型,其中最受欢迎的是神经网络、支持向量机、隐马尔可夫模型和支持向量与线性回归。这些方法提供高通量预测,其中大多数通过网络服务器和/或独立软件包对非专业用户都可用。我们使用基于 CASP8 靶标的基准集,对几种最近的基于序列的二级结构、无序和溶剂可及性描述符预测器进行了实证评估。我们的分析表明,二级结构的预测准确率超过 80%,片段重叠(SOV)超过 0.9 AUC、0.6 马修斯相关系数(MCC)和 75% SOV,无序的预测准确率超过 0.9 AUC、0.6 MCC 和 75% SOV,相对溶剂可及性的 PCC 为 0.7,MCC 为 0.6(使用同源性时为 0.86)。我们证明,不使用同源建模从序列预测的二级结构与通过表现最佳的基于模板的方法预测的 3D 折叠中提取的结构一样好。

相似文献

1
Structural protein descriptors in 1-dimension and their sequence-based predictions.一维结构蛋白描述符及其基于序列的预测。
Curr Protein Pept Sci. 2011 Sep;12(6):470-89. doi: 10.2174/138920311796957711.
2
iFC²: an integrated web-server for improved prediction of protein structural class, fold type, and secondary structure content.iFC²:一个集成的网络服务器,用于提高蛋白质结构类别、折叠类型和二级结构含量的预测。
Amino Acids. 2011 Mar;40(3):963-73. doi: 10.1007/s00726-010-0721-1. Epub 2010 Aug 21.
3
Prediction of beta-turns at over 80% accuracy based on an ensemble of predicted secondary structures and multiple alignments.基于预测的二级结构集合和多重比对,以超过80%的准确率预测β转角。
BMC Bioinformatics. 2008 Oct 10;9:430. doi: 10.1186/1471-2105-9-430.
4
Evaluation of methods for predicting the topology of beta-barrel outer membrane proteins and a consensus prediction method.β-桶状外膜蛋白拓扑结构预测方法的评估及一种共识预测方法
BMC Bioinformatics. 2005 Jan 12;6:7. doi: 10.1186/1471-2105-6-7.
5
A 3D-1D substitution matrix for protein fold recognition that includes predicted secondary structure of the sequence.一种用于蛋白质折叠识别的3D-1D替换矩阵,其包含序列的预测二级结构。
J Mol Biol. 1997 Apr 11;267(4):1026-38. doi: 10.1006/jmbi.1997.0924.
6
CONFOLD: Residue-residue contact-guided ab initio protein folding.CONFOLD:基于残基-残基接触引导的从头算蛋白质折叠。
Proteins. 2015 Aug;83(8):1436-49. doi: 10.1002/prot.24829. Epub 2015 Jun 6.
7
Using multi-data hidden Markov models trained on local neighborhoods of protein structure to predict residue-residue contacts.使用在蛋白质结构的局部邻域上训练的多数据隐藏马尔可夫模型来预测残基-残基接触。
Bioinformatics. 2009 May 15;25(10):1264-70. doi: 10.1093/bioinformatics/btp149. Epub 2009 Mar 16.
8
SNBRFinder: A Sequence-Based Hybrid Algorithm for Enhanced Prediction of Nucleic Acid-Binding Residues.SNBRFinder:一种基于序列的混合算法,用于增强对核酸结合残基的预测。
PLoS One. 2015 Jul 15;10(7):e0133260. doi: 10.1371/journal.pone.0133260. eCollection 2015.
9
Structural classification of proteins using texture descriptors extracted from the cellular automata image.使用从细胞自动机图像中提取的纹理描述符对蛋白质进行结构分类。
Amino Acids. 2017 Feb;49(2):261-271. doi: 10.1007/s00726-016-2354-5. Epub 2016 Oct 24.
10
Protein contact prediction by integrating deep multiple sequence alignments, coevolution and machine learning.通过整合深度多序列比对、协同进化和机器学习进行蛋白质接触预测。
Proteins. 2018 Mar;86 Suppl 1(Suppl 1):84-96. doi: 10.1002/prot.25405. Epub 2017 Oct 31.

引用本文的文献

1
Comprehensive assessment of AlphaFold's predictions of secondary structure and solvent accessibility at the amino acid-level in eukaryotic, bacterial and archaeal proteins.对AlphaFold在真核生物、细菌和古细菌蛋白质氨基酸水平上的二级结构和溶剂可及性预测进行全面评估。
Comput Struct Biotechnol J. 2025 May 29;27:2443-2449. doi: 10.1016/j.csbj.2025.05.047. eCollection 2025.
2
DescribePROT Database of Residue-Level Protein Structure and Function Annotations.描述 PROT 数据库中的残基水平的蛋白质结构和功能注释。
Methods Mol Biol. 2025;2867:169-184. doi: 10.1007/978-1-0716-4196-5_10.
3
TEMPRO: nanobody melting temperature estimation model using protein embeddings.
TEMPRO:使用蛋白质嵌入的纳米体融解温度预估模型。
Sci Rep. 2024 Aug 17;14(1):19074. doi: 10.1038/s41598-024-70101-6.
4
Taxonomy-specific assessment of intrinsic disorder predictions at residue and region levels in higher eukaryotes, protists, archaea, bacteria and viruses.对高等真核生物、原生生物、古细菌、细菌和病毒中残基和区域水平的内在无序预测进行分类学特异性评估。
Comput Struct Biotechnol J. 2024 Apr 27;23:1968-1977. doi: 10.1016/j.csbj.2024.04.059. eCollection 2024 Dec.
5
DescribePROT in 2023: more, higher-quality and experimental annotations and improved data download options.2023 年的描述 PROT:更多、更高质量和实验性的注释以及改进的数据下载选项。
Nucleic Acids Res. 2024 Jan 5;52(D1):D426-D433. doi: 10.1093/nar/gkad985.
6
Complementarity of the residue-level protein function and structure predictions in human proteins.人类蛋白质中残基水平的蛋白质功能与结构预测的互补性。
Comput Struct Biotechnol J. 2022 May 6;20:2223-2234. doi: 10.1016/j.csbj.2022.05.003. eCollection 2022.
7
DescribePROT: database of amino acid-level protein structure and function predictions.DescribePROT:氨基酸水平的蛋白质结构和功能预测数据库。
Nucleic Acids Res. 2021 Jan 8;49(D1):D298-D308. doi: 10.1093/nar/gkaa931.
8
Analyzing IDPs in Interactomes.分析互作网络中的内在离散蛋白。
Methods Mol Biol. 2020;2141:895-945. doi: 10.1007/978-1-0716-0524-0_46.
9
Recent applications of deep learning and machine intelligence on in silico drug discovery: methods, tools and databases.深度学习和机器智能在计算机药物发现中的最新应用:方法、工具和数据库。
Brief Bioinform. 2019 Sep 27;20(5):1878-1912. doi: 10.1093/bib/bby061.
10
RaptorX-Angle: real-value prediction of protein backbone dihedral angles through a hybrid method of clustering and deep learning. RaptorX-Angle:通过聚类和深度学习的混合方法实现蛋白质主链二面角的实值预测。
BMC Bioinformatics. 2018 May 8;19(Suppl 4):100. doi: 10.1186/s12859-018-2065-x.