• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

SMOQ:一种使用支持向量机预测单个蛋白质模型绝对残基特异性质量的工具。

SMOQ: a tool for predicting the absolute residue-specific quality of a single protein model with support vector machines.

机构信息

Department of Computer Science, Informatics Institute, Christopher S, Bond Life Science Center, University of Missouri, Columbia, MO 65211, USA.

出版信息

BMC Bioinformatics. 2014 Apr 28;15:120. doi: 10.1186/1471-2105-15-120.

DOI:10.1186/1471-2105-15-120
PMID:24776231
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4013430/
Abstract

BACKGROUND

It is important to predict the quality of a protein structural model before its native structure is known. The method that can predict the absolute local quality of individual residues in a single protein model is rare, yet particularly needed for using, ranking and refining protein models.

RESULTS

We developed a machine learning tool (SMOQ) that can predict the distance deviation of each residue in a single protein model. SMOQ uses support vector machines (SVM) with protein sequence and structural features (i.e. basic feature set), including amino acid sequence, secondary structures, solvent accessibilities, and residue-residue contacts to make predictions. We also trained a SVM model with two new additional features (profiles and SOV scores) on 20 CASP8 targets and found that including them can only improve the performance when real deviations between native and model are higher than 5Å. The SMOQ tool finally released uses the basic feature set trained on 85 CASP8 targets. Moreover, SMOQ implemented a way to convert predicted local quality scores into a global quality score. SMOQ was tested on the 84 CASP9 single-domain targets. The average difference between the residue-specific distance deviation predicted by our method and the actual distance deviation on the test data is 2.637Å. The global quality prediction accuracy of the tool is comparable to other good tools on the same benchmark.

CONCLUSION

SMOQ is a useful tool for protein single model quality assessment. Its source code and executable are available at: http://sysbio.rnet.missouri.edu/multicom_toolbox/.

摘要

背景

在未知蛋白质结构的情况下,预测蛋白质结构模型的质量非常重要。能够预测单个蛋白质模型中各个残基绝对局部质量的方法很少,但对于使用、排序和精炼蛋白质模型来说,这种方法尤其需要。

结果

我们开发了一种机器学习工具(SMOQ),可以预测单个蛋白质模型中每个残基的距离偏差。SMOQ 使用支持向量机(SVM)结合蛋白质序列和结构特征(即基本特征集),包括氨基酸序列、二级结构、溶剂可及性和残基-残基接触来进行预测。我们还在 20 个 CASP8 目标上使用两个新的附加特征(轮廓和 SOV 分数)训练了一个 SVM 模型,并发现只有在真实偏差大于 5Å时,包含这些特征才能提高性能。SMOQ 工具最终使用在 85 个 CASP8 目标上训练的基本特征集发布。此外,SMOQ 实现了一种将预测的局部质量分数转换为全局质量分数的方法。SMOQ 在 84 个 CASP9 单域目标上进行了测试。我们的方法预测的残基特定距离偏差与测试数据上的实际距离偏差之间的平均差异为 2.637Å。该工具的全局质量预测准确性与同一基准上的其他优秀工具相当。

结论

SMOQ 是一种用于蛋白质单模型质量评估的有用工具。其源代码和可执行文件可在以下网址获得:http://sysbio.rnet.missouri.edu/multicom_toolbox/。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/77a2/4013430/4eec89aabeb4/1471-2105-15-120-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/77a2/4013430/f5c965cd402f/1471-2105-15-120-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/77a2/4013430/982d73ed0e8b/1471-2105-15-120-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/77a2/4013430/5fd83f50d7d8/1471-2105-15-120-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/77a2/4013430/4eec89aabeb4/1471-2105-15-120-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/77a2/4013430/f5c965cd402f/1471-2105-15-120-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/77a2/4013430/982d73ed0e8b/1471-2105-15-120-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/77a2/4013430/5fd83f50d7d8/1471-2105-15-120-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/77a2/4013430/4eec89aabeb4/1471-2105-15-120-4.jpg

相似文献

1
SMOQ: a tool for predicting the absolute residue-specific quality of a single protein model with support vector machines.SMOQ:一种使用支持向量机预测单个蛋白质模型绝对残基特异性质量的工具。
BMC Bioinformatics. 2014 Apr 28;15:120. doi: 10.1186/1471-2105-15-120.
2
APOLLO: a quality assessment service for single and multiple protein models.APOLLO:用于单蛋白模型和多蛋白模型的质量评估服务。
Bioinformatics. 2011 Jun 15;27(12):1715-6. doi: 10.1093/bioinformatics/btr268. Epub 2011 May 5.
3
Predicting residue-wise contact orders in proteins by support vector regression.通过支持向量回归预测蛋白质中残基水平的接触序。
BMC Bioinformatics. 2006 Oct 3;7:425. doi: 10.1186/1471-2105-7-425.
4
The MULTICOM Protein Structure Prediction Server Empowered by Deep Learning and Contact Distance Prediction.基于深度学习和接触距离预测的 MULTICOM 蛋白质结构预测服务器。
Methods Mol Biol. 2020;2165:13-26. doi: 10.1007/978-1-0716-0708-4_2.
5
DoBo: Protein domain boundary prediction by integrating evolutionary signals and machine learning.多宝:通过整合进化信号和机器学习进行蛋白质结构域边界预测。
BMC Bioinformatics. 2011 Feb 1;12:43. doi: 10.1186/1471-2105-12-43.
6
MSACompro: protein multiple sequence alignment using predicted secondary structure, solvent accessibility, and residue-residue contacts.MSACompro:基于预测的二级结构、溶剂可及性和残基-残基接触的蛋白质多重序列比对。
BMC Bioinformatics. 2011 Dec 14;12:472. doi: 10.1186/1471-2105-12-472.
7
SVM-Fold: a tool for discriminative multi-class protein fold and superfamily recognition.支持向量机折叠法:一种用于判别式多类别蛋白质折叠和超家族识别的工具。
BMC Bioinformatics. 2007 May 22;8 Suppl 4(Suppl 4):S2. doi: 10.1186/1471-2105-8-S4-S2.
8
Protein-RNA interface residue prediction using machine learning: an assessment of the state of the art.基于机器学习的蛋白质-RNA 界面残基预测:现状评估。
BMC Bioinformatics. 2012 May 10;13:89. doi: 10.1186/1471-2105-13-89.
9
Improved model quality assessment using ProQ2.使用 ProQ2 提高模型质量评估。
BMC Bioinformatics. 2012 Sep 10;13:224. doi: 10.1186/1471-2105-13-224.
10
Benchmarking Deep Networks for Predicting Residue-Specific Quality of Individual Protein Models in CASP11.用于预测CASP11中单个蛋白质模型的残基特异性质量的深度网络基准测试。
Sci Rep. 2016 Jan 14;6:19301. doi: 10.1038/srep19301.

引用本文的文献

1
Structural Modeling of Human Immunodeficiency Virus Proteins.人类免疫缺陷病毒蛋白的结构建模
Biomed Eng Comput Biol. 2023 Feb 16;14:11795972231154402. doi: 10.1177/11795972231154402. eCollection 2023.
2
A GHKNN model based on the physicochemical property extraction method to identify SNARE proteins.一种基于物理化学性质提取方法的GHKNN模型,用于识别SNARE蛋白。
Front Genet. 2022 Nov 23;13:935717. doi: 10.3389/fgene.2022.935717. eCollection 2022.
3
Predicting residue-specific qualities of individual protein models using residual neural networks and graph neural networks.

本文引用的文献

1
Improved model quality assessment using ProQ2.使用 ProQ2 提高模型质量评估。
BMC Bioinformatics. 2012 Sep 10;13:224. doi: 10.1186/1471-2105-13-224.
2
Evaluation of model quality predictions in CASP9.CASP9 模型质量预测评估。
Proteins. 2011;79 Suppl 10(Suppl 10):91-106. doi: 10.1002/prot.23180. Epub 2011 Oct 14.
3
APOLLO: a quality assessment service for single and multiple protein models.APOLLO:用于单蛋白模型和多蛋白模型的质量评估服务。
使用残差神经网络和图神经网络预测个体蛋白质模型的残基特异性性质。
Proteins. 2022 Dec;90(12):2091-2102. doi: 10.1002/prot.26400. Epub 2022 Jul 30.
4
MUfoldQA_G: High-accuracy protein model QA via retraining and transformation.MUfoldQA_G:通过再训练和转换实现高精度蛋白质模型问答
Comput Struct Biotechnol J. 2021 Nov 23;19:6282-6290. doi: 10.1016/j.csbj.2021.11.021. eCollection 2021.
5
ZoomQA: residue-level protein model accuracy estimation with machine learning on sequential and 3D structural features.ZoomQA:基于序列和 3D 结构特征的机器学习的残基水平蛋白质模型准确性估计。
Brief Bioinform. 2022 Jan 17;23(1). doi: 10.1093/bib/bbab384.
6
Protein model accuracy estimation empowered by deep learning and inter-residue distance prediction in CASP14.基于深度学习和残差距离预测的蛋白质模型准确性估计在 CASP14 中的应用。
Sci Rep. 2021 May 25;11(1):10943. doi: 10.1038/s41598-021-90303-6.
7
Predicting Preference of Transcription Factors for Methylated DNA Using Sequence Information.利用序列信息预测转录因子对甲基化DNA的偏好性
Mol Ther Nucleic Acids. 2020 Jul 31;22:1043-1050. doi: 10.1016/j.omtn.2020.07.035. eCollection 2020 Dec 4.
8
MASS: predict the global qualities of individual protein models using random forests and novel statistical potentials.MASS:使用随机森林和新的统计势能预测个体蛋白质模型的全局性质。
BMC Bioinformatics. 2020 Jul 6;21(Suppl 4):246. doi: 10.1186/s12859-020-3383-3.
9
Early Diagnosis of Hepatocellular Carcinoma Using Machine Learning Method.基于机器学习方法的肝细胞癌早期诊断
Front Bioeng Biotechnol. 2020 Mar 27;8:254. doi: 10.3389/fbioe.2020.00254. eCollection 2020.
10
AOPs-SVM: A Sequence-Based Classifier of Antioxidant Proteins Using a Support Vector Machine.AOPs-SVM:一种基于序列的使用支持向量机的抗氧化蛋白分类器。
Front Bioeng Biotechnol. 2019 Sep 18;7:224. doi: 10.3389/fbioe.2019.00224. eCollection 2019.
Bioinformatics. 2011 Jun 15;27(12):1715-6. doi: 10.1093/bioinformatics/btr268. Epub 2011 May 5.
4
MULTICOM: a multi-level combination approach to protein structure prediction and its assessments in CASP8.MULTICOM:一种多层次组合方法,用于蛋白质结构预测及其在 CASP8 中的评估。
Bioinformatics. 2010 Apr 1;26(7):882-8. doi: 10.1093/bioinformatics/btq058. Epub 2010 Feb 11.
5
Rapid model quality assessment for protein structure predictions using the comparison of multiple models without structural alignments.使用不进行结构比对的多模型比较进行蛋白质结构预测的快速模型质量评估。
Bioinformatics. 2010 Jan 15;26(2):182-8. doi: 10.1093/bioinformatics/btp629. Epub 2009 Nov 6.
6
Protein structure prediction center in CASP8.中国科学院结构生物学研究中心蛋白质结构预测中心。
Proteins. 2009;77 Suppl 9(Suppl 9):5-9. doi: 10.1002/prot.22517.
7
Evaluation of CASP8 model quality predictions.CASP8 模型质量预测评估。
Proteins. 2009;77 Suppl 9:157-66. doi: 10.1002/prot.22534.
8
Prediction of global and local model quality in CASP8 using the ModFOLD server.利用 ModFOLD 服务器预测 CASP8 中的全局和局部模型质量。
Proteins. 2009;77 Suppl 9:185-90. doi: 10.1002/prot.22491.
9
Assessment of global and local model quality in CASP8 using Pcons and ProQ.使用 Pcons 和 ProQ 评估 CASP8 中的全局和局部模型质量。
Proteins. 2009;77 Suppl 9:167-72. doi: 10.1002/prot.22476.
10
Prediction of global and local quality of CASP8 models by MULTICOM series.MULTICOM 系列预测 CASP8 模型的全局和局部质量。
Proteins. 2009;77 Suppl 9:181-4. doi: 10.1002/prot.22487.