蛋白质穿线模型质量度量的研究。

A study of quality measures for protein threading models.

作者信息

Cristobal S, Zemla A, Fischer D, Rychlewski L, Elofsson A

机构信息

Cell and Molecular Biology Department, Box 596. BMC Uppsala University, SE-751 24 Uppsala, Sweden.

出版信息

BMC Bioinformatics. 2001;2:5. doi: 10.1186/1471-2105-2-5. Epub 2001 Aug 1.

DOI:10.1186/1471-2105-2-5

PMID:11545673

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC55330/

Abstract

BACKGROUND

Prediction of protein structures is one of the fundamental challenges in biology today. To fully understand how well different prediction methods perform, it is necessary to use measures that evaluate their performance. Every two years, starting in 1994, the CASP (Critical Assessment of protein Structure Prediction) process has been organized to evaluate the ability of different predictors to blindly predict the structure of proteins. To capture different features of the models, several measures have been developed during the CASP processes. However, these measures have not been examined in detail before. In an attempt to develop fully automatic measures that can be used in CASP, as well as in other type of benchmarking experiments, we have compared twenty-one measures. These measures include the measures used in CASP3 and CASP2 as well as have measures introduced later. We have studied their ability to distinguish between the better and worse models submitted to CASP3 and the correlation between them.

RESULTS

Using a small set of 1340 models for 23 different targets we show that most methods correlate with each other. Most pairs of measures show a correlation coefficient of about 0.5. The correlation is slightly higher for measures of similar types. We found that a significant problem when developing automatic measures is how to deal with proteins of different length. Also the comparisons between different measures is complicated as many measures are dependent on the size of the target. We show that the manual assessment can be reproduced to about 70% using automatic measures. Alignment independent measures, detects slightly more of the models with the correct fold, while alignment dependent measures agree better when selecting the best models for each target. Finally we show that using automatic measures would, to a large extent, reproduce the assessors ranking of the predictors at CASP3.

CONCLUSIONS

We show that given a sufficient number of targets the manual and automatic measures would have given almost identical results at CASP3. If the intent is to reproduce the type of scoring done by the manual assessor in in CASP3, the best approach might be to use a combination of alignment independent and alignment dependent measures, as used in several recent studies.

摘要

背景

蛋白质结构预测是当今生物学领域的一项基本挑战。为了全面了解不同预测方法的性能表现，有必要使用评估其性能的指标。自1994年起，每两年组织一次蛋白质结构预测关键评估（CASP）活动，以评估不同预测程序盲目预测蛋白质结构的能力。在CASP活动期间，为了捕捉模型的不同特征，开发了多种指标。然而，这些指标此前尚未得到详细研究。为了开发可用于CASP以及其他类型基准测试实验的全自动指标，我们对21种指标进行了比较。这些指标包括CASP3和CASP2中使用的指标以及后来引入的指标。我们研究了它们区分提交给CASP3的优劣模型的能力以及它们之间的相关性。

结果

使用针对23个不同目标的1340个模型组成的小数据集，我们发现大多数方法相互之间存在相关性。大多数指标对的相关系数约为0.5。相似类型的指标之间的相关性略高。我们发现，在开发自动指标时，一个重大问题是如何处理不同长度的蛋白质。此外，由于许多指标依赖于目标的大小，不同指标之间的比较也很复杂。我们表明，使用自动指标可以将人工评估的结果重现约70%。与比对无关的指标能略微多检测出一些具有正确折叠的模型，而在为每个目标选择最佳模型时，与比对相关的指标一致性更好。最后，我们表明使用自动指标在很大程度上可以重现CASP3中评估人员对预测程序的排名。

结论

我们表明，在有足够数量目标的情况下，人工和自动指标在CASP3中会得出几乎相同的结果。如果目的是重现CASP3中人工评估者的评分类型，最佳方法可能是像最近几项研究所做的那样结合使用与比对无关和与比对相关的指标。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2918/55330/b63ce5c78203/1471-2105-2-5-1.jpg

相似文献

A study of quality measures for protein threading models.蛋白质穿线模型质量度量的研究。

BMC Bioinformatics. 2001;2:5. doi: 10.1186/1471-2105-2-5. Epub 2001 Aug 1.

MaxSub: an automated measure for the assessment of protein structure prediction quality.MaxSub：一种用于评估蛋白质结构预测质量的自动化度量方法。

Bioinformatics. 2000 Sep;16(9):776-85. doi: 10.1093/bioinformatics/16.9.776.

Estimating quality of template-based protein models by alignment stability.通过比对稳定性评估基于模板的蛋白质模型的质量。

Proteins. 2008 May 15;71(3):1255-74. doi: 10.1002/prot.21819.

Assessment of refinement of template-based models in CASP11.在蛋白质结构预测技术关键评估第11轮（CASP11）中基于模板的模型优化评估

Proteins. 2016 Sep;84 Suppl 1(Suppl 1):260-81. doi: 10.1002/prot.25048. Epub 2016 Jun 15.

Comparison of performance in successive CASP experiments.连续几届蛋白质结构预测关键评估（CASP）实验中的性能比较。

Proteins. 2001;Suppl 5:163-70. doi: 10.1002/prot.10053.

CASP5 assessment of fold recognition target predictions.CASP5对折叠识别目标预测的评估。

Proteins. 2003;53 Suppl 6:395-409. doi: 10.1002/prot.10557.

An attempt to analyse progress in fold recognition from CASP1 to CASP3.对从CASP1到CASP3的折叠识别进展进行分析的尝试。

Proteins. 1999;Suppl 3:226-30. doi: 10.1002/(sici)1097-0134(1999)37:3+<226::aid-prot29>3.3.co;2-q.

CASP prediction center infrastructure and evaluation measures in CASP10 and CASP ROLL.CASP10和CASP ROLL中的CASP预测中心基础设施及评估措施。

Proteins. 2014 Feb;82 Suppl 2(0 2):7-13. doi: 10.1002/prot.24399. Epub 2013 Oct 18.

Automated prediction of CASP-5 structures using the Robetta server.使用Robetta服务器自动预测CASP-5结构。

Proteins. 2003;53 Suppl 6:524-33. doi: 10.1002/prot.10529.

Some measures of comparative performance in the three CASPs.三个蛋白质结构预测关键评估（CASP）中的一些比较性能指标。

Proteins. 1999;Suppl 3:231-7. doi: 10.1002/(sici)1097-0134(1999)37:3+<231::aid-prot30>3.3.co;2-t.

引用本文的文献

and study of FLT3 inhibitors and their application in acute myeloid leukemia.以及 FLT3 抑制剂的研究及其在急性髓系白血病中的应用。

Mol Med Rep. 2024 Dec;30(6). doi: 10.3892/mmr.2024.13353. Epub 2024 Oct 11.

Alba6 exhibits DNase activity and participates in stress response.阿尔巴6具有脱氧核糖核酸酶活性并参与应激反应。

iScience. 2024 Mar 8;27(4):109467. doi: 10.1016/j.isci.2024.109467. eCollection 2024 Apr 19.

Mechanistic Elucidation of Activation/Deactivation Signal Transduction within Neurotensin Receptor 1 Triggered by 'Driver Chemical Groups' of Modulators: A Comparative Molecular Dynamics Simulation.调节剂“驱动化学基团”触发的神经降压素受体1内激活/失活信号转导的机制阐释：比较分子动力学模拟

Pharmaceutics. 2023 Jul 21;15(7):2000. doi: 10.3390/pharmaceutics15072000.

Comparative genomics and integrated system biology approach unveiled undirected phylogeny patterns, mutational hotspots, functional patterns, and molecule repurposing for monkeypox virus.比较基因组学和综合系统生物学方法揭示了猴痘病毒无定向系统发育模式、突变热点、功能模式和分子再利用。

Funct Integr Genomics. 2023 Jul 11;23(3):231. doi: 10.1007/s10142-023-01168-z.

Discovery of andrographolide hit analog as a potent cyclooxygenase-2 inhibitor through consensus MD-simulation, electrostatic potential energy simulation and ligand efficiency metrics.通过共识 MD 模拟、静电势能模拟和配体效率指标发现穿心莲内酯类似物作为有效的环氧化酶-2 抑制剂。

Sci Rep. 2023 May 19;13(1):8147. doi: 10.1038/s41598-023-35192-7.

Virtual Screening of Novel 24-Dehydroxysterol Reductase () Inhibitors and the Biological Evaluation of Irbesartan in Cholesterol-Lowering Effect.新型 24-去羟胆固醇还原酶（DHCR）抑制剂的虚拟筛选及依普罗沙坦降低胆固醇作用的生物学评价。

Molecules. 2023 Mar 14;28(6):2643. doi: 10.3390/molecules28062643.

Computational Approaches for the Structure-Based Identification of Novel Inhibitors Targeting Nucleoid-Associated Proteins in Mycobacterium Tuberculosis.基于结构的计算方法鉴定结核分枝杆菌核相关蛋白的新型抑制剂

Mol Biotechnol. 2024 Apr;66(4):814-823. doi: 10.1007/s12033-023-00710-5. Epub 2023 Mar 13.

Identifying molecular structural features by pattern recognition methods.通过模式识别方法识别分子结构特征。

RSC Adv. 2022 Jun 14;12(27):17559-17569. doi: 10.1039/d2ra00764a. eCollection 2022 Jun 7.

Molecular structure recognition by blob detection.通过斑点检测进行分子结构识别。

RSC Adv. 2021 Nov 5;11(57):35879-35886. doi: 10.1039/d1ra05752a. eCollection 2021 Nov 4.

MUfoldQA_G: High-accuracy protein model QA via retraining and transformation.MUfoldQA_G：通过再训练和转换实现高精度蛋白质模型问答

Comput Struct Biotechnol J. 2021 Nov 23;19:6282-6290. doi: 10.1016/j.csbj.2021.11.021. eCollection 2021.

本文引用的文献

LiveBench-1: continuous benchmarking of protein structure prediction servers.LiveBench-1：蛋白质结构预测服务器的持续基准测试。

Protein Sci. 2001 Feb;10(2):352-61. doi: 10.1110/ps.40501.

MaxSub: an automated measure for the assessment of protein structure prediction quality.MaxSub：一种用于评估蛋白质结构预测质量的自动化度量方法。

Bioinformatics. 2000 Sep;16(9):776-85. doi: 10.1093/bioinformatics/16.9.776.

Structure-based evaluation of sequence comparison and fold recognition alignment accuracy.基于结构的序列比对和折叠识别比对准确性评估。

J Mol Biol. 2000 Apr 7;297(4):1003-13. doi: 10.1006/jmbi.2000.3615.

Identification of related proteins on family, superfamily and fold level.在家族、超家族和折叠水平上鉴定相关蛋白质。

J Mol Biol. 2000 Jan 21;295(3):613-25. doi: 10.1006/jmbi.1999.3377.

A measure of progress in fold recognition?折叠识别中的进展度量？

Proteins. 1999;Suppl 3:218-25. doi: 10.1002/(sici)1097-0134(1999)37:3+<218::aid-prot28>3.3.co;2-o.

CAFASP-1: critical assessment of fully automated structure prediction methods.CAFASP-1：全自动结构预测方法的批判性评估

Proteins. 1999;Suppl 3:209-17. doi: 10.1002/(sici)1097-0134(1999)37:3+<209::aid-prot27>3.3.co;2-p.

Analysis and assessment of ab initio three-dimensional prediction, secondary structure, and contacts prediction.从头开始的三维预测、二级结构预测和接触预测的分析与评估。

Proteins. 1999;Suppl 3:149-70. doi: 10.1002/(sici)1097-0134(1999)37:3+<149::aid-prot20>3.3.co;2-8.

Structure classification-based assessment of CASP3 predictions for the fold recognition targets.基于结构分类对折叠识别目标的CASP3预测进行评估

Proteins. 1999;Suppl 3:88-103. doi: 10.1002/(sici)1097-0134(1999)37:3+<88::aid-prot13>3.3.co;2-v.

CASP3 comparative modeling evaluation.半胱天冬酶3比较建模评估。

Proteins. 1999;Suppl 3:30-46. doi: 10.1002/(sici)1097-0134(1999)37:3+<30::aid-prot6>3.0.co;2-s.

Processing and analysis of CASP3 protein structure predictions.半胱天冬酶3（CASP3）蛋白结构预测的处理与分析

Proteins. 1999;Suppl 3:22-9. doi: 10.1002/(sici)1097-0134(1999)37:3+<22::aid-prot5>3.3.co;2-n.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

蛋白质穿线模型质量度量的研究。

A study of quality measures for protein threading models.

作者信息

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSIONS

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献