Suppr超能文献

CSmetaPred:一种预测催化残基的共识方法。

CSmetaPred: a consensus method for prediction of catalytic residues.

机构信息

Department of Biological Sciences, Indian Institute of Science Education and Research, Mohali, Knowledge City, Sector 81, SAS Nagar, Manuali PO 140306, India.

Laboratory of Biochemistry and Genetics, National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Bethesda, MD, 20892, USA.

出版信息

BMC Bioinformatics. 2017 Dec 22;18(1):583. doi: 10.1186/s12859-017-1987-z.

Abstract

BACKGROUND

Knowledge of catalytic residues can play an essential role in elucidating mechanistic details of an enzyme. However, experimental identification of catalytic residues is a tedious and time-consuming task, which can be expedited by computational predictions. Despite significant development in active-site prediction methods, one of the remaining issues is ranked positions of putative catalytic residues among all ranked residues. In order to improve ranking of catalytic residues and their prediction accuracy, we have developed a meta-approach based method CSmetaPred. In this approach, residues are ranked based on the mean of normalized residue scores derived from four well-known catalytic residue predictors. The mean residue score of CSmetaPred is combined with predicted pocket information to improve prediction performance in meta-predictor, CSmetaPred_poc.

RESULTS

Both meta-predictors are evaluated on two comprehensive benchmark datasets and three legacy datasets using Receiver Operating Characteristic (ROC) and Precision Recall (PR) curves. The visual and quantitative analysis of ROC and PR curves shows that meta-predictors outperform their constituent methods and CSmetaPred_poc is the best of evaluated methods. For instance, on CSAMAC dataset CSmetaPred_poc (CSmetaPred) achieves highest Mean Average Specificity (MAS), a scalar measure for ROC curve, of 0.97 (0.96). Importantly, median predicted rank of catalytic residues is the lowest (best) for CSmetaPred_poc. Considering residues ranked ≤20 classified as true positive in binary classification, CSmetaPred_poc achieves prediction accuracy of 0.94 on CSAMAC dataset. Moreover, on the same dataset CSmetaPred_poc predicts all catalytic residues within top 20 ranks for ~73% of enzymes. Furthermore, benchmarking of prediction on comparative modelled structures showed that models result in better prediction than only sequence based predictions. These analyses suggest that CSmetaPred_poc is able to rank putative catalytic residues at lower (better) ranked positions, which can facilitate and expedite their experimental characterization.

CONCLUSIONS

The benchmarking studies showed that employing meta-approach in combining residue-level scores derived from well-known catalytic residue predictors can improve prediction accuracy as well as provide improved ranked positions of known catalytic residues. Hence, such predictions can assist experimentalist to prioritize residues for mutational studies in their efforts to characterize catalytic residues. Both meta-predictors are available as webserver at: http://14.139.227.206/csmetapred/ .

摘要

背景

催化残基的知识对于阐明酶的机制细节起着至关重要的作用。然而,催化残基的实验鉴定是一项繁琐且耗时的任务,可以通过计算预测来加速。尽管活性位点预测方法有了显著的发展,但仍存在一个问题,即假定催化残基在所有排序残基中的排序位置。为了提高催化残基的排序和预测准确性,我们开发了一种基于元方法的 CSmetaPred 方法。在这种方法中,根据从四个著名的催化残基预测器中得出的归一化残基得分的平均值对残基进行排序。CSmetaPred 的平均残基得分与预测口袋信息相结合,以提高元预测器 CSmetaPred_poc 的预测性能。

结果

使用接收器操作特征 (ROC) 和精度召回 (PR) 曲线,在两个综合基准数据集和三个遗留数据集上评估了两个元预测器。ROC 和 PR 曲线的可视化和定量分析表明,元预测器优于其组成方法,CSmetaPred_poc 是评估方法中最好的。例如,在 CSAMAC 数据集上,CSmetaPred_poc(CSmetaPred)实现了最高的平均特异性 (MAS),ROC 曲线的标量度量,为 0.97(0.96)。重要的是,催化残基的预测排名中位数最低(最佳)为 CSmetaPred_poc。在二元分类中,将排名≤20 的残基视为真阳性,CSmetaPred_poc 在 CSAMAC 数据集上的预测准确率为 0.94。此外,在同一数据集上,CSmetaPred_poc 预测了约 73%的酶的前 20 位的所有催化残基。此外,基于比较建模结构的预测基准测试表明,模型的预测结果优于仅基于序列的预测。这些分析表明,CSmetaPred_poc 能够将假定的催化残基排在更低(更好)的位置,这可以促进和加快它们的实验表征。

结论

基准研究表明,在组合来自著名催化残基预测器的残基水平得分时采用元方法可以提高预测准确性,并提供已知催化残基的改进排序位置。因此,这些预测可以帮助实验人员在努力表征催化残基时,为突变研究优先选择残基。两个元预测器都可以在以下网址作为网络服务器使用:http://14.139.227.206/csmetapred/

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4d72/5741869/43510755b654/12859_2017_1987_Fig1_HTML.jpg

相似文献

1
CSmetaPred: a consensus method for prediction of catalytic residues.
BMC Bioinformatics. 2017 Dec 22;18(1):583. doi: 10.1186/s12859-017-1987-z.
2
Accurate sequence-based prediction of catalytic residues.
Bioinformatics. 2008 Oct 15;24(20):2329-38. doi: 10.1093/bioinformatics/btn433. Epub 2008 Aug 18.
3
Evaluation of features for catalytic residue prediction in novel folds.
Protein Sci. 2007 Feb;16(2):216-26. doi: 10.1110/ps.062523907. Epub 2006 Dec 22.
4
An improved prediction of catalytic residues in enzyme structures.
Protein Eng Des Sel. 2008 May;21(5):295-302. doi: 10.1093/protein/gzn003. Epub 2008 Feb 20.
5
PINGU: PredIction of eNzyme catalytic residues usinG seqUence information.
PLoS One. 2015 Aug 11;10(8):e0135122. doi: 10.1371/journal.pone.0135122. eCollection 2015.
8
Rapid catalytic template searching as an enzyme function prediction procedure.
PLoS One. 2013 May 10;8(5):e62535. doi: 10.1371/journal.pone.0062535. Print 2013.
9
An assessment of catalytic residue 3D ensembles for the prediction of enzyme function.
BMC Bioinformatics. 2015 Nov 4;16:359. doi: 10.1186/s12859-015-0807-6.
10
iCataly-PseAAC: Identification of Enzymes Catalytic Sites Using Sequence Evolution Information with Grey Model GM (2,1).
J Membr Biol. 2015 Dec;248(6):1033-41. doi: 10.1007/s00232-015-9815-8. Epub 2015 Jun 16.

引用本文的文献

1
CATH functional families predict functional sites in proteins.
Bioinformatics. 2021 May 23;37(8):1099-1106. doi: 10.1093/bioinformatics/btaa937.

本文引用的文献

1
The precision-recall plot is more informative than the ROC plot when evaluating binary classifiers on imbalanced datasets.
PLoS One. 2015 Mar 4;10(3):e0118432. doi: 10.1371/journal.pone.0118432. eCollection 2015.
2
Prediction of distal residue participation in enzyme catalysis.
Protein Sci. 2015 May;24(5):762-78. doi: 10.1002/pro.2648. Epub 2015 Apr 2.
3
GASS: identifying enzyme active sites with genetic algorithms.
Bioinformatics. 2015 Mar 15;31(6):864-70. doi: 10.1093/bioinformatics/btu746. Epub 2014 Nov 10.
4
EXIA2: web server of accurate and rapid protein catalytic residue prediction.
Biomed Res Int. 2014;2014:807839. doi: 10.1155/2014/807839. Epub 2014 Sep 11.
5
Structure and catalysis in the Escherichia coli hotdog-fold thioesterase paralogs YdiI and YbdB.
Biochemistry. 2014 Jul 29;53(29):4788-805. doi: 10.1021/bi500334v. Epub 2014 Jul 18.
6
The Catalytic Site Atlas 2.0: cataloging catalytic sites and residues identified in enzymes.
Nucleic Acids Res. 2014 Jan;42(Database issue):D485-9. doi: 10.1093/nar/gkt1243. Epub 2013 Dec 6.
7
Revealing the hidden functional diversity of an enzyme family.
Nat Chem Biol. 2014 Jan;10(1):42-9. doi: 10.1038/nchembio.1387. Epub 2013 Nov 17.
8
Rapid catalytic template searching as an enzyme function prediction procedure.
PLoS One. 2013 May 10;8(5):e62535. doi: 10.1371/journal.pone.0062535. Print 2013.
9
Protein structure based prediction of catalytic residues.
BMC Bioinformatics. 2013 Feb 22;14:63. doi: 10.1186/1471-2105-14-63.
10
A large-scale evaluation of computational protein function prediction.
Nat Methods. 2013 Mar;10(3):221-7. doi: 10.1038/nmeth.2340. Epub 2013 Jan 27.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验