Suppr超能文献

使用基于特征优化的支持向量机方法识别酶中的β-发夹基序。

Using feature optimization-based support vector machine method to recognize the β-hairpin motifs in enzymes.

作者信息

Li Dongmei, Hu Xiuzhen, Liu Xingxing, Feng Zhenxing, Ding Changjiang

机构信息

College of Sciences, Inner Mongolia University of Technology, Hohhot 010051, China.

出版信息

Saudi J Biol Sci. 2017 Sep;24(6):1361-1369. doi: 10.1016/j.sjbs.2016.11.014. Epub 2016 Nov 28.

Abstract

β-Hairpins in enzyme, a kind of special protein with catalytic functions, contain many binding sites which are essential for the functions of enzyme. With the increasing number of observed enzyme protein sequences, it is of especial importance to use bioinformatics techniques to quickly and accurately identify the β-hairpin in enzyme protein for further advanced annotation of structure and function of enzyme. In this work, the proposed method was trained and tested on a non-redundant enzyme β-hairpin database containing 2818 β-hairpins and 1098 non-β-hairpins. With 5-fold cross-validation on the training dataset, the overall accuracy of 90.08% and Matthew's correlation coefficient (Mcc) of 0.74 were obtained, while on the independent test dataset, the overall accuracy of 88.93% and Mcc of 0.76 were achieved. Furthermore, the method was validated on 845 β-hairpins with ligand binding sites. With 5-fold cross-validation on the training dataset and independent test on the test dataset, the overall accuracies were 85.82% (Mcc of 0.71) and 84.78% (Mcc of 0.70), respectively. With an integration of mRMR feature selection and SVM algorithm, a reasonable high accuracy was achieved, indicating the method to be an effective tool for the further studies of β-hairpins in enzymes structure. Additionally, as a novelty for function prediction of enzymes, β-hairpins with ligand binding sites were predicted. Based on this work, a web server was constructed to predict β-hairpin motifs in enzymes (http://202.207.29.251:8080/).

摘要

酶中的β发夹结构是一种具有催化功能的特殊蛋白质,含有许多对酶功能至关重要的结合位点。随着观察到的酶蛋白序列数量的增加,利用生物信息学技术快速准确地识别酶蛋白中的β发夹结构,对于进一步深入注释酶的结构和功能尤为重要。在这项工作中,所提出的方法在一个包含2818个β发夹结构和1098个非β发夹结构的非冗余酶β发夹数据库上进行了训练和测试。在训练数据集上进行5折交叉验证,获得了90.08%的总体准确率和0.74的马修斯相关系数(Mcc),而在独立测试数据集上,总体准确率为88.93%,Mcc为0.76。此外,该方法在845个具有配体结合位点的β发夹结构上进行了验证。在训练数据集上进行5折交叉验证,在测试数据集上进行独立测试,总体准确率分别为85.82%(Mcc为0.71)和84.78%(Mcc为0.70)。通过整合mRMR特征选择和支持向量机算法,实现了较高的准确率,表明该方法是进一步研究酶结构中β发夹结构的有效工具。此外,作为酶功能预测的一个新方法,预测了具有配体结合位点的β发夹结构。基于这项工作,构建了一个网络服务器来预测酶中的β发夹基序(http://202.207.29.251:8080/)。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/88ff/5562482/57cb200658ce/gr1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验