Suppr超能文献

基于两阶段特征选择方法的HIV-1蛋白酶切割位点预测

HIV-1 protease cleavage site prediction based on two-stage feature selection method.

作者信息

Niu Bing, Yuan Xiao-Cheng, Roeper Preston, Su Qiang, Peng Chun-Rong, Yin Jing-Yuan, Ding Juan, Li HaiPeng, Lu Wen-Cong

机构信息

College of Life Science, Shanghai University, Shanghai, People's Republic of China.

出版信息

Protein Pept Lett. 2013 Mar;20(3):290-8. doi: 10.2174/0929866511320030007.

Abstract

Knowledge of the mechanism of HIV protease cleavage specificity is critical to the design of specific and effective HIV inhibitors. Searching for an accurate, robust, and rapid method to correctly predict the cleavage sites in proteins is crucial when searching for possible HIV inhibitors. In this article, HIV-1 protease specificity was studied using the correlation-based feature subset (CfsSubset) selection method combined with Genetic Algorithms method. Thirty important biochemical features were found based on a jackknife test from the original data set containing 4,248 features. By using the AdaBoost method with the thirty selected features the prediction model yields an accuracy of 96.7% for the jackknife test and 92.1% for an independent set test, with increased accuracy over the original dataset by 6.7% and 77.4%, respectively. Our feature selection scheme could be a useful technique for finding effective competitive inhibitors of HIV protease.

摘要

了解HIV蛋白酶切割特异性的机制对于设计特异性和有效的HIV抑制剂至关重要。在寻找可能的HIV抑制剂时,寻找一种准确、稳健且快速的方法来正确预测蛋白质中的切割位点至关重要。在本文中,使用基于相关性的特征子集(CfsSubset)选择方法与遗传算法相结合的方法研究了HIV-1蛋白酶的特异性。基于留一法检验,从包含4248个特征的原始数据集中发现了30个重要的生化特征。通过将AdaBoost方法与所选的30个特征一起使用,预测模型在留一法检验中的准确率为96.7%,在独立集检验中的准确率为92.1%,与原始数据集相比,准确率分别提高了6.7%和77.4%。我们的特征选择方案可能是一种用于寻找HIV蛋白酶有效竞争性抑制剂的有用技术。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验