• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于 MLP 的 HIV-1 蛋白酶切割位点分析特征子集选择。

An MLP-based feature subset selection for HIV-1 protease cleavage site analysis.

机构信息

Department of Computer Science Education, Korea University, Seoul 136-701, Republic of Korea.

出版信息

Artif Intell Med. 2010 Feb-Mar;48(2-3):83-9. doi: 10.1016/j.artmed.2009.07.010. Epub 2009 Nov 27.

DOI:10.1016/j.artmed.2009.07.010
PMID:19945261
Abstract

OBJECTIVE

In recent years, several machine learning approaches have been applied to modeling the specificity of the human immunodeficiency virus type 1 (HIV-1) protease cleavage domain. However, the high dimensional domain dataset contains a small number of samples, which could misguide classification modeling and its interpretation. Appropriate feature selection can alleviate the problem by eliminating irrelevant and redundant features, and thus improve prediction performance.

METHODS

We introduce a new feature subset selection method, FS-MLP, that selects relevant features using multi-layered perceptron (MLP) learning. The method includes MLP learning with a training dataset and then feature subset selection using decompositional approach to analyze the trained MLP. Our method is able to select a subset of relevant features in high dimensional, multi-variate and non-linear domains.

RESULTS

Using five artificial datasets that represent four data types, we verified the FS-MLP performance with seven other feature selection methods. Experimental results showed that the FS-MLP is superior at high dimensional, multi-variate and non-linear domains. In experiments with HIV-1 protease cleavage dataset, the FS-MLP selected a set of 14 highly relevant features among 160 original features. On a validation set of 131 test instances, classifiers that used the 14 features showed about 95% accuracy which outperformed other seven methods in terms of accuracy and the number of features.

CONCLUSIONS

Our experimental results indicate that the FS-MLP is effective in analyzing multi-variate, non-linear and high dimensional datasets such as HIV-1 protease cleavage dataset. The 14 relevant features which were selected by the FS-MLP provide us with useful insights into the HIV-1 cleavage site domain as well. The FS-MLP is a useful method for computational sequence analysis in general.

摘要

目的

近年来,已有几种机器学习方法被应用于构建人类免疫缺陷病毒 1 型(HIV-1)蛋白酶切割域的特异性模型。然而,高维数据集样本数量较少,可能会误导分类建模及其解释。适当的特征选择可以通过消除不相关和冗余的特征来缓解这个问题,从而提高预测性能。

方法

我们引入了一种新的特征子集选择方法 FS-MLP,该方法使用多层感知器(MLP)学习选择相关特征。该方法包括使用训练数据集进行 MLP 学习,然后使用分解方法进行特征子集选择,以分析训练后的 MLP。我们的方法能够在高维、多变量和非线性域中选择相关特征的子集。

结果

使用五个代表四种数据类型的人工数据集,我们将 FS-MLP 性能与其他七种特征选择方法进行了验证。实验结果表明,FS-MLP 在高维、多变量和非线性领域表现优异。在 HIV-1 蛋白酶切割数据集的实验中,FS-MLP 在 160 个原始特征中选择了一组 14 个高度相关的特征。在包含 131 个测试实例的验证集中,使用这 14 个特征的分类器的准确率约为 95%,在准确率和特征数量方面均优于其他七种方法。

结论

我们的实验结果表明,FS-MLP 能够有效地分析多变量、非线性和高维数据集,如 HIV-1 蛋白酶切割数据集。FS-MLP 选择的 14 个相关特征为我们提供了有关 HIV-1 切割位点域的有用见解。FS-MLP 是一般计算序列分析的一种有用方法。

相似文献

1
An MLP-based feature subset selection for HIV-1 protease cleavage site analysis.基于 MLP 的 HIV-1 蛋白酶切割位点分析特征子集选择。
Artif Intell Med. 2010 Feb-Mar;48(2-3):83-9. doi: 10.1016/j.artmed.2009.07.010. Epub 2009 Nov 27.
2
Why neural networks should not be used for HIV-1 protease cleavage site prediction.为何神经网络不应被用于预测HIV-1蛋白酶切割位点。
Bioinformatics. 2004 Jul 22;20(11):1702-9. doi: 10.1093/bioinformatics/bth144. Epub 2004 Feb 26.
3
Prediction of HIV-1 protease cleavage site using a combination of sequence, structural, and physicochemical features.利用序列、结构和物理化学特征相结合的方法预测HIV-1蛋白酶切割位点。
BMC Bioinformatics. 2016 Dec 23;17(Suppl 17):478. doi: 10.1186/s12859-016-1337-6.
4
Seminal quality prediction using data mining methods.使用数据挖掘方法进行精液质量预测。
Technol Health Care. 2014;22(4):531-45. doi: 10.3233/THC-140816.
5
Specificity rule discovery in HIV-1 protease cleavage site analysis.HIV-1蛋白酶切割位点分析中的特异性规则发现
Comput Biol Chem. 2008 Feb;32(1):71-8. doi: 10.1016/j.compbiolchem.2007.09.006. Epub 2007 Sep 29.
6
HIV-1 protease cleavage site prediction based on amino acid property.基于氨基酸特性的HIV-1蛋白酶切割位点预测
J Comput Chem. 2009 Jan 15;30(1):33-9. doi: 10.1002/jcc.21024.
7
A consistency-based feature selection method allied with linear SVMs for HIV-1 protease cleavage site prediction.基于一致性的特征选择方法与线性 SVM 联合用于 HIV-1 蛋白酶切割位点预测。
PLoS One. 2013 Aug 23;8(8):e63145. doi: 10.1371/journal.pone.0063145. eCollection 2013.
8
Predicting human immunodeficiency virus protease cleavage sites in nonlinear projection space.在非线性投影空间中预测人类免疫缺陷病毒蛋白酶切割位点。
Mol Cell Biochem. 2010 Jun;339(1-2):127-33. doi: 10.1007/s11010-009-0376-y. Epub 2010 Jan 7.
9
Feature Selection Combined with Neural Network Structure Optimization for HIV-1 Protease Cleavage Site Prediction.结合特征选择与神经网络结构优化的HIV-1蛋白酶切割位点预测
Biomed Res Int. 2015;2015:263586. doi: 10.1155/2015/263586. Epub 2015 Apr 15.
10
HIV-1 protease cleavage site prediction based on two-stage feature selection method.基于两阶段特征选择方法的HIV-1蛋白酶切割位点预测
Protein Pept Lett. 2013 Mar;20(3):290-8. doi: 10.2174/0929866511320030007.

引用本文的文献

1
Surgical Methods and Social Factors Are Associated With Long-Term Survival in Follicular Thyroid Carcinoma: Construction and Validation of a Prognostic Model Based on Machine Learning Algorithms.手术方法和社会因素与滤泡性甲状腺癌的长期生存相关:基于机器学习算法的预后模型的构建与验证
Front Oncol. 2022 Jun 21;12:816427. doi: 10.3389/fonc.2022.816427. eCollection 2022.
2
RGIFE: a ranked guided iterative feature elimination heuristic for the identification of biomarkers.RGIFE:一种用于识别生物标志物的排序引导迭代特征消除启发式方法。
BMC Bioinformatics. 2017 Jun 30;18(1):322. doi: 10.1186/s12859-017-1729-2.
3
The importance of physicochemical characteristics and nonlinear classifiers in determining HIV-1 protease specificity.
物理化学特性和非线性分类器在确定HIV-1蛋白酶特异性中的重要性。
Bioengineered. 2016 Apr 2;7(2):65-78. doi: 10.1080/21655979.2016.1149271.
4
A consistency-based feature selection method allied with linear SVMs for HIV-1 protease cleavage site prediction.基于一致性的特征选择方法与线性 SVM 联合用于 HIV-1 蛋白酶切割位点预测。
PLoS One. 2013 Aug 23;8(8):e63145. doi: 10.1371/journal.pone.0063145. eCollection 2013.