Suppr超能文献

MLACP:基于机器学习的抗癌肽预测

MLACP: machine-learning-based prediction of anticancer peptides.

作者信息

Manavalan Balachandran, Basith Shaherin, Shin Tae Hwan, Choi Sun, Kim Myeong Ok, Lee Gwang

机构信息

Department of Physiology, Ajou University School of Medicine, Suwon, Republic of Korea.

College of Pharmacy, Graduate School of Pharmaceutical Sciences, Ewha Womans University, Seoul, Republic of Korea.

出版信息

Oncotarget. 2017 Aug 19;8(44):77121-77136. doi: 10.18632/oncotarget.20365. eCollection 2017 Sep 29.

Abstract

Cancer is the second leading cause of death globally, and use of therapeutic peptides to target and kill cancer cells has received considerable attention in recent years. Identification of anticancer peptides (ACPs) through wet-lab experimentation is expensive and often time consuming; therefore, development of an efficient computational method is essential to identify potential ACP candidates prior to experimentation. In this study, we developed support vector machine- and random forest-based machine-learning methods for the prediction of ACPs using the features calculated from the amino acid sequence, including amino acid composition, dipeptide composition, atomic composition, and physicochemical properties. We trained our methods using the Tyagi-B dataset and determined the machine parameters by 10-fold cross-validation. Furthermore, we evaluated the performance of our methods on two benchmarking datasets, with our results showing that the random forest-based method outperformed the existing methods with an average accuracy and Matthews correlation coefficient value of 88.7% and 0.78, respectively. To assist the scientific community, we also developed a publicly accessible web server at www.thegleelab.org/MLACP.html.

摘要

癌症是全球第二大死因,近年来,使用治疗性肽靶向并杀死癌细胞受到了广泛关注。通过湿实验室实验鉴定抗癌肽(ACP)成本高昂且耗时;因此,开发一种高效的计算方法对于在实验前识别潜在的ACP候选物至关重要。在本研究中,我们利用从氨基酸序列计算得到的特征,包括氨基酸组成、二肽组成、原子组成和物理化学性质,开发了基于支持向量机和随机森林的机器学习方法来预测ACP。我们使用Tyagi-B数据集训练我们的方法,并通过10折交叉验证确定机器参数。此外,我们在两个基准数据集上评估了我们方法的性能,结果表明基于随机森林的方法优于现有方法,平均准确率和马修斯相关系数值分别为88.7%和0.78。为了帮助科学界,我们还在www.thegleelab.org/MLACP.html上开发了一个可公开访问的网络服务器。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bf49/5652333/1a761041a54f/oncotarget-08-77121-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验