Suppr超能文献

EnACP:一种用于鉴定抗癌肽的集成学习模型。

EnACP: An Ensemble Learning Model for Identification of Anticancer Peptides.

作者信息

Ge Ruiquan, Feng Guanwen, Jing Xiaoyang, Zhang Renfeng, Wang Pu, Wu Qing

机构信息

Key Laboratory of Complex Systems Modeling and Simulation, School of Computer Science and Technology, Hangzhou Dianzi University, Hangzhou, China.

Xi'an Key Laboratory of Big Data and Intelligent Vision, School of Computer Science and Technology, Xidian University, Xi'an, China.

出版信息

Front Genet. 2020 Jul 30;11:760. doi: 10.3389/fgene.2020.00760. eCollection 2020.

Abstract

As cancer remains one of the main threats of human life, developing efficient cancer treatments is urgent. Anticancer peptides, which could overcome the significant side effects and poor results of traditional cancer treatments, have become a new potential alternative these years. However, identifying anticancer peptides by experimental methods is time consuming and resource consuming, it is of great significance to develop effective computational tools to quickly and accurately identify potential anticancer peptides from amino acid sequences. For most current computational methods, feature representation plays a key role in their final successes. This study proposes a novel fast and accurate approach to identify anticancer peptides using diversified feature representations and ensemble learning method. For the feature representations, the information is encoded from multidimensional feature spaces, including sequence composition, sequence-order, physicochemical properties, etc. In order to better model the potential relationships of peptides, multiple ensemble classifiers, LightGBMs, are applied to detect the different feature sets at first. Then the obtained multiple outputs are used as inputs of the support vector machine classifier, which effectively identifies anticancer peptides. Experimental results on cross validation and independent test sets demonstrate that our method can achieve better or comparable performances compared with other state-of-the-art methods.

摘要

由于癌症仍然是人类生命的主要威胁之一,开发有效的癌症治疗方法迫在眉睫。近年来,能够克服传统癌症治疗显著副作用和不佳效果的抗癌肽已成为一种新的潜在替代方案。然而,通过实验方法鉴定抗癌肽既耗时又耗资源,因此开发有效的计算工具以从氨基酸序列中快速准确地鉴定潜在抗癌肽具有重要意义。对于当前大多数计算方法而言,特征表示在其最终成功中起着关键作用。本研究提出了一种新颖的快速准确方法,利用多样化的特征表示和集成学习方法来鉴定抗癌肽。对于特征表示,信息是从多维特征空间进行编码的,包括序列组成、序列顺序、理化性质等。为了更好地模拟肽的潜在关系,首先应用多个集成分类器LightGBM来检测不同的特征集。然后将获得的多个输出用作支持向量机分类器的输入,从而有效地鉴定抗癌肽。交叉验证和独立测试集上的实验结果表明,与其他现有最先进方法相比,我们的方法能够实现更好或相当的性能。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验