Li Qingwen, Zhou Wenyang, Wang Donghua, Wang Sui, Li Qingyuan
College of Animal Science and Technology, Northeast Agricultural University, Harbin, China.
Center for Bioinformatics, School of Life Sciences and Technology, Harbin Institute of Technology, Harbin, China.
Front Bioeng Biotechnol. 2020 Aug 12;8:892. doi: 10.3389/fbioe.2020.00892. eCollection 2020.
Cancer is still a severe health problem globally. The therapy of cancer traditionally involves the use of radiotherapy or anticancer drugs to kill cancer cells, but these methods are quite expensive and have side effects, which will cause great harm to patients. With the find of anticancer peptides (ACPs), significant progress has been achieved in the therapy of tumors. Therefore, it is invaluable to accurately identify anticancer peptides. Although biochemical experiments can solve this work, this method is expensive and time-consuming. To promote the application of anticancer peptides in cancer therapy, machine learning can be used to recognize anticancer peptides by extracting the feature vectors of anticancer peptides. Nevertheless, poor performance usually be found in training the machine learning model to utilizing high-dimensional features in practice. In order to solve the above job, this paper put forward a 19-dimensional feature model based on anticancer peptide sequences, which has lower dimensionality and better performance than some existing methods. In addition, this paper also separated a model with a low number of dimensions and acceptable performance. The few features identified in this study may represent the important features of anticancer peptides.
癌症在全球范围内仍然是一个严重的健康问题。传统的癌症治疗方法包括使用放射疗法或抗癌药物来杀死癌细胞,但这些方法相当昂贵且有副作用,会对患者造成极大伤害。随着抗癌肽(ACPs)的发现,肿瘤治疗取得了重大进展。因此,准确识别抗癌肽具有重要价值。虽然生化实验可以解决这项工作,但这种方法既昂贵又耗时。为了促进抗癌肽在癌症治疗中的应用,可以利用机器学习通过提取抗癌肽的特征向量来识别抗癌肽。然而,在实际训练机器学习模型以利用高维特征时,通常会发现性能不佳。为了解决上述问题,本文提出了一种基于抗癌肽序列的19维特征模型,该模型比一些现有方法具有更低的维度和更好的性能。此外,本文还分离出了一个维度数量少且性能可接受的模型。本研究中确定的少数特征可能代表了抗癌肽的重要特征。