Lin Weizhong, Xu Dong
nformation Engineering School, Jingdezhen Ceramic Institute, Jingdezhen 333406, China.
Department of Computer Science and Christopher S. Bond Life Sciences Center, University of Missouri, Columbia, MO 65211, USA.
Bioinformatics. 2016 Dec 15;32(24):3745-3752. doi: 10.1093/bioinformatics/btw560. Epub 2016 Aug 26.
With the rapid increase of infection resistance to antibiotics, it is urgent to find novel infection therapeutics. In recent years, antimicrobial peptides (AMPs) have been utilized as potential alternatives for infection therapeutics. AMPs are key components of the innate immune system and can protect the host from various pathogenic bacteria. Identifying AMPs and their functional types has led to many studies, and various predictors using machine learning have been developed. However, there is room for improvement; in particular, no predictor takes into account the lack of balance among different functional AMPs.
In this paper, a new synthetic minority over-sampling technique on imbalanced and multi-label datasets, referred to as ML-SMOTE, was designed for processing and identifying AMPs' functional families. A novel multi-label classifier, MLAMP, was also developed using ML-SMOTE and grey pseudo amino acid composition. The classifier obtained 0.4846 subset accuracy and 0.16 hamming loss.
A user-friendly web-server for MLAMP was established at http://www.jci-bioinfo.cn/MLAMP CONTACTS: linweizhong@jci.edu.cn or xudong@missouri.edu.
随着感染对抗生素的耐药性迅速增加,寻找新型感染治疗方法迫在眉睫。近年来,抗菌肽(AMPs)已被用作感染治疗的潜在替代物。抗菌肽是先天免疫系统的关键组成部分,可以保护宿主免受各种病原菌的侵害。识别抗菌肽及其功能类型引发了许多研究,并且已经开发了各种使用机器学习的预测器。然而,仍有改进的空间;特别是,没有预测器考虑到不同功能抗菌肽之间缺乏平衡的情况。
在本文中,设计了一种用于不平衡和多标签数据集的新的合成少数类过采样技术,称为ML-SMOTE,用于处理和识别抗菌肽的功能家族。还使用ML-SMOTE和灰色伪氨基酸组成开发了一种新型多标签分类器MLAMP。该分类器获得了0.4846的子集准确率和0.16的汉明损失。