Charoenkwan Phasit, Schaduangrat Nalini, Lio Pietro, Moni Mohammad Ali, Chumnanpuen Pramote, Shoombuatong Watshara
Modern Management and Information Technology, College of Arts, Media and Technology, Chiang Mai University, Chiang Mai50200, Thailand.
Center of Data Mining and Biomedical Informatics, Faculty of Medical Technology, Mahidol University, Bangkok10700, Thailand.
ACS Omega. 2022 Nov 2;7(45):41082-41095. doi: 10.1021/acsomega.2c04465. eCollection 2022 Nov 15.
Antimalarial peptides (AMAPs) varying in length, amino acid composition, charge, conformational structure, hydrophobicity, and amphipathicity reflect their diversity in antimalarial mechanisms. Due to the worldwide major health problem concerning antimicrobial resistance, these peptides possess great therapeutic value owing to their low incidences of drug resistance as compared to conventional antibiotics. Although well-known experimental methods are able to precisely determine the antimalarial activity of peptides, these methods are still time-consuming and costly. Thus, machine learning (ML)-based methods that are capable of identifying AMAPs rapidly by using only sequence information would be beneficial for the high-throughput identification of AMAPs. In this study, we propose the first computational model (termed iAMAP-SCM) for the large-scale identification and characterization of peptides with antimalarial activity by using only sequence information. Specifically, we employed an interpretable scoring card method (SCM) to develop iAMAP-SCM and estimate propensities of 20 amino acids and 400 dipeptides to be AMAPs in a supervised manner. Experimental results showed that iAMAP-SCM could achieve a maximum accuracy and Matthew's coefficient correlation of 0.957 and 0.834, respectively, on the independent test dataset. In addition, SCM-derived propensities of 20 amino acids and selected physicochemical properties were used to provide an understanding of the functional mechanisms of AMAPs. Finally, a user-friendly online computational platform of iAMAP-SCM is publicly available at http://pmlabstack.pythonanywhere.com/iAMAP-SCM. The iAMAP-SCM predictor is anticipated to assist experimental scientists in the high-throughput identification of potential AMAP candidates for the treatment of malaria and other clinical applications.
抗疟肽(AMAPs)在长度、氨基酸组成、电荷、构象结构、疏水性和两亲性方面存在差异,这反映了它们在抗疟机制上的多样性。由于全球范围内与抗菌药物耐药性相关的重大健康问题,与传统抗生素相比,这些肽具有较低的耐药发生率,因此具有很大的治疗价值。尽管已知的实验方法能够精确测定肽的抗疟活性,但这些方法仍然耗时且成本高昂。因此,基于机器学习(ML)的方法能够仅通过序列信息快速识别AMAPs,这将有助于高通量识别AMAPs。在本研究中,我们提出了第一个计算模型(称为iAMAP-SCM),用于仅通过序列信息大规模识别和表征具有抗疟活性的肽。具体而言,我们采用了一种可解释的评分卡方法(SCM)来开发iAMAP-SCM,并以监督的方式估计20种氨基酸和400种二肽成为AMAPs的倾向。实验结果表明,iAMAP-SCM在独立测试数据集上分别能够达到最大准确率0.957和马修斯系数相关性0.834。此外,SCM得出的20种氨基酸的倾向和选定的物理化学性质被用于理解AMAPs的功能机制。最后,一个用户友好的iAMAP-SCM在线计算平台可在http://pmlabstack.pythonanywhere.com/iAMAP-SCM上公开获取。预计iAMAP-SCM预测器将协助实验科学家高通量识别潜在的AMAP候选物,用于治疗疟疾和其他临床应用。