Charoenkwan Phasit, Chumnanpuen Pramote, Schaduangrat Nalini, Shoombuatong Watshara
Modern Management and Information Technology, College of Arts, Media and Technology, Chiang Mai University, Chiang Mai 50200, Thailand.
Department of Zoology, Faculty of Science, Kasetsart University, Bangkok 10900, Thailand; Kasetsart University International College (KUIC), Kasetsart University, Bangkok 10900, Thailand.
J Mol Biol. 2025 Mar 15;437(6):168853. doi: 10.1016/j.jmb.2024.168853. Epub 2024 Nov 6.
AVPs, or antiviral peptides, are short chains of amino acids capable of inhibiting viral replication, preventing viral entry, or disrupting viral membranes. They represent a promising area of research for developing new antiviral therapies due to their potential to target a broad spectrum of viruses, incorporating those resistant to traditional antiviral drugs. However, traditional experimental methods for identifying AVPs are often costly and labour-intensive. Thus far, multiple computational methods have been introduced for the in silico identification of AVPs, but these methods still have certain shortcomings. In this study, we propose a novel stacked ensemble learning framework, termed Stack-AVP, for fast and accurate AVP identification. In Stack-AVP, we investigated heterogeneous prediction models, which were trained with 12 commonly used machine learning algorithms coupled with a wide range of multiple feature encoding schemes. Subsequently, these prediction models were adopted to generate multi-view features providing class information and probability information. Finally, we applied our feature selection method to determine the best feature subset for the construction of the final stacked model. Comparative assessments on the independent test dataset revealed that Stack-AVP surpassed the performance of current state-of-the-art methods, with an accuracy of 0.930, MCC of 0.860, and AUC of 0.975. Furthermore, it was found that our multi-view features exhibited a crucial mechanism to improve the prediction performance of AVPs. To facilitate experimental scientists in performing high-throughput identification of AVPs, the prediction sever Stack-AVP is publicly accessible at https://pmlabqsar.pythonanywhere.com/Stack-AVP.
抗病毒肽(AVP)是一种氨基酸短链,能够抑制病毒复制、阻止病毒进入或破坏病毒膜。由于其能够靶向多种病毒,包括对传统抗病毒药物耐药的病毒,因此在开发新的抗病毒疗法方面是一个很有前景的研究领域。然而,传统的识别AVP的实验方法通常成本高昂且 labor-intensive。到目前为止,已经引入了多种计算方法用于在计算机上识别AVP,但这些方法仍然存在一定的缺点。在本研究中,我们提出了一种新颖的堆叠集成学习框架,称为Stack-AVP,用于快速准确地识别AVP。在Stack-AVP中,我们研究了异构预测模型,这些模型使用12种常用的机器学习算法以及广泛的多种特征编码方案进行训练。随后,采用这些预测模型生成提供类别信息和概率信息的多视图特征。最后,我们应用我们的特征选择方法来确定用于构建最终堆叠模型的最佳特征子集。在独立测试数据集上的比较评估表明,Stack-AVP超过了当前最先进方法的性能,准确率为0.930,马修斯相关系数(MCC)为0.860,曲线下面积(AUC)为0.975。此外,我们发现我们的多视图特征展示了一种提高AVP预测性能的关键机制。为了方便实验科学家进行AVP的高通量识别,预测服务器Stack-AVP可在https://pmlabqsar.pythonanywhere.com/Stack-AVP上公开访问。