Modern Management and Information Technology, College of Arts, Media and Technology, Chiang Mai University, Chiang Mai 50200, Thailand.
Center of Data Mining and Biomedical Informatics, Faculty of Medical Technology, Mahidol University, Bangkok, 10700, Thailand.
Curr Pharm Des. 2021;27(18):2180-2188. doi: 10.2174/1381612826666201102105827.
In light of the growing resistance toward current antiviral drugs, efforts to discover novel and effective antiviral therapeutic agents remain a pressing scientific effort. Antiviral peptides (AVPs) represent promising therapeutic agents due to their extraordinary advantages in terms of potency, efficacy and pharmacokinetic properties. The growing volume of newly discovered peptide sequences in the post-genomic era requires computational approaches for timely and accurate identification of AVPs. Machine learning (ML) methods such as random forest and support vector machine represent robust learning algorithms that are instrumental in successful peptide-based drug discovery. Therefore, this review summarizes the current state-of-the-art application of ML methods for identifying AVPs directly from the sequence information. We compare the efficiency of these methods in terms of the underlying characteristics of the dataset used along with feature encoding methods, ML algorithms, cross-validation methods and prediction performance. Finally, guidelines for the development of robust AVP models are also discussed. It is anticipated that this review will serve as a useful guide for the design and development of robust AVP and related therapeutic peptide predictors in the future.
鉴于当前抗病毒药物的耐药性不断增加,寻找新型有效的抗病毒治疗药物仍然是一项紧迫的科学努力。抗病毒肽(AVP)因其在效力、功效和药代动力学特性方面的卓越优势,成为很有前途的治疗药物。在后基因组时代,新发现的肽序列数量不断增加,这就需要计算方法来及时、准确地识别 AVP。随机森林和支持向量机等机器学习 (ML) 方法是强大的学习算法,它们在成功的基于肽的药物发现中发挥了重要作用。因此,本文综述了当前最先进的应用机器学习方法直接从序列信息中识别 AVP 的方法。我们比较了这些方法在使用的数据集的基础特征、特征编码方法、ML 算法、交叉验证方法和预测性能方面的效率。最后,还讨论了开发稳健的 AVP 模型的指南。预计本综述将为未来设计和开发稳健的 AVP 和相关治疗性肽预测器提供有用的指导。