Ramasundaram Maduravani, Sohn Honglae, Madhavan Thirumurthy
Department of Genetic Engineering, Computational Biology Lab, School of Bioengineering, SRM Institute of Science and Technology, SRM Nagar, Chennai, India.
Department of Chemistry and Department of Carbon Materials, Chosun University, Gwangju, Republic of Korea.
Front Artif Intell. 2025 Jan 7;7:1497307. doi: 10.3389/frai.2024.1497307. eCollection 2024.
Cell-penetrating peptides (CPPs) are highly effective at passing through eukaryotic membranes with various cargo molecules, like drugs, proteins, nucleic acids, and nanoparticles, without causing significant harm. Creating drug delivery systems with CPP is associated with cancer, genetic disorders, and diabetes due to their unique chemical properties. Wet lab experiments in drug discovery methodologies are time-consuming and expensive. Machine learning (ML) techniques can enhance and accelerate the drug discovery process with accurate and intricate data quality. ML classifiers, such as support vector machine (SVM), random forest (RF), gradient-boosted decision trees (GBDT), and different types of artificial neural networks (ANN), are commonly used for CPP prediction with cross-validation performance evaluation. Functional CPP prediction is improved by using these ML strategies by using CPP datasets produced by high-throughput sequencing and computational methods. This review focuses on several ML-based CPP prediction tools. We discussed the CPP mechanism to understand the basic functioning of CPPs through cells. A comparative analysis of diverse CPP prediction methods was conducted based on their algorithms, dataset size, feature encoding, software utilities, assessment metrics, and prediction scores. The performance of the CPP prediction was evaluated based on accuracy, sensitivity, specificity, and Matthews correlation coefficient (MCC) on independent datasets. In conclusion, this review will encourage the use of ML algorithms for finding effective CPPs, which will have a positive impact on future research on drug delivery and therapeutics.
细胞穿透肽(CPPs)能够高效地携带各种负载分子穿过真核细胞膜,这些负载分子包括药物、蛋白质、核酸和纳米颗粒等,且不会造成显著损害。由于其独特的化学性质,利用CPPs创建药物递送系统与癌症、遗传疾病和糖尿病相关。药物发现方法中的湿实验室实验既耗时又昂贵。机器学习(ML)技术可以通过准确且复杂的数据质量来增强和加速药物发现过程。ML分类器,如支持向量机(SVM)、随机森林(RF)、梯度提升决策树(GBDT)以及不同类型的人工神经网络(ANN),通常用于通过交叉验证性能评估来进行CPP预测。通过使用高通量测序和计算方法产生的CPP数据集,利用这些ML策略可以改进功能性CPP预测。本综述聚焦于几种基于ML的CPP预测工具。我们讨论了CPP的机制,以了解CPPs通过细胞的基本功能。基于不同CPP预测方法的算法、数据集大小、特征编码、软件实用程序、评估指标和预测分数进行了比较分析。基于独立数据集的准确性、敏感性、特异性和马修斯相关系数(MCC)对CPP预测的性能进行了评估。总之,本综述将鼓励使用ML算法来寻找有效的CPPs,这将对未来药物递送和治疗学的研究产生积极影响。