Bonifacio-Velez de Villa Eliezer I, Montoya-Alfaro María E, Negrón-Ballarte Luisa P, Solis-Calero Christian
Faculty of Pharmacy and Biochemistry, Universidad Nacional Mayor de San Marcos, Lima 15001, Peru.
Pharmaceutics. 2025 Jul 31;17(8):993. doi: 10.3390/pharmaceutics17080993.
: Peptides are a class of molecules that can be presented as good antimicrobials and with mechanisms that avoid resistance, and the design of peptides with good activity can be complex and laborious. The study of their quantitative structure-activity relationships through machine learning algorithms can shed light on a rational and effective design. : Information on the antimicrobial activity of peptides was collected, and their structures were characterized by molecular descriptors generation to design regression and classification models based on machine learning algorithms. The contribution of each descriptor in the generated models was evaluated by determining its relative importance and, finally, the antimicrobial activity of new peptides was estimated. : A structured database of antimicrobial peptides and their descriptors was obtained, with which 56 machine learning models were generated. Random Forest-based models showed better performance, and of these, regression models showed variable performance (R = 0.339-0.574), while classification models showed good performance (MCC = 0.662-0.755 and ACC = 0.831-0.877). Those models based on bacterial groups showed better performance than those based on the entire dataset. The properties of the new peptides generated are related to important descriptors that encode physicochemical properties such as lower molecular weight, higher charge, propensity to form alpha-helical structures, lower hydrophobicity, and higher frequency of amino acids such as lysine and serine. : Machine learning models allowed to establish the structure-activity relationships of antimicrobial peptides. Classification models performed better than regression models. These models allowed us to make predictions and new peptides with high antimicrobial potential were proposed.
肽是一类可作为良好抗菌剂的分子,其作用机制可避免耐药性,而设计具有良好活性的肽可能复杂且费力。通过机器学习算法研究其定量构效关系有助于进行合理有效的设计。收集了肽的抗菌活性信息,并通过生成分子描述符对其结构进行表征,以基于机器学习算法设计回归和分类模型。通过确定每个描述符的相对重要性来评估其在生成模型中的贡献,最后估计新肽的抗菌活性。获得了一个抗菌肽及其描述符的结构化数据库,利用该数据库生成了56个机器学习模型。基于随机森林的模型表现更好,其中回归模型表现各异(R = 0.339 - 0.574),而分类模型表现良好(MCC = 0.662 - 0.755,ACC = 0.831 - 0.877)。基于细菌群体的模型比基于整个数据集的模型表现更好。生成的新肽的特性与编码物理化学性质的重要描述符相关,如较低的分子量、较高的电荷、形成α - 螺旋结构的倾向、较低的疏水性以及赖氨酸和丝氨酸等氨基酸的较高频率。机器学习模型有助于建立抗菌肽的构效关系。分类模型比回归模型表现更好。这些模型使我们能够进行预测,并提出了具有高抗菌潜力的新肽。