Xiao Wei, Chen Liu-Zhen, Chang Jun, Xiao Yi-Wen
School of Life Science, Jiangxi Science & Technology Normal University, Nanchang, Jiangxi, People's Republic of China.
Drug Des Devel Ther. 2025 Jun 14;19:5085-5098. doi: 10.2147/DDDT.S523769. eCollection 2025.
Alzheimer's disease poses a significant threat to human health. Currenttherapeutic medicines, while alleviate symptoms, fail to reverse the disease progression or reduce its harmful effects, and exhibit toxicity and side effects such as gastrointestinal discomfort and cardiovascular disorders. The major challenge in developing machine learning models for anti-acetylcholinesterase peptides discovery is the limited availability of active peptide data in public databases. This study primarily aims to address this challenge and secondarily to discover novel, safer, and less toxic anti-acetylcholinesterase peptides for better Alzheimer's disease treatment.
A Random Forest Classifier model was constructed from a hybrid dataset of non-peptide small molecules and peptides. It was applied to screen a custom peptide library. The binding affinities of the predicted peptides to acetylcholinesterase were assessed via molecular docking, and top ranked peptides were selected for experimental assay.
The top six peptides (IFLSMC, WCWIYN, WIGCWD, LHTMELL, WHLCVLF, and VWIIGFEHM) were selected for experimental validation. Their inhibitiory effects on acetylcholinesterase were determined to be 0.007, 3.4, 1.9, 10.6, 1.5, and 3.9 μmol/L, respectively.
Predicting anti-acetylcholinesterase peptides is challenging due to the absence of a comprehensive, publicly accessible peptide database. Traditional approaches using only non-peptide small molecules for model construction often have poor performance on predicting active peptides. Here, we developed a machine-learning model from a hybrid dataset of non-peptide small molecules and peptides, which find six potent peptides. This model was as/superior accuracy compared to small-molecule-only models reported before, but has a significant higher capability of discriminating active peptides. Our work shows that hybrid datasets can boost machine-learning model prediction in peptide drug discovery.
阿尔茨海默病对人类健康构成重大威胁。目前的治疗药物虽然能缓解症状,但无法逆转疾病进展或减轻其有害影响,且会出现如胃肠道不适和心血管疾病等毒性和副作用。开发用于发现抗乙酰胆碱酯酶肽的机器学习模型面临的主要挑战是公共数据库中活性肽数据有限。本研究主要旨在应对这一挑战,其次是发现新型、更安全且毒性更低的抗乙酰胆碱酯酶肽,以更好地治疗阿尔茨海默病。
从非肽小分子和肽的混合数据集中构建随机森林分类器模型。将其应用于筛选定制肽库。通过分子对接评估预测肽与乙酰胆碱酯酶的结合亲和力,并选择排名靠前的肽进行实验测定。
选择前六种肽(IFLSMC、WCWIYN、WIGCWD、LHTMELL、WHLCVLF和VWIIGFEHM)进行实验验证。确定它们对乙酰胆碱酯酶的抑制作用分别为0.007、3.4、1.9、10.6、1.5和3.9 μmol/L。
由于缺乏全面、可公开访问的肽数据库,预测抗乙酰胆碱酯酶肽具有挑战性。仅使用非肽小分子构建模型的传统方法在预测活性肽方面往往表现不佳。在此,我们从非肽小分子和肽的混合数据集中开发了一种机器学习模型,该模型发现了六种有效肽。与之前报道的仅小分子模型相比,该模型具有相同/更高的准确性,但具有显著更高的区分活性肽的能力。我们的工作表明,混合数据集可以提高肽药物发现中机器学习模型的预测能力。