• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于机器学习的随机森林模型预测肽分子药物相似性评估中的规则违背情况

Machine Learning-Based Prediction of Rule Violations for Drug-Likeness Assessment in Peptide Molecules Using Random Forest Models.

作者信息

Lambev Momchil, Dimitrova Dimana, Mihaylova Silviya

机构信息

Medical College, Medical University of Varna, 84 Tzar Osvoboditel Str., 9002 Varna, Bulgaria.

出版信息

Int J Mol Sci. 2025 Aug 29;26(17):8407. doi: 10.3390/ijms26178407.

DOI:10.3390/ijms26178407
PMID:40943329
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12428439/
Abstract

Peptide therapeutics often fall outside classical small-molecule heuristics, such as Lipinski's Rule of Five (Ro5), motivating the development of adapted filters and data-driven approaches to early drug-likeness assessment. We curated >300 k drug (small and peptide) and non-drug molecules from PubChem, extracted key molecular descriptors with RDKit, and generated three rule-violation counters for Ro5, the peptide-oriented beyond-Ro5 (bRo5) extension, and Muegge's criteria. Random Forest (RF) classifier and regressor models (with 10, 20, and 30 trees) were trained and evaluated. Predictions for 26 peptide test molecules were compared with those from SwissADME, Molinspiration, and manual calculations. Model metrics were uniformly high (Ro5 accuracy/precision/recall = 1.0; Muegge ≈ 0.99), indicating effective learning. Ro5 violation counts matched reference values for 23/26 peptides; the remaining cases differed by +1 violation, reflecting larger structures and platform limits. bRo5 predictions showed near-complete agreement with manual values; minor discrepancies occurred in isolated peptides. Muegge's predictions were internally consistent but tended to underestimate SwissADME by ~1 violation in several molecules. Four peptides (ML13-16) satisfied bRo5 boundaries; three also fully met Ro5. RF models thus provide fast and reliable in silico filters for peptide drug-likeness and can support the prioritisation of orally developable candidates.

摘要

肽类疗法通常不符合经典的小分子启发式方法,如Lipinski的五规则(Ro5),这推动了适应性筛选方法和数据驱动方法的发展,以用于早期药物相似性评估。我们从PubChem中整理了超过30万个药物(小分子和肽)及非药物分子,使用RDKit提取关键分子描述符,并为Ro5、面向肽的Ro5扩展(bRo5)和Muegge标准生成了三个规则违反计数器。训练并评估了随机森林(RF)分类器和回归模型(分别有10、20和30棵树)。将26个肽测试分子的预测结果与来自SwissADME、Molinspiration和手动计算的结果进行了比较。模型指标普遍较高(Ro5的准确率/精确率/召回率 = 1.0;Muegge约为0.99),表明学习效果良好。26个肽中有23个的Ro5违反计数与参考值匹配;其余情况相差1次违反,这反映了更大的结构和平台限制。bRo5的预测结果与手动计算值几乎完全一致;在个别肽中出现了微小差异。Muegge的预测在内部是一致的,但在几个分子中往往比SwissADME低估约1次违反。四个肽(ML13 - 16)满足bRo5边界;其中三个也完全符合Ro5。因此,RF模型为肽类药物相似性提供了快速可靠的计算机筛选方法,并可支持对可口服开发候选物的优先级排序。

相似文献

1
Machine Learning-Based Prediction of Rule Violations for Drug-Likeness Assessment in Peptide Molecules Using Random Forest Models.基于机器学习的随机森林模型预测肽分子药物相似性评估中的规则违背情况
Int J Mol Sci. 2025 Aug 29;26(17):8407. doi: 10.3390/ijms26178407.
2
Development of Machine Learning-based Algorithms to Predict the 2- and 5-year Risk of TKA After Tibial Plateau Fracture Treatment.基于机器学习的算法用于预测胫骨平台骨折治疗后2年和5年全膝关节置换风险的研究进展
Clin Orthop Relat Res. 2025 Mar 12. doi: 10.1097/CORR.0000000000003442.
3
Stabilizing machine learning for reproducible and explainable results: A novel validation approach to subject-specific insights.稳定机器学习以获得可重复和可解释的结果:一种针对特定个体见解的新型验证方法。
Comput Methods Programs Biomed. 2025 Jun 21;269:108899. doi: 10.1016/j.cmpb.2025.108899.
4
Comparison of Two Modern Survival Prediction Tools, SORG-MLA and METSSS, in Patients With Symptomatic Long-bone Metastases Who Underwent Local Treatment With Surgery Followed by Radiotherapy and With Radiotherapy Alone.两种现代生存预测工具 SORG-MLA 和 METSSS 在接受手术联合放疗和单纯放疗治疗有症状长骨转移患者中的比较。
Clin Orthop Relat Res. 2024 Dec 1;482(12):2193-2208. doi: 10.1097/CORR.0000000000003185. Epub 2024 Jul 23.
5
Development and validation of a machine learning-based model for predicting intraoperative blood loss during burn surgery.基于机器学习的烧伤手术术中失血量预测模型的开发与验证
Surgery. 2025 Aug;184:109445. doi: 10.1016/j.surg.2025.109445. Epub 2025 May 29.
6
Prescription of Controlled Substances: Benefits and Risks管制药品的处方:益处与风险
7
Supervised Machine Learning Models for Predicting Sepsis-Associated Liver Injury in Patients With Sepsis: Development and Validation Study Based on a Multicenter Cohort Study.用于预测脓毒症患者脓毒症相关肝损伤的监督式机器学习模型:基于多中心队列研究的开发与验证研究
J Med Internet Res. 2025 May 26;27:e66733. doi: 10.2196/66733.
8
Discovery of Novel Anti-Acetylcholinesterase Peptides Using a Machine Learning and Molecular Docking Approach.使用机器学习和分子对接方法发现新型抗乙酰胆碱酯酶肽
Drug Des Devel Ther. 2025 Jun 14;19:5085-5098. doi: 10.2147/DDDT.S523769. eCollection 2025.
9
A study on the effectiveness of machine learning models for hepatitis prediction.关于机器学习模型用于肝炎预测有效性的研究。
Sci Rep. 2025 Aug 20;15(1):30659. doi: 10.1038/s41598-025-07104-4.
10
Signs and symptoms to determine if a patient presenting in primary care or hospital outpatient settings has COVID-19.在基层医疗机构或医院门诊环境中,如果患者出现以下症状和体征,可判断其是否患有 COVID-19。
Cochrane Database Syst Rev. 2022 May 20;5(5):CD013665. doi: 10.1002/14651858.CD013665.pub3.

本文引用的文献

1
PubChem 2025 update.PubChem 2025更新版。
Nucleic Acids Res. 2025 Jan 6;53(D1):D1516-D1525. doi: 10.1093/nar/gkae1059.
2
Evaluation metrics and statistical tests for machine learning.机器学习的评估指标和统计检验。
Sci Rep. 2024 Mar 13;14(1):6086. doi: 10.1038/s41598-024-56706-x.
3
Hybrid Approach to Identifying Druglikeness Leading Compounds against COVID-19 3CL Protease.用于鉴定抗新型冠状病毒3CL蛋白酶类药物先导化合物的混合方法
Pharmaceuticals (Basel). 2022 Oct 28;15(11):1333. doi: 10.3390/ph15111333.
4
Application of network link prediction in drug discovery.网络链接预测在药物发现中的应用。
BMC Bioinformatics. 2021 Apr 12;22(1):187. doi: 10.1186/s12859-021-04082-y.
5
Random Forest Model Prediction of Compound Oral Exposure in the Mouse.小鼠复合口服暴露的随机森林模型预测
ACS Pharmacol Transl Sci. 2021 Jan 26;4(1):338-343. doi: 10.1021/acsptsci.0c00197. eCollection 2021 Feb 12.
6
Prediction of Drug-Likeness Using Deep Autoencoder Neural Networks.使用深度自动编码器神经网络预测类药性
Front Genet. 2018 Nov 27;9:585. doi: 10.3389/fgene.2018.00585. eCollection 2018.
7
SwissADME: a free web tool to evaluate pharmacokinetics, drug-likeness and medicinal chemistry friendliness of small molecules.SwissADME:一个免费的网络工具,用于评估小分子的药代动力学、类药性和药物化学友善性。
Sci Rep. 2017 Mar 3;7:42717. doi: 10.1038/srep42717.
8
Oral druggable space beyond the rule of 5: insights from drugs and clinical candidates.超越“五规则”的口服可成药空间:来自药物和临床候选药物的见解
Chem Biol. 2014 Sep 18;21(9):1115-42. doi: 10.1016/j.chembiol.2014.08.013.
9
Random forest: a classification and regression tool for compound classification and QSAR modeling.随机森林:一种用于化合物分类和定量构效关系建模的分类与回归工具。
J Chem Inf Comput Sci. 2003 Nov-Dec;43(6):1947-58. doi: 10.1021/ci034160g.
10
Computational methods to estimate drug development parameters.估计药物研发参数的计算方法。
Curr Opin Drug Discov Devel. 2001 Jan;4(1):102-9.