基于工业视角的机器学习算法在吸收、分布、代谢和排泄预测中的前瞻性验证。

Prospective Validation of Machine Learning Algorithms for Absorption, Distribution, Metabolism, and Excretion Prediction: An Industrial Perspective.

机构信息

Medicinal Chemistry, Biogen, Cambridge, Massachusetts 02142, United States.

DMPK, Biogen, Cambridge, Massachusetts 02142, United States.

出版信息

J Chem Inf Model. 2023 Jun 12;63(11):3263-3274. doi: 10.1021/acs.jcim.3c00160. Epub 2023 May 22.

DOI:10.1021/acs.jcim.3c00160

PMID:37216672

Abstract

Absorption, distribution, metabolism, and excretion (ADME), which collectively define the concentration profile of a drug at the site of action, are of critical importance to the success of a drug candidate. Recent advances in machine learning algorithms and the availability of larger proprietary as well as public ADME data sets have generated renewed interest within the academic and pharmaceutical science communities in predicting pharmacokinetic and physicochemical endpoints in early drug discovery. In this study, we collected 120 internal prospective data sets over 20 months across six ADME in vitro endpoints: human and rat liver microsomal stability, MDR1-MDCK efflux ratio, solubility, and human and rat plasma protein binding. A variety of machine learning algorithms in combination with different molecular representations were evaluated. Our results suggest that gradient boosting decision tree and deep learning models consistently outperformed random forest over time. We also observed better performance when models were retrained on a fixed schedule, and the more frequent retraining generally resulted in increased accuracy, while hyperparameters tuning only improved the prospective predictions marginally.

摘要

吸收、分布、代谢和排泄（ADME）共同定义了药物在作用部位的浓度特征，对候选药物的成功至关重要。最近，机器学习算法的进步以及更大规模的专有和公共 ADME 数据集的出现，重新激发了学术界和制药科学领域对早期药物发现中预测药代动力学和物理化学终点的兴趣。在这项研究中，我们在 20 个月内收集了 120 个内部前瞻性数据集，涵盖了六个 ADME 体外终点：人肝微粒体和大鼠肝微粒体稳定性、MDR1-MDCK 外排比、溶解度以及人血浆和大鼠血浆蛋白结合。我们评估了多种机器学习算法和不同的分子表示方法。结果表明，梯度提升决策树和深度学习模型在整个时间内始终优于随机森林。我们还观察到，模型按固定时间表重新训练时性能更好，更频繁的重新训练通常会提高准确性，而超参数调整仅略微提高了前瞻性预测的准确性。

相似文献

Prospective Validation of Machine Learning Algorithms for Absorption, Distribution, Metabolism, and Excretion Prediction: An Industrial Perspective.基于工业视角的机器学习算法在吸收、分布、代谢和排泄预测中的前瞻性验证。

J Chem Inf Model. 2023 Jun 12;63(11):3263-3274. doi: 10.1021/acs.jcim.3c00160. Epub 2023 May 22.

Comparison of Deep Learning With Multiple Machine Learning Methods and Metrics Using Diverse Drug Discovery Data Sets.使用多种药物发现数据集比较深度学习与多种机器学习方法和指标。

Mol Pharm. 2017 Dec 4;14(12):4462-4475. doi: 10.1021/acs.molpharmaceut.7b00578. Epub 2017 Nov 13.

Systematic Evaluation of Local and Global Machine Learning Models for the Prediction of ADME Properties.用于预测药物吸收、分布、代谢和排泄（ADME）特性的局部和全局机器学习模型的系统评估。

Mol Pharm. 2023 Mar 6;20(3):1758-1767. doi: 10.1021/acs.molpharmaceut.2c00962. Epub 2023 Feb 6.

Exploring Tunable Hyperparameters for Deep Neural Networks with Industrial ADME Data Sets.探索具有工业 ADME 数据集的深度神经网络的可调超参数。

J Chem Inf Model. 2019 Mar 25;59(3):1005-1016. doi: 10.1021/acs.jcim.8b00671. Epub 2019 Jan 11.

Application of Machine Learning Technology in the Prediction of ADME- Related Pharmacokinetic Parameters.机器学习技术在ADME相关药代动力学参数预测中的应用。

Curr Med Chem. 2023;30(17):1945-1962. doi: 10.2174/0929867329666220819122205.

The Next Era: Deep Learning in Pharmaceutical Research.下一个时代：药物研究中的深度学习。

Pharm Res. 2016 Nov;33(11):2594-603. doi: 10.1007/s11095-016-2029-7. Epub 2016 Sep 6.

Bioactivity Comparison across Multiple Machine Learning Algorithms Using over 5000 Datasets for Drug Discovery.利用 5000 多个数据集进行药物发现的多种机器学习算法的生物活性比较。

Mol Pharm. 2021 Jan 4;18(1):403-415. doi: 10.1021/acs.molpharmaceut.0c01013. Epub 2020 Dec 16.

Prediction of Oral Bioavailability in Rats: Transferring Insights from in Vitro Correlations to (Deep) Machine Learning Models Using in Silico Model Outputs and Chemical Structure Parameters.大鼠口服生物利用度预测：通过体内外相关性和（深度）机器学习模型，利用计算模型输出和化学结构参数进行转移。

J Chem Inf Model. 2019 Nov 25;59(11):4893-4905. doi: 10.1021/acs.jcim.9b00460. Epub 2019 Nov 12.

Prediction of Oral Pharmacokinetics Using a Combination of In Silico Descriptors and In Vitro ADME Properties.利用体内外 ADME 特性与计算描述符组合预测口服药代动力学。

Mol Pharm. 2021 Mar 1;18(3):1071-1079. doi: 10.1021/acs.molpharmaceut.0c01009. Epub 2021 Jan 29.

Machine Learning Hybrid Model for the Prediction of Chronic Kidney Disease.机器学习混合模型预测慢性肾脏病。

Comput Intell Neurosci. 2023 Mar 14;2023:9266889. doi: 10.1155/2023/9266889. eCollection 2023.

引用本文的文献

Chemical Language Model Linker: Blending Text and Molecules with Modular Adapters.化学语言模型链接器：通过模块化适配器融合文本与分子

J Chem Inf Model. 2025 Sep 8;65(17):8944-8956. doi: 10.1021/acs.jcim.5c00853. Epub 2025 Aug 21.

The topology of molecular representations and its influence on machine learning performance.分子表示的拓扑结构及其对机器学习性能的影响。

J Cheminform. 2025 Jul 21;17(1):109. doi: 10.1186/s13321-025-01045-w.

Benchmarking ML in ADMET predictions: the practical impact of feature representations in ligand-based models.在药物代谢及药物动力学（ADMET）预测中对机器学习进行基准测试：基于配体的模型中特征表示的实际影响。

J Cheminform. 2025 Jul 21;17(1):108. doi: 10.1186/s13321-025-01041-0.

Chemical Language Model Linker: blending text and molecules with modular adapters.化学语言模型链接器：通过模块化适配器融合文本与分子。

ArXiv. 2025 Jun 13:arXiv:2410.20182v3.

Machine Learning for Toxicity Prediction Using Chemical Structures: Pillars for Success in the Real World.利用化学结构进行毒性预测的机器学习：在现实世界中取得成功的支柱。

Chem Res Toxicol. 2025 May 19;38(5):759-807. doi: 10.1021/acs.chemrestox.5c00033. Epub 2025 May 2.

Evaluating the Diversity and Target Addressability of DELs using Scaffold Analysis and Machine Learning.使用支架分析和机器学习评估DELs的多样性和靶向可及性

ACS Med Chem Lett. 2025 Jan 25;16(2):263-270. doi: 10.1021/acsmedchemlett.4c00505. eCollection 2025 Feb 13.

A review of large language models and autonomous agents in chemistry.化学领域中大型语言模型与自主智能体的综述。

Chem Sci. 2024 Dec 9;16(6):2514-2572. doi: 10.1039/d4sc03921a. eCollection 2025 Feb 5.

ADMET evaluation in drug discovery: 21. Application and industrial validation of machine learning algorithms for Caco-2 permeability prediction.药物研发中的ADMET评估：21. 用于Caco-2细胞渗透性预测的机器学习算法的应用与工业验证。

J Cheminform. 2025 Jan 10;17(1):3. doi: 10.1186/s13321-025-00947-z.

Accelerated hit identification with target evaluation, deep learning and automated labs: prospective validation in IRAK1.通过靶点评估、深度学习和自动化实验室加速命中靶点识别：在 IRAK1 中的前瞻性验证

J Cheminform. 2024 Nov 14;16(1):127. doi: 10.1186/s13321-024-00914-0.

Effect of Data Quality and Data Quantity on the Estimation of Intrinsic Solubility: Analysis Based on a Single-Source Data Set.数据质量和数量对固有溶解度估算的影响：基于单数据源数据集的分析。

Mol Pharm. 2024 Oct 7;21(10):5261-5271. doi: 10.1021/acs.molpharmaceut.4c00685. Epub 2024 Sep 13.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

基于工业视角的机器学习算法在吸收、分布、代谢和排泄预测中的前瞻性验证。

Prospective Validation of Machine Learning Algorithms for Absorption, Distribution, Metabolism, and Excretion Prediction: An Industrial Perspective.

机构信息

出版信息

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献