Suppr超能文献

FormulationBCS:一种基于多种分子表征的机器学习平台,用于生物药剂分类系统(BCS)类别预测。

FormulationBCS: A Machine Learning Platform Based on Diverse Molecular Representations for Biopharmaceutical Classification System (BCS) Class Prediction.

作者信息

Wu Zheng, Wang Nannan, Ye Zhuyifan, Xu Huanle, Chan Ging, Ouyang Defang

机构信息

Institute of Chinese Medical Sciences (ICMS), State Key Laboratory of Quality Research in Chinese Medicine, University of Macau, Macau 999078, China.

Faculty of Applied Sciences, Macao Polytechnic University, Macau 999078, China.

出版信息

Mol Pharm. 2025 Jan 6;22(1):330-342. doi: 10.1021/acs.molpharmaceut.4c00946. Epub 2024 Dec 8.

Abstract

The Biopharmaceutics Classification System (BCS) has facilitated biowaivers and played a significant role in enhancing drug regulation and development efficiency. However, the productivity of measuring the key discriminative properties of BCS, solubility and permeability, still requires improvement, limiting high-throughput applications of BCS, which is essential for evaluating drug candidate developability and guiding formulation decisions in the early stages of drug development. In recent years, advancements in machine learning (ML) and molecular characterization have revealed the potential of quantitative structure-performance relationships (QSPR) for rapid and accurate BCS classification. The present study aims to develop a web platform for high-throughput BCS classification based on high-performance ML models. Initially, four data sets of BCS-related molecular properties: log , log , log , and log were curated. Subsequently, 6 ML algorithms or deep learning frameworks were employed to construct models, with diverse molecular representations ranging from one-dimensional molecular fingerprints, descriptors, and molecular graphs to three-dimensional molecular spatial coordinates. By comparing different combinations of molecular representations and learning algorithms, LightGBM exhibited excellent performance in solubility prediction, with an of 0.84; AttentiveFP outperformed others in permeability prediction, with values of 0.96 and 0.76 for log and log , respectively; and XGBoost was the most accurate for log prediction, with an of 0.71. When externally validated on a marketed drug BCS category data set, the best-performing models achieved classification accuracies of over 77 and 73% for solubility and permeability, respectively. Finally, the well-trained models were embedded into the first ML-based BCS class prediction web platform (x f), enabling pharmaceutical scientists to quickly determine the BCS category of candidate drugs, which will aid in the high-throughput BCS assessment for candidate drugs during the preformulation stage, thereby promoting reduced risk and enhanced efficiency in drug development and regulation.

摘要

生物药剂学分类系统(BCS)推动了生物豁免,并在提高药物监管和开发效率方面发挥了重要作用。然而,测量BCS关键区分特性(溶解度和渗透性)的效率仍有待提高,这限制了BCS的高通量应用,而高通量应用对于评估候选药物的可开发性以及在药物开发早期指导制剂决策至关重要。近年来,机器学习(ML)和分子表征技术的进步揭示了定量结构-性能关系(QSPR)在快速准确的BCS分类方面的潜力。本研究旨在基于高性能ML模型开发一个用于高通量BCS分类的网络平台。首先,整理了四个与BCS相关的分子性质数据集:log 、log 、log 和log 。随后,采用6种ML算法或深度学习框架构建模型,分子表示形式多样,从一维分子指纹、描述符、分子图到三维分子空间坐标。通过比较分子表示形式和学习算法的不同组合,LightGBM在溶解度预测方面表现出色, 为0.84;AttentiveFP在渗透性预测方面优于其他算法,log 和log 的 值分别为0.96和0.76;XGBoost在log 预测方面最准确, 为0.71。在一个市售药物BCS类别数据集上进行外部验证时,表现最佳的模型在溶解度和渗透性分类准确率分别超过77%和73%。最后,将训练良好的模型嵌入到第一个基于ML的BCS类别预测网络平台(x f)中,使药物科学家能够快速确定候选药物的BCS类别,这将有助于在制剂前阶段对候选药物进行高通量BCS评估,从而降低药物开发和监管的风险并提高效率。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4596/11707745/a8b429763986/mp4c00946_0001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验