Owasit Anna, Tripathi Siddharth, Davé Rajesh, Young Joshua
Otto H. York Department of Chemical and Materials Engineering, New Jersey Institute of Technology, 138 Warren St, Newark, NJ, 07103, USA.
Pharm Res. 2025 Apr;42(4):665-683. doi: 10.1007/s11095-025-03855-x. Epub 2025 Apr 17.
Predicting powder blend flowability is necessary for pharmaceutical manufacturing but challenging and resource-intensive. The purpose was to develop machine learning (ML) models to help predict flowability across multiple flow categories, identify key predictive features, and arrive at formulations with improved flow properties.
A dataset of 410 blends, composed of 9 active pharmaceutical ingredients (APIs) and 18 excipients with varying silica dry-coating parameters, was analyzed. Supervised ML models were trained to predict various flowability categories (very cohesive, cohesive, semi-cohesive, well-flowing, and free-flowing). Particle size, morphology, surface properties, and coating parameters were used as features. Classification algorithms, including Random Forest (RF) and Extreme Gradient Boosting (XGBoost), were evaluated. Unsupervised clustering identified natural groupings within flowability data.
The best-performing models achieved up to 85% accuracy for predicting flowability regimes of individual components and 87% for blends. Individual components generally showed higher accuracy than blends, except in the uncoated scenario with 2 flow regimes, where blends outperformed with 94.67%. SHapley Additive exPlanations (SHAP) and Feature Importance analysis indicated dry coating parameters as the most influential factors, followed by particle size and morphology. ML models effectively identified category transitions between flow regimes, offering insights into blend optimization.
Integrating ML with mechanistic approaches effectively predicted powder blend flowability across diverse categories and elucidated feature-property relationships. These outcomes can facilitate the rational design of blends having enhanced flow properties at reduced experimental effort through judiciously selected dry coating of a blend constituent; making this approach promising for advancing pharmaceutical process and product development.
预测粉末混合物的流动性对于制药生产至关重要,但具有挑战性且资源密集。目的是开发机器学习(ML)模型,以帮助预测多种流动类别下的流动性,识别关键预测特征,并得出具有改善流动特性的配方。
分析了一个由410种混合物组成的数据集,这些混合物由9种活性药物成分(API)和18种具有不同二氧化硅干包衣参数的辅料组成。训练监督式ML模型以预测各种流动性类别(极强粘性、粘性、半粘性、良好流动性和自由流动性)。使用粒度、形态、表面性质和包衣参数作为特征。评估了包括随机森林(RF)和极端梯度提升(XGBoost)在内的分类算法。无监督聚类识别了流动性数据中的自然分组。
性能最佳的模型在预测单个成分的流动状态时准确率高达85%,在预测混合物时准确率为87%。除了在具有两种流动状态的未包衣情况下混合物的表现优于单个成分(94.67%)外,单个成分的准确率通常高于混合物。SHapley加法解释(SHAP)和特征重要性分析表明,干包衣参数是最有影响的因素,其次是粒度和形态。ML模型有效地识别了流动状态之间的类别转变,为混合物优化提供了见解。
将ML与机理方法相结合,有效地预测了不同类别的粉末混合物流动性,并阐明了特征-性质关系。这些结果可以通过明智地选择混合物成分的干包衣,以减少实验工作量,促进具有增强流动特性的混合物的合理设计;使这种方法有望推动制药工艺和产品开发。