Department of Pharmacy, The Affiliated Brain Hospital of Guangzhou Medical University, Guangzhou, China.
Guangdong Engineering Technology Research Center for Translational Medicine of Mental Disorders, Guangzhou, China.
Front Endocrinol (Lausanne). 2022 Oct 13;13:1011492. doi: 10.3389/fendo.2022.1011492. eCollection 2022.
Available evidence suggests elevated serum prolactin (PRL) levels in olanzapine (OLZ)-treated patients with schizophrenia. However, machine learning (ML)-based comprehensive evaluations of the influence of pathophysiological and pharmacological factors on PRL levels in OLZ-treated patients are rare. We aimed to forecast the PRL level in OLZ-treated patients and mine pharmacovigilance information on PRL-related adverse events by integrating ML and electronic health record (EHR) data.
Data were extracted from an EHR system to construct an ML dataset in 672×384 matrix format after preprocessing, which was subsequently randomly divided into a derivation cohort for model development and a validation cohort for model validation (8:2). The eXtreme gradient boosting (XGBoost) algorithm was used to build the ML models, the importance of the features and predictive behaviors of which were illustrated by SHapley Additive exPlanations (SHAP)-based analyses. The sequential forward feature selection approach was used to generate the optimal feature subset. The co-administered drugs that might have influenced PRL levels during OLZ treatment as identified by SHAP analyses were then compared with evidence from disproportionality analyses by using OpenVigil FDA.
The 15 features that made the greatest contributions, as ranked by the mean (|SHAP value|), were identified as the optimal feature subset. The features were gender_male, co-administration of risperidone, age, co-administration of aripiprazole, concentration of aripiprazole, concentration of OLZ, progesterone, co-administration of sulpiride, creatine kinase, serum sodium, serum phosphorus, testosterone, platelet distribution width, α-L-fucosidase, and lipoprotein (a). The XGBoost model after feature selection delivered good performance on the validation cohort with a mean absolute error of 0.046, mean squared error of 0.0036, root-mean-squared error of 0.060, and mean relative error of 11%. Risperidone and aripiprazole exhibited the strongest associations with hyperprolactinemia and decreased blood PRL according to the disproportionality analyses, and both were identified as co-administered drugs that influenced PRL levels during OLZ treatment by SHAP analyses.
Multiple pathophysiological and pharmacological confounders influence PRL levels associated with effective treatment and PRL-related side-effects in OLZ-treated patients. Our study highlights the feasibility of integration of ML and EHR data to facilitate the detection of PRL levels and pharmacovigilance signals in OLZ-treated patients.
现有证据表明,奥氮平(OLZ)治疗的精神分裂症患者血清催乳素(PRL)水平升高。然而,基于机器学习(ML)的对生理病理和药理学因素对 OLZ 治疗患者 PRL 水平影响的综合评估很少。我们旨在通过整合 ML 和电子健康记录(EHR)数据来预测 OLZ 治疗患者的 PRL 水平,并挖掘与 PRL 相关的不良反应的药物警戒信息。
从 EHR 系统中提取数据,构建 ML 数据集,在预处理后形成 672×384 矩阵格式,随后将其随机分为用于模型开发的推导队列和用于模型验证的验证队列(8:2)。使用极端梯度提升(XGBoost)算法构建 ML 模型,通过基于 SHapley Additive exPlanations(SHAP)的分析说明特征的重要性和预测行为。使用顺序前向特征选择方法生成最佳特征子集。然后,通过使用 OpenVigil FDA 对 SHAP 分析确定的可能影响 OLZ 治疗期间 PRL 水平的合并药物与不成比例分析的证据进行比较。
确定了 15 个对 SHAP 值绝对值贡献最大的特征,作为最佳特征子集。这些特征是男性(gender_male)、合并利培酮(risperidone)、年龄(age)、合并阿立哌唑(aripiprazole)、阿立哌唑浓度(concentration of aripiprazole)、OLZ 浓度(concentration of OLZ)、孕酮(progesterone)、合并舒必利(sulpiride)、肌酸激酶(creatine kinase)、血清钠(serum sodium)、血清磷(serum phosphorus)、睾酮(testosterone)、血小板分布宽度(platelet distribution width)、α-L-岩藻糖苷酶(α-L-fucosidase)和脂蛋白(a)。经特征选择后的 XGBoost 模型在验证队列中表现出良好的性能,平均绝对误差为 0.046、均方误差为 0.0036、均方根误差为 0.060 和平均相对误差为 11%。根据不成比例分析,利培酮和阿立哌唑与高催乳素血症和降低的血 PRL 关联最强,并且根据 SHAP 分析,这两种药物均被确定为影响 OLZ 治疗期间 PRL 水平的合并药物。
多种生理病理和药理学混杂因素影响与 OLZ 有效治疗相关的 PRL 水平和 PRL 相关不良反应。我们的研究强调了整合 ML 和 EHR 数据的可行性,以促进 OLZ 治疗患者 PRL 水平和药物警戒信号的检测。