基于多种机器学习算法的 T2D 患者药物不依从风险预测模型。

Predictive models of medication non-adherence risks of patients with T2D based on multiple machine learning algorithms.

机构信息

Personalized Drug Therapy Key Laboratory of Sichuan Province, School of Medicine, University of Electronic Science and Technology of China, Chengdu, China.

Department of Pharmacy, Sichuan Academy of Medical Sciences and Sichuan Provincial People's Hospital, Chengdu, China.

出版信息

BMJ Open Diabetes Res Care. 2020 Mar;8(1). doi: 10.1136/bmjdrc-2019-001055.

DOI:10.1136/bmjdrc-2019-001055

PMID:32156739

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7064141/

Abstract

OBJECTIVE

Medication adherence plays a key role in type 2 diabetes (T2D) care. Identifying patients with high risks of non-compliance helps individualized management, especially for China, where medical resources are relatively insufficient. However, models with good predictive capabilities have not been studied. This study aims to assess multiple machine learning algorithms and screen out a model that can be used to predict patients' non-adherence risks.

METHODS

A real-world registration study was conducted at Sichuan Provincial People's Hospital from 1 April 2018 to 30 March 2019. Data of patients with T2D on demographics, disease and treatment, diet and exercise, mental status, and treatment adherence were obtained by face-to-face questionnaires. The medication possession ratio was used to evaluate patients' medication adherence status. Fourteen machine learning algorithms were applied for modeling, including Bayesian network, Neural Net, support vector machine, and so on, and balanced sampling, data imputation, binning, and methods of feature selection were evaluated by the area under the receiver operating characteristic curve (AUC). We use two-way cross-validation to ensure the accuracy of model evaluation, and we performed a posteriori test on the sample size based on the trend of AUC as the sample size increase.

RESULTS

A total of 401 patients out of 630 candidates were investigated, of which 85 were evaluated as poor adherence (21.20%). A total of 16 variables were selected as potential variables for modeling, and 300 models were built based on 30 machine learning algorithms. Among these algorithms, the AUC of the best capable one was 0.866±0.082. Imputing, oversampling and larger sample size will help improve predictive ability.

CONCLUSIONS

An accurate and sensitive adherence prediction model based on real-world registration data was established after evaluating data filling, balanced sampling, and so on, which may provide a technical tool for individualized diabetes care.

摘要

目的

药物依从性在 2 型糖尿病（T2D）治疗中起着关键作用。识别出有较高不依从风险的患者有助于个体化管理，尤其是在中国，医疗资源相对不足的情况下。然而，尚未研究具有良好预测能力的模型。本研究旨在评估多种机器学习算法，并筛选出可用于预测患者不依从风险的模型。

方法

2018 年 4 月 1 日至 2019 年 3 月 30 日，在四川省人民医院进行了一项真实世界的注册研究。通过面对面的问卷调查获得了 T2D 患者的人口统计学、疾病和治疗、饮食和运动、精神状态以及治疗依从性数据。使用药物占有率来评估患者的药物依从性状态。应用了 14 种机器学习算法进行建模，包括贝叶斯网络、神经网络、支持向量机等，并评估了平衡采样、数据插补、分箱和特征选择方法的受试者工作特征曲线下面积（AUC）。我们使用双向交叉验证来确保模型评估的准确性，并根据 AUC 的趋势对样本量进行事后检验，随着样本量的增加。

结果

在 630 名候选人中，共有 401 名患者接受了调查，其中 85 名被评估为依从性差（21.20%）。共选择了 16 个变量作为建模的潜在变量，基于 30 种机器学习算法构建了 300 个模型。在这些算法中，表现最佳的算法的 AUC 为 0.866±0.082。插补、过采样和更大的样本量有助于提高预测能力。