SDU Health Informatics and Technology, The Maersk Mc-Kinney Moller Institute, University of Southern Denmark, Odense, Denmark.
Unit for Clinical Alcohol Research, Clinical Institute, University of Southern Denmark, Odense, Denmark.
BMC Bioinformatics. 2023 Sep 2;24(1):329. doi: 10.1186/s12859-023-05450-6.
Alcohol use disorder (AUD) causes significant morbidity, mortality, and injuries. According to reports, approximately 5% of all registered deaths in Denmark could be due to AUD. The problem is compounded by the late identification of patients with AUD, a situation that can cause enormous problems, from psychological to physical to economic problems. Many individuals suffering from AUD never undergo specialist treatment during their addiction due to obstacles such as taboo and the poor performance of current screening tools. Therefore, there is a lack of rapid intervention. This can be mitigated by the early detection of patients with AUD. A clinical decision support system (DSS) powered by machine learning (ML) methods can be used to diagnose patients' AUD status earlier.
This study proposes an effective AUD prediction model (AUDPM), which can be used in a DSS. The proposed model consists of four distinct components: (1) imputation to address missing values using the k-nearest neighbours approach, (2) recursive feature elimination with cross validation to select the most relevant subset of features, (3) a hybrid synthetic minority oversampling technique-edited nearest neighbour approach to remove noise and balance the distribution of the training data, and (4) an ML model for the early detection of patients with AUD. Two data sources, including a questionnaire and electronic health records of 2571 patients, were collected from Odense University Hospital in the Region of Southern Denmark for the AUD-Dataset. Then, the AUD-Dataset was used to build ML models. The results of different ML models, such as support vector machine, K-nearest neighbour, decision tree, random forest, and extreme gradient boosting, were compared. Finally, a combination of all these models in an ensemble learning approach was selected for the AUDPM.
The results revealed that the proposed ensemble AUDPM outperformed other single models and our previous study results, achieving 0.96, 0.94, 0.95, and 0.97 precision, recall, F1-score, and accuracy, respectively. In addition, we designed and developed an AUD-DSS prototype.
It was shown that our proposed AUDPM achieved high classification performance. In addition, we identified clinical factors related to the early detection of patients with AUD. The designed AUD-DSS is intended to be integrated into the existing Danish health care system to provide novel information to clinical staff if a patient shows signs of harmful alcohol use; in other words, it gives staff a good reason for having a conversation with patients for whom a conversation is relevant.
酒精使用障碍(AUD)会导致严重的发病率、死亡率和伤害。据报道,丹麦所有登记死亡人数中约有 5%可能是 AUD 导致的。由于 AUD 患者的识别较晚,情况变得更加复杂,这可能会导致从心理到身体再到经济等方面的巨大问题。许多 AUD 患者在成瘾期间从未接受过专科治疗,这是因为存在禁忌和当前筛选工具表现不佳等障碍。因此,缺乏快速干预措施。通过早期发现 AUD 患者,可以减轻这种情况。基于机器学习(ML)方法的临床决策支持系统(DSS)可用于更早地诊断患者的 AUD 状况。
本研究提出了一种有效的 AUD 预测模型(AUDPM),可用于 DSS。该模型由四个不同的组件组成:(1)使用 k 近邻方法进行缺失值插补;(2)递归特征消除和交叉验证,以选择最相关的特征子集;(3)混合的合成少数过采样技术编辑最近邻方法,以去除噪声并平衡训练数据的分布;(4)用于早期检测 AUD 患者的 ML 模型。从丹麦南丹麦大区奥胡斯大学医院收集了包括问卷和 2571 名患者的电子健康记录在内的两个数据源,用于构建 AUD 数据集。然后,使用 AUD 数据集来构建 ML 模型。比较了不同 ML 模型(如支持向量机、K 最近邻、决策树、随机森林和极端梯度提升)的结果。最后,选择了所有这些模型的组合,采用集成学习方法构建 AUDPM。
结果表明,所提出的集成 AUDPM 优于其他单一模型和我们之前的研究结果,分别达到了 0.96、0.94、0.95 和 0.97 的精度、召回率、F1 得分和准确率。此外,我们设计并开发了 AUD-DSS 原型。
结果表明,我们提出的 AUDPM 达到了较高的分类性能。此外,我们确定了与 AUD 患者早期检测相关的临床因素。设计的 AUD-DSS 旨在集成到现有的丹麦医疗保健系统中,如果患者出现有害饮酒迹象,为临床工作人员提供新的信息;换句话说,这为工作人员提供了一个与相关患者进行对话的好理由。