Bernardini Michele, Morettini Micaela, Romeo Luca, Frontoni Emanuele, Burattini Laura
Department of Information Engineering (DII), Università Politecnica delle Marche, Ancona, Italy.
Department of Information Engineering (DII), Università Politecnica delle Marche, Ancona, Italy; Cognition, Motion and Neuroscience and Computational Statistics and Machine Learning, Istituto Italiano di Tecnologia, Genova, Italy.
Artif Intell Med. 2020 May;105:101847. doi: 10.1016/j.artmed.2020.101847. Epub 2020 May 6.
Early prediction of target patients at high risk of developing Type 2 diabetes (T2D) plays a significant role in preventing the onset of overt disease and its associated comorbidities. Although fundamental in early phases of T2D natural history, insulin resistance is not usually quantified by General Practitioners (GPs). Triglyceride-glucose (TyG) index has been proven useful in clinical studies for quantifying insulin resistance and for the early identification of individuals at T2D risk but still not applied by GPs for diagnostic purposes. The aim of this study is to propose a multiple instance learning boosting algorithm (MIL-Boost) for creating a predictive model capable of early prediction of worsening insulin resistance (low vs high T2D risk) in terms of TyG index. The MIL-Boost is applied to past electronic health record (EHR) patients' information stored by a single GP. The proposed MIL-Boost algorithm proved to be effective in dealing with this task, by performing better than the other state-of-the-art ML competitors (Recall from 0.70 and up to 0.83). The proposed MIL-based approach is able to extract hidden patterns from past EHR temporal data, even not directly exploiting triglycerides and glucose measurements. The major advantages of our method can be found in its ability to model the temporal evolution of longitudinal EHR data while dealing with small sample size and variability in the observations (e.g., a small variable number of prescriptions for non-hospitalized patients). The proposed algorithm may represent the main core of a clinical decision support system.
对2型糖尿病(T2D)高危目标患者进行早期预测,对于预防显性疾病及其相关合并症的发生具有重要作用。尽管胰岛素抵抗在T2D自然病程的早期阶段至关重要,但全科医生(GPs)通常并不对其进行量化。甘油三酯-葡萄糖(TyG)指数已在临床研究中被证明可用于量化胰岛素抵抗以及早期识别有T2D风险的个体,但全科医生仍未将其用于诊断目的。本研究的目的是提出一种多实例学习增强算法(MIL-Boost),以创建一个能够根据TyG指数对胰岛素抵抗恶化(低T2D风险与高T2D风险)进行早期预测的预测模型。MIL-Boost应用于由一位全科医生存储的既往电子健康记录(EHR)患者信息。所提出的MIL-Boost算法在处理这项任务时被证明是有效的,其表现优于其他先进的机器学习竞争对手(召回率从0.70到0.83)。所提出的基于MIL的方法能够从既往EHR时间数据中提取隐藏模式,即使没有直接利用甘油三酯和葡萄糖测量值。我们方法的主要优点在于其能够在处理小样本量和观测值变异性(例如,非住院患者的处方数量可变较少)的同时,对纵向EHR数据的时间演变进行建模。所提出的算法可能代表临床决策支持系统的主要核心。