Koloi Angela, Loukas Vasileios S, Hourican Cillian, Sakellarios Antonis I, Quax Rick, Mishra Pashupati P, Lehtimäki Terho, Raitakari Olli T, Papaloukas Costas, Bosch Jos A, März Winfried, Fotiadis Dimitrios I
Unit of Medical Technology and Intelligent Information Systems, Department of Materials Science and Engineering, University of Ioannina, Ioannina, Greece.
Department of Biological Applications and Technology, University of Ioannina, Ioannina, Greece.
Eur Heart J Digit Health. 2024 Aug 9;5(5):542-550. doi: 10.1093/ehjdh/ztae049. eCollection 2024 Sep.
Coronary artery disease (CAD) is a highly prevalent disease with modifiable risk factors. In patients with suspected obstructive CAD, evaluating the pre-test probability model is crucial for diagnosis, although its accuracy remains controversial. Machine learning (ML) predictive models can help clinicians detect CAD early and improve outcomes. This study aimed to identify early-stage CAD using ML in conjunction with a panel of clinical and laboratory tests.
The study sample included 3316 patients enrolled in the Ludwigshafen Risk and Cardiovascular Health (LURIC) study. A comprehensive array of attributes was considered, and an ML pipeline was developed. Subsequently, we utilized five approaches to generating high-quality virtual patient data to improve the performance of the artificial intelligence models. An extension study was carried out using data from the Young Finns Study (YFS) to assess the results' generalizability. Upon applying virtual augmented data, accuracy increased by approximately 5%, from 0.75 to -0.79 for random forests (RFs), and from 0.76 to -0.80 for Gradient Boosting (GB). Sensitivity showed a significant boost for RFs, rising by about 9.4% (0.81-0.89), while GB exhibited a 4.8% increase (0.83-0.87). Specificity showed a significant boost for RFs, rising by ∼24% (from 0.55 to 0.70), while GB exhibited a 37% increase (from 0.51 to 0.74). The extension analysis aligned with the initial study.
Accurate predictions of angiographic CAD can be obtained using a set of routine laboratory markers, age, sex, and smoking status, holding the potential to limit the need for invasive diagnostic techniques. The extension analysis in the YFS demonstrated the potential of these findings in a younger population, and it confirmed applicability to atherosclerotic vascular disease.
冠状动脉疾病(CAD)是一种具有可改变风险因素的高度流行疾病。在疑似阻塞性CAD患者中,评估检测前概率模型对诊断至关重要,尽管其准确性仍存在争议。机器学习(ML)预测模型可帮助临床医生早期检测CAD并改善治疗结果。本研究旨在结合一系列临床和实验室检查,使用ML识别早期CAD。
研究样本包括3316名参与路德维希港风险与心血管健康(LURIC)研究的患者。考虑了一系列综合属性,并开发了一个ML流程。随后,我们采用了五种方法来生成高质量的虚拟患者数据,以提高人工智能模型的性能。使用来自芬兰青年研究(YFS)的数据进行了一项扩展研究,以评估结果的普遍性。应用虚拟增强数据后,准确性提高了约5%,随机森林(RF)从0.75提高到0.79,梯度提升(GB)从0.76提高到0.80。RF的敏感性显著提高,上升约9.4%(从0.81到0.89),而GB上升4.8%(从0.83到0.87)。RF的特异性显著提高,上升约24%(从0.55到0.70),而GB上升37%(从0.51到0.74)。扩展分析与初始研究一致。
使用一组常规实验室指标、年龄、性别和吸烟状况,可获得对血管造影CAD的准确预测,有可能减少对侵入性诊断技术的需求。YFS中的扩展分析证明了这些发现在年轻人群中的潜力,并证实了其对动脉粥样硬化性血管疾病的适用性。