Zanfardino Mario, Punzo Bruna, Maffei Erica, Saba Luca, Bossone Eduardo, Nistri Stefano, La Grutta Ludovico, Franzese Monica, Cavaliere Carlo, Cademartiri Filippo
IRCCS Synlab SDN, Naples, 80143, Italy.
Department of Imaging, Fondazione Monasterio/CNR, Pisa, 56124, Italy.
Comput Struct Biotechnol J. 2023 Nov 29;23:287-294. doi: 10.1016/j.csbj.2023.11.021. eCollection 2024 Dec.
The potential of precision population health lies in its capacity to utilize robust patient data for customized prevention and care targeted at specific groups. Machine learning has the potential to automatically identify clinically relevant subgroups of individuals, considering heterogeneous data sources. This study aimed to assess whether unsupervised machine learning (UML) techniques could interpret different clinical data to uncover clinically significant subgroups of patients suspected of coronary artery disease and identify different ranges of aorta dimensions in the different identified subgroups. We employed a random forest-based cluster analysis, utilizing 14 variables from 1170 (717 men/453 women) participants. The unsupervised clustering approach successfully identified four distinct subgroups of individuals with specific clinical characteristics, and this allows us to interpret and assess different ranges of aorta dimensions for each cluster. By employing flexible UML algorithms, we can effectively process heterogeneous patient data and gain deeper insights into clinical interpretation and risk assessment.
精准人群健康的潜力在于其利用可靠的患者数据进行针对特定群体的定制化预防和护理的能力。考虑到数据来源的异质性,机器学习有潜力自动识别临床上相关的个体亚组。本研究旨在评估无监督机器学习(UML)技术是否能够解读不同的临床数据,以发现疑似冠心病患者的具有临床意义的亚组,并确定不同亚组中主动脉尺寸的不同范围。我们采用了基于随机森林的聚类分析,利用了来自1170名参与者(717名男性/453名女性)的14个变量。这种无监督聚类方法成功地识别出了具有特定临床特征的四个不同个体亚组,这使我们能够解读和评估每个聚类中主动脉尺寸的不同范围。通过采用灵活的UML算法,我们可以有效地处理异质性患者数据,并在临床解读和风险评估方面获得更深入的见解。