Bağcı Mehmet Furkan, Do Toan, Spierling Bagsic Samantha R, Gomez Rahul F, Jun Judy H, Ritko Anna L, Wenzel Sally E, Nguyen Truong, Öztürk Yusuf, Modena Brian D
Department of Electrical and Computer Engineering, University of California San Diego, La Jolla, Calif.
Department of Electrical and Computer Engineering, San Diego State University, San Diego, Calif.
J Allergy Clin Immunol Glob. 2025 Apr 17;4(3):100473. doi: 10.1016/j.jacig.2025.100473. eCollection 2025 Aug.
Asthma is a heterogeneous disease with a diverse array of phenotypes that differ in inflammatory characteristics and severity. Identifying and classifying phenotypes in the real world could provide a foundation to improve and personalize asthma management. Leveraging machine learning in analyzing electronic health records (EHRs) provides an opportunity to identify real-world asthma phenotypes.
We utilized machine-learning techniques applied to EHRs to detect and predict real-world severe asthma (SA) phenotypes and improve the precision of asthma severity diagnoses.
Data from 31,795 asthma patients were extracted from a health care system's EHR, with 1,112 patients meeting inclusion criteria for analysis. Principal component analysis (PCA) and a Gaussian mixture model classified patients into subject clusters (SCs). Asthma severity was assessed using two predictive models, one based on the American Thoracic Society (ATS) definition and the other a supervised model trained on 50 randomly selected patients whose disease severity was predetermined by 2 independent physicians.
Three principal components (PCs) emerged, reflecting lung function (PC1), blood inflammatory markers (PC2), and systemic corticosteroid receipt (PC3). PCA identified 5 distinct asthma phenotypes with significant clinical, physiologic, and inflammatory differences. A supervised model, trained on 50 randomly selected patients, predicted SA with 92% precision and 85% accuracy. SC3 was classified as an inflammatory, SA phenotype, making it highly suitable for biologic therapy.
Integrating machine learning with EHRs successfully classified and identified real-world asthma phenotypes, demonstrating the potential of this approach to identify SA for appropriate management and/or clinical studies.
哮喘是一种异质性疾病,具有多种不同的表型,其炎症特征和严重程度各不相同。在现实世界中识别和分类表型可为改善哮喘管理和实现个性化治疗提供基础。利用机器学习分析电子健康记录(EHR)为识别现实世界中的哮喘表型提供了机会。
我们运用机器学习技术分析EHR,以检测和预测现实世界中的重度哮喘(SA)表型,并提高哮喘严重程度诊断的准确性。
从一个医疗系统的EHR中提取了31795例哮喘患者的数据,其中1112例患者符合纳入标准进行分析。主成分分析(PCA)和高斯混合模型将患者分为不同的主题集群(SC)。使用两种预测模型评估哮喘严重程度,一种基于美国胸科学会(ATS)的定义,另一种是在50例随机选择的患者上训练的监督模型,这些患者的疾病严重程度由2名独立医生预先确定。
出现了三个主成分(PC),分别反映肺功能(PC1)、血液炎症标志物(PC2)和全身皮质类固醇的使用情况(PC3)。PCA识别出5种不同的哮喘表型,它们在临床、生理和炎症方面存在显著差异。在50例随机选择的患者上训练的监督模型预测SA的准确率为92%,准确度为85%。SC3被分类为炎症性SA表型,使其非常适合生物治疗。
将机器学习与EHR相结合成功地对现实世界中的哮喘表型进行了分类和识别,证明了这种方法在识别SA以进行适当管理和/或临床研究方面的潜力。