Gryak Jonathan, Georgievska Aleksandra, Zhang Justin, Najarian Kayvan, Ravikumar Rajan, Sanders Georgiana, Schuler Charles F
Department of Computer Science, Queens College, City University of New York, New York, NY.
Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Mich.
J Allergy Clin Immunol Glob. 2024 Apr 7;3(3):100252. doi: 10.1016/j.jacig.2024.100252. eCollection 2024 Aug.
Clinical testing, including food-specific skin and serum IgE level tests, provides limited accuracy to predict food allergy. Confirmatory oral food challenges (OFCs) are often required, but the associated risks, cost, and logistic difficulties comprise a barrier to proper diagnosis.
We sought to utilize advanced machine learning methodologies to integrate clinical variables associated with peanut allergy to create a predictive model for OFCs to improve predictive performance over that of purely statistical methods.
Machine learning was applied to the Learning Early about Peanut Allergy (LEAP) study of 463 peanut OFCs and associated clinical variables. Patient-wise cross-validation was used to create ensemble models that were evaluated on holdout test sets. These models were further evaluated by using 2 additional peanut allergy OFC cohorts: the IMPACT study cohort and a local University of Michigan cohort.
In the LEAP data set, the ensemble models achieved a maximum mean area under the curve of 0.997, with a sensitivity and specificity of 0.994 and 1.00, respectively. In the combined validation data sets, the top ensemble model achieved a maximum area under the curve of 0.871, with a sensitivity and specificity of 0.763 and 0.980, respectively.
Machine learning models for predicting peanut OFC results have the potential to accurately predict OFC outcomes, potentially minimizing the need for OFCs while increasing confidence in food allergy diagnoses.
临床检测,包括特定食物的皮肤和血清IgE水平检测,在预测食物过敏方面准确性有限。通常需要进行确诊性口服食物激发试验(OFC),但相关风险、成本和后勤困难构成了正确诊断的障碍。
我们试图利用先进的机器学习方法整合与花生过敏相关的临床变量,以创建一个用于OFC的预测模型,从而提高预测性能,超越单纯的统计方法。
将机器学习应用于关于花生过敏的早期学习(LEAP)研究中的463例花生OFC及相关临床变量。采用患者层面的交叉验证来创建集成模型,并在保留测试集上进行评估。这些模型通过另外两个花生过敏OFC队列进一步评估:IMPACT研究队列和密歇根大学本地队列。
在LEAP数据集中,集成模型的曲线下面积均值最大为0.997,敏感性和特异性分别为0.994和1.00。在合并的验证数据集中,顶级集成模型的曲线下面积最大为0.871,敏感性和特异性分别为0.763和0.980。
预测花生OFC结果的机器学习模型有潜力准确预测OFC结果,可能最大限度地减少对OFC的需求,同时增强对食物过敏诊断的信心。