IBM Research Europe.
IBM GBS Germany.
AMIA Annu Symp Proc. 2022 Feb 21;2021:526-535. eCollection 2021.
We develop various AI models to predict hospitalization on a large (over 110k) cohort of COVID-19 positive-tested US patients, sourced from March 2020 to February 2021. Models range from Random Forest to Neural Network (NN) and Time Convolutional NN, where combination of the data modalities (tabular and time dependent) are performed at different stages (early vs. model fusion). Despite high data unbalance, the models reach average precision 0.96-0.98 (0.75-0.85), recall 0.96-0.98 (0.74-0.85), and F-score 0.97-0.98 (0.79-0.83) on the non-hospitalized (or hospitalized) class. Performances do not significantly drop even when selected lists of features are removed to study model adaptability to different scenarios. However, a systematic study of the SHAP feature importance values for the developed models in the different scenarios shows a large variability across models and use cases. This calls for even more complete studies on several explainability methods before their adoption in high-stakes scenarios.
我们开发了各种 AI 模型,以预测在一个来自 2020 年 3 月至 2021 年 2 月的超过 11 万例 COVID-19 阳性检测美国患者的大型队列上的住院情况。模型范围从随机森林到神经网络(NN)和时间卷积神经网络,其中数据模态(表格和时间相关)的组合在不同阶段(早期与模型融合)进行。尽管数据严重不平衡,模型在非住院(或住院)类别上的平均精度达到 0.96-0.98(0.75-0.85),召回率达到 0.96-0.98(0.74-0.85),F 分数达到 0.97-0.98(0.79-0.83)。即使在去除选择的特征列表以研究模型对不同场景的适应性时,性能也没有明显下降。然而,对不同场景下开发的模型的 SHAP 特征重要性值进行系统研究表明,模型之间和用例之间存在很大的可变性。在采用高风险场景之前,需要对几种可解释性方法进行更全面的研究。