Providence St. Joseph Health, 1801 Lind Avenue S.W. Valley Office Park, Morin Bldg, 1st Floor, Renton, WA, 98057-9016, USA.
Providence Medical Research Center, 105 W 8th Ave, Suite 250E, Spokane, WA, 99204, USA.
Med Biol Eng Comput. 2022 Jul;60(7):2039-2049. doi: 10.1007/s11517-022-02549-5. Epub 2022 May 11.
Notable discrepancies in vulnerability to COVID-19 infection have been identified between specific population groups and regions in the USA. The purpose of this study was to estimate the likelihood of COVID-19 infection using a machine-learning algorithm that can be updated continuously based on health care data. Patient records were extracted for all COVID-19 nasal swab PCR tests performed within the Providence St. Joseph Health system from February to October of 2020. A total of 316,599 participants were included in this study, and approximately 7.7% (n = 24,358) tested positive for COVID-19. A gradient boosting model, LightGBM (LGBM), predicted risk of initial infection with an area under the receiver operating characteristic curve of 0.819. Factors that predicted infection were cough, fever, being a member of the Hispanic or Latino community, being Spanish speaking, having a history of diabetes or dementia, and living in a neighborhood with housing insecurity. A model trained on sociodemographic, environmental, and medical history data performed well in predicting risk of a positive COVID-19 test. This model could be used to tailor education, public health policy, and resources for communities that are at the greatest risk of infection.
在美国,特定人群和地区对 COVID-19 感染的易感性存在显著差异。本研究的目的是使用一种可以根据医疗保健数据进行持续更新的机器学习算法来估计 COVID-19 感染的可能性。从 2020 年 2 月到 10 月,从普罗维登斯圣约瑟夫健康系统进行的所有 COVID-19 鼻拭子 PCR 检测中提取了患者记录。本研究共纳入 316599 名参与者,约 7.7%(n=24358)的 COVID-19 检测呈阳性。梯度提升模型 LightGBM(LGBM)预测初始感染风险的接收者操作特征曲线下面积为 0.819。预测感染的因素包括咳嗽、发烧、属于西班牙裔或拉丁裔社区、会说西班牙语、有糖尿病或痴呆病史以及居住在住房不安全的社区。基于社会人口统计学、环境和病史数据训练的模型在预测 COVID-19 检测呈阳性的风险方面表现良好。该模型可用于针对感染风险最高的社区量身定制教育、公共卫生政策和资源。