长新冠预测模型。

Predictive models of long COVID.

机构信息

Department of Computer Science, Virginia Polytechnic Institute and State University (Virginia Tech), Blacksburg, VA, 24061, USA.

The Jackson Laboratory for Genomic Medicine, Farmington, CT, 06032, USA.

出版信息

EBioMedicine. 2023 Oct;96:104777. doi: 10.1016/j.ebiom.2023.104777. Epub 2023 Sep 4.

DOI:10.1016/j.ebiom.2023.104777

PMID:37672869

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10494314/

Abstract

BACKGROUND

The cause and symptoms of long COVID are poorly understood. It is challenging to predict whether a given COVID-19 patient will develop long COVID in the future.

METHODS

We used electronic health record (EHR) data from the National COVID Cohort Collaborative to predict the incidence of long COVID. We trained two machine learning (ML) models - logistic regression (LR) and random forest (RF). Features used to train predictors included symptoms and drugs ordered during acute infection, measures of COVID-19 treatment, pre-COVID comorbidities, and demographic information. We assigned the 'long COVID' label to patients diagnosed with the U09.9 ICD10-CM code. The cohorts included patients with (a) EHRs reported from data partners using U09.9 ICD10-CM code and (b) at least one EHR in each feature category. We analysed three cohorts: all patients (n = 2,190,579; diagnosed with long COVID = 17,036), inpatients (149,319; 3,295), and outpatients (2,041,260; 13,741).

FINDINGS

LR and RF models yielded median AUROC of 0.76 and 0.75, respectively. Ablation study revealed that drugs had the highest influence on the prediction task. The SHAP method identified age, gender, cough, fatigue, albuterol, obesity, diabetes, and chronic lung disease as explanatory features. Models trained on data from one N3C partner and tested on data from the other partners had average AUROC of 0.75.

INTERPRETATION

ML-based classification using EHR information from the acute infection period is effective in predicting long COVID. SHAP methods identified important features for prediction. Cross-site analysis demonstrated the generalizability of the proposed methodology.

FUNDING

NCATS U24 TR002306, NCATS UL1 TR003015, Axle Informatics Subcontract: NCATS-P00438-B, NIH/NIDDK/OD, PSR2015-1720GVALE_01, G43C22001320007, and Director, Office of Science, Office of Basic Energy Sciences of the U.S. Department of Energy Contract No. DE-AC02-05CH11231.

摘要

背景

长新冠的病因和症状尚不清楚。预测给定的 COVID-19 患者将来是否会发展成长新冠具有挑战性。

方法

我们使用来自国家 COVID 队列协作的电子健康记录 (EHR) 数据来预测长新冠的发病率。我们训练了两个机器学习 (ML) 模型 - 逻辑回归 (LR) 和随机森林 (RF)。用于训练预测器的特征包括急性感染期间的症状和开的药物、COVID-19 治疗措施、预 COVID 合并症和人口统计信息。我们将 U09.9 ICD10-CM 代码诊断的患者分配给“长新冠”标签。队列包括 (a) 使用 U09.9 ICD10-CM 代码报告来自数据合作伙伴的 EHRs 的患者，以及 (b) 每个特征类别中至少有一个 EHR 的患者。我们分析了三个队列：所有患者 (n=2,190,579; 诊断为长新冠 = 17,036)、住院患者 (149,319; 3,295) 和门诊患者 (2,041,260; 13,741)。

结果

LR 和 RF 模型的中位数 AUROC 分别为 0.76 和 0.75。消融研究表明，药物对预测任务的影响最大。SHAP 方法确定了年龄、性别、咳嗽、疲劳、沙丁胺醇、肥胖、糖尿病和慢性肺病作为解释性特征。在一个 N3C 合作伙伴的数据上训练的模型并在另一个合作伙伴的数据上进行测试的模型的平均 AUROC 为 0.75。

解释

使用急性感染期的 EHR 信息进行基于机器学习的分类在预测长新冠方面是有效的。SHAP 方法确定了预测的重要特征。跨站点分析证明了所提出方法的通用性。

资助

NCATS U24 TR002306、NCATS UL1 TR003015、Axle Informatics 分包合同：NCATS-P00438-B、NIH/NIDDK/OD、PSR2015-1720GVALE_01、G43C22001320007 和能源部基础能源科学办公室主任，合同号 DE-AC02-05CH11231。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a70d/10494314/7fb2bd433fad/gr1.jpg

相似文献

Predictive models of long COVID.长新冠预测模型。

EBioMedicine. 2023 Oct;96:104777. doi: 10.1016/j.ebiom.2023.104777. Epub 2023 Sep 4.

Crowd-sourced machine learning prediction of long COVID using data from the National COVID Cohort Collaborative.基于国家 COVID 队列协作数据的众包机器学习预测长新冠。

EBioMedicine. 2024 Oct;108:105333. doi: 10.1016/j.ebiom.2024.105333. Epub 2024 Sep 24.

Generalisable long COVID subtypes: findings from the NIH N3C and RECOVER programmes.可泛化的长新冠亚型：来自 NIH N3C 和 RECOVER 项目的发现。

EBioMedicine. 2023 Jan;87:104413. doi: 10.1016/j.ebiom.2022.104413. Epub 2022 Dec 21.

Prediction of acute and chronic kidney diseases during the post-covid-19 pandemic with machine learning models: utilizing national electronic health records in the US.利用机器学习模型预测新冠疫情后美国的急慢性肾脏疾病：运用国家电子健康记录

EBioMedicine. 2025 May;115:105726. doi: 10.1016/j.ebiom.2025.105726. Epub 2025 Apr 26.

Identifying who has long COVID in the USA: a machine learning approach using N3C data.在美国识别长新冠患者：使用 N3C 数据的机器学习方法。

Lancet Digit Health. 2022 Jul;4(7):e532-e541. doi: 10.1016/S2589-7500(22)00048-6. Epub 2022 May 16.

Learning From Past Respiratory Infections to Predict COVID-19 Outcomes: Retrospective Study.从既往呼吸道感染预测 COVID-19 结局：回顾性研究。

J Med Internet Res. 2021 Feb 22;23(2):e23026. doi: 10.2196/23026.

Machine learning algorithms for predicting COVID-19 mortality in Ethiopia.用于预测埃塞俄比亚 COVID-19 死亡率的机器学习算法。

BMC Public Health. 2024 Jun 28;24(1):1728. doi: 10.1186/s12889-024-19196-0.

Predicting Long COVID in the National COVID Cohort Collaborative Using Super Learner: Cohort Study.利用超级学习者预测全国 COVID 队列协作中的长新冠：队列研究。

JMIR Public Health Surveill. 2024 Aug 15;10:e53322. doi: 10.2196/53322.

Rates of ICD-10 Code U09.9 Documentation and Clinical Characteristics of VA Patients With Post-COVID-19 Condition.ICD-10 编码 U09.9 的记录率和退伍军人事务部患有新冠后状况患者的临床特征。

JAMA Netw Open. 2023 Dec 1;6(12):e2346783. doi: 10.1001/jamanetworkopen.2023.46783.

Risk factors associated with post-acute sequelae of SARS-CoV-2: an N3C and NIH RECOVER study.与 SARS-CoV-2 急性后期后遗症相关的风险因素：N3C 和 NIH RECOVER 研究。

BMC Public Health. 2023 Oct 25;23(1):2103. doi: 10.1186/s12889-023-16916-w.

引用本文的文献

Development and internal validation of a prediction model for post-COVID-19 condition 2 years after infection-results of the CORFU study.新冠病毒感染后2年新冠后状况预测模型的开发与内部验证——科孚研究结果

Diagn Progn Res. 2025 Sep 1;9(1):18. doi: 10.1186/s41512-025-00203-w.

A Bayesian Survival Analysis on Long COVID and Non-Long COVID Patients: A Cohort Study Using National COVID Cohort Collaborative (N3C) Data.长新冠患者与非长新冠患者的贝叶斯生存分析：一项使用国家新冠队列协作组（N3C）数据的队列研究

Bioengineering (Basel). 2025 May 7;12(5):496. doi: 10.3390/bioengineering12050496.

Postacute Sequelae From SARS-CoV-2 at the University of Illinois Hospital and Clinics: An Examination of the Effects of Long COVID in an Underserved Population Utilizing Manual Extraction of Electronic Health Records.伊利诺伊大学医院及诊所中SARS-CoV-2的急性后遗症：利用电子健康记录的人工提取对服务不足人群中长新冠的影响进行的调查。

Am J Med Open. 2025 Mar 1;13:100095. doi: 10.1016/j.ajmo.2025.100095. eCollection 2025 Jun.

Identifying risk factors and predicting long COVID in a Spanish cohort.在一个西班牙队列中识别风险因素并预测新冠长期症状

Sci Rep. 2025 Mar 28;15(1):10758. doi: 10.1038/s41598-025-94765-w.

Wearable data reveals distinct characteristics of individuals with persistent symptoms after a SARS-CoV-2 infection.可穿戴设备数据揭示了新冠病毒感染后有持续症状个体的独特特征。

NPJ Digit Med. 2025 Mar 19;8(1):167. doi: 10.1038/s41746-025-01456-x.

Machine learning models predict long COVID outcomes based on baseline clinical and immunologic factors.机器学习模型基于基线临床和免疫因素预测新冠长期症状的结果。

medRxiv. 2025 Feb 13:2025.02.12.25322164. doi: 10.1101/2025.02.12.25322164.

The prolonged health sequelae "of the COVID-19 pandemic" in sub-Saharan Africa: a systematic review and meta-analysis.撒哈拉以南非洲地区“新冠疫情”的长期健康后遗症：一项系统评价与荟萃分析

Front Public Health. 2025 Jan 24;13:1415427. doi: 10.3389/fpubh.2025.1415427. eCollection 2025.

Relevance of superoxide dismutase type 1 to lipoid pneumonia: the first retrospective case-control study.1型超氧化物歧化酶与类脂性肺炎的相关性：首例回顾性病例对照研究

Respir Res. 2025 Jan 18;26(1):24. doi: 10.1186/s12931-025-03101-3.

Association of systemic inflammation and long-term dysfunction in COVID-19 patients: A prospective cohort.新冠病毒肺炎患者全身炎症与长期功能障碍的关联：一项前瞻性队列研究

Psychoneuroendocrinology. 2025 Feb;172:107269. doi: 10.1016/j.psyneuen.2024.107269. Epub 2024 Dec 25.

Psychological factors associated with Long COVID: a systematic review and meta-analysis.与长期新冠相关的心理因素：一项系统综述和荟萃分析。

EClinicalMedicine. 2024 Jul 26;74:102756. doi: 10.1016/j.eclinm.2024.102756. eCollection 2024 Aug.

本文引用的文献

Ontologizing health systems data at scale: making translational discovery a reality.大规模实现卫生系统数据本体化：让转化性发现成为现实。

NPJ Digit Med. 2023 May 19;6(1):89. doi: 10.1038/s41746-023-00830-x.

Antihistamines as an early treatment for Covid-19.抗组胺药作为新冠肺炎的早期治疗方法。

Heliyon. 2023 May;9(5):e15772. doi: 10.1016/j.heliyon.2023.e15772. Epub 2023 Apr 25.

Toward a Universal Definition of Post-COVID-19 Condition-How Do We Proceed?迈向新冠后状况的通用定义——我们该如何推进？

JAMA Netw Open. 2023 Apr 3;6(4):e235779. doi: 10.1001/jamanetworkopen.2023.5779.

Risk Factors Associated With Post-COVID-19 Condition: A Systematic Review and Meta-analysis.与新冠后状况相关的风险因素：系统评价和荟萃分析。

JAMA Intern Med. 2023 Jun 1;183(6):566-580. doi: 10.1001/jamainternmed.2023.0750.

Coding long COVID: characterizing a new disease through an ICD-10 lens.长新冠编码：通过 ICD-10 视角描述一种新疾病。

BMC Med. 2023 Feb 16;21(1):58. doi: 10.1186/s12916-023-02737-6.

Covid-19 Histamine theory: Why antihistamines should be incorporated as the basic component in Covid-19 management?新冠病毒组胺理论：为何抗组胺药应作为新冠病毒治疗的基本成分？

Health Sci Rep. 2023 Feb 7;6(2):e1109. doi: 10.1002/hsr2.1109. eCollection 2023 Feb.

Long COVID: major findings, mechanisms and recommendations.长新冠：主要发现、机制和建议。

Nat Rev Microbiol. 2023 Mar;21(3):133-146. doi: 10.1038/s41579-022-00846-2. Epub 2023 Jan 13.

Generalisable long COVID subtypes: findings from the NIH N3C and RECOVER programmes.可泛化的长新冠亚型：来自 NIH N3C 和 RECOVER 项目的发现。

EBioMedicine. 2023 Jan;87:104413. doi: 10.1016/j.ebiom.2022.104413. Epub 2022 Dec 21.

Boosting tissue-specific prediction of active cis-regulatory regions through deep learning and Bayesian optimization techniques.通过深度学习和贝叶斯优化技术提高活性顺式调控区域的组织特异性预测。

BMC Bioinformatics. 2022 Dec 12;23(Suppl 2):154. doi: 10.1186/s12859-022-04582-5.

Erratum to "Long COVID: An overview" [Diabetes Metabol. Syndr. Clin. Res. Rev. (2021) 869-875].《“长新冠”概述》的勘误 [《糖尿病与代谢综合征：临床研究评论》（2021年）第869 - 875页] 。

Diabetes Metab Syndr. 2022 Dec;16(12):102660. doi: 10.1016/j.dsx.2022.102660. Epub 2022 Nov 7.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

长新冠预测模型。

Predictive models of long COVID.

机构信息

出版信息

BACKGROUND

METHODS

FINDINGS

INTERPRETATION

FUNDING

背景

方法

结果

解释

资助

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献