• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

多尺度数据提高了用于长新冠预测的机器学习模型的性能。

Multi-scale Data Improves Performance of Machine Learning Model for Long COVID Prediction.

作者信息

Wei Wei-Qi, Guardo Christopher, Zhang Xinmeng, Gandireddy Srushti, Yan Chao, Kerchberger Vern, Dickson Alyson, Pfaff Emily, Master Hiral, Basford Melissa, Chute Christopher, Tran Nguyen, Manusco Salvatore, Syed Toufeeq, Zhao Zhongming, Feng QiPing, Haendel Melissa, Lunt Christopher, Harris Paul, Li Lang, Ginsburg Geoffrey, Denny Joshua, Roden Dan

机构信息

Vanderbilt University Medical Center.

University of North Carolina, USA.

出版信息

Res Sq. 2025 Aug 31:rs.3.rs-7234976. doi: 10.21203/rs.3.rs-7234976/v1.

DOI:10.21203/rs.3.rs-7234976/v1
PMID:40909786
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12408029/
Abstract

Long COVID affects a substantial proportion of the over 778 million individuals infected with SARS-CoV-2, yet predictive models remain limited in scope. While existing efforts, such as the National COVID Cohort Collaborative (N3C), have leveraged electronic health record (EHR) data for risk prediction, accumulating evidence points to additional contributions from social, behavioral, and genetic factors. Using a diverse cohort of SARS-CoV-2-infected individuals (n>17,200) from the NIH All of Us Research Program, we investigated whether integrating EHR data with survey-based and genomic information improves model performance. Our multi-scale approach outperformed EHR-only models original AUROC 0.736 (95% CI: 0.730, 0.741), achieving an AUROC of 0.748 (0.741,0.755). Among the top predictors, active-duty service status, self-reported fatigue, and chr19:4719431:G:A_A were among the most informative survey and genetic features. These findings highlight the importance of incorporating multi-scale data to improve risk stratification and inform personalized interventions for long COVID.

摘要

长新冠影响着超过7.78亿感染新冠病毒的人中的很大一部分,但预测模型的范围仍然有限。虽然现有的努力,如国家新冠队列协作组织(N3C),已经利用电子健康记录(EHR)数据进行风险预测,但越来越多的证据表明社会、行为和遗传因素也有额外作用。我们使用来自美国国立卫生研究院“我们所有人”研究项目的多样化新冠病毒感染个体队列(n>17200),研究将电子健康记录数据与基于调查的信息和基因组信息相结合是否能提高模型性能。我们的多尺度方法优于仅使用电子健康记录的模型,原始受试者工作特征曲线下面积(AUROC)为0.736(95%置信区间:0.730,0.741),新方法的AUROC达到0.748(0.741,0.755)。在最重要的预测因素中,现役军人身份、自我报告的疲劳以及chr19:4719431:G:A_A是最具信息量的调查和遗传特征。这些发现凸显了纳入多尺度数据以改善风险分层并为长新冠的个性化干预提供信息的重要性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bf3d/12408029/3bdb74c1b438/nihpp-rs7234976v1-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bf3d/12408029/024270943063/nihpp-rs7234976v1-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bf3d/12408029/8446000bb7a6/nihpp-rs7234976v1-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bf3d/12408029/49a8ceacca57/nihpp-rs7234976v1-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bf3d/12408029/3bdb74c1b438/nihpp-rs7234976v1-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bf3d/12408029/024270943063/nihpp-rs7234976v1-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bf3d/12408029/8446000bb7a6/nihpp-rs7234976v1-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bf3d/12408029/49a8ceacca57/nihpp-rs7234976v1-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bf3d/12408029/3bdb74c1b438/nihpp-rs7234976v1-f0004.jpg

相似文献

1
Multi-scale Data Improves Performance of Machine Learning Model for Long COVID Prediction.多尺度数据提高了用于长新冠预测的机器学习模型的性能。
Res Sq. 2025 Aug 31:rs.3.rs-7234976. doi: 10.21203/rs.3.rs-7234976/v1.
2
Comparison of Two Modern Survival Prediction Tools, SORG-MLA and METSSS, in Patients With Symptomatic Long-bone Metastases Who Underwent Local Treatment With Surgery Followed by Radiotherapy and With Radiotherapy Alone.两种现代生存预测工具 SORG-MLA 和 METSSS 在接受手术联合放疗和单纯放疗治疗有症状长骨转移患者中的比较。
Clin Orthop Relat Res. 2024 Dec 1;482(12):2193-2208. doi: 10.1097/CORR.0000000000003185. Epub 2024 Jul 23.
3
Post-pandemic planning for maternity care for local, regional, and national maternity systems across the four nations: a mixed-methods study.针对四个地区的地方、区域和国家孕产妇保健系统的疫情后规划:一项混合方法研究。
Health Soc Care Deliv Res. 2025 Sep;13(35):1-25. doi: 10.3310/HHTE6611.
4
Measures implemented in the school setting to contain the COVID-19 pandemic.学校为控制 COVID-19 疫情而采取的措施。
Cochrane Database Syst Rev. 2022 Jan 17;1(1):CD015029. doi: 10.1002/14651858.CD015029.
5
Prescription of Controlled Substances: Benefits and Risks管制药品的处方:益处与风险
6
Re-engineering a machine learning phenotype to adapt to the changing COVID-19 landscape: a machine learning modelling study from the N3C and RECOVER consortia.重新设计机器学习表型以适应不断变化的新冠疫情形势:一项来自国家COVID-19合作数据库(N3C)和新冠疫情临床表征、治疗、预防和风险评估(RECOVER)联盟的机器学习建模研究
Lancet Digit Health. 2025 Aug;7(8):100887. doi: 10.1016/j.landig.2025.100887. Epub 2025 Aug 25.
7
Prediction of acute and chronic kidney diseases during the post-covid-19 pandemic with machine learning models: utilizing national electronic health records in the US.利用机器学习模型预测新冠疫情后美国的急慢性肾脏疾病:运用国家电子健康记录
EBioMedicine. 2025 May;115:105726. doi: 10.1016/j.ebiom.2025.105726. Epub 2025 Apr 26.
8
Signs and symptoms to determine if a patient presenting in primary care or hospital outpatient settings has COVID-19.在基层医疗机构或医院门诊环境中,如果患者出现以下症状和体征,可判断其是否患有 COVID-19。
Cochrane Database Syst Rev. 2022 May 20;5(5):CD013665. doi: 10.1002/14651858.CD013665.pub3.
9
SARS-CoV-2-neutralising monoclonal antibodies for treatment of COVID-19.用于治疗 COVID-19 的 SARS-CoV-2 中和单克隆抗体。
Cochrane Database Syst Rev. 2021 Sep 2;9(9):CD013825. doi: 10.1002/14651858.CD013825.pub2.
10
Genetic determinants of testicular sperm extraction outcomes: insights from a large multicentre study of men with non-obstructive azoospermia.睾丸精子提取结果的遗传决定因素:来自一项针对非梗阻性无精子症男性的大型多中心研究的见解
Hum Reprod Open. 2025 Aug 29;2025(3):hoaf049. doi: 10.1093/hropen/hoaf049. eCollection 2025.

本文引用的文献

1
Reproducibility of genetic risk factors identified for long COVID using combinatorial analysis across US and UK patient cohorts with diverse ancestries.通过对美国和英国不同血统患者队列进行组合分析确定的长期新冠病毒遗传风险因素的可重复性。
J Transl Med. 2025 May 8;23(1):516. doi: 10.1186/s12967-025-06535-x.
2
A nationwide study of risk factors for long COVID and its economic and mental health consequences in the United States.一项关于美国长期新冠病毒感染的风险因素及其经济和心理健康后果的全国性研究。
Commun Med (Lond). 2025 Apr 8;5(1):104. doi: 10.1038/s43856-025-00759-0.
3
Accurate predictions on small data with a tabular foundation model.
基于表格基础模型对小数据进行准确预测。
Nature. 2025 Jan;637(8045):319-326. doi: 10.1038/s41586-024-08328-6. Epub 2025 Jan 8.
4
Exploring social determinants of health and their impacts on self-reported quality of life in long COVID-19 patients.探索新冠长期症状患者健康的社会决定因素及其对自我报告的生活质量的影响。
Sci Rep. 2024 Dec 6;14(1):30410. doi: 10.1038/s41598-024-81275-4.
5
Practical guide to SHAP analysis: Explaining supervised machine learning model predictions in drug development.SHAP 分析实用指南:在药物研发中解释有监督机器学习模型预测。
Clin Transl Sci. 2024 Nov;17(11):e70056. doi: 10.1111/cts.70056.
6
Long COVID science, research and policy.长新冠科学、研究与政策。
Nat Med. 2024 Aug;30(8):2148-2164. doi: 10.1038/s41591-024-03173-6. Epub 2024 Aug 9.
7
Long Covid Defined.长新冠的定义。
N Engl J Med. 2024 Nov 7;391(18):1746-1753. doi: 10.1056/NEJMsb2408466. Epub 2024 Jul 31.
8
Looking at the Data on Smoking and Post-COVID-19 Syndrome-A Literature Review.审视吸烟与新冠后综合征的数据——文献综述
J Pers Med. 2024 Jan 16;14(1):97. doi: 10.3390/jpm14010097.
9
Post-COVID-19 Condition in Military Personnel.军事人员的新冠后状况
Mil Med. 2024 May 18;189(5-6):e1277-e1281. doi: 10.1093/milmed/usad453.
10
De-black-boxing health AI: demonstrating reproducible machine learning computable phenotypes using the N3C-RECOVER Long COVID model in the All of Us data repository.健康人工智能去黑箱化:在 All of Us 数据存储库中使用 N3C-RECOVER 长新冠模型展示可重复机器学习计算表型。
J Am Med Inform Assoc. 2023 Jun 20;30(7):1305-1312. doi: 10.1093/jamia/ocad077.