Suppr超能文献

基于机器学习的儿童长新冠表型:RECOVER 计划中的基于电子健康记录的研究。

A machine learning-based phenotype for long COVID in children: An EHR-based study from the RECOVER program.

机构信息

Applied Clinical Research Center, Children's Hospital of Philadelphia, Philadelphia, Pennsylvania, United States of America.

Department of Health Management and Informatics, University of Missouri School of Medicine, Columbia, Missouri, United States of America.

出版信息

PLoS One. 2023 Aug 10;18(8):e0289774. doi: 10.1371/journal.pone.0289774. eCollection 2023.

Abstract

As clinical understanding of pediatric Post-Acute Sequelae of SARS CoV-2 (PASC) develops, and hence the clinical definition evolves, it is desirable to have a method to reliably identify patients who are likely to have post-acute sequelae of SARS CoV-2 (PASC) in health systems data. In this study, we developed and validated a machine learning algorithm to classify which patients have PASC (distinguishing between Multisystem Inflammatory Syndrome in Children (MIS-C) and non-MIS-C variants) from a cohort of patients with positive SARS- CoV-2 test results in pediatric health systems within the PEDSnet EHR network. Patient features included in the model were selected from conditions, procedures, performance of diagnostic testing, and medications using a tree-based scan statistic approach. We used an XGboost model, with hyperparameters selected through cross-validated grid search, and model performance was assessed using 5-fold cross-validation. Model predictions and feature importance were evaluated using Shapley Additive exPlanation (SHAP) values. The model provides a tool for identifying patients with PASC and an approach to characterizing PASC using diagnosis, medication, laboratory, and procedure features in health systems data. Using appropriate threshold settings, the model can be used to identify PASC patients in health systems data at higher precision for inclusion in studies or at higher recall in screening for clinical trials, especially in settings where PASC diagnosis codes are used less frequently or less reliably. Analysis of how specific features contribute to the classification process may assist in gaining a better understanding of features that are associated with PASC diagnoses.

摘要

随着临床对儿童 SARS-CoV-2 后急性后遗症(PASC)的认识不断发展,临床定义也在不断演变,因此需要有一种方法能够在健康系统数据中可靠地识别出可能患有 SARS-CoV-2 后急性后遗症(PASC)的患者。在这项研究中,我们开发并验证了一种机器学习算法,以从 PEDSnet EHR 网络中的儿科健康系统中 SARS-CoV-2 检测结果为阳性的患者队列中,对 PASC(区分儿童多系统炎症综合征(MIS-C)和非-MIS-C 变体)患者进行分类。模型中包含的患者特征是使用基于树的扫描统计方法从条件、程序、诊断测试的表现和药物中选择的。我们使用 XGboost 模型,并通过交叉验证网格搜索选择超参数,然后使用 5 倍交叉验证评估模型性能。使用 Shapley Additive exPlanation (SHAP) 值评估模型预测和特征重要性。该模型提供了一种识别 PASC 患者的工具,并提供了一种使用诊断、药物、实验室和程序特征来描述健康系统数据中 PASC 的方法。使用适当的阈值设置,该模型可以在健康系统数据中以更高的精度识别 PASC 患者,以便纳入研究,或者在临床试验中以更高的召回率进行筛选,特别是在 PASC 诊断代码使用较少或不太可靠的情况下。分析特定特征如何有助于分类过程,可以帮助更好地了解与 PASC 诊断相关的特征。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/91b4/10414557/e70403154ef5/pone.0289774.g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验