机器学习分析用于识别可能患有原发性纤毛运动障碍患者的可行性

Feasibility of Machine Learning Analysis for the Identification of Patients with Possible Primary Ciliary Dyskinesia.

作者信息

Burns Gully, Kauffman Carey, Manion Michele, Pai Ruth-Anne, Milla Carlos, O'Connor Michael G, Shapiro Adam J, Bjornson-Pennell Heidi

机构信息

Chan Zuckerberg Initiative, PO BOX 8040, Redwood City, CA 94063.

The Primary Ciliary Dyskinesia Foundation, Minneapolis, MN, USA.

出版信息

medRxiv. 2025 Apr 20:2025.04.18.25326065. doi: 10.1101/2025.04.18.25326065.

DOI:10.1101/2025.04.18.25326065

PMID:40321264

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12047924/

Abstract

BACKGROUND

Significant diagnostic delays are common in primary ciliary dyskinesia (PCD), a rare disease that is significantly underdiagnosed. Scalable screening methods could improve early identification and health outcomes.

RESEARCH QUESTION

Can machine learning (ML) be used to screen for PCD in pediatric patients?

STUDY DESIGN AND METHODS

We evaluated the feasibility of a random forest model to screen for PCD using data from the PCD Foundation Registry and a national claims database. We identified a cohort of pediatric patients with diagnostic codes indicative of conditions potentially associated with PCD, and studied diagnostic, procedural, and pharmaceutical codes associated with PCD to develop ML features. Models were trained on composite claims data from confirmed patients with PCD, patients with Q34.8 (Specific Congenital Malformation of the Respiratory System) diagnosed within six months of an Electron Microscopy procedure (Q34.8+EM), and a randomly-selected, matched control group. Model performance was tested through 5-fold cross-validation.

RESULTS

Using 82 confirmed PCD cases and 4,161 matched controls, the model demonstrated variable performance (positive predictive value 0.45-0.73, sensitivity 0.75-0.94). Synthetic data augmentation did not improve results (positive predictive value 0.45-0.67, sensitivity 0.71-1.00). Expanding the dataset to include 319 Q34.8+EM patients and 8,214 controls improved performance (positive predictive value 0.51-0.54, sensitivity 0.82-0.90), suitable for screening. In a cohort of 1.32 million pediatric patients, 7,705 were classified as positive, consistent with the estimated prevalence of PCD (1:7,554).

INTERPRETATION

This study demonstrates the feasibility of using ML to screen for PCD using claims data, even in the absence of a specific International Classification of Disease (ICD) code. Such screening approaches may aid in the identification of individuals who may benefit from timely diagnostic testing and targeted interventions.

摘要

背景

在原发性纤毛运动障碍（PCD）中，显著的诊断延迟很常见，这是一种罕见病，且严重漏诊。可扩展的筛查方法能够改善早期识别和健康结局。

研究问题

机器学习（ML）能否用于筛查儿科患者的PCD？

研究设计与方法

我们使用PCD基金会登记处的数据和一个全国性索赔数据库，评估了随机森林模型筛查PCD的可行性。我们确定了一组具有指示可能与PCD相关疾病的诊断代码的儿科患者，并研究了与PCD相关的诊断、程序和药物代码，以开发ML特征。模型在来自确诊PCD患者、在电子显微镜检查（Q34.8+EM）后六个月内被诊断为Q34.8（呼吸系统特定先天性畸形）的患者以及随机选择的匹配对照组的综合索赔数据上进行训练。通过五折交叉验证测试模型性能。

结果

使用82例确诊的PCD病例和4161例匹配对照，该模型表现出不同的性能（阳性预测值0.45 - 0.73，灵敏度0.75 - 0.94）。合成数据增强并未改善结果（阳性预测值0.45 - 0.67，灵敏度0.71 - 1.00）。将数据集扩大到包括319例Q34.8+EM患者和8214例对照可提高性能（阳性预测值0.51 - 0.54，灵敏度0.82 - 0.90），适合筛查。在一组132万儿科患者中，7705例被分类为阳性，与PCD的估计患病率（1:7554）一致。

解读

本研究证明了使用ML通过索赔数据筛查PCD的可行性，即使在没有特定国际疾病分类（ICD）代码的情况下也是如此。这种筛查方法可能有助于识别那些可能从及时诊断测试和针对性干预中受益的个体。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8682/12047924/b84fb250e969/nihpp-2025.04.18.25326065v1-f0001.jpg

相似文献

Feasibility of Machine Learning Analysis for the Identification of Patients with Possible Primary Ciliary Dyskinesia.

medRxiv. 2025 Apr 20:2025.04.18.25326065. doi: 10.1101/2025.04.18.25326065.

Accuracy of Nasal Nitric Oxide Measurement as a Diagnostic Test for Primary Ciliary Dyskinesia. A Systematic Review and Meta-analysis.

Ann Am Thorac Soc. 2017 Jul;14(7):1184-1196. doi: 10.1513/AnnalsATS.201701-062SR.

The Swiss Primary Ciliary Dyskinesia registry: objectives, methods and first results.

Swiss Med Wkly. 2019 Jan 13;149. doi: 10.57187/smw.2019.20004. eCollection 2019 Jan 1.

The Primary Ciliary Dyskinesia Computed Tomography Score in Adults with Bronchiectasis: A Derivation und Validation Study.

Respiration. 2021;100(6):499-509. doi: 10.1159/000514927. Epub 2021 Apr 23.

Diagnostic testing of patients suspected of primary ciliary dyskinesia.

Am J Respir Crit Care Med. 2010 Feb 15;181(4):307-14. doi: 10.1164/rccm.200903-0459OC. Epub 2009 Nov 12.

Implementation of a screening tool for primary ciliary dyskinesia (PCD) in a pediatric otolaryngology clinic.

Int J Pediatr Otorhinolaryngol. 2021 Mar;142:110586. doi: 10.1016/j.ijporl.2020.110586. Epub 2020 Dec 31.

Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.

Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.

Primary ciliary dyskinesia: From diagnosis to molecular mechanisms.

J Pediatr Genet. 2014 Jun;3(2):115-27. doi: 10.3233/PGE-14088.

Nasal Nitric Oxide Measurement and a Modified PICADAR Score for the Screening of Primary Ciliary Dyskinesia in Adults with Bronchiectasis.

Pneumologie. 2017 Aug;71(8):543-548. doi: 10.1055/s-0043-111909. Epub 2017 Aug 7.

Progress in Diagnosing Primary Ciliary Dyskinesia: The North American Perspective.

Diagnostics (Basel). 2021 Jul 16;11(7):1278. doi: 10.3390/diagnostics11071278.

本文引用的文献

An electronic medical record retrieval system can be used to identify missed diagnosis in patients with primary ciliary dyskinesia.

J Intern Med. 2025 Jan;297(1):93-100. doi: 10.1111/joim.20034. Epub 2024 Nov 23.

Evaluating the clinical benefits of LLMs.

Nat Med. 2024 Sep;30(9):2409-2410. doi: 10.1038/s41591-024-03181-6.

Electronic health record signatures identify undiagnosed patients with common variable immunodeficiency disease.

Sci Transl Med. 2024 May;16(745):eade4510. doi: 10.1126/scitranslmed.ade4510. Epub 2024 May 1.

Characterizing the limitations of using diagnosis codes in the context of machine learning for healthcare.

BMC Med Inform Decis Mak. 2024 Feb 14;24(1):51. doi: 10.1186/s12911-024-02449-8.

Evaluation of clinical prediction models (part 1): from development to external validation.

BMJ. 2024 Jan 8;384:e074819. doi: 10.1136/bmj-2023-074819.

Primary ciliary dyskinesia diagnosis and management and its implications in America: a mini review.

Front Pediatr. 2023 Sep 8;11:1091173. doi: 10.3389/fped.2023.1091173. eCollection 2023.

Creation and Adoption of Large Language Models in Medicine.

JAMA. 2023 Sep 5;330(9):866-869. doi: 10.1001/jama.2023.14217.

Establishing a framework for privacy-preserving record linkage among electronic health record and administrative claims databases within PCORnet, the National Patient-Centered Clinical Research Network.

BMC Res Notes. 2022 Oct 31;15(1):337. doi: 10.1186/s13104-022-06243-5.

Genome sequencing reveals underdiagnosis of primary ciliary dyskinesia in bronchiectasis.

Eur Respir J. 2022 Nov 17;60(5). doi: 10.1183/13993003.00176-2022. Print 2022 Nov.

Nasal nitric oxide May not differentiate primary ciliary dyskinesia from certain primary immunodeficiencies.

Pediatr Pulmonol. 2022 Sep;57(9):2269-2272. doi: 10.1002/ppul.25989. Epub 2022 Jun 2.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

机器学习分析用于识别可能患有原发性纤毛运动障碍患者的可行性

Feasibility of Machine Learning Analysis for the Identification of Patients with Possible Primary Ciliary Dyskinesia.

作者信息

机构信息

出版信息

BACKGROUND

RESEARCH QUESTION

STUDY DESIGN AND METHODS

RESULTS

INTERPRETATION

背景

研究问题

研究设计与方法

结果

解读

相似文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

本文引用的文献