Chen Fangyi, Ahimaz Priyanka, Nguyen Quan M, Lewis Rachel, Chung Wendy K, Ta Casey N, Szigety Katherine M, Sheppard Sarah E, Campbell Ian M, Wang Kai, Weng Chunhua, Liu Cong
Department of Biomedical Informatics, Columbia University, New York, NY, USA.
Department of Pediatrics, Columbia University, New York, NY, USA.
NPJ Digit Med. 2024 Nov 21;7(1):333. doi: 10.1038/s41746-024-01331-1.
Patients with rare diseases often experience prolonged diagnostic delays. Ordering appropriate genetic tests is crucial yet challenging, especially for general pediatricians without genetic expertise. Recent American College of Medical Genetics (ACMG) guidelines embrace early use of exome sequencing (ES) or genome sequencing (GS) for conditions like congenital anomalies or developmental delays while still recommend gene panels for patients exhibiting strong manifestations of a specific disease. Recognizing the difficulty in navigating these options, we developed a machine learning model trained on 1005 patient records from Columbia University Irving Medical Center to recommend appropriate genetic tests based on the phenotype information. The model achieved a remarkable performance with an AUROC of 0.823 and AUPRC of 0.918, aligning closely with decisions made by genetic specialists, and demonstrated strong generalizability (AUROC:0.77, AUPRC: 0.816) in an external cohort, indicating its potential value for general pediatricians to expedite rare disease diagnosis by enhancing genetic test ordering.
罕见病患者常常经历长时间的诊断延迟。安排适当的基因检测至关重要但颇具挑战,尤其是对于没有基因专业知识的普通儿科医生而言。美国医学遗传学学会(ACMG)最近的指南提倡在先天性异常或发育迟缓等病症中尽早使用外显子组测序(ES)或基因组测序(GS),同时仍建议为表现出特定疾病强烈症状的患者使用基因检测板。认识到在这些选择中做出决策的困难,我们开发了一种机器学习模型,该模型基于哥伦比亚大学欧文医学中心的1005份患者记录进行训练,以根据表型信息推荐适当的基因检测。该模型表现出色,曲线下面积(AUROC)为0.823,精确率-召回率曲线下面积(AUPRC)为0.918,与基因专家做出的决策密切吻合,并且在一个外部队列中显示出很强的通用性(AUROC:0.77,AUPRC:0.816),表明其对普通儿科医生通过改进基因检测安排来加速罕见病诊断具有潜在价值。