Department of Computer Science, University of California Los Angeles, Los Angeles, CA 90095, USA.
Department of Pathology and Laboratory Medicine, University of California Los Angeles, Los Angeles, CA 90095, USA.
Sci Transl Med. 2024 May;16(745):eade4510. doi: 10.1126/scitranslmed.ade4510. Epub 2024 May 1.
Human inborn errors of immunity include rare disorders entailing functional and quantitative antibody deficiencies due to impaired B cells called the common variable immunodeficiency (CVID) phenotype. Patients with CVID face delayed diagnoses and treatments for 5 to 15 years after symptom onset because the disorders are rare (prevalence of ~1/25,000), and there is extensive heterogeneity in CVID phenotypes, ranging from infections to autoimmunity to inflammatory conditions, overlapping with other more common disorders. The prolonged diagnostic odyssey drives excessive system-wide costs before diagnosis. Because there is no single causal mechanism, there are no genetic tests to definitively diagnose CVID. Here, we present PheNet, a machine learning algorithm that identifies patients with CVID from their electronic health records (EHRs). PheNet learns phenotypic patterns from verified CVID cases and uses this knowledge to rank patients by likelihood of having CVID. PheNet could have diagnosed more than half of our patients with CVID 1 or more years earlier than they had been diagnosed. When applied to a large EHR dataset, followed by blinded chart review of the top 100 patients ranked by PheNet, we found that 74% were highly probable to have CVID. We externally validated PheNet using >6 million records from disparate medical systems in California and Tennessee. As artificial intelligence and machine learning make their way into health care, we show that algorithms such as PheNet can offer clinical benefits by expediting the diagnosis of rare diseases.
人类先天性免疫缺陷包括罕见疾病,这些疾病因 B 细胞功能和数量缺陷导致抗体缺乏,被称为常见可变免疫缺陷(CVID)表型。CVID 患者在症状出现后 5 至 15 年后才得到诊断和治疗,因为这些疾病较为罕见(患病率约为 1/25000),而且 CVID 表型存在广泛的异质性,从感染到自身免疫到炎症性疾病,与其他更为常见的疾病重叠。漫长的诊断过程导致在诊断前产生过度的全身性成本。由于没有单一的因果机制,因此没有基因检测可以明确诊断 CVID。在这里,我们提出了 PheNet,这是一种从电子健康记录(EHR)中识别 CVID 患者的机器学习算法。PheNet 从已确认的 CVID 病例中学习表型模式,并利用这些知识对患者进行 CVID 可能性排序。PheNet 可以使我们的 CVID 1 患者中的一半以上提前 1 年或更长时间得到诊断。当将其应用于大型 EHR 数据集,并对 PheNet 排名前 100 的患者进行盲法图表审查时,我们发现其中 74%的患者高度可能患有 CVID。我们使用加利福尼亚州和田纳西州的不同医疗系统中的超过 600 万份记录对 PheNet 进行了外部验证。随着人工智能和机器学习进入医疗保健领域,我们证明,像 PheNet 这样的算法可以通过加快罕见疾病的诊断来提供临床益处。