Department of Applied Mathematics, Computer Science and Statistics, Ghent University, Ghent, Belgium.
Data Mining and Modeling for Biomedicine, VIB Center for Inflammation Research, Ghent, Belgium.
Front Immunol. 2019 Aug 30;10:2009. doi: 10.3389/fimmu.2019.02009. eCollection 2019.
Common variable immunodeficiency (CVID) is one of the most frequently diagnosed primary antibody deficiencies (PADs), a group of disorders characterized by a decrease in one or more immunoglobulin (sub)classes and/or impaired antibody responses caused by inborn defects in B cells in the absence of other major immune defects. CVID patients suffer from recurrent infections and disease-related, non-infectious, complications such as autoimmune manifestations, lymphoproliferation, and malignancies. A timely diagnosis is essential for optimal follow-up and treatment. However, CVID is by definition a diagnosis of exclusion, thereby covering a heterogeneous patient population and making it difficult to establish a definite diagnosis. To aid the diagnosis of CVID patients, and distinguish them from other PADs, we developed an automated machine learning pipeline which performs automated diagnosis based on flow cytometric immunophenotyping. Using this pipeline, we analyzed the immunophenotypic profile in a pediatric and adult cohort of 28 patients with CVID, 23 patients with idiopathic primary hypogammaglobulinemia, 21 patients with IgG subclass deficiency, six patients with isolated IgA deficiency, one patient with isolated IgM deficiency, and 100 unrelated healthy controls. Flow cytometry analysis is traditionally done by manual identification of the cell populations of interest. Yet, this approach has severe limitations including subjectivity of the manual gating and bias toward known populations. To overcome these limitations, we here propose an automated computational flow cytometry pipeline that successfully distinguishes CVID phenotypes from other PADs and healthy controls. Compared to the traditional, manual analysis, our pipeline is fully automated, performing automated quality control and data pre-processing, automated population identification (gating) and deriving features from these populations to build a machine learning classifier to distinguish CVID from other PADs and healthy controls. This results in a more reproducible flow cytometry analysis, and improves the diagnosis compared to manual analysis: our pipelines achieve on average a balanced accuracy score of 0.93 (±0.07), whereas using the manually extracted populations, an averaged balanced accuracy score of 0.72 (±0.23) is achieved.
普通变异型免疫缺陷病(CVID)是最常见的原发性抗体缺陷症(PAD)之一,此类疾病的特征是一种或多种免疫球蛋白(亚)类减少和/或由于 B 细胞的先天缺陷导致抗体反应受损,而不存在其他主要免疫缺陷。CVID 患者会反复感染,出现与疾病相关的非传染性并发症,如自身免疫表现、淋巴增生和恶性肿瘤。及时诊断对于最佳随访和治疗至关重要。然而,根据定义,CVID 是一种排除性诊断,因此涵盖了异质性患者群体,使得难以确定明确的诊断。为了帮助 CVID 患者的诊断,并将其与其他 PAD 区分开来,我们开发了一种自动化机器学习管道,该管道基于流式细胞免疫表型分析进行自动诊断。使用该管道,我们分析了 28 名 CVID 患者、23 名特发性原发性低丙种球蛋白血症患者、21 名 IgG 亚类缺陷患者、6 名孤立性 IgA 缺陷患者、1 名孤立性 IgM 缺陷患者和 100 名无关健康对照者的儿科和成人队列的免疫表型谱。流式细胞术分析传统上是通过手动识别感兴趣的细胞群来完成的。然而,这种方法存在严重的局限性,包括手动门控的主观性和对已知群体的偏见。为了克服这些局限性,我们在这里提出了一种自动化计算流式细胞术管道,该管道可成功将 CVID 表型与其他 PAD 和健康对照区分开来。与传统的手动分析相比,我们的管道完全自动化,执行自动质量控制和数据预处理、自动群体识别(门控)以及从这些群体中提取特征来构建机器学习分类器,以将 CVID 与其他 PAD 和健康对照区分开来。这导致更可重复的流式细胞术分析,并改善了与手动分析相比的诊断:我们的管道平均实现了 0.93(±0.07)的平衡准确性评分,而使用手动提取的群体,平均平衡准确性评分为 0.72(±0.23)。