Division of Rheumatology, Department of Paediatrics, The Hospital for Sick Children (SickKids), Toronto, Ontario, Canada.
Department of Immunology, University of Toronto, Toronto, Ontario, Canada.
PLoS Med. 2019 Feb 26;16(2):e1002750. doi: 10.1371/journal.pmed.1002750. eCollection 2019 Feb.
Joint inflammation is the common feature underlying juvenile idiopathic arthritis (JIA). Clinicians recognize patterns of joint involvement currently not part of the International League of Associations for Rheumatology (ILAR) classification. Using unsupervised machine learning, we sought to uncover data-driven joint patterns that predict clinical phenotype and disease trajectories.
We analyzed prospectively collected clinical data, including joint involvement using a standard 71-joint homunculus, for 640 discovery patients with newly diagnosed JIA enrolled in a Canada-wide study who were followed serially for five years, treatment-naïve except for nonsteroidal anti-inflammatory drugs (NSAIDs) and diagnosed within one year of symptom onset. Twenty-one patients had systemic arthritis, 300 oligoarthritis, 125 rheumatoid factor (RF)-negative polyarthritis, 16 RF-positive polyarthritis, 37 psoriatic arthritis, 78 enthesitis-related arthritis (ERA), and 63 undifferentiated arthritis. At diagnosis, we observed global hierarchical groups of co-involved joints. To characterize these patterns, we developed sparse multilayer non-negative matrix factorization (NMF). Model selection by internal bi-cross-validation identified seven joint patterns at presentation, to which all 640 discovery patients were assigned: pelvic girdle (57 patients), fingers (25), wrists (114), toes (48), ankles (106), knees (283), and indistinct (7). Patterns were distinct from clinical subtypes (P < 0.001 by χ2 test) and reproducible through external data set validation on a 119-patient, prospectively collected independent validation cohort (reconstruction accuracy Q2 = 0.55 for patterns; 0.35 for groups). Some patients matched multiple patterns. To determine whether their disease outcomes differed, we further subdivided the 640 discovery patients into three subgroups by degree of localization-the percentage of their active joints aligning with their assigned pattern: localized (≥90%; 359 patients), partially localized (60%-90%; 124), or extended (<60%; 157). Localized patients more often maintained their baseline patterns (P < 0.05 for five groups by permutation test) than nonlocalized patients (P < 0.05 for three groups by permutation test) over a five-year follow-up period. We modelled time to zero joints in the discovery cohort using a multivariate Cox proportional hazards model considering joint pattern, degree of localization, and ILAR subtype. Despite receiving more intense treatment, 50% of nonlocalized patients had zero joints at one year compared to six months for localized patients. Overall, localized patients required less time to reach zero joints (partial: P = 0.0018 versus localized by log-rank test; extended: P = 0.0057). Potential limitations include the requirement for patients to be treatment naïve (except NSAIDs), which may skew the patient cohorts towards milder disease, and the validation cohort size precluded multivariate analyses of disease trajectories.
Multilayer NMF identified patterns of joint involvement that predicted disease trajectory in children with arthritis. Our hierarchical unsupervised approach identified a new clinical feature, degree of localization, which predicted outcomes in both cohorts. Detailed assessment of every joint is already part of every musculoskeletal exam for children with arthritis. Our study supports both the continued collection of detailed joint involvement and the inclusion of patterns and degrees of localization to stratify patients and inform treatment decisions. This will advance pediatric rheumatology from counting joints to realizing the potential of using data available from uncovering patterns of joint involvement.
关节炎症是幼年特发性关节炎(JIA)的共同特征。临床医生目前认识到的关节受累模式并不属于国际风湿病联盟(ILAR)分类。我们采用无监督机器学习的方法,旨在发现可预测临床表型和疾病轨迹的数据驱动关节模式。
我们分析了 640 名新诊断为 JIA 的加拿大患者的前瞻性收集的临床数据,包括使用标准的 71 个关节拟人图的关节受累情况。这些患者在入组时均为初治,除非甾体抗炎药(NSAIDs)外未接受其他治疗,且在症状出现后一年内确诊。21 名患者为全身型关节炎,300 名为少关节型关节炎,125 名为类风湿因子(RF)阴性多关节型关节炎,16 名为 RF 阳性多关节型关节炎,37 名为银屑病关节炎,78 名为附着点相关关节炎(ERA),63 名为未分化关节炎。在诊断时,我们观察到了共同受累关节的全局层次分组。为了描述这些模式,我们开发了稀疏多层非负矩阵分解(NMF)。通过内部双向交叉验证进行的模型选择确定了七个在发病时存在的关节模式,并将所有 640 名发现患者分配到这七个模式中:骨盆带(57 名患者)、手指(25 名患者)、手腕(114 名患者)、脚趾(48 名患者)、踝关节(106 名患者)、膝关节(283 名患者)和不明确(7 名患者)。这些模式与临床亚型不同(χ2 检验 P<0.001),并且通过对一个 119 名前瞻性收集的独立验证队列的外部数据集验证是可重复的(模式的重建准确性 Q2=0.55;组的重建准确性 Q2=0.35)。有些患者符合多种模式。为了确定他们的疾病结局是否存在差异,我们根据活跃关节与分配模式的一致性程度,将 640 名发现患者进一步分为三个亚组:局部化程度(≥90%,359 名患者)、部分局部化(60%-90%,124 名患者)或扩展(<60%,157 名患者)。与非局部化患者相比(通过置换检验,三组患者的 P<0.05),局部化患者在五年随访期间更常保持其基线模式(通过置换检验,五组患者的 P<0.05)。我们通过考虑关节模式、局部化程度和 ILAR 亚型,使用多变量 Cox 比例风险模型来模拟发现队列中零关节的时间。尽管接受了更强烈的治疗,但与局部化患者相比,非局部化患者在一年时达到零关节的比例为 50%,而局部化患者为 6 个月。总体而言,局部化患者达到零关节的时间更短(部分:P=0.0018 与局部化相比,通过对数秩检验;扩展:P=0.0057)。潜在的局限性包括要求患者为初治(除 NSAIDs 外),这可能使患者队列偏向于轻症,验证队列的大小限制了对疾病轨迹的多变量分析。
多层 NMF 确定了预测儿童关节炎疾病轨迹的关节受累模式。我们的分层无监督方法确定了一个新的临床特征,即局部化程度,它可以预测两个队列的结局。对每个关节的详细评估已经是儿童关节炎患者每次肌肉骨骼检查的一部分。我们的研究支持继续收集详细的关节受累情况,并纳入模式和局部化程度,以对患者进行分层,并为治疗决策提供信息。这将使儿科风湿病学从计数关节发展到利用现有的关节受累模式来实现潜在价值。