Department of Pediatrics, University of Saskatchewan, Saskatoon, Canada.
Department of Computer Sciences, University of Saskatchewan.
Rheumatology (Oxford). 2020 May 1;59(5):1066-1075. doi: 10.1093/rheumatology/kez382.
To identify discrete clusters comprising clinical features and inflammatory biomarkers in children with JIA and to determine cluster alignment with JIA categories.
A Canadian prospective inception cohort comprising 150 children with JIA was evaluated at baseline (visit 1) and after six months (visit 2). Data included clinical manifestations and inflammation-related biomarkers. Probabilistic principal component analysis identified sets of composite variables, or principal components, from 191 original variables. To discern new clinical-biomarker clusters (clusters), Gaussian mixture models were fit to the data. Newly-defined clusters and JIA categories were compared. Agreement between the two was assessed using Kruskal-Wallis analyses and contingency plots.
Three principal components recovered 35% (three clusters) and 40% (five clusters) of the variance in patient profiles in visits 1 and 2, respectively. None of the clusters aligned precisely with any of the seven JIA categories but rather spanned multiple categories. Results demonstrated that the newly defined clinical-biomarker lustres are more homogeneous than JIA categories.
Applying unsupervised data mining to clinical and inflammatory biomarker data discerns discrete clusters that intersect multiple JIA categories. Results suggest that certain groups of patients within different JIA categories are more aligned pathobiologically than their separate clinical categorizations suggest. Applying data mining analyses to complex datasets can generate insights into JIA pathogenesis and could contribute to biologically based refinements in JIA classification.
在幼年特发性关节炎(JIA)患儿中识别包含临床特征和炎症生物标志物的离散簇,并确定与 JIA 类别的簇对齐情况。
一项加拿大前瞻性发病队列研究纳入了 150 例 JIA 患儿,在基线(第 1 次就诊)和 6 个月后(第 2 次就诊)进行评估。数据包括临床表现和炎症相关的生物标志物。概率主成分分析从 191 个原始变量中提取出一组复合变量或主成分。为了识别新的临床生物标志物簇(簇),对数据进行了高斯混合模型拟合。比较新定义的簇和 JIA 类别。使用 Kruskal-Wallis 分析和列联表评估两者之间的一致性。
在第 1 次和第 2 次就诊时,三个主成分分别恢复了患者特征 35%(三个簇)和 40%(五个簇)的方差。没有一个簇与任何七个 JIA 类别完全对齐,而是跨越了多个类别。结果表明,新定义的临床生物标志物聚类比 JIA 类别更具有同质性。
应用无监督数据挖掘对临床和炎症生物标志物数据进行分析,可以识别出与多个 JIA 类别相交的离散簇。结果表明,不同 JIA 类别中的某些患者群体在病理生物学上比其单独的临床分类更一致。将数据挖掘分析应用于复杂数据集可以深入了解 JIA 的发病机制,并有助于基于生物学的 JIA 分类的改进。