Division of Informatics, Imaging and Data Sciences, Faculty of Biology, Medicine and Health, University of Manchester, Manchester, UK.
Department of Pediatric Allergy, Gulhane School of Medicine, Ankara, Turkey.
Clin Exp Allergy. 2018 Jan;48(1):39-47. doi: 10.1111/cea.13014. Epub 2017 Sep 15.
Data-driven methods such as hierarchical clustering (HC) and principal component analysis (PCA) have been used to identify asthma subtypes, with inconsistent results.
To develop a framework for the discovery of stable and clinically meaningful asthma subtypes.
We performed HC in a rich data set from 613 asthmatic children, using 45 clinical variables (Model 1), and after PCA dimensionality reduction (Model 2). Clinical experts then identified a set of asthma features/domains which informed clusters in the two analyses. In Model 3, we reclustered the data using these features to ascertain whether this improved the discovery process.
Cluster stability was poor in Models 1 and 2. Clinical experts highlighted four asthma features/domains which differentiated the clusters in two models: age of onset, allergic sensitization, severity, and recent exacerbations. In Model 3 (HC using these four features), cluster stability improved substantially. The cluster assignment changed, providing more clinically interpretable results. In a 5-cluster model, we labelled the clusters as: "Difficult asthma" (n = 132); "Early-onset mild atopic" (n = 210); "Early-onset mild non-atopic: (n = 153); "Late-onset" (n = 105); and "Exacerbation-prone asthma" (n = 13). Multinomial regression demonstrated that lung function was significantly diminished among children with "Difficult asthma"; blood eosinophilia was a significant feature of "Difficult," "Early-onset mild atopic," and "Late-onset asthma." Children with moderate-to-severe asthma were present in each cluster.
An integrative approach of blending the data with clinical expert domain knowledge identified four features, which may be informative for ascertaining asthma endotypes. These findings suggest that variables which are key determinants of asthma presence, severity, or control may not be the most informative for determining asthma subtypes. Our results indicate that exacerbation-prone asthma may be a separate asthma endotype and that severe asthma is not a single entity, but an extreme end of the spectrum of several different asthma endotypes.
数据驱动方法,如层次聚类(HC)和主成分分析(PCA),已被用于识别哮喘亚型,但结果不一致。
开发一种用于发现稳定且具有临床意义的哮喘亚型的框架。
我们对 613 名哮喘儿童的丰富数据集进行 HC(模型 1),并在 PCA 降维后(模型 2)进行。临床专家随后确定了一组哮喘特征/领域,为这两种分析中的聚类提供信息。在模型 3 中,我们使用这些特征重新聚类数据,以确定这是否能改善发现过程。
模型 1 和 2 中的聚类稳定性较差。临床专家强调了区分两个模型中聚类的四个哮喘特征/领域:发病年龄、过敏致敏、严重程度和近期加重。在模型 3(使用这四个特征的 HC)中,聚类稳定性大大提高。聚类分配发生变化,提供了更具临床解释性的结果。在一个 5 聚类模型中,我们将聚类标记为:“难治性哮喘”(n=132);“早发性轻度特应性”(n=210);“早发性轻度非特应性”(n=153);“晚发性”(n=105);和“易加重哮喘”(n=13)。多项回归表明,在“难治性哮喘”儿童中,肺功能明显降低;血液嗜酸性粒细胞增多是“难治性”、“早发性轻度特应性”和“晚发性哮喘”的显著特征。中度至重度哮喘儿童存在于每个聚类中。
融合数据和临床专家领域知识的综合方法确定了四个特征,这些特征可能有助于确定哮喘表型。这些发现表明,哮喘存在、严重程度或控制的关键决定因素的变量可能不是确定哮喘亚型的最具信息性的变量。我们的结果表明,易加重哮喘可能是一种单独的哮喘表型,而严重哮喘不是一个单一的实体,而是几个不同哮喘表型的极端表现。