Ghorvei Mohammadreza, Karhu Tuomas, Hietakoste Salla, Ferreira-Santos Daniela, Hrubos-Strøm Harald, Islind Anna Sigridur, Biedebach Luka, Nikkonen Sami, Leppänen Timo, Rusanen Matias
Department of Technical Physics, University of Eastern Finland, Kuopio, Finland.
Diagnostic Imaging Center, Kuopio University Hospital, Kuopio, Finland.
J Sleep Res. 2025 Jun;34(3):e14349. doi: 10.1111/jsr.14349. Epub 2024 Oct 24.
Obstructive sleep apnea is a heterogeneous sleep disorder with varying phenotypes. Several studies have already performed cluster analyses to discover various obstructive sleep apnea phenotypic clusters. However, the selection of the clustering method might affect the outputs. Consequently, it is unclear whether similar obstructive sleep apnea clusters can be reproduced using different clustering methods. In this study, we applied four well-known clustering methods: Agglomerative Hierarchical Clustering; K-means; Fuzzy C-means; and Gaussian Mixture Model to a population of 865 suspected obstructive sleep apnea patients. By creating five clusters with each method, we examined the effect of clustering methods on forming obstructive sleep apnea clusters and the differences in their physiological characteristics. We utilized a visualization technique to indicate the cluster formations, Cohen's kappa statistics to find the similarity and agreement between clustering methods, and performance evaluation to compare the clustering performance. As a result, two out of five clusters were distinctly different with all four methods, while three other clusters exhibited overlapping features across all methods. In terms of agreement, Fuzzy C-means and K-means had the strongest (κ = 0.87), and Agglomerative hierarchical clustering and Gaussian Mixture Model had the weakest agreement (κ = 0.51) between each other. The K-means showed the best clustering performance, followed by the Fuzzy C-means in most evaluation criteria. Moreover, Fuzzy C-means showed the greatest potential in handling overlapping clusters compared with other methods. In conclusion, we revealed a direct impact of clustering method selection on the formation and physiological characteristics of obstructive sleep apnea clusters. In addition, we highlighted the capability of soft clustering methods, particularly Fuzzy C-means, in the application of obstructive sleep apnea phenotyping.
阻塞性睡眠呼吸暂停是一种具有不同表型的异质性睡眠障碍。已有多项研究进行聚类分析以发现各种阻塞性睡眠呼吸暂停表型集群。然而,聚类方法的选择可能会影响结果。因此,尚不清楚使用不同的聚类方法是否能重现相似的阻塞性睡眠呼吸暂停集群。在本研究中,我们将四种著名的聚类方法:凝聚层次聚类、K均值聚类、模糊C均值聚类和高斯混合模型应用于865名疑似阻塞性睡眠呼吸暂停患者的群体。通过用每种方法创建五个集群,我们研究了聚类方法对形成阻塞性睡眠呼吸暂停集群的影响及其生理特征的差异。我们利用一种可视化技术来表示集群形成情况,使用科恩kappa统计量来找出聚类方法之间的相似性和一致性,并通过性能评估来比较聚类性能。结果,五种集群中有两种在所有四种方法下都明显不同,而其他三种集群在所有方法中都呈现出重叠特征。在一致性方面,模糊C均值聚类和K均值聚类之间的一致性最强(κ = 0.87),凝聚层次聚类和高斯混合模型之间的一致性最弱(κ = 0.51)。在大多数评估标准中,K均值聚类表现出最佳的聚类性能,其次是模糊C均值聚类。此外,与其他方法相比,模糊C均值聚类在处理重叠集群方面显示出最大潜力。总之,我们揭示了聚类方法选择对阻塞性睡眠呼吸暂停集群的形成和生理特征有直接影响。此外,我们强调了软聚类方法,特别是模糊C均值聚类在阻塞性睡眠呼吸暂停表型分析中的应用能力。