Weaver Caleb, Xiao Luo, Wen Qiuting, Wu Yu-Chien, Harezlak Jaroslaw
Department of Statistics, North Carolina State University.
Department of Radiology and Imaging Sciences, Indiana University School of Medicine.
Data Sci Sci. 2024;3(1). doi: 10.1080/26941899.2024.2376535. Epub 2024 Jul 16.
Biclustering is the task of simultaneously clustering the samples and features of a data set. In doing so, subsets of samples that exhibit similar behaviors across subsets of features can be identified. Motivated by a longitudinal diffusion tensor imaging study of sport-related concussion (SRC), we present the problem of biclustering multivariate longitudinal data in which subjects and features are grouped simultaneously based on longitudinal patterns rather than magnitude. We propose a penalized regression based method for solving this problem by exploiting the heterogeneity in the longitudinal patterns within subjects and features. We evaluate the performance of the proposed methods via a simulation study and apply them to the motivating dataset, revealing distinctive patterns of white-matter abnormalities within subgroups of SRC cases.
双聚类是对数据集的样本和特征同时进行聚类的任务。通过这样做,可以识别出在特征子集上表现出相似行为的样本子集。受一项关于运动相关脑震荡(SRC)的纵向扩散张量成像研究的启发,我们提出了双聚类多元纵向数据的问题,其中基于纵向模式而非幅度同时对受试者和特征进行分组。我们提出了一种基于惩罚回归的方法,通过利用受试者和特征内纵向模式的异质性来解决这个问题。我们通过模拟研究评估了所提出方法的性能,并将其应用于激发性数据集,揭示了SRC病例亚组内白质异常的独特模式。