Heinzl Felix, Tutz Gerhard
Department of Statistics, Ludwig-Maximilians-University Munich, Akademiestr. 1, 80799, Munich, Germany.
Biom J. 2014 Jan;56(1):44-68. doi: 10.1002/bimj.201200111. Epub 2013 Nov 18.
A method is proposed that aims at identifying clusters of individuals that show similar patterns when observed repeatedly. We consider linear-mixed models that are widely used for the modeling of longitudinal data. In contrast to the classical assumption of a normal distribution for the random effects a finite mixture of normal distributions is assumed. Typically, the number of mixture components is unknown and has to be chosen, ideally by data driven tools. For this purpose, an EM algorithm-based approach is considered that uses a penalized normal mixture as random effects distribution. The penalty term shrinks the pairwise distances of cluster centers based on the group lasso and the fused lasso method. The effect is that individuals with similar time trends are merged into the same cluster. The strength of regularization is determined by one penalization parameter. For finding the optimal penalization parameter a new model choice criterion is proposed.
提出了一种方法,旨在识别在重复观察时表现出相似模式的个体集群。我们考虑广泛用于纵向数据建模的线性混合模型。与随机效应服从正态分布的经典假设不同,这里假设正态分布的有限混合。通常,混合成分的数量是未知的,必须进行选择,理想情况下是通过数据驱动工具来选择。为此,考虑一种基于期望最大化(EM)算法的方法,该方法使用惩罚正态混合作为随机效应分布。惩罚项基于组套索和融合套索方法收缩聚类中心的成对距离。其效果是将具有相似时间趋势的个体合并到同一聚类中。正则化强度由一个惩罚参数确定。为了找到最优惩罚参数,提出了一种新的模型选择标准。