Zhou Xiaoxiao, Kang Kai, Song Xinyuan
Department of Statistics, Chinese University of Hong Kong, Hong Kong.
Stat Med. 2020 Feb 26. doi: 10.1002/sim.8513.
This study develops a two-part hidden Markov model (HMM) for analyzing semicontinuous longitudinal data in the presence of missing covariates. The proposed model manages a semicontinuous variable by splitting it into two random variables: a binary indicator for determining the occurrence of excess zeros at all occasions and a continuous random variable for examining its actual level. For the continuous longitudinal response, an HMM is proposed to describe the relationship between the observation and unobservable finite-state transition processes. The HMM consists of two major components. The first component is a transition model for investigating how potential covariates influence the probabilities of transitioning from one hidden state to another. The second component is a conditional regression model for examining the state-specific effects of covariates on the response. A shared random effect is introduced to each part of the model to accommodate possible unobservable heterogeneity among observation processes and the nonignorability of missing covariates. A Bayesian adaptive least absolute shrinkage and selection operator (lasso) procedure is developed to conduct simultaneous variable selection and estimation. The proposed methodology is applied to a study on the Alzheimer's Disease Neuroimaging Initiative dataset. New insights into the pathology of Alzheimer's disease and its potential risk factors are obtained.
本研究开发了一种两部分隐马尔可夫模型(HMM),用于在存在协变量缺失的情况下分析半连续纵向数据。所提出的模型通过将半连续变量拆分为两个随机变量来处理该变量:一个二元指标,用于确定在所有情况下是否出现过多零值;一个连续随机变量,用于检查其实际水平。对于连续纵向响应,提出了一种HMM来描述观测值与不可观测的有限状态转移过程之间的关系。该HMM由两个主要部分组成。第一部分是一个转移模型,用于研究潜在协变量如何影响从一个隐藏状态转移到另一个隐藏状态的概率。第二部分是一个条件回归模型,用于检查协变量对响应的特定状态效应。在模型的每个部分引入了一个共享随机效应,以适应观测过程中可能存在的不可观测异质性以及协变量缺失的不可忽略性。开发了一种贝叶斯自适应最小绝对收缩和选择算子(lasso)程序,以同时进行变量选择和估计。所提出的方法应用于阿尔茨海默病神经影像倡议数据集的一项研究。获得了关于阿尔茨海默病病理学及其潜在风险因素的新见解。