Ahn Jaeil, Mukherjee Bhramar, Gruber Stephen B, Sinha Samiran
Department of Biostatistics, University of Michigan, Ann Arbor, Michigan 48109, USA.
Biometrics. 2011 Jun;67(2):546-58. doi: 10.1111/j.1541-0420.2010.01453.x. Epub 2010 Jun 16.
With advances in modern medicine and clinical diagnosis, case-control data with characterization of finer subtypes of cases are often available. In matched case-control studies, missingness in exposure values often leads to deletion of entire stratum, and thus entails a significant loss in information. When subtypes of cases are treated as categorical outcomes, the data are further stratified and deletion of observations becomes even more expensive in terms of precision of the category-specific odds-ratio parameters, especially using the multinomial logit model. The stereotype regression model for categorical responses lies intermediate between the proportional odds and the multinomial or baseline category logit model. The use of this class of models has been limited as the structure of the model implies certain inferential challenges with nonidentifiability and nonlinearity in the parameters. We illustrate how to handle missing data in matched case-control studies with finer disease subclassification within the cases under a stereotype regression model. We present both Monte Carlo based full Bayesian approach and expectation/conditional maximization algorithm for the estimation of model parameters in the presence of a completely general missingness mechanism. We illustrate our methods by using data from an ongoing matched case-control study of colorectal cancer. Simulation results are presented under various missing data mechanisms and departures from modeling assumptions.
随着现代医学和临床诊断的进步,通常可以获得具有更精细病例亚型特征的病例对照数据。在匹配病例对照研究中,暴露值的缺失往往会导致整个层被删除,从而导致信息的大量损失。当将病例亚型视为分类结局时,数据会进一步分层,就特定类别优势比参数的精度而言,观测值的删除成本更高,尤其是使用多项logit模型时。用于分类响应的刻板回归模型介于比例优势模型和多项或基线类别logit模型之间。由于该类模型的结构意味着参数存在不可识别性和非线性等特定推断挑战,其使用受到限制。我们说明了如何在刻板回归模型下,在病例中具有更精细疾病亚分类的匹配病例对照研究中处理缺失数据。我们提出了基于蒙特卡罗的全贝叶斯方法和期望/条件最大化算法,用于在存在完全一般缺失机制的情况下估计模型参数。我们通过使用正在进行的结直肠癌匹配病例对照研究的数据来说明我们的方法。在各种缺失数据机制和偏离建模假设的情况下给出了模拟结果。