Salim Agus, Ma Xiangmei, Fall Katja, Andrén Ove, Reilly Marie
La Trobe University, Melbourne, Victoria 3086, Australia.
Stat Med. 2014 Dec 30;33(30):5388-98. doi: 10.1002/sim.6245. Epub 2014 Jul 1.
The significant investment in measuring biomarkers has prompted investigators to improve cost-efficiency by sub-sampling in non-standard study designs. For example, investigators studying prognosis may assume that any differences in biomarkers are likely to be most apparent in an extreme sample of the earliest deaths and the longest-surviving controls. Simple logistic regression analysis of such data does not exploit the information available in the survival time, and statistical methods that model the sampling scheme may be more efficient. We derive likelihood equations that reflect the complex sampling scheme in unmatched and matched 'extreme' case-control designs. We investigated the performance and power of the method in simulation experiments, with a range of underlying hazard ratios and study sizes. Our proposed method resulted in hazard ratio estimates close to those obtained from the full cohort. The standard error estimates also performed well when compared with the empirical variance. In an application to a study investigating markers for lethal prostate cancer, an extreme case-control sample of lethal cases and the longest-surviving controls provided estimates of the effect of Gleason score in close agreement with analysis of all the data. By using the information in the sampling design, our method enables efficient and valid estimation of the underlying hazard ratio from a study design that is intuitive and easily implemented.
在生物标志物测量方面的大量投入促使研究人员通过在非标准研究设计中进行子采样来提高成本效益。例如,研究预后的研究人员可能会认为,生物标志物的任何差异在最早死亡的极端样本和存活时间最长的对照样本中可能最为明显。对这类数据进行简单的逻辑回归分析并未利用生存时间中可用的信息,而对抽样方案进行建模的统计方法可能更有效。我们推导了反映未匹配和匹配的“极端”病例对照设计中复杂抽样方案的似然方程。我们在模拟实验中研究了该方法在一系列潜在风险比和研究规模下的性能和功效。我们提出的方法得出的风险比估计值与从整个队列中获得的估计值相近。与经验方差相比,标准误差估计也表现良好。在一项调查致命性前列腺癌标志物的研究应用中,致命病例和存活时间最长的对照的极端病例对照样本提供的 Gleason 评分效应估计值与对所有数据的分析结果非常一致。通过利用抽样设计中的信息,我们的方法能够从直观且易于实施的研究设计中高效且有效地估计潜在风险比。