Suppr超能文献

存在缺失研究合格性信息时的比例风险回归

Proportional hazards regression in the presence of missing study eligibility information.

作者信息

Pan Qing, Schaubel Douglas E

机构信息

Department of Statistics, George Washington University, Washington, DC, 20052, USA,

出版信息

Lifetime Data Anal. 2014 Jul;20(3):424-43. doi: 10.1007/s10985-013-9273-5. Epub 2013 Jun 22.

Abstract

We consider the study of censored survival times in the situation where the available data consist of both eligible and ineligible subjects, and information distinguishing the two groups is sometimes missing. A complete-case analysis in this context would use only subjects known to be eligible, resulting in inefficient and potentially biased estimators. We propose a two-step procedure which resembles the EM algorithm but is computationally much faster. In the first step, one estimates the conditional expectation of the missing eligibility indicators given the observed data using a logistic regression based on the complete cases (i.e., subjects with non-missing eligibility indicator). In the second step, maximum likelihood estimators are obtained from a weighted Cox proportional hazards model, with the weights being either observed eligibility indicators or estimated conditional expectations thereof. Under ignorable missingness, the estimators from the second step are proven to be consistent and asymptotically normal, with explicit variance estimators. We demonstrate through simulation that the proposed methods perform well for moderate sized samples and are robust in the presence of eligibility indicators that are missing not at random. The proposed procedure is more efficient and more robust than the complete case analysis and, unlike the EM algorithm, does not require time-consuming iteration. Although the proposed methods are applicable generally, they would be most useful for large data sets (e.g., administrative data), for which the computational savings outweigh the price one has to pay for making various approximations in avoiding iteration. We apply the proposed methods to national kidney transplant registry data.

摘要

我们考虑在可用数据包含合格和不合格受试者,且区分这两组的信息有时缺失的情况下,对删失生存时间进行研究。在此背景下的完全病例分析仅会使用已知合格的受试者,从而导致估计量效率低下且可能存在偏差。我们提出了一种两步法,它类似于期望最大化(EM)算法,但计算速度要快得多。在第一步中,使用基于完全病例(即合格指标无缺失的受试者)的逻辑回归来估计给定观测数据时缺失的合格指标的条件期望。在第二步中,从加权Cox比例风险模型中获得最大似然估计量,权重为观测到的合格指标或其估计的条件期望。在可忽略的缺失情况下,第二步得到的估计量被证明是一致的且渐近正态,并有明确的方差估计量。我们通过模拟表明,所提出的方法对于中等规模样本表现良好,并且在存在非随机缺失的合格指标时具有稳健性。所提出的方法比完全病例分析更有效且更稳健,并且与EM算法不同,它不需要耗时的迭代。尽管所提出的方法一般都适用,但它们对于大数据集(例如行政数据)最为有用,因为对于大数据集,计算节省超过了为避免迭代而进行各种近似所付出的代价。我们将所提出的方法应用于国家肾脏移植登记数据。

相似文献

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验