Huang Chiung-Yu, Qin Jing
Department of Epidemiology and Biostatistics, University of California at San Francisco, San Francisco, California.
Biostatistics Research Branch, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, Maryland.
Stat Med. 2020 May 15;39(10):1573-1590. doi: 10.1002/sim.8499. Epub 2020 Feb 19.
There has been a growing interest in developing methodologies to combine information from public domains to improve efficiency in the analysis of relatively small-scale studies that collect more detailed patient-level information. The auxiliary information is usually given in the form of summary statistics or regression coefficients. Thus, the question arises as to how to incorporate the summary information in the model estimation procedure. In this article, we consider statistical analysis of right-censored survival data when additional information about the covariate effects evaluated in a reduced Cox model is available. Recognizing that such external information can be summarized using population moments, we present a unified framework by employing the generalized method of moments to combine information from different sources for the analysis of survival data. The proposed estimator can be shown to be consistent and asymptotically normal; moreover, it is more efficient than the maximum partial likelihood estimator. We also consider incorporating uncertainty of the external information in the inference procedure. Simulation studies show that, by incorporating the additional summary information, the proposed estimators enjoy a substantial gain in efficiency over the conventional approach. A data analysis of a pancreatic cancer cohort study is presented to illustrate the methods and theory.
人们越来越关注开发方法,以整合来自公共领域的信息,从而提高对收集更详细患者层面信息的相对小规模研究的分析效率。辅助信息通常以汇总统计量或回归系数的形式给出。因此,就出现了如何在模型估计过程中纳入汇总信息的问题。在本文中,当有关于在简化的Cox模型中评估的协变量效应的额外信息时,我们考虑对右删失生存数据进行统计分析。认识到此类外部信息可以用总体矩来概括,我们通过采用广义矩方法提出一个统一框架,以整合来自不同来源的信息用于生存数据分析。所提出的估计量可以证明是一致的且渐近正态;此外,它比最大偏似然估计量更有效。我们还考虑在推断过程中纳入外部信息的不确定性。模拟研究表明,通过纳入额外的汇总信息,所提出的估计量比传统方法在效率上有显著提高。本文给出了一项胰腺癌队列研究的数据分析,以说明方法和理论。