Taylor Jeremy M G, Choi Kyuseong, Han Peisong
Department of Biostatistics, University of Michigan, 1415 Washington Heights, Ann Arbor, Michigan 48019, U.S.A.
Department of Statistics and Data Science, Cornell University, 1198 Comstock Hall, 129 Garden Ave., Ithaca, New York 14853, U.S.A.
Biometrika. 2022 Apr 12;110(1):119-134. doi: 10.1093/biomet/asac022. eCollection 2023 Mar.
We consider the situation of estimating the parameters in a generalized linear prediction model, from an internal dataset, where the outcome variable [Formula: see text] is binary and there are two sets of covariates, [Formula: see text] and [Formula: see text]. We have information from an external study that provides parameter estimates for a generalized linear model of [Formula: see text] on [Formula: see text]. We propose a method that makes limited assumptions about the similarity of the distributions in the two study populations. The method involves orthogonalizing the [Formula: see text] variables and then borrowing information about the ratio of the coefficients from the external model. The method is justified based on a new result relating the parameters in a generalized linear model to the parameters in a generalized linear model with omitted covariates. The method is applicable if the regression coefficients in the [Formula: see text] given [Formula: see text] model are similar in the two populations, up to an unknown scalar constant. This type of transportability between populations is something that can be checked from the available data. The asymptotic variance of the proposed method is derived. The method is evaluated in a simulation study and shown to gain efficiency compared to simple analysis of the internal dataset, and is robust compared to an alternative method of incorporating external information.
我们考虑从一个内部数据集中估计广义线性预测模型参数的情况,其中结果变量[公式:见正文]是二元的,并且有两组协变量,[公式:见正文]和[公式:见正文]。我们有来自一项外部研究的信息,该研究提供了关于[公式:见正文]对[公式:见正文]的广义线性模型的参数估计。我们提出了一种方法,该方法对两个研究总体中分布的相似性做出有限的假设。该方法包括对[公式:见正文]变量进行正交化,然后从外部模型中借用关于系数比率的信息。该方法基于一个将广义线性模型中的参数与省略协变量的广义线性模型中的参数相关联的新结果而得到证明。如果在给定[公式:见正文]的模型中,[公式:见正文]的回归系数在两个总体中相似,至多相差一个未知的标量常数,则该方法适用。这种总体之间的可转移性可以从现有数据中进行检验。推导了所提出方法的渐近方差。在一项模拟研究中对该方法进行了评估,结果表明与简单分析内部数据集相比,该方法提高了效率,并且与纳入外部信息的另一种方法相比具有稳健性。