Suppr超能文献

数据整合:利用简化外部模型中参数估计值的比率

Data integration: exploiting ratios of parameter estimates from a reduced external model.

作者信息

Taylor Jeremy M G, Choi Kyuseong, Han Peisong

机构信息

Department of Biostatistics, University of Michigan, 1415 Washington Heights, Ann Arbor, Michigan 48019, U.S.A.

Department of Statistics and Data Science, Cornell University, 1198 Comstock Hall, 129 Garden Ave., Ithaca, New York 14853, U.S.A.

出版信息

Biometrika. 2022 Apr 12;110(1):119-134. doi: 10.1093/biomet/asac022. eCollection 2023 Mar.

Abstract

We consider the situation of estimating the parameters in a generalized linear prediction model, from an internal dataset, where the outcome variable [Formula: see text] is binary and there are two sets of covariates, [Formula: see text] and [Formula: see text]. We have information from an external study that provides parameter estimates for a generalized linear model of [Formula: see text] on [Formula: see text]. We propose a method that makes limited assumptions about the similarity of the distributions in the two study populations. The method involves orthogonalizing the [Formula: see text] variables and then borrowing information about the ratio of the coefficients from the external model. The method is justified based on a new result relating the parameters in a generalized linear model to the parameters in a generalized linear model with omitted covariates. The method is applicable if the regression coefficients in the [Formula: see text] given [Formula: see text] model are similar in the two populations, up to an unknown scalar constant. This type of transportability between populations is something that can be checked from the available data. The asymptotic variance of the proposed method is derived. The method is evaluated in a simulation study and shown to gain efficiency compared to simple analysis of the internal dataset, and is robust compared to an alternative method of incorporating external information.

摘要

我们考虑从一个内部数据集中估计广义线性预测模型参数的情况,其中结果变量[公式:见正文]是二元的,并且有两组协变量,[公式:见正文]和[公式:见正文]。我们有来自一项外部研究的信息,该研究提供了关于[公式:见正文]对[公式:见正文]的广义线性模型的参数估计。我们提出了一种方法,该方法对两个研究总体中分布的相似性做出有限的假设。该方法包括对[公式:见正文]变量进行正交化,然后从外部模型中借用关于系数比率的信息。该方法基于一个将广义线性模型中的参数与省略协变量的广义线性模型中的参数相关联的新结果而得到证明。如果在给定[公式:见正文]的模型中,[公式:见正文]的回归系数在两个总体中相似,至多相差一个未知的标量常数,则该方法适用。这种总体之间的可转移性可以从现有数据中进行检验。推导了所提出方法的渐近方差。在一项模拟研究中对该方法进行了评估,结果表明与简单分析内部数据集相比,该方法提高了效率,并且与纳入外部信息的另一种方法相比具有稳健性。

相似文献

1
Data integration: exploiting ratios of parameter estimates from a reduced external model.
Biometrika. 2022 Apr 12;110(1):119-134. doi: 10.1093/biomet/asac022. eCollection 2023 Mar.
7
A weak instrument [Formula: see text]-test in linear IV models with multiple endogenous variables.
J Econom. 2016 Feb;190(2):212-221. doi: 10.1016/j.jeconom.2015.06.004.
9
A method for estimating the power of moments.
J Inequal Appl. 2018;2018(1):54. doi: 10.1186/s13660-018-1645-7. Epub 2018 Mar 6.
10
Evolution of stochastic demography with life history tradeoffs in density-dependent age-structured populations.
Proc Natl Acad Sci U S A. 2017 Oct 31;114(44):11582-11590. doi: 10.1073/pnas.1710679114. Epub 2017 Oct 10.

引用本文的文献

1
Robust angle-based transfer learning in high dimensions.
J R Stat Soc Series B Stat Methodol. 2024 Dec 3;87(3):723-745. doi: 10.1093/jrsssb/qkae111. eCollection 2025 Jul.
2
A comparison of some existing and novel methods for integrating historical models to improve estimation of coefficients in logistic regression.
J R Stat Soc Ser A Stat Soc. 2024 Sep 24;188(1):46-67. doi: 10.1093/jrsssa/qnae093. eCollection 2025 Jan.
5
Accommodating time-varying heterogeneity in risk estimation under the Cox model: a transfer learning approach.
J Am Stat Assoc. 2023;118(544):2276-2287. doi: 10.1080/01621459.2023.2210336. Epub 2023 Jun 26.
7
Pulmonary Fibrosis Stakeholder Summit: A Joint NHLBI, Three Lakes Foundation, and Pulmonary Fibrosis Foundation Workshop Report.
Am J Respir Crit Care Med. 2024 Feb 15;209(4):362-373. doi: 10.1164/rccm.202307-1154WS.

本文引用的文献

1
Integrating Information from Existing Risk Prediction Models with No Model Details.
Can J Stat. 2023 Jun;51(2):355-374. doi: 10.1002/cjs.11701. Epub 2022 Apr 15.
3
Synthetic data method to incorporate external information into a current study.
Can J Stat. 2019 Dec;47(4):580-603. doi: 10.1002/cjs.11513. Epub 2019 Jun 26.
4
Generalized meta-analysis for multiple regression models across studies with disparate covariate information.
Biometrika. 2019 Sep;106(3):567-585. doi: 10.1093/biomet/asz030. Epub 2019 Jul 13.
6
Informing a Risk Prediction Model for Binary Outcomes with External Coefficient Information.
J R Stat Soc Ser C Appl Stat. 2019 Jan;68(1):121-139. doi: 10.1111/rssc.12306. Epub 2018 Aug 13.
8
A flexible method for aggregation of prior statistical findings.
PLoS One. 2017 Apr 6;12(4):e0175111. doi: 10.1371/journal.pone.0175111. eCollection 2017.
9
Constrained Maximum Likelihood Estimation for Model Calibration Using Summary-level Information from External Big Data Sources.
J Am Stat Assoc. 2016 Mar;111(513):107-117. doi: 10.1080/01621459.2015.1123157. Epub 2016 May 5.
10
Urine TMPRSS2:ERG Plus PCA3 for Individualized Prostate Cancer Risk Assessment.
Eur Urol. 2016 Jul;70(1):45-53. doi: 10.1016/j.eururo.2015.04.039. Epub 2015 May 16.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验