Suppr超能文献

一种用于非对称数据集成的交叉验证统计框架。

A cross-validation statistical framework for asymmetric data integration.

机构信息

Department of Biostatistics, University of Michigan, Ann Arbor, Michigan.

出版信息

Biometrics. 2023 Jun;79(2):1280-1292. doi: 10.1111/biom.13685. Epub 2022 May 23.

Abstract

The proliferation of biobanks and large public clinical data sets enables their integration with a smaller amount of locally gathered data for the purposes of parameter estimation and model prediction. However, public data sets may be subject to context-dependent confounders and the protocols behind their generation are often opaque; naively integrating all external data sets equally can bias estimates and lead to spurious conclusions. Weighted data integration is a potential solution, but current methods still require subjective specifications of weights and can become computationally intractable. Under the assumption that local data are generated from the set of unknown true parameters, we propose a novel weighted integration method based upon using the external data to minimize the local data leave-one-out cross validation (LOOCV) error. We demonstrate how the optimization of LOOCV errors for linear and Cox proportional hazards models can be rewritten as functions of external data set integration weights. Significant reductions in estimation error and prediction error are shown using simulation studies mimicking the heterogeneity of clinical data as well as a real-world example using kidney transplant patients from the Scientific Registry of Transplant Recipients.

摘要

生物银行和大型公共临床数据集的激增使得它们能够与少量本地收集的数据相结合,用于参数估计和模型预测。然而,公共数据集可能受到上下文相关的混杂因素的影响,并且其生成背后的方案通常不透明;盲目地平等地整合所有外部数据集可能会产生偏差估计,并导致虚假结论。加权数据集成是一种潜在的解决方案,但目前的方法仍然需要主观指定权重,并且可能变得计算上难以处理。在假设本地数据是从一组未知真实参数集中生成的情况下,我们提出了一种新的基于使用外部数据最小化本地数据留一法交叉验证 (LOOCV) 误差的加权集成方法。我们展示了如何将线性和 Cox 比例风险模型的 LOOCV 误差的优化重写为外部数据集集成权重的函数。通过模拟临床数据异质性的模拟研究以及使用来自 Scientific Registry of Transplant Recipients 的肾移植患者的真实示例,显示了在估计误差和预测误差方面的显著降低。

相似文献

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验