Suppr超能文献

高维空间中基于稳健角度的迁移学习

Robust angle-based transfer learning in high dimensions.

作者信息

Gu Tian, Han Yi, Duan Rui

机构信息

Department of Biostatistics, Columbia University Mailman School of Public Health, New York, NY 10032, USA.

Department of Statistics, Columbia University, New York, NY 10027, USA.

出版信息

J R Stat Soc Series B Stat Methodol. 2024 Dec 3;87(3):723-745. doi: 10.1093/jrsssb/qkae111. eCollection 2025 Jul.

Abstract

Transfer learning improves target model performance by leveraging data from related source populations, especially when target data are scarce. This study addresses the challenge of training high-dimensional regression models with limited target data in the presence of heterogeneous source populations. We focus on a practical setting where only parameter estimates of pretrained source models are available, rather than individual-level source data. For a single source model, we propose a novel angle-based transfer learning (angleTL) method that leverages concordance between source and target model parameters. AngleTL adapts to the signal strength of the target model, unifies several benchmark methods, and mitigates negative transfer when between-population heterogeneity is large. We extend angleTL to incorporate multiple source models, accounting for varying levels of relevance among them. Our high-dimensional asymptotic analysis provides insights into when a source model benefits the target model and demonstrates the superiority of angleTL over other methods. Extensive simulations validate these findings and highlight the feasibility of applying angleTL to transfer genetic risk prediction models across multiple biobanks.

摘要

迁移学习通过利用来自相关源群体的数据来提高目标模型的性能,尤其是在目标数据稀缺的情况下。本研究解决了在存在异质源群体的情况下,使用有限的目标数据训练高维回归模型的挑战。我们关注的是一种实际情况,即只有预训练源模型的参数估计可用,而不是个体层面的源数据。对于单个源模型,我们提出了一种新颖的基于角度的迁移学习(angleTL)方法,该方法利用源模型和目标模型参数之间的一致性。AngleTL适应目标模型的信号强度,统一了几种基准方法,并在群体间异质性较大时减轻负迁移。我们将angleTL扩展为纳入多个源模型,考虑它们之间不同程度的相关性。我们的高维渐近分析深入探讨了源模型何时对目标模型有益,并证明了angleTL相对于其他方法的优越性。广泛的模拟验证了这些发现,并突出了将angleTL应用于跨多个生物银行转移遗传风险预测模型的可行性。

相似文献

1
Robust angle-based transfer learning in high dimensions.
J R Stat Soc Series B Stat Methodol. 2024 Dec 3;87(3):723-745. doi: 10.1093/jrsssb/qkae111. eCollection 2025 Jul.
7
Are Current Survival Prediction Tools Useful When Treating Subsequent Skeletal-related Events From Bone Metastases?
Clin Orthop Relat Res. 2024 Sep 1;482(9):1710-1721. doi: 10.1097/CORR.0000000000003030. Epub 2024 Mar 22.
8
Audit and feedback: effects on professional practice.
Cochrane Database Syst Rev. 2025 Mar 25;3(3):CD000259. doi: 10.1002/14651858.CD000259.pub4.
9
Behavioral interventions to reduce risk for sexual transmission of HIV among men who have sex with men.
Cochrane Database Syst Rev. 2008 Jul 16(3):CD001230. doi: 10.1002/14651858.CD001230.pub2.
10
Systemic pharmacological treatments for chronic plaque psoriasis: a network meta-analysis.
Cochrane Database Syst Rev. 2020 Jan 9;1(1):CD011535. doi: 10.1002/14651858.CD011535.pub3.

本文引用的文献

1
Estimation of the number of spiked eigenvalues in a covariance matrix by bulk eigenvalue matching analysis.
J Am Stat Assoc. 2023;118(541):374-392. doi: 10.1080/01621459.2021.1933497. Epub 2021 Jul 23.
2
Multi-Task Learning with Summary Statistics.
Adv Neural Inf Process Syst. 2023;36:54020-54031. Epub 2024 May 30.
3
TARGETING UNDERREPRESENTED POPULATIONS IN PRECISION MEDICINE: A FEDERATED TRANSFER LEARNING APPROACH.
Ann Appl Stat. 2023 Dec;17(4):2970-2992. doi: 10.1214/23-AOAS1747. Epub 2023 Oct 30.
4
Transfer Learning under High-dimensional Generalized Linear Models.
J Am Stat Assoc. 2023;118(544):2684-2697. doi: 10.1080/01621459.2022.2071278. Epub 2022 Jun 27.
5
Transfer Learning in Large-scale Gaussian Graphical Models with False Discovery Rate Control.
J Am Stat Assoc. 2023;118(543):2171-2183. doi: 10.1080/01621459.2022.2044333. Epub 2022 Mar 18.
6
Individual Data Protected Integrative Regression Analysis of High-Dimensional Heterogeneous Data.
J Am Stat Assoc. 2022;117(540):2105-2119. doi: 10.1080/01621459.2021.1904958. Epub 2021 May 19.
7
Data integration: exploiting ratios of parameter estimates from a reduced external model.
Biometrika. 2022 Apr 12;110(1):119-134. doi: 10.1093/biomet/asac022. eCollection 2023 Mar.
8
COMMUTE: Communication-efficient transfer learning for multi-site risk prediction.
J Biomed Inform. 2023 Jan;137:104243. doi: 10.1016/j.jbi.2022.104243. Epub 2022 Nov 18.
9
How to reduce dimension with PCA and random projections?
IEEE Trans Inf Theory. 2021 Dec;67(12):8154-8189. doi: 10.1109/tit.2021.3112821. Epub 2021 Sep 14.
10
Reproducibility of prediction models in health services research.
BMC Res Notes. 2022 Jun 11;15(1):204. doi: 10.1186/s13104-022-06082-4.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验