• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

通过联合半监督迁移学习利用不准确的电子健康记录数据增强遗传风险预测

Enhancing Genetic Risk Prediction through Federated Semi-Supervised Transfer Learning with Inaccurate Electronic Health Record Data.

作者信息

Lu Yuying, Gu Tian, Duan Rui

机构信息

Department of Biostatistics, Columbia Mailman School of Public Health, New York, NY 10032, USA.

Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA.

出版信息

Stat Biosci. 2024 Aug 13. doi: 10.1007/s12561-024-09449-2.

DOI:10.1007/s12561-024-09449-2
PMID:40917581
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12409711/
Abstract

Large-scale genomics data combined with Electronic Health Records (EHRs) illuminate the path towards personalized disease management and enhanced medical interventions. However, the absence of "gold standard" disease labels makes the development of machine learning models a challenging task. Additionally, imbalances in demographic representation within datasets compromise the development of unbiased healthcare solutions. In response to these challenges, we introduce FEderated Semi-Supervised Transfer Learning (FEST) for improving disease risk predictions in underrepresented populations. FEST facilitates the collaborative training of models across various institutions by leveraging both labeled and unlabeled data from diverse subpopulations. It addresses distributional variations across different populations and healthcare institutions by combining density ratio reweighting and model calibration techniques. Federated learning algorithms are developed for training models using only summary-level statistics. We perform simulation studies to assess the efficacy of FEST in comparisons with a few alternative methods. Subsequently, we apply FEST to training a genetic risk prediction model for type 2 diabetes that targets the African-Ancestry population using data from the Massachusetts General Brigham (MGB) Biobank. Both our computational experiments and real-world data application underline the superior performance of FEST over competing methods.

摘要

大规模基因组学数据与电子健康记录(EHRs)相结合,为个性化疾病管理和强化医疗干预指明了道路。然而,缺乏“金标准”疾病标签使得机器学习模型的开发成为一项具有挑战性的任务。此外,数据集中人口统计学代表性的不平衡损害了无偏医疗保健解决方案的开发。为应对这些挑战,我们引入了联邦半监督迁移学习(FEST),以改善代表性不足人群的疾病风险预测。FEST通过利用来自不同亚人群的标记和未标记数据,促进跨机构的模型协作训练。它通过结合密度比重新加权和模型校准技术,解决了不同人群和医疗机构之间的分布差异。开发了联邦学习算法,用于仅使用汇总级统计数据训练模型。我们进行模拟研究,以评估FEST与一些替代方法相比的有效性。随后,我们应用FEST使用来自马萨诸塞州综合布莱根(MGB)生物银行的数据,为以非洲裔人群为目标的2型糖尿病训练遗传风险预测模型。我们的计算实验和实际数据应用都强调了FEST相对于竞争方法的卓越性能。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8604/12409711/b0d0e2aee0c1/nihms-2024766-f0008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8604/12409711/63904ef73efe/nihms-2024766-f0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8604/12409711/7aab56f07916/nihms-2024766-f0006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8604/12409711/624cf1151f25/nihms-2024766-f0007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8604/12409711/b0d0e2aee0c1/nihms-2024766-f0008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8604/12409711/63904ef73efe/nihms-2024766-f0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8604/12409711/7aab56f07916/nihms-2024766-f0006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8604/12409711/624cf1151f25/nihms-2024766-f0007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8604/12409711/b0d0e2aee0c1/nihms-2024766-f0008.jpg

相似文献

1
Enhancing Genetic Risk Prediction through Federated Semi-Supervised Transfer Learning with Inaccurate Electronic Health Record Data.通过联合半监督迁移学习利用不准确的电子健康记录数据增强遗传风险预测
Stat Biosci. 2024 Aug 13. doi: 10.1007/s12561-024-09449-2.
2
Semi-supervised Double Deep Learning Temporal Risk Prediction (SeDDLeR) with Electronic Health Records.基于电子健康记录的半监督双深度学习时间风险预测(SeDDLeR)
J Biomed Inform. 2024 Sep;157:104685. doi: 10.1016/j.jbi.2024.104685. Epub 2024 Jul 14.
3
Prescription of Controlled Substances: Benefits and Risks管制药品的处方:益处与风险
4
Radiomics-Based Model Using Tumor and Peritumoral Features with Semi-Supervised and Privileged Learning for Metastatic Risk Prediction in Lung Cancer: A Multi-Site Study.基于影像组学的模型:利用肿瘤及瘤周特征结合半监督和特权学习预测肺癌转移风险的多中心研究
Comput Methods Programs Biomed. 2025 Aug 20;271:109029. doi: 10.1016/j.cmpb.2025.109029.
5
Personalized federated learning with hierarchical reweighting for multi-center clinical prediction.
Comput Methods Programs Biomed. 2025 Nov;271:109015. doi: 10.1016/j.cmpb.2025.109015. Epub 2025 Aug 22.
6
Trajectory-Ordered Objectives for Self-Supervised Representation Learning of Temporal Healthcare Data Using Transformers: Model Development and Evaluation Study.使用Transformer进行时间序列医疗数据自监督表示学习的轨迹有序目标:模型开发与评估研究
JMIR Med Inform. 2025 Jun 4;13:e68138. doi: 10.2196/68138.
7
Comparison of Two Modern Survival Prediction Tools, SORG-MLA and METSSS, in Patients With Symptomatic Long-bone Metastases Who Underwent Local Treatment With Surgery Followed by Radiotherapy and With Radiotherapy Alone.两种现代生存预测工具 SORG-MLA 和 METSSS 在接受手术联合放疗和单纯放疗治疗有症状长骨转移患者中的比较。
Clin Orthop Relat Res. 2024 Dec 1;482(12):2193-2208. doi: 10.1097/CORR.0000000000003185. Epub 2024 Jul 23.
8
Development of Machine Learning-based Algorithms to Predict the 2- and 5-year Risk of TKA After Tibial Plateau Fracture Treatment.基于机器学习的算法用于预测胫骨平台骨折治疗后2年和5年全膝关节置换风险的研究进展
Clin Orthop Relat Res. 2025 Mar 12. doi: 10.1097/CORR.0000000000003442.
9
Semi-supervised semantic segmentation of cell nuclei with diffusion model and collaborative learning.基于扩散模型和协同学习的细胞核半监督语义分割
J Med Imaging (Bellingham). 2025 Nov;12(6):061403. doi: 10.1117/1.JMI.12.6.061403. Epub 2025 Mar 20.
10
Stabilizing machine learning for reproducible and explainable results: A novel validation approach to subject-specific insights.稳定机器学习以获得可重复和可解释的结果:一种针对特定个体见解的新型验证方法。
Comput Methods Programs Biomed. 2025 Jun 21;269:108899. doi: 10.1016/j.cmpb.2025.108899.

本文引用的文献

1
Federated Adaptive Causal Estimation (FACE) of Target Treatment Effects.目标治疗效果的联合自适应因果估计(FACE)
J Am Stat Assoc. 2025 Mar 17. doi: 10.1080/01621459.2025.2453249.
2
Robust angle-based transfer learning in high dimensions.高维空间中基于稳健角度的迁移学习
J R Stat Soc Series B Stat Methodol. 2024 Dec 3;87(3):723-745. doi: 10.1093/jrsssb/qkae111. eCollection 2025 Jul.
3
Semi-supervised Triply Robust Inductive Transfer Learning.半监督三重稳健归纳迁移学习
J Am Stat Assoc. 2025;120:1037-1047. doi: 10.1080/01621459.2024.2393463. Epub 2024 Oct 10.
4
Doubly Robust Augmented Model Accuracy Transfer Inference with High Dimensional Features.具有高维特征的双稳健增强模型精度转移推断
J Am Stat Assoc. 2025;120(549):524-534. doi: 10.1080/01621459.2024.2356291. Epub 2024 Jun 24.
5
TARGETING UNDERREPRESENTED POPULATIONS IN PRECISION MEDICINE: A FEDERATED TRANSFER LEARNING APPROACH.精准医学中针对代表性不足人群:一种联邦迁移学习方法。
Ann Appl Stat. 2023 Dec;17(4):2970-2992. doi: 10.1214/23-AOAS1747. Epub 2023 Oct 30.
6
Federated causal inference in heterogeneous observational data.基于异质观测数据的联邦因果推断。
Stat Med. 2023 Oct 30;42(24):4418-4439. doi: 10.1002/sim.9868. Epub 2023 Aug 8.
7
Semi-Supervised Deep Transfer Learning for Benign-Malignant Diagnosis of Pulmonary Nodules in Chest CT Images.基于半监督深度迁移学习的胸部 CT 图像肺结节良恶性诊断。
IEEE Trans Med Imaging. 2022 Apr;41(4):771-781. doi: 10.1109/TMI.2021.3123572. Epub 2022 Apr 1.
8
Genetic discovery and risk characterization in type 2 diabetes across diverse populations.不同人群2型糖尿病的基因发现与风险特征分析
HGG Adv. 2021 Apr 8;2(2). doi: 10.1016/j.xhgg.2021.100029. Epub 2021 Mar 9.
9
Tutorial: a guide to performing polygenic risk score analyses.教程:多基因风险评分分析操作指南。
Nat Protoc. 2020 Sep;15(9):2759-2772. doi: 10.1038/s41596-020-0353-1. Epub 2020 Jul 24.
10
Learning from local to global: An efficient distributed algorithm for modeling time-to-event data.从局部到全局学习:一种用于建模事件时间数据的高效分布式算法。
J Am Med Inform Assoc. 2020 Jul 1;27(7):1028-1036. doi: 10.1093/jamia/ocaa044.