• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

FedGMMAT:联邦广义线性混合模型关联测试。

FedGMMAT: Federated generalized linear mixed model association tests.

机构信息

McWilliams School of Biomedical Informatics, University of Texas Health Science Center at Houston, Houston, Texas, United States of America.

School of Public Health, University of Texas Health Science Center at Houston, Houston, Texas, United States of America.

出版信息

PLoS Comput Biol. 2024 Jul 24;20(7):e1012142. doi: 10.1371/journal.pcbi.1012142. eCollection 2024 Jul.

DOI:10.1371/journal.pcbi.1012142
PMID:39047024
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11299833/
Abstract

Increasing genetic and phenotypic data size is critical for understanding the genetic determinants of diseases. Evidently, establishing practical means for collaboration and data sharing among institutions is a fundamental methodological barrier for performing high-powered studies. As the sample sizes become more heterogeneous, complex statistical approaches, such as generalized linear mixed effects models, must be used to correct for the confounders that may bias results. On another front, due to the privacy concerns around Protected Health Information (PHI), genetic information is restrictively protected by sharing according to regulations such as Health Insurance Portability and Accountability Act (HIPAA). This limits data sharing among institutions and hampers efforts around executing high-powered collaborative studies. Federated approaches are promising to alleviate the issues around privacy and performance, since sensitive data never leaves the local sites. Motivated by these, we developed FedGMMAT, a federated genetic association testing tool that utilizes a federated statistical testing approach for efficient association tests that can correct for confounding fixed and additive polygenic random effects among different collaborating sites. Genetic data is never shared among collaborating sites, and the intermediate statistics are protected by encryption. Using simulated and real datasets, we demonstrate FedGMMAT can achieve the virtually same results as pooled analysis under a privacy-preserving framework with practical resource requirements.

摘要

随着基因和表型数据规模的不断增加,理解疾病的遗传决定因素变得至关重要。显然,建立机构之间的合作和数据共享的实际手段是进行高影响力研究的基本方法障碍。随着样本大小变得更加异质,必须使用复杂的统计方法,如广义线性混合效应模型,来纠正可能导致结果偏差的混杂因素。另一方面,由于受保护的健康信息 (PHI) 隐私问题,根据《健康保险携带和责任法案》(HIPAA)等法规,遗传信息受到严格保护,只能按照规定进行共享。这限制了机构之间的数据共享,并阻碍了围绕执行高影响力的合作研究的努力。联合方法有望缓解隐私和性能方面的问题,因为敏感数据从未离开过本地站点。受此启发,我们开发了 FedGMMAT,这是一种联邦遗传关联测试工具,它利用联邦统计测试方法来进行有效的关联测试,可以纠正不同合作站点之间混杂的固定和加性多基因随机效应。在保护隐私的框架下,遗传数据不会在合作站点之间共享,中间统计数据受到加密保护。使用模拟和真实数据集,我们证明 FedGMMAT 可以在实际资源要求下实现与汇总分析几乎相同的结果。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4591/11299833/7d42160f3f6f/pcbi.1012142.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4591/11299833/2a85f3f3a7da/pcbi.1012142.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4591/11299833/61d6c343e71a/pcbi.1012142.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4591/11299833/91eac209385b/pcbi.1012142.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4591/11299833/cbcd198c9ad2/pcbi.1012142.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4591/11299833/da9943becce8/pcbi.1012142.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4591/11299833/48084e1c3537/pcbi.1012142.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4591/11299833/7d42160f3f6f/pcbi.1012142.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4591/11299833/2a85f3f3a7da/pcbi.1012142.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4591/11299833/61d6c343e71a/pcbi.1012142.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4591/11299833/91eac209385b/pcbi.1012142.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4591/11299833/cbcd198c9ad2/pcbi.1012142.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4591/11299833/da9943becce8/pcbi.1012142.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4591/11299833/48084e1c3537/pcbi.1012142.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4591/11299833/7d42160f3f6f/pcbi.1012142.g007.jpg

相似文献

1
FedGMMAT: Federated generalized linear mixed model association tests.FedGMMAT:联邦广义线性混合模型关联测试。
PLoS Comput Biol. 2024 Jul 24;20(7):e1012142. doi: 10.1371/journal.pcbi.1012142. eCollection 2024 Jul.
2
Federated generalized linear mixed models for collaborative genome-wide association studies.用于协作式全基因组关联研究的联邦广义线性混合模型。
iScience. 2023 Jun 28;26(8):107227. doi: 10.1016/j.isci.2023.107227. eCollection 2023 Aug 18.
3
Doctors Routinely Share Health Data Electronically Under HIPAA, and Sharing With Patients and Patients' Third-Party Health Apps is Consistent: Interoperability and Privacy Analysis.根据 HIPAA 规定,医生通常会以电子方式共享健康数据,与患者及其患者第三方健康应用程序共享数据也是符合规定的:互操作性和隐私分析。
J Med Internet Res. 2020 Sep 2;22(9):e19818. doi: 10.2196/19818.
4
A novel privacy-preserving federated genome-wide association study framework and its application in identifying potential risk variants in ankylosing spondylitis.一种新型的隐私保护联邦全基因组关联研究框架及其在强直性脊柱炎潜在风险变异识别中的应用。
Brief Bioinform. 2021 May 20;22(3). doi: 10.1093/bib/bbaa090.
5
A privacy-preserving and computation-efficient federated algorithm for generalized linear mixed models to analyze correlated electronic health records data.一种用于分析相关电子健康记录数据的广义线性混合模型的隐私保护和计算高效的联邦算法。
PLoS One. 2023 Jan 17;18(1):e0280192. doi: 10.1371/journal.pone.0280192. eCollection 2023.
6
Privacy-preserving federated machine learning on FAIR health data: A real-world application.公平健康数据上的隐私保护联邦机器学习:一个实际应用
Comput Struct Biotechnol J. 2024 Feb 17;24:136-145. doi: 10.1016/j.csbj.2024.02.014. eCollection 2024 Dec.
7
Privacy-preserving construction of generalized linear mixed model for biomedical computation.用于生物医学计算的广义线性混合模型的隐私保护构建。
Bioinformatics. 2020 Jul 1;36(Suppl_1):i128-i135. doi: 10.1093/bioinformatics/btaa478.
8
Security and privacy requirements for a multi-institutional cancer research data grid: an interview-based study.多机构癌症研究数据网格的安全与隐私要求:一项基于访谈的研究
BMC Med Inform Decis Mak. 2009 Jun 15;9:31. doi: 10.1186/1472-6947-9-31.
9
COLLAGENE enables privacy-aware federated and collaborative genomic data analysis.COLLAGENE 实现了隐私感知的联邦和协作基因组数据分析。
Genome Biol. 2023 Sep 11;24(1):204. doi: 10.1186/s13059-023-03039-z.
10
The FeatureCloud Platform for Federated Learning in Biomedicine: Unified Approach.FeatureCloud 平台在生物医学领域的联邦学习:统一方法。
J Med Internet Res. 2023 Jul 12;25:e42621. doi: 10.2196/42621.

引用本文的文献

1
A One-Shot Lossless Algorithm for Cross-Cohort Learning in Mixed-Outcomes Analysis.一种用于混合结果分析中跨队列学习的一次性无损算法。
medRxiv. 2024 Dec 4:2024.01.09.24301073. doi: 10.1101/2024.01.09.24301073.
2
Modeling the Dependence Structure in Genome Wide Association Studies of Binary Phenotypes in Family Data.基于家系数据的二元表型全基因组关联研究中依赖结构的建模。
Behav Genet. 2020 Nov;50(6):423-439. doi: 10.1007/s10519-020-10010-2. Epub 2020 Aug 17.
3
Clarifying the Risk of Lung Disease in SZ Alpha-1 Antitrypsin Deficiency.

本文引用的文献

1
COLLAGENE enables privacy-aware federated and collaborative genomic data analysis.COLLAGENE 实现了隐私感知的联邦和协作基因组数据分析。
Genome Biol. 2023 Sep 11;24(1):204. doi: 10.1186/s13059-023-03039-z.
2
Federated generalized linear mixed models for collaborative genome-wide association studies.用于协作式全基因组关联研究的联邦广义线性混合模型。
iScience. 2023 Jun 28;26(8):107227. doi: 10.1016/j.isci.2023.107227. eCollection 2023 Aug 18.
3
Strategies for the Genomic Analysis of Admixed Populations.混合人群的基因组分析策略。
明确 SZ Alpha-1 抗胰蛋白酶缺乏症的肺病风险。
Am J Respir Crit Care Med. 2020 Jul 1;202(1):73-82. doi: 10.1164/rccm.202002-0262OC.
Annu Rev Biomed Data Sci. 2023 Aug 10;6:105-127. doi: 10.1146/annurev-biodatasci-020722-014310. Epub 2023 Apr 26.
4
A privacy-preserving and computation-efficient federated algorithm for generalized linear mixed models to analyze correlated electronic health records data.一种用于分析相关电子健康记录数据的广义线性混合模型的隐私保护和计算高效的联邦算法。
PLoS One. 2023 Jan 17;18(1):e0280192. doi: 10.1371/journal.pone.0280192. eCollection 2023.
5
Privacy-aware estimation of relatedness in admixed populations.混合人群中具有隐私意识的亲缘关系估计。
Brief Bioinform. 2022 Nov 19;23(6). doi: 10.1093/bib/bbac473.
6
Publisher Correction: Sociotechnical safeguards for genomic data privacy.出版商更正:基因组数据隐私的社会技术保障措施。
Nat Rev Genet. 2022 Jul;23(7):453. doi: 10.1038/s41576-022-00479-4.
7
sPLINK: a hybrid federated tool as a robust alternative to meta-analysis in genome-wide association studies.sPLINK:一种混合联邦工具,是全基因组关联研究中替代荟萃分析的强大选择。
Genome Biol. 2022 Jan 24;23(1):32. doi: 10.1186/s13059-021-02562-1.
8
Functional genomics data: privacy risk assessment and technological mitigation.功能基因组学数据:隐私风险评估与技术缓解措施
Nat Rev Genet. 2022 Apr;23(4):245-258. doi: 10.1038/s41576-021-00428-7. Epub 2021 Nov 10.
9
Sequencing of 53,831 diverse genomes from the NHLBI TOPMed Program.美国国立卫生研究院生物医学高级研究与发展局(NHLBI)TOPMed 项目中对 53831 个不同基因组进行测序。
Nature. 2021 Feb;590(7845):290-299. doi: 10.1038/s41586-021-03205-y. Epub 2021 Feb 10.
10
Lossless integration of multiple electronic health records for identifying pleiotropy using summary statistics.利用汇总统计信息对多个电子健康记录进行无损整合,以识别多效性。
Nat Commun. 2021 Jan 8;12(1):168. doi: 10.1038/s41467-020-20211-2.