• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

SEAGLE:一种用于生物样本库数据中大规模基于集合的基因-环境相互作用测试的可扩展精确算法。

SEAGLE: A Scalable Exact Algorithm for Large-Scale Set-Based Gene-Environment Interaction Tests in Biobank Data.

作者信息

Chi Jocelyn T, Ipsen Ilse C F, Hsiao Tzu-Hung, Lin Ching-Heng, Wang Li-San, Lee Wan-Ping, Lu Tzu-Pin, Tzeng Jung-Ying

机构信息

Department of Statistics, North Carolina State University, Raleigh, NC, United States.

Department of Mathematics, North Carolina State University, Raleigh, NC, United States.

出版信息

Front Genet. 2021 Nov 2;12:710055. doi: 10.3389/fgene.2021.710055. eCollection 2021.

DOI:10.3389/fgene.2021.710055
PMID:34795690
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8593472/
Abstract

The explosion of biobank data offers unprecedented opportunities for gene-environment interaction (GxE) studies of complex diseases because of the large sample sizes and the rich collection in genetic and non-genetic information. However, the extremely large sample size also introduces new computational challenges in G×E assessment, especially for set-based G×E variance component (VC) tests, which are a widely used strategy to boost overall G×E signals and to evaluate the joint G×E effect of multiple variants from a biologically meaningful unit (e.g., gene). In this work, we focus on continuous traits and present SEAGLE, a calable xact lorithm for arge-scale set-based G× tests, to permit G×E VC tests for biobank-scale data. SEAGLE employs modern matrix computations to calculate the test statistic and -value of the GxE VC test in a computationally efficient fashion, without imposing additional assumptions or relying on approximations. SEAGLE can easily accommodate sample sizes in the order of 10, is implementable on standard laptops, and does not require specialized computing equipment. We demonstrate the performance of SEAGLE using extensive simulations. We illustrate its utility by conducting genome-wide gene-based G×E analysis on the Taiwan Biobank data to explore the interaction of gene and physical activity status on body mass index.

摘要

生物样本库数据的激增为复杂疾病的基因-环境相互作用(GxE)研究提供了前所未有的机遇,这得益于其庞大的样本量以及丰富的遗传和非遗传信息收集。然而,极大的样本量也给G×E评估带来了新的计算挑战,特别是对于基于集合的G×E方差分量(VC)检验而言,该检验是一种广泛使用的策略,用于增强整体G×E信号并评估来自生物学上有意义的单元(例如基因)的多个变异的联合G×E效应。在这项工作中,我们专注于连续性状,并提出了SEAGLE,一种用于大规模基于集合的G×检验的可扩展精确算法,以允许对生物样本库规模的数据进行G×E VC检验。SEAGLE采用现代矩阵计算,以计算高效的方式计算GxE VC检验的检验统计量和p值,无需额外假设或依赖近似值。SEAGLE能够轻松容纳数量达10万级别的样本量,可在标准笔记本电脑上实现,且不需要专门的计算设备。我们通过广泛的模拟展示了SEAGLE的性能。我们通过对台湾生物样本库数据进行全基因组基于基因的G×E分析,以探索基因与身体活动状态对体重指数的相互作用,来说明其效用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5a96/8593472/059d842f6b76/fgene-12-710055-g011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5a96/8593472/e5130fb076f3/fgene-12-710055-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5a96/8593472/9d26a2710f3b/fgene-12-710055-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5a96/8593472/2d3aab8b77eb/fgene-12-710055-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5a96/8593472/78bbf25278b8/fgene-12-710055-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5a96/8593472/d3136dc6e03d/fgene-12-710055-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5a96/8593472/059d842f6b76/fgene-12-710055-g011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5a96/8593472/e5130fb076f3/fgene-12-710055-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5a96/8593472/9d26a2710f3b/fgene-12-710055-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5a96/8593472/2d3aab8b77eb/fgene-12-710055-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5a96/8593472/78bbf25278b8/fgene-12-710055-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5a96/8593472/d3136dc6e03d/fgene-12-710055-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5a96/8593472/059d842f6b76/fgene-12-710055-g011.jpg

相似文献

1
SEAGLE: A Scalable Exact Algorithm for Large-Scale Set-Based Gene-Environment Interaction Tests in Biobank Data.SEAGLE:一种用于生物样本库数据中大规模基于集合的基因-环境相互作用测试的可扩展精确算法。
Front Genet. 2021 Nov 2;12:710055. doi: 10.3389/fgene.2021.710055. eCollection 2021.
2
GxEsum: a novel approach to estimate the phenotypic variance explained by genome-wide GxE interaction based on GWAS summary statistics for biobank-scale data.GxEsum:一种基于生物库规模数据的 GWAS 汇总统计数据来估计全基因组 GxE 互作解释表型方差的新方法。
Genome Biol. 2021 Jun 21;22(1):183. doi: 10.1186/s13059-021-02403-1.
3
Using Genetic Marginal Effects to Study Gene-Environment Interactions with GWAS Data.利用遗传边际效应研究 GWAS 数据中的基因-环境相互作用。
Behav Genet. 2021 May;51(3):358-373. doi: 10.1007/s10519-021-10058-8. Epub 2021 Apr 26.
4
A versatile, fast and unbiased method for estimation of gene-by-environment interaction effects on biobank-scale datasets.一种用于在生物库规模数据集上估计基因-环境互作效应的通用、快速且无偏的方法。
Nat Commun. 2023 Aug 25;14(1):5196. doi: 10.1038/s41467-023-40913-7.
5
Detecting gene-environment interactions from multiple continuous traits.从多个连续性状中检测基因-环境相互作用。
Bioinformatics. 2024 Jul 1;40(7). doi: 10.1093/bioinformatics/btae419.
6
A unified powerful set-based test for sequencing data analysis of GxE interactions.一种用于基因与环境相互作用测序数据分析的统一的基于强大集的检验。
Biostatistics. 2017 Jan;18(1):119-131. doi: 10.1093/biostatistics/kxw034. Epub 2016 Jul 28.
7
Efficient gene-environment interaction tests for large biobank-scale sequencing studies.高效的基因-环境交互作用检验方法,适用于大型生物库规模的测序研究。
Genet Epidemiol. 2020 Nov;44(8):908-923. doi: 10.1002/gepi.22351. Epub 2020 Aug 30.
8
A fast and powerful linear mixed model approach for genotype-environment interaction tests in large-scale GWAS.一种用于大规模全基因组关联研究中基因型-环境相互作用测试的快速且强大的线性混合模型方法。
Brief Bioinform. 2023 Jan 19;24(1). doi: 10.1093/bib/bbac547.
9
Assessing gene-environment interactions for common and rare variants with binary traits using gene-trait similarity regression.使用基因-性状相似性回归评估具有二元性状的常见和罕见变异的基因-环境相互作用。
Genetics. 2015 Mar;199(3):695-710. doi: 10.1534/genetics.114.171686. Epub 2015 Jan 12.
10
A scalable and robust variance components method reveals insights into the architecture of gene-environment interactions underlying complex traits.一种可扩展且稳健的方差分量方法揭示了复杂性状潜在的基因-环境相互作用结构的见解。
bioRxiv. 2023 Dec 13:2023.12.12.571316. doi: 10.1101/2023.12.12.571316.

引用本文的文献

1
Marginal interaction test for detecting interactions between genetic marker sets and environment in genome-wide studies.全基因组研究中用于检测遗传标记集与环境之间相互作用的边际相互作用检验。
G3 (Bethesda). 2025 Jan 8;15(1). doi: 10.1093/g3journal/jkae263.
2
Genotype × environment interactions in gene regulation and complex traits.基因调控和复杂性状中的基因型×环境互作。
Nat Genet. 2024 Jun;56(6):1057-1068. doi: 10.1038/s41588-024-01776-w. Epub 2024 Jun 10.
3
Gene-environment interactions in human health.人类健康中的基因-环境相互作用。

本文引用的文献

1
Integrating comprehensive functional annotations to boost power and accuracy in gene-based association analysis.整合全面的功能注释以提高基于基因的关联分析的功效和准确性。
PLoS Genet. 2020 Dec 15;16(12):e1009060. doi: 10.1371/journal.pgen.1009060. eCollection 2020 Dec.
2
Cauchy combination test: a powerful test with analytic -value calculation under arbitrary dependency structures.柯西组合检验:一种在任意相依结构下具有解析值计算功能的强大检验。
J Am Stat Assoc. 2020;115(529):393-402. doi: 10.1080/01621459.2018.1554485. Epub 2019 Apr 25.
3
Efficient gene-environment interaction tests for large biobank-scale sequencing studies.
Nat Rev Genet. 2024 Nov;25(11):768-784. doi: 10.1038/s41576-024-00731-z. Epub 2024 May 28.
4
Editorial: Current Status and Future Challenges of Biobank Data Analysis.社论:生物样本库数据分析的现状与未来挑战
Front Genet. 2022 Apr 14;13:882611. doi: 10.3389/fgene.2022.882611. eCollection 2022.
高效的基因-环境交互作用检验方法,适用于大型生物库规模的测序研究。
Genet Epidemiol. 2020 Nov;44(8):908-923. doi: 10.1002/gepi.22351. Epub 2020 Aug 30.
4
Dynamic incorporation of multiple in silico functional annotations empowers rare variant association analysis of large whole-genome sequencing studies at scale.大规模全基因组测序研究中通过多种计算功能注释的动态整合增强罕见变异关联分析。
Nat Genet. 2020 Sep;52(9):969-983. doi: 10.1038/s41588-020-0676-4. Epub 2020 Aug 24.
5
Heterogeneity in Obesity: Genetic Basis and Metabolic Consequences.肥胖的异质性:遗传基础和代谢后果。
Curr Diab Rep. 2020 Jan 22;20(1):1. doi: 10.1007/s11892-020-1285-4.
6
The harmonic mean -value for combining dependent tests.合并相关检验的调和平均值。
Proc Natl Acad Sci U S A. 2019 Jan 22;116(4):1195-1200. doi: 10.1073/pnas.1814092116. Epub 2019 Jan 4.
7
A scalable estimator of SNP heritability for biobank-scale data.用于生物库规模数据的 SNP 遗传力可扩展估计器。
Bioinformatics. 2018 Jul 1;34(13):i187-i194. doi: 10.1093/bioinformatics/bty253.
8
FastSKAT: Sequence kernel association tests for very large sets of markers.FastSKAT:针对大量标记集的序列核关联检验。
Genet Epidemiol. 2018 Sep;42(6):516-527. doi: 10.1002/gepi.22136. Epub 2018 Jun 22.
9
Gene-by-environment interactions in urban populations modulate risk phenotypes.城市人群中的基因-环境交互作用调节风险表型。
Nat Commun. 2018 Mar 6;9(1):827. doi: 10.1038/s41467-018-03202-2.
10
Current Challenges and New Opportunities for Gene-Environment Interaction Studies of Complex Diseases.复杂疾病基因-环境相互作用研究的当前挑战与新机遇
Am J Epidemiol. 2017 Oct 1;186(7):753-761. doi: 10.1093/aje/kwx227.