• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

转录组关联研究的统计功效。

Statistical power of transcriptome-wide association studies.

机构信息

School of Statistics, University of Minnesota, Minneapolis, Minnesota, USA.

University of Minnesota, Division of Biostatistics, School of Public Health, Minneapolis, Minnesota, USA.

出版信息

Genet Epidemiol. 2022 Dec;46(8):572-588. doi: 10.1002/gepi.22491. Epub 2022 Jun 29.

DOI:10.1002/gepi.22491
PMID:35766062
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9669108/
Abstract

Transcriptome-Wide Association Studies (TWASs) have become increasingly popular in identifying genes (or other endophenotypes or exposures) associated with complex traits. In TWAS, one first builds a predictive model for gene expressions using an expression quantitative trait loci (eQTL) data set in stage 1, then tests the association between the predicted gene expression and a trait based on a large, independent genome-wide association study (GWAS) data set in stage 2. However, since the sample size of the eQTL data set is usually small and the coefficient of multiple determination (i.e., ) of the model for many genes is also small, a question of interest is to what extent these factors affect the statistical power of TWAS. In addition, in contrast to a standard (univariate) TWAS (UV-TWAS) considering only a single gene at a time, multivariate TWAS (MV-TWAS) methods have recently emerged to account for the effects of multiple genes, or a gene's nonlinear effects, simultaneously. With the absence of the power analysis for these MV-TWAS methods, it would be of interest to investigate whether one can gain or lose power by using the newly proposed MV-TWAS instead of UV-TWAS. In this paper, we first outline a general method for sample size/power calculations for two-sample TWAS, then use real data-the Alzheimer's Disease Neuroimaging Initiative (ADNI) expression quantitative trait loci (eQTL) data and the Genotype-Tissue Expression (GTEx) eQTL data for stage 1, the International Genomics of Alzheimer's Project Alzheimer's disease (AD) GWAS summary data and UK Biobank (UKB) individual-level data for stage 2-to empirically address these questions. Our most important conclusions are the following. First, a sample size of a few thousands (~8000) would suffice in stage 1, where the power of TWAS would be more determined by cis-heritability of gene expression. Second, as in the general case of simple regression versus multiple regression, the power of MV-TWAS may be higher or lower than that of UV-TWAS, depending on the specific relationships among the GWAS trait and multiple genes (or linear and nonlinear terms of the same gene's expression levels), such as their correlations and effect sizes. Interestingly, several top genes with large power gains in MV-TWAS (over that in UV-TWAS) were known to be (and in our data more significantly) associated with AD. We also reached similar conclusions in an application to the GTEx whole blood gene expression data and UKB GWAS data of high-density lipoprotein cholesterol. The proposed method and the conclusions are expected to be useful in planning and designing future TWAS and other related studies (e.g., Proteome- or Metabolome-Wide Association Studies) when determining the sample sizes for the two stages.

摘要

转录组关联研究(TWAS)已成为鉴定与复杂性状相关的基因(或其他内表型或暴露因素)的一种越来越受欢迎的方法。在 TWAS 中,首先使用阶段 1 中的表达数量性状基因座(eQTL)数据集构建基因表达的预测模型,然后使用来自大型独立全基因组关联研究(GWAS)数据集的预测基因表达和性状之间的关联在阶段 2 中进行测试。然而,由于 eQTL 数据集的样本量通常较小,并且许多基因模型的多重确定系数(即 )也较小,因此一个感兴趣的问题是这些因素在多大程度上影响 TWAS 的统计功效。此外,与仅一次考虑单个基因的标准(单变量)TWAS(UV-TWAS)相比,最近出现了多变量 TWAS(MV-TWAS)方法,以同时考虑多个基因或基因的非线性效应的影响。由于缺乏这些 MV-TWAS 方法的功效分析,因此研究使用新提出的 MV-TWAS 而不是 UV-TWAS 是否可以获得或失去功效将是一件很有意义的事情。在本文中,我们首先概述了两样本 TWAS 的样本量/功效计算的一般方法,然后使用实际数据——阿尔茨海默病神经影像学倡议(ADNI)表达数量性状基因座(eQTL)数据和基因型组织表达(GTEx)eQTL 数据进行阶段 1,国际阿尔茨海默病基因组学项目阿尔茨海默病(AD)GWAS 汇总数据和英国生物库(UKB)个体水平数据进行阶段 2,以经验性地解决这些问题。我们最重要的结论如下。首先,在阶段 1 中,几千个(约 8000 个)样本量就足够了,TWAS 的功效将更多地取决于基因表达的顺式遗传力。其次,与简单回归与多元回归的一般情况一样,MV-TWAS 的功效可能高于或低于 UV-TWAS,这取决于 GWAS 性状与多个基因(或同一基因表达水平的线性和非线性项)之间的具体关系,例如它们的相关性和效应大小。有趣的是,MV-TWAS 中具有较高功效增益的几个顶级基因(高于 UV-TWAS)被认为是(并且在我们的数据中更为显著)与 AD 相关。我们在对 GTEx 全血基因表达数据和 UKB 高密度脂蛋白胆固醇 GWAS 数据的应用中也得出了类似的结论。当确定两个阶段的样本量时,所提出的方法和结论有望在规划和设计未来的 TWAS 和其他相关研究(例如,蛋白质组学或代谢组学关联研究)时提供有用的信息。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bb52/9796297/23ba12364943/GEPI-46-572-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bb52/9796297/f694b0c1263e/GEPI-46-572-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bb52/9796297/cfd83a8e4e85/GEPI-46-572-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bb52/9796297/87f87a0ed398/GEPI-46-572-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bb52/9796297/303017dea4bf/GEPI-46-572-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bb52/9796297/47563a1509ba/GEPI-46-572-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bb52/9796297/8490e0877ac3/GEPI-46-572-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bb52/9796297/23ba12364943/GEPI-46-572-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bb52/9796297/f694b0c1263e/GEPI-46-572-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bb52/9796297/cfd83a8e4e85/GEPI-46-572-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bb52/9796297/87f87a0ed398/GEPI-46-572-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bb52/9796297/303017dea4bf/GEPI-46-572-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bb52/9796297/47563a1509ba/GEPI-46-572-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bb52/9796297/8490e0877ac3/GEPI-46-572-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bb52/9796297/23ba12364943/GEPI-46-572-g004.jpg

相似文献

1
Statistical power of transcriptome-wide association studies.转录组关联研究的统计功效。
Genet Epidemiol. 2022 Dec;46(8):572-588. doi: 10.1002/gepi.22491. Epub 2022 Jun 29.
2
Bayesian genome-wide TWAS with reference transcriptomic data of brain and blood tissues identified 141 risk genes for Alzheimer's disease dementia.基于大脑和血液组织参考转录组数据的贝叶斯全基因组 TWAS 鉴定出 141 个阿尔茨海默病痴呆风险基因。
Alzheimers Res Ther. 2024 Jun 1;16(1):120. doi: 10.1186/s13195-024-01488-7.
3
Some statistical consideration in transcriptome-wide association studies.全转录组关联研究中的一些统计考虑。
Genet Epidemiol. 2020 Apr;44(3):221-232. doi: 10.1002/gepi.22274. Epub 2019 Dec 10.
4
Accounting for nonlinear effects of gene expression identifies additional associated genes in transcriptome-wide association studies.在转录组全基因组关联研究中,考虑基因表达的非线性效应可以识别更多相关的基因。
Hum Mol Genet. 2022 Jul 21;31(14):2462-2470. doi: 10.1093/hmg/ddac015.
5
How powerful are summary-based methods for identifying expression-trait associations under different genetic architectures?基于汇总数据的方法在不同遗传结构下识别表达性状关联的能力有多强?
Pac Symp Biocomput. 2018;23:228-239.
6
Enhancing nonlinear transcriptome- and proteome-wide association studies via trait imputation with applications to Alzheimer's disease.通过性状插补增强全转录组和全蛋白质组的非线性关联研究及其在阿尔茨海默病中的应用
PLoS Genet. 2025 Apr 10;21(4):e1011659. doi: 10.1371/journal.pgen.1011659. eCollection 2025 Apr.
7
DeLIVR: a deep learning approach to IV regression for testing nonlinear causal effects in transcriptome-wide association studies.DeLIVR:一种用于转录组全关联研究中测试非线性因果效应的IV回归深度学习方法。
Biostatistics. 2024 Apr 15;25(2):468-485. doi: 10.1093/biostatistics/kxac051.
8
Brain and blood transcriptome-wide association studies identify five novel genes associated with Alzheimer's disease.大脑和血液全转录组关联研究确定了五个与阿尔茨海默病相关的新基因。
J Alzheimers Dis. 2025 May;105(1):228-244. doi: 10.1177/13872877251326288. Epub 2025 Mar 20.
9
Bayesian Genome-wide TWAS Method to Leverage both cis- and trans-eQTL Information through Summary Statistics.贝叶斯全基因组 TWAS 方法,通过汇总统计数据利用 cis- 和 trans-eQTL 信息。
Am J Hum Genet. 2020 Oct 1;107(4):714-726. doi: 10.1016/j.ajhg.2020.08.022. Epub 2020 Sep 21.
10
Leveraging expression from multiple tissues using sparse canonical correlation analysis and aggregate tests improves the power of transcriptome-wide association studies.利用稀疏典型相关分析和综合检验从多个组织中获取表达信息,可提高全转录组关联研究的效能。
PLoS Genet. 2021 Apr 8;17(4):e1008973. doi: 10.1371/journal.pgen.1008973. eCollection 2021 Apr.

引用本文的文献

1
Co-expression-wide association studies link genetically regulated interactions with complex traits.共表达全基因组关联研究将基因调控的相互作用与复杂性状联系起来。
medRxiv. 2024 Dec 13:2024.10.02.24314813. doi: 10.1101/2024.10.02.24314813.
2
A bootstrap model comparison test for identifying genes with context-specific patterns of genetic regulation.一种用于识别具有基因调控上下文特异性模式的基因的自举模型比较测试。
Ann Appl Stat. 2024 Sep;18(3):1840-1857. doi: 10.1214/23-aoas1859. Epub 2024 Aug 5.
3
Enhancing Gene Expression Predictions Using Deep Learning and Functional Annotations.

本文引用的文献

1
Association of Education and Intracranial Volume With Cognitive Trajectories and Mortality Rates Across the Alzheimer Disease Continuum.教育程度与颅内体积对阿尔茨海默病连续体认知轨迹和死亡率的关联。
Neurology. 2022 Apr 19;98(16):e1679-e1691. doi: 10.1212/WNL.0000000000200116. Epub 2022 Mar 21.
2
Accounting for nonlinear effects of gene expression identifies additional associated genes in transcriptome-wide association studies.在转录组全基因组关联研究中,考虑基因表达的非线性效应可以识别更多相关的基因。
Hum Mol Genet. 2022 Jul 21;31(14):2462-2470. doi: 10.1093/hmg/ddac015.
3
Model checking via testing for direct effects in Mendelian Randomization and transcriptome-wide association studies.
利用深度学习和功能注释增强基因表达预测
Genet Epidemiol. 2025 Jan;49(1):e22595. doi: 10.1002/gepi.22595. Epub 2024 Sep 30.
4
SUMMIT-FA: a new resource for improved transcriptome imputation using functional annotations.SUMMIT-FA:利用功能注释提高转录本推断的新资源。
Hum Mol Genet. 2024 Mar 20;33(7):624-635. doi: 10.1093/hmg/ddad205.
5
Transcriptome-wide association studies: recent advances in methods, applications and available databases.转录组关联研究:方法、应用和现有数据库的最新进展。
Commun Biol. 2023 Sep 1;6(1):899. doi: 10.1038/s42003-023-05279-y.
6
A BOOTSTRAP MODEL COMPARISON TEST FOR IDENTIFYING GENES WITH CONTEXT-SPECIFIC PATTERNS OF GENETIC REGULATION.一种用于识别具有基因调控上下文特异性模式基因的自举模型比较测试。
bioRxiv. 2023 Oct 22:2023.03.06.531446. doi: 10.1101/2023.03.06.531446.
7
MATS: a novel multi-ancestry transcriptome-wide association study to account for heterogeneity in the effects of cis-regulated gene expression on complex traits.MATS:一种新型的多祖系转录组全基因组关联研究,用于解释顺式调控基因表达对复杂性状影响的异质性。
Hum Mol Genet. 2023 Apr 6;32(8):1237-1251. doi: 10.1093/hmg/ddac247.
基于孟德尔随机化和转录组关联研究中直接效应的检验进行模型检测。
PLoS Comput Biol. 2021 Aug 2;17(8):e1009266. doi: 10.1371/journal.pcbi.1009266. eCollection 2021 Aug.
4
Transcriptome prediction performance across machine learning models and diverse ancestries.跨机器学习模型和不同血统的转录组预测性能。
HGG Adv. 2021 Apr 8;2(2). doi: 10.1016/j.xhgg.2020.100019. Epub 2021 Jan 5.
5
Power analysis of transcriptome-wide association study: Implications for practical protocol choice.全转录组关联研究的功效分析:对实际方案选择的启示。
PLoS Genet. 2021 Feb 26;17(2):e1009405. doi: 10.1371/journal.pgen.1009405. eCollection 2021 Feb.
6
The GTEx Consortium atlas of genetic regulatory effects across human tissues.GTEx 联盟人类组织遗传调控效应图谱
Science. 2020 Sep 11;369(6509):1318-1330. doi: 10.1126/science.aaz1776.
7
Implicating causal brain imaging endophenotypes in Alzheimer's disease using multivariable IWAS and GWAS summary data.使用多变量 IWAS 和 GWAS 汇总数据推断阿尔茨海默病的因果性脑成像内表型。
Neuroimage. 2020 Dec;223:117347. doi: 10.1016/j.neuroimage.2020.117347. Epub 2020 Sep 6.
8
HLA in Alzheimer's Disease: Genetic Association and Possible Pathogenic Roles.阿尔茨海默病中的 HLA:遗传关联及可能的致病作用。
Neuromolecular Med. 2020 Dec;22(4):464-473. doi: 10.1007/s12017-020-08612-4. Epub 2020 Sep 7.
9
Power calculation for the general two-sample Mendelian randomization analysis.一般两样本孟德尔随机化分析的功效计算。
Genet Epidemiol. 2020 Apr;44(3):290-299. doi: 10.1002/gepi.22284. Epub 2020 Feb 11.
10
Some statistical consideration in transcriptome-wide association studies.全转录组关联研究中的一些统计考虑。
Genet Epidemiol. 2020 Apr;44(3):221-232. doi: 10.1002/gepi.22274. Epub 2019 Dec 10.