多维剪接数据与 GWAS 汇总统计数据的整合，用于风险基因的发现。

Integration of multidimensional splicing data and GWAS summary statistics for risk gene discovery.

机构信息

Department of Molecular Physiology & Biophysics, Vanderbilt University, Nashville, Tennessee, United States of America.

Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, Tennessee, United States of America.

出版信息

PLoS Genet. 2022 Jun 30;18(6):e1009814. doi: 10.1371/journal.pgen.1009814. eCollection 2022 Jun.

DOI:10.1371/journal.pgen.1009814

PMID:35771864

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9278751/

Abstract

A common strategy for the functional interpretation of genome-wide association study (GWAS) findings has been the integrative analysis of GWAS and expression data. Using this strategy, many association methods (e.g., PrediXcan and FUSION) have been successful in identifying trait-associated genes via mediating effects on RNA expression. However, these approaches often ignore the effects of splicing, which can carry as much disease risk as expression. Compared to expression data, one challenge to detect associations using splicing data is the large multiple testing burden due to multidimensional splicing events within genes. Here, we introduce a multidimensional splicing gene (MSG) approach, which consists of two stages: 1) we use sparse canonical correlation analysis (sCCA) to construct latent canonical vectors (CVs) by identifying sparse linear combinations of genetic variants and splicing events that are maximally correlated with each other; and 2) we test for the association between the genetically regulated splicing CVs and the trait of interest using GWAS summary statistics. Simulations show that MSG has proper type I error control and substantial power gains over existing multidimensional expression analysis methods (i.e., S-MultiXcan, UTMOST, and sCCA+ACAT) under diverse scenarios. When applied to the Genotype-Tissue Expression Project data and GWAS summary statistics of 14 complex human traits, MSG identified on average 83%, 115%, and 223% more significant genes than sCCA+ACAT, S-MultiXcan, and UTMOST, respectively. We highlight MSG's applications to Alzheimer's disease, low-density lipoprotein cholesterol, and schizophrenia, and found that the majority of MSG-identified genes would have been missed from expression-based analyses. Our results demonstrate that aggregating splicing data through MSG can improve power in identifying gene-trait associations and help better understand the genetic risk of complex traits.

摘要

一种常用于对全基因组关联研究（GWAS）结果进行功能解释的策略是整合 GWAS 和表达数据进行分析。利用这种策略，许多关联方法（如 PrediXcan 和 FUSION）已经成功地通过对 RNA 表达的中介效应来识别与性状相关的基因。然而，这些方法往往忽略了剪接的影响，剪接对疾病的影响与表达一样大。与表达数据相比，使用剪接数据检测关联的一个挑战是由于基因内多维剪接事件导致的多重检验负担很大。在这里，我们引入了多维剪接基因（MSG）方法，该方法包括两个阶段：1）我们使用稀疏典型相关分析（sCCA）通过识别与彼此最大相关的遗传变异和剪接事件的稀疏线性组合来构建潜在的典型向量（CV）；2）我们使用 GWAS 汇总统计数据测试遗传调控的剪接 CV 与感兴趣性状之间的关联。模拟表明，在各种情况下，MSG 具有适当的 I 型错误控制和相对于现有多维表达分析方法（即 S-MultiXcan、UTMOST 和 sCCA+ACAT）的实质性功效增益。当应用于基因型组织表达项目数据和 14 个人类复杂性状的 GWAS 汇总统计数据时，MSG 平均识别出比 sCCA+ACAT、S-MultiXcan 和 UTMOST 分别多 83%、115%和 223%的显著基因。我们突出了 MSG 在阿尔茨海默病、低密度脂蛋白胆固醇和精神分裂症中的应用，并发现基于表达的分析方法可能会错过 MSG 识别的大多数基因。我们的结果表明，通过 MSG 聚合剪接数据可以提高识别基因-性状关联的功效，并有助于更好地理解复杂性状的遗传风险。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/039f/9278751/3e3e67c859ff/pgen.1009814.g001.jpg

相似文献

Integration of multidimensional splicing data and GWAS summary statistics for risk gene discovery.多维剪接数据与 GWAS 汇总统计数据的整合，用于风险基因的发现。

PLoS Genet. 2022 Jun 30;18(6):e1009814. doi: 10.1371/journal.pgen.1009814. eCollection 2022 Jun.

Leveraging expression from multiple tissues using sparse canonical correlation analysis and aggregate tests improves the power of transcriptome-wide association studies.利用稀疏典型相关分析和综合检验从多个组织中获取表达信息，可提高全转录组关联研究的效能。

PLoS Genet. 2021 Apr 8;17(4):e1008973. doi: 10.1371/journal.pgen.1008973. eCollection 2021 Apr.

Investigation of multi-trait associations using pathway-based analysis of GWAS summary statistics.基于 GWAS 汇总统计数据的通路分析探究多性状关联。

BMC Genomics. 2019 Feb 4;20(Suppl 1):79. doi: 10.1186/s12864-018-5373-7.

Integrating eQTL data with GWAS summary statistics in pathway-based analysis with application to schizophrenia.在基于通路的分析中将表达数量性状基因座（eQTL）数据与全基因组关联研究（GWAS）汇总统计数据相结合，并应用于精神分裂症研究。

Genet Epidemiol. 2018 Apr;42(3):303-316. doi: 10.1002/gepi.22110. Epub 2018 Feb 7.

An analysis of genetically regulated gene expression across multiple tissues implicates novel gene candidates in Alzheimer's disease.对多个组织中受遗传调控的基因表达进行分析，提示了阿尔茨海默病的新基因候选物。

Alzheimers Res Ther. 2020 Apr 16;12(1):43. doi: 10.1186/s13195-020-00611-8.

A general framework for functionally informed set-based analysis: Application to a large-scale colorectal cancer study.一种功能信息的基于集合的分析的通用框架：在大规模结直肠癌研究中的应用。

PLoS Genet. 2020 Aug 24;16(8):e1008947. doi: 10.1371/journal.pgen.1008947. eCollection 2020 Aug.

How powerful are summary-based methods for identifying expression-trait associations under different genetic architectures?基于汇总数据的方法在不同遗传结构下识别表达性状关联的能力有多强？

Pac Symp Biocomput. 2018;23:228-239.

Integrate multiple traits to detect novel trait-gene association using GWAS summary data with an adaptive test approach.利用 GWAS 汇总数据和自适应检验方法整合多种性状，以检测新的性状-基因关联。

Bioinformatics. 2019 Jul 1;35(13):2251-2257. doi: 10.1093/bioinformatics/bty961.

Influence of tissue context on gene prioritization for predicted transcriptome-wide association studies.组织背景对预测的全转录组关联研究中基因优先级排序的影响。

Pac Symp Biocomput. 2019;24:296-307.

Methods for meta-analysis of multiple traits using GWAS summary statistics.使用全基因组关联研究（GWAS）汇总统计量进行多性状荟萃分析的方法。

Genet Epidemiol. 2018 Mar;42(2):134-145. doi: 10.1002/gepi.22105. Epub 2017 Dec 10.

引用本文的文献

Tensor decomposition of multi-dimensional splicing events across multiple tissues to identify splicing-mediated risk genes associated with complex traits.跨多个组织的多维剪接事件的张量分解，以识别与复杂性状相关的剪接介导的风险基因。

PLoS Comput Biol. 2025 Jul 21;21(7):e1013303. doi: 10.1371/journal.pcbi.1013303. eCollection 2025 Jul.

Endophenotype 2.0: updated definitions and criteria for endophenotypes of psychiatric disorders, incorporating new technologies and findings.内表型2.0：精神障碍内表型的更新定义和标准，纳入新技术和研究结果。

Transl Psychiatry. 2024 Dec 24;14(1):502. doi: 10.1038/s41398-024-03195-1.

The goldmine of GWAS summary statistics: a systematic review of methods and tools.全基因组关联研究汇总统计数据的宝库：方法与工具的系统综述

BioData Min. 2024 Sep 5;17(1):31. doi: 10.1186/s13040-024-00385-x.

From bugs to bedside: functional annotation of human genetic variation for neurological disorders using invertebrate models.从虫子到床边：利用无脊椎动物模型对神经紊乱相关的人类遗传变异进行功能注释。

Hum Mol Genet. 2022 Oct 20;31(R1):R37-R46. doi: 10.1093/hmg/ddac203.

本文引用的文献

Multitrait transcriptome-wide association study (TWAS) tests.多性状转录组全基因组关联研究（TWAS）检验。

Genet Epidemiol. 2021 Sep;45(6):563-576. doi: 10.1002/gepi.22391. Epub 2021 Jun 3.

PLoS Genet. 2021 Apr 8;17(4):e1008973. doi: 10.1371/journal.pgen.1008973. eCollection 2021 Apr.

Genome-wide meta-analysis, fine-mapping and integrative prioritization implicate new Alzheimer's disease risk genes.全基因组荟萃分析、精细映射和综合优先级推断出新的阿尔茨海默病风险基因。

Nat Genet. 2021 Mar;53(3):392-402. doi: 10.1038/s41588-020-00776-w. Epub 2021 Feb 15.

Deep transcriptome sequencing of subgenual anterior cingulate cortex reveals cross-diagnostic and diagnosis-specific RNA expression changes in major psychiatric disorders.采用亚皮质前扣带皮层的深度转录组测序揭示了主要精神疾病中的跨诊断和特定于诊断的 RNA 表达变化。

Neuropsychopharmacology. 2021 Jun;46(7):1364-1372. doi: 10.1038/s41386-020-00949-5. Epub 2021 Feb 8.

Exploiting the GTEx resources to decipher the mechanisms at GWAS loci.利用 GTEx 资源来破解 GWAS 位点的机制。

Genome Biol. 2021 Jan 26;22(1):49. doi: 10.1186/s13059-020-02252-4.

Multi-trait transcriptome-wide association studies with probabilistic Mendelian randomization.多性状全转录组关联研究与概率性孟德尔随机化。

Am J Hum Genet. 2021 Feb 4;108(2):240-256. doi: 10.1016/j.ajhg.2020.12.006.

A unified framework for joint-tissue transcriptome-wide association and Mendelian randomization analysis.联合组织转录组全基因组关联和孟德尔随机化分析的统一框架。

Nat Genet. 2020 Nov;52(11):1239-1246. doi: 10.1038/s41588-020-0706-2. Epub 2020 Oct 5.

Microtubule affinity-regulating kinase 4 with an Alzheimer's disease-related mutation promotes tau accumulation and exacerbates neurodegeneration.携带阿尔茨海默病相关突变的微管亲和调节激酶4会促进tau蛋白积累并加剧神经退行性变。

J Biol Chem. 2020 Dec 11;295(50):17138-17147. doi: 10.1074/jbc.RA120.014420. Epub 2020 Oct 5.

A tissue-specific collaborative mixed model for jointly analyzing multiple tissues in transcriptome-wide association studies.组织特异性协作混合模型，用于在全转录组关联研究中联合分析多个组织。

Nucleic Acids Res. 2020 Nov 4;48(19):e109. doi: 10.1093/nar/gkaa767.

Bayesian Genome-wide TWAS Method to Leverage both cis- and trans-eQTL Information through Summary Statistics.贝叶斯全基因组 TWAS 方法，通过汇总统计数据利用 cis- 和 trans-eQTL 信息。

Am J Hum Genet. 2020 Oct 1;107(4):714-726. doi: 10.1016/j.ajhg.2020.08.022. Epub 2020 Sep 21.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

多维剪接数据与 GWAS 汇总统计数据的整合，用于风险基因的发现。

Integration of multidimensional splicing data and GWAS summary statistics for risk gene discovery.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献