用于全基因组关联研究的注释回归及其在精神疾病基因组学联盟数据中的应用

Annotation Regression for Genome-Wide Association Studies with an Application to Psychiatric Genomic Consortium Data.

作者信息

Shin Sunyoung, Keleş Sündüz

机构信息

Department of Statistics, Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, USA.

出版信息

Stat Biosci. 2017 Jun;9(1):50-72. doi: 10.1007/s12561-016-9154-z. Epub 2016 Aug 12.

DOI:10.1007/s12561-016-9154-z

PMID:28781711

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5542423/

Abstract

Although genome-wide association studies (GWAS) have been successful at finding thousands of disease-associated genetic variants (GVs), identifying causal variants and elucidating the mechanisms by which genotypes influence phenotypes are critical open questions. A key challenge is that a large percentage of disease-associated GVs are potential regulatory variants located in noncoding regions, making them difficult to interpret. Recent research efforts focus on going beyond annotating GVs by integrating functional annotation data with GWAS to prioritize GVs. However, applicability of these approaches is challenged by high dimensionality and heterogeneity of functional annotation data. Furthermore, existing methods often assume global associations of GVs with annotation data. This strong assumption is susceptible to violations for GVs involved in many complex diseases. To address these issues, we develop a general regression framework, named nnotation egression fr WAS (ARoG). ARoG is based on a finite mixture of linear regressions model where GWAS association measures are viewed as responses and functional annotations as predictors. This mixture framework addresses heterogeneity of effects of GVs by grouping them into clusters and high dimensionality of the functional annotations by enabling annotation selection within each cluster. ARoG further employs permutation testing to evaluate the significance of selected annotations. Computational experiments indicate that ARoG can discover distinct associations between disease risk and functional annotations. Application of ARoG to autism and schizophrenia data from Psychiatric Genomics Consortium led to identification of GVs that significantly affect interactions of several transcription factors with DNA as potential mechanisms contributing to these disorders.

摘要

尽管全基因组关联研究（GWAS）已成功发现数千种与疾病相关的基因变异（GV），但确定因果变异以及阐明基因型影响表型的机制仍是关键的开放性问题。一个关键挑战在于，很大一部分与疾病相关的GV是位于非编码区域的潜在调控变异，这使得它们难以解读。近期的研究工作重点在于通过将功能注释数据与GWAS整合，对GV进行优先级排序，从而超越对GV的注释。然而，这些方法的适用性受到功能注释数据的高维度和异质性的挑战。此外，现有方法通常假定GV与注释数据存在全局关联。对于许多复杂疾病所涉及的GV而言，这种强假设很容易被违反。为了解决这些问题，我们开发了一个通用回归框架，名为GWAS注释回归（ARoG）。ARoG基于线性回归模型的有限混合，其中GWAS关联度量被视为响应变量，功能注释被视为预测变量。这种混合框架通过将GV分组到不同簇中来解决GV效应的异质性，并通过在每个簇内进行注释选择来解决功能注释的高维度问题。ARoG进一步采用置换检验来评估所选注释的显著性。计算实验表明，ARoG能够发现疾病风险与功能注释之间的不同关联。将ARoG应用于精神疾病基因组学联盟的自闭症和精神分裂症数据，导致识别出一些GV，这些GV显著影响几种转录因子与DNA的相互作用，这是导致这些疾病的潜在机制。

相似文献

Annotation Regression for Genome-Wide Association Studies with an Application to Psychiatric Genomic Consortium Data.

Stat Biosci. 2017 Jun;9(1):50-72. doi: 10.1007/s12561-016-9154-z. Epub 2016 Aug 12.

GWASdb: a database for human genetic variants identified by genome-wide association studies.

Nucleic Acids Res. 2012 Jan;40(Database issue):D1047-54. doi: 10.1093/nar/gkr1182. Epub 2011 Dec 1.

iFunMed: Integrative functional mediation analysis of GWAS and eQTL studies.

Genet Epidemiol. 2019 Oct;43(7):742-760. doi: 10.1002/gepi.22217. Epub 2019 Jul 22.

A scalable Bayesian functional GWAS method accounting for multivariate quantitative functional annotations with applications for studying Alzheimer disease.

HGG Adv. 2022 Sep 17;3(4):100143. doi: 10.1016/j.xhgg.2022.100143. eCollection 2022 Oct 13.

Weighting sequence variants based on their annotation increases the power of genome-wide association studies in dairy cattle.

Genet Sel Evol. 2019 May 10;51(1):20. doi: 10.1186/s12711-019-0463-9.

Integrative Tissue-Specific Functional Annotations in the Human Genome Provide Novel Insights on Many Complex Traits and Improve Signal Prioritization in Genome Wide Association Studies.

PLoS Genet. 2016 Apr 8;12(4):e1005947. doi: 10.1371/journal.pgen.1005947. eCollection 2016 Apr.

Ultrafast and scalable variant annotation and prioritization with big functional genomics data.

Genome Res. 2020 Dec;30(12):1789-1801. doi: 10.1101/gr.267997.120. Epub 2020 Oct 15.

Functional annotation signatures of disease susceptibility loci improve SNP association analysis.

BMC Genomics. 2014 May 24;15(1):398. doi: 10.1186/1471-2164-15-398.

An inferred functional impact map of genetic variants in rice.

Mol Plant. 2021 Sep 6;14(9):1584-1599. doi: 10.1016/j.molp.2021.06.025. Epub 2021 Jun 29.

引用本文的文献

CERENKOV3: Clustering and molecular network-derived features improve computational prediction of functional noncoding SNPs.

Pac Symp Biocomput. 2020;25:535-546.

CERENKOV2: improved detection of functional noncoding SNPs using data-space geometric features.

BMC Bioinformatics. 2019 Feb 6;20(1):63. doi: 10.1186/s12859-019-2637-4.

本文引用的文献

atSNP: transcription factor binding affinity testing for regulatory SNP detection.

Bioinformatics. 2015 Oct 15;31(20):3353-5. doi: 10.1093/bioinformatics/btv328. Epub 2015 Jun 18.

Human genomics. The Genotype-Tissue Expression (GTEx) pilot analysis: multitissue gene regulation in humans.

Science. 2015 May 8;348(6235):648-60. doi: 10.1126/science.1262110. Epub 2015 May 7.

Integrative analysis of 111 reference human epigenomes.

Nature. 2015 Feb 19;518(7539):317-30. doi: 10.1038/nature14248.

The genetic and mechanistic basis for variation in gene regulation.

PLoS Genet. 2015 Jan 8;11(1):e1004857. doi: 10.1371/journal.pgen.1004857. eCollection 2015 Jan.

A comparative encyclopedia of DNA elements in the mouse genome.

Nature. 2014 Nov 20;515(7527):355-64. doi: 10.1038/nature13992.

GPA: a statistical approach to prioritizing GWAS results by integrating pleiotropy and annotation.

PLoS Genet. 2014 Nov 13;10(11):e1004787. doi: 10.1371/journal.pgen.1004787. eCollection 2014 Nov.

Integrating functional data to prioritize causal variants in statistical fine-mapping studies.

PLoS Genet. 2014 Oct 30;10(10):e1004722. doi: 10.1371/journal.pgen.1004722. eCollection 2014 Oct.

Biological insights from 108 schizophrenia-associated genetic loci.

Nature. 2014 Jul 24;511(7510):421-7. doi: 10.1038/nature13595. Epub 2014 Jul 22.

Functional annotation signatures of disease susceptibility loci improve SNP association analysis.

BMC Genomics. 2014 May 24;15(1):398. doi: 10.1186/1471-2164-15-398.

A Bayesian method to incorporate hundreds of functional characteristics with association evidence to improve variant prioritization.

PLoS One. 2014 May 20;9(5):e98122. doi: 10.1371/journal.pone.0098122. eCollection 2014.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

用于全基因组关联研究的注释回归及其在精神疾病基因组学联盟数据中的应用

Annotation Regression for Genome-Wide Association Studies with an Application to Psychiatric Genomic Consortium Data.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献