Suppr超能文献

一种基于基因的测试方法,该方法考虑了附近基因之间测试的相关性。

An approach to gene-based testing accounting for dependence of tests among nearby genes.

机构信息

Department of Statistics & Data Science, Carnegie Mellon University, Pittsburgh, PA, USA.

Department of Computational Biology, Carnegie Mellon University, USA.

出版信息

Brief Bioinform. 2021 Nov 5;22(6). doi: 10.1093/bib/bbab329.

Abstract

In genome-wide association studies (GWAS), it has become commonplace to test millions of single-nucleotide polymorphisms (SNPs) for phenotypic association. Gene-based testing can improve power to detect weak signal by reducing multiple testing and pooling signal strength. While such tests account for linkage disequilibrium (LD) structure of SNP alleles within each gene, current approaches do not capture LD of SNPs falling in different nearby genes, which can induce correlation of gene-based test statistics. We introduce an algorithm to account for this correlation. When a gene's test statistic is independent of others, it is assessed separately; when test statistics for nearby genes are strongly correlated, their SNPs are agglomerated and tested as a locus. To provide insight into SNPs and genes driving association within loci, we develop an interactive visualization tool to explore localized signal. We demonstrate our approach in the context of weakly powered GWAS for autism spectrum disorder, which is contrasted to more highly powered GWAS for schizophrenia and educational attainment. To increase power for these analyses, especially those for autism, we use adaptive $P$-value thresholding, guided by high-dimensional metadata modeled with gradient boosted trees, highlighting when and how it can be most useful. Notably our workflow is based on summary statistics.

摘要

在全基因组关联研究(GWAS)中,对表型关联进行数百万个单核苷酸多态性(SNP)的基因检测已经很常见。基因检测可以通过减少多重检验和汇集信号强度来提高检测弱信号的能力。虽然这些测试考虑了每个基因中 SNP 等位基因的连锁不平衡(LD)结构,但目前的方法无法捕获落在不同附近基因中的 SNP 的 LD,这会导致基因检测统计数据的相关性。我们引入了一种算法来考虑这种相关性。当一个基因的检测统计量与其他基因独立时,将单独评估;当附近基因的检测统计量高度相关时,将它们的 SNP 聚集成一个位点进行检测。为了深入了解驱动基因座内关联的 SNP 和基因,我们开发了一个交互式可视化工具来探索局部信号。我们在自闭症谱系障碍的弱效 GWAS 背景下展示了我们的方法,并与精神分裂症和教育程度的高功效 GWAS 进行了对比。为了提高这些分析的功效,尤其是自闭症的分析功效,我们使用了基于梯度提升树建模的高维元数据指导的自适应 P 值阈值,突出显示何时以及如何最有用。值得注意的是,我们的工作流程基于汇总统计数据。

相似文献

4
A powerful and versatile colocalization test.一种强大且多功能的共定位测试。
PLoS Comput Biol. 2020 Apr 10;16(4):e1007778. doi: 10.1371/journal.pcbi.1007778. eCollection 2020 Apr.

本文引用的文献

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验