Suppr超能文献

二阶群组置换检验及其在全基因组关联研究中的应用。

Second-order group knockoffs with applications to genome-wide association studies.

机构信息

Department of Biomedical Data Science, Stanford University, Stanford, CA, 94305, USA.

Department of Neurology and Neurological Sciences, Stanford University, Stanford, CA, 94035, USA.

出版信息

Bioinformatics. 2024 Oct 1;40(10). doi: 10.1093/bioinformatics/btae580.

Abstract

MOTIVATION

Conditional testing via the knockoff framework allows one to identify-among a large number of possible explanatory variables-those that carry unique information about an outcome of interest and also provides a false discovery rate guarantee on the selection. This approach is particularly well suited to the analysis of genome-wide association studies (GWAS), which have the goal of identifying genetic variants that influence traits of medical relevance.

RESULTS

While conditional testing can be both more powerful and precise than traditional GWAS analysis methods, its vanilla implementation encounters a difficulty common to all multivariate analysis methods: it is challenging to distinguish among multiple, highly correlated regressors. This impasse can be overcome by shifting the object of inference from single variables to groups of correlated variables. To achieve this, it is necessary to construct "group knockoffs." While successful examples are already documented in the literature, this paper substantially expands the set of algorithms and software for group knockoffs. We focus in particular on second-order knockoffs, for which we describe correlation matrix approximations that are appropriate for GWAS data and that result in considerable computational savings. We illustrate the effectiveness of the proposed methods with simulations and with the analysis of albuminuria data from the UK Biobank.

AVAILABILITY AND IMPLEMENTATION

The described algorithms are implemented in an open-source Julia package Knockoffs.jl. R and Python wrappers are available as knockoffsr and knockoffspy packages.

摘要

动机

通过 knockoff 框架进行条件检验,可以在大量可能的解释变量中识别出那些对感兴趣的结果具有独特信息的变量,并且对选择提供虚假发现率保证。这种方法特别适合于全基因组关联研究(GWAS)的分析,其目标是识别影响医学相关特征的遗传变异。

结果

虽然条件检验可以比传统的 GWAS 分析方法更强大和精确,但它的香草实现遇到了所有多元分析方法都面临的一个困难:很难区分多个高度相关的回归变量。通过将推断的对象从单个变量转移到相关变量组,可以克服这种僵局。为此,有必要构建“组 knockoffs”。虽然文献中已经有成功的例子,但本文大大扩展了组 knockoffs 的算法和软件集。我们特别关注二阶 knockoffs,对于它们,我们描述了适用于 GWAS 数据的相关矩阵近似值,这导致了相当大的计算节省。我们通过模拟和 UK Biobank 的白蛋白尿数据的分析来说明所提出方法的有效性。

可用性和实现

描述的算法在开源 Julia 包 Knockoffs.jl 中实现。R 和 Python 的包装分别是 knockoffsr 和 knockoffspy 包。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/60cf/11639161/f4f706f0f1c5/btae580f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验