Suppr超能文献

仅从汇总统计中进行控制变量选择?一种通过幽灵仿样和惩罚回归的解决方案。

Controlled Variable Selection from Summary Statistics Only? A Solution via GhostKnockoffs and Penalized Regression.

作者信息

Chen Zhaomeng, He Zihuai, Chu Benjamin B, Gu Jiaqi, Morrison Tim, Sabatti Chiara, Candès Emmanuel

机构信息

Department of Statistics, Stanford University.

Department of Neurology and Neurological Sciences, Stanford University.

出版信息

ArXiv. 2024 Feb 20:arXiv:2402.12724v1.

Abstract

Identifying which variables do influence a response while controlling false positives pervades statistics and data science. In this paper, we consider a scenario in which we only have access to summary statistics, such as the values of marginal empirical correlations between each dependent variable of potential interest and the response. This situation may arise due to privacy concerns, e.g., to avoid the release of sensitive genetic information. We extend GhostKnockoffs He et al. [2022] and introduce variable selection methods based on penalized regression achieving false discovery rate (FDR) control. We report empirical results in extensive simulation studies, demonstrating enhanced performance over previous work. We also apply our methods to genome-wide association studies of Alzheimer's disease, and evidence a significant improvement in power.

摘要

在控制误报的同时确定哪些变量确实会影响响应,这在统计学和数据科学中普遍存在。在本文中,我们考虑一种情况,即我们只能获取汇总统计信息,例如每个潜在感兴趣的因变量与响应之间的边际经验相关性值。由于隐私问题,例如为了避免泄露敏感的遗传信息,可能会出现这种情况。我们扩展了GhostKnockoffs(He等人,[2022]),并引入了基于惩罚回归的变量选择方法,以实现错误发现率(FDR)控制。我们在广泛的模拟研究中报告了实证结果,表明与先前的工作相比性能有所提高。我们还将我们的方法应用于阿尔茨海默病的全基因组关联研究,并证明在功效方面有显著提高。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0931/10925382/19dc03e3e007/nihpp-2402.12724v1-f0007.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验