Suppr超能文献

具有误差控制的有序结局的高维变量选择。

High-dimensional variable selection for ordinal outcomes with error control.

机构信息

Ohio State University.

出版信息

Brief Bioinform. 2021 Jan 18;22(1):334-345. doi: 10.1093/bib/bbaa007.

Abstract

Many high-throughput genomic applications involve a large set of potential covariates and a response which is frequently measured on an ordinal scale, and it is crucial to identify which variables are truly associated with the response. Effectively controlling the false discovery rate (FDR) without sacrificing power has been a major challenge in variable selection research. This study reviews two existing variable selection frameworks, model-X knockoffs and a modified version of reference distribution variable selection (RDVS), both of which utilize artificial variables as benchmarks for decision making. Model-X knockoffs constructs a 'knockoff' variable for each covariate to mimic the covariance structure, while RDVS generates only one null variable and forms a reference distribution by performing multiple runs of model fitting. Herein, we describe how different importance measures for ordinal responses can be constructed that fit into these two selection frameworks, using either penalized regression or machine learning techniques. We compared these measures in terms of the FDR and power using simulated data. Moreover, we applied these two frameworks to high-throughput methylation data for identifying features associated with the progression from normal liver tissue to hepatocellular carcinoma to further compare and contrast their performances.

摘要

许多高通量基因组应用涉及一大组潜在的协变量和一个通常是有序尺度测量的响应,确定哪些变量与响应真正相关是至关重要的。在不牺牲功效的情况下有效地控制假发现率(FDR)一直是变量选择研究中的主要挑战。本研究综述了两种现有的变量选择框架,即模型-X 置换和修改后的参考分布变量选择(RDVS),它们都利用人工变量作为决策的基准。模型-X 置换为每个协变量构建一个“置换”变量来模拟协方差结构,而 RDVS 只生成一个空变量,并通过多次模型拟合来形成参考分布。在此,我们描述了如何使用惩罚回归或机器学习技术,为这两个选择框架构建适合有序响应的不同重要性度量。我们使用模拟数据从 FDR 和功效两方面比较了这些度量。此外,我们将这两个框架应用于高通量甲基化数据,以识别与正常肝组织向肝细胞癌进展相关的特征,从而进一步比较和对比它们的性能。

相似文献

2
Knockoff boosted tree for model-free variable selection.无模型变量选择的仿射提升树。
Bioinformatics. 2021 May 17;37(7):976-983. doi: 10.1093/bioinformatics/btaa770.
5
Deep direct likelihood knockoffs.深度直接似然性仿样
Adv Neural Inf Process Syst. 2020 Dec;33:5036-5046.

本文引用的文献

2
Gene hunting with hidden Markov model knockoffs.使用隐马尔可夫模型仿样进行基因搜寻。
Biometrika. 2019 Mar;106(1):1-18. doi: 10.1093/biomet/asy033. Epub 2018 Aug 4.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验