Suppr超能文献

无模型预测检验及其在基因组学数据中的应用。

Model-free prediction test with application to genomics data.

机构信息

Department of Statistics, Iowa State University, Ames, IA 50011.

Department of Statistics and Data Science, Carnegie Mellon University, Pittsburgh, PA 15213.

出版信息

Proc Natl Acad Sci U S A. 2022 Aug 23;119(34):e2205518119. doi: 10.1073/pnas.2205518119. Epub 2022 Aug 15.

Abstract

Testing the significance of predictors in a regression model is one of the most important topics in statistics. This problem is especially difficult without any parametric assumptions on the data. This paper aims to test the null hypothesis that given confounding variables , does not significantly contribute to the prediction of under the model-free setting, where and are possibly high dimensional. We propose a general framework that first fits nonparametric machine learning regression algorithms on [Formula: see text] and [Formula: see text], then compares the prediction power of the two models. The proposed method allows us to leverage the strength of the most powerful regression algorithms developed in the modern machine learning community. The value for the test can be easily obtained by permutation. In simulations, we find that the proposed method is more powerful compared to existing methods. The proposed method allows us to draw biologically meaningful conclusions from two gene expression data analyses without strong distributional assumptions: 1) testing the prediction power of sequencing RNA for the proteins in cellular indexing of transcriptomes and epitopes by sequencing data and 2) identification of spatially variable genes in spatially resolved transcriptomics data.

摘要

检验回归模型中预测因子的显著性是统计学中最重要的课题之一。在对数据没有任何参数假设的情况下,这个问题尤其困难。本文旨在检验零假设,即在给定混杂变量的情况下,在无模型设定下, 对 的预测没有显著贡献,其中 和 可能是高维的。我们提出了一个通用框架,首先在 [Formula: see text] 和 [Formula: see text] 上拟合非参数机器学习回归算法,然后比较两个模型的预测能力。所提出的方法允许我们利用现代机器学习社区中开发的最强大的回归算法的优势。通过置换可以轻松获得检验的 值。在模拟中,我们发现与现有方法相比,所提出的方法更有效。所提出的方法允许我们从两个基因表达数据分析中得出具有生物学意义的结论,而无需进行强分布假设:1)测试 RNA 测序对细胞转录组和表位测序数据中蛋白质的预测能力,以及 2)鉴定空间分辨转录组学数据中的空间变异基因。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验