Suppr超能文献

一致逼近在基因集分析中更适合Wilcoxon 秩和检验。

Uniform approximation is more appropriate for Wilcoxon Rank-Sum Test in gene set analysis.

机构信息

Biostatistics Program, School of Public Health, Louisiana State University Health Sciences Center, New Orleans, Louisiana, United States of America.

出版信息

PLoS One. 2012;7(2):e31505. doi: 10.1371/journal.pone.0031505. Epub 2012 Feb 7.

Abstract

Gene set analysis is widely used to facilitate biological interpretations in the analyses of differential expression from high throughput profiling data. Wilcoxon Rank-Sum (WRS) test is one of the commonly used methods in gene set enrichment analysis. It compares the ranks of genes in a gene set against those of genes outside the gene set. This method is easy to implement and it eliminates the dichotomization of genes into significant and non-significant in a competitive hypothesis testing. Due to the large number of genes being examined, it is impractical to calculate the exact null distribution for the WRS test. Therefore, the normal distribution is commonly used as an approximation. However, as we demonstrate in this paper, the normal approximation is problematic when a gene set with relative small number of genes is tested against the large number of genes in the complementary set. In this situation, a uniform approximation is substantially more powerful, more accurate, and less intensive in computation. We demonstrate the advantage of the uniform approximations in Gene Ontology (GO) term analysis using simulations and real data sets.

摘要

基因集分析被广泛应用于高通量分析中差异表达的生物解释。Wilcoxon 秩和(WRS)检验是基因集富集分析中常用的方法之一。它将基因集内的基因排名与基因集外的基因排名进行比较。这种方法易于实现,并且在竞争性假设检验中消除了基因的二分法,即显著和非显著。由于要检查的基因数量众多,因此计算 WRS 检验的精确零分布是不切实际的。因此,通常使用正态分布作为近似。然而,正如我们在本文中所证明的,当相对较少数量的基因集与互补集中的大量基因进行测试时,正态逼近存在问题。在这种情况下,均匀逼近在计算上更强大、更准确、更密集。我们使用模拟和真实数据集展示了在基因本体论(GO)术语分析中使用均匀逼近的优势。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/123f/3274536/5ce4a39eefa3/pone.0031505.g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验