• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

并行化排列检验计算。

Parallelized calculation of permutation tests.

机构信息

Science for Life Laboratory, School of Engineering Sciences in Chemistry, Biotechnology and Health, KTH - Royal Institute of Technology, 171 21 Solna, Sweden.

Department of Mathematics, Stockholm University, 106 91 Stockholm, Sweden.

出版信息

Bioinformatics. 2021 Apr 1;36(22-23):5392-5397. doi: 10.1093/bioinformatics/btaa1007.

DOI:10.1093/bioinformatics/btaa1007
PMID:33289531
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8016463/
Abstract

MOTIVATION

Permutation tests offer a straightforward framework to assess the significance of differences in sample statistics. A significant advantage of permutation tests are the relatively few assumptions about the distribution of the test statistic are needed, as they rely on the assumption of exchangeability of the group labels. They have great value, as they allow a sensitivity analysis to determine the extent to which the assumed broad sample distribution of the test statistic applies. However, in this situation, permutation tests are rarely applied because the running time of naïve implementations is too slow and grows exponentially with the sample size. Nevertheless, continued development in the 1980s introduced dynamic programming algorithms that compute exact permutation tests in polynomial time. Albeit this significant running time reduction, the exact test has not yet become one of the predominant statistical tests for medium sample size. Here, we propose a computational parallelization of one such dynamic programming-based permutation test, the Green algorithm, which makes the permutation test more attractive.

RESULTS

Parallelization of the Green algorithm was found possible by non-trivial rearrangement of the structure of the algorithm. A speed-up-by orders of magnitude-is achievable by executing the parallelized algorithm on a GPU. We demonstrate that the execution time essentially becomes a non-issue for sample sizes, even as high as hundreds of samples. This improvement makes our method an attractive alternative to, e.g. the widely used asymptotic Mann-Whitney U-test.

AVAILABILITYAND IMPLEMENTATION

In Python 3 code from the GitHub repository https://github.com/statisticalbiotechnology/parallelPermutationTest under an Apache 2.0 license.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

排列检验为评估样本统计数据差异的显著性提供了一个直接的框架。排列检验的一个显著优势是,它们对检验统计量分布的假设相对较少,因为它们依赖于组标签可交换性的假设。它们具有很大的价值,因为它们允许进行敏感性分析,以确定所假设的检验统计量的广泛样本分布在多大程度上适用。然而,在这种情况下,很少应用排列检验,因为天真实现的运行时间太慢,并且随样本量呈指数增长。尽管在 20 世纪 80 年代继续发展,引入了计算精确排列检验的动态规划算法,但多项式时间。尽管运行时间显著减少,但精确检验尚未成为中等样本量的主要统计检验方法之一。在这里,我们提出了一种基于动态规划的排列检验,即 Green 算法的计算并行化,这使得排列检验更具吸引力。

结果

通过对算法结构进行非平凡的重新排列,发现 Green 算法可以进行并行化。通过在 GPU 上执行并行化算法,可以实现数量级的加速。我们证明,即使对于高达数百个样本的样本大小,执行时间基本上也不再是一个问题。这种改进使得我们的方法成为一种有吸引力的替代方法,例如广泛使用的渐近 Mann-Whitney U 检验。

可用性和实现

在 Python 3 代码中,可从 GitHub 存储库 https://github.com/statisticalbiotechnology/parallelPermutationTest 获得,许可证为 Apache 2.0。

补充信息

补充数据可在 Bioinformatics 在线获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b5f9/8016463/ee474b055d0e/btaa1007f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b5f9/8016463/9494ea0da3f5/btaa1007f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b5f9/8016463/4247e80ebd81/btaa1007f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b5f9/8016463/c9fcf9d00f23/btaa1007f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b5f9/8016463/ee474b055d0e/btaa1007f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b5f9/8016463/9494ea0da3f5/btaa1007f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b5f9/8016463/4247e80ebd81/btaa1007f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b5f9/8016463/c9fcf9d00f23/btaa1007f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b5f9/8016463/ee474b055d0e/btaa1007f4.jpg

相似文献

1
Parallelized calculation of permutation tests.并行化排列检验计算。
Bioinformatics. 2021 Apr 1;36(22-23):5392-5397. doi: 10.1093/bioinformatics/btaa1007.
2
blitzGSEA: efficient computation of gene set enrichment analysis through gamma distribution approximation.blitzGSEA:通过伽马分布逼近实现基因集富集分析的高效计算。
Bioinformatics. 2022 Apr 12;38(8):2356-2357. doi: 10.1093/bioinformatics/btac076.
3
Rank-permutation tests for behavior analysis, and a test for trend allowing unequal data numbers for each subject.等级置换检验在行为分析中的应用,以及一种允许每个被试数据数量不等的趋势检验。
J Exp Anal Behav. 2019 Mar;111(2):342-358. doi: 10.1002/jeab.502. Epub 2019 Feb 7.
4
PecanPy: a fast, efficient and parallelized Python implementation of node2vec.PecanPy:node2vec的一种快速、高效且并行化的Python实现。
Bioinformatics. 2021 Oct 11;37(19):3377-3379. doi: 10.1093/bioinformatics/btab202.
5
An exact test for comparing a fixed quantitative property between gene sets.一种用于比较基因集之间固定定量属性的精确检验方法。
Bioinformatics. 2018 Mar 15;34(6):971-977. doi: 10.1093/bioinformatics/btx693.
6
EDISON-WMW: Exact Dynamic Programing Solution of the Wilcoxon-Mann-Whitney Test.爱迪生 - 威尔科克森 - 曼 - 惠特尼检验的精确动态规划解决方案
Genomics Proteomics Bioinformatics. 2016 Feb;14(1):55-61. doi: 10.1016/j.gpb.2015.11.004. Epub 2016 Jan 29.
7
PERMORY: an LD-exploiting permutation test algorithm for powerful genome-wide association testing.PERMORY:一种利用 LD 进行置换检验的算法,用于进行强大的全基因组关联测试。
Bioinformatics. 2010 Sep 1;26(17):2093-100. doi: 10.1093/bioinformatics/btq399. Epub 2010 Jul 6.
8
Robust nonparametric tests of general linear model coefficients: A comparison of permutation methods and test statistics.稳健的一般线性模型系数的非参数检验:置换方法和检验统计量的比较。
Neuroimage. 2019 Nov 1;201:116030. doi: 10.1016/j.neuroimage.2019.116030. Epub 2019 Jul 19.
9
Accurate and fast small -value estimation for permutation tests in high-throughput genomic data analysis with the cross-entropy method.利用交叉熵方法对高通量基因组数据分析中的置换检验进行准确快速的小值估计。
Stat Appl Genet Mol Biol. 2023 Aug 25;22(1). doi: 10.1515/sagmb-2021-0067. eCollection 2023 Jan 1.
10
Fast algorithms for transforming back and forth between a signed permutation and its equivalent simple permutation.用于在有符号排列及其等效简单排列之间来回转换的快速算法。
J Comput Biol. 2008 Oct;15(8):1029-41. doi: 10.1089/cmb.2008.0040.

引用本文的文献

1
Whole-genome screens reveal regulators of differentiation state and context-dependent migration in human neutrophils.全基因组筛选揭示了人类中性粒细胞分化状态和依赖于上下文的迁移的调控因子。
Nat Commun. 2023 Sep 18;14(1):5770. doi: 10.1038/s41467-023-41452-x.

本文引用的文献

1
A simple null model for inferences from network enrichment analysis.一种用于网络富集分析推论的简单零模型。
PLoS One. 2018 Nov 9;13(11):e0206864. doi: 10.1371/journal.pone.0206864. eCollection 2018.
2
Fast approximation of small p-values in permutation tests by partitioning the permutations.通过对排列进行划分来快速近似排列检验中的小p值。
Biometrics. 2018 Mar;74(1):196-206. doi: 10.1111/biom.12731. Epub 2017 May 18.
3
Proteogenomics connects somatic mutations to signalling in breast cancer.蛋白质基因组学将体细胞突变与乳腺癌中的信号传导联系起来。
Nature. 2016 Jun 2;534(7605):55-62. doi: 10.1038/nature18003. Epub 2016 May 25.
4
Accurate and fast multiple-testing correction in eQTL studies.在全基因组关联研究中进行准确快速的多重检验校正。
Am J Hum Genet. 2015 Jun 4;96(6):857-68. doi: 10.1016/j.ajhg.2015.04.012. Epub 2015 May 28.
5
Wilcoxon-Mann-Whitney or t-test? On assumptions for hypothesis tests and multiple interpretations of decision rules.威尔科克森-曼-惠特尼检验还是t检验?关于假设检验的假设以及决策规则的多种解释。
Stat Surv. 2010;4:1-39. doi: 10.1214/09-SS051.
6
PRESTO: rapid calculation of order statistic distributions and multiple-testing adjusted P-values via permutation for one and two-stage genetic association studies.PRESTO:通过置换快速计算一阶段和两阶段基因关联研究的顺序统计分布和多重检验校正P值。
BMC Bioinformatics. 2008 Jul 13;9:309. doi: 10.1186/1471-2105-9-309.
7
Estimation of significance thresholds for genomewide association scans.全基因组关联扫描显著性阈值的估计
Genet Epidemiol. 2008 Apr;32(3):227-34. doi: 10.1002/gepi.20297.
8
PLINK: a tool set for whole-genome association and population-based linkage analyses.PLINK:一个用于全基因组关联分析和基于群体的连锁分析的工具集。
Am J Hum Genet. 2007 Sep;81(3):559-75. doi: 10.1086/519795. Epub 2007 Jul 25.
9
To permute or not to permute.是否进行置换。
Bioinformatics. 2006 Sep 15;22(18):2244-8. doi: 10.1093/bioinformatics/btl383. Epub 2006 Jul 26.
10
Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles.基因集富集分析:一种基于知识的方法用于解读全基因组表达谱。
Proc Natl Acad Sci U S A. 2005 Oct 25;102(43):15545-50. doi: 10.1073/pnas.0506580102. Epub 2005 Sep 30.