Suppr超能文献

用于多重假设的基于排列的统计检验。

Permutation - based statistical tests for multiple hypotheses.

作者信息

Camargo Anyela, Azuaje Francisco, Wang Haiying, Zheng Huiru

机构信息

University of Ulster at Jordanstown, School of Computing and Mathematics, Shore Road, Newtownabbey, Co, Antrim, BT37 0QB, Northern Ireland, UK.

出版信息

Source Code Biol Med. 2008 Oct 21;3:15. doi: 10.1186/1751-0473-3-15.

Abstract

BACKGROUND

Genomics and proteomics analyses regularly involve the simultaneous test of hundreds of hypotheses, either on numerical or categorical data. To correct for the occurrence of false positives, validation tests based on multiple testing correction, such as Bonferroni and Benjamini and Hochberg, and re-sampling, such as permutation tests, are frequently used. Despite the known power of permutation-based tests, most available tools offer such tests for either t-test or ANOVA only. Less attention has been given to tests for categorical data, such as the Chi-square. This project takes a first step by developing an open-source software tool, Ptest, that addresses the need to offer public software tools incorporating these and other statistical tests with options for correcting for multiple hypotheses.

RESULTS

This study developed a public-domain, user-friendly software whose purpose was twofold: first, to estimate test statistics for categorical and numerical data; and second, to validate the significance of the test statistics via Bonferroni, Benjamini and Hochberg, and a permutation test of numerical and categorical data. The tool allows the calculation of Chi-square test for categorical data, and ANOVA test, Bartlett's test and t-test for paired and unpaired data. Once a test statistic is calculated, Bonferroni, Benjamini and Hochberg, and a permutation tests are implemented, independently, to control for Type I errors. An evaluation of the software using different public data sets is reported, which illustrates the power of permutation tests for multiple hypotheses assessment and for controlling the rate of Type I errors.

CONCLUSION

The analytical options offered by the software can be applied to support a significant spectrum of hypothesis testing tasks in functional genomics, using both numerical and categorical data.

摘要

背景

基因组学和蛋白质组学分析经常涉及对数百个假设同时进行检验,无论是针对数值数据还是分类数据。为了校正假阳性的出现,经常使用基于多重检验校正的验证检验,如邦费罗尼检验、本贾尼和霍奇伯格检验,以及重抽样检验,如置换检验。尽管基于置换的检验具有已知的功效,但大多数现有工具仅为t检验或方差分析提供此类检验。对于分类数据的检验,如卡方检验,关注较少。本项目迈出了第一步,开发了一个开源软件工具Ptest,以满足提供包含这些及其他统计检验并具有多重假设校正选项的公共软件工具的需求。

结果

本研究开发了一个公共领域、用户友好的软件,其目的有两个:第一,估计分类数据和数值数据的检验统计量;第二,通过邦费罗尼检验、本贾尼和霍奇伯格检验以及数值和分类数据的置换检验来验证检验统计量的显著性。该工具允许计算分类数据的卡方检验,以及配对和非配对数据的方差分析检验、巴特利特检验和t检验。一旦计算出检验统计量,将独立实施邦费罗尼检验、本贾尼和霍奇伯格检验以及置换检验,以控制I型错误。报告了使用不同公共数据集对该软件的评估,这说明了置换检验在多重假设评估和控制I型错误率方面的功效。

结论

该软件提供的分析选项可应用于支持功能基因组学中大量的假设检验任务,同时使用数值数据和分类数据。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6faa/2611984/8a7ade541b3b/1751-0473-3-15-1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验