Suppr超能文献

为何、何时以及如何调整P值?

Why, When and How to Adjust Your P Values?

作者信息

Jafari Mohieddin, Ansari-Pour Naser

机构信息

Drug Design and Bioinformatics Unit, Medical Biotechnology Department, Biotechnology Research Center, Pasteur Institute of Iran, Tehran, Iran.

Faculty of New Sciences and Technologies, University of Tehran, Tehran, Iran. Electronic Address:

出版信息

Cell J. 2019 Jan;20(4):604-607. doi: 10.22074/cellj.2019.5992. Epub 2018 Aug 1.

Abstract

Currently, numerous papers are published reporting analysis of biological data at different omics levels by making statistical inferences. Of note, many studies, as those published in this Journal, report association of gene(s) at the genomic and transcriptomic levels by undertaking appropriate statistical tests. For instance, genotype, allele or haplotype frequencies at the genomic level or normalized expression levels at the transcriptomic level are compared between the case and control groups using the Chi-square/Fisher's exact test or independent (i.e. two-sampled) t-test respectively, with this culminating into a single numeric, namely the P value (or the degree of the false positive rate), which is used to make or break the outcome of the association test. This approach has flaws but nevertheless remains a standard and convenient approach in association studies. However, what becomes a critical issue is that the same cut-off is used when 'multiple' tests are undertaken on the same case-control (or any pairwise) comparison. Here, in brevity, we present what the P value represents, and why and when it should be adjusted. We also show, with worked examples, how to adjust P values for multiple testing in the R environment for statistical computing (http://www.R-project.org).

摘要

目前,有大量论文发表,报道了通过进行统计推断对不同组学水平的生物学数据进行分析的情况。值得注意的是,许多研究,比如发表在本杂志上的那些研究,通过进行适当的统计检验,报道了基因在基因组和转录组水平上的关联性。例如,分别使用卡方检验/费舍尔精确检验或独立(即双样本)t检验,比较病例组和对照组在基因组水平上的基因型、等位基因或单倍型频率,或在转录组水平上的标准化表达水平,最终得出一个单一数值,即P值(或假阳性率),该数值用于判定关联性检验的结果。这种方法存在缺陷,但在关联性研究中仍然是一种标准且便捷的方法。然而,一个关键问题是,在对同一病例对照(或任何成对)比较进行“多次”检验时,使用的是相同的临界值。在此,我们简要介绍P值代表什么,以及为什么要调整P值以及何时调整。我们还通过实际例子展示了如何在用于统计计算的R环境(http://www.R-project.org)中对多重检验的P值进行调整。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e760/6099145/2f720450f49f/Cell-J-20-604-g01.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验