Suppr超能文献

统计学显著性

Statistical Significance

作者信息

Tenny Steven, Abdelgawad Ibrahim

机构信息

University of Nebraska Medical Center

AinShams University, Cairo Egypt

Abstract

In research, statistical significance measures the probability of the null hypothesis being true compared to the acceptable level of uncertainty regarding the true answer. We can better understand statistical significance if we break apart a study design. When creating a study, the researcher has to start with a hypothesis; that is, they must have some idea of what they think the outcome may be. For example, a study is researching a new medication to lower blood pressure. The researcher hypothesizes that the new medication lowers systolic blood pressure by at least 10 mm Hg compared to not taking the new medication. The hypothesis can be stated: "Taking the new medication will lower systolic blood pressure by at least 10 mm Hg compared to not taking the medication." In science, researchers can never prove any statement as there are infinite alternatives as to why the outcome may have occurred. They can only try to disprove a specific hypothesis. The researcher must then formulate a question they can disprove while concluding that the new medication lowers systolic blood pressure. The hypothesis to be disproven is the null hypothesis and typically the inverse statement of the hypothesis. Thus, the null hypothesis for our researcher would be, "Taking the new medication will not lower systolic blood pressure by at least 10 mm Hg compared to not taking the new medication." The researcher now has the null hypothesis for the research and must specify the significance level or level of acceptable uncertainty. Even when disproving a hypothesis, the researcher can not be 100% certain of the outcome. The researcher must then settle for some level of confidence, or the degree of significance, for which they want to be confident their finding is correct. The significance level is given the Greek letter alpha and specified as the probability the researcher is willing to be incorrect. Generally, a researcher wants to be correct about their outcome 95% of the time, so the researcher is willing to be incorrect 5% of the time. Probabilities are decimals, with 1.0 being entirely positive (100%) and 0 being completely negative (0%). Thus, the researcher who wants to be 95% sure about the outcome of their study is willing to be wrong about the result 5% of the time. The alpha is the decimal expression of how much they are ready to be incorrect. For the current example, the alpha is 0.05. The level of uncertainty the researcher is willing to accept (alpha or significance level) is 0.05, or a 5% chance they are incorrect about the study's outcome. Now, the researcher can perform the research. In this example, a prospective randomized controlled study is conducted in which the researcher gives some individuals the new medication and others a placebo. The researcher then evaluates the blood pressure of both groups after a specified time and performs a statistical analysis of the results to obtain a  value (probability value). Several different tests can be performed depending on the type of variable being studied and the number of subjects. The exact test is outside the scope of this review, but the output would be a  value. Using the correct statistical analysis tool when calculating the  value is imperative. If the researchers use the wrong test, the  value will not be accurate, and this result can mislead the researcher. A  value is a probability under a specified statistical model that a statistical summary of the data (eg, the sample mean difference between 2 compared groups) would be equal to or more extreme than its observed value. In this example, the researcher hypothetically found blood pressure tended to decrease after taking the new medication, with an average decrease of 15 mm Hg in the group taking the new medication. The researcher then used the help of their statistician to perform the correct analysis and arrived at a  value of 0.02 for a decrease in blood pressure in those taking the new medication versus those not taking the new medication. This researcher now has the 3 required pieces of information to look at statistical significance: the null hypothesis, the significance level, and the  value. The researcher can finally assess the statistical significance of the new medication. A study result is statistically significant if the  value of the data analysis is less than the prespecified alpha (significance level). In this example, the value is 0.02, which is less than the prespecified alpha of 0.05, so the researcher rejects the null hypothesis, which has been determined within the predetermined confidence level to be disproven, and accepts the hypothesis, thus concluding there is statistical significance for the finding that the new medication lowers blood pressure.  What does this mean? The value is not the probability of the null hypothesis itself. It is the probability that, if the study were repeated an infinite number of times, one would expect the findings to be as, or more extreme, than the one calculated in this test. Therefore, the  value of 0.02 would signify that 2% of the infinite tests would find a result at least as extreme as the one in this study. Given that the null hypothesis states that there is no significant change in blood pressure if the patient is or is not taking the new medication, we can assume that this statement is false, as 98% of the infinite studies would find that there was indeed a reduction in blood pressure. However, as the  value implies, there is a chance that this is false, and there truly is no effect of the medication on the blood pressure. However, as the researcher prespecified an acceptable confidence level with an alpha of 0.05, and the  value is 0.02, less than the acceptable alpha of 0.05, the researcher rejects the null hypothesis. By rejecting the null hypothesis, the researcher accepts the alternative hypothesis. The researcher rejects the idea that there is no difference in systolic blood pressure with the new medication and accepts a difference of at least 10 mm Hg in systolic blood pressure when taking the new medication. If the researcher had prespecified an alpha of 0.01, implying they wanted to be 99% sure the new medication lowered the blood pressure by at least 10 mm Hg, the  value of 0.02 would be more significant than the prespecified alpha of 0.01. The researcher would conclude the study did not reach statistical significance as the  value is equal to or greater than the prespecified alpha. The research would then not be able to reject the null hypothesis.

摘要

在研究中,统计显著性衡量的是原假设为真的概率,与关于真实答案的可接受不确定性水平相比较。如果我们剖析一项研究设计,就能更好地理解统计显著性。在开展一项研究时,研究人员必须先提出一个假设;也就是说,他们必须对自己认为可能的结果有一些想法。例如,一项研究正在研究一种降低血压的新药。研究人员假设,与不服用这种新药相比,这种新药能使收缩压至少降低10毫米汞柱。该假设可以表述为:“与不服用药物相比,服用这种新药将使收缩压至少降低10毫米汞柱。”在科学领域,研究人员永远无法证明任何陈述,因为对于结果为何会出现,存在无数种可能性。他们只能试图反驳一个特定的假设。然后,研究人员必须提出一个他们可以反驳的问题,同时得出新药能降低收缩压的结论。要被反驳的假设就是原假设,通常是假设的反陈述。因此,我们这位研究人员的原假设将是:“与不服用这种新药相比,服用这种新药不会使收缩压至少降低10毫米汞柱。”研究人员现在有了该研究的原假设,并且必须指定显著性水平或可接受的不确定性水平。即使在反驳一个假设时(即拒绝原假设),研究人员也不能对结果有100%的把握。然后,研究人员必须确定某种置信水平,即显著性程度,他们希望对自己的发现正确与否有信心。显著性水平用希腊字母α表示,具体指研究人员愿意犯错的概率。一般来说,研究人员希望在95%的情况下对自己的结果判断正确,所以研究人员愿意在5%的情况下犯错。概率用小数表示,1.0表示完全肯定(100%),0表示完全否定(0%)。因此,希望对自己研究结果有95%把握的研究人员,愿意在5%的情况下对结果判断错误。α就是他们准备犯错程度的小数表示。对于当前的例子,α是0.05。研究人员愿意接受的不确定性水平(α或显著性水平)是0.05,即他们对研究结果判断错误的概率为5%。现在,研究人员可以进行研究了。在这个例子中,进行了一项前瞻性随机对照研究,研究人员给一些个体服用新药,给另一些个体服用安慰剂。然后研究人员在指定时间后评估两组的血压,并对结果进行统计分析以获得一个p值(概率值)。根据所研究变量的类型和受试者数量,可以进行几种不同的测试。具体的测试超出了本综述的范围,但输出结果会是一个p值。在计算p值时使用正确的统计分析工具至关重要。如果研究人员使用了错误的测试方法,p值将不准确,这个结果可能会误导研究人员。p值是在指定统计模型下,数据的统计摘要(例如,两个比较组之间的样本均值差异)等于或比其观察值更极端的概率。在这个例子中,研究人员假设发现服用新药后血压有下降趋势,服用新药的组平均下降了15毫米汞柱。然后研究人员借助他们的数据统计师进行了正确的分析,得出服用新药与未服用新药的人相比血压下降的p值为0.02。这位研究人员现在有了查看统计显著性所需的三个信息:原假设、显著性水平和p值。研究人员最终可以评估这种新药的统计显著性了。如果数据分析的p值小于预先设定的α(显著性水平),那么研究结果在统计上就是显著的。在这个例子中,p值是0.02,小于预先设定的α值0.05,所以研究人员拒绝原假设,在预定的置信水平内已确定原假设可被反驳,从而接受假设,因此得出结论:新药能降低血压这一发现具有统计显著性。这意味着什么呢?p值不是原假设本身的概率。它是指如果该研究无限次重复,人们预期结果会与本次测试计算结果一样或更极端的概率。因此,0.02的p值意味着在无限次测试中,2%的测试会得到至少与本研究结果一样极端的结果。鉴于原假设表明患者服用或不服用新药时血压没有显著变化,我们可以假设这个陈述是错误的,因为在98%的无限次研究中会发现血压确实下降了。然而,正如p值所暗示的,存在这种情况是错误的可能性,即药物对血压真的没有影响。然而,由于研究人员预先设定了可接受的置信水平,α为0.05,而p值为0.02,小于可接受的α值0.05,研究人员拒绝原假设。通过拒绝原假设,研究人员接受备择假设。研究人员拒绝服用新药时收缩压没有差异的观点,接受服用新药时收缩压至少有10毫米汞柱差异的观点。如果研究人员预先设定的α为0.01,这意味着他们想有99%的把握确定新药能使血压降低至少10毫米汞柱,那么0.02的p值就会比预先设定的α值0.01更显著。研究人员会得出该研究未达到统计显著性的结论,因为p值等于或大于预先设定的α值。那么该研究就无法拒绝原假设。

相似文献

5
Inappropriate use of statistical power.统计功效的误用。
Bone Marrow Transplant. 2023 May;58(5):474-477. doi: 10.1038/s41409-023-01935-3. Epub 2023 Mar 3.
6
Biostatistics Series Module 2: Overview of Hypothesis Testing.生物统计学系列模块2:假设检验概述。
Indian J Dermatol. 2016 Mar-Apr;61(2):137-45. doi: 10.4103/0019-5154.177775.
9
-Value Demystified.-价值揭秘。
Indian Dermatol Online J. 2019 Nov 1;10(6):745-750. doi: 10.4103/idoj.IDOJ_368_19. eCollection 2019 Nov-Dec.
10
The future of Cochrane Neonatal.考克兰新生儿协作网的未来。
Early Hum Dev. 2020 Nov;150:105191. doi: 10.1016/j.earlhumdev.2020.105191. Epub 2020 Sep 12.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验