Ulrich Rolf, Miller Jeff
Department of Psychology, University of Tubingen.
Department of Psychology, University of Otago.
J Exp Psychol Gen. 2015 Dec;144(6):1137-45. doi: 10.1037/xge0000086.
Simonsohn, Nelson, and Simmons (2014) have suggested a novel test to detect p-hacking in research, that is, when researchers report excessive rates of "significant effects" that are truly false positives. Although this test is very useful for identifying true effects in some cases, it fails to identify false positives in several situations when researchers conduct multiple statistical tests (e.g., reporting the most significant result). In these cases, p-curves are right-skewed, thereby mimicking the existence of real effects even if no effect is actually present.
西蒙松、纳尔逊和西蒙斯(2014年)提出了一种用于检测研究中p值操纵的新测试方法,即当研究人员报告的“显著效应”发生率过高,而这些效应实际上是假阳性时。尽管这种测试在某些情况下对于识别真实效应非常有用,但当研究人员进行多次统计测试(例如,报告最显著的结果)时,它在几种情况下无法识别假阳性。在这些情况下,p值曲线呈右偏态,从而即使实际上不存在效应也会模拟真实效应的存在。