Institute for Systems Neuroscience, University Medical Center Hamburg-Eppendorf, Hamburg, Germany.
Department of Psychiatry, Harvard Medical School, and Center for Depression, Anxiety and Stress Research, McLean Hospital, Belmont, Massachusetts, USA.
Psychophysiology. 2022 Sep;59(9):e14058. doi: 10.1111/psyp.14058. Epub 2022 Apr 2.
Raw data are typically required to be processed to be ready for statistical analyses, and processing pipelines are often characterized by substantial heterogeneity. Here, we applied seven different approaches (trough-to-peak scoring by two different raters, script-based baseline correction, Ledalab as well as four different models implemented in the software PsPM) to two fear conditioning data sets. Selection of the approaches included was guided by a systematic literature search by using fear conditioning research as a case example. Our approach can be viewed as a set of robustness analyses (i.e., same data subjected to different processing pipelines) aiming to investigate if and to what extent these different quantification approaches yield comparable results given the same data. To our knowledge, no formal framework for the evaluation of robustness analyses exists to date, but we may borrow some criteria from a framework suggested for the evaluation of "replicability" in general. Our results from seven different SCR quantification approaches applied to two data sets with different paradigms suggest that there may be no single approach that consistently yields larger effect sizes and could be universally considered "best." Yet, at least some of the approaches employed show consistent effect sizes within each data set indicating comparability. Finally, we highlight substantial heterogeneity also within most quantification approaches and discuss implications and potential remedies.
原始数据通常需要经过处理才能进行统计分析,而处理管道通常具有很大的异质性。在这里,我们应用了七种不同的方法(由两位不同的评分者进行的谷底到峰值评分、基于脚本的基线校正、Ledalab 以及在软件 PsPM 中实现的四种不同模型)来分析两个恐惧条件反射数据集。方法的选择是通过使用恐惧条件反射研究作为案例示例进行系统的文献搜索来指导的。我们的方法可以看作是一组稳健性分析(即,相同的数据经过不同的处理管道),旨在研究这些不同的量化方法在相同数据下是否以及在何种程度上产生可比的结果。据我们所知,目前还没有用于评估稳健性分析的正式框架,但我们可以从一般用于评估“可重复性”的框架中借用一些标准。我们从应用于两个具有不同范式的数据集中的七种不同 SCR 量化方法中得到的结果表明,可能没有一种方法始终产生更大的效果量,并可以被普遍认为是“最佳”。然而,至少一些所采用的方法在每个数据集内都显示出一致的效果量,表明可比较性。最后,我们强调了大多数量化方法内也存在很大的异质性,并讨论了其影响和潜在的补救措施。