Department of Neurological Surgery, Weill Institute for Neurosciences, Brain and Spinal Injury Center, University of California San Francisco, San Francisco, CA, USA.
W.M. Keck Center for Collaborative Neuroscience, Rutgers University, New Brunswick, NJ, USA.
Neuroinformatics. 2022 Jan;20(1):39-52. doi: 10.1007/s12021-021-09512-z. Epub 2021 Mar 2.
Meta-analyses suggest that the published literature represents only a small minority of the total data collected in biomedical research, with most becoming 'dark data' unreported in the literature. Dark data is due to publication bias toward novel results that confirm investigator hypotheses and omission of data that do not. Publication bias contributes to scientific irreproducibility and failures in bench-to-bedside translation. Sharing dark data by making it Findable, Accessible, Interoperable, and Reusable (FAIR) may reduce the burden of irreproducible science by increasing transparency and support data-driven discoveries beyond the lifecycle of the original study. We illustrate feasibility of dark data sharing by recovering original raw data from the Multicenter Animal Spinal Cord Injury Study (MASCIS), an NIH-funded multi-site preclinical drug trial conducted in the 1990s that tested efficacy of several therapies after a spinal cord injury (SCI). The original drug treatments did not produce clear positive results and MASCIS data were stored in boxes for more than two decades. The goal of the present study was to independently confirm published machine learning findings that perioperative blood pressure is a major predictor of SCI neuromotor outcome (Nielson et al., 2015). We recovered, digitized, and curated the data from 1125 rats from MASCIS. Analyses indicated that high perioperative blood pressure at the time of SCI is associated with poorer health and worse neuromotor outcomes in more severe SCI, whereas low perioperative blood pressure is associated with poorer health and worse neuromotor outcome in moderate SCI. These findings confirm and expand prior results that a narrow window of blood-pressure control optimizes outcome, and demonstrate the value of recovering dark data for assessing reproducibility of findings with implications for precision therapeutic approaches.
荟萃分析表明,已发表的文献仅代表生物医学研究中收集的全部数据的一小部分,其中大部分数据成为未在文献中报告的“暗数据”。暗数据是由于发表偏倚导致的,即偏向于新颖的结果,这些结果证实了研究人员的假设,而排除了不支持假设的数据。发表偏倚导致科学不可再现性和从实验室到临床的转化失败。通过使暗数据变得可查找、可访问、可互操作和可重用(FAIR)来共享暗数据,可能会通过提高透明度并支持超越原始研究生命周期的数据驱动发现,从而减少不可再现科学的负担。我们通过从 Multicenter Animal Spinal Cord Injury Study (MASCIS) 中恢复原始原始数据来说明暗数据共享的可行性,这是一项由美国国立卫生研究院资助的 90 年代多地点临床前药物试验,该试验测试了脊髓损伤 (SCI) 后几种治疗方法的疗效。原始药物治疗没有产生明确的阳性结果,MASCIS 数据被储存在盒子里超过二十年。本研究的目的是独立证实已发表的机器学习研究结果,即围手术期血压是 SCI 神经运动结果的主要预测因素(Nielson 等人,2015 年)。我们从 MASCIS 中恢复、数字化并整理了来自 1125 只大鼠的数据。分析表明,SCI 时围手术期的高血压与更严重 SCI 中更差的健康状况和更差的神经运动结果相关,而围手术期低血压与中度 SCI 中更差的健康状况和更差的神经运动结果相关。这些发现证实并扩展了先前的结果,即血压控制的狭窄窗口优化了结果,并证明了恢复暗数据的价值,这对于评估发现的可再现性以及对精准治疗方法的影响具有重要意义。