Fred Hutchinson Cancer Research Center, 1100 Fairview Avenue North, Mailstop M2-C200, Seattle, WA 98109-1024, USA.
Brief Bioinform. 2013 Jul;14(4):391-401. doi: 10.1093/bib/bbs078. Epub 2012 Nov 27.
With the development of novel assay technologies, biomedical experiments and analyses have gone through substantial evolution. Today, a typical experiment can simultaneously measure hundreds to thousands of individual features (e.g. genes) in dozens of biological conditions, resulting in gigabytes of data that need to be processed and analyzed. Because of the multiple steps involved in the data generation and analysis and the lack of details provided, it can be difficult for independent researchers to try to reproduce a published study. With the recent outrage following the halt of a cancer clinical trial due to the lack of reproducibility of the published study, researchers are now facing heavy pressure to ensure that their results are reproducible. Despite the global demand, too many published studies remain non-reproducible mainly due to the lack of availability of experimental protocol, data and/or computer code. Scientific discovery is an iterative process, where a published study generates new knowledge and data, resulting in new follow-up studies or clinical trials based on these results. As such, it is important for the results of a study to be quickly confirmed or discarded to avoid wasting time and money on novel projects. The availability of high-quality, reproducible data will also lead to more powerful analyses (or meta-analyses) where multiple data sets are combined to generate new knowledge. In this article, we review some of the recent developments regarding biomedical reproducibility and comparability and discuss some of the areas where the overall field could be improved.
随着新型分析技术的发展,生物医学实验和分析已经经历了重大变革。如今,一个典型的实验可以同时测量数十种生物条件下数百到数千个个体特征(例如基因),产生需要处理和分析的数十千兆字节的数据。由于数据生成和分析涉及多个步骤,并且提供的细节很少,因此独立研究人员可能难以尝试重现已发表的研究。由于最近由于发表的研究缺乏可重复性而停止了一项癌症临床试验,研究人员现在面临着确保结果可重现性的巨大压力。尽管全球有需求,但由于缺乏实验方案、数据和/或计算机代码,太多已发表的研究仍然不可重现。科学发现是一个迭代过程,发表的研究产生新知识和数据,从而基于这些结果进行新的后续研究或临床试验。因此,快速确认或摒弃研究结果以避免在新的项目上浪费时间和金钱非常重要。高质量、可重现的数据的可用性还将导致更强大的分析(或荟萃分析),其中多个数据集被组合以生成新知识。在本文中,我们回顾了一些关于生物医学可重复性和可比性的最新进展,并讨论了整个领域可以改进的一些方面。