Department of Computer Science, Tufts University, 177 College Ave, 02155, MA, USA.
Department of Biological Sciences, University of Rhode Island, 120 Flagg Rd, 02881, RI, USA.
Database (Oxford). 2022 Aug 17;2022. doi: 10.1093/database/baac065.
Reproducibility of research is essential for science. However, in the way modern computational biology research is done, it is easy to lose track of small, but extremely critical, details. Key details, such as the specific version of a software used or iteration of a genome can easily be lost in the shuffle or perhaps not noted at all. Much work is being done on the database and storage side of things, ensuring that there exists a space-to-store experiment-specific details, but current mechanisms for recording details are cumbersome for scientists to use. We propose a new metadata description language, named MEtaData Format for Open Reef Data (MEDFORD), in which scientists can record all details relevant to their research. Being human-readable, easily editable and templatable, MEDFORD serves as a collection point for all notes that a researcher could find relevant to their research, be it for internal use or for future replication. MEDFORD has been applied to coral research, documenting research from RNA-seq analyses to photo collections.
研究的可重复性对科学至关重要。然而,在现代计算生物学研究的方式中,很容易忽略一些微小但极其关键的细节。关键细节,如使用的软件的特定版本或基因组的迭代,很容易在混乱中丢失,或者根本没有被注意到。数据库和存储方面的工作正在进行中,以确保有一个空间来存储实验特定的细节,但目前记录细节的机制对于科学家来说使用起来很麻烦。我们提出了一种新的元数据描述语言,名为开放珊瑚礁数据元数据格式(MEDFORD),科学家可以在其中记录与其研究相关的所有细节。MEDFORD 是人类可读的,易于编辑和可模板化的,它作为一个收集点,收集研究人员认为与其研究相关的所有注释,无论是内部使用还是未来复制。MEDFORD 已应用于珊瑚研究,记录了从 RNA-seq 分析到照片集的研究。