McKinney Bill, Meyer Peter A, Crosas Mercè, Sliz Piotr
Department of Biochemistry and Molecular Pharmacology and SBGrid Initiative, Harvard Medical School, Boston, Massachusetts, and the Dataverse Project, Harvard University, Cambridge, Massachusetts.
Ann N Y Acad Sci. 2017 Jan;1387(1):95-104. doi: 10.1111/nyas.13272. Epub 2016 Nov 10.
Access to experimental X-ray diffraction image data is important for validation and reproduction of macromolecular models and indispensable for the development of structural biology processing methods. In response to the evolving needs of the structural biology community, we recently established a diffraction data publication system, the Structural Biology Data Grid (SBDG, data.sbgrid.org), to preserve primary experimental datasets supporting scientific publications. All datasets published through the SBDG are freely available to the research community under a public domain dedication license, with metadata compliant with the DataCite Schema (schema.datacite.org). A proof-of-concept study demonstrated community interest and utility. Publication of large datasets is a challenge shared by several fields, and the SBDG has begun collaborating with the Institute for Quantitative Social Science at Harvard University to extend the Dataverse (dataverse.org) open-source data repository system to structural biology datasets. Several extensions are necessary to support the size and metadata requirements for structural biology datasets. In this paper, we describe one such extension-functionality supporting preservation of file system structure within Dataverse-which is essential for both in-place computation and supporting non-HTTP data transfers.
获取实验性X射线衍射图像数据对于大分子模型的验证和再现非常重要,对于结构生物学处理方法的发展也是不可或缺的。为了响应结构生物学界不断变化的需求,我们最近建立了一个衍射数据发布系统,即结构生物学数据网格(SBDG,data.sbgrid.org),以保存支持科学出版物的原始实验数据集。通过SBDG发布的所有数据集均在公共领域奉献许可下向研究社区免费提供,其元数据符合DataCite Schema(schema.datacite.org)。一项概念验证研究证明了社区的兴趣和实用性。发布大型数据集是多个领域共同面临的挑战,SBDG已开始与哈佛大学定量社会科学研究所合作,将Dataverse(dataverse.org)开源数据存储库系统扩展到结构生物学数据集。需要进行一些扩展以支持结构生物学数据集的大小和元数据要求。在本文中,我们描述了这样一种扩展功能——支持在Dataverse中保留文件系统结构,这对于就地计算和支持非HTTP数据传输都至关重要。