DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, California, 94720, USA.
Michigan State University, Department of Microbiology & Molecular Genetics, East Lansing, Michigan, 48824, USA.
Sci Data. 2024 Nov 6;11(1):1200. doi: 10.1038/s41597-024-04049-7.
Increases in sequencing capacity, combined with rapid accumulation of publications and associated data resources, have increased the complexity of maintaining associations between literature and genomic data. As the volume of literature and data have exceeded the capacity of manual curation, automated approaches to maintaining and confirming associations among these resources have become necessary. Here we present the Data Citation Explorer (DCE), which discovers literature incorporating genomic data that was not formally cited. This service provides advantages over manual curation methods including consistent resource coverage, metadata enrichment, documentation of new use cases, and identification of conflicting metadata. The service reduces labor costs associated with manual review, improves the quality of genome metadata maintained by the U.S. Department of Energy Joint Genome Institute (JGI), and increases the number of known publications that incorporate its data products. The DCE facilitates an understanding of JGI impact, improves credit attribution for data generators, and can encourage data sharing by allowing scientists to see how reuse amplifies the impact of their original studies.
测序能力的提高,加上出版物和相关数据资源的快速积累,增加了维护文献和基因组数据之间关联的复杂性。由于文献和数据的数量已经超过了人工管理的能力,因此需要自动化的方法来维护和确认这些资源之间的关联。在这里,我们介绍了数据引文资源管理器(Data Citation Explorer,DCE),它可以发现包含未被正式引用的基因组数据的文献。与手动管理方法相比,该服务具有许多优势,包括一致的资源覆盖范围、元数据丰富、记录新的用例以及识别冲突的元数据。该服务降低了与手动审查相关的劳动力成本,提高了美国能源部联合基因组研究所(Joint Genome Institute,JGI)维护的基因组元数据的质量,并增加了已知包含其数据产品的出版物数量。DCE 有助于了解 JGI 的影响力,提高数据生成者的信用归因,并通过允许科学家了解重新使用如何放大他们原始研究的影响,从而鼓励数据共享。