IT Center for Clinical Research, University of Lübeck, Lübeck, Germany.
Institute of Medical Informatics, University of Lübeck, Lübeck, Germany.
J Med Internet Res. 2022 Jan 11;24(1):e25440. doi: 10.2196/25440.
Metadata are created to describe the corresponding data in a detailed and unambiguous way and is used for various applications in different research areas, for example, data identification and classification. However, a clear definition of metadata is crucial for further use. Unfortunately, extensive experience with the processing and management of metadata has shown that the term "metadata" and its use is not always unambiguous.
This study aimed to understand the definition of metadata and the challenges resulting from metadata reuse.
A systematic literature search was performed in this study following the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines for reporting on systematic reviews. Five research questions were identified to streamline the review process, addressing metadata characteristics, metadata standards, use cases, and problems encountered. This review was preceded by a harmonization process to achieve a general understanding of the terms used.
The harmonization process resulted in a clear set of definitions for metadata processing focusing on data integration. The following literature review was conducted by 10 reviewers with different backgrounds and using the harmonized definitions. This study included 81 peer-reviewed papers from the last decade after applying various filtering steps to identify the most relevant papers. The 5 research questions could be answered, resulting in a broad overview of the standards, use cases, problems, and corresponding solutions for the application of metadata in different research areas.
Metadata can be a powerful tool for identifying, describing, and processing information, but its meaningful creation is costly and challenging. This review process uncovered many standards, use cases, problems, and solutions for dealing with metadata. The presented harmonized definitions and the new schema have the potential to improve the classification and generation of metadata by creating a shared understanding of metadata and its context.
元数据用于以详细且明确的方式描述相应的数据,并且在不同的研究领域中的各种应用中使用,例如数据识别和分类。然而,对于进一步的使用来说,元数据的明确定义是至关重要的。不幸的是,对元数据的处理和管理的广泛经验表明,“元数据”一词及其用法并不总是明确的。
本研究旨在了解元数据的定义以及元数据重用所带来的挑战。
本研究遵循 PRISMA(系统评价和荟萃分析的首选报告项目)报告系统评价的指南进行了系统文献检索。确定了 5 个研究问题,以简化审查过程,解决元数据特征、元数据标准、用例和遇到的问题。在进行此审查之前,进行了协调过程以实现对所用术语的总体理解。
协调过程产生了一组明确的元数据处理定义,重点是数据集成。使用协调后的定义,由具有不同背景的 10 位审阅者进行了以下文献综述。经过各种过滤步骤识别出最相关的论文后,本研究包括了过去十年中的 81 篇同行评审论文。通过回答这 5 个研究问题,可以全面了解元数据在不同研究领域中的应用的标准、用例、问题和相应的解决方案。
元数据可以成为识别、描述和处理信息的强大工具,但它的有意义创建是昂贵且具有挑战性的。此审查过程揭示了许多用于处理元数据的标准、用例、问题和解决方案。所提出的协调定义和新架构有可能通过创建对元数据及其上下文的共享理解来改进元数据的分类和生成。