Abaza Haitham, Shutsko Aliaksandra, Klopfenstein Sophie A I, Vorisek Carina N, Schmidt Carsten Oliver, Brünings-Kuppe Claudia, Clemens Vera, Darms Johannes, Hanß Sabine, Intemann Timm, Jannasch Franziska, Kasbohm Elisa, Lindstädt Birte, Löbe Matthias, Nimptsch Katharina, Nöthlings Ute, Ocanto Marisabel Gonzalez, Osei Tracy Bonsu, Perrar Ines, Peters Manuela, Pischon Tobias, Sax Ulrich, Schulze Matthias B, Schwarz Florian, Schwedhelm Carolina, Thun Sylvia, Waltemath Dagmar, Wünsche Hannes, Zeleke Atinkut A, Müller Wolfgang, Golebiewski Martin
Scientific Databases and Visualization, Heidelberg Institute for Theoretical Studies (HITS), Heidelberg, Germany.
Information Centre for Life Sciences, German National Library of Medicine (ZB MED), Cologne, Germany.
JMIR Med Inform. 2025 May 21;13:e63906. doi: 10.2196/63906.
Despite wide acceptance in medical research, implementation of the FAIR (findability, accessibility, interoperability, and reusability) principles in certain health domains and interoperability across data sources remain a challenge. While clinical trial registries collect metadata about clinical studies, numerous epidemiological and public health studies remain unregistered or lack detailed information about relevant study documents. Making valuable data from these studies available to the research community could improve our understanding of various diseases and their risk factors. The National Research Data Infrastructure for Personal Health Data (NFDI4Health) seeks to optimize data sharing among the clinical, epidemiological, and public health research communities while preserving privacy and ethical regulations.
We aimed to develop a tailored metadata schema (MDS) to support the standardized publication of health studies' metadata in NFDI4Health services and beyond. This study describes the development, structure, and implementation of this MDS designed to improve the FAIRness of metadata from clinical, epidemiological, and public health research while maintaining compatibility with metadata models of other resources to ease interoperability.
Based on the models of DataCite, ClinicalTrials.gov, and other data models and international standards, the first MDS version was developed by the NFDI4Health Task Force COVID-19. It was later extended in a modular fashion, combining generic and NFDI4Health use case-specific metadata items relevant to domains of nutritional epidemiology, chronic diseases, and record linkage. Mappings to schemas of clinical trial registries and international and local initiatives were performed to enable interfacing with external resources. The MDS is represented in Microsoft Excel spreadsheets. A transformation into an improved and interactive machine-readable format was completed using the ART-DECOR (Advanced Requirement Tooling-Data Elements, Codes, OIDs, and Rules) tool to facilitate editing, maintenance, and versioning.
The MDS is implemented in NFDI4Health services (eg, the German Central Health Study Hub and the Local Data Hub) to structure and exchange study-related metadata. Its current version (3.3) comprises 220 metadata items in 5 modules. The core and design modules cover generic metadata, including bibliographic information, study design details, and data access information. Domain-specific metadata are included in use case-specific modules, currently comprising nutritional epidemiology, chronic diseases, and record linkage. All modules incorporate mandatory, optional, and conditional items. Mappings to the schemas of clinical trial registries and other resources enable integrating their study metadata in the NFDI4Health services. The current MDS version is available in both Excel and ART-DECOR formats.
With its implementation in the German Central Health Study Hub and the Local Data Hub, the MDS improves the FAIRness of data from clinical, epidemiological, and public health research. Due to its generic nature and interoperability through mappings to other schemas, it is transferable to services from adjacent domains, making it useful for a broader user community.
尽管在医学研究中已被广泛接受,但在某些健康领域实施FAIR(可查找性、可访问性、互操作性和可重用性)原则以及跨数据源的互操作性仍然是一项挑战。虽然临床试验注册机构收集有关临床研究的元数据,但许多流行病学和公共卫生研究仍未注册或缺乏有关相关研究文件的详细信息。使这些研究中的宝贵数据可供研究界使用,有助于我们更好地了解各种疾病及其风险因素。国家个人健康数据研究数据基础设施(NFDI4Health)旨在优化临床、流行病学和公共卫生研究社区之间的数据共享,同时维护隐私和道德规范。
我们旨在开发一种定制的元数据模式(MDS),以支持在NFDI4Health服务及其他领域中标准化发布健康研究的元数据。本研究描述了该MDS的开发、结构和实施,旨在提高临床、流行病学和公共卫生研究元数据的FAIR性,同时与其他资源的元数据模型保持兼容性,以促进互操作性。
基于DataCite、ClinicalTrials.gov的模型以及其他数据模型和国际标准,NFDI4Health COVID-19特别工作组开发了第一个MDS版本。随后以模块化方式进行扩展,结合了与营养流行病学、慢性病和记录链接领域相关的通用和特定于NFDI4Health用例的元数据项。与临床试验注册机构以及国际和地方倡议的模式进行映射,以便与外部资源进行接口。MDS以Microsoft Excel电子表格形式表示。使用ART-DECOR(高级需求工具 - 数据元素、代码、对象标识符和规则)工具完成了向改进的交互式机器可读格式的转换,以方便编辑、维护和版本控制。
MDS在NFDI4Health服务(如德国中央健康研究中心和本地数据中心)中实施,用于构建和交换与研究相关的元数据。其当前版本(3.3)在5个模块中包含220个元数据项。核心模块和设计模块涵盖通用元数据,包括书目信息、研究设计细节和数据访问信息。特定领域的元数据包含在用例特定模块中,目前包括营养流行病学、慢性病和记录链接。所有模块都包含强制性、可选性和条件性项目。与临床试验注册机构和其他资源的模式映射使得能够将其研究元数据集成到NFDI4Health服务中。当前的MDS版本以Excel和ART-DECOR格式提供。
通过在德国中央健康研究中心和本地数据中心的实施,MDS提高了临床、流行病学和公共卫生研究数据的FAIR性。由于其通用性以及通过与其他模式的映射实现的互操作性,它可转移到相邻领域的服务中,对更广泛的用户群体有用。