Ruiz Oscar E, Wagenaar Joost B, Mehta Bella, Ziogas Ilias, Swanson Lyndie, Worley Kim C, Cruz-Almeida Yenisel, Johnson Alisa J, Boline Jyl, Boccanfuso Jacqueline, Martone Maryann E, Haelterman Nele A
Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA.
Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania, Philadelphia, PA, USA.
Exp Neurol. 2025 Jun 5;392:115333. doi: 10.1016/j.expneurol.2025.115333.
Large, interdisciplinary team science initiatives are increasingly leveraged to uncover novel insights into complex scientific problems. Such projects typically aim to produce large, harmonized datasets that can be analyzed to yield breakthrough discoveries using cutting-edge scientific methods. Successfully harmonizing and integrating datasets generated by different technologies and research groups is a considerable task, which requires an extensive supportive framework that is built by all members involved. Such a data harmonization framework includes a shared language to communicate across teams and disciplines, harmonized methods and protocols, (meta)data standards and common data elements, and the appropriate infrastructure to support the framework's development and implementation. In addition, a supportive data harmonization framework also entails adopting processes to decide on which elements to harmonize and to help individual team members implement agreed-upon data workflows in their own laboratories/centers. Building an effective data harmonization framework requires buy-in, team building, and significant effort from all members involved. While the nature and individual elements of these frameworks are project-specific, some common challenges typically arise that are independent of the research questions, scientific techniques, or model systems involved. In this perspective, we build on our collective experiences as part of the REstoring JOINt health and function to reduce pain (RE-JOIN) Consortium to provide guidance for developing research-centered data collection and analysis pipelines that enable downstream integrated analyses within and across diverse teams.
大型跨学科团队科学计划越来越多地被用于揭示对复杂科学问题的新见解。此类项目通常旨在生成大型、统一的数据集,以便使用前沿科学方法进行分析,从而产生突破性发现。成功地协调和整合由不同技术和研究团队生成的数据集是一项艰巨的任务,这需要由所有相关成员构建一个广泛的支持框架。这样的数据协调框架包括一种跨团队和学科进行交流的共享语言、统一的方法和协议、(元)数据标准和通用数据元素,以及支持该框架开发和实施的适当基础设施。此外,一个支持性的数据协调框架还需要采用一些流程来决定协调哪些元素,并帮助单个团队成员在其自己的实验室/中心实施商定的数据工作流程。构建一个有效的数据协调框架需要所有相关成员的认同、团队建设以及巨大的努力。虽然这些框架的性质和个别元素因项目而异,但通常会出现一些与所涉及的研究问题、科学技术或模型系统无关的常见挑战。在此观点中,我们基于作为恢复关节健康与功能以减轻疼痛(RE-JOIN)联盟一部分的集体经验,为开发以研究为中心的数据收集和分析流程提供指导,这些流程能够在不同团队内部和跨团队进行下游综合分析。
Exp Neurol. 2025-6-5
Alzheimers Dement. 2021-4
JBI Database System Rev Implement Rep. 2016-4
Int J Equity Health. 2025-8-8
2025-1
J Health Organ Manag. 2025-6-30
Health Soc Care Deliv Res. 2025-5-21
PLoS Comput Biol. 2024-8-8
Sci Data. 2024-1-31
Nat Cell Biol. 2023-8
PLoS Comput Biol. 2023-6-15
R Soc Open Sci. 2023-6-7
Sci Data. 2023-2-23