Mugotitsa Bylhah, Bhattacharjee Tathagata, Ochola Michael, Mailosi Dorothy, Amadi David, Andeso Pauline, Kuria Joseph, Momanyi Reinpeter, Omondi Evans, Kajungu Dan, Todd Jim, Kiragga Agnes, Greenfield Jay
African Population and Health Research Center (APHRC), Nairobi, Kenya.
Strathmore University Business School, Strathmore University, Nairobi, Kenya.
Front Big Data. 2024 Oct 11;7:1435510. doi: 10.3389/fdata.2024.1435510. eCollection 2024.
Longitudinal studies are essential for understanding the progression of mental health disorders over time, but combining data collected through different methods to assess conditions like depression, anxiety, and psychosis presents significant challenges. This study presents a mapping technique allowing for the conversion of diverse longitudinal data into a standardized staging database, leveraging the Data Documentation Initiative (DDI) Lifecycle and the Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM) standards to ensure consistency and compatibility across datasets.
The "INSPIRE" project integrates longitudinal data from African studies into a staging database using metadata documentation standards structured with a snowflake schema. This facilitates the development of Extraction, Transformation, and Loading (ETL) scripts for integrating data into OMOP CDM. The staging database schema is designed to capture the dynamic nature of longitudinal studies, including changes in research protocols and the use of different instruments across data collection waves.
Utilizing this mapping method, we streamlined the data migration process to the staging database, enabling subsequent integration into the OMOP CDM. Adherence to metadata standards ensures data quality, promotes interoperability, and expands opportunities for data sharing in mental health research.
The staging database serves as an innovative tool in managing longitudinal mental health data, going beyond simple data hosting to act as a comprehensive study descriptor. It provides detailed insights into each study stage and establishes a data science foundation for standardizing and integrating the data into OMOP CDM.
纵向研究对于理解心理健康障碍随时间的发展至关重要,但将通过不同方法收集的数据结合起来以评估抑郁症、焦虑症和精神病等状况面临重大挑战。本研究提出了一种映射技术,可将多样的纵向数据转换为标准化的分期数据库,利用数据文档计划(DDI)生命周期和观察性医疗结局合作组织(OMOP)通用数据模型(CDM)标准来确保数据集之间的一致性和兼容性。
“激励”项目使用以雪花模式构建的元数据文档标准,将来自非洲研究的纵向数据集成到一个分期数据库中。这便于开发用于将数据集成到OMOP CDM的提取、转换和加载(ETL)脚本。分期数据库模式旨在捕捉纵向研究的动态性质,包括研究方案的变化以及不同数据收集波次中不同工具的使用。
利用这种映射方法,我们简化了向分期数据库的数据迁移过程,使得后续能够集成到OMOP CDM中。遵守元数据标准可确保数据质量、促进互操作性,并扩大心理健康研究中数据共享的机会。
分期数据库是管理纵向心理健康数据的创新工具,不仅是简单的数据存储,还能作为全面的研究描述符。它提供了对每个研究阶段的详细洞察,并为将数据标准化和集成到OMOP CDM中奠定了数据科学基础。