Mateussi Nadayca, Janjua Haroon, Grimsley Emily A, Kendall Melissa, Zander Tyler, Pietrobon Ricardo, Kuo Paul C
SporeData, Durham, USA.
Surgery, University of South Florida Morsani College of Medicine, Tampa, USA.
Cureus. 2024 Aug 13;16(8):e66763. doi: 10.7759/cureus.66763. eCollection 2024 Aug.
Big Data has revolutionized healthcare research through the three Vs: volume, veracity, and variety. This study introduces the OnetoMap meta-data repository, a centralized inventory developed in collaboration with the University of South Florida's Department of Surgery.
The repository offers extensive details about each database, including its primary purpose, available variables, and examples of high-impact research utilizing these databases. It aims to create a centralized inventory, enabling researchers to locate and link relevant datasets efficiently. Each dataset is described using standardized criteria to ensure clarity and usability, such as data type, source, collection methods, and potential linkages to other datasets. Results: Currently, the OnetoMap repository contains descriptions of 49 datasets, with ongoing updates to include new datasets and additional data years. These datasets include a range of data types, including cross-sectional and longitudinal, gathered through claims, registries, electronic health records, and surveys. The repository is hosted on GitHub, enabling version control, collaboration, and open access. Effective search functionalities and descriptive categorization enhance the findability of datasets.
The data repository includes comprehensive records of patient health statuses, socioeconomic profiles, hospital structures, and physician practices, enabling nuanced interventions and addressing complex healthcare needs. It also promotes interdisciplinary research and accelerates novel discoveries by providing a centralized source of diverse data and facilitating collaboration among research teams.
The OnetoMap meta-data repository represents a significant advancement in healthcare research by providing a centralized, detailed, and easily accessible repository of clinical research databases. Future directions include implementing automatic annual updates of datasets, exploring automatic dataset linkage, providing monthly updates on published research, creating a user chat space for enhanced collaboration, and developing code applets for simplified data analysis. These efforts will ensure that the repository remains current, functional, and accessible, ultimately facilitating new discoveries and insights in healthcare outcomes research.
大数据通过三个“V”——体量、准确性和多样性,彻底改变了医疗保健研究。本研究介绍了OnetoMap元数据存储库,这是一个与南佛罗里达大学外科系合作开发的集中式清单。
该存储库提供了每个数据库的详细信息,包括其主要用途、可用变量以及利用这些数据库进行的高影响力研究示例。其目的是创建一个集中式清单,使研究人员能够高效地查找和链接相关数据集。每个数据集都使用标准化标准进行描述,以确保清晰度和可用性,如数据类型、来源、收集方法以及与其他数据集的潜在联系。
目前,OnetoMap存储库包含49个数据集的描述,并在持续更新以纳入新数据集和更多年份的数据。这些数据集包括一系列数据类型,如横断面数据和纵向数据,通过索赔、登记处、电子健康记录和调查收集。该存储库托管在GitHub上,支持版本控制、协作和开放访问。有效的搜索功能和描述性分类提高了数据集的可查找性。
该数据存储库包括患者健康状况、社会经济概况、医院结构和医生执业情况的全面记录,有助于进行细致入微的干预并满足复杂的医疗保健需求。它还通过提供多样化数据的集中来源并促进研究团队之间的合作,推动跨学科研究并加速新发现。
OnetoMap元数据存储库通过提供一个集中、详细且易于访问的临床研究数据库存储库,代表了医疗保健研究的重大进展。未来的方向包括实施数据集的年度自动更新、探索数据集的自动链接、提供已发表研究的月度更新、创建用于加强协作的用户聊天空间以及开发用于简化数据分析的代码小程序。这些努力将确保存储库保持最新、功能正常且易于访问,最终促进医疗保健结果研究中的新发现和新见解。