Ozaydin Bunyamin, Zengul Ferhat, Oner Nurettin, Feldman Sue S
University of Alabama at Birmingham, Birmingham, AL, United States.
J Med Internet Res. 2020 Jun 4;22(6):e18579. doi: 10.2196/18579.
Health services researchers spend a substantial amount of time performing integration, cleansing, interpretation, and aggregation of raw data from multiple public or private data sources. Often, each researcher (or someone in their team) duplicates this effort for their own project, facing the same challenges and experiencing the same pitfalls discovered by those before them.
This paper described a design process for creating a data warehouse that includes the most frequently used databases in health services research.
The design is based on a conceptual iterative process model framework that utilizes the sociotechnical systems theory approach and includes the capacity for subsequent updates of the existing data sources and the addition of new ones. We introduce the theory and the framework and then explain how they are used to inform the methodology of this study.
The application of the iterative process model to the design research process of problem identification and solution design for the Healthcare Research and Analytics Data Infrastructure Solution (HRADIS) is described. Each phase of the iterative model produced end products to inform the implementation of HRADIS. The analysis phase produced the problem statement and requirements documents. The projection phase produced a list of tasks and goals for the ideal system. Finally, the synthesis phase provided the process for a plan to implement HRADIS. HRADIS structures and integrates data dictionaries provided by the data sources, allowing the creation of dimensions and measures for a multidimensional business intelligence system. We discuss how HRADIS is complemented with a set of data mining, analytics, and visualization tools to enable researchers to more efficiently apply multiple methods to a given research project. HRADIS also includes a built-in security and account management framework for data governance purposes to ensure customized authorization depending on user roles and parts of the data the roles are authorized to access.
To address existing inefficiencies during the obtaining, extracting, preprocessing, cleansing, and filtering stages of data processing in health services research, we envision HRADIS as a full-service data warehouse integrating frequently used data sources, processes, and methods along with a variety of data analytics and visualization tools. This paper presents the application of the iterative process model to build such a solution. It also includes a discussion on several prominent issues, lessons learned, reflections and recommendations, and future considerations, as this model was applied.
卫生服务研究人员花费大量时间对来自多个公共或私人数据源的原始数据进行整合、清理、解释和汇总。通常,每个研究人员(或其团队中的某人)都会为自己的项目重复这项工作,面临着与前人相同的挑战和陷阱。
本文描述了一个创建数据仓库的设计过程,该数据仓库包含卫生服务研究中最常用的数据库。
该设计基于一个概念迭代过程模型框架,该框架采用社会技术系统理论方法,并具备对现有数据源进行后续更新以及添加新数据源的能力。我们介绍了该理论和框架,然后解释了它们如何用于为这项研究的方法提供信息。
描述了迭代过程模型在医疗保健研究与分析数据基础设施解决方案(HRADIS)的问题识别和解决方案设计的设计研究过程中的应用。迭代模型的每个阶段都产生了最终产品,以为HRADIS的实施提供信息。分析阶段产生了问题陈述和需求文档。预测阶段产生了理想系统的任务和目标列表。最后,综合阶段提供了实施HRADIS的计划流程。HRADIS对数据源提供的数据字典进行结构化和集成,允许为多维商业智能系统创建维度和度量。我们讨论了HRADIS如何与一组数据挖掘、分析和可视化工具相结合,使研究人员能够更有效地将多种方法应用于给定的研究项目。HRADIS还包括一个用于数据治理目的的内置安全和账户管理框架,以确保根据用户角色和角色被授权访问的数据部分进行定制授权。
为了解决卫生服务研究中数据处理的获取、提取、预处理、清理和过滤阶段现有的低效率问题,我们设想HRADIS是一个全方位服务的数据仓库,集成了常用的数据源、流程和方法以及各种数据分析和可视化工具。本文介绍了迭代过程模型在构建此类解决方案中的应用。它还包括对几个突出问题、经验教训、反思和建议以及未来考虑因素的讨论,因为该模型已被应用。