Wang Liwei, Wen Andrew, Fu Sunyang, Ruan Xiaoyang, Huang Ming, Li Rui, Lu Qiuhao, Williams Andrew E, Liu Hongfang
McWilliams School of Biomedical Informatics, The University of Texas Health Science Center at Houston, TX, USA.
Clinical and Translational Science Institute Tufts Medical Center Boston US.
medRxiv. 2024 Aug 23:2024.08.23.24311950. doi: 10.1101/2024.08.23.24311950.
The Observational Medical Outcomes Partnership (OMOP) common data model (CDM) that is developed and maintained by the Observational Health Data Sciences and Informatics (OHDSI) community supports large scale cancer research by enabling distributed network analysis. As the number of studies using the OMOP CDM for cancer research increases, there is a growing need for an overview of the scope of cancer research that relies on the OMOP CDM ecosystem.
In this study, we present a comprehensive review of the adoption of the OMOP CDM for cancer research and offer some insights on opportunities in leveraging the OMOP CDM ecosystem for advancing cancer research.
Published literature databases were searched to retrieve OMOP CDM and cancer-related English language articles published between January 2010 and December 2023. A charting form was developed for two main themes, i.e., clinically focused data analysis studies and infrastructure development studies in the cancer domain.
In total, 50 unique articles were included, with 30 for the data analysis theme and 23 for the infrastructure theme, with 3 articles belonging to both themes. The topics covered by the existing body of research was depicted.
Through depicting the status quo of research efforts to improve or leverage the potential of the OMOP CDM ecosystem for advancing cancer research, we identify challenges and opportunities surrounding data analysis and infrastructure including data quality, advanced analytics methodology adoption, in-depth phenotypic data inclusion through NLP, and multisite evaluation.
由观察性健康数据科学与信息学(OHDSI)社区开发和维护的观察性医学结果合作组织(OMOP)通用数据模型(CDM),通过实现分布式网络分析来支持大规模癌症研究。随着使用OMOP CDM进行癌症研究的研究数量增加,越来越需要对依赖OMOP CDM生态系统的癌症研究范围进行概述。
在本研究中,我们对OMOP CDM在癌症研究中的应用进行了全面综述,并就利用OMOP CDM生态系统推进癌症研究的机会提供了一些见解。
检索已发表的文献数据库,以获取2010年1月至2023年12月期间发表的与OMOP CDM和癌症相关的英文文章。针对两个主要主题开发了一种图表形式,即癌症领域以临床为重点的数据分析研究和基础设施开发研究。
总共纳入了50篇独特的文章,其中30篇属于数据分析主题,23篇属于基础设施主题,3篇文章同时属于这两个主题。描述了现有研究主体所涵盖的主题。
通过描述为改善或利用OMOP CDM生态系统推进癌症研究的潜力而进行的研究工作的现状,我们确定了数据分析和基础设施方面的挑战和机遇,包括数据质量、先进分析方法的采用、通过自然语言处理纳入深入的表型数据以及多中心评估。