Voss Erica A, Makadia Rupa, Matcho Amy, Ma Qianli, Knoll Chris, Schuemie Martijn, DeFalco Frank J, Londhe Ajit, Zhu Vivienne, Ryan Patrick B
Epidemiology Analytics, Janssen Research & Development, Titusville, New Jersey, USA
Epidemiology Analytics, Janssen Research & Development, Titusville, New Jersey, USA.
J Am Med Inform Assoc. 2015 May;22(3):553-64. doi: 10.1093/jamia/ocu023. Epub 2015 Feb 10.
To evaluate the utility of applying the Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM) across multiple observational databases within an organization and to apply standardized analytics tools for conducting observational research.
Six deidentified patient-level datasets were transformed to the OMOP CDM. We evaluated the extent of information loss that occurred through the standardization process. We developed a standardized analytic tool to replicate the cohort construction process from a published epidemiology protocol and applied the analysis to all 6 databases to assess time-to-execution and comparability of results.
Transformation to the CDM resulted in minimal information loss across all 6 databases. Patients and observations excluded were due to identified data quality issues in the source system, 96% to 99% of condition records and 90% to 99% of drug records were successfully mapped into the CDM using the standard vocabulary. The full cohort replication and descriptive baseline summary was executed for 2 cohorts in 6 databases in less than 1 hour.
The standardization process improved data quality, increased efficiency, and facilitated cross-database comparisons to support a more systematic approach to observational research. Comparisons across data sources showed consistency in the impact of inclusion criteria, using the protocol and identified differences in patient characteristics and coding practices across databases.
Standardizing data structure (through a CDM), content (through a standard vocabulary with source code mappings), and analytics can enable an institution to apply a network-based approach to observational research across multiple, disparate observational health databases.
评估在一个组织内的多个观察性数据库中应用观察性医疗结果合作组织(OMOP)通用数据模型(CDM)的效用,并应用标准化分析工具进行观察性研究。
六个经过去标识化处理的患者层面数据集被转换为OMOP CDM。我们评估了标准化过程中发生的信息损失程度。我们开发了一种标准化分析工具,以复制已发表的流行病学方案中的队列构建过程,并将该分析应用于所有6个数据库,以评估执行时间和结果的可比性。
转换为CDM在所有6个数据库中导致的信息损失最小。被排除的患者和观察结果是由于源系统中已识别的数据质量问题,96%至99%的病情记录和90%至99%的药物记录使用标准词汇成功映射到CDM中。在不到1小时的时间内,对6个数据库中的2个队列执行了完整的队列复制和描述性基线总结。
标准化过程提高了数据质量,提高了效率,并促进了跨数据库比较,以支持更系统的观察性研究方法。跨数据源的比较显示,纳入标准的影响具有一致性,使用该方案并识别出不同数据库之间患者特征和编码实践的差异。
标准化数据结构(通过CDM)、内容(通过带有源代码映射的标准词汇)和分析可以使一个机构能够应用基于网络的方法,在多个不同的观察性健康数据库中进行观察性研究。