实现电子表型的工作可视化:从 eMERGE 网络中获得的经验教训。
Making work visible for electronic phenotype implementation: Lessons learned from the eMERGE network.
机构信息
Department of Biomedical Informatics, Columbia University, New York, NY, United States.
Northwestern University Feinberg School of Medicine, Chicago, IL, United States.
出版信息
J Biomed Inform. 2019 Nov;99:103293. doi: 10.1016/j.jbi.2019.103293. Epub 2019 Sep 19.
BACKGROUND
Implementation of phenotype algorithms requires phenotype engineers to interpret human-readable algorithms and translate the description (text and flowcharts) into computable phenotypes - a process that can be labor intensive and error prone. To address the critical need for reducing the implementation efforts, it is important to develop portable algorithms.
METHODS
We conducted a retrospective analysis of phenotype algorithms developed in the Electronic Medical Records and Genomics (eMERGE) network and identified common customization tasks required for implementation. A novel scoring system was developed to quantify portability from three aspects: Knowledge conversion, clause Interpretation, and Programming (KIP). Tasks were grouped into twenty representative categories. Experienced phenotype engineers were asked to estimate the average time spent on each category and evaluate time saving enabled by a common data model (CDM), specifically the Observational Medical Outcomes Partnership (OMOP) model, for each category.
RESULTS
A total of 485 distinct clauses (phenotype criteria) were identified from 55 phenotype algorithms, corresponding to 1153 customization tasks. In addition to 25 non-phenotype-specific tasks, 46 tasks are related to interpretation, 613 tasks are related to knowledge conversion, and 469 tasks are related to programming. A score between 0 and 2 (0 for easy, 1 for moderate, and 2 for difficult portability) is assigned for each aspect, yielding a total KIP score range of 0 to 6. The average clause-wise KIP score to reflect portability is 1.37 ± 1.38. Specifically, the average knowledge (K) score is 0.64 ± 0.66, interpretation (I) score is 0.33 ± 0.55, and programming (P) score is 0.40 ± 0.64. 5% of the categories can be completed within one hour (median). 70% of the categories take from days to months to complete. The OMOP model can assist with vocabulary mapping tasks.
CONCLUSION
This study presents firsthand knowledge of the substantial implementation efforts in phenotyping and introduces a novel metric (KIP) to measure portability of phenotype algorithms for quantifying such efforts across the eMERGE Network. Phenotype developers are encouraged to analyze and optimize the portability in regards to knowledge, interpretation and programming. CDMs can be used to improve the portability for some 'knowledge-oriented' tasks.
背景
表型算法的实现需要表型工程师解释人类可读的算法,并将描述(文本和流程图)转换为可计算的表型 - 这是一个劳动强度大且容易出错的过程。为了解决减少实施工作的关键需求,开发可移植算法非常重要。
方法
我们对电子病历和基因组学(eMERGE)网络中开发的表型算法进行了回顾性分析,并确定了实现所需的常见定制任务。开发了一种新的评分系统,从知识转换、条款解释和编程(KIP)三个方面量化可移植性。任务分为二十个代表性类别。经验丰富的表型工程师被要求估计每个类别的平均时间,并评估通用数据模型(CDM),特别是观察性医疗结局伙伴关系(OMOP)模型,为每个类别节省的时间。
结果
从 55 个表型算法中确定了 485 个不同的条款(表型标准),对应于 1153 个定制任务。除了 25 个非表型特定任务外,46 个任务与解释有关,613 个任务与知识转换有关,469 个任务与编程有关。每个方面的评分在 0 到 2 之间(0 表示容易,1 表示中等,2 表示困难可移植性),总 KIP 评分范围为 0 到 6。反映可移植性的平均条款 KIP 得分为 1.37±1.38。具体来说,平均知识(K)得分为 0.64±0.66,解释(I)得分为 0.33±0.55,编程(P)得分为 0.40±0.64。5%的类别可以在一小时内完成(中位数)。70%的类别需要几天到几个月才能完成。OMOP 模型可以协助词汇映射任务。
结论
本研究首次提供了表型方面大量实施工作的第一手知识,并引入了一种新的度量标准(KIP)来衡量 eMERGE 网络中表型算法的可移植性,以量化这些工作。鼓励表型开发人员分析和优化知识、解释和编程方面的可移植性。CDMs 可用于提高某些“面向知识”任务的可移植性。