Williams Richard, Kontopantelis Evangelos, Buchan Iain, Peek Niels
MRC Health eResearch Centre, University of Manchester, Manchester, UK; NIHR Greater Manchester Primary Care Patient Safety Translational Research Centre, University of Manchester, Manchester, UK.
MRC Health eResearch Centre, University of Manchester, Manchester, UK; NIHR School for Primary Care Research, University of Manchester, Manchester, UK.
J Biomed Inform. 2017 Jun;70:1-13. doi: 10.1016/j.jbi.2017.04.010. Epub 2017 Apr 22.
The construction of reliable, reusable clinical code sets is essential when re-using Electronic Health Record (EHR) data for research. Yet code set definitions are rarely transparent and their sharing is almost non-existent. There is a lack of methodological standards for the management (construction, sharing, revision and reuse) of clinical code sets which needs to be addressed to ensure the reliability and credibility of studies which use code sets.
To review methodological literature on the management of sets of clinical codes used in research on clinical databases and to provide a list of best practice recommendations for future studies and software tools.
We performed an exhaustive search for methodological papers about clinical code set engineering for re-using EHR data in research. This was supplemented with papers identified by snowball sampling. In addition, a list of e-phenotyping systems was constructed by merging references from several systematic reviews on this topic, and the processes adopted by those systems for code set management was reviewed.
Thirty methodological papers were reviewed. Common approaches included: creating an initial list of synonyms for the condition of interest (n=20); making use of the hierarchical nature of coding terminologies during searching (n=23); reviewing sets with clinician input (n=20); and reusing and updating an existing code set (n=20). Several open source software tools (n=3) were discovered.
There is a need for software tools that enable users to easily and quickly create, revise, extend, review and share code sets and we provide a list of recommendations for their design and implementation.
Research re-using EHR data could be improved through the further development, more widespread use and routine reporting of the methods by which clinical codes were selected.
在将电子健康记录(EHR)数据用于研究时,构建可靠、可重复使用的临床代码集至关重要。然而,代码集定义很少透明,且几乎不存在代码集共享的情况。临床代码集的管理(构建、共享、修订和重用)缺乏方法学标准,需要加以解决,以确保使用代码集的研究的可靠性和可信度。
回顾关于临床数据库研究中使用的临床代码集管理的方法学文献,并为未来研究和软件工具提供最佳实践建议清单。
我们对关于在研究中重新使用EHR数据的临床代码集工程的方法学论文进行了详尽搜索。通过滚雪球抽样确定的论文对其进行了补充。此外,通过合并关于该主题的几篇系统评价中的参考文献,构建了一个电子表型系统列表,并对这些系统采用的代码集管理流程进行了回顾。
审查了30篇方法学论文。常见方法包括:为感兴趣的病症创建同义词初始列表(n = 20);在搜索过程中利用编码术语的层次结构(n = 23);在临床医生的参与下审查代码集(n = 20);以及重用和更新现有代码集(n = 20)。发现了几个开源软件工具(n = 3)。
需要能够让用户轻松快速地创建、修订、扩展、审查和共享代码集的软件工具,我们为其设计和实施提供了一份建议清单。
通过进一步开发、更广泛地使用以及对临床代码选择方法进行常规报告,可改进对EHR数据的研究重用。