College of Biomedical Engineering and Instrument Science, Zhejiang University, Hangzhou, Zhejiang Province, China.
Country Two Key Laboratory for Biomedical Engineering, Ministry of Education, y, Hangzhou, Zhejiang Province, China.
Stud Health Technol Inform. 2022 Jun 6;290:106-110. doi: 10.3233/SHTI220041.
The clinical data often have limited usefulness because of the diversified expression. Chinese clinical data standardization can improve the usability of clinical data. The complexity of data cleaning and coding for Chinese clinical data prompted the turn of low-effective manual coding into the computer-aided tool. This study established the universal data cleaning and coding process and tool for Chinese clinical data standardization, which can greatly improve human efficiency. The process included the preprocessing, text similarity algorithm, and manual review. The standardization process proved effective for the diagnosis, drug, and examination data standardization task and can be used gradually in other clinical domains. The semi-automatic data cleaning and coding can reduce the half time for standardization, and it was used in hospitals in Beijing.
临床数据通常由于表达多样化而具有有限的用途。中国临床数据标准化可以提高临床数据的可用性。中文临床数据的数据清理和编码的复杂性促使低效的手动编码转向计算机辅助工具。本研究为中国临床数据标准化建立了通用的数据清理和编码流程及工具,可以极大地提高工作效率。该流程包括预处理、文本相似度算法和手动审查。标准化流程在诊断、药物和检查数据标准化任务中被证明是有效的,并且可以在其他临床领域逐步应用。半自动化的数据清理和编码可以将标准化的时间减少一半,已经在北京的医院中使用。