Dai Hong-Jie, Chen Chien-Chang, Mir Tatheer Hussain, Wang Ting-Yu, Wang Chen-Kai, Chang Ya-Chen, Yu Shu-Jung, Shen Yi-Wen, Huang Cheng-Jiun, Tsai Chia-Hsuan, Wang Ching-Yun, Chen Hsiao-Jou, Weng Pei-Shan, Lin You-Xiang, Chen Sheng-Wei, Tsai Ming-Ju, Juang Shian-Fei, Wu Su-Ying, Tsai Wen-Tsung, Huang Ming-Yii, Huang Chih-Jen, Yang Chih-Jen, Liu Ping-Zun, Huang Chiao-Wen, Huang Chi-Yen, Wang William Yu Chung, Chong Inn-Wen, Yang Yi-Hsin
Intelligent System Laboratory, Department of Electrical Engineering, College of Electrical Engineering and Computer Science, National Kaohsiung University of Science and Technology, Kaohsiung 80778, Taiwan.
National Institute of Cancer Research, National Health Research Institutes, Tainan 70456, Taiwan.
Comput Struct Biotechnol J. 2024 Apr 7;24:322-333. doi: 10.1016/j.csbj.2024.04.007. eCollection 2024 Dec.
Data curation for a hospital-based cancer registry heavily relies on the labor-intensive manual abstraction process by cancer registrars to identify cancer-related information from free-text electronic health records. To streamline this process, a natural language processing system incorporating a hybrid of deep learning-based and rule-based approaches for identifying lung cancer registry-related concepts, along with a symbolic expert system that generates registry coding based on weighted rules, was developed. The system is integrated with the hospital information system at a medical center to provide cancer registrars with a patient journey visualization platform. The embedded system offers a comprehensive view of patient reports annotated with significant registry concepts to facilitate the manual coding process and elevate overall quality. Extensive evaluations, including comparisons with state-of-the-art methods, were conducted using a lung cancer dataset comprising 1428 patients from the medical center. The experimental results illustrate the effectiveness of the developed system, consistently achieving F1-scores of 0.85 and 1.00 across 30 coding items. Registrar feedback highlights the system's reliability as a tool for assisting and auditing the abstraction. By presenting key registry items along the timeline of a patient's reports with accurate code predictions, the system improves the quality of registrar outcomes and reduces the labor resources and time required for data abstraction. Our study highlights advancements in cancer registry coding practices, demonstrating that the proposed hybrid weighted neural-symbolic cancer registry system is reliable and efficient for assisting cancer registrars in the coding workflow and contributing to clinical outcomes.
基于医院的癌症登记处的数据管理严重依赖于癌症登记员进行的劳动密集型手动提取过程,以便从自由文本电子健康记录中识别癌症相关信息。为了简化这一过程,开发了一种自然语言处理系统,该系统结合了基于深度学习和基于规则的方法来识别肺癌登记处相关概念,以及一个基于加权规则生成登记编码的符号专家系统。该系统与一家医疗中心的医院信息系统集成,为癌症登记员提供一个患者病程可视化平台。嵌入式系统提供了带有重要登记概念注释的患者报告的全面视图,以促进手动编码过程并提高整体质量。使用包含该医疗中心1428名患者的肺癌数据集进行了广泛评估,包括与最先进方法的比较。实验结果表明了所开发系统的有效性,在30个编码项目中始终实现了0.85和1.00的F1分数。登记员的反馈突出了该系统作为辅助和审核提取工具的可靠性。通过在患者报告的时间线上呈现关键登记项目并进行准确的代码预测,该系统提高了登记员工作成果的质量,减少了数据提取所需的劳动力资源和时间。我们的研究突出了癌症登记编码实践的进展,表明所提出的混合加权神经符号癌症登记系统在协助癌症登记员进行编码工作流程以及促进临床结果方面是可靠且高效的。