Lee Seokmin, Im Gyeongmin
Statistics Research Institute, Statistics Korea, Daejeon, Korea.
Ewha Med J. 2025 Jul;48(3):e45. doi: 10.12771/emj.2025.00675. Epub 2025 Jul 28.
This study evaluated the feasibility and performance of a deep learning approach utilizing the Korean Medical BERT (KM-BERT) model for the automated classification of underlying causes of death within national mortality statistics. It aimed to assess predictive accuracy throughout the cause-of-death coding workflow and to identify limitations and opportunities for further artificial intelligence (AI) integration.
We performed a retrospective prediction study using 693,587 death certificates issued in Korea between January 2021 and December 2022. Free-text fields for immediate, antecedent, and contributory causes were concatenated and fine-tuned with KM-BERT. Three classification models were developed: (1) final underlying cause prediction (International Classification of Diseases, 10th Revision [ICD-10] code) from certificate inputs, (2) tentative underlying cause selection based on ICD-10 Volume 2 rules, and (3) classification of individual cause-of-death entries. Models were trained and validated using 2021 data (80% training, 20% validation) and evaluated on 2022 data. Performance metrics included overall accuracy, weighted F1 score, and macro F1 score.
On 306,898 certificates from 2022, the final cause model achieved 62.65% accuracy (F1-weighted, 0.5940; F1-macro, 0.1503). The tentative cause model demonstrated 95.35% accuracy (F1-weighted, 0.9516; F1-macro, 0.4996). The individual entry model yielded 79.51% accuracy (F1-weighted, 0.7741; F1-macro, 0.9250). Error analysis indicated reduced reliability for rare diseases and for specific ICD chapters, which require supplementary administrative data.
Despite strong performance in mapping free-text inputs and selecting tentative underlying causes, there remains a need for improved data quality, administrative record integration, and model refinement. A systematic, long-term approach is essential for the broad adoption of AI in mortality statistics.
本研究评估了利用韩国医学BERT(KM-BERT)模型进行深度学习方法在国家死亡率统计中自动分类潜在死因的可行性和性能。其旨在评估整个死因编码工作流程中的预测准确性,并识别进一步整合人工智能(AI)的局限性和机会。
我们使用2021年1月至2022年12月期间在韩国发放的693,587份死亡证明进行了一项回顾性预测研究。将直接死因、先行死因和辅助死因的自由文本字段连接起来,并用KM-BERT进行微调。开发了三种分类模型:(1)根据证明输入预测最终潜在死因(国际疾病分类第10版[ICD-10]编码),(2)根据ICD-10第2卷规则选择暂定潜在死因,以及(3)对各个死因条目进行分类。使用2021年的数据(80%用于训练,20%用于验证)对模型进行训练和验证,并在2022年的数据上进行评估。性能指标包括总体准确率、加权F1分数和宏F1分数。
对于2022年的306,898份证明,最终死因模型的准确率为62.65%(F1加权,0.5940;F1宏,0.1503)。暂定死因模型的准确率为95.35%(F1加权,0.9516;F1宏,0.4996)。单个条目模型的准确率为79.51%(F1加权,0.7741;F1宏,0.9250)。错误分析表明,罕见疾病和特定ICD章节的可靠性较低,这需要补充行政数据。
尽管在映射自由文本输入和选择暂定潜在死因方面表现出色,但仍需要提高数据质量、整合行政记录和改进模型。系统的长期方法对于在死亡率统计中广泛采用AI至关重要。