Department of Computer Science, Advanced Studies Center in ICT (CEATIC), Universidad de Jaén, Campus Las Lagunillas, Jaén, 23071, Spain.
Natural Language Processing Unit, HT Médica, Carmelo Torres, no̱2, Jaén, 23007, Spain.
Med Biol Eng Comput. 2024 Nov;62(11):3373-3383. doi: 10.1007/s11517-024-03131-x. Epub 2024 Jun 7.
This paper presents the implementation of two automated text classification systems for prostate cancer findings based on the PI-RADS criteria. Specifically, a traditional machine learning model using XGBoost and a language model-based approach using RoBERTa were employed. The study focused on Spanish-language radiological MRI prostate reports, which has not been explored before. The results demonstrate that the RoBERTa model outperforms the XGBoost model, although both achieve promising results. Furthermore, the best-performing system was integrated into the radiological company's information systems as an API, operating in a real-world environment.
本文提出了两种基于 PI-RADS 标准的前列腺癌影像学发现的自动化文本分类系统的实现方法。具体来说,使用 XGBoost 的传统机器学习模型和基于 RoBERTa 的语言模型方法都被采用。本研究侧重于以前没有探索过的西班牙语放射学 MRI 前列腺报告。结果表明,尽管两个模型都取得了不错的效果,但 RoBERTa 模型的性能优于 XGBoost 模型。此外,表现最好的系统被集成到放射科公司的信息系统中作为 API,在实际环境中运行。