Yao Jason, Alabousi Abdullah, Mironov Oleg
Department of Radiology, McMaster University, Hamilton, ON, Canada.
St Joseph's Healthcare Hamilton, Hamilton, ON, Canada.
Can Assoc Radiol J. 2025 May;76(2):265-272. doi: 10.1177/08465371241255895. Epub 2024 Jun 4.
To evaluate the accuracy of a Bidirectional Encoder Representations for Transformers (BERT) Natural Language Processing (NLP) model for automating triage and protocol selection of cross-sectional image requisitions. A retrospective study was completed using 222 392 CT and MRI studies from a single Canadian university hospital database (January 2018-September 2022). Three hundred unique protocols (116 CT and 184 MRI) were included. A BERT model was trained, validated, and tested using an 80%-10%-10% stratified split. Naive Bayes (NB) and Support Vector Machine (SVM) machine learning models were used as comparators. Models were assessed using F1 score, precision, recall, and area under the receiver operating characteristic curve (AUROC). The BERT model was also assessed for multi-class protocol suggestion and subgroups based on referral location, modality, and imaging section. BERT was superior to SVM for protocol selection (F1 score: BERT-0.901 vs SVM-0.881). However, was not significantly different from SVM for triage prediction (F1 score: BERT-0.844 vs SVM-0.845). Both models outperformed NB for protocol and triage. BERT had superior performance on minority classes compared to SVM and NB. For multiclass prediction, BERT accuracy was up to 0.991 for top-5 protocol suggestion, and 0.981 for top-2 triage suggestion. Emergency department patients had the highest F1 scores for both protocol (0.957) and triage (0.986), compared to inpatients and outpatients. The BERT NLP model demonstrated strong performance in automating the triage and protocol selection of radiology studies, showing potential to enhance radiologist workflows. These findings suggest the feasibility of using advanced NLP models to streamline radiology operations.
为评估用于自动分诊和横断面影像检查申请协议选择的双向编码器表征变换器(BERT)自然语言处理(NLP)模型的准确性。使用来自加拿大一所大学医院单一数据库(2018年1月至2022年9月)的222392例CT和MRI检查进行了一项回顾性研究。纳入了300种独特的协议(116种CT和184种MRI)。使用80%-10%-10%的分层划分对BERT模型进行训练、验证和测试。朴素贝叶斯(NB)和支持向量机(SVM)机器学习模型用作比较对象。使用F1分数、精确率、召回率和受试者操作特征曲线下面积(AUROC)对模型进行评估。还基于转诊地点、检查方式和影像部位对BERT模型进行了多类协议建议和亚组评估。在协议选择方面,BERT优于SVM(F1分数:BERT为0.901,SVM为0.881)。然而,在分诊预测方面与SVM无显著差异(F1分数:BERT为0.844,SVM为0.845)。在协议和分诊方面,两种模型均优于NB。与SVM和NB相比,BERT在少数类别上具有更好的性能。对于多类预测,BERT在前5个协议建议中的准确率高达0.991,在前2个分诊建议中的准确率为0.981。与住院患者和门诊患者相比,急诊科患者在协议(0.957)和分诊(0.986)方面的F1分数最高。BERT NLP模型在放射学检查的自动分诊和协议选择方面表现出色,显示出增强放射科医生工作流程的潜力。这些发现表明使用先进NLP模型简化放射学操作的可行性。