Imaging Institute, Cleveland Clinic Foundation, 9500 Euclid Ave., P34, Cleveland, OH, 44195, USA.
J Digit Imaging. 2022 Oct;35(5):1120-1130. doi: 10.1007/s10278-022-00633-8. Epub 2022 Jun 2.
A correct protocol assignment is critical to high-quality imaging examinations, and its automation can be amenable to natural language processing (NLP). Assigning protocols for abdominal imaging CT scans is particularly challenging given the multiple organ specific indications and parameters. We compared conventional machine learning, deep learning, and automated machine learning builder workflows for this multiclass text classification task. A total of 94,501 CT studies performed over 4 years and their assigned protocols were obtained. Text data associated with each study including the ordering provider generated free text study indication and ICD codes were used for NLP analysis and protocol class prediction. The data was classified into one of 11 abdominal CT protocol classes before and after augmentations used to account for imbalances in the class sample sizes. Four machine learning (ML) algorithms, one deep learning algorithm, and an automated machine learning (AutoML) builder were used for the multilabel classification task: Random Forest (RF), Tree Ensemble (TE), Gradient Boosted Tree (GBT), multi-layer perceptron (MLP), Universal Language Model Fine-tuning (ULMFiT), and Google's AutoML builder (Alphabet, Inc., Mountain View, CA), respectively. On the unbalanced dataset, the manually coded algorithms all performed similarly with F1 scores of 0.811 for RF, 0.813 for TE, 0.813 for GBT, 0.828 for MLP, and 0.847 for ULMFiT. The AutoML builder performed better with a F1 score of 0.854. On the balanced dataset, the tree ensemble machine learning algorithm performed the best with an F1 score of 0.803 and a Cohen's kappa of 0.612. AutoML methods took a longer time for completion of NLP model training and evaluation, 4 h and 45 min compared to an average of 51 min for manual methods. Machine learning and natural language processing can be used for the complex multiclass classification task of abdominal imaging CT scan protocol assignment.
正确的协议分配对高质量的成像检查至关重要,其自动化可以采用自然语言处理(NLP)。由于存在多种特定于器官的适应症和参数,因此为腹部成像 CT 扫描分配协议尤其具有挑战性。我们比较了传统机器学习、深度学习和自动化机器学习构建器工作流程在这个多类别文本分类任务中的表现。总共获得了在 4 年期间进行的 94501 项 CT 研究及其分配的协议。与每个研究相关的文本数据包括生成的免费文本研究指示和 ICD 代码的订购提供程序,用于 NLP 分析和协议分类预测。该数据在进行扩充之前和之后被分类为 11 种腹部 CT 协议类别之一,以弥补类别样本量的不平衡。对于多标签分类任务,使用了四种机器学习(ML)算法、一种深度学习算法和一个自动化机器学习(AutoML)构建器:随机森林(RF)、树集成(TE)、梯度提升树(GBT)、多层感知器(MLP)、通用语言模型微调(ULMFiT)和 Google 的 AutoML 构建器(Alphabet,Inc.,Mountain View,CA)。在不平衡数据集上,手动编码算法的 F1 分数均相似,RF 为 0.811,TE 为 0.813,GBT 为 0.813,MLP 为 0.828,ULMFiT 为 0.847。AutoML 构建器的性能更好,F1 分数为 0.854。在平衡数据集上,树集成机器学习算法的表现最佳,F1 得分为 0.803,Cohen's kappa 为 0.612。AutoML 方法完成 NLP 模型训练和评估所需的时间更长,为 4 小时 45 分钟,而手动方法的平均时间为 51 分钟。机器学习和自然语言处理可用于腹部成像 CT 扫描协议分配的复杂多类别分类任务。