Department of Radiology, Perelman School of Medicine, Hospital of the University of Pennsylvania, 3400 Spruce Street, Philadelphia, PA, 19104, USA.
Musculoskeletal Imaging Division, Department of Radiology, Hospital of the University of Pennsylvania, 3400 Spruce St., 1 Silverstein, Philadelphia, PA, 19104, USA.
J Digit Imaging. 2018 Apr;31(2):178-184. doi: 10.1007/s10278-017-0027-x.
A significant volume of medical data remains unstructured. Natural language processing (NLP) and machine learning (ML) techniques have shown to successfully extract insights from radiology reports. However, the codependent effects of NLP and ML in this context have not been well-studied. Between April 1, 2015 and November 1, 2016, 9418 cross-sectional abdomen/pelvis CT and MR examinations containing our internal structured reporting element for cancer were separated into four categories: Progression, Stable Disease, Improvement, or No Cancer. We combined each of three NLP techniques with five ML algorithms to predict the assigned label using the unstructured report text and compared the performance of each combination. The three NLP algorithms included term frequency-inverse document frequency (TF-IDF), term frequency weighting (TF), and 16-bit feature hashing. The ML algorithms included logistic regression (LR), random decision forest (RDF), one-vs-all support vector machine (SVM), one-vs-all Bayes point machine (BPM), and fully connected neural network (NN). The best-performing NLP model consisted of tokenized unigrams and bigrams with TF-IDF. Increasing N-gram length yielded little to no added benefit for most ML algorithms. With all parameters optimized, SVM had the best performance on the test dataset, with 90.6 average accuracy and F score of 0.813. The interplay between ML and NLP algorithms and their effect on interpretation accuracy is complex. The best accuracy is achieved when both algorithms are optimized concurrently.
大量的医学数据仍然是非结构化的。自然语言处理(NLP)和机器学习(ML)技术已被证明可以成功地从放射学报告中提取见解。然而,在这种情况下,NLP 和 ML 的相互依存效应尚未得到很好的研究。在 2015 年 4 月 1 日至 2016 年 11 月 1 日期间,9418 项横断面腹部/骨盆 CT 和 MR 检查包含我们内部用于癌症的结构化报告元素,分为四类:进展、稳定疾病、改善或无癌症。我们将三种 NLP 技术中的每一种与五种 ML 算法相结合,使用非结构化报告文本预测分配的标签,并比较每种组合的性能。三种 NLP 算法包括词频-逆文档频率(TF-IDF)、词频加权(TF)和 16 位特征哈希。ML 算法包括逻辑回归(LR)、随机决策森林(RDF)、一对一支持向量机(SVM)、一对一贝叶斯点机(BPM)和全连接神经网络(NN)。表现最好的 NLP 模型由带有 TF-IDF 的标记化单字和双字组成。对于大多数 ML 算法来说,增加 N 元长度几乎没有带来额外的好处。在所有参数都得到优化的情况下,SVM 在测试数据集上的性能最好,平均准确率为 90.6%,F 得分为 0.813。ML 和 NLP 算法之间的相互作用及其对解释准确性的影响是复杂的。当两种算法都被同时优化时,准确性达到最佳。