1Zhongda Hospital, Medical College, Southeast University, Nanjing; and.
2Nanjing Jiangbei Hospital, Nanjing, Jiangsu, China.
Neurosurg Focus. 2022 Apr;52(4):E7. doi: 10.3171/2022.1.FOCUS21561.
The purpose of this study was to develop natural language processing (NLP)-based machine learning algorithms to automatically differentiate lumbar disc herniation (LDH) and lumbar spinal stenosis (LSS) based on positive symptoms in free-text admission notes. The secondary purpose was to compare the performance of the deep learning algorithm with the ensemble model on the current task.
In total, 1921 patients whose principal diagnosis was LDH or LSS between June 2013 and June 2020 at Zhongda Hospital, affiliated with Southeast University, were retrospectively analyzed. The data set was randomly divided into a training set and testing set at a 7:3 ratio. Long Short-Term Memory (LSTM) and extreme gradient boosting (XGBoost) models were developed in this study. NLP algorithms were assessed on the testing set by the following metrics: receiver operating characteristic (ROC) curve, area under the curve (AUC), accuracy score, recall score, F1 score, and precision score.
In the testing set, the LSTM model achieved an AUC of 0.8487, accuracy score of 0.7818, recall score of 0.9045, F1 score of 0.8108, and precision score of 0.7347. In comparison, the XGBoost model achieved an AUC of 0.7565, accuracy score of 0.6961, recall score of 0.7387, F1 score of 0.7153, and precision score of 0.6934.
NLP-based machine learning algorithms were a promising auxiliary to the electronic health record in spine disease diagnosis. LSTM, the deep learning model, showed better capacity compared with the widely used ensemble model, XGBoost, in differentiation of LDH and LSS using positive symptoms. This study presents a proof of concept for the application of NLP in prediagnosis of spine disease.
本研究旨在开发基于自然语言处理(NLP)的机器学习算法,以便根据入院记录中的阳性症状自动区分腰椎间盘突出症(LDH)和腰椎管狭窄症(LSS)。次要目的是比较深度学习算法与集成模型在当前任务上的性能。
回顾性分析了 2013 年 6 月至 2020 年 6 月期间东南大学附属中大医院以 LDH 或 LSS 为主要诊断的 1921 例患者。数据集以 7:3 的比例随机分为训练集和测试集。本研究中开发了长短期记忆(LSTM)和极端梯度提升(XGBoost)模型。通过以下指标评估 NLP 算法在测试集上的表现:受试者工作特征(ROC)曲线、曲线下面积(AUC)、准确率、召回率、F1 得分和精确率。
在测试集中,LSTM 模型的 AUC 为 0.8487,准确率为 0.7818,召回率为 0.9045,F1 得分为 0.8108,精确率为 0.7347。相比之下,XGBoost 模型的 AUC 为 0.7565,准确率为 0.6961,召回率为 0.7387,F1 得分为 0.7153,精确率为 0.6934。
基于 NLP 的机器学习算法是电子病历在脊柱疾病诊断中的一种很有前途的辅助手段。LSTM,即深度学习模型,在使用阳性症状区分 LDH 和 LSS 方面表现出优于广泛使用的集成模型 XGBoost 的更好能力。本研究为 NLP 在脊柱疾病预诊断中的应用提供了概念验证。