Miotto Riccardo, Percha Bethany L, Glicksberg Benjamin S, Lee Hao-Chih, Cruz Lisanne, Dudley Joel T, Nabeel Ismail
Hasso Plattner Institute for Digital Health at Mount Sinai, Icahn School of Medicine at Mount Sinai, New York, NY, United States.
Institute for Next Generation Healthcare, Icahn School of Medicine at Mount Sinai, New York, NY, United States.
JMIR Med Inform. 2020 Feb 27;8(2):e16878. doi: 10.2196/16878.
Acute and chronic low back pain (LBP) are different conditions with different treatments. However, they are coded in electronic health records with the same International Classification of Diseases, 10th revision (ICD-10) code (M54.5) and can be differentiated only by retrospective chart reviews. This prevents an efficient definition of data-driven guidelines for billing and therapy recommendations, such as return-to-work options.
The objective of this study was to evaluate the feasibility of automatically distinguishing acute LBP episodes by analyzing free-text clinical notes.
We used a dataset of 17,409 clinical notes from different primary care practices; of these, 891 documents were manually annotated as acute LBP and 2973 were generally associated with LBP via the recorded ICD-10 code. We compared different supervised and unsupervised strategies for automated identification: keyword search, topic modeling, logistic regression with bag of n-grams and manual features, and deep learning (a convolutional neural network-based architecture [ConvNet]). We trained the supervised models using either manual annotations or ICD-10 codes as positive labels.
ConvNet trained using manual annotations obtained the best results with an area under the receiver operating characteristic curve of 0.98 and an F score of 0.70. ConvNet's results were also robust to reduction of the number of manually annotated documents. In the absence of manual annotations, topic models performed better than methods trained using ICD-10 codes, which were unsatisfactory for identifying LBP acuity.
This study uses clinical notes to delineate a potential path toward systematic learning of therapeutic strategies, billing guidelines, and management options for acute LBP at the point of care.
急性和慢性腰痛(LBP)是不同的病症,治疗方法也不同。然而,它们在电子健康记录中使用相同的《国际疾病分类》第10版(ICD - 10)编码(M54.5),并且只能通过回顾性病历审查来区分。这阻碍了为计费和治疗建议(如重返工作岗位选项)制定数据驱动指南的有效定义。
本研究的目的是通过分析自由文本临床记录来评估自动区分急性LBP发作的可行性。
我们使用了来自不同初级保健机构的17409份临床记录数据集;其中,891份文件被人工标注为急性LBP,2973份通过记录的ICD - 10编码通常与LBP相关。我们比较了不同的监督和非监督自动识别策略:关键词搜索、主题建模、基于n元语法袋和手动特征的逻辑回归以及深度学习(基于卷积神经网络的架构[ConvNet])。我们使用人工标注或ICD - 10编码作为正标签来训练监督模型。
使用人工标注训练的ConvNet取得了最佳结果,受试者工作特征曲线下面积为0.98,F分数为0.70。ConvNet的结果对于减少人工标注文件数量也具有鲁棒性。在没有人工标注的情况下,主题模型的表现优于使用ICD - 10编码训练的方法,后者在识别LBP严重程度方面并不理想。
本研究利用临床记录描绘了一条在护理点系统学习急性LBP治疗策略、计费指南和管理选项的潜在途径。