Division of Hospital Medicine, Cincinnati Children's Hospital Medical Center, Cincinnati, Ohio, USA.
Department of Pediatrics, College of Medicine, University of Cincinnati, Cincinnati, Ohio, USA.
J Hosp Med. 2023 May;18(5):405-412. doi: 10.1002/jhm.13080. Epub 2023 Mar 15.
Diagnostic uncertainty, when unrecognized or poorly communicated, can result in diagnostic error. However, diagnostic uncertainty is challenging to study due to a lack of validated identification methods. This study aims to identify distinct linguistic patterns associated with diagnostic uncertainty in clinical documentation.
DESIGN, SETTING AND PARTICIPANTS: This case-control study compares the clinical documentation of hospitalized children who received a novel uncertain diagnosis (UD) diagnosis label during their admission to a set of matched controls. Linguistic analyses identified potential linguistic indicators (i.e., words or phrases) of diagnostic uncertainty that were then manually reviewed by a linguist and clinical experts to identify those most relevant to diagnostic uncertainty. A natural language processing program categorized medical terminology into semantic types (i.e., sign or symptom), from which we identified a subset of these semantic types that both categorized reliably and were relevant to diagnostic uncertainty. Finally, a competitive machine learning modeling strategy utilizing the linguistic indicators and semantic types compared different predictive models for identifying diagnostic uncertainty.
Our cohort included 242 UD-labeled patients and 932 matched controls with a combination of 3070 clinical notes. The best-performing model was a random forest, utilizing a combination of linguistic indicators and semantic types, yielding a sensitivity of 89.4% and a positive predictive value of 96.7%.
Expert labeling, natural language processing, and machine learning methods combined with human validation resulted in highly predictive models to detect diagnostic uncertainty in clinical documentation and represent a promising approach to detecting, studying, and ultimately mitigating diagnostic uncertainty in clinical practice.
诊断不确定性,如果未被识别或沟通不畅,可能导致诊断错误。然而,由于缺乏经过验证的识别方法,诊断不确定性难以研究。本研究旨在确定与临床文档中诊断不确定性相关的独特语言模式。
设计、设置和参与者:本病例对照研究比较了在住院期间获得新的不确定诊断 (UD) 诊断标签的住院儿童的临床记录与一组匹配的对照。语言分析确定了潜在的语言指示词(即单词或短语),然后由语言学家和临床专家手动审查这些指示词,以确定与诊断不确定性最相关的指示词。自然语言处理程序将医学术语分类为语义类型(即症状或体征),从中我们确定了这些语义类型的一个子集,这些子集既可以可靠地分类,又与诊断不确定性相关。最后,利用语言指标和语义类型的竞争机器学习建模策略比较了用于识别诊断不确定性的不同预测模型。
我们的队列包括 242 名 UD 标记患者和 932 名匹配对照,共有 3070 份临床记录。表现最佳的模型是随机森林,它结合了语言指标和语义类型,其敏感性为 89.4%,阳性预测值为 96.7%。
专家标记、自然语言处理和机器学习方法结合人工验证,产生了高度预测模型,可以在临床文档中检测诊断不确定性,代表了一种有前途的方法,可以用于检测、研究,最终减轻临床实践中的诊断不确定性。