Schwartz Jessica L, Tseng Eva, Maruthur Nisa M, Rouhizadeh Masoud
Division of General Internal Medicine, Johns Hopkins School of Medicine, Baltimore, MD, United States.
Division of Hospital Medicine, Johns Hopkins Hospital, Baltimore, MD, United States.
JMIR Med Inform. 2022 Feb 24;10(2):e29803. doi: 10.2196/29803.
Prediabetes affects 1 in 3 US adults. Most are not receiving evidence-based interventions, so understanding how providers discuss prediabetes with patients will inform how to improve their care.
This study aimed to develop a natural language processing (NLP) algorithm using machine learning techniques to identify discussions of prediabetes in narrative documentation.
We developed and applied a keyword search strategy to identify discussions of prediabetes in clinical documentation for patients with prediabetes. We manually reviewed matching notes to determine which represented actual prediabetes discussions. We applied 7 machine learning models against our manual annotation.
Machine learning classifiers were able to achieve classification results that were close to human performance with up to 98% precision and recall to identify prediabetes discussions in clinical documentation.
We demonstrated that prediabetes discussions can be accurately identified using an NLP algorithm. This approach can be used to understand and identify prediabetes management practices in primary care, thereby informing interventions to improve guideline-concordant care.
美国三分之一的成年人患有糖尿病前期。大多数人未接受循证干预,因此了解医疗服务提供者如何与患者讨论糖尿病前期将为如何改善其护理提供信息。
本研究旨在开发一种使用机器学习技术的自然语言处理(NLP)算法,以识别叙述性文档中有关糖尿病前期的讨论。
我们开发并应用了一种关键词搜索策略,以识别糖尿病前期患者临床文档中有关糖尿病前期的讨论。我们人工审核匹配的记录,以确定哪些代表实际的糖尿病前期讨论。我们针对人工标注应用了7种机器学习模型。
机器学习分类器能够实现接近人类水平的分类结果,在识别临床文档中糖尿病前期讨论时,精确率和召回率高达98%。
我们证明,使用NLP算法可以准确识别糖尿病前期讨论。这种方法可用于了解和识别初级保健中糖尿病前期的管理实践,从而为改善符合指南的护理的干预措施提供信息。