Suppr超能文献

在初级保健临床记录中识别枪支暴力暴露情况:开发全国性语言处理文本分类器的方案

Identifying Firearm Violence Exposure in Primary Care Clinical Notes: Protocol for Developing a National Language Processing Text Classifier.

作者信息

Carwright Natalie, Biel Frances M, Hoopes Megan, Bataineh Ali Al, Rivera Pedro, Bet Kerry, Cook Nicole

机构信息

Department of Mathematics, Norwich University, Northfield, VT, United States.

OCHIN, Portland, OR, United States.

出版信息

JMIR Res Protoc. 2025 Sep 5;14:e76681. doi: 10.2196/76681.

Abstract

BACKGROUND

Structured data codes capture acute bodily injury from firearm violence but do not necessarily describe follow-up care from bodily injury and secondary exposure to firearm violence (eg, witnessing a shooting, being threatened by a firearm, or losing a loved one to gun violence and injury from firearms) even though such exposure is associated with many short- and long-term health impacts. Clinical notes from electronic health records (EHRs) often contain data not otherwise captured in structured data fields and can be categorized using natural language processing (NLP).

OBJECTIVE

This study protocol outlines the steps being taken to develop an NLP text classifier for determination of exposure to firearm violence (both primary and secondary exposure) from ambulatory primary care and behavioral health EHR clinical notes for persons aged ≥5 years.

METHODS

The study will use unstructured data from clinical notes taken between 2012 and 2022 from OCHIN, a multistate network of community health organizations using a single instance of Epic EHR. We describe the process of developing a labeled dataset for supervised NLP development that includes establishing a lexicon (words related to firearm violence) to identify potentially relevant notes, followed by a review of text extracted from a sample of these notes. We then describe the process of building, training, and evaluating candidate machine learning, neural network, and large language model NLP text classifiers. From this, a final NLP model is chosen then evaluated on a new set of randomly selected notes. An engaged stakeholder advisory committee will provide input and guidance on methods and results to identify and address potential biases in the NLP text classifiers.

RESULTS

The study was funded in September 2023. Study activities have been ongoing through July 2025 and we are currently evaluating NLP text classifiers. We expect that the final model will be selected by August 2025 and we will publish results of NLP model development and the final model performance in 2026.

CONCLUSIONS

This work describes the development of a novel NLP text classifier to identify exposure to firearm violence in ambulatory primary care and behavioral health clinical notes. The NLP model developed in this study may lead to increased ascertainment of patients with exposure, laying the groundwork for understanding the long-term impacts and outcomes of firearm violence exposure and presenting opportunities for improved patient care.

INTERNATIONAL REGISTERED REPORT IDENTIFIER (IRRID): DERR1-10.2196/76681.

摘要

背景

结构化数据编码可记录枪支暴力导致的急性身体损伤,但不一定能描述身体损伤后的后续护理以及枪支暴力的二次暴露情况(例如,目睹枪击、受到枪支威胁,或因枪支暴力失去亲人以及因枪支造成的伤害),尽管这种暴露会对健康产生许多短期和长期影响。电子健康记录(EHR)中的临床记录通常包含结构化数据字段中未捕获的数据,并且可以使用自然语言处理(NLP)进行分类。

目的

本研究方案概述了为开发一个NLP文本分类器而采取的步骤,该分类器用于从≥5岁人群的门诊初级保健和行为健康EHR临床记录中确定枪支暴力暴露情况(包括初次暴露和二次暴露)。

方法

本研究将使用2012年至2022年期间从OCHIN获取的临床记录中的非结构化数据,OCHIN是一个多州社区卫生组织网络,使用单一实例的Epic EHR。我们描述了为监督式NLP开发创建标记数据集的过程,包括建立一个词汇表(与枪支暴力相关的词汇)以识别潜在相关记录,随后对从这些记录样本中提取的文本进行审查。然后,我们描述了构建、训练和评估候选机器学习、神经网络和大语言模型NLP文本分类器的过程。据此,选择一个最终的NLP模型,然后在一组新的随机选择的记录上进行评估。一个积极参与的利益相关者咨询委员会将就方法和结果提供意见和指导,以识别和解决NLP文本分类器中的潜在偏差。

结果

该研究于2023年9月获得资助。研究活动一直持续到2025年7月,我们目前正在评估NLP文本分类器。我们预计最终模型将于2025年8月选定,我们将在2026年公布NLP模型开发结果和最终模型性能。

结论

这项工作描述了一种新型NLP文本分类器的开发,用于识别门诊初级保健和行为健康临床记录中的枪支暴力暴露情况。本研究中开发的NLP模型可能会提高对暴露患者的确诊率,为了解枪支暴力暴露的长期影响和结果奠定基础,并为改善患者护理提供机会。

国际注册报告识别号(IRRID):DERR1-10.2196/76681。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验