• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

一种用于公共卫生监测的实用工具:使用朴素贝叶斯算法对来自大型行政数据库的简短伤害描述进行半自动编码。

A practical tool for public health surveillance: Semi-automated coding of short injury narratives from large administrative databases using Naïve Bayes algorithms.

作者信息

Marucci-Wellman Helen R, Lehto Mark R, Corns Helen L

机构信息

Center for Injury Epidemiology, Liberty Mutual Research Institute for Safety, Hopkinton, MA, USA.

School of Industrial Engineering, Purdue University, West Lafayette, IN, USA.

出版信息

Accid Anal Prev. 2015 Nov;84:165-76. doi: 10.1016/j.aap.2015.06.014. Epub 2015 Sep 26.

DOI:10.1016/j.aap.2015.06.014
PMID:26412196
Abstract

Public health surveillance programs in the U.S. are undergoing landmark changes with the availability of electronic health records and advancements in information technology. Injury narratives gathered from hospital records, workers compensation claims or national surveys can be very useful for identifying antecedents to injury or emerging risks. However, classifying narratives manually can become prohibitive for large datasets. The purpose of this study was to develop a human-machine system that could be relatively easily tailored to routinely and accurately classify injury narratives from large administrative databases such as workers compensation. We used a semi-automated approach based on two Naïve Bayesian algorithms to classify 15,000 workers compensation narratives into two-digit Bureau of Labor Statistics (BLS) event (leading to injury) codes. Narratives were filtered out for manual review if the algorithms disagreed or made weak predictions. This approach resulted in an overall accuracy of 87%, with consistently high positive predictive values across all two-digit BLS event categories including the very small categories (e.g., exposure to noise, needle sticks). The Naïve Bayes algorithms were able to identify and accurately machine code most narratives leaving only 32% (4853) for manual review. This strategy substantially reduces the need for resources compared with manual review alone.

摘要

随着电子健康记录的普及和信息技术的进步,美国的公共卫生监测项目正在经历具有里程碑意义的变革。从医院记录、工伤赔偿申请或全国性调查中收集到的伤害描述,对于识别伤害的前因或新出现的风险可能非常有用。然而,对于大型数据集来说,手动对这些描述进行分类可能变得代价高昂。本研究的目的是开发一种人机系统,该系统能够相对容易地进行定制,以便从诸如工伤赔偿等大型行政数据库中常规且准确地对伤害描述进行分类。我们使用了基于两种朴素贝叶斯算法的半自动方法,将15000份工伤赔偿描述分类为两位数的劳工统计局(BLS)事件(导致伤害)代码。如果算法存在分歧或做出的预测较弱,则将这些描述筛选出来进行人工审核。这种方法的总体准确率为87%,在所有两位数的BLS事件类别中,包括非常小的类别(如接触噪音、针刺),阳性预测值始终很高。朴素贝叶斯算法能够识别并准确地对大多数描述进行机器编码,仅留下32%(4853份)进行人工审核。与仅进行人工审核相比,这种策略大大减少了资源需求。

相似文献

1
A practical tool for public health surveillance: Semi-automated coding of short injury narratives from large administrative databases using Naïve Bayes algorithms.一种用于公共卫生监测的实用工具:使用朴素贝叶斯算法对来自大型行政数据库的简短伤害描述进行半自动编码。
Accid Anal Prev. 2015 Nov;84:165-76. doi: 10.1016/j.aap.2015.06.014. Epub 2015 Sep 26.
2
Classifying injury narratives of large administrative databases for surveillance-A practical approach combining machine learning ensembles and human review.用于监测的大型行政数据库损伤叙述分类——一种结合机器学习集成和人工审核的实用方法。
Accid Anal Prev. 2017 Jan;98:359-371. doi: 10.1016/j.aap.2016.10.014. Epub 2016 Nov 15.
3
Bayesian methods: a useful tool for classifying injury narratives into cause groups.贝叶斯方法:将伤害叙述分类为原因组的有用工具。
Inj Prev. 2009 Aug;15(4):259-65. doi: 10.1136/ip.2008.021337.
4
A combined Fuzzy and Naive Bayesian strategy can be used to assign event codes to injury narratives.一种组合的模糊和朴素贝叶斯策略可用于为伤害描述分配事件代码。
Inj Prev. 2011 Dec;17(6):407-14. doi: 10.1136/ip.2010.030593. Epub 2011 Apr 11.
5
Comparison of methods for auto-coding causation of injury narratives.损伤描述因果关系自动编码方法的比较
Accid Anal Prev. 2016 Mar;88:117-23. doi: 10.1016/j.aap.2015.12.006. Epub 2015 Dec 30.
6
Development and evaluation of a Naïve Bayesian model for coding causation of workers' compensation claims.开发和评估用于编码工人赔偿索赔因果关系的朴素贝叶斯模型。
J Safety Res. 2012 Dec;43(5-6):327-32. doi: 10.1016/j.jsr.2012.10.012. Epub 2012 Nov 1.
7
Near-miss narratives from the fire service: a Bayesian analysis.消防部门的险些事故叙述:贝叶斯分析。
Accid Anal Prev. 2014 Jan;62:119-29. doi: 10.1016/j.aap.2013.09.012. Epub 2013 Oct 1.
8
Occupational amputations in Illinois 2000-2007: BLS vs. data linkage of trauma registry, hospital discharge, workers compensation databases and OSHA citations.2000-2007 年伊利诺伊州的职业性截肢:BLS 与创伤登记处、医院出院记录、工人赔偿数据库和 OSHA 引文的数据链接比较。
Injury. 2013 May;44(5):667-73. doi: 10.1016/j.injury.2012.01.007. Epub 2012 Feb 24.
9
Etiology of work-related electrical injuries: a narrative analysis of workers' compensation claims.与工作相关的电击伤病因:对工伤赔偿申请的叙述性分析
J Occup Environ Hyg. 2009 Oct;6(10):612-23. doi: 10.1080/15459620903133683.
10
Establishment size and risk of occupational injury.企业规模与职业伤害风险。
Am J Ind Med. 1995 Jul;28(1):1-21. doi: 10.1002/ajim.4700280102.

引用本文的文献

1
Establishment-level occupational safety analytics: Challenges and opportunities.企业层面的职业安全分析:挑战与机遇
Int J Ind Ergon. 2023 Mar;94. doi: 10.1016/j.ergon.2023.103428.
2
A Catalogue of Machine Learning Algorithms for Healthcare Risk Predictions.医疗保健风险预测的机器学习算法目录。
Sensors (Basel). 2022 Nov 8;22(22):8615. doi: 10.3390/s22228615.
3
Workers' compensation claim counts and rates by injury event/exposure among state-insured private employers in Ohio, 2007-2017.2007-2017 年俄亥俄州参保私营雇主按伤害事件/暴露分类的工人赔偿索赔数和费率。
J Safety Res. 2021 Dec;79:148-167. doi: 10.1016/j.jsr.2021.08.015. Epub 2021 Sep 17.
4
The development of a machine learning algorithm to identify occupational injuries in agriculture using pre-hospital care reports.开发一种利用院前护理报告识别农业职业伤害的机器学习算法。
Health Inf Sci Syst. 2021 Jul 29;9(1):31. doi: 10.1007/s13755-021-00161-9. eCollection 2021 Dec.
5
Testing and Validating Semi-automated Approaches to the Occupational Exposure Assessment of Polycyclic Aromatic Hydrocarbons.测试和验证多环芳烃职业暴露评估的半自动方法。
Ann Work Expo Health. 2021 Jul 3;65(6):682-693. doi: 10.1093/annweh/wxab002.
6
Prediction of postoperative complications of pediatric cataract patients using data mining.基于数据挖掘预测小儿白内障患者术后并发症
J Transl Med. 2019 Jan 3;17(1):2. doi: 10.1186/s12967-018-1758-2.
7
Mortality prediction in patients with isolated moderate and severe traumatic brain injury using machine learning models.使用机器学习模型预测单纯中重度创伤性脑损伤患者的死亡率。
PLoS One. 2018 Nov 9;13(11):e0207192. doi: 10.1371/journal.pone.0207192. eCollection 2018.
8
Mortality, morbidity and health in developed societies: a review of data sources.发达社会中的死亡率、发病率与健康状况:数据来源综述
Genus. 2018;74(1):2. doi: 10.1186/s41118-018-0027-9. Epub 2018 Jan 29.
9
Comparison of methods for auto-coding causation of injury narratives.损伤描述因果关系自动编码方法的比较
Accid Anal Prev. 2016 Mar;88:117-23. doi: 10.1016/j.aap.2015.12.006. Epub 2015 Dec 30.
10
Harnessing information from injury narratives in the 'big data' era: understanding and applying machine learning for injury surveillance.在“大数据”时代利用伤害叙事中的信息:理解并应用机器学习进行伤害监测。
Inj Prev. 2016 Apr;22 Suppl 1(Suppl 1):i34-42. doi: 10.1136/injuryprev-2015-041813. Epub 2016 Jan 4.