Suppr超能文献

使用自然语言处理从西班牙语自由文本中提取的与工作相关事故中的性别差异。

Sex differences in work-related accidents extracted from free text in Spanish using natural language processing.

作者信息

Dunstan Jocelyn, Campaña-Herrera Valentina, Miranda Luis, Ladrón de Guevara Rocío, Pincheira Pablo, Rocco Victor, Moyano-Dávila Daniela

机构信息

Department of Computer Science, Pontificia Universidad Católica de Chile, Santiago, Chile.

Institute for Mathematical and Computational Engineering, Pontificia Universidad Católica de Chile, Santiago, Chile.

出版信息

BMC Public Health. 2025 Aug 12;25(1):2746. doi: 10.1186/s12889-025-24130-z.

Abstract

BACKGROUND

Evidence from the global north shows that women and men significantly differ in work accidents and occupational disease rates. However, more data is needed for countries elsewhere.

METHODS

Using natural language processing (NLP), we extracted accident mechanisms from 350,000 admission reports from the largest occupational health provider in Chile. In addition, using the same technique, we normalize occupations written in free text, following the nomenclature from the International Labour Organization (ILO).

RESULTS

We found that in 57.3% of accidents, a man is affected, while in 42.7% is a woman. The most common occupation for men is operator, while for women, it is related to cleaning duties. The most common form of accident for women is falling from the same height while for men is contact with sharp objects. In this work, we demonstrate the power of NLP in the massive analysis of work-related accidents by reporting the use of large language models with human expert annotation to evaluate mechanisms extraction.

CONCLUSION

By sharing our prompts and code, we aim to help other institutions and countries extract crucial information from free text to a controlled vocabulary of ILO. Future work includes the analysis of commuting accidents and occupational diseases.

摘要

背景

来自北半球的证据表明,男性和女性在工伤事故和职业病发生率上存在显著差异。然而,其他国家需要更多数据。

方法

我们使用自然语言处理(NLP)技术,从智利最大的职业健康服务机构的35万份入院报告中提取事故机制。此外,我们使用相同技术,按照国际劳工组织(ILO)的命名法对自由文本中书写的职业进行标准化。

结果

我们发现,在57.3%的事故中,受影响的是男性,而42.7%是女性。男性最常见的职业是操作员,而女性则与清洁工作相关。女性最常见的事故形式是从同一高度坠落,而男性是与尖锐物体接触。在这项工作中,我们通过报告使用带有人类专家注释的大语言模型来评估机制提取,展示了NLP在大规模分析与工作相关事故中的作用。

结论

通过分享我们的提示和代码,我们旨在帮助其他机构和国家从自由文本中提取关键信息,转换为ILO的受控词汇表。未来的工作包括分析通勤事故和职业病。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ccbd/12341284/85594c0ca66b/12889_2025_24130_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验