Dunstan Jocelyn, Campaña-Herrera Valentina, Miranda Luis, Ladrón de Guevara Rocío, Pincheira Pablo, Rocco Victor, Moyano-Dávila Daniela
Department of Computer Science, Pontificia Universidad Católica de Chile, Santiago, Chile.
Institute for Mathematical and Computational Engineering, Pontificia Universidad Católica de Chile, Santiago, Chile.
BMC Public Health. 2025 Aug 12;25(1):2746. doi: 10.1186/s12889-025-24130-z.
Evidence from the global north shows that women and men significantly differ in work accidents and occupational disease rates. However, more data is needed for countries elsewhere.
Using natural language processing (NLP), we extracted accident mechanisms from 350,000 admission reports from the largest occupational health provider in Chile. In addition, using the same technique, we normalize occupations written in free text, following the nomenclature from the International Labour Organization (ILO).
We found that in 57.3% of accidents, a man is affected, while in 42.7% is a woman. The most common occupation for men is operator, while for women, it is related to cleaning duties. The most common form of accident for women is falling from the same height while for men is contact with sharp objects. In this work, we demonstrate the power of NLP in the massive analysis of work-related accidents by reporting the use of large language models with human expert annotation to evaluate mechanisms extraction.
By sharing our prompts and code, we aim to help other institutions and countries extract crucial information from free text to a controlled vocabulary of ILO. Future work includes the analysis of commuting accidents and occupational diseases.
来自北半球的证据表明,男性和女性在工伤事故和职业病发生率上存在显著差异。然而,其他国家需要更多数据。
我们使用自然语言处理(NLP)技术,从智利最大的职业健康服务机构的35万份入院报告中提取事故机制。此外,我们使用相同技术,按照国际劳工组织(ILO)的命名法对自由文本中书写的职业进行标准化。
我们发现,在57.3%的事故中,受影响的是男性,而42.7%是女性。男性最常见的职业是操作员,而女性则与清洁工作相关。女性最常见的事故形式是从同一高度坠落,而男性是与尖锐物体接触。在这项工作中,我们通过报告使用带有人类专家注释的大语言模型来评估机制提取,展示了NLP在大规模分析与工作相关事故中的作用。
通过分享我们的提示和代码,我们旨在帮助其他机构和国家从自由文本中提取关键信息,转换为ILO的受控词汇表。未来的工作包括分析通勤事故和职业病。