Operations and Supply Chain Management, Indian Institute of Management, Mumbai 400087, India.
Business School, University of Colorado Denver, Denver, CO 80202, USA.
J Safety Res. 2024 Jun;89:91-104. doi: 10.1016/j.jsr.2024.02.006. Epub 2024 Mar 1.
Workplace accidents in the petroleum industry can cause catastrophic damage to people, property, and the environment. Earlier studies in this domain indicate that the majority of the accident report information is available in unstructured text format. Conventional techniques for the analysis of accident data are time-consuming and heavily dependent on experts' subject knowledge, experience, and judgment. There is a need to develop a machine learning-based decision support system to analyze the vast amounts of unstructured text data that are frequently overlooked due to a lack of appropriate methodology.
To address this gap in the literature, we propose a hybrid methodology that uses improved text-mining techniques combined with an un-bias group decision-making framework to combine the output of objective weights (based on text mining) and subjective weights (based on expert opinion) of risk factors to prioritize them. Based on the contextual word embedding models and term frequencies, we extracted five important clusters of risk factors comprising more than 32 risk sub-factors. A heterogeneous group of experts and employees in the petroleum industry were contacted to obtain their opinions on the extracted risk factors, and the best-worst method was used to convert their opinions to weights.
The applicability of our proposed framework was tested on the data compiled from the accident data released by the petroleum industries in India. Our framework can be extended to accident data from any industry, to reduce analysis time and improve the accuracy in classifying and prioritizing risk factors.
石油行业的工作场所事故可能会对人员、财产和环境造成灾难性的破坏。该领域的早期研究表明,大多数事故报告信息都以非结构化文本格式提供。传统的事故数据分析技术既耗时又严重依赖专家的专业知识、经验和判断。因此,需要开发一个基于机器学习的决策支持系统,以分析由于缺乏适当方法而经常被忽视的大量非结构化文本数据。
为了解决文献中的这一差距,我们提出了一种混合方法,该方法使用改进的文本挖掘技术结合无偏群体决策框架,将客观权重(基于文本挖掘)和主观权重(基于专家意见)的风险因素进行组合,以对其进行优先级排序。基于上下文词嵌入模型和术语频率,我们提取了五个重要的风险因素簇,其中包含 32 多个风险子因素。我们联系了石油行业的异构专家组和员工,以获取他们对提取的风险因素的意见,并使用最佳最差方法将他们的意见转换为权重。
我们提出的框架的适用性已在印度石油行业事故数据中进行了测试。我们的框架可以扩展到任何行业的事故数据,以减少分析时间并提高分类和优先级排序风险因素的准确性。