College of Safety Science and Engineering, Xi'an University of Science and Technology, Xi'an 710054, China.
Institute of Management Science, Ningxia University, Yinchuan 750021, China.
J Environ Public Health. 2023 Jan 28;2023:4181159. doi: 10.1155/2023/4181159. eCollection 2023.
Coal chemical enterprises have many risk factors, and the causes of accidents are complex. The traditional risk assessment methods rely on expert experience and previous literature to determine the causes of accidents, which has the problems such as lack of objectivity and low interpretation ability. Analyzing the accident report helps to identify typical accident risk factors and determines the accident evolution rule. However, experts usually judge this work manually, which is subjective and time-consuming. This paper developed an improved approach to identify safety risk factors from a volume of coal chemical accident reports using text mining (TM) technology. Firstly, the accident report was preprocessed, and the Term Frequency Inverse Document Frequency (TF-IDF) was used for feature extraction. Then, the -means algorithm and apriori algorithm were developed to cluster and for the association rule analysis of the vectorized documents in the TF-IDF matrix, respectively to quickly identify the hidden risk factors and the relationship between risk factors in the accident report and to propose targeted safety management measures. Using the sample data of 505 accidents in a large coal chemical enterprise in Western China in the past seven years, the enterprise accident reports were analyzed by text clustering analysis and association rule analysis methods. Through the analysis, six accident clusters and 13 association rules were obtained, and the main risk factors of each accident cluster were further mined, and the corresponding management suggestions were put forward for the enterprise. This method provides a new idea for coal chemical enterprises to make safety management decisions and helps to prevent safety accidents.
煤化工企业存在诸多风险因素,事故致因复杂。传统的风险评估方法依赖于专家经验和以往文献来确定事故原因,存在客观性不足、解释能力低等问题。分析事故报告有助于识别典型的事故风险因素,并确定事故演化规律。然而,专家通常手动进行此项工作,存在主观性和耗时的问题。本文提出了一种改进的方法,利用文本挖掘(TM)技术从大量煤化工事故报告中识别安全风险因素。首先,对事故报告进行预处理,并使用词频逆文档频率(TF-IDF)进行特征提取。然后,分别采用 -means 算法和 apriori 算法对 TF-IDF 矩阵中的向量化文档进行聚类和关联规则分析,以快速识别事故报告中隐藏的风险因素及其与风险因素之间的关系,并提出有针对性的安全管理措施。利用中国西部某大型煤化工企业过去七年的 505 起事故的样本数据,采用文本聚类分析和关联规则分析方法对企业事故报告进行了分析。通过分析,得到了六个事故聚类和 13 条关联规则,并进一步挖掘了每个事故聚类的主要风险因素,为企业提出了相应的管理建议。该方法为煤化工企业制定安全管理决策提供了新思路,有助于预防安全事故。