Topaloglu Fatih
Computer Engineering/Faculty of Engineering, Malatya Turgut Ozal University, Malatya, Turkey.
PeerJ Comput Sci. 2024 Jul 26;10:e2198. doi: 10.7717/peerj-cs.2198. eCollection 2024.
Every work environment contains different types of risks and interactions between risks. Therefore, the method to be used when making a risk assessment is very important. When determining which risk assessment method (RAM) to use, there are many factors such as the types of risks in the work environment, the interactions of these risks with each other, and their distance from the employees. Although there are many RAMs available, there is no RAM that will suit all workplaces and which method to choose is the biggest question. There is no internationally accepted scale or trend on this subject. In the study, 26 sectors, 10 different RAMs and 10 criteria were determined. A hybrid approach has been designed to determine the most suitable RAMs for sectors by using k-means clustering and support vector machine (SVM) classification algorithms, which are machine learning (ML) algorithms. First, the data set was divided into subsets with the k-means algorithm. Then, the SVM algorithm was run on all subsets with different characteristics. Finally, the results of all subsets were combined to obtain the result of the entire dataset. Thus, instead of the threshold value determined for a single and large cluster affecting the entire cluster and being made mandatory for all of them, a flexible structure was created by determining separate threshold values for each sub-cluster according to their characteristics. In this way, machine support was provided by selecting the most suitable RAMs for the sectors and eliminating the administrative and software problems in the selection phase from the manpower. The first comparison result of the proposed method was found to be the hybrid method: 96.63%, k-means: 90.63 and SVM: 94.68%. In the second comparison made with five different ML algorithms, the results of the artificial neural networks (ANN): 87.44%, naive bayes (NB): 91.29%, decision trees (DT): 89.25%, random forest (RF): 81.23% and k-nearest neighbours (KNN): 85.43% were found.
每个工作环境都包含不同类型的风险以及风险之间的相互作用。因此,进行风险评估时所采用的方法非常重要。在确定使用哪种风险评估方法(RAM)时,存在许多因素,例如工作环境中的风险类型、这些风险之间的相互作用以及它们与员工的距离。尽管有许多可用的RAM,但没有一种RAM适用于所有工作场所,选择哪种方法是最大的问题。在这个问题上没有国际公认的标准或趋势。在该研究中,确定了26个行业、10种不同的RAM和10个标准。设计了一种混合方法,通过使用机器学习(ML)算法中的k均值聚类和支持向量机(SVM)分类算法来确定最适合各行业的RAM。首先,使用k均值算法将数据集划分为子集。然后,在所有具有不同特征的子集上运行SVM算法。最后,将所有子集的结果合并以获得整个数据集的结果。因此,不是为单个大集群确定一个影响整个集群并对其所有成员强制执行的阈值,而是根据每个子集群的特征为其确定单独的阈值,从而创建了一个灵活的结构。通过为各行业选择最合适的RAM并从人力中消除选择阶段的行政和软件问题,提供了机器支持。所提出方法的首次比较结果发现混合方法为:96.63%,k均值为:90.63%,SVM为:94.68%。在与五种不同ML算法进行的第二次比较中,发现人工神经网络(ANN)的结果为:87.44%,朴素贝叶斯(NB)为:91.29%,决策树(DT)为:89.25%,随机森林(RF)为:81.23%,k近邻(KNN)为:85.43%。