优化巴基斯坦吉尔吉特-巴尔蒂斯坦喀喇昆仑公路沿线滑坡易发性制图的机器学习算法：基线、贝叶斯和元启发式超参数优化技术的比较研究

Optimizing Machine Learning Algorithms for Landslide Susceptibility Mapping along the Karakoram Highway, Gilgit Baltistan, Pakistan: A Comparative Study of Baseline, Bayesian, and Metaheuristic Hyperparameter Optimization Techniques.

作者信息

Abbas Farkhanda, Zhang Feng, Ismail Muhammad, Khan Garee, Iqbal Javed, Alrefaei Abdulwahed Fahad, Albeshr Mohammed Fahad

机构信息

School of Computer Science, China University of Geosciences, Wuhan 430074, China.

Department of Computer Science, Karakoram International University, Gilgit 15100, Pakistan.

出版信息

Sensors (Basel). 2023 Aug 1;23(15):6843. doi: 10.3390/s23156843.

DOI:10.3390/s23156843

PMID:37571627

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10422586/

Abstract

Algorithms for machine learning have found extensive use in numerous fields and applications. One important aspect of effectively utilizing these algorithms is tuning the hyperparameters to match the specific task at hand. The selection and configuration of hyperparameters directly impact the performance of machine learning models. Achieving optimal hyperparameter settings often requires a deep understanding of the underlying models and the appropriate optimization techniques. While there are many automatic optimization techniques available, each with its own advantages and disadvantages, this article focuses on hyperparameter optimization for well-known machine learning models. It explores cutting-edge optimization methods such as metaheuristic algorithms, deep learning-based optimization, Bayesian optimization, and quantum optimization, and our paper focused mainly on metaheuristic and Bayesian optimization techniques and provides guidance on applying them to different machine learning algorithms. The article also presents real-world applications of hyperparameter optimization by conducting tests on spatial data collections for landslide susceptibility mapping. Based on the experiment's results, both Bayesian optimization and metaheuristic algorithms showed promising performance compared to baseline algorithms. For instance, the metaheuristic algorithm boosted the random forest model's overall accuracy by 5% and 3%, respectively, from baseline optimization methods GS and RS, and by 4% and 2% from baseline optimization methods GA and PSO. Additionally, for models like KNN and SVM, Bayesian methods with Gaussian processes had good results. When compared to the baseline algorithms RS and GS, the accuracy of the KNN model was enhanced by BO-TPE by 1% and 11%, respectively, and by BO-GP by 2% and 12%, respectively. For SVM, BO-TPE outperformed GS and RS by 6% in terms of performance, while BO-GP improved results by 5%. The paper thoroughly discusses the reasons behind the efficiency of these algorithms. By successfully identifying appropriate hyperparameter configurations, this research paper aims to assist researchers, spatial data analysts, and industrial users in developing machine learning models more effectively. The findings and insights provided in this paper can contribute to enhancing the performance and applicability of machine learning algorithms in various domains.

摘要

机器学习算法已在众多领域和应用中得到广泛使用。有效利用这些算法的一个重要方面是调整超参数以匹配手头的特定任务。超参数的选择和配置直接影响机器学习模型的性能。实现最佳超参数设置通常需要深入了解基础模型和适当的优化技术。虽然有许多自动优化技术可用，每种技术都有其自身的优缺点，但本文重点关注著名机器学习模型的超参数优化。它探讨了前沿的优化方法，如元启发式算法、基于深度学习的优化、贝叶斯优化和量子优化，并且我们的论文主要关注元启发式和贝叶斯优化技术，并提供将它们应用于不同机器学习算法的指导。本文还通过对用于滑坡易发性映射的空间数据集合进行测试，展示了超参数优化的实际应用。基于实验结果，与基线算法相比，贝叶斯优化和元启发式算法都显示出有前景的性能。例如，元启发式算法分别将随机森林模型的总体准确率从基线优化方法GS和RS提高了5%和3%，从基线优化方法GA和PSO提高了4%和2%。此外，对于KNN和SVM等模型，具有高斯过程的贝叶斯方法有良好的结果。与基线算法RS和GS相比，KNN模型的准确率通过BO - TPE分别提高了1%和11%，通过BO - GP分别提高了2%和12%。对于SVM，BO - TPE在性能方面比GS和RS高出6%，而BO - GP将结果提高了5%。本文深入讨论了这些算法效率背后的原因。通过成功识别合适的超参数配置，本研究论文旨在帮助研究人员、空间数据分析人员和行业用户更有效地开发机器学习模型。本文提供的研究结果和见解有助于提高机器学习算法在各个领域的性能和适用性。