Karim Ali Abdul, Pardede Eric, Mann Scott
Department of Computer Science and Information Technology, La Trobe University, Melbourne, VIC 3086, Australia.
Entropy (Basel). 2023 Jul 30;25(8):1144. doi: 10.3390/e25081144.
This study examined whether the behaviour of Internet search users obtained from Google Trends contributes to the forecasting of two Australian macroeconomic indicators: monthly unemployment rate and monthly number of short-term visitors. We assessed the performance of traditional time series linear regression (SARIMA) against a widely used machine learning technique (support vector regression) and a deep learning technique (convolutional neural network) in forecasting both indicators across different data settings. Our study focused on the out-of-sample forecasting performance of the SARIMA, SVR, and CNN models and forecasting the two Australian indicators. We adopted a multi-step approach to compare the performance of the models built over different forecasting horizons and assessed the impact of incorporating Google Trends data in the modelling process. Our approach supports a data-driven framework, which reduces the number of features prior to selecting the best-performing model. The experiments showed that incorporating Internet search data in the forecasting models improved the forecasting accuracy and that the results were dependent on the forecasting horizon, as well as the technique. To the best of our knowledge, this study is the first to assess the usefulness of Google search data in the context of these two economic variables. An extensive comparison of the performance of traditional and machine learning techniques on different data settings was conducted to enable the selection of an efficient model, including the forecasting technique, horizon, and modelling features.
月度失业率和月度短期访客数量。我们评估了传统时间序列线性回归(SARIMA)与一种广泛使用的机器学习技术(支持向量回归)和一种深度学习技术(卷积神经网络)在不同数据设置下预测这两个指标的性能。我们的研究聚焦于SARIMA、支持向量回归(SVR)和卷积神经网络(CNN)模型的样本外预测性能以及对澳大利亚这两个指标的预测。我们采用多步骤方法来比较在不同预测期构建的模型的性能,并评估在建模过程中纳入谷歌趋势数据的影响。我们的方法支持一个数据驱动的框架,该框架在选择表现最佳的模型之前减少了特征数量。实验表明,在预测模型中纳入互联网搜索数据提高了预测准确性,并且结果取决于预测期以及技术。据我们所知,本研究首次在这两个经济变量的背景下评估谷歌搜索数据的有用性。我们对传统技术和机器学习技术在不同数据设置下的性能进行了广泛比较,以便能够选择一个有效的模型,包括预测技术、预测期和建模特征。