School of Economics and Trade, Guangzhou Xinhua University, Dongguan, China.
Lingnan College, Sun Yat-Sen University, Guangzhou, China.
PLoS One. 2023 Jun 28;18(6):e0287754. doi: 10.1371/journal.pone.0287754. eCollection 2023.
Prediction of stock price has been a hot topic in artificial intelligence field. Computational intelligent methods such as machine learning or deep learning are explored in the prediction system in recent years. However, making accurate predictions of stock price direction is still a big challenge because stock prices are affected by nonlinear, nonstationary, and high dimensional features. In previous works, feature engineering was overlooked. How to select the optimal feature sets that affect stock price is a prominent solution. Hence, our motivation for this article is to propose an improved many-objective optimization algorithm integrating random forest (I-NSGA-II-RF) algorithm with a three-stage feature engineering process in order to decrease the computational complexity and improve the accuracy of prediction system. Maximizing accuracy and minimizing the optimal solution set are the optimization directions of the model in this study. The integrated information initialization population of two filtered feature selection methods is used to optimize the I-NSGA-II algorithm, using multiple chromosome hybrid coding to synchronously select features and optimize model parameters. Finally, the selected feature subset and parameters are input to the RF for training, prediction, and iterative optimization. Experimental results show that the I-NSGA-II-RF algorithm has the highest average accuracy, the smallest optimal solution set, and the shortest running time compared to the unmodified multi-objective feature selection algorithm and the single target feature selection algorithm. Compared to the deep learning model, this model has interpretability, higher accuracy, and less running time.
股票价格预测一直是人工智能领域的热门话题。近年来,预测系统中探索了计算智能方法,如机器学习或深度学习。然而,由于股票价格受到非线性、非平稳和高维特征的影响,准确预测股票价格方向仍然是一个巨大的挑战。在以前的工作中,特征工程被忽视了。如何选择影响股票价格的最佳特征集是一个突出的解决方案。因此,我们提出了一种改进的多目标优化算法,将随机森林(I-NSGA-II-RF)算法与三阶段特征工程过程相结合,以降低计算复杂度并提高预测系统的准确性。该模型的优化方向是最大化准确性和最小化最优解集。该算法使用两种过滤特征选择方法的集成信息初始化种群来优化 I-NSGA-II 算法,使用多种染色体混合编码来同步选择特征和优化模型参数。最后,将选择的特征子集和参数输入 RF 进行训练、预测和迭代优化。实验结果表明,与未修改的多目标特征选择算法和单目标特征选择算法相比,I-NSGA-II-RF 算法具有最高的平均准确性、最小的最优解集和最短的运行时间。与深度学习模型相比,该模型具有可解释性、更高的准确性和更短的运行时间。