Xiao Qianyi, Ihnaini Baha
Department of Computer Science, Wenzhou Kean University, Wenzhou, Zhejiang, China.
PeerJ Comput Sci. 2023 Mar 20;9:e1293. doi: 10.7717/peerj-cs.1293. eCollection 2023.
These days, the vast amount of data generated on the Internet is a new treasure trove for investors. They can utilize text mining and sentiment analysis techniques to reflect investors' confidence in specific stocks in order to make the most accurate decision. Most previous research just sums up the text sentiment score on each natural day and uses such aggregated score to predict various stock trends. However, the natural day aggregated score may not be useful in predicting different stock trends. Therefore, in this research, we designed two different time divisions: 0:00∼0:00 and 9:30∼9:30 to study how tweets and news from the different periods can predict the next-day stock trend. 260,000 tweets and 6,000 news from Service stocks (Amazon, Netflix) and Technology stocks (Apple, Microsoft) were selected to conduct the research. The experimental result shows that opening hours division (9:30∼9:30) outperformed natural hours division (0:00∼0:00).
如今,互联网上产生的海量数据对投资者来说是一个新的宝库。他们可以利用文本挖掘和情感分析技术来反映投资者对特定股票的信心,以便做出最准确的决策。以往的大多数研究只是对每个自然日的文本情感得分进行汇总,并使用这种汇总得分来预测各种股票走势。然而,自然日汇总得分在预测不同股票走势时可能并无用处。因此,在本研究中,我们设计了两种不同的时间划分:0:00至0:00和9:30至9:30,以研究不同时间段的推文和新闻如何预测次日的股票走势。我们选取了来自服务业股票(亚马逊、奈飞)和科技股(苹果、微软)的260,000条推文和6,000条新闻来进行研究。实验结果表明,开盘时间划分(9:30至9:30)优于自然时间划分(0:00至0:00)。