Jazbec Metod, Pàsztor Barna, Faltings Felix, Antulov-Fantulin Nino, Kolm Petter N
Department of Computer Science, ETH Zurich, 8092 Zurich, Switzerland.
Computational Social Science, ETH Zurich, 8092 Zurich, Switzerland.
R Soc Open Sci. 2021 Jul 28;8(7):202321. doi: 10.1098/rsos.202321. eCollection 2021 Jul.
We quantify the propagation and absorption of large-scale publicly available news articles from the World Wide Web to financial markets. To extract publicly available information, we use the news archives from the Common Crawl, a non-profit organization that crawls a large part of the web. We develop a processing pipeline to identify news articles associated with the constituent companies in the S&P 500 index, an equity market index that measures the stock performance of US companies. Using machine learning techniques, we extract sentiment scores from the Common Crawl News data and employ tools from information theory to quantify the information transfer from public news articles to the US stock market. Furthermore, we analyse and quantify the economic significance of the news-based information with a simple sentiment-based portfolio trading strategy. Our findings provide support for that information in publicly available news on the World Wide Web has a statistically and economically significant impact on events in financial markets.
我们对大规模公开可用的新闻文章从万维网到金融市场的传播与吸收进行了量化。为了提取公开可用信息,我们使用了来自Common Crawl的新闻存档,Common Crawl是一个对网络很大一部分进行爬取的非营利组织。我们开发了一个处理流程,以识别与标准普尔500指数(一个衡量美国公司股票表现的股票市场指数)中的成分公司相关的新闻文章。利用机器学习技术,我们从Common Crawl新闻数据中提取情绪得分,并运用信息论工具来量化从公共新闻文章到美国股票市场的信息传递。此外,我们用一种基于简单情绪的投资组合交易策略来分析和量化基于新闻的信息的经济意义。我们的研究结果支持了这样一种观点,即万维网上公开可用新闻中的信息对金融市场事件具有统计上和经济上的显著影响。