Usher Institute, University of Edinburgh, NINE Bioquarter, Edinburgh, United Kingdom.
PLoS One. 2022 Sep 7;17(9):e0273830. doi: 10.1371/journal.pone.0273830. eCollection 2022.
When studying financial markets, we often look at estimating a correlation matrix from asset returns. These tend to be noisy, with many more dimensions than samples, so often the resulting correlation matrix is filtered. Popular methods to do this include the minimum spanning tree, planar maximally filtered graph and the triangulated maximally filtered graph, which involve using the correlation network as the adjacency matrix of a graph and then using tools from graph theory. These assume the data fits some form of shape. We do not necessarily have a reason to believe that the data does fit into this shape, and there have been few empirical investigations comparing how the methods perform. In this paper we look at how the filtered networks are changed from the original networks using stock returns from the US, UK, German, Indian and Chinese markets, and at how these methods affect our ability to distinguish between datasets created from different correlation matrices using a graph embedding algorithm. We find that the relationship between the full and filtered networks depends on the data and the state of the market, and decreases as we increase the size of networks, and that the filtered networks do not provide an improvement in classification accuracy compared to the full networks.
在研究金融市场时,我们经常需要从资产回报中估计相关矩阵。这些回报往往存在较多噪声,维度远多于样本数量,因此相关矩阵通常需要经过过滤。常见的过滤方法包括最小生成树、平面极大过滤图和三角极大过滤图,这些方法涉及将相关网络用作图的邻接矩阵,然后使用图论工具。这些方法假设数据符合某种形状。我们不一定有理由相信数据确实符合这种形状,而且很少有实证研究比较这些方法的性能。在本文中,我们使用来自美国、英国、德国、印度和中国市场的股票回报,研究了原始网络经过过滤后的网络如何发生变化,以及这些方法如何影响我们使用图嵌入算法区分来自不同相关矩阵的数据集的能力。我们发现,全网络和过滤网络之间的关系取决于数据和市场状况,并且随着网络规模的增加而减小,与全网络相比,过滤网络并不能提高分类准确性。