Kachouie Nezamoddin N, Deebani Wejdan
Department of Mathematical Sciences, Florida Institute of Technology, Melbourne, FL 32901, USA.
Deparments of Mathematics, College of Science and Arts, King Abdulaziz University, P.O. Box 344, Rabigh 21911, Saudi Arabia.
Entropy (Basel). 2020 Apr 13;22(4):440. doi: 10.3390/e22040440.
In data analysis and machine learning, we often need to identify and quantify the correlation between variables. Although Pearson's correlation coefficient has been widely used, its value is reliable only for linear relationships and Distance correlation was introduced to address this shortcoming.
Distance correlation can identify linear and nonlinear correlations. However, its performance drops in noisy conditions. In this paper, we introduce the Association Factor (AF) as a robust method for identification and quantification of linear and nonlinear associations in noisy conditions.
To test the performance of the proposed Association Factor, we modeled several simulations of linear and nonlinear relationships in different noise conditions and computed Pearson's correlation, Distance correlation, and the proposed Association Factor.
Our results show that the proposed method is robust in two ways. First, it can identify both linear and nonlinear associations. Second, the proposed Association Factor is reliable in both noiseless and noisy conditions.
在数据分析和机器学习中,我们经常需要识别和量化变量之间的相关性。虽然皮尔逊相关系数已被广泛使用,但其值仅对线性关系可靠,距离相关被引入以解决这一缺点。
距离相关可以识别线性和非线性相关性。然而,其性能在噪声条件下会下降。在本文中,我们引入关联因子(AF)作为一种在噪声条件下识别和量化线性和非线性关联的稳健方法。
为了测试所提出的关联因子的性能,我们对不同噪声条件下的线性和非线性关系进行了几种模拟建模,并计算了皮尔逊相关、距离相关和所提出的关联因子。
我们的结果表明,所提出的方法在两个方面具有稳健性。第一,它可以识别线性和非线性关联。第二,所提出的关联因子在无噪声和有噪声条件下都是可靠的。