Suppr超能文献

一种用于检测变量之间关联的分段线性新方法。

A novel piecewise-linear method for detecting associations between variables.

机构信息

School of Computer Science and Technology, Xidian University, Xi'an, Shaanxi, China.

出版信息

PLoS One. 2023 Aug 24;18(8):e0290280. doi: 10.1371/journal.pone.0290280. eCollection 2023.

Abstract

Detecting the association between two variables is necessary and meaningful in the era of big data. There are many measures to detect the association between them, some detect linear association, e.g., simple and fast Pearson correlation coefficient, and others detect nonlinear association, e.g., computationally expensive and imprecise maximal information coefficient (MIC). In our study, we proposed a novel maximal association coefficient (MAC) based on the idea that any nonlinear association can be considered to be composed of some piecewise-linear ones, which detects linear or nonlinear association between two variables through Pearson coefficient. We conduct experiments on some simulation data, with the results show that the MAC has both generality and equitability. In addition, we also apply MAC method to two real datasets, the major-league baseball dataset from Baseball Prospectus and dataset of credit card clients' default, to detect the association strength of pairs of variables in these two datasets respectively. The experimental results show that the MAC can be used to detect the association between two variables, and it is computationally inexpensive and precise than MIC, which may be potentially important for follow-up data analysis and the conclusion of data analysis in the future.

摘要

在大数据时代,检测两个变量之间的关联是必要且有意义的。有许多方法可以检测它们之间的关联,有些方法检测线性关联,例如简单快速的皮尔逊相关系数,而另一些方法检测非线性关联,例如计算成本高且不准确的最大信息系数 (MIC)。在我们的研究中,我们基于任何非线性关联都可以被视为由一些分段线性关联组成的想法,提出了一种新的最大关联系数 (MAC),通过皮尔逊系数检测两个变量之间的线性或非线性关联。我们在一些模拟数据上进行了实验,结果表明 MAC 具有通用性和公平性。此外,我们还将 MAC 方法应用于两个真实数据集,即来自 Baseball Prospectus 的职业棒球数据集和信用卡客户违约数据集,分别检测这两个数据集中变量对之间的关联强度。实验结果表明,MAC 可用于检测两个变量之间的关联,并且它比 MIC 计算成本低且精确,这可能对后续数据分析和未来数据分析的结论具有重要意义。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0914/10449123/c8986017741e/pone.0290280.g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验