Suppr超能文献

马蒂厄斯相关系数(MCC)、患病率阈值和福克-马洛斯指数的统计学比较。

A statistical comparison between Matthews correlation coefficient (MCC), prevalence threshold, and Fowlkes-Mallows index.

机构信息

University of Toronto, Canada.

Fondazione Bruno Kessler, Italy.

出版信息

J Biomed Inform. 2023 Aug;144:104426. doi: 10.1016/j.jbi.2023.104426. Epub 2023 Jun 21.

Abstract

Even if assessing binary classifications is a common task in scientific research, no consensus on a single statistic summarizing the confusion matrix has been reached so far. In recent studies, we demonstrated the advantages of the Matthews correlation coefficient (MCC) over other popular rates such as cross-entropy error, F score, accuracy, balanced accuracy, bookmaker informedness, diagnostic odds ratio, Brier score, and Cohen's kappa. In this study, we compared the MCC to other two statistics: prevalence threshold (PT), frequently used in obstetrics and gynecology, and Fowlkes-Mallows index, a metric employed in fuzzy logic and drug discovery. Through the investigation of the mutual relations among three metrics and the study of some relevant use cases, we show that, when positive data elements and negative data elements have the same importance, the Matthews correlation coefficient can be more informative than its two competitors, even this time.

摘要

即使评估二分类是科学研究中的常见任务,但迄今为止,还没有达成关于单个统计量来总结混淆矩阵的共识。在最近的研究中,我们证明了马修斯相关系数(MCC)优于其他流行的指标,如交叉熵误差、F 分数、准确性、平衡准确性、博彩商信息量、诊断优势比、Brier 得分和 Cohen's kappa。在这项研究中,我们将 MCC 与另外两个统计量进行了比较:流行阈值(PT),常用于妇产科,以及 Fowlkes-Mallows 指数,这是模糊逻辑和药物发现中使用的一种度量。通过研究这三个指标之间的相互关系以及对一些相关用例的研究,我们表明,当阳性数据元素和阴性数据元素具有相同的重要性时,马修斯相关系数比其两个竞争对手更具信息量,即使在这种情况下也是如此。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验