• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

配对设计中二分类和多分类 F 分数的假设检验程序。

Hypothesis testing procedure for binary and multi-class F -scores in the paired design.

机构信息

Department of Biostatistics, Hyogo Medical University, Hyogo, Japan.

Department of Biostatistics, School of Medicine, Yokohama City University, Kanagawa, Japan.

出版信息

Stat Med. 2023 Oct 15;42(23):4177-4192. doi: 10.1002/sim.9853. Epub 2023 Aug 1.

DOI:10.1002/sim.9853
PMID:37527903
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11483486/
Abstract

In modern medicine, medical tests are used for various purposes including diagnosis, disease screening, prognosis, and risk prediction. To quantify the performance of the binary medical test, we often use sensitivity, specificity, and negative and positive predictive values as measures. Additionally, the -score, which is defined as the harmonic mean of precision (positive predictive value) and recall (sensitivity), has come to be used in the medical field due to its favorable characteristics. The -score has been extended for multi-class classification, and two types of -scores have been proposed for multi-class classification: a micro-averaged -score and a macro-averaged -score. The micro-averaged -score pools per-sample classifications across classes and then calculates the overall -score, whereas the macro-averaged -score computes an arithmetic mean of the -scores for each class. Additionally, Sokolova and Lapalme gave an alternative definition of the macro-averaged -score as the harmonic mean of the arithmetic means of the precision and recall over classes. Although some statistical methods of inference for binary and multi-class -scores have been proposed, the methodology development of hypothesis testing procedure for them has not been fully progressing yet. Therefore, we aim to develop hypothesis testing procedure for comparing two -scores in paired study design based on the large sample multivariate central limit theorem.

摘要

在现代医学中,医学检验被用于各种目的,包括诊断、疾病筛查、预后和风险预测。为了量化二项式医学检验的性能,我们通常使用敏感性、特异性和阴性及阳性预测值作为衡量标准。此外,由于其优良的特性,-分数已在医学领域得到应用,它被定义为精确性(阳性预测值)和召回率(敏感性)的调和平均值。-分数已被扩展用于多类分类,并且已经提出了两种用于多类分类的 -分数:微平均 -分数和宏平均 -分数。微平均 -分数在跨类别的样本分类中进行汇总,然后计算总体 -分数,而宏平均 -分数则计算每个类别的 -分数的算术平均值。此外,Sokolova 和 Lapalme 还给出了宏平均 -分数的另一种定义,即将精度和召回率的算术平均值的调和平均值作为宏平均 -分数。尽管已经提出了用于二项式和多类 -分数的一些统计推断方法,但它们的假设检验程序的方法学开发尚未完全推进。因此,我们旨在基于大样本多元中心极限定理,为配对研究设计中比较两个 -分数的假设检验程序。

相似文献

1
Hypothesis testing procedure for binary and multi-class F -scores in the paired design.配对设计中二分类和多分类 F 分数的假设检验程序。
Stat Med. 2023 Oct 15;42(23):4177-4192. doi: 10.1002/sim.9853. Epub 2023 Aug 1.
2
Confidence interval for micro-averaged and macro-averaged scores.微观平均和宏观平均分数的置信区间。
Appl Intell (Dordr). 2022 Mar;52(5):4961-4972. doi: 10.1007/s10489-021-02635-5. Epub 2021 Jul 31.
3
Optimal Thresholding of Classifiers to Maximize F1 Measure.分类器的最优阈值设定以最大化F1度量
Mach Learn Knowl Discov Databases. 2014;8725:225-239. doi: 10.1007/978-3-662-44851-9_15.
4
Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.在流行地区,服用抗叶酸抗疟药物的人群中,叶酸补充剂与疟疾易感性和严重程度的关系。
Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.
5
MF-MNER: Multi-models Fusion for MNER in Chinese Clinical Electronic Medical Records.MF-MNER:中文临床电子病历中的多模型融合命名实体识别。
Interdiscip Sci. 2024 Jun;16(2):489-502. doi: 10.1007/s12539-024-00624-z. Epub 2024 Apr 5.
6
Using a national surgical database to predict complications following posterior lumbar surgery and comparing the area under the curve and F1-score for the assessment of prognostic capability.利用国家外科手术数据库预测腰椎后路手术后的并发症,并比较曲线下面积和 F1 评分评估预测能力。
Spine J. 2021 Jul;21(7):1135-1142. doi: 10.1016/j.spinee.2021.02.007. Epub 2021 Feb 16.
7
More advantages in detecting bone and soft tissue metastases from prostate cancer using F-PSMA PET/CT.使用F-PSMA PET/CT检测前列腺癌骨和软组织转移方面有更多优势。
Hell J Nucl Med. 2019 Jan-Apr;22(1):6-9. doi: 10.1967/s002449910952. Epub 2019 Mar 7.
8
Developing a FHIR-based EHR phenotyping framework: A case study for identification of patients with obesity and multiple comorbidities from discharge summaries.基于 FHIR 的电子健康记录表型框架的开发:以从出院小结中识别肥胖且伴有多种合并症的患者为例。
J Biomed Inform. 2019 Nov;99:103310. doi: 10.1016/j.jbi.2019.103310. Epub 2019 Oct 14.
9
Development of an unsupervised machine learning algorithm for the prognostication of walking ability in spinal cord injury patients.开发一种用于预测脊髓损伤患者行走能力的无监督机器学习算法。
Spine J. 2020 Feb;20(2):213-224. doi: 10.1016/j.spinee.2019.09.007. Epub 2019 Sep 13.
10
Antibody Class(es) Predictor for Epitopes (AbCPE): A Multi-Label Classification Algorithm.表位抗体类别预测器(AbCPE):一种多标签分类算法。
Front Bioinform. 2021 Sep 7;1:709951. doi: 10.3389/fbinf.2021.709951. eCollection 2021.

引用本文的文献

1
Machine learning-assisted design of immunomodulatory lipid nanoparticles for delivery of mRNA to repolarize hyperactivated microglia.用于递送mRNA以重新极化过度活化小胶质细胞的免疫调节脂质纳米颗粒的机器学习辅助设计
Drug Deliv. 2025 Dec;32(1):2465909. doi: 10.1080/10717544.2025.2465909. Epub 2025 Mar 3.
2
Asymptotic Properties of Matthews Correlation Coefficient.马修斯相关系数的渐近性质
Stat Med. 2025 Jan 15;44(1-2):e10303. doi: 10.1002/sim.10303. Epub 2024 Dec 16.
3
Exploring the Impact of Batch Size on Deep Learning Artificial Intelligence Models for Malaria Detection.探索批量大小对用于疟疾检测的深度学习人工智能模型的影响。
Cureus. 2024 May 13;16(5):e60224. doi: 10.7759/cureus.60224. eCollection 2024 May.
4
Estimated electric conductivities of thermal plasma for air-fuel combustion and oxy-fuel combustion with potassium or cesium seeding.空气燃料燃烧以及添加钾或铯的富氧燃料燃烧的热等离子体的估计电导率。
Heliyon. 2024 May 22;10(11):e31697. doi: 10.1016/j.heliyon.2024.e31697. eCollection 2024 Jun 15.

本文引用的文献

1
Confidence interval for micro-averaged and macro-averaged scores.微观平均和宏观平均分数的置信区间。
Appl Intell (Dordr). 2022 Mar;52(5):4961-4972. doi: 10.1007/s10489-021-02635-5. Epub 2021 Jul 31.
2
Federated Learning on Clinical Benchmark Data: Performance Assessment.基于临床基准数据的联邦学习:性能评估。
J Med Internet Res. 2020 Oct 26;22(10):e20891. doi: 10.2196/20891.
3
The Development of a Skin Cancer Classification System for Pigmented Skin Lesions Using Deep Learning.利用深度学习开发用于色素性皮肤病变的皮肤癌分类系统。
Biomolecules. 2020 Jul 29;10(8):1123. doi: 10.3390/biom10081123.
4
Expression based biomarkers and models to classify early and late-stage samples of Papillary Thyroid Carcinoma.基于表达谱的生物标志物和模型,用于分类甲状腺乳头状癌的早期和晚期样本。
PLoS One. 2020 Apr 23;15(4):e0231629. doi: 10.1371/journal.pone.0231629. eCollection 2020.
5
Automated recognition of functional compound-protein relationships in literature.文献中功能化合物-蛋白质关系的自动识别。
PLoS One. 2020 Mar 3;15(3):e0220925. doi: 10.1371/journal.pone.0220925. eCollection 2020.
6
Analyze Informant-Based Questionnaire for The Early Diagnosis of Senile Dementia Using Deep Learning.基于深度学习的用于老年痴呆症早期诊断的信息提供者问卷分析
IEEE J Transl Eng Health Med. 2019 Dec 16;8:2200106. doi: 10.1109/JTEHM.2019.2959331. eCollection 2020.
7
Keratinocytic Skin Cancer Detection on the Face Using Region-Based Convolutional Neural Network.基于区域的卷积神经网络对面部角化细胞癌的检测。
JAMA Dermatol. 2020 Jan 1;156(1):29-37. doi: 10.1001/jamadermatol.2019.3807.
8
Developing a FHIR-based EHR phenotyping framework: A case study for identification of patients with obesity and multiple comorbidities from discharge summaries.基于 FHIR 的电子健康记录表型框架的开发:以从出院小结中识别肥胖且伴有多种合并症的患者为例。
J Biomed Inform. 2019 Nov;99:103310. doi: 10.1016/j.jbi.2019.103310. Epub 2019 Oct 14.
9
Application of Augmented Intelligence for Pharmacovigilance Case Seriousness Determination.增强智能在药物警戒案例严重程度判定中的应用。
Drug Saf. 2020 Jan;43(1):57-66. doi: 10.1007/s40264-019-00869-4.
10
An artificial intelligence-enabled ECG algorithm for the identification of patients with atrial fibrillation during sinus rhythm: a retrospective analysis of outcome prediction.一种基于人工智能的心电图算法,用于在窦性心律期间识别房颤患者:对结局预测的回顾性分析。
Lancet. 2019 Sep 7;394(10201):861-867. doi: 10.1016/S0140-6736(19)31721-0. Epub 2019 Aug 1.