Primary Health Care Service of the Capital Area, Reykjavik, Iceland.
Department of Computer Science, Reykjavik University, Reykjavik, Iceland.
Scand J Prim Health Care. 2021 Dec;39(4):448-458. doi: 10.1080/02813432.2021.1973255. Epub 2021 Sep 29.
Machine learning (ML) is expected to play an increasing role within primary health care (PHC) in coming years. No peer-reviewed studies exist that evaluate the diagnostic accuracy of ML models compared to general practitioners (GPs). The aim of this study was to evaluate the diagnostic accuracy of an ML classifier on primary headache diagnoses in PHC, compare its performance to GPs, and examine the most impactful signs and symptoms when making a prediction.
A retrospective study on diagnostic accuracy, using electronic health records from the database of the Primary Health Care Service of the Capital Area (PHCCA) in Iceland.
Fifteen primary health care centers of the PHCCA.
All patients that consulted a physician, from 1 January 2006 to 30 April 2020, and received one of the selected diagnoses.
Sensitivity, Specificity, Positive Predictive Value, Matthews Correlation Coefficient, Receiver Operating Characteristic (ROC) curve, and Area under the ROC curve (AUROC) score for primary headache diagnoses, as well as Shapley Additive Explanations (SHAP) values of the ML classifier.
The classifier outperformed the GPs on all metrics except specificity. The SHAP values indicate that the classifier uses the same signs and symptoms (features) as a physician would, when distinguishing between headache diagnoses.
In a retrospective comparison, the diagnostic accuracy of the ML classifier for primary headache diagnoses is superior to GPs. According to SHAP values, the ML classifier relies on the same signs and symptoms as a physician when making a diagnostic prediction.KeypointsLittle is known about the diagnostic accuracy of machine learning (ML) in the context of primary health care, despite its considerable potential to aid in clinical work. This novel research sheds light on the diagnostic accuracy of ML in a clinical context, as well as the interpretation of its predictions. If the vast potential of ML is to be utilized in primary health care, its performance, safety, and inner workings need to be understood by clinicians.
机器学习(ML)有望在未来几年在基层医疗保健(PHC)中发挥越来越重要的作用。目前尚无同行评审的研究评估 ML 模型与全科医生(GP)相比的诊断准确性。本研究旨在评估 ML 分类器在 PHC 原发性头痛诊断中的诊断准确性,比较其性能与全科医生,并研究在进行预测时最具影响力的体征和症状。
一项回顾性研究,使用来自冰岛首都地区初级保健服务数据库(PHCCA)的电子健康记录。
PHCCA 的 15 个初级保健中心。
所有于 2006 年 1 月 1 日至 2020 年 4 月 30 日就诊的患者,并接受了选定诊断之一。
原发性头痛诊断的敏感性、特异性、阳性预测值、马修斯相关系数、接收器工作特征(ROC)曲线和 ROC 曲线下面积(AUROC)评分,以及 ML 分类器的 Shapley 加法解释(SHAP)值。
除了特异性外,分类器在所有指标上均优于全科医生。SHAP 值表明,在区分头痛诊断时,分类器与医生使用相同的体征和症状(特征)。
在回顾性比较中,ML 分类器用于原发性头痛诊断的诊断准确性优于全科医生。根据 SHAP 值,ML 分类器在进行诊断预测时依赖于与医生相同的体征和症状。
尽管机器学习(ML)在临床工作中具有很大的辅助作用,但在基层医疗保健背景下,其诊断准确性知之甚少。这项新研究揭示了 ML 在临床环境中的诊断准确性,以及对其预测的解释。如果要在基层医疗保健中利用 ML 的巨大潜力,则临床医生需要了解其性能、安全性和内部运作。