Suppr超能文献

基于网络的新冠病毒症状检查器的诊断准确性:比较研究

Diagnostic Accuracy of Web-Based COVID-19 Symptom Checkers: Comparison Study.

作者信息

Munsch Nicolas, Martin Alistair, Gruarin Stefanie, Nateqi Jama, Abdarahmane Isselmou, Weingartner-Ortner Rafael, Knapp Bernhard

机构信息

Data Science Department, Symptoma, Vienna, Austria.

Medical Department, Symptoma, Attersee, Austria.

出版信息

J Med Internet Res. 2020 Oct 6;22(10):e21299. doi: 10.2196/21299.

Abstract

BACKGROUND

A large number of web-based COVID-19 symptom checkers and chatbots have been developed; however, anecdotal evidence suggests that their conclusions are highly variable. To our knowledge, no study has evaluated the accuracy of COVID-19 symptom checkers in a statistically rigorous manner.

OBJECTIVE

The aim of this study is to evaluate and compare the diagnostic accuracies of web-based COVID-19 symptom checkers.

METHODS

We identified 10 web-based COVID-19 symptom checkers, all of which were included in the study. We evaluated the COVID-19 symptom checkers by assessing 50 COVID-19 case reports alongside 410 non-COVID-19 control cases. A bootstrapping method was used to counter the unbalanced sample sizes and obtain confidence intervals (CIs). Results are reported as sensitivity, specificity, F1 score, and Matthews correlation coefficient (MCC).

RESULTS

The classification task between COVID-19-positive and COVID-19-negative for "high risk" cases among the 460 test cases yielded (sorted by F1 score): Symptoma (F1=0.92, MCC=0.85), Infermedica (F1=0.80, MCC=0.61), US Centers for Disease Control and Prevention (CDC) (F1=0.71, MCC=0.30), Babylon (F1=0.70, MCC=0.29), Cleveland Clinic (F1=0.40, MCC=0.07), Providence (F1=0.40, MCC=0.05), Apple (F1=0.29, MCC=-0.10), Docyet (F1=0.27, MCC=0.29), Ada (F1=0.24, MCC=0.27) and Your.MD (F1=0.24, MCC=0.27). For "high risk" and "medium risk" combined the performance was: Symptoma (F1=0.91, MCC=0.83) Infermedica (F1=0.80, MCC=0.61), Cleveland Clinic (F1=0.76, MCC=0.47), Providence (F1=0.75, MCC=0.45), Your.MD (F1=0.72, MCC=0.33), CDC (F1=0.71, MCC=0.30), Babylon (F1=0.70, MCC=0.29), Apple (F1=0.70, MCC=0.25), Ada (F1=0.42, MCC=0.03), and Docyet (F1=0.27, MCC=0.29).

CONCLUSIONS

We found that the number of correctly assessed COVID-19 and control cases varies considerably between symptom checkers, with different symptom checkers showing different strengths with respect to sensitivity and specificity. A good balance between sensitivity and specificity was only achieved by two symptom checkers.

摘要

背景

大量基于网络的新冠病毒症状检查器和聊天机器人已被开发出来;然而,坊间证据表明它们的结论差异很大。据我们所知,尚无研究以统计学上严谨的方式评估新冠病毒症状检查器的准确性。

目的

本研究旨在评估和比较基于网络的新冠病毒症状检查器的诊断准确性。

方法

我们识别出10个基于网络的新冠病毒症状检查器,它们均被纳入本研究。我们通过评估50份新冠病毒病例报告以及410份非新冠病毒对照病例来对新冠病毒症状检查器进行评估。采用自抽样法来应对样本量不均衡的问题并获得置信区间(CI)。结果以灵敏度、特异度、F1分数和马修斯相关系数(MCC)的形式呈现。

结果

在460个测试病例中,针对“高风险”病例的新冠病毒阳性与新冠病毒阴性分类任务得出(按F1分数排序):Symptoma(F1 = 0.92,MCC = 0.85)、Infermedica(F1 = 0.80,MCC = 0.61)、美国疾病控制与预防中心(CDC)(F1 = 0.71,MCC = 0.30)、巴比伦(F1 = 0.70,MCC = 0.29)、克利夫兰诊所(F1 = 0.40,MCC = 0.07)、普罗维登斯(F1 = 0.40,MCC = 0.05)、苹果(F1 = 0.29,MCC = -0.10)、Docyet(F1 = 0.27,MCC = 0.29)、Ada(F1 = 0.24,MCC = 0.27)和Your.MD(F1 = 0.24,MCC = 0.27)。对于“高风险”和“中风险”合并情况,表现如下:Symptoma(F1 = 0.91,MCC = 0.83)、Infermedica(F1 = 0.80,MCC = 0.61)、克利夫兰诊所(F1 = 0.76,MCC = 0.47)、普罗维登斯(F1 = 0.75,MCC = 0.45)、Your.MD(F1 = 0.72,MCC = 0.33)、CDC(F1 = 0.71,MCC = 0.30)、巴比伦(F1 = 0.70,MCC = 0.29)、苹果(F1 = 0.70,MCC = 0.25)、Ada(F1 = 0.42,MCC = 0.03)和Docyet(F1 = 0.27,MCC = 0.29)。

结论

我们发现,不同症状检查器正确评估的新冠病毒病例和对照病例数量差异很大,不同的症状检查器在灵敏度和特异度方面表现出不同的优势。只有两个症状检查器在灵敏度和特异度之间实现了良好的平衡。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f37f/7541039/f35d984cbaca/jmir_v22i10e21299_fig1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验