• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

临床病例在基准测试在线症状检查器的性能方面的适用性如何?一项审计研究。

What is the suitability of clinical vignettes in benchmarking the performance of online symptom checkers? An audit study.

机构信息

Department of Primary Care and Public Health, Imperial College London, London, UK

Self-Care Academic Research Unit (SCARU), Department of Primary Care and Public Health, Imperial College London Faculty of Medicine, London, UK.

出版信息

BMJ Open. 2022 Apr 27;12(4):e053566. doi: 10.1136/bmjopen-2021-053566.

DOI:10.1136/bmjopen-2021-053566
PMID:35477872
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9047920/
Abstract

OBJECTIVE

Assess the suitability of clinical vignettes in benchmarking the performance of online symptom checkers (OSCs).

DESIGN

Observational study using a publicly available free OSC.

PARTICIPANTS

Healthily OSC, which provided consultations in English, was used to record consultation outcomes from two lay and four expert inputters using 139 standardised patient vignettes. Each vignette included three diagnostic solutions and a triage recommendation in one of three categories of triage urgency. A panel of three independent general practitioners interpreted the vignettes to arrive at an alternative set of diagnostic and triage solutions. Both sets of diagnostic and triage solutions were consolidated to arrive at a final consolidated version for benchmarking.

MAIN OUTCOME MEASURES

Six inputters simulated 834 standardised patient evaluations using Healthily OSC and recorded outputs (triage solution, signposting, and whether the correct diagnostic solution appeared first or within the first three differentials). We estimated Cohen's kappa to assess how interpretations by different inputters could lead to divergent OSC output even when using the same vignette or when compared with a separate panel of physicians.

RESULTS

There was moderate agreement on triage recommendation (kappa=0.48), and substantial agreement on consultation outcomes between all inputters (kappa=0.73). OSC performance improved significantly from baseline when compared against the final consolidated diagnostic and triage solution (p<0.001).

CONCLUSIONS

Clinical vignettes are inherently limited in their utility to benchmark the diagnostic accuracy or triage safety of OSC. Real-world evidence studies involving real patients are recommended to benchmark the performance of OSC against a panel of physicians.

摘要

目的

评估临床病例在基准测试在线症状检查器(OSC)性能方面的适用性。

设计

使用公开可用的免费 OSC 进行观察性研究。

参与者

健康 OSC 以英文提供咨询服务,使用 139 个标准化患者病例记录了两名外行和四名专家输入者的咨询结果。每个病例包括三个诊断方案和一个分诊建议,分为三个紧急程度类别之一。一个由三名独立的全科医生组成的小组对病例进行解释,以得出另一组诊断和分诊解决方案。将两套诊断和分诊解决方案合并为一套最终的基准测试合并版本。

主要结果测量

六名输入者使用 Healthily OSC 模拟了 834 次标准化患者评估,并记录了输出结果(分诊解决方案、引导和正确的诊断解决方案是否首先出现或在前三个不同的方案中出现)。我们估计了 Cohen's kappa,以评估即使使用相同的病例或与单独的医生小组进行比较,不同输入者的解释如何导致 OSC 输出结果的分歧。

结果

在分诊建议方面存在中度一致性(kappa=0.48),所有输入者在咨询结果方面存在高度一致性(kappa=0.73)。与最终的综合诊断和分诊解决方案相比,OSC 的性能有了显著提高(p<0.001)。

结论

临床病例在基准测试 OSC 的诊断准确性或分诊安全性方面的实用性存在固有局限性。建议使用涉及真实患者的真实世界证据研究来基准测试 OSC 相对于医生小组的性能。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/581d/9047920/092fefdea06a/bmjopen-2021-053566f05.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/581d/9047920/a919f286f4bc/bmjopen-2021-053566f01.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/581d/9047920/f8da779fa01d/bmjopen-2021-053566f02.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/581d/9047920/853bec3e8522/bmjopen-2021-053566f03.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/581d/9047920/7d7998c24753/bmjopen-2021-053566f04.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/581d/9047920/092fefdea06a/bmjopen-2021-053566f05.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/581d/9047920/a919f286f4bc/bmjopen-2021-053566f01.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/581d/9047920/f8da779fa01d/bmjopen-2021-053566f02.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/581d/9047920/853bec3e8522/bmjopen-2021-053566f03.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/581d/9047920/7d7998c24753/bmjopen-2021-053566f04.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/581d/9047920/092fefdea06a/bmjopen-2021-053566f05.jpg

相似文献

1
What is the suitability of clinical vignettes in benchmarking the performance of online symptom checkers? An audit study.临床病例在基准测试在线症状检查器的性能方面的适用性如何?一项审计研究。
BMJ Open. 2022 Apr 27;12(4):e053566. doi: 10.1136/bmjopen-2021-053566.
2
Evaluation of symptom checkers for self diagnosis and triage: audit study.用于自我诊断和分诊的症状检查器评估:审计研究
BMJ. 2015 Jul 8;351:h3480. doi: 10.1136/bmj.h3480.
3
Triage and Diagnostic Accuracy of Online Symptom Checkers: Systematic Review.在线症状检查器的分诊和诊断准确性:系统评价。
J Med Internet Res. 2023 Jun 2;25:e43803. doi: 10.2196/43803.
4
Accuracy of online symptom checkers and the potential impact on service utilisation.在线症状检查器的准确性及其对服务利用的潜在影响。
PLoS One. 2021 Jul 15;16(7):e0254088. doi: 10.1371/journal.pone.0254088. eCollection 2021.
5
Triage Accuracy of Symptom Checker Apps: 5-Year Follow-up Evaluation.症状检查器应用程序的分诊准确性:5 年随访评估。
J Med Internet Res. 2022 May 10;24(5):e31810. doi: 10.2196/31810.
6
How suitable are clinical vignettes for the evaluation of symptom checker apps? A test theoretical perspective.临床案例对症状检查应用程序评估的适用性如何?一个测试理论视角。
Digit Health. 2023 Aug 21;9:20552076231194929. doi: 10.1177/20552076231194929. eCollection 2023 Jan-Dec.
7
Benchmarking Triage Capability of Symptom Checkers Against That of Medical Laypersons: Survey Study.基于症状检查器和医学门外汉分诊能力的基准测试:调查研究。
J Med Internet Res. 2021 Mar 10;23(3):e24475. doi: 10.2196/24475.
8
Assessment of a Digital Symptom Checker Tool's Accuracy in Suggesting Reproductive Health Conditions: Clinical Vignettes Study.评估数字症状检查工具在提示生殖健康状况方面的准确性:临床病例研究。
JMIR Mhealth Uhealth. 2023 Dec 5;11:e46718. doi: 10.2196/46718.
9
Controlling Inputter Variability in Vignette Studies Assessing Web-Based Symptom Checkers: Evaluation of Current Practice and Recommendations for Isolated Accuracy Metrics.评估基于网络的症状检查器的病例研究中控制输入者变异性:当前实践评估及孤立准确性指标建议
JMIR Form Res. 2024 May 31;8:e49907. doi: 10.2196/49907.
10
Online Symptom Checkers: Recommendations for a Vignette-Based Clinical Evaluation Standard.在线症状检查器:基于病例的临床评估标准的推荐。
J Med Internet Res. 2022 Oct 26;24(10):e37408. doi: 10.2196/37408.

引用本文的文献

1
Factors Influencing the Use of Online Symptom Checkers in the United Kingdom: Cross-Sectional Study.影响英国在线症状检查器使用情况的因素:横断面研究
JMIR Form Res. 2025 Sep 15;9:e65314. doi: 10.2196/65314.
2
A dual domain systematic review and meta-analysis of risk tool accuracy to predict cardiovascular morbidity in prehypertension and diabetic morbidity in prediabetes.一项双领域系统评价与荟萃分析:评估预测高血压前期心血管疾病发病率及糖尿病前期糖尿病发病率的风险工具的准确性
Front Endocrinol (Lausanne). 2025 Jul 22;16:1527092. doi: 10.3389/fendo.2025.1527092. eCollection 2025.
3
Accuracy of online symptom assessment applications, large language models, and laypeople for self-triage decisions.

本文引用的文献

1
Accuracy of online symptom checkers and the potential impact on service utilisation.在线症状检查器的准确性及其对服务利用的潜在影响。
PLoS One. 2021 Jul 15;16(7):e0254088. doi: 10.1371/journal.pone.0254088. eCollection 2021.
2
Securing a sustainable and fit-for-purpose UK health and care workforce.确保英国卫生和保健劳动力的可持续性和适用性。
Lancet. 2021 May 22;397(10288):1992-2011. doi: 10.1016/S0140-6736(21)00231-2. Epub 2021 May 6.
3
How accurate are digital symptom assessment apps for suggesting conditions and urgency advice? A clinical vignettes comparison to GPs.
在线症状评估应用程序、大语言模型和非专业人员进行自我分诊决策的准确性。
NPJ Digit Med. 2025 Mar 25;8(1):178. doi: 10.1038/s41746-025-01566-6.
4
The RepVig framework for designing use-case specific representative vignettes and evaluating triage accuracy of laypeople and symptom assessment applications.用于设计特定用例代表性案例并评估非专业人员和症状评估应用程序分诊准确性的RepVig框架。
Sci Rep. 2024 Dec 23;14(1):30614. doi: 10.1038/s41598-024-83844-z.
5
Exploring Self-Reported Symptoms for Developing and Evaluating Digital Symptom Checkers for Polycystic Ovarian Syndrome, Endometriosis, and Uterine Fibroids: Exploratory Survey Study.探索用于多囊卵巢综合征、子宫内膜异位症和子宫肌瘤的数字症状检查器开发与评估的自我报告症状:探索性调查研究
JMIR Form Res. 2024 Dec 12;8:e65469. doi: 10.2196/65469.
6
Statistical refinement of patient-centered case vignettes for digital health research.用于数字健康研究的以患者为中心的病例 vignettes 的统计优化。
Front Digit Health. 2024 Oct 21;6:1411924. doi: 10.3389/fdgth.2024.1411924. eCollection 2024.
7
Triage Accuracy and the Safety of User-Initiated Symptom Assessment With an Electronic Symptom Checker in a Real-Life Setting: Instrument Validation Study.在真实环境中,使用电子症状检查器进行用户发起的症状评估的分诊准确性和安全性:仪器验证研究。
JMIR Hum Factors. 2024 Sep 26;11:e55099. doi: 10.2196/55099.
8
Evaluating the Diagnostic Performance of Symptom Checkers: Clinical Vignette Study.评估症状检查器的诊断性能:临床病例研究。
JMIR AI. 2024 Apr 29;3:e46875. doi: 10.2196/46875.
9
Controlling Inputter Variability in Vignette Studies Assessing Web-Based Symptom Checkers: Evaluation of Current Practice and Recommendations for Isolated Accuracy Metrics.评估基于网络的症状检查器的病例研究中控制输入者变异性:当前实践评估及孤立准确性指标建议
JMIR Form Res. 2024 May 31;8:e49907. doi: 10.2196/49907.
10
A Symptom-Checker for Adult Patients Visiting an Interdisciplinary Emergency Care Center and the Safety of Patient Self-Triage: Real-Life Prospective Evaluation.成人患者就诊于多学科急诊中心的症状自查工具和患者自我分诊的安全性:真实世界前瞻性评估。
J Med Internet Res. 2024 Jun 27;26:e58157. doi: 10.2196/58157.
数字症状评估应用程序在提示病症和紧急程度建议方面的准确性如何?与全科医生进行临床病例比较。
BMJ Open. 2020 Dec 16;10(12):e040269. doi: 10.1136/bmjopen-2020-040269.
4
The efficacy of microlearning in improving self-care capability: a systematic review of the literature.微学习在提高自我护理能力方面的效果:文献系统评价。
Public Health. 2020 Sep;186:286-296. doi: 10.1016/j.puhe.2020.07.007. Epub 2020 Aug 31.
5
The primary care response to COVID-19 in England's National Health Service.英国国民医疗服务体系中初级医疗对新冠疫情的应对措施。
J R Soc Med. 2020 Jun;113(6):208-210. doi: 10.1177/0141076820931452.
6
Accuracy of a Chatbot (Ada) in the Diagnosis of Mental Disorders: Comparative Case Study With Lay and Expert Users.聊天机器人(Ada)在精神障碍诊断中的准确性:与普通用户和专家用户的比较案例研究
JMIR Form Res. 2019 Oct 29;3(4):e13863. doi: 10.2196/13863.
7
Digital and online symptom checkers and health assessment/triage services for urgent health problems: systematic review.用于紧急健康问题的数字和在线症状检查器以及健康评估/分诊服务:系统评价
BMJ Open. 2019 Aug 1;9(8):e027743. doi: 10.1136/bmjopen-2018-027743.
8
Vignette methodologies for studying clinicians' decision-making: Validity, utility, and application in ICD-11 field studies.用于研究临床医生决策的案例法:在国际疾病分类第11版实地研究中的有效性、实用性及应用
Int J Clin Health Psychol. 2015 May-Aug;15(2):160-170. doi: 10.1016/j.ijchp.2014.12.001. Epub 2015 Jan 29.
9
Safety of patient-facing digital symptom checkers.面向患者的数字症状检查器的安全性。
Lancet. 2018 Nov 24;392(10161):2263-2264. doi: 10.1016/S0140-6736(18)32819-8. Epub 2018 Nov 6.
10
Is digital medicine different?数字医学有何不同?
Lancet. 2018 Jul 14;392(10142):95. doi: 10.1016/S0140-6736(18)31562-9.