参考标准、评判者与对照对象：专家在评估系统性能中的作用。

Reference standards, judges, and comparison subjects: roles for experts in evaluating system performance.

作者信息

Hripcsak George, Wilcox Adam

机构信息

Department of Medical Informatics, Columbia University, New York, New York 10032, USA.

出版信息

J Am Med Inform Assoc. 2002 Jan-Feb;9(1):1-15. doi: 10.1136/jamia.2002.0090001.

DOI:10.1136/jamia.2002.0090001

PMID:11751799

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC349383/

Abstract

Medical informatics systems are often designed to perform at the level of human experts. Evaluation of the performance of these systems is often constrained by lack of reference standards, either because the appropriate response is not known or because no simple appropriate response exists. Even when performance can be assessed, it is not always clear whether the performance is sufficient or reasonable. These challenges can be addressed if an evaluator enlists the help of clinical domain experts. 1) The experts can carry out the same tasks as the system, and then their responses can be combined to generate a reference standard. 2)The experts can judge the appropriateness of system output directly. 3) The experts can serve as comparison subjects with which the system can be compared. These are separate roles that have different implications for study design, metrics, and issues of reliability and validity. Diagrams help delineate the roles of experts in complex study designs.

摘要

医学信息系统通常被设计为具备人类专家水平的性能。对这些系统性能的评估常常受到缺乏参考标准的限制，这要么是因为合适的响应未知，要么是因为不存在简单合适的响应。即使性能能够被评估，也不总是清楚该性能是否足够或合理。如果评估者寻求临床领域专家的帮助，这些挑战是可以解决的。1）专家可以执行与系统相同的任务，然后将他们的响应合并以生成参考标准。2）专家可以直接判断系统输出的适当性。3）专家可以作为与系统进行比较的对照对象。这些是不同的角色，对研究设计、指标以及可靠性和有效性问题具有不同的影响。图表有助于在复杂的研究设计中描绘专家的角色。

相似文献

Reference standards, judges, and comparison subjects: roles for experts in evaluating system performance.

J Am Med Inform Assoc. 2002 Jan-Feb;9(1):1-15. doi: 10.1136/jamia.2002.0090001.

Reference standards in evaluating system performance.

J Am Med Inform Assoc. 2002 Jan-Feb;9(1):87-8. doi: 10.1136/jamia.2002.0090087.

Scientific basis of the OCRA method for risk assessment of biomechanical overload of upper limb, as preferred method in ISO standards on biomechanical risk factors.

Scand J Work Environ Health. 2018 Jul 1;44(4):436-438. doi: 10.5271/sjweh.3746.

A reliability study for evaluating information extraction from radiology reports.

J Am Med Inform Assoc. 1999 Mar-Apr;6(2):143-50. doi: 10.1136/jamia.1999.0060143.

A framework for evaluation of medical information systems.

Stud Health Technol Inform. 2003;95:611-6.

Setting standards for nursing data sets in information systems.

Proc AMIA Symp. 1998:745-9.

Implementing the HL7v3 standard in Croatian primary healthcare domain.

Stud Health Technol Inform. 2004;105:325-36.

Generating reference models for structurally complex data. Application to the stabilometry medical domain.

Methods Inf Med. 2013;52(5):441-53. doi: 10.3414/ME12-01-0106. Epub 2013 Sep 6.

A model for setting performance standards for standardized patient examinations.

Eval Health Prof. 2003 Dec;26(4):427-46. doi: 10.1177/0163278703258105.

Development and evaluation of the Women's Sexual Interest Diagnostic Interview (WSID): a structured interview to diagnose hypoactive sexual desire disorder (HSDD) in standardized patients.

J Sex Med. 2008 Dec;5(12):2827-41. doi: 10.1111/j.1743-6109.2008.01008.x. Epub 2008 Sep 24.

引用本文的文献

Accuracy of the diagnosis of pneumonia in Canadian pediatric emergency departments: A prospective cohort study.

PLoS One. 2024 Dec 11;19(12):e0311201. doi: 10.1371/journal.pone.0311201. eCollection 2024.

Jawbone quality classification in dental implant planning and placement studies. A scoping review.

J Int Soc Prev Community Dent. 2024 Jan 4;14(2):89-97. doi: 10.4103/jispcd.JISPCD_4_22. eCollection 2024 Mar-Apr.

Linguistic and ontological challenges of multiple domains contributing to transformed health ecosystems.

Front Med (Lausanne). 2023 Mar 15;10:1073313. doi: 10.3389/fmed.2023.1073313. eCollection 2023.

Initial Development of an Automated Platform for Assessing Trainee Performance on Case Presentations.

ATS Sch. 2022 Sep 23;3(4):548-560. doi: 10.34197/ats-scholar.2022-0010OC. eCollection 2022 Dec.

Measurement Strategies for Evidence-Based Psychotherapy for Posttraumatic Stress Disorder Delivery: Trends and Associations with Patient-Reported Outcomes.

Adm Policy Ment Health. 2020 May;47(3):451-467. doi: 10.1007/s10488-019-01004-2.

Comparison of the cohort selection performance of Australian Medicines Terminology to Anatomical Therapeutic Chemical mappings.

J Am Med Inform Assoc. 2019 Nov 1;26(11):1237-1246. doi: 10.1093/jamia/ocz143.

Linguistic summarization of in-home sensor data.

J Biomed Inform. 2019 Aug;96:103240. doi: 10.1016/j.jbi.2019.103240. Epub 2019 Jun 28.

Computer-based automatic classification of trabecular bone pattern can assist radiographic bone quality assessment at dental implant site.

Br J Radiol. 2018 Dec;91(1092):20180437. doi: 10.1259/bjr.20180437. Epub 2018 Sep 17.

Diagnostic Accuracy of Ophthalmoscopy vs Telemedicine in Examinations for Retinopathy of Prematurity.

JAMA Ophthalmol. 2018 May 1;136(5):498-504. doi: 10.1001/jamaophthalmol.2018.0649.

Comparison of heuristic and cognitive walkthrough usability evaluation methods for evaluating health information systems.

J Am Med Inform Assoc. 2017 Apr 1;24(e1):e55-e60. doi: 10.1093/jamia/ocw100.

本文引用的文献

Controlling for chance agreement in the validation of medical expert systems with no gold standard: PNEUMON-IA and RENOIR revisited.

Comput Biomed Res. 2000 Dec;33(6):380-97. doi: 10.1006/cbmr.2000.1552.

Automatic detection of acute bacterial pneumonia from chest X-ray reports.

J Am Med Inform Assoc. 2000 Nov-Dec;7(6):593-604. doi: 10.1136/jamia.2000.0070593.

The objective structured clinical examination: a step in the direction of competency-based evaluation.

Arch Pediatr Adolesc Med. 2000 Jul;154(7):736-41. doi: 10.1001/archpedi.154.7.736.

Evaluation of the quality of information retrieval of clinical findings from a computerized patient database using a semantic terminological model.

J Am Med Inform Assoc. 2000 Jul-Aug;7(4):392-403. doi: 10.1136/jamia.2000.0070392.

Comparing expert systems for identifying chest x-ray reports that support pneumonia.

Proc AMIA Symp. 1999:216-20.

Automatic prediction of trauma registry procedure codes from emergency room dictations.

Stud Health Technol Inform. 1998;52 Pt 1:665-9.

An overview of the objective structured clinical examination.

Physiother Can. 1993 Summer;45(3):171-8.

Toward a measured approach to medical informatics.

J Am Med Inform Assoc. 1999 Mar-Apr;6(2):176-7. doi: 10.1136/jamia.1999.0060176.

A reliability study for evaluating information extraction from radiology reports.

J Am Med Inform Assoc. 1999 Mar-Apr;6(2):143-50. doi: 10.1136/jamia.1999.0060143.

Automatic extraction of PIOPED interpretations from ventilation/perfusion lung scan reports.

Proc AMIA Symp. 1998:860-4.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

参考标准、评判者与对照对象：专家在评估系统性能中的作用。

Reference standards, judges, and comparison subjects: roles for experts in evaluating system performance.

作者信息

Hripcsak George, Wilcox Adam

机构信息

Department of Medical Informatics, Columbia University, New York, New York 10032, USA.

出版信息

J Am Med Inform Assoc. 2002 Jan-Feb;9(1):1-15. doi: 10.1136/jamia.2002.0090001.

DOI:10.1136/jamia.2002.0090001

PMID:11751799

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC349383/

Abstract

摘要

参考标准、评判者与对照对象：专家在评估系统性能中的作用。

Reference standards, judges, and comparison subjects: roles for experts in evaluating system performance.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

参考标准、评判者与对照对象：专家在评估系统性能中的作用。

Reference standards, judges, and comparison subjects: roles for experts in evaluating system performance.

作者信息

机构信息

出版信息