不同研究人群对读者行为和性能指标的影响：初步结果。

Impact of Different Study Populations on Reader Behavior and Performance Metrics: Initial Results.

作者信息

Gallas Brandon D, Pisano Etta, Cole Elodia, Myers Kyle

机构信息

FDA/CDRH/OSEL/DIDSR, Silver Spring, MD.

Beth Israel Deaconess Medical Center, Boston, MA.

出版信息

Proc SPIE Int Soc Opt Eng. 2017;10136. doi: 10.1117/12.2255977. Epub 2017 Mar 10.

DOI:10.1117/12.2255977

PMID:28845078

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5568780/

Abstract

The FDA recently completed a study on design methodologies surrounding the alidation of maging remarket valuation and egulation called VIPER. VIPER consisted of five large reader sub-studies to compare the impact of different study populations on reader behavior as seen by sensitivity, specificity, and AUC, the area under the ROC curve (receiver operating characteristic curve). The study investigated different prevalence levels and two kinds of sampling of non-cancer patients: a screening population and a challenge population. The VIPER study compared full-field digital mammography (FFDM) to screen-film mammography (SFM) for women with heterogeneously dense or extremely dense breasts. All cases and corresponding images were sampled from Digital Mammographic Imaging Screening Trial (DMIST) archives. There were 20 readers (American Board Certified radiologists) for each sub-study, and instead of every reader reading every case (fully-crossed study), readers and cases were split into groups to reduce reader workload and the total number of observations (split-plot study). For data collection, readers first decided whether or not they would recall a patient. Following that decision, they provided an ROC score for how close or far that patient was from the recall decision threshold. Performance results for FFDM show that as prevalence increases to 50%, there is a moderate increase in sensitivity and decrease in specificity, whereas AUC is mainly flat. Regarding precision, the statistical efficiency (ratio of variances) of sensitivity and specificity relative to AUC are 0.66 at best and decrease with prevalence. Analyses comparing modalities and the study populations (screening vs. challenge) are still ongoing.

摘要

美国食品药品监督管理局（FDA）最近完成了一项关于名为VIPER的影像再市场估值与监管验证的设计方法的研究。VIPER包括五项大型读者子研究，以比较不同研究人群对读者行为的影响，通过灵敏度、特异度和ROC曲线下面积（接收者操作特征曲线）来衡量。该研究调查了不同的患病率水平以及两种非癌症患者的抽样方式：筛查人群和挑战人群。VIPER研究比较了全场数字化乳腺摄影（FFDM）和屏-片乳腺摄影（SFM）对乳腺密度不均或极高的女性的效果。所有病例及相应图像均从数字化乳腺影像筛查试验（DMIST）档案中抽取。每个子研究有20名读者（美国放射学会认证的放射科医生），与每个读者阅读每个病例的方式（完全交叉研究）不同，读者和病例被分成小组以减少读者工作量和观察总数（裂区研究）。为收集数据，读者首先决定是否召回患者。在做出该决定后，他们针对该患者与召回决定阈值的接近程度给出一个ROC评分。FFDM的性能结果表明，随着患病率增加到50%，灵敏度适度增加，特异度降低，而ROC曲线下面积基本保持平稳。关于精度，灵敏度和特异度相对于ROC曲线下面积的值的统计效率（方差比）最高为0.66，且随患病率降低。比较不同模式和研究人群（筛查与挑战）的分析仍在进行中。

相似文献

Impact of Different Study Populations on Reader Behavior and Performance Metrics: Initial Results.不同研究人群对读者行为和性能指标的影响：初步结果。

Proc SPIE Int Soc Opt Eng. 2017;10136. doi: 10.1117/12.2255977. Epub 2017 Mar 10.

Impact of prevalence and case distribution in lab-based diagnostic imaging studies.基于实验室的诊断成像研究中患病率和病例分布的影响。

J Med Imaging (Bellingham). 2019 Jan;6(1):015501. doi: 10.1117/1.JMI.6.1.015501. Epub 2019 Jan 21.

Automated Breast Ultrasound in Breast Cancer Screening of Women With Dense Breasts: Reader Study of Mammography-Negative and Mammography-Positive Cancers.乳腺致密女性乳腺癌筛查中的自动乳腺超声：乳腺X线摄影阴性和阳性癌症的阅片者研究

AJR Am J Roentgenol. 2016 Jun;206(6):1341-50. doi: 10.2214/AJR.15.15367. Epub 2016 Apr 4.

Accuracy of soft-copy digital mammography versus that of screen-film mammography according to digital manufacturer: ACRIN DMIST retrospective multireader study.根据数字制造商比较软拷贝数字乳腺摄影与屏-片乳腺摄影的准确性：ACRIN DMIST回顾性多阅片者研究

Radiology. 2008 Apr;247(1):38-48. doi: 10.1148/radiol.2471070418.

Follow-up and final results of the Oslo I Study comparing screen-film mammography and full-field digital mammography with soft-copy reading.奥斯陆I研究的随访及最终结果：比较屏-片乳腺摄影与软拷贝阅读的全视野数字乳腺摄影。

Acta Radiol. 2005 Nov;46(7):679-89. doi: 10.1080/02841850500223547.

Comparison of radiologist performance with photon-counting full-field digital mammography to conventional full-field digital mammography.与传统全数字乳腺摄影相比，光子计数全数字乳腺摄影的放射科医生性能比较。

Acad Radiol. 2012 Aug;19(8):916-22. doi: 10.1016/j.acra.2012.03.005. Epub 2012 Apr 24.

Diagnostic accuracy of digital versus film mammography: exploratory analysis of selected population subgroups in DMIST.数字乳腺摄影与胶片乳腺摄影的诊断准确性：DMIST中选定人群亚组的探索性分析。

Radiology. 2008 Feb;246(2):376-83. doi: 10.1148/radiol.2461070200.

Detection and characterization of breast lesions in a selective diagnostic population: diagnostic accuracy study for comparison between one-view digital breast tomosynthesis and two-view full-field digital mammography.选择性诊断人群中乳腺病变的检测与特征分析：单视图数字乳腺断层合成与双视图全视野数字乳腺摄影对比的诊断准确性研究

Br J Radiol. 2016 Jun;89(1062):20150743. doi: 10.1259/bjr.20150743. Epub 2016 Apr 13.

Is an ROC-type response truly always better than a binary response in observer performance studies?在观察者性能研究中，ROC 型反应真的总是优于二项反应吗？

Acad Radiol. 2010 May;17(5):639-45. doi: 10.1016/j.acra.2009.12.012. Epub 2010 Mar 16.

Cancer cases from ACRIN digital mammographic imaging screening trial: radiologist analysis with use of a logistic regression model.来自ACRIN数字化乳腺钼靶成像筛查试验的癌症病例：使用逻辑回归模型的放射科医生分析

Radiology. 2009 Aug;252(2):348-57. doi: 10.1148/radiol.2522081457.

引用本文的文献

Impact of prevalence and case distribution in lab-based diagnostic imaging studies.基于实验室的诊断成像研究中患病率和病例分布的影响。

J Med Imaging (Bellingham). 2019 Jan;6(1):015501. doi: 10.1117/1.JMI.6.1.015501. Epub 2019 Jan 21.

Paired split-plot designs of multireader multicase studies.多读者多病例研究的配对裂区设计

J Med Imaging (Bellingham). 2018 Jul;5(3):031410. doi: 10.1117/1.JMI.5.3.031410. Epub 2018 May 17.

本文引用的文献

The average receiver operating characteristic curve in multireader multicase imaging studies.多阅片者多病例影像研究中的平均受试者工作特征曲线。

Br J Radiol. 2014 Aug;87(1040):20140016. doi: 10.1259/bjr.20140016. Epub 2014 Jun 2.

Multi-reader ROC studies with split-plot designs: a comparison of statistical methods.多读者 ROC 研究的分割设计：统计方法的比较。

Acad Radiol. 2012 Dec;19(12):1508-17. doi: 10.1016/j.acra.2012.09.012.

Prevalence of abnormalities influences cytologists' error rates in screening for cervical cancer.异常的普遍性会影响细胞学医生在宫颈癌筛查中的错误率。

Arch Pathol Lab Med. 2011 Dec;135(12):1557-60. doi: 10.5858/arpa.2010-0739-OA.

Reader studies for validation of CAD systems.用于验证计算机辅助检测（CAD）系统的读者研究。

Neural Netw. 2008 Mar-Apr;21(2-3):387-97. doi: 10.1016/j.neunet.2007.12.013. Epub 2007 Dec 23.

The prevalence effect in a laboratory environment: Changing the confidence ratings.实验室环境中的患病率效应：改变置信度评级。

Acad Radiol. 2007 Jan;14(1):49-53. doi: 10.1016/j.acra.2006.10.003.

Diagnostic performance of digital versus film mammography for breast-cancer screening.数字化乳腺摄影与传统胶片乳腺摄影在乳腺癌筛查中的诊断性能

N Engl J Med. 2005 Oct 27;353(17):1773-83. doi: 10.1056/NEJMoa052911. Epub 2005 Sep 16.

American College of Radiology Imaging Network digital mammographic imaging screening trial: objectives and methodology.美国放射学会影像网络数字化乳腺X线摄影筛查试验：目标与方法

Radiology. 2005 Aug;236(2):404-12. doi: 10.1148/radiol.2362050440. Epub 2005 Jun 16.

Prevalence effect in a laboratory environment.实验室环境中的流行效应。

Radiology. 2003 Jul;228(1):10-4. doi: 10.1148/radiol.2281020709.

Context bias. A problem in diagnostic radiology.背景偏倚。诊断放射学中的一个问题。

JAMA. 1996 Dec 4;276(21):1752-5. doi: 10.1001/jama.276.21.1752.

Basic principles of ROC analysis.ROC分析的基本原理。

Semin Nucl Med. 1978 Oct;8(4):283-98. doi: 10.1016/s0001-2998(78)80014-2.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验