Gallas Brandon D, Pisano Etta, Cole Elodia, Myers Kyle
FDA/CDRH/OSEL/DIDSR, Silver Spring, MD.
Beth Israel Deaconess Medical Center, Boston, MA.
Proc SPIE Int Soc Opt Eng. 2017;10136. doi: 10.1117/12.2255977. Epub 2017 Mar 10.
The FDA recently completed a study on design methodologies surrounding the alidation of maging remarket valuation and egulation called VIPER. VIPER consisted of five large reader sub-studies to compare the impact of different study populations on reader behavior as seen by sensitivity, specificity, and AUC, the area under the ROC curve (receiver operating characteristic curve). The study investigated different prevalence levels and two kinds of sampling of non-cancer patients: a screening population and a challenge population. The VIPER study compared full-field digital mammography (FFDM) to screen-film mammography (SFM) for women with heterogeneously dense or extremely dense breasts. All cases and corresponding images were sampled from Digital Mammographic Imaging Screening Trial (DMIST) archives. There were 20 readers (American Board Certified radiologists) for each sub-study, and instead of every reader reading every case (fully-crossed study), readers and cases were split into groups to reduce reader workload and the total number of observations (split-plot study). For data collection, readers first decided whether or not they would recall a patient. Following that decision, they provided an ROC score for how close or far that patient was from the recall decision threshold. Performance results for FFDM show that as prevalence increases to 50%, there is a moderate increase in sensitivity and decrease in specificity, whereas AUC is mainly flat. Regarding precision, the statistical efficiency (ratio of variances) of sensitivity and specificity relative to AUC are 0.66 at best and decrease with prevalence. Analyses comparing modalities and the study populations (screening vs. challenge) are still ongoing.
美国食品药品监督管理局(FDA)最近完成了一项关于名为VIPER的影像再市场估值与监管验证的设计方法的研究。VIPER包括五项大型读者子研究,以比较不同研究人群对读者行为的影响,通过灵敏度、特异度和ROC曲线下面积(接收者操作特征曲线)来衡量。该研究调查了不同的患病率水平以及两种非癌症患者的抽样方式:筛查人群和挑战人群。VIPER研究比较了全场数字化乳腺摄影(FFDM)和屏-片乳腺摄影(SFM)对乳腺密度不均或极高的女性的效果。所有病例及相应图像均从数字化乳腺影像筛查试验(DMIST)档案中抽取。每个子研究有20名读者(美国放射学会认证的放射科医生),与每个读者阅读每个病例的方式(完全交叉研究)不同,读者和病例被分成小组以减少读者工作量和观察总数(裂区研究)。为收集数据,读者首先决定是否召回患者。在做出该决定后,他们针对该患者与召回决定阈值的接近程度给出一个ROC评分。FFDM的性能结果表明,随着患病率增加到50%,灵敏度适度增加,特异度降低,而ROC曲线下面积基本保持平稳。关于精度,灵敏度和特异度相对于ROC曲线下面积的值的统计效率(方差比)最高为0.66,且随患病率降低。比较不同模式和研究人群(筛查与挑战)的分析仍在进行中。