Suppr超能文献

临床环境中CAD系统性能的评估与比较。

Estimation and comparison of CAD system performance in clinical settings.

作者信息

Bornefalk Hans

机构信息

Royal Institute of Technology, AlbaNova University Center, Department of Physics, SE--106 91 Stockholm, Sweden.

出版信息

Acad Radiol. 2005 Jun;12(6):687-94. doi: 10.1016/j.acra.2005.02.005.

Abstract

RATIONALE AND OBJECTIVES

Computer-aided detection (CAD) systems are frequently compared using free-response receiver operating characteristic (FROC) curves. While there are ample statistical methods for comparing FROC curves, when one is interested in comparing the outcomes of 2 CAD systems applied in a typical clinical setting, there is the additional matter of correctly determining the system operating point. This article shows how the effect of the sampling error on determining the correct CAD operating point can be captured. By incorporating this uncertainty, a method is presented that allows estimation of the probability with which a particular CAD system performs better than another on unseen data in a clinical setting.

MATERIALS AND METHODS

The distribution of possible clinical outcomes from 2 artificial CAD systems with different FROC curves is examined. The sampling error is captured by the distribution of possible system thresholds of the classifying machine that yields a specified sensitivity. After introducing a measure of superiority, the probability of one system being superior to the other can be determined.

RESULTS

It is shown that for 2 typical mammography CAD systems, each trained on independent representative datasets of 100 cases, the FROC curves must be separated by 0.20 false positives per image in order to conclude that there is a 90% probability that one is better than the other in a clinical setting. Also, there is no apparent gain in increasing the size of the training set beyond 100 cases.

DISCUSSION

CAD systems for mammography are modeled for illustrative purposes, but the method presented is applicable to any computer-aided detection system evaluated with FROC curves. The presented method is designed to construct confidence intervals around possible clinical outcomes and to assess the importance of training set size and separation between FROC curves of systems trained on different datasets.

摘要

原理与目的

计算机辅助检测(CAD)系统常通过自由响应接收器操作特性(FROC)曲线进行比较。虽然有大量统计方法可用于比较FROC曲线,但当人们想要比较在典型临床环境中应用的两种CAD系统的结果时,正确确定系统操作点则是另外一个问题。本文展示了如何捕捉采样误差对确定正确CAD操作点的影响。通过纳入这种不确定性,提出了一种方法,该方法能够估计在临床环境中特定CAD系统在未见过的数据上比另一个系统表现更好的概率。

材料与方法

研究了具有不同FROC曲线的两个人造CAD系统可能的临床结果分布。通过产生指定灵敏度的分类机器的可能系统阈值分布来捕捉采样误差。引入优势度量后,就可以确定一个系统优于另一个系统的概率。

结果

结果表明,对于两个典型的乳腺X线摄影CAD系统,每个系统在100例独立的代表性数据集中进行训练,为了得出在临床环境中有90%的概率一个系统比另一个系统更好的结论,FROC曲线必须以每张图像0.20个假阳性的幅度分开。此外,将训练集大小增加到超过100例并没有明显的收益。

讨论

用于乳腺X线摄影的CAD系统是为说明目的而建模的,但所提出的方法适用于任何用FROC曲线评估的计算机辅助检测系统。所提出的方法旨在围绕可能的临床结果构建置信区间,并评估训练集大小以及在不同数据集上训练的系统的FROC曲线之间的分离程度的重要性。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验