Department of Electrical Engineering, Vanderbilt University, Nashville, TN 37235 USA.
IEEE Trans Med Imaging. 2012 Feb;31(2):512-22. doi: 10.1109/TMI.2011.2172215. Epub 2011 Oct 14.
Image labeling and parcellation (i.e., assigning structure to a collection of voxels) are critical tasks for the assessment of volumetric and morphometric features in medical imaging data. The process of image labeling is inherently error prone as images are corrupted by noise and artifacts. Even expert interpretations are subject to subjectivity and the precision of the individual raters. Hence, all labels must be considered imperfect with some degree of inherent variability. One may seek multiple independent assessments to both reduce this variability and quantify the degree of uncertainty. Existing techniques have exploited maximum a posteriori statistics to combine data from multiple raters and simultaneously estimate rater reliabilities. Although quite successful, wide-scale application has been hampered by unstable estimation with practical datasets, for example, with label sets with small or thin objects to be labeled or with partial or limited datasets. As well, these approaches have required each rater to generate a complete dataset, which is often impossible given both human foibles and the typical turnover rate of raters in a research or clinical environment. Herein, we propose a robust approach to improve estimation performance with small anatomical structures, allow for missing data, account for repeated label sets, and utilize training/catch trial data. With this approach, numerous raters can label small, overlapping portions of a large dataset, and rater heterogeneity can be robustly controlled while simultaneously estimating a single, reliable label set and characterizing uncertainty. The proposed approach enables many individuals to collaborate in the construction of large datasets for labeling tasks (e.g., human parallel processing) and reduces the otherwise detrimental impact of rater unavailability.
图像标注和分割(即将结构分配给体素集合)是医学成像数据中评估体积和形态特征的关键任务。由于图像受到噪声和伪影的干扰,因此图像标注的过程本身容易出错。即使是专家的解释也受到主观性和个体评分者的准确性的影响。因此,所有标签都必须被认为是不完美的,具有一定程度的固有可变性。人们可能会寻求多个独立的评估,以降低这种可变性并量化不确定性的程度。现有的技术已经利用最大后验统计来结合来自多个评分者的数据,并同时估计评分者的可靠性。尽管非常成功,但由于实用数据集的估计不稳定,例如,要标记的标签集具有小或薄的物体,或者数据集部分或有限,这些方法的广泛应用受到了阻碍。此外,这些方法要求每个评分者生成一个完整的数据集,但在研究或临床环境中,由于人为的弱点和评分者的典型周转率,这通常是不可能的。在此,我们提出了一种稳健的方法,用于改善小解剖结构的估计性能,允许存在缺失数据,考虑重复的标签集,并利用训练/捕获试验数据。通过这种方法,许多评分者可以标记大型数据集的小重叠部分,可以稳健地控制评分者的异质性,同时估计单个可靠的标签集并描述不确定性。该方法可以使许多人能够协作构建用于标注任务的大型数据集(例如,人类并行处理),并减少评分者不可用的不利影响。