Shieh Alexander, Mathai Tejas Sudharshan, Liu Jianfei, Paul Angshuman, Summers Ronald M
Imaging Biomarkers and Computer-Aided Diagnosis Laboratory, Radiology and Imaging Sciences, Clinical Center, National Institutes of Health, Bethesda MD, USA.
Indian Institute of Technology, Jodhpur, Rajasthan, India.
ArXiv. 2025 Apr 7:arXiv:2504.05207v1.
Universal lesion detection and tagging (ULDT) in CT studies is critical for tumor burden assessment and tracking the progression of lesion status (growth/shrinkage) over time. However, a lack of fully annotated data hinders the development of effective ULDT approaches. Prior work used the DeepLesion dataset (4,427 patients, 10,594 studies, 32,120 CT slices, 32,735 lesions, 8 body part labels) for algorithmic development, but this dataset is not completely annotated and contains class imbalances. To address these issues, in this work, we developed a self-training pipeline for ULDT. A VFNet model was trained on a limited 11.5% subset of DeepLesion (bounding boxes + tags) to detect and classify lesions in CT studies. Then, it identified and incorporated novel lesion candidates from a larger unseen data subset into its training set, and self-trained itself over multiple rounds. Multiple self-training experiments were conducted with different threshold policies to select predicted lesions with higher quality and cover the class imbalances. We discovered that direct self-training improved the sensitivities of over-represented lesion classes at the expense of under-represented classes. However, upsampling the lesions mined during self-training along with a variable threshold policy yielded a 6.5% increase in sensitivity at 4 FP in contrast to self-training without class balancing (72% vs 78.5%) and a 11.7% increase compared to the same self-training policy without upsampling (66.8% vs 78.5%). Furthermore, we show that our results either improved or maintained the sensitivity at 4FP for all 8 lesion classes.
CT研究中的通用病变检测与标记(ULDT)对于肿瘤负荷评估以及随时间追踪病变状态(生长/缩小)的进展至关重要。然而,缺乏完全注释的数据阻碍了有效ULDT方法的开发。先前的工作使用DeepLesion数据集(4427名患者,10594项研究,32120个CT切片,32735个病变,8个身体部位标签)进行算法开发,但该数据集没有完全注释且存在类别不平衡问题。为了解决这些问题,在这项工作中,我们开发了一种用于ULDT的自训练管道。在DeepLesion的有限11.5%子集(边界框+标签)上训练VFNet模型,以检测和分类CT研究中的病变。然后,它从更大的未见数据子集中识别并纳入新的病变候选者到其训练集中,并在多轮中进行自我训练。使用不同的阈值策略进行了多次自训练实验,以选择质量更高的预测病变并解决类别不平衡问题。我们发现直接自训练提高了过度代表的病变类别的敏感性,但以牺牲代表性不足的类别为代价。然而,对自训练期间挖掘的病变进行上采样并采用可变阈值策略,与没有类别平衡的自训练相比(72%对78.5%),在4个假阳性时敏感性提高了6.5%,与相同的无采样自训练策略相比(66.8%对78.5%)提高了11.7%。此外,我们表明我们的结果对于所有8个病变类别在4个假阳性时要么提高了敏感性要么保持了敏感性。