Suppr超能文献

基于深度学习的 CT 病变检测中全面注释提供的改进的定量分析。

A quantitative analysis of the improvement provided by comprehensive annotation on CT lesion detection using deep learning.

机构信息

Department of Radiology, Memorial Sloan Kettering Cancer Center, New York, New York, USA.

Department of Radiology, Columbia University Irving Medical Center, New York, New York, USA.

出版信息

J Appl Clin Med Phys. 2024 Sep;25(9):e14434. doi: 10.1002/acm2.14434. Epub 2024 Jul 30.

Abstract

BACKGROUND

Data collected from hospitals are usually partially annotated by radiologists due to time constraints. Developing and evaluating deep learning models on these data may result in over or under estimation PURPOSE: We aimed to quantitatively investigate how the percentage of annotated lesions in CT images will influence the performance of universal lesion detection (ULD) algorithms.

METHODS

We trained a multi-view feature pyramid network with position-aware attention (MVP-Net) to perform ULD. Three versions of the DeepLesion dataset were created for training MVP-Net. Original DeepLesion Dataset (OriginalDL) is the publicly available, widely studied DeepLesion dataset that includes 32 735 lesions in 4427 patients which were partially labeled during routine clinical practice. Enriched DeepLesion Dataset (EnrichedDL) is an enhanced dataset that features fully labeled at one or more time points for 4145 patients with 34 317 lesions. UnionDL is the union of the OriginalDL and EnrichedDL with 54 510 labeled lesions in 4427 patients. Each dataset was used separately to train MVP-Net, resulting in the following models: OriginalCNN (replicating the original result), EnrichedCNN (testing the effect of increased annotation), and UnionCNN (featuring the greatest number of annotations).

RESULTS

Although the reported mean sensitivity of OriginalCNN was 84.3% using the OriginalDL testing set, the performance fell sharply when tested on the EnrichedDL testing set, yielding mean sensitivities of 56.1%, 66.0%, and 67.8% for OriginalCNN, EnrichedCNN, and UnionCNN, respectively. We also found that increasing the percentage of annotated lesions in the training set increased sensitivity, but the margin of increase in performance gradually diminished according to the power law.

CONCLUSIONS

We expanded and improved the existing DeepLesion dataset by annotating additional 21 775 lesions, and we demonstrated that using fully labeled CT images avoided overestimation of MVP-Net's performance while increasing the algorithm's sensitivity, which may have a huge impact to the future CT lesion detection research. The annotated lesions are at https://github.com/ComputationalImageAnalysisLab/DeepLesionData.

摘要

背景

由于时间限制,医院采集的数据通常由放射科医生部分标注。在这些数据上开发和评估深度学习模型可能会导致过度或低估。

目的

我们旨在定量研究 CT 图像中注释病变的百分比将如何影响通用病变检测 (ULD) 算法的性能。

方法

我们使用具有位置感知注意力的多视图特征金字塔网络 (MVP-Net) 进行 ULD 训练。为训练 MVP-Net 创建了三个版本的 DeepLesion 数据集。原始 DeepLesion 数据集 (OriginalDL) 是公开的、广泛研究的 DeepLesion 数据集,其中包含 4427 名患者的 32735 个病变,这些病变是在常规临床实践中部分标记的。富集 DeepLesion 数据集 (EnrichedDL) 是一个增强的数据集,其中 4145 名患者的一个或多个时间点的病变完全标记,共有 34317 个病变。UnionDL 是 OriginalDL 和 EnrichedDL 的联合,其中有 4427 名患者的 54510 个标记病变。每个数据集分别用于训练 MVP-Net,从而产生以下模型:OriginalCNN(复制原始结果)、EnrichedCNN(测试增加注释的效果)和 UnionCNN(具有最大数量的注释)。

结果

尽管使用 OriginalDL 测试集报告的 OriginalCNN 的平均灵敏度为 84.3%,但当在 EnrichedDL 测试集上进行测试时,性能急剧下降,OriginalCNN、EnrichedCNN 和 UnionCNN 的平均灵敏度分别为 56.1%、66.0%和 67.8%。我们还发现,增加训练集中注释病变的百分比会提高灵敏度,但性能提高的幅度根据幂律逐渐减小。

结论

我们通过注释另外 21775 个病变扩展和改进了现有的 DeepLesion 数据集,并证明使用完全标记的 CT 图像避免了 MVP-Net 性能的高估,同时提高了算法的灵敏度,这可能对未来的 CT 病变检测研究产生巨大影响。注释病变可在 https://github.com/ComputationalImageAnalysisLab/DeepLesionData 上获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3050/11492393/34d871b5518f/ACM2-25-e14434-g004.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验