Suppr超能文献

从多数据集学习具有异质和部分标签的通用 CT 病变检测

Learning From Multiple Datasets With Heterogeneous and Partial Labels for Universal Lesion Detection in CT.

出版信息

IEEE Trans Med Imaging. 2021 Oct;40(10):2759-2770. doi: 10.1109/TMI.2020.3047598. Epub 2021 Sep 30.

Abstract

Large-scale datasets with high-quality labels are desired for training accurate deep learning models. However, due to the annotation cost, datasets in medical imaging are often either partially-labeled or small. For example, DeepLesion is such a large-scale CT image dataset with lesions of various types, but it also has many unlabeled lesions (missing annotations). When training a lesion detector on a partially-labeled dataset, the missing annotations will generate incorrect negative signals and degrade the performance. Besides DeepLesion, there are several small single-type datasets, such as LUNA for lung nodules and LiTS for liver tumors. These datasets have heterogeneous label scopes, i.e., different lesion types are labeled in different datasets with other types ignored. In this work, we aim to develop a universal lesion detection algorithm to detect a variety of lesions. The problem of heterogeneous and partial labels is tackled. First, we build a simple yet effective lesion detection framework named Lesion ENSemble (LENS). LENS can efficiently learn from multiple heterogeneous lesion datasets in a multi-task fashion and leverage their synergy by proposal fusion. Next, we propose strategies to mine missing annotations from partially-labeled datasets by exploiting clinical prior knowledge and cross-dataset knowledge transfer. Finally, we train our framework on four public lesion datasets and evaluate it on 800 manually-labeled sub-volumes in DeepLesion. Our method brings a relative improvement of 49% compared to the current state-of-the-art approach in the metric of average sensitivity. We have publicly released our manual 3D annotations of DeepLesion online. https://github.com/viggin/DeepLesion_manual_test_set.

摘要

大型数据集和高质量的标签是训练精确的深度学习模型所需要的。然而,由于注释成本,医学成像中的数据集通常是部分标记或较小的。例如,DeepLesion 是一个具有各种类型病变的大规模 CT 图像数据集,但它也有许多未标记的病变(缺少注释)。在部分标记的数据集上训练病变检测器时,未标记的注释会产生错误的负信号,从而降低性能。除了 DeepLesion 之外,还有几个小的单类型数据集,如用于肺结节的 LUNA 和用于肝脏肿瘤的 LiTS。这些数据集的标签范围不同,即不同的病变类型在不同的数据集中标注,而忽略其他类型的病变。在这项工作中,我们旨在开发一种通用的病变检测算法来检测多种病变。解决了异构和部分标签的问题。首先,我们构建了一个简单而有效的病变检测框架,名为病变 ENSemble(LENS)。LENS 可以以多任务的方式从多个异构病变数据集中高效地学习,并通过提议融合利用它们的协同作用。接下来,我们提出了从部分标记的数据集中挖掘缺失注释的策略,利用临床先验知识和跨数据集知识转移。最后,我们在四个公共病变数据集上训练我们的框架,并在 DeepLesion 的 800 个手动标记子体积上进行评估。与当前最先进的方法相比,我们的方法在平均灵敏度指标上有相对提高 49%。我们已经在网上公开了 DeepLesion 的手动 3D 注释。https://github.com/viggin/DeepLesion_manual_test_set。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验