Li Ping, Kong Xiangwen, Li Johann, Zhu Guangming, Lu Xiaoyuan, Shen Peiyi, Shah Syed Afaq Ali, Bennamoun Mohammed, Hua Tao
Shanghai BNC, Shanghai, China.
Embedded Technology & Vision Processing Research Center, Xidian University, Xi'an, China.
Front Digit Health. 2021 Feb 17;2:609349. doi: 10.3389/fdgth.2020.609349. eCollection 2020.
Lung cancer is a life-threatening disease and its diagnosis is of great significance. Data scarcity and unavailability of datasets is a major bottleneck in lung cancer research. In this paper, we introduce a dataset of pulmonary lesions for designing the computer-aided diagnosis (CAD) systems. The dataset has fine contour annotations and nine attribute annotations. We define the structure of the dataset in detail, and then discuss the relationship of the attributes and pathology, and the correlation between the nine attributes with the chi-square test. To demonstrate the contribution of our dataset to computer-aided system design, we define four tasks that can be developed using our dataset. Then, we use our dataset to model multi-attribute classification tasks. We discuss the performance in 2D, 2.5D, and 3D input modes of the classification model. To improve performance, we introduce two attention mechanisms and verify the principles of the attention mechanisms through visualization. Experimental results show the relationship between different models and different levels of attributes.
肺癌是一种危及生命的疾病,其诊断具有重要意义。数据稀缺和数据集不可用是肺癌研究中的一个主要瓶颈。在本文中,我们介绍了一个用于设计计算机辅助诊断(CAD)系统的肺部病变数据集。该数据集具有精细的轮廓注释和九个属性注释。我们详细定义了数据集的结构,然后讨论了属性与病理学的关系,以及通过卡方检验得出的九个属性之间的相关性。为了证明我们的数据集对计算机辅助系统设计的贡献,我们定义了四个可以使用我们的数据集开发的任务。然后,我们使用我们的数据集对多属性分类任务进行建模。我们讨论了分类模型在二维、2.5维和三维输入模式下的性能。为了提高性能,我们引入了两种注意力机制,并通过可视化验证了注意力机制的原理。实验结果显示了不同模型与不同属性水平之间的关系。