Cao Weiwei, Guo Jianfeng, You Xiaohui, Liu Yuxin, Li Lei, Cui Wenju, Cao Yuzhu, Chen Xinjian, Zheng Jian
IEEE J Biomed Health Inform. 2024 Aug;28(8):4761-4771. doi: 10.1109/JBHI.2024.3400802. Epub 2024 Aug 6.
Breast lesion segmentation from ultrasound images is essential in computer-aided breast cancer diagnosis. To alleviate the problems of blurry lesion boundaries and irregular morphologies, common practices combine CNN and attention to integrate global and local information. However, previous methods use two independent modules to extract global and local features separately, such feature-wise inflexible integration ignores the semantic gap between them, resulting in representation redundancy/insufficiency and undesirable restrictions in clinic practices. Moreover, medical images are highly similar to each other due to the imaging methods and human tissues, but the captured global information by transformer-based methods in the medical domain is limited within images, the semantic relations and common knowledge across images are largely ignored. To alleviate the above problems, in the neighbor view, this paper develops a pixel neighbor representation learning method (NeighborNet) to flexibly integrate global and local context within and across images for lesion morphology and boundary modeling. Concretely, we design two neighbor layers to investigate two properties (i.e., number and distribution) of neighbors. The neighbor number for each pixel is not fixed but determined by itself. The neighbor distribution is extended from one image to all images in the datasets. With the two properties, for each pixel at each feature level, the proposed NeighborNet can evolve into the transformer or degenerate into the CNN for adaptive context representation learning to cope with the irregular lesion morphologies and blurry boundaries. The state-of-the-art performances on three ultrasound datasets prove the effectiveness of the proposed NeighborNet.
从超声图像中分割乳腺病变在计算机辅助乳腺癌诊断中至关重要。为了缓解病变边界模糊和形态不规则的问题,常见的做法是将卷积神经网络(CNN)和注意力机制相结合,以整合全局和局部信息。然而,以往的方法使用两个独立的模块分别提取全局和局部特征,这种特征层面的不灵活整合忽略了它们之间的语义差距,导致表示冗余/不足以及在临床实践中存在不理想的限制。此外,由于成像方法和人体组织的原因,医学图像彼此高度相似,但基于Transformer的方法在医学领域中捕获的全局信息仅限于图像内部,跨图像的语义关系和常识在很大程度上被忽略了。为了缓解上述问题,从邻域视角出发,本文提出了一种像素邻域表示学习方法(NeighborNet),用于灵活地整合图像内部和跨图像的全局和局部上下文,以进行病变形态和边界建模。具体而言,我们设计了两个邻域层来研究邻域的两个属性(即数量和分布)。每个像素的邻域数量不是固定的,而是由其自身决定。邻域分布从一幅图像扩展到数据集中的所有图像。利用这两个属性,对于每个特征层的每个像素,所提出的NeighborNet可以演化为Transformer或退化为CNN,以进行自适应上下文表示学习,从而应对不规则的病变形态和模糊的边界。在三个超声数据集上的最优性能证明了所提出的NeighborNet的有效性。