IEEE Trans Med Imaging. 2020 Jun;39(6):2246-2255. doi: 10.1109/TMI.2020.2968397. Epub 2020 Jan 21.
Breast cancer is one of the most frequently diagnosed solid cancers. Mammography is the most commonly used screening technology for detecting breast cancer. Traditional machine learning methods of mammographic image classification or segmentation using manual features require a great quantity of manual segmentation annotation data to train the model and test the results. But manual labeling is expensive, time-consuming, and laborious, and greatly increases the cost of system construction. To reduce this cost and the workload of radiologists, an end-to-end full-image mammogram classification method based on deep neural networks was proposed for classifier building, which can be constructed without bounding boxes or mask ground truth label of training data. The only label required in this method is the classification of mammographic images, which can be relatively easy to collect from diagnostic reports. Because breast lesions usually take up a fraction of the total area visualized in the mammographic image, we propose different pooling structures for convolutional neural networks(CNNs) instead of the common pooling methods, which divide the image into regions and select the few with high probability of malignancy as the representation of the whole mammographic image. The proposed pooling structures can be applied on most CNN-based models, which may greatly improve the models' performance on mammographic image data with the same input. Experimental results on the publicly available INbreast dataset and CBIS dataset indicate that the proposed pooling structures perform satisfactorily on mammographic image data compared with previous state-of-the-art mammographic image classifiers and detection algorithm using segmentation annotations.
乳腺癌是最常见的实体肿瘤之一。乳腺 X 线摄影是用于检测乳腺癌的最常用的筛查技术。传统的基于机器学习的乳腺 X 线图像分类或分割方法使用手工特征需要大量的手动分割标注数据来训练模型和测试结果。但是手动标注既昂贵、耗时又费力,极大地增加了系统建设的成本。为了降低这种成本和放射科医生的工作量,提出了一种基于深度神经网络的全乳腺 X 线摄影分类方法来构建分类器,该方法无需训练数据的边界框或掩模地面真实标签即可构建。这种方法唯一需要的标签是乳腺 X 线图像的分类,可以从诊断报告中相对容易地收集到。由于乳腺病变通常只占乳腺 X 线图像中可见区域的一小部分,因此我们提出了卷积神经网络(CNN)的不同池化结构,而不是常见的池化方法,将图像划分为区域,并选择具有高恶性概率的少数区域作为整个乳腺 X 线图像的表示。所提出的池化结构可应用于大多数基于 CNN 的模型,这可能会极大地提高模型在具有相同输入的乳腺 X 线图像数据上的性能。在公开的 INbreast 数据集和 CBIS 数据集上的实验结果表明,与使用分割标注的以前的乳腺 X 线图像分类器和检测算法相比,所提出的池化结构在乳腺 X 线图像数据上的性能令人满意。