Department of Pathology and Laboratory Medicine, University of California Davis Health, Sacramento, CA, USA.
Division of Biostatistics, University of California Davis, Sacramento, CA, USA.
Histopathology. 2019 Jul;75(1):39-53. doi: 10.1111/his.13844. Epub 2019 May 16.
Machine learning (ML) binary classification in diagnostic histopathology is an area of intense investigation. Several assumptions, including training image quality/format and the number of training images required, appear to be similar in many studies irrespective of the paucity of supporting evidence. We empirically compared training image file type, training set size, and two common convolutional neural networks (CNNs) using transfer learning (ResNet50 and SqueezeNet).
Thirty haematoxylin and eosin (H&E)-stained slides with carcinoma or normal tissue from three tissue types (breast, colon, and prostate) were photographed, generating 3000 partially overlapping images (1000 per tissue type). These lossless Portable Networks Graphics (PNGs) images were converted to lossy Joint Photographic Experts Group (JPG) images. Tissue type-specific binary classification ML models were developed by the use of all PNG or JPG images, and repeated with a subset of 500, 200, 100, 50, 30 and 10 images. Eleven models were generated for each tissue type, at each quantity of training images, for each file type, and for each CNN, resulting in 924 models. Internal accuracies and generalisation accuracies were compared. There was no meaningful significant difference in accuracies between PNG and JPG models. Models trained with more images did not invariably perform better. ResNet50 typically outperformed SqueezeNet. Models were generalisable within a tissue type but not across tissue types.
Lossy JPG images were not inferior to lossless PNG images in our models. Large numbers of unique H&E-stained slides were not required for training optimal ML models. This reinforces the need for an evidence-based approach to best practices for histopathological ML.
机器学习(ML)在诊断组织病理学中的二进制分类是一个研究热点。尽管缺乏支持证据,但许多研究似乎都存在一些类似的假设,包括训练图像的质量/格式和所需的训练图像数量。我们通过实证比较了训练图像文件类型、训练集大小以及两种常用的卷积神经网络(ResNet50 和 SqueezeNet)的迁移学习。
从三种组织类型(乳腺、结肠和前列腺)拍摄了 30 张带有癌组织或正常组织的苏木精和伊红(H&E)染色切片,生成了 3000 张部分重叠的图像(每种组织类型 1000 张)。这些无损的便携式网络图形(PNG)图像被转换为有损联合图像专家组(JPG)图像。使用所有 PNG 或 JPG 图像开发了组织类型特定的二进制分类 ML 模型,并使用 500、200、100、50、30 和 10 张图像的子集重复进行了实验。对于每种组织类型、每种数量的训练图像、每种文件类型和每种 CNN,生成了 11 个模型,总共生成了 924 个模型。比较了内部准确性和泛化准确性。PNG 和 JPG 模型之间的准确性没有显著差异。训练图像数量较多的模型并不总是表现更好。ResNet50 通常优于 SqueezeNet。模型在组织类型内具有可泛化性,但不能跨组织类型泛化。
在我们的模型中,有损 JPG 图像并不逊于无损 PNG 图像。训练最佳 ML 模型不需要大量独特的 H&E 染色载玻片。这进一步强调了需要采用循证方法来制定组织病理学 ML 的最佳实践。