Department of Research & Development, VitaDX, Paris, France.
Cytometry A. 2019 Nov;95(11):1198-1206. doi: 10.1002/cyto.a.23899. Epub 2019 Oct 8.
Building automated cancer screening systems based on image analysis is currently a hot topic in computer vision and medical imaging community. One of the biggest challenges of such systems, especially those using state-of-the-art deep learning techniques, is that they usually require a large amount of training data to be accurate. However, in the medical field, the confidentiality of the data and the need for medical expertise to label them significantly reduce the amount of training data available. A common practice to overcome this problem is to apply data set augmentation techniques to artificially increase the size of the training data set. Classical data set augmentation methods such as geometrical or color transformations are efficient but still produce a limited amount of new data. Hence, there has been interest in data set augmentation methods using generative models able to synthesize a wider variety of new data. VitaDX is actually developing an automated bladder cancer screening system based on the analysis of cell images contained in urinary cytology digital slides. Currently, the number of available labeled cell images is limited and therefore exploitation of the full potential of deep learning techniques is not possible. In an attempt to increase the number of labeled cell images, a new generic generator for 2D cell images has been developed and is described in this article. This framework combines previous works on cell image generation and a recent style transfer method referred to as doodle-style transfer in this article. To the best of our knowledge, we are the first to use a doodle-style transfer method for synthetic cell image generation. This framework is quite modular and could be applied to other cell image generation problems. A statistical evaluation has shown that features of real and synthetic cell images followed roughly the same distribution. Finally, the realism of the synthetic cell images has been assessed through a visual evaluation performed with the help of medical experts. © 2019 The Authors. Cytometry Part A published by Wiley Periodicals, Inc. on behalf of International Society for Advancement of Cytometry.
基于图像分析构建自动化癌症筛查系统是计算机视觉和医学成像领域当前的热门话题。这些系统,特别是那些使用最先进的深度学习技术的系统,面临的最大挑战之一是它们通常需要大量的训练数据才能达到准确。然而,在医学领域,数据的保密性和需要医学专业知识来对其进行标记,显著减少了可用的训练数据量。克服这个问题的一种常见做法是应用数据集扩充技术来人为地增加训练数据集的大小。经典的数据集扩充方法,如几何或颜色变换,效率很高,但仍然只能产生有限数量的新数据。因此,人们对使用生成模型来合成更广泛的新数据的数据集扩充方法产生了兴趣。VitaDX 实际上正在开发一种基于尿液细胞学数字载玻片上的细胞图像分析的膀胱癌自动化筛查系统。目前,可用的标记细胞图像数量有限,因此无法充分利用深度学习技术的潜力。为了增加标记细胞图像的数量,我们开发了一种新的 2D 细胞图像通用生成器,并在本文中进行了描述。该框架结合了细胞图像生成的先前工作和本文中提到的一种最近的风格迁移方法,即涂鸦风格迁移。据我们所知,我们是第一个将涂鸦风格迁移方法用于合成细胞图像生成的。该框架非常模块化,可以应用于其他细胞图像生成问题。统计评估表明,真实和合成细胞图像的特征大致遵循相同的分布。最后,通过医学专家的帮助进行了视觉评估,评估了合成细胞图像的逼真度。©2019 作者。细胞分析杂志由 Wiley 期刊出版公司代表国际细胞分析促进协会出版。