Zaki George, Gudla Prabhakar R, Lee Kyunghun, Kim Justin, Ozbun Laurent, Shachar Sigal, Gadkari Manasi, Sun Jing, Fraser Iain D C, Franco Luis M, Misteli Tom, Pegoraro Gianluca
Biomedical Informatics and Data Science Directorate, Frederick National Laboratory for Cancer Research (FNLCR), Frederick, Maryland, USA.
High-Throughput Imaging Facility (HiTIF), Center for Cancer Research (CCR), NCI/NIH, Bethesda, Maryland, USA.
Cytometry A. 2020 Dec;97(12):1248-1264. doi: 10.1002/cyto.a.24257. Epub 2020 Nov 19.
Deep learning is rapidly becoming the technique of choice for automated segmentation of nuclei in biological image analysis workflows. In order to evaluate the feasibility of training nuclear segmentation models on small, custom annotated image datasets that have been augmented, we have designed a computational pipeline to systematically compare different nuclear segmentation model architectures and model training strategies. Using this approach, we demonstrate that transfer learning and tuning of training parameters, such as the composition, size, and preprocessing of the training image dataset, can lead to robust nuclear segmentation models, which match, and often exceed, the performance of existing, off-the-shelf deep learning models pretrained on large image datasets. We envision a practical scenario where deep learning nuclear segmentation models trained in this way can be shared across a laboratory, facility, or institution, and continuously improved by training them on progressively larger and varied image datasets. Our work provides computational tools and a practical framework for deep learning-based biological image segmentation using small annotated image datasets. Published [2020]. This article is a U.S. Government work and is in the public domain in the USA.
深度学习正迅速成为生物图像分析工作流程中细胞核自动分割的首选技术。为了评估在经过扩充的小型自定义注释图像数据集上训练细胞核分割模型的可行性,我们设计了一个计算管道,以系统地比较不同的细胞核分割模型架构和模型训练策略。使用这种方法,我们证明了迁移学习和训练参数的调整,如训练图像数据集的组成、大小和预处理,可以产生强大的细胞核分割模型,这些模型能够匹配并常常超过在大型图像数据集上预训练的现有现成深度学习模型的性能。我们设想了一种实际场景,即通过这种方式训练的深度学习细胞核分割模型可以在实验室、机构或单位之间共享,并通过在越来越大且多样的图像数据集上进行训练来不断改进。我们的工作为使用小型注释图像数据集进行基于深度学习的生物图像分割提供了计算工具和实用框架。发表于[2020年]。本文是美国政府作品,在美国属于公共领域。