Englbrecht Fabian, Ruider Iris E, Bausch Andreas R
Lehrstuhl für Biophysik (E27), Technische Universität München (TUM), Garching, Germany.
Center for Protein Assemblies (CPA), Garching, Germany.
PLoS One. 2021 Apr 16;16(4):e0250093. doi: 10.1371/journal.pone.0250093. eCollection 2021.
Dataset annotation is a time and labor-intensive task and an integral requirement for training and testing deep learning models. The segmentation of images in life science microscopy requires annotated image datasets for object detection tasks such as instance segmentation. Although the amount of annotated image data has been steadily reduced due to methods such as data augmentation, the process of manual or semi-automated data annotation is the most labor and cost intensive task in the process of cell nuclei segmentation with deep neural networks. In this work we propose a system to fully automate the annotation process of a custom fluorescent cell nuclei image dataset. By that we are able to reduce nuclei labelling time by up to 99.5%. The output of our system provides high quality training data for machine learning applications to identify the position of cell nuclei in microscopy images. Our experiments have shown that the automatically annotated dataset provides coequal segmentation performance compared to manual data annotation. In addition, we show that our system enables a single workflow from raw data input to desired nuclei segmentation and tracking results without relying on pre-trained models or third-party training datasets for neural networks.
数据集标注是一项耗时费力的任务,也是训练和测试深度学习模型的一项不可或缺的要求。生命科学显微镜图像的分割需要带注释的图像数据集来进行诸如实例分割等目标检测任务。尽管由于数据增强等方法,带注释的图像数据量一直在稳步减少,但在使用深度神经网络进行细胞核分割的过程中,手动或半自动数据标注过程是最耗费人力和成本的任务。在这项工作中,我们提出了一个系统,以完全自动化自定义荧光细胞核图像数据集的标注过程。通过这样做,我们能够将细胞核标记时间减少多达99.5%。我们系统的输出为机器学习应用提供了高质量的训练数据,以识别显微镜图像中细胞核的位置。我们的实验表明,自动标注的数据集与手动数据标注相比具有同等的分割性能。此外,我们表明,我们的系统能够实现从原始数据输入到所需的细胞核分割和跟踪结果的单一工作流程,而无需依赖神经网络的预训练模型或第三方训练数据集。