Institute of Clinical Neurobiology, University Hospital Würzburg, Würzburg, Germany.
Department of Business and Economics, University of Würzburg, Würzburg, Germany.
Elife. 2020 Oct 19;9:e59780. doi: 10.7554/eLife.59780.
Bioimage analysis of fluorescent labels is widely used in the life sciences. Recent advances in deep learning (DL) allow automating time-consuming manual image analysis processes based on annotated training data. However, manual annotation of fluorescent features with a low signal-to-noise ratio is somewhat subjective. Training DL models on subjective annotations may be instable or yield biased models. In turn, these models may be unable to reliably detect biological effects. An analysis pipeline integrating data annotation, ground truth estimation, and model training can mitigate this risk. To evaluate this integrated process, we compared different DL-based analysis approaches. With data from two model organisms (mice, zebrafish) and five laboratories, we show that ground truth estimation from multiple human annotators helps to establish objectivity in fluorescent feature annotations. Furthermore, ensembles of multiple models trained on the estimated ground truth establish reliability and validity. Our research provides guidelines for reproducible DL-based bioimage analyses.
荧光标记的生物图像分析在生命科学中得到了广泛的应用。深度学习(DL)的最新进展使得基于标注训练数据的耗时的手动图像分析过程自动化成为可能。然而,带有低信噪比的荧光特征的手动标注有些主观。在主观标注上训练 DL 模型可能不稳定或产生有偏差的模型。反过来,这些模型可能无法可靠地检测生物效应。整合数据标注、真值估计和模型训练的分析管道可以减轻这种风险。为了评估这个集成过程,我们比较了不同的基于 DL 的分析方法。通过来自两种模式生物(老鼠、斑马鱼)和五个实验室的数据,我们表明,来自多个人类注释者的真值估计有助于在荧光特征注释中建立客观性。此外,基于估计的真值训练的多个模型的集合建立了可靠性和有效性。我们的研究为基于 DL 的可重现生物图像分析提供了指导。