Bruch Roman, Vitacolonna Mario, Nürnberg Elina, Sauer Simeon, Rudolf Rüdiger, Reischl Markus
Institute for Automation and Applied Informatics, Karlsruhe Institute of Technology, Eggenstein-Leopoldshafen, Germany.
Institute of Molecular and Cell Biology, Mannheim University of Applied Sciences, Mannheim, Germany.
Commun Biol. 2025 Jan 11;8(1):43. doi: 10.1038/s42003-025-07469-2.
Biomedical research increasingly relies on three-dimensional (3D) cell culture models and artificial-intelligence-based analysis can potentially facilitate a detailed and accurate feature extraction on a single-cell level. However, this requires for a precise segmentation of 3D cell datasets, which in turn demands high-quality ground truth for training. Manual annotation, the gold standard for ground truth data, is too time-consuming and thus not feasible for the generation of large 3D training datasets. To address this, we present a framework for generating 3D training data, which integrates biophysical modeling for realistic cell shape and alignment. Our approach allows the in silico generation of coherent membrane and nuclei signals, that enable the training of segmentation models utilizing both channels for improved performance. Furthermore, we present a generative adversarial network (GAN) training scheme that generates not only image data but also matching labels. Quantitative evaluation shows superior performance of biophysical motivated synthetic training data, even outperforming manual annotation and pretrained models. This underscores the potential of incorporating biophysical modeling for enhancing synthetic training data quality.
生物医学研究越来越依赖于三维(3D)细胞培养模型,基于人工智能的分析有可能促进在单细胞水平上进行详细而准确的特征提取。然而,这需要对3D细胞数据集进行精确分割,而这又需要高质量的真值数据用于训练。人工标注作为真值数据的金标准,过于耗时,因此对于生成大型3D训练数据集而言并不可行。为了解决这个问题,我们提出了一个用于生成3D训练数据的框架,该框架集成了用于逼真细胞形状和排列的生物物理建模。我们的方法允许在计算机上生成连贯的细胞膜和细胞核信号,从而能够利用这两个通道训练分割模型以提高性能。此外,我们提出了一种生成对抗网络(GAN)训练方案,该方案不仅生成图像数据,还生成匹配的标签。定量评估表明,具有生物物理动机的合成训练数据具有卓越性能,甚至优于人工标注和预训练模型。这凸显了结合生物物理建模以提高合成训练数据质量的潜力。