IEEE Trans Neural Netw Learn Syst. 2023 May;34(5):2528-2538. doi: 10.1109/TNNLS.2021.3106831. Epub 2023 May 2.
Predictive modeling is useful but very challenging in biological image analysis due to the high cost of obtaining and labeling training data. For example, in the study of gene interaction and regulation in Drosophila embryogenesis, the analysis is most biologically meaningful when in situ hybridization (ISH) gene expression pattern images from the same developmental stage are compared. However, labeling training data with precise stages is very time-consuming even for developmental biologists. Thus, a critical challenge is how to build accurate computational models for precise developmental stage classification from limited training samples. In addition, identification and visualization of developmental landmarks are required to enable biologists to interpret prediction results and calibrate models. To address these challenges, we propose a deep two-step low-shot learning framework to accurately classify ISH images using limited training images. Specifically, to enable accurate model training on limited training samples, we formulate the task as a deep low-shot learning problem and develop a novel two-step learning approach, including data-level learning and feature-level learning. We use a deep residual network as our base model and achieve improved performance in the precise stage prediction task of ISH images. Furthermore, the deep model can be interpreted by computing saliency maps, which consists of pixel-wise contributions of an image to its prediction result. In our task, saliency maps are used to assist the identification and visualization of developmental landmarks. Our experimental results show that the proposed model can not only make accurate predictions but also yield biologically meaningful interpretations. We anticipate our methods to be easily generalizable to other biological image classification tasks with small training datasets. Our open-source code is available at https://github.com/divelab/lsl-fly.
预测建模在生物图像分析中非常有用,但也极具挑战性,因为获取和标记训练数据的成本很高。例如,在研究果蝇胚胎发生过程中的基因相互作用和调控时,只有当比较来自同一发育阶段的原位杂交(ISH)基因表达模式图像时,分析才最具有生物学意义。然而,即使是发育生物学家,对训练数据进行精确阶段的标记也是非常耗时的。因此,一个关键的挑战是如何从有限的训练样本中构建用于精确发育阶段分类的准确计算模型。此外,需要识别和可视化发育标志,以使生物学家能够解释预测结果并校准模型。为了解决这些挑战,我们提出了一种深度两步少样本学习框架,以使用有限的训练图像准确地对 ISH 图像进行分类。具体来说,为了能够在有限的训练样本上进行准确的模型训练,我们将任务表述为一个深度少样本学习问题,并开发了一种新颖的两步学习方法,包括数据级学习和特征级学习。我们使用深度残差网络作为基础模型,并在 ISH 图像的精确阶段预测任务中实现了性能的提高。此外,深度模型可以通过计算显着图进行解释,显着图由图像对其预测结果的每个像素的贡献组成。在我们的任务中,显着图用于辅助发育标志的识别和可视化。我们的实验结果表明,所提出的模型不仅可以进行准确的预测,还可以提供具有生物学意义的解释。我们预计我们的方法可以轻松地推广到具有小训练数据集的其他生物图像分类任务。我们的开源代码可在 https://github.com/divelab/lsl-fly 上获得。