Hung Jane, Lopes Stefanie C P, Nery Odailton Amaral, Nosten Francois, Ferreira Marcelo U, Duraisingh Manoj T, Marti Matthias, Ravel Deepali, Rangel Gabriel, Malleret Benoit, Lacerda Marcus V G, Rénia Laurent, Costa Fabio T M, Carpenter Anne E
Massachusetts Institute of Technology.
Instituto Leônidas e Maria Deane, Fundação Oswaldo Cruz (FIOCRUZ); Fundação de Medicina Tropical Dr. Heitor Vieira Dourado, Gerência de Malária.
Conf Comput Vis Pattern Recognit Workshops. 2017 Jul;2017:808-813. doi: 10.1109/cvprw.2017.112. Epub 2021 Nov 18.
Deep learning based models have had great success in object detection, but the state of the art models have not yet been widely applied to biological image data. We apply for the first time an object detection model previously used on natural images to identify cells and recognize their stages in brightfield microscopy images of malaria-infected blood. Many micro-organisms like malaria parasites are still studied by expert manual inspection and hand counting. This type of object detection task is challenging due to factors like variations in cell shape, density, and color, and uncertainty of some cell classes. In addition, annotated data useful for training is scarce, and the class distribution is inherently highly imbalanced due to the dominance of uninfected red blood cells. We use Faster Region-based Convolutional Neural Network (Faster R-CNN), one of the top performing object detection models in recent years, pre-trained on ImageNet but fine tuned with our data, and compare it to a baseline, which is based on a traditional approach consisting of cell segmentation, extraction of several single-cell features, and classification using random forests. To conduct our initial study, we collect and label a dataset of 1300 fields of view consisting of around 100,000 individual cells. We demonstrate that Faster R-CNN outperforms our baseline and put the results in context of human performance.
基于深度学习的模型在目标检测方面取得了巨大成功,但目前最先进的模型尚未广泛应用于生物图像数据。我们首次将先前用于自然图像的目标检测模型应用于识别疟原虫感染血液的明场显微镜图像中的细胞并识别其阶段。许多微生物,如疟原虫,仍然通过专家手动检查和手工计数来研究。由于细胞形状、密度和颜色的变化以及某些细胞类别的不确定性等因素,这种类型的目标检测任务具有挑战性。此外,用于训练的标注数据稀缺,并且由于未感染红细胞占主导地位,类分布本身高度不平衡。我们使用基于区域的快速卷积神经网络(Faster R-CNN),它是近年来表现最佳的目标检测模型之一,在ImageNet上进行预训练,但使用我们的数据进行微调,并将其与基于传统方法的基线进行比较,该传统方法包括细胞分割、提取几个单细胞特征以及使用随机森林进行分类。为了进行我们的初步研究,我们收集并标记了一个包含1300个视野的数据集,其中包含约100,000个单个细胞。我们证明Faster R-CNN优于我们的基线,并将结果与人类表现进行了对比。