Chen Shann-Ching, Murphy Robert F
Department of Biomedical Engineering and Center for Bioimage Informatics, Carnegie Mellon University, Pittsburgh, PA 15213, USA.
BMC Bioinformatics. 2006 Feb 23;7:90. doi: 10.1186/1471-2105-7-90.
Knowledge of the subcellular location of a protein is critical to understanding how that protein works in a cell. This location is frequently determined by the interpretation of fluorescence microscope images. In recent years, automated systems have been developed for consistent and objective interpretation of such images so that the protein pattern in a single cell can be assigned to a known location category. While these systems perform with nearly perfect accuracy for single cell images of all major subcellular structures, their ability to distinguish subpatterns of an organelle (such as two Golgi proteins) is not perfect. Our goal in the work described here was to improve the ability of an automated system to decide which of two similar patterns is present in a field of cells by considering more than one cell at a time. Since cells displaying the same location pattern are often clustered together, considering multiple cells may be expected to improve discrimination between similar patterns.
We describe how to take advantage of information on experimental conditions to construct a graphical representation for multiple cells in a field. Assuming that a field is composed of a small number of classes, the classification accuracy can be improved by allowing the computed probability of each pattern for each cell to be influenced by the probabilities of its neighboring cells in the model. We describe a novel way to allow this influence to occur, in which we adjust the prior probabilities of each class to reflect the patterns that are present. When this graphical model approach is used on synthetic multi-cell images in which the true class of each cell is known, we observe that the ability to distinguish similar classes is improved without suffering any degradation in ability to distinguish dissimilar classes. The computational complexity of the method is sufficiently low that improved assignments of classes can be obtained for fields of twelve cells in under 0.04 second on a 1600 megahertz processor.
We demonstrate that graphical models can be used to improve the accuracy of classification of subcellular patterns in multi-cell fluorescence microscope images. We also describe a novel algorithm for inferring classes from a graphical model. The performance and speed suggest that the method will be particularly valuable for analysis of images from high-throughput microscopy. We also anticipate that it will be useful for analyzing the mixtures of cell types typically present in images of tissues. Lastly, we anticipate that the method can be generalized to other problems.
了解蛋白质的亚细胞定位对于理解该蛋白质在细胞中的作用至关重要。这种定位通常通过对荧光显微镜图像的解读来确定。近年来,已开发出自动化系统,用于对这类图像进行一致且客观的解读,以便将单个细胞中的蛋白质模式归类到已知的定位类别。虽然这些系统对所有主要亚细胞结构的单细胞图像的识别准确率近乎完美,但其区分细胞器亚模式(如两种高尔基体蛋白)的能力并不理想。我们在此所述工作的目标是通过一次考虑多个细胞,提高自动化系统判断细胞区域中存在两种相似模式中的哪一种的能力。由于显示相同定位模式的细胞通常聚集在一起,考虑多个细胞有望改善对相似模式的区分。
我们描述了如何利用实验条件信息为细胞区域中的多个细胞构建图形表示。假设一个区域由少数几类组成,通过允许模型中每个细胞每种模式的计算概率受其相邻细胞概率的影响,可以提高分类准确率。我们描述了一种实现这种影响的新方法,即调整每类的先验概率以反映所呈现的模式。当将这种图形模型方法用于每个细胞真实类别已知的合成多细胞图像时,我们观察到区分相似类别的能力得到了提高,同时区分不同类别的能力没有任何下降。该方法的计算复杂度足够低,以至于在1600兆赫兹处理器上,不到0.04秒就能为12个细胞的区域获得改进的类别分配。
我们证明了图形模型可用于提高多细胞荧光显微镜图像中亚细胞模式的分类准确率。我们还描述了一种从图形模型推断类别的新算法。其性能和速度表明该方法对于高通量显微镜图像分析将特别有价值。我们还预期它对于分析组织图像中通常存在的细胞类型混合物会很有用。最后,我们预期该方法可推广到其他问题。