Li Rongjian, Zhang Wenlu, Ji Shuiwang
Department of Computer Science, Old Dominion University, 23529 Norfolk, VA, USA.
BMC Bioinformatics. 2014 Jun 20;15:209. doi: 10.1186/1471-2105-15-209.
Differential gene expression patterns in cells of the mammalian brain result in the morphological, connectional, and functional diversity of cells. A wide variety of studies have shown that certain genes are expressed only in specific cell-types. Analysis of cell-type-specific gene expression patterns can provide insights into the relationship between genes, connectivity, brain regions, and cell-types. However, automated methods for identifying cell-type-specific genes are lacking to date.
Here, we describe a set of computational methods for identifying cell-type-specific genes in the mouse brain by automated image computing of in situ hybridization (ISH) expression patterns. We applied invariant image feature descriptors to capture local gene expression information from cellular-resolution ISH images. We then built image-level representations by applying vector quantization on the image descriptors. We employed regularized learning methods for classifying genes specifically expressed in different brain cell-types. These methods can also rank image features based on their discriminative power. We used a data set of 2,872 genes from the Allen Brain Atlas in the experiments. Results showed that our methods are predictive of cell-type-specificity of genes. Our classifiers achieved AUC values of approximately 87% when the enrichment level is set to 20. In addition, we showed that the highly-ranked image features captured the relationship between cell-types.
Overall, our results showed that automated image computing methods could potentially be used to identify cell-type-specific genes in the mouse brain.
哺乳动物大脑细胞中不同的基因表达模式导致了细胞在形态、连接和功能上的多样性。大量研究表明,某些基因仅在特定细胞类型中表达。对细胞类型特异性基因表达模式的分析能够为基因、连接性、脑区和细胞类型之间的关系提供见解。然而,迄今为止,缺乏用于识别细胞类型特异性基因的自动化方法。
在此,我们描述了一套通过对原位杂交(ISH)表达模式进行自动图像计算来识别小鼠大脑中细胞类型特异性基因的计算方法。我们应用不变图像特征描述符从细胞分辨率的ISH图像中捕获局部基因表达信息。然后,我们通过对图像描述符应用矢量量化来构建图像级表示。我们采用正则化学习方法对在不同脑细胞类型中特异性表达的基因进行分类。这些方法还可以根据图像特征的判别能力对其进行排序。在实验中,我们使用了来自艾伦脑图谱的2872个基因的数据集。结果表明,我们的方法能够预测基因的细胞类型特异性。当富集水平设定为20时,我们的分类器实现了约87%的AUC值。此外,我们表明,排名靠前的图像特征捕获了细胞类型之间的关系。
总体而言,我们的结果表明,自动化图像计算方法有可能用于识别小鼠大脑中的细胞类型特异性基因。