Departamento de Ingeniería de la Información y las Comunicaciones, Universidad de Murcia, Murcia E-30100, Spain.
Department of Neurodegenerative Diseases, UCL Institute of Neurology, London WC1E 6BT, UK.
Bioinformatics. 2021 Sep 29;37(18):2905-2911. doi: 10.1093/bioinformatics/btab175.
Co-expression networks are a powerful gene expression analysis method to study how genes co-express together in clusters with functional coherence that usually resemble specific cell type behavior for the genes involved. They can be applied to bulk-tissue gene expression profiling and assign function, and usually cell type specificity, to a high percentage of the gene pool used to construct the network. One of the limitations of this method is that each gene is predicted to play a role in a specific set of coherent functions in a single cell type (i.e. at most we get a single <gene, function, cell type> for each gene). We present here GMSCA (Gene Multifunctionality Secondary Co-expression Analysis), a software tool that exploits the co-expression paradigm to increase the number of functions and cell types ascribed to a gene in bulk-tissue co-expression networks.
We applied GMSCA to 27 co-expression networks derived from bulk-tissue gene expression profiling of a variety of brain tissues. Neurons and glial cells (microglia, astrocytes and oligodendrocytes) were considered the main cell types. Applying this approach, we increase the overall number of predicted triplets <gene, function, cell type> by 46.73%. Moreover, GMSCA predicts that the SNCA gene, traditionally associated to work mainly in neurons, also plays a relevant function in oligodendrocytes.
The tool is available at GitHub, https://github.com/drlaguna/GMSCA as open-source software.
Supplementary data are available at Bioinformatics online.
共表达网络是一种强大的基因表达分析方法,用于研究基因如何在具有功能一致性的簇中共同表达,这些簇通常与涉及的基因的特定细胞类型行为相似。它们可以应用于批量组织基因表达谱分析,并为用于构建网络的基因池中很大一部分赋予功能,通常是细胞类型特异性。这种方法的局限性之一是,每个基因都被预测在单个细胞类型中发挥特定的一组协调功能(即,对于每个基因,我们最多只能获得一个<基因、功能、细胞类型>)。我们在这里提出 GMSCA(基因多功能二次共表达分析),这是一种软件工具,利用共表达范例来增加批量组织共表达网络中赋予基因的功能和细胞类型的数量。
我们将 GMSCA 应用于 27 个源自各种脑组织批量基因表达谱的共表达网络。神经元和神经胶质细胞(小胶质细胞、星形胶质细胞和少突胶质细胞)被认为是主要的细胞类型。通过应用这种方法,我们将预测的<基因、功能、细胞类型>三体的总数增加了 46.73%。此外,GMSCA 预测,传统上与神经元主要相关的 SNCA 基因也在少突胶质细胞中发挥相关功能。
该工具可在 GitHub 上获得,网址为 https://github.com/drlaguna/GMSCA,作为开源软件。
补充数据可在生物信息学在线获得。