Baran Yusuf, Doğan Berat
Department of Biomedical Engineering, Inonu University, Malatya, Turkey.
Department of Biomedical Engineering, Inonu University, Malatya, Turkey.
Comput Biol Med. 2023 Mar;155:106634. doi: 10.1016/j.compbiomed.2023.106634. Epub 2023 Feb 9.
Single-Cell RNA sequencing (scRNA-seq) has provided unprecedented opportunities for exploring gene expression and thus uncovering regulatory relationships between genes at the single-cell level. However, scRNA-seq relies on isolating cells from tissues. Therefore, the spatial context of the regulatory processes is lost. A recent technological innovation, spatial transcriptomics, allows for the measurement of gene expression while preserving spatial information. An initial step in the spatial transcriptomic analysis is to identify the cell type, which requires a careful selection of cell-specific marker genes. For this purpose, currently, scRNA-seq data is used to select a limited number of marker genes from among all genes that distinguish cell types from each other. This study proposes scMAGS (single-cell MArker Gene Selection), a novel method for marker gene selection from scRNA-seq data for spatial transcriptomics studies. scMAGS uses a filtering step in which the candidate genes are identified before the marker gene selection step. For the selection of marker genes, cluster validity indices, the Silhouette index, or the Calinski-Harabasz index (for large datasets) are utilized. Experimental results showed that, in comparison to the existing methods, scMAGS is scalable, fast, and accurate. Even for large datasets with millions of cells, scMAGS could find the required number of marker genes in a reasonable amount of time with fewer memory requirements. scMAGS is made freely available at https://github.com/doganlab/scmags and can be downloaded from the Python Package Directory (PyPI) software repository with the command pip install scmags.
单细胞RNA测序(scRNA-seq)为探索基因表达并进而在单细胞水平揭示基因间的调控关系提供了前所未有的机遇。然而,scRNA-seq依赖于从组织中分离细胞。因此,调控过程的空间背景信息丢失了。最近的一项技术创新——空间转录组学,能够在保留空间信息的同时测量基因表达。空间转录组分析的第一步是识别细胞类型,这需要仔细选择细胞特异性标记基因。为此,目前利用scRNA-seq数据从所有能区分不同细胞类型的基因中选择有限数量的标记基因。本研究提出了scMAGS(单细胞标记基因选择),这是一种从scRNA-seq数据中选择标记基因用于空间转录组学研究的新方法。scMAGS使用一个过滤步骤,在标记基因选择步骤之前识别候选基因。对于标记基因的选择,使用聚类有效性指标、轮廓系数或Calinski-Harabasz指数(用于大型数据集)。实验结果表明,与现有方法相比,scMAGS具有可扩展性、快速且准确的特点。即使对于包含数百万个细胞的大型数据集,scMAGS也能在合理的时间内以较少的内存需求找到所需数量的标记基因。scMAGS可在https://github.com/doganlab/scmags上免费获取,也可通过命令pip install scmags从Python包目录(PyPI)软件仓库下载。