Eisele A S, Suter D M
Ecole Polytechnique Fédérale de Lausanne, School of Life Sciences, Institute of Bioengineering, Lausanne, Switzerland.
Methods Mol Biol. 2025;2886:375-400. doi: 10.1007/978-1-0716-4310-5_19.
Gene expression memory-based lineage inference (GEMLI) is a computational tool allowing to predict cell lineages solely from single-cell RNA-sequencing (scRNA-seq) datasets and is publicly available as an R package on GitHub. GEMLI is based on the occurrence of gene expression memory, i.e., the gene-specific maintenance of expression levels through cell divisions. This represents a shift away from experimental lineage tracing techniques based on genetic marks or physical cell lineage separation and greatly eases and expands lineage annotation. GEMLI allows to study cell lineages during differentiation in development, homeostasis, and regeneration, as well as disease onset and progression in various physiological and pathological contexts. This makes it possible to dissect cell type-specific gene expression memory, to discriminate symmetric and asymmetric cell fate decisions, and to reconstruct individual multicellular structures from pooled scRNA-seq datasets. GEMLI is particularly promising for its ability to identify small lineages in human samples, a context in which no other lineage tracing methods are applicable. In this chapter, we provide a detailed protocol of the GEMLI R package usage on gene expression matrices derived from standard scRNA-seq on various platforms. We cover the use of the main function to predict cell lineages and how to adjust its parameters to different tasks. We also show how lineage information is extracted, visualized, and fine-tuned. Finally, we describe the use of the package's functions for the detailed analysis of the predicted cell lineages. This includes the analysis of gene expression memory, cell type composition of individual large lineages, and identification of lineages at the transition point between two cell types.
基于基因表达记忆的谱系推断(GEMLI)是一种计算工具,能够仅从单细胞RNA测序(scRNA-seq)数据集中预测细胞谱系,并且作为一个R包在GitHub上公开可用。GEMLI基于基因表达记忆的发生,即通过细胞分裂对表达水平进行基因特异性维持。这代表了从基于遗传标记或物理细胞谱系分离的实验性谱系追踪技术的转变,极大地简化和扩展了谱系注释。GEMLI能够研究发育、体内平衡和再生过程中分化期间的细胞谱系,以及各种生理和病理背景下的疾病发生和进展。这使得剖析细胞类型特异性基因表达记忆、区分对称和不对称细胞命运决定以及从汇总的scRNA-seq数据集中重建单个多细胞结构成为可能。GEMLI因其能够在人类样本中识别小谱系的能力而特别有前景,在这种情况下没有其他谱系追踪方法适用。在本章中,我们提供了一个关于在从各种平台上的标准scRNA-seq获得的基因表达矩阵上使用GEMLI R包的详细方案。我们涵盖了使用主要功能预测细胞谱系以及如何针对不同任务调整其参数。我们还展示了如何提取、可视化和微调谱系信息。最后,我们描述了使用该包的功能对预测的细胞谱系进行详细分析。这包括对基因表达记忆的分析、单个大谱系的细胞类型组成分析以及在两种细胞类型之间的过渡点识别谱系。