Suppr超能文献

scCancer2:基于数据的肿瘤微环境在单细胞分辨率水平上的深度注释。

scCancer2: data-driven in-depth annotations of the tumor microenvironment at single-level resolution.

机构信息

MOE Key Laboratory of Bioinformatics, BNRIST Bioinformatics Division, Institute for Precision Medicine & Department of Automation, Tsinghua University, Beijing 100084, China.

Department of Finance, Shanghai Advanced Institute of Finance, Shanghai Jiao Tong University, Shanghai 200240, China.

出版信息

Bioinformatics. 2024 Feb 1;40(2). doi: 10.1093/bioinformatics/btae028.

Abstract

SUMMARY

Single-cell RNA-seq (scRNA-seq) is a powerful technique for decoding the complex cellular compositions in the tumor microenvironment (TME). As previous studies have defined many meaningful cell subtypes in several tumor types, there is a great need to computationally transfer these labels to new datasets. Also, different studies used different approaches or criteria to define the cell subtypes for the same major cell lineages. The relationships between the cell subtypes defined in different studies should be carefully evaluated. In this updated package scCancer2, designed for integrative tumor scRNA-seq data analysis, we developed a supervised machine learning framework to annotate TME cells with annotated cell subtypes from 15 scRNA-seq datasets with 594 samples in total. Based on the trained classifiers, we quantitatively constructed the similarity maps between the cell subtypes defined in different references by testing on all the 15 datasets. Secondly, to improve the identification of malignant cells, we designed a classifier by integrating large-scale pan-cancer TCGA bulk gene expression datasets and scRNA-seq datasets (10 cancer types, 175 samples, 663 857 cells). This classifier shows robust performances when no internal confidential reference cells are available. Thirdly, scCancer2 integrated a module to process the spatial transcriptomic data and analyze the spatial features of TME.

AVAILABILITY AND IMPLEMENTATION

The package and user documentation are available at http://lifeome.net/software/sccancer2/ and https://doi.org/10.5281/zenodo.10477296.

摘要

摘要

单细胞 RNA 测序 (scRNA-seq) 是解码肿瘤微环境 (TME) 中复杂细胞组成的强大技术。由于之前的研究已经在几种肿瘤类型中定义了许多有意义的细胞亚型,因此非常需要将这些标签计算转移到新的数据集上。此外,不同的研究使用不同的方法或标准来定义相同主要细胞谱系的细胞亚型。不同研究中定义的细胞亚型之间的关系应该仔细评估。在这个名为 scCancer2 的更新软件包中,专门用于整合肿瘤 scRNA-seq 数据分析,我们开发了一个监督机器学习框架,使用总共 594 个样本的 15 个 scRNA-seq 数据集的带注释的细胞亚型来注释 TME 细胞。基于训练好的分类器,我们通过在所有 15 个数据集上进行测试,定量构建了不同参考文献中定义的细胞亚型之间的相似性图。其次,为了提高恶性细胞的识别能力,我们设计了一个分类器,该分类器整合了大规模的泛癌 TCGA 批量基因表达数据集和 scRNA-seq 数据集(10 种癌症类型,175 个样本,663857 个细胞)。当没有内部机密参考细胞时,该分类器显示出稳健的性能。第三,scCancer2 集成了一个处理空间转录组数据和分析 TME 空间特征的模块。

可用性和实现

该软件包和用户文档可在 http://lifeome.net/software/sccancer2/https://doi.org/10.5281/zenodo.10477296 获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b78b/10868330/212d00663b19/btae028f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验