MOE Key Laboratory of Bioinformatics, BNRIST Bioinformatics Division, Department of Automation, Tsinghua University.
BIOPIC and School of Life Sciences, Peking University.
Brief Bioinform. 2021 May 20;22(3). doi: 10.1093/bib/bbaa127.
Molecular heterogeneities and complex microenvironments bring great challenges for cancer diagnosis and treatment. Recent advances in single-cell RNA-sequencing (scRNA-seq) technology make it possible to study cancer cell heterogeneities and microenvironments at single-cell transcriptomic level. Here, we develop an R package named scCancer, which focuses on processing and analyzing scRNA-seq data for cancer research. Except basic data processing steps, this package takes several special considerations for cancer-specific features. Firstly, the package introduced comprehensive quality control metrics. Secondly, it used a data-driven machine learning algorithm to accurately identify major cancer microenvironment cell populations. Thirdly, it estimated a malignancy score to classify malignant (cancerous) and non-malignant cells. Then, it analyzed intra-tumor heterogeneities by key cellular phenotypes (such as cell cycle and stemness), gene signatures and cell-cell interactions. Besides, it provided multi-sample data integration analysis with different batch-effect correction strategies. Finally, user-friendly graphic reports were generated for all the analyses. By testing on 56 samples with 433 405 cells in total, we demonstrated its good performance. The package is available at: http://lifeome.net/software/sccancer/.
分子异质性和复杂的微环境给癌症的诊断和治疗带来了巨大的挑战。单细胞 RNA 测序 (scRNA-seq) 技术的最新进展使得研究癌症细胞异质性和单细胞转录组水平的微环境成为可能。在这里,我们开发了一个名为 scCancer 的 R 包,该包专注于处理和分析癌症研究的 scRNA-seq 数据。除了基本的数据处理步骤外,该包还考虑了一些癌症特有的特征。首先,该包引入了全面的质量控制指标。其次,它使用了一种数据驱动的机器学习算法来准确识别主要的癌症微环境细胞群体。第三,它估计了一个恶性评分来对恶性(癌症)和非恶性细胞进行分类。然后,它通过关键细胞表型(如细胞周期和干性)、基因特征和细胞-细胞相互作用来分析肿瘤内异质性。此外,它还提供了具有不同批次效应校正策略的多样本数据集成分析。最后,所有分析都生成了用户友好的图形报告。通过对总共 56 个样本、433405 个细胞进行测试,我们证明了它的良好性能。该软件包可在:http://lifeome.net/software/sccancer/ 获取。