Suppr超能文献

scDetect:一种基于排序的集成学习算法,用于癌症中单细胞RNA测序的细胞类型识别。

scDetect: a rank-based ensemble learning algorithm for cell type identification of single-cell RNA sequencing in cancer.

作者信息

Shen Yifei, Chu Qinjie, Timko Michael P, Fan Longjiang

机构信息

China Department of Laboratory Medicine, First Affiliated Hospital, College of Medicine, Zhejiang University, Hangzhou 310003, China.

China Key Laboratory of Clinical In Vitro Diagnostic Techniques of Zhejiang Province, Hangzhou 310003, China.

出版信息

Bioinformatics. 2021 Nov 18;37(22):4115-4122. doi: 10.1093/bioinformatics/btab410.

Abstract

MOTIVATION

Single-cell RNA sequencing (scRNA-seq) has enabled the characterization of different cell types in many tissues and tumor samples. Cell type identification is essential for single-cell RNA profiling, currently transforming the life sciences. Often, this is achieved by searching for combinations of genes that have previously been implicated as being cell-type specific, an approach that is not quantitative and does not explicitly take advantage of other scRNA-seq studies. Batch effects and different data platforms greatly decrease the predictive performance in inter-laboratory and different data type validation.

RESULTS

Here, we present a new ensemble learning method named as 'scDetect' that combines gene expression rank-based analysis and a majority vote ensemble machine-learning probability-based prediction method capable of highly accurate classification of cells based on scRNA-seq data by different sequencing platforms. Because of tumor heterogeneity, in order to accurately predict tumor cells in the single-cell RNA-seq data, we have also incorporated cell copy number variation consensus clustering and epithelial score in the classification. We applied scDetect to scRNA-seq data from pancreatic tissue, mononuclear cells and tumor biopsies cells and show that scDetect classified individual cells with high accuracy and better than other publicly available tools.

AVAILABILITY AND IMPLEMENTATION

scDetect is an open source software. Source code and test data is freely available from Github (https://github.com/IVDgenomicslab/scDetect/) and Zenodo (https://zenodo.org/record/4764132#.YKCOlrH5AYN). The examples and tutorial page is at https://ivdgenomicslab.github.io/scDetect-Introduction/. And scDetect will be available from Bioconductor.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

单细胞RNA测序(scRNA-seq)已能够对许多组织和肿瘤样本中的不同细胞类型进行表征。细胞类型识别对于单细胞RNA分析至关重要,目前正在改变生命科学。通常,这是通过寻找先前被认为是细胞类型特异性的基因组合来实现的,这种方法不是定量的,也没有明确利用其他scRNA-seq研究。批次效应和不同的数据平台大大降低了实验室间和不同数据类型验证中的预测性能。

结果

在此,我们提出了一种名为“scDetect”的新集成学习方法,该方法结合了基于基因表达排名的分析和基于多数投票集成机器学习概率的预测方法,能够基于不同测序平台的scRNA-seq数据对细胞进行高精度分类。由于肿瘤异质性,为了准确预测单细胞RNA-seq数据中的肿瘤细胞,我们还在分类中纳入了细胞拷贝数变异共识聚类和上皮评分。我们将scDetect应用于胰腺组织、单核细胞和肿瘤活检细胞的scRNA-seq数据,并表明scDetect能够高精度地对单个细胞进行分类,且优于其他公开可用的工具。

可用性和实现

scDetect是一个开源软件。源代码和测试数据可从Github(https://github.com/IVDgenomicslab/scDetect/)和Zenodo(https://zenodo.org/record/4764132#.YKCOlrH5AYN)免费获取。示例和教程页面位于https://ivdgenomicslab.github.io/scDetect-Introduction/。并且scDetect将可从Bioconductor获得。

补充信息

补充数据可在《生物信息学》在线获取。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验