Suppr超能文献

CASTLe - 通过迁移学习对单细胞进行分类:利用公开的单细胞 RNA 测序实验的力量来注释新的实验。

CaSTLe - Classification of single cells by transfer learning: Harnessing the power of publicly available single cell RNA sequencing experiments to annotate new experiments.

机构信息

Department of Life Sciences, Ben-Gurion University of the Negev, Beer-Sheva, Israel.

Department of Software and Information Systems Engineering, Ben-Gurion University of the Negev, Beer-Sheva, Israel.

出版信息

PLoS One. 2018 Oct 10;13(10):e0205499. doi: 10.1371/journal.pone.0205499. eCollection 2018.

Abstract

Single-cell RNA sequencing (scRNA-seq) is an emerging technology for profiling the gene expression of thousands of cells at the single cell resolution. Currently, the labeling of cells in an scRNA-seq dataset is performed by manually characterizing clusters of cells or by fluorescence-activated cell sorting (FACS). Both methods have inherent drawbacks: The first depends on the clustering algorithm used and the knowledge and arbitrary decisions of the annotator, and the second involves an experimental step in addition to the sequencing and cannot be incorporated into the higher throughput scRNA-seq methods. We therefore suggest a different approach for cell labeling, namely, classifying cells from scRNA-seq datasets by using a model transferred from different (previously labeled) datasets. This approach can complement existing methods, and-in some cases-even replace them. Such a transfer-learning framework requires selecting informative features and training a classifier. The specific implementation for the framework that we propose, designated ''CaSTLe-classification of single cells by transfer learning,'' is based on a robust feature engineering workflow and an XGBoost classification model built on these features. Evaluation of CaSTLe against two benchmark feature-selection and classification methods showed that it outperformed the benchmark methods in most cases and yielded satisfactory classification accuracy in a consistent manner. CaSTLe has the additional advantage of being parallelizable and well suited to large datasets. We showed that it was possible to classify cell types using transfer learning, even when the databases contained a very small number of genes, and our study thus indicates the potential applicability of this approach for analysis of scRNA-seq datasets.

摘要

单细胞 RNA 测序 (scRNA-seq) 是一种新兴的技术,可在单细胞分辨率下对数千个细胞的基因表达进行分析。目前,scRNA-seq 数据集中的细胞标记是通过手动特征化细胞簇或通过荧光激活细胞分选 (FACS) 来完成的。这两种方法都有其内在的缺点:第一种方法依赖于所使用的聚类算法以及注释者的知识和任意决策,第二种方法除了测序之外还涉及一个实验步骤,并且不能纳入更高通量的 scRNA-seq 方法中。因此,我们建议使用不同的方法来进行细胞标记,即通过使用从不同(先前标记)数据集转移过来的模型来对 scRNA-seq 数据集中的细胞进行分类。这种方法可以补充现有的方法,并且在某些情况下甚至可以替代它们。这种迁移学习框架需要选择信息丰富的特征并训练分类器。我们提出的“基于迁移学习的单细胞分类”(CaSTLe-classification of single cells by transfer learning)框架的具体实现是基于稳健的特征工程工作流程和基于这些特征构建的 XGBoost 分类模型。将 CaSTLe 与两种基准特征选择和分类方法进行评估的结果表明,它在大多数情况下都优于基准方法,并且以一致的方式产生了令人满意的分类准确性。CaSTLe 具有可并行化的额外优势,非常适合大型数据集。我们表明,即使在数据库包含非常少的基因的情况下,也可以使用迁移学习来对细胞类型进行分类,因此本研究表明了这种方法对 scRNA-seq 数据集分析的潜在适用性。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验