Suppr超能文献

ILoReg:一种用于从单细胞 RNA-seq 数据中识别高分辨率细胞群体的工具。

ILoReg: a tool for high-resolution cell population identification from single-cell RNA-seq data.

机构信息

Turku Bioscience Centre, University of Turku and Åbo Akademi University, Turku 20520, Finland.

Institute of Biomedicine, University of Turku, Turku, Finland.

出版信息

Bioinformatics. 2021 May 23;37(8):1107-1114. doi: 10.1093/bioinformatics/btaa919.

Abstract

MOTIVATION

Single-cell RNA-seq allows researchers to identify cell populations based on unsupervised clustering of the transcriptome. However, subpopulations can have only subtle transcriptomic differences and the high dimensionality of the data makes their identification challenging.

RESULTS

We introduce ILoReg, an R package implementing a new cell population identification method that improves identification of cell populations with subtle differences through a probabilistic feature extraction step that is applied before clustering and visualization. The feature extraction is performed using a novel machine learning algorithm, called iterative clustering projection (ICP), that uses logistic regression and clustering similarity comparison to iteratively cluster data. Remarkably, ICP also manages to integrate feature selection with the clustering through L1-regularization, enabling the identification of genes that are differentially expressed between cell populations. By combining solutions of multiple ICP runs into a single consensus solution, ILoReg creates a representation that enables investigating cell populations with a high resolution. In particular, we show that the visualization of ILoReg allows segregation of immune and pancreatic cell populations in a more pronounced manner compared with current state-of-the-art methods.

AVAILABILITY AND IMPLEMENTATION

ILoReg is available as an R package at https://bioconductor.org/packages/ILoReg.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

单细胞 RNA-seq 允许研究人员根据转录组的无监督聚类来鉴定细胞群体。然而,亚群可能只有细微的转录组差异,并且数据的高维性使得它们的鉴定具有挑战性。

结果

我们引入了 ILoReg,这是一个 R 包,实现了一种新的细胞群体识别方法,该方法通过在聚类和可视化之前应用概率特征提取步骤来改善对具有细微差异的细胞群体的识别。特征提取是使用一种称为迭代聚类投影(ICP)的新机器学习算法来执行的,该算法使用逻辑回归和聚类相似性比较来迭代地聚类数据。值得注意的是,ICP 还通过 L1-正则化成功地将特征选择与聚类集成在一起,从而能够识别细胞群体之间差异表达的基因。通过将多个 ICP 运行的解决方案组合成单个共识解决方案,ILoReg 创建了一种表示形式,能够以高分辨率研究细胞群体。特别是,我们表明 ILoReg 的可视化能够比当前最先进的方法更明显地分离免疫和胰腺细胞群体。

可用性和实现

ILoReg 可作为 R 包在 https://bioconductor.org/packages/ILoReg 上获得。

补充信息

补充数据可在 Bioinformatics 在线获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2733/8150131/345b202e7af0/btaa919f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验