Suppr超能文献

细胞内核:用于评估单细胞数据差异表达的稳健内核嵌入

cytoKernel: Robust kernel embeddings for assessing differential expression of single cell data.

作者信息

Ghosh Tusharkanti, Baxter Ryan M, Seal Souvik, Lui Victor G, Rudra Pratyaydipta, Vu Thao, Hsieh Elena Wy, Ghosh Debashis

机构信息

Department of Biostatistics and Informatics, Colorado School of Public Health, University of Colorado Anschutz Medical Campus, Aurora, CO, USA.

Department of Immunology and Microbiology, University of Colorado Anschutz Medical Campus, Aurora, CO, USA.

出版信息

bioRxiv. 2024 Aug 19:2024.08.16.608287. doi: 10.1101/2024.08.16.608287.

Abstract

High-throughput sequencing of single-cell data can be used to rigorously evlauate cell specification and enable intricate variations between groups or conditions. Many popular existing methods for differential expression target differences in aggregate measurements (mean, median, sum) and limit their approaches to detect only global differential changes. We present a robust method for differential expression of single-cell data using a kernel-based score test, cytoKernel. cytoKernel is specifically designed to assess the differential expression of single cell RNA sequencing and high-dimensional flow or mass cytometry data using the full probability distribution pattern. cytoKernel is based on kernel embeddings which employs the probability distributions of the single cell data, by calculating the pairwise divergence/distance between distributions of subjects. It can detect both patterns involving aggregate changes, as well as more elusive variations that are often overlooked due to the multimodal characteristics of single cell data. We performed extensive benchmarks across both simulated and real data sets from mass cytometry data and single-cell RNA sequencing. The cytoKernel procedure effectively controls the False Discovery Rate (FDR) and shows favourable performance compared to existing methods. The method is able to identify more differential patterns than existing approaches. We apply cytoKernel to assess gene expression and protein marker expression differences from cell subpopulations in various publicly available single-cell RNAseq and mass cytometry data sets. The methods described in this paper are implemented in the open-source R package cytoKernel, which is freely available from Bioconductor at http://bioconductor.org/packages/cytoKernel.

摘要

单细胞数据的高通量测序可用于严格评估细胞特化,并揭示不同组或条件之间的复杂差异。许多现有的流行差异表达方法针对的是总体测量值(均值、中位数、总和)中的差异,并限制其方法仅检测全局差异变化。我们提出了一种基于核分数检验的单细胞数据差异表达稳健方法——细胞内核(cytoKernel)。细胞内核专门设计用于使用全概率分布模式评估单细胞RNA测序以及高维流式或质谱细胞术数据的差异表达。细胞内核基于核嵌入,通过计算样本分布之间的成对散度/距离来利用单细胞数据的概率分布。它既能检测涉及总体变化的模式,也能检测由于单细胞数据的多模态特征而经常被忽视的更难以捉摸的差异。我们对来自质谱细胞术数据和单细胞RNA测序的模拟和真实数据集进行了广泛的基准测试。细胞内核程序有效控制了错误发现率(FDR),并且与现有方法相比表现良好。该方法能够识别比现有方法更多的差异模式。我们应用细胞内核来评估各种公开可用的单细胞RNA测序和质谱细胞术数据集中细胞亚群的基因表达和蛋白质标志物表达差异。本文所述方法在开源R包细胞内核中实现,可从Bioconductor的http://bioconductor.org/packages/cytoKernel免费获取。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验