文献检索文档翻译深度研究
Suppr Zotero 插件Zotero 插件
邀请有礼套餐&价格历史记录

新学期,新优惠

限时优惠:9月1日-9月22日

30天高级会员仅需29元

1天体验卡首发特惠仅需5.99元

了解详情
不再提醒
插件&应用
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
高级版
套餐订阅购买积分包
AI 工具
文献检索文档翻译深度研究
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2025

泊松分布模型适用于基于UMI的单细胞RNA测序数据。

The Poisson distribution model fits UMI-based single-cell RNA-sequencing data.

作者信息

Pan Yue, Landis Justin T, Moorad Razia, Wu Di, Marron J S, Dittmer Dirk P

出版信息

Res Sq. 2023 Feb 6:rs.3.rs-2517698. doi: 10.21203/rs.3.rs-2517698/v1.


DOI:10.21203/rs.3.rs-2517698/v1
PMID:36798423
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9934739/
Abstract

Modeling of single cell RNA-sequencing (scRNA-seq) data remains challenging due to a high percentage of zeros and data heterogeneity, so improved modeling has strong potential to benefit many downstream data analyses. The existing zero-inflated or over-dispersed models are based on aggregations at either the gene or the cell level. However, they typically lose accuracy due to a too crude aggregation at those two levels. We avoid the crude approximations entailed by such aggregation through proposing an Independent Poisson Distribution (IPD) particularly at each individual entry in the scRNA-seq data matrix. This approach naturally and intuitively models the large number of zeros as matrix entries with a very small Poisson parameter. The critical challenge of cell clustering is approached via a novel data representation as Departures from a simple homogeneous IPD (DIPD) to capture the per-gene-per-cell intrinsic heterogeneity generated by cell clusters. Our experiments using real data and crafted experiments show that using DIPD as a data representation for scRNA-seq data can uncover novel cell subtypes that are missed or can only be found by careful parameter tuning using conventional methods. This new method has multiple advantages, including (1) no needfor prior feature selection or manual optimization of hyperparameters; (2) flexibility to combine with and improve upon other methods, such as Seurat. Another novel contribution is the use of crafted experiments as part of the validation of our newly developed DIPD-based clustering pipeline. This new clustering pipeline is implemented in the R (CRAN) package .

摘要

由于单细胞RNA测序(scRNA-seq)数据中零值比例高且数据具有异质性,对其进行建模仍然具有挑战性,因此改进的建模方法有很大潜力使许多下游数据分析受益。现有的零膨胀或过度离散模型基于基因或细胞水平的聚合。然而,由于在这两个水平上的聚合过于粗糙,它们通常会失去准确性。我们通过提出独立泊松分布(IPD)来避免这种聚合带来的粗糙近似,特别是针对scRNA-seq数据矩阵中的每个单独条目。这种方法自然且直观地将大量零值建模为具有非常小泊松参数的矩阵条目。通过一种新颖的数据表示方法,即偏离简单均匀IPD(DIPD),来解决细胞聚类的关键挑战,以捕获由细胞簇产生的每个基因每个细胞的内在异质性。我们使用真实数据和精心设计的实验表明,将DIPD用作scRNA-seq数据的表示方法可以发现传统方法遗漏或只能通过仔细调整参数才能找到的新细胞亚型。这种新方法具有多个优点,包括(1)无需事先进行特征选择或手动优化超参数;(2)可灵活与其他方法(如Seurat)结合并进行改进。另一个新颖的贡献是使用精心设计的实验作为我们新开发的基于DIPD的聚类管道验证的一部分。这个新的聚类管道在R(CRAN)包中实现。

相似文献

[1]
The Poisson distribution model fits UMI-based single-cell RNA-sequencing data.

Res Sq. 2023-2-6

[2]
The Poisson distribution model fits UMI-based single-cell RNA-sequencing data.

BMC Bioinformatics. 2023-6-17

[3]
DIMM-SC: a Dirichlet mixture model for clustering droplet-based single cell transcriptomic data.

Bioinformatics. 2018-1-1

[4]
Statistical methods for analysis of single-cell RNA-sequencing data.

MethodsX. 2021-11-17

[5]
scBGEDA: deep single-cell clustering analysis via a dual denoising autoencoder with bipartite graph ensemble clustering.

Bioinformatics. 2023-2-14

[6]
Deep structural clustering for single-cell RNA-seq data jointly through autoencoder and graph neural network.

Brief Bioinform. 2022-3-10

[7]
Comparison of scRNA-seq data analysis method combinations.

Brief Funct Genomics. 2022-11-17

[8]
Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.

Cochrane Database Syst Rev. 2022-2-1

[9]
scGCL: an imputation method for scRNA-seq data based on graph contrastive learning.

Bioinformatics. 2023-3-1

[10]
scDCCA: deep contrastive clustering for single-cell RNA-seq data based on auto-encoder network.

Brief Bioinform. 2023-1-19

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

推荐工具

医学文档翻译智能文献检索