• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

通过软聚类和并行计算进行大规模单细胞RNA测序分析的JOINT

JOINT for large-scale single-cell RNA-sequencing analysis via soft-clustering and parallel computing.

作者信息

Cui Tao, Wang Tingting

机构信息

Department of Pharmacology and Physiology, Georgetown University Medical Center, Washington, DC, 20057, USA.

Interdisciplinary Program in Neuroscience, Georgetown University Medical Center, Washington, DC, 20057, USA.

出版信息

BMC Genomics. 2021 Jan 11;22(1):47. doi: 10.1186/s12864-020-07302-6.

DOI:10.1186/s12864-020-07302-6
PMID:33430769
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7798298/
Abstract

BACKGROUND

Single-cell RNA-Sequencing (scRNA-Seq) has provided single-cell level insights into complex biological processes. However, the high frequency of gene expression detection failures in scRNA-Seq data make it challenging to achieve reliable identification of cell-types and Differentially Expressed Genes (DEG). Moreover, with the explosive growth of single-cell data using 10x genomics protocol, existing methods will soon reach the computation limit due to scalability issues. The single-cell transcriptomics field desperately need new tools and framework to facilitate large-scale single-cell analysis.

RESULTS

In order to improve the accuracy, robustness, and speed of scRNA-Seq data processing, we propose a generalized zero-inflated negative binomial mixture model, "JOINT," that can perform probability-based cell-type discovery and DEG analysis simultaneously without the need for imputation. JOINT performs soft-clustering for cell-type identification by computing the probability of individual cells, i.e. each cell can belong to multiple cell types with different probabilities. This is drastically different from existing hard-clustering methods where each cell can only belong to one cell type. The soft-clustering component of the algorithm significantly facilitates the accuracy and robustness of single-cell analysis, especially when the scRNA-Seq datasets are noisy and contain a large number of dropout events. Moreover, JOINT is able to determine the optimal number of cell-types automatically rather than specifying it empirically. The proposed model is an unsupervised learning problem which is solved by using the Expectation and Maximization (EM) algorithm. The EM algorithm is implemented using the TensorFlow deep learning framework, dramatically accelerating the speed for data analysis through parallel GPU computing.

CONCLUSIONS

Taken together, the JOINT algorithm is accurate and efficient for large-scale scRNA-Seq data analysis via parallel computing. The Python package that we have developed can be readily applied to aid future advances in parallel computing-based single-cell algorithms and research in various biological and biomedical fields.

摘要

背景

单细胞RNA测序(scRNA-Seq)已在单细胞水平上为复杂生物过程提供了深入见解。然而,scRNA-Seq数据中基因表达检测失败的频率很高,这使得可靠地识别细胞类型和差异表达基因(DEG)具有挑战性。此外,随着使用10x基因组学协议的单细胞数据呈爆炸式增长,由于可扩展性问题,现有方法很快将达到计算极限。单细胞转录组学领域迫切需要新的工具和框架来促进大规模单细胞分析。

结果

为了提高scRNA-Seq数据处理的准确性、稳健性和速度,我们提出了一种广义零膨胀负二项混合模型“JOINT”,它可以同时进行基于概率的细胞类型发现和DEG分析,而无需进行插补。JOINT通过计算单个细胞的概率来进行细胞类型识别的软聚类,即每个细胞可以以不同概率属于多种细胞类型。这与现有的硬聚类方法有很大不同,在硬聚类方法中每个细胞只能属于一种细胞类型。该算法的软聚类组件显著提高了单细胞分析的准确性和稳健性,特别是当scRNA-Seq数据集存在噪声且包含大量缺失事件时。此外,JOINT能够自动确定最佳细胞类型数量,而不是凭经验指定。所提出的模型是一个无监督学习问题,通过使用期望最大化(EM)算法来解决。EM算法使用TensorFlow深度学习框架实现,通过并行GPU计算极大地加快了数据分析速度。

结论

总之,JOINT算法通过并行计算对大规模scRNA-Seq数据分析准确且高效。我们开发的Python包可以很容易地应用于推动基于并行计算的单细胞算法的未来发展以及各种生物学和生物医学领域的研究。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/35ec/7798298/4aa9f98f0b96/12864_2020_7302_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/35ec/7798298/cffd0f3028f0/12864_2020_7302_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/35ec/7798298/6ddec47f5412/12864_2020_7302_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/35ec/7798298/1e341945627a/12864_2020_7302_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/35ec/7798298/aee868ec31aa/12864_2020_7302_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/35ec/7798298/4aa9f98f0b96/12864_2020_7302_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/35ec/7798298/cffd0f3028f0/12864_2020_7302_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/35ec/7798298/6ddec47f5412/12864_2020_7302_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/35ec/7798298/1e341945627a/12864_2020_7302_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/35ec/7798298/aee868ec31aa/12864_2020_7302_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/35ec/7798298/4aa9f98f0b96/12864_2020_7302_Fig5_HTML.jpg

相似文献

1
JOINT for large-scale single-cell RNA-sequencing analysis via soft-clustering and parallel computing.通过软聚类和并行计算进行大规模单细胞RNA测序分析的JOINT
BMC Genomics. 2021 Jan 11;22(1):47. doi: 10.1186/s12864-020-07302-6.
2
A comprehensive assessment of hurdle and zero-inflated models for single cell RNA-sequencing analysis.单细胞 RNA 测序分析中障碍和零膨胀模型的综合评估。
Brief Bioinform. 2023 Sep 20;24(5). doi: 10.1093/bib/bbad272.
3
scBGEDA: deep single-cell clustering analysis via a dual denoising autoencoder with bipartite graph ensemble clustering.scBGEDA:基于双分图集成分聚类的对偶去噪自动编码器的单细胞聚类分析。
Bioinformatics. 2023 Feb 14;39(2). doi: 10.1093/bioinformatics/btad075.
4
SSNMDI: a novel joint learning model of semi-supervised non-negative matrix factorization and data imputation for clustering of single-cell RNA-seq data.SSNMDI:一种用于单细胞 RNA-seq 数据聚类的半监督非负矩阵分解和数据插补的新型联合学习模型。
Brief Bioinform. 2023 May 19;24(3). doi: 10.1093/bib/bbad149.
5
Consensus clustering of single-cell RNA-seq data by enhancing network affinity.通过增强网络亲和力对单细胞 RNA-seq 数据进行共识聚类。
Brief Bioinform. 2021 Nov 5;22(6). doi: 10.1093/bib/bbab236.
6
Deep structural clustering for single-cell RNA-seq data jointly through autoencoder and graph neural network.基于自动编码器和图神经网络的单细胞 RNA-seq 数据深度结构聚类。
Brief Bioinform. 2022 Mar 10;23(2). doi: 10.1093/bib/bbac018.
7
Contrastive self-supervised clustering of scRNA-seq data.单细胞 RNA 测序数据的对比自监督聚类。
BMC Bioinformatics. 2021 May 27;22(1):280. doi: 10.1186/s12859-021-04210-8.
8
jSRC: a flexible and accurate joint learning algorithm for clustering of single-cell RNA-sequencing data.jSRC:一种用于单细胞 RNA-seq 数据聚类的灵活准确的联合学习算法。
Brief Bioinform. 2021 Sep 2;22(5). doi: 10.1093/bib/bbaa433.
9
Attention-based deep clustering method for scRNA-seq cell type identification.基于注意力机制的深度聚类方法在 scRNA-seq 细胞类型鉴定中的应用。
PLoS Comput Biol. 2023 Nov 10;19(11):e1011641. doi: 10.1371/journal.pcbi.1011641. eCollection 2023 Nov.
10
Deep enhanced constraint clustering based on contrastive learning for scRNA-seq data.基于对比学习的深度增强约束聚类算法在单细胞 RNA-seq 数据分析中的应用。
Brief Bioinform. 2023 Jul 20;24(4). doi: 10.1093/bib/bbad222.

引用本文的文献

1
A comprehensive assessment of hurdle and zero-inflated models for single cell RNA-sequencing analysis.单细胞 RNA 测序分析中障碍和零膨胀模型的综合评估。
Brief Bioinform. 2023 Sep 20;24(5). doi: 10.1093/bib/bbad272.
2
Correction to: JOINT for large-scale single-cell RNAsequencing analysis via soft-clustering and parallel computing.对《通过软聚类和并行计算进行大规模单细胞RNA测序分析的JOINT》的修正
BMC Genomics. 2021 Mar 29;22(1):223. doi: 10.1186/s12864-021-07408-5.