Suppr超能文献

稳健的自监督学习策略解决单细胞 RNA-seq 数据固有的稀疏性问题。

Robust self-supervised learning strategy to tackle the inherent sparsity in single-cell RNA-seq data.

机构信息

School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology, 61005, Gwangju, South Korea.

Artificial Intelligence Graduate School, Gwangju Institute of Science and Technology, 61005, Gwangju, South Korea.

出版信息

Brief Bioinform. 2024 Sep 23;25(6). doi: 10.1093/bib/bbae586.

Abstract

Single-cell RNA sequencing (scRNA-seq) is a powerful tool for elucidating cellular heterogeneity and tissue function in various biological contexts. However, the sparsity in scRNA-seq data limits the accuracy of cell type annotation and transcriptomic analysis due to information loss. To address this limitation, we present scRobust, a robust self-supervised learning strategy to tackle the inherent sparsity of scRNA-seq data. Built upon the Transformer architecture, scRobust employs a novel self-supervised learning strategy comprising contrastive learning and gene expression prediction tasks. We demonstrated the effectiveness of scRobust using nine benchmarks, additional dropout scenarios, and combined datasets. scRobust outperformed recent methods in cell-type annotation tasks and generated cell embeddings that capture multi-faceted clustering information (e.g. cell types and HbA1c levels). In addition, cell embeddings of scRobust were useful for detecting specific marker genes related to drug tolerance stages. Furthermore, when we applied scRobust to scATAC-seq data, high-quality cell embedding vectors were generated. These results demonstrate the representational power of scRobust.

摘要

单细胞 RNA 测序 (scRNA-seq) 是阐明各种生物学背景下细胞异质性和组织功能的强大工具。然而,由于信息丢失,scRNA-seq 数据的稀疏性限制了细胞类型注释和转录组分析的准确性。为了解决这个限制,我们提出了 scRobust,这是一种稳健的自监督学习策略,用于解决 scRNA-seq 数据固有的稀疏性问题。scRobust 建立在 Transformer 架构之上,采用了一种新颖的自监督学习策略,包括对比学习和基因表达预测任务。我们使用九个基准、额外的随机失活场景和组合数据集证明了 scRobust 的有效性。在细胞类型注释任务中,scRobust 优于最近的方法,并生成了能够捕获多方面聚类信息(例如细胞类型和 HbA1c 水平)的细胞嵌入。此外,scRobust 的细胞嵌入对于检测与药物耐受阶段相关的特定标记基因很有用。此外,当我们将 scRobust 应用于 scATAC-seq 数据时,生成了高质量的细胞嵌入向量。这些结果证明了 scRobust 的表示能力。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/24b5/11568879/246490b12c88/bbae586f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验