• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

scLENS:用于无偏单细胞RNA测序数据分析的数据驱动信号检测

scLENS: data-driven signal detection for unbiased scRNA-seq data analysis.

作者信息

Kim Hyun, Chang Won, Chae Seok Joo, Park Jong-Eun, Seo Minseok, Kim Jae Kyoung

机构信息

Biomedical Mathematics Group, Pioneer Research Center for Mathematical and Computational Sciences, Institute for Basic Science, Daejeon, 34126, Republic of Korea.

Division of Statistics and Data Science, University of Cincinnati, Cincinnati, OH, 45221, USA.

出版信息

Nat Commun. 2024 Apr 27;15(1):3575. doi: 10.1038/s41467-024-47884-3.

DOI:10.1038/s41467-024-47884-3
PMID:38678050
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11519519/
Abstract

High dimensionality and noise have limited the new biological insights that can be discovered in scRNA-seq data. While dimensionality reduction tools have been developed to extract biological signals from the data, they often require manual determination of signal dimension, introducing user bias. Furthermore, a common data preprocessing method, log normalization, can unintentionally distort signals in the data. Here, we develop scLENS, a dimensionality reduction tool that circumvents the long-standing issues of signal distortion and manual input. Specifically, we identify the primary cause of signal distortion during log normalization and effectively address it by uniformizing cell vector lengths with L2 normalization. Furthermore, we utilize random matrix theory-based noise filtering and a signal robustness test to enable data-driven determination of the threshold for signal dimensions. Our method outperforms 11 widely used dimensionality reduction tools and performs particularly well for challenging scRNA-seq datasets with high sparsity and variability. To facilitate the use of scLENS, we provide a user-friendly package that automates accurate signal detection of scRNA-seq data without manual time-consuming tuning.

摘要

高维度和噪声限制了在单细胞RNA测序(scRNA-seq)数据中发现的新生物学见解。虽然已经开发了降维工具来从数据中提取生物学信号,但它们通常需要手动确定信号维度,从而引入用户偏差。此外,一种常见的数据预处理方法——对数归一化,可能会无意中扭曲数据中的信号。在这里,我们开发了scLENS,这是一种降维工具,它规避了信号失真和手动输入这两个长期存在的问题。具体来说,我们确定了对数归一化过程中信号失真的主要原因,并通过L2归一化使细胞向量长度均匀化来有效解决这一问题。此外,我们利用基于随机矩阵理论的噪声过滤和信号稳健性测试,实现数据驱动的信号维度阈值确定。我们的方法优于11种广泛使用的降维工具,对于具有高稀疏性和可变性的具有挑战性的scRNA-seq数据集表现尤其出色。为了便于使用scLENS,我们提供了一个用户友好的软件包,可自动对scRNA-seq数据进行准确的信号检测,而无需进行耗时的手动调整。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e91a/11519519/502cc5f070d4/41467_2024_47884_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e91a/11519519/75f2350c240b/41467_2024_47884_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e91a/11519519/8822a0d301c0/41467_2024_47884_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e91a/11519519/564827b0ecae/41467_2024_47884_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e91a/11519519/7f0d0a29b804/41467_2024_47884_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e91a/11519519/f4609ba751d2/41467_2024_47884_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e91a/11519519/d527b5f2c1b1/41467_2024_47884_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e91a/11519519/502cc5f070d4/41467_2024_47884_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e91a/11519519/75f2350c240b/41467_2024_47884_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e91a/11519519/8822a0d301c0/41467_2024_47884_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e91a/11519519/564827b0ecae/41467_2024_47884_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e91a/11519519/7f0d0a29b804/41467_2024_47884_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e91a/11519519/f4609ba751d2/41467_2024_47884_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e91a/11519519/d527b5f2c1b1/41467_2024_47884_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e91a/11519519/502cc5f070d4/41467_2024_47884_Fig7_HTML.jpg

相似文献

1
scLENS: data-driven signal detection for unbiased scRNA-seq data analysis.scLENS:用于无偏单细胞RNA测序数据分析的数据驱动信号检测
Nat Commun. 2024 Apr 27;15(1):3575. doi: 10.1038/s41467-024-47884-3.
2
Identifying cell states in single-cell RNA-seq data at statistically maximal resolution.以统计学上最大分辨率识别单细胞 RNA-seq 数据中的细胞状态。
PLoS Comput Biol. 2024 Jul 12;20(7):e1012224. doi: 10.1371/journal.pcbi.1012224. eCollection 2024 Jul.
3
Scedar: A scalable Python package for single-cell RNA-seq exploratory data analysis.Scedar:一个用于单细胞 RNA-seq 探索性数据分析的可扩展 Python 包。
PLoS Comput Biol. 2020 Apr 27;16(4):e1007794. doi: 10.1371/journal.pcbi.1007794. eCollection 2020 Apr.
4
Dimensionality Reduction of Single-Cell RNA-Seq Data.单细胞 RNA-Seq 数据的降维处理。
Methods Mol Biol. 2021;2284:331-342. doi: 10.1007/978-1-0716-1307-8_18.
5
Graph-Regularized Non-Negative Matrix Factorization for Single-Cell Clustering in scRNA-Seq Data.基于图正则化的非负矩阵分解在 scRNA-Seq 数据中的单细胞聚类。
IEEE J Biomed Health Inform. 2024 Aug;28(8):4986-4994. doi: 10.1109/JBHI.2024.3400050. Epub 2024 Aug 6.
6
nsDCC: dual-level contrastive clustering with nonuniform sampling for scRNA-seq data analysis.nsDCC:基于非均匀采样的双层对比聚类算法,用于 scRNA-seq 数据分析。
Brief Bioinform. 2024 Sep 23;25(6). doi: 10.1093/bib/bbae477.
7
scLEGA: an attention-based deep clustering method with a tendency for low expression of genes on single-cell RNA-seq data.scLEGA:一种基于注意力的深度聚类方法,在单细胞 RNA-seq 数据中倾向于低表达基因。
Brief Bioinform. 2024 Jul 25;25(5). doi: 10.1093/bib/bbae371.
8
scRNA-Explorer: An End-user Online Tool for Single Cell RNA-seq Data Analysis Featuring Gene Correlation and Data Filtering.scRNA-Explorer:一个用于单细胞 RNA-seq 数据分析的用户友好型在线工具,具有基因相关性和数据筛选功能。
J Mol Biol. 2024 Sep 1;436(17):168654. doi: 10.1016/j.jmb.2024.168654. Epub 2024 Jun 12.
9
scZAG: Integrating ZINB-Based Autoencoder with Adaptive Data Augmentation Graph Contrastive Learning for scRNA-seq Clustering.scZAG:基于 ZINB 的自动编码器与自适应数据增强图对比学习在 scRNA-seq 聚类中的整合。
Int J Mol Sci. 2024 May 29;25(11):5976. doi: 10.3390/ijms25115976.
10
Supervised application of internal validation measures to benchmark dimensionality reduction methods in scRNA-seq data.监督应用内部验证措施,以基准化 scRNA-seq 数据的降维方法。
Brief Bioinform. 2021 Nov 5;22(6). doi: 10.1093/bib/bbab304.

引用本文的文献

1
scICE: enhancing clustering reliability and efficiency of scRNA-seq data with multi-cluster label consistency evaluation.scICE:通过多聚类标签一致性评估提高scRNA-seq数据的聚类可靠性和效率。
Nat Commun. 2025 Jul 2;16(1):6031. doi: 10.1038/s41467-025-60702-8.
2
Principled PCA separates signal from noise in omics count data.基于原理的主成分分析(PCA)可在组学计数数据中分离信号与噪声。
bioRxiv. 2025 Feb 7:2025.02.03.636129. doi: 10.1101/2025.02.03.636129.

本文引用的文献

1
Platelet and myeloid lineage biases of transplanted single perinatal mouse hematopoietic stem cells.移植的单个围产期小鼠造血干细胞的血小板和髓系谱系偏向性
Cell Res. 2023 Nov;33(11):883-886. doi: 10.1038/s41422-023-00866-4. Epub 2023 Sep 6.
2
Consequences and opportunities arising due to sparser single-cell RNA-seq datasets.由于单细胞 RNA-seq 数据集较为稀疏而产生的结果和机会。
Genome Biol. 2023 Apr 21;24(1):86. doi: 10.1186/s13059-023-02933-w.
3
Comparison of transformations for single-cell RNA-seq data.单细胞 RNA-seq 数据转换方法比较。
Nat Methods. 2023 May;20(5):665-672. doi: 10.1038/s41592-023-01814-1. Epub 2023 Apr 10.
4
Single-cell proteomics enabled by next-generation sequencing or mass spectrometry.基于下一代测序或质谱的单细胞蛋白质组学。
Nat Methods. 2023 Mar;20(3):363-374. doi: 10.1038/s41592-023-01791-5. Epub 2023 Mar 2.
5
Principal Component Analyses (PCA)-based findings in population genetic studies are highly biased and must be reevaluated.基于主成分分析(PCA)的群体遗传学研究结果存在高度偏差,必须重新评估。
Sci Rep. 2022 Aug 29;12(1):14683. doi: 10.1038/s41598-022-14395-4.
6
Scalable single-cell RNA sequencing from full transcripts with Smart-seq3xpress.基于 Smart-seq3xpress 的全长转录本可扩展的单细胞 RNA 测序。
Nat Biotechnol. 2022 Oct;40(10):1452-1457. doi: 10.1038/s41587-022-01311-4. Epub 2022 May 30.
7
findPC: An R package to automatically select the number of principal components in single-cell analysis.findPC:一个用于在单细胞分析中自动选择主成分数量的 R 包。
Bioinformatics. 2022 May 13;38(10):2949-2951. doi: 10.1093/bioinformatics/btac235.
8
Cross-tissue immune cell analysis reveals tissue-specific features in humans.跨组织免疫细胞分析揭示人类组织特异性特征。
Science. 2022 May 13;376(6594):eabl5197. doi: 10.1126/science.abl5197.
9
Transcriptional kinetics and molecular functions of long noncoding RNAs.长非编码 RNA 的转录动力学和分子功能。
Nat Genet. 2022 Mar;54(3):306-317. doi: 10.1038/s41588-022-01014-1. Epub 2022 Mar 3.
10
Statistics or biology: the zero-inflation controversy about scRNA-seq data.统计学还是生物学:关于 scRNA-seq 数据的零膨胀争议。
Genome Biol. 2022 Jan 21;23(1):31. doi: 10.1186/s13059-022-02601-5.