• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

通过考虑细胞异质性利用低秩矩阵补全对单细胞RNA测序数据中的缺失值进行插补

Missing Value Imputation With Low-Rank Matrix Completion in Single-Cell RNA-Seq Data by Considering Cell Heterogeneity.

作者信息

Huang Meng, Ye Xiucai, Li Hongmin, Sakurai Tetsuya

机构信息

Department of Computer Science, University of Tsukuba, Tsukuba, Japan.

Center for Artificial Intelligence Research, University of Tsukuba, Tsukuba, Japan.

出版信息

Front Genet. 2022 Jul 14;13:952649. doi: 10.3389/fgene.2022.952649. eCollection 2022.

DOI:10.3389/fgene.2022.952649
PMID:35910201
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9329700/
Abstract

Single-cell RNA-sequencing (scRNA-seq) technologies enable the measurements of gene expressions in individual cells, which is helpful for exploring cancer heterogeneity and precision medicine. However, various technical noises lead to false zero values (missing gene expression values) in scRNA-seq data, termed as dropout events. These zero values complicate the analysis of cell patterns, which affects the high-precision analysis of intra-tumor heterogeneity. Recovering missing gene expression values is still a major obstacle in the scRNA-seq data analysis. In this study, taking the cell heterogeneity into consideration, we develop a novel method, called single cell Gauss-Newton Gene expression Imputation (scGNGI), to impute the scRNA-seq expression matrices by using a low-rank matrix completion. The obtained experimental results on the simulated datasets and real scRNA-seq datasets show that scGNGI can more effectively impute the missing values for scRNA-seq gene expression and improve the down-stream analysis compared to other state-of-the-art methods. Moreover, we show that the proposed method can better preserve gene expression variability among cells. Overall, this study helps explore the complex biological system and precision medicine in scRNA-seq data.

摘要

单细胞RNA测序(scRNA-seq)技术能够测量单个细胞中的基因表达,这有助于探索癌症异质性和精准医学。然而,各种技术噪声会导致scRNA-seq数据中出现假零值(缺失基因表达值),即所谓的脱落事件。这些零值使细胞模式分析变得复杂,影响了肿瘤内异质性的高精度分析。恢复缺失的基因表达值仍然是scRNA-seq数据分析中的一个主要障碍。在本研究中,考虑到细胞异质性,我们开发了一种名为单细胞高斯-牛顿基因表达插补(scGNGI)的新方法,通过低秩矩阵补全来插补scRNA-seq表达矩阵。在模拟数据集和真实scRNA-seq数据集上获得的实验结果表明,与其他现有方法相比,scGNGI能够更有效地插补scRNA-seq基因表达的缺失值,并改善下游分析。此外,我们表明所提出的方法能够更好地保留细胞间的基因表达变异性。总体而言,本研究有助于探索scRNA-seq数据中的复杂生物系统和精准医学。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/faff/9329700/ce1789e542f4/fgene-13-952649-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/faff/9329700/b1476ca53897/fgene-13-952649-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/faff/9329700/ef192fe89836/fgene-13-952649-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/faff/9329700/b9790ab8e337/fgene-13-952649-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/faff/9329700/b59fe53d55ea/fgene-13-952649-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/faff/9329700/5e98cbe295ca/fgene-13-952649-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/faff/9329700/24df4cc4c8f6/fgene-13-952649-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/faff/9329700/673de205c959/fgene-13-952649-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/faff/9329700/dc81e68d9e8e/fgene-13-952649-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/faff/9329700/ce1789e542f4/fgene-13-952649-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/faff/9329700/b1476ca53897/fgene-13-952649-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/faff/9329700/ef192fe89836/fgene-13-952649-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/faff/9329700/b9790ab8e337/fgene-13-952649-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/faff/9329700/b59fe53d55ea/fgene-13-952649-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/faff/9329700/5e98cbe295ca/fgene-13-952649-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/faff/9329700/24df4cc4c8f6/fgene-13-952649-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/faff/9329700/673de205c959/fgene-13-952649-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/faff/9329700/dc81e68d9e8e/fgene-13-952649-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/faff/9329700/ce1789e542f4/fgene-13-952649-g009.jpg

相似文献

1
Missing Value Imputation With Low-Rank Matrix Completion in Single-Cell RNA-Seq Data by Considering Cell Heterogeneity.通过考虑细胞异质性利用低秩矩阵补全对单细胞RNA测序数据中的缺失值进行插补
Front Genet. 2022 Jul 14;13:952649. doi: 10.3389/fgene.2022.952649. eCollection 2022.
2
ScLRTC: imputation for single-cell RNA-seq data via low-rank tensor completion.ScLRTC:基于低秩张量补全的单细胞 RNA-seq 数据插补。
BMC Genomics. 2021 Nov 29;22(1):860. doi: 10.1186/s12864-021-08101-3.
3
CL-Impute: A contrastive learning-based imputation for dropout single-cell RNA-seq data.CL-Impute:基于对比学习的 dropout 单细胞 RNA-seq 数据插补方法。
Comput Biol Med. 2023 Sep;164:107263. doi: 10.1016/j.compbiomed.2023.107263. Epub 2023 Jul 23.
4
GE-Impute: graph embedding-based imputation for single-cell RNA-seq data.GE-Impute:基于图嵌入的单细胞 RNA-seq 数据插补。
Brief Bioinform. 2022 Sep 20;23(5). doi: 10.1093/bib/bbac313.
5
Imputing single-cell RNA-seq data by considering cell heterogeneity and prior expression of dropouts.考虑细胞异质性和缺失值先前表达来推断单细胞 RNA-seq 数据。
J Mol Cell Biol. 2021 Apr 10;13(1):29-40. doi: 10.1093/jmcb/mjaa052.
6
Accurate and interpretable gene expression imputation on scRNA-seq data using IGSimpute.使用 IGSimpute 实现 scRNA-seq 数据的准确和可解释的基因表达推断。
Brief Bioinform. 2023 May 19;24(3). doi: 10.1093/bib/bbad124.
7
scGGAN: single-cell RNA-seq imputation by graph-based generative adversarial network.scGGAN:基于图的生成对抗网络的单细胞RNA测序数据插补
Brief Bioinform. 2023 Mar 19;24(2). doi: 10.1093/bib/bbad040.
8
CMF-Impute: an accurate imputation tool for single-cell RNA-seq data.CMF-Impute:一种用于单细胞 RNA-seq 数据的精确插补工具。
Bioinformatics. 2020 May 1;36(10):3139-3147. doi: 10.1093/bioinformatics/btaa109.
9
scIDPMs: Single-Cell RNA-Seq Imputation Using Diffusion Probabilistic Models.scIDPMs:使用扩散概率模型的单细胞RNA测序插补
IEEE J Biomed Health Inform. 2025 Apr;29(4):3057-3068. doi: 10.1109/JBHI.2024.3430554. Epub 2025 Apr 4.
10
Bubble: a fast single-cell RNA-seq imputation using an autoencoder constrained by bulk RNA-seq data.Bubble:一种利用受批量RNA测序数据约束的自动编码器进行的快速单细胞RNA测序插补方法。
Brief Bioinform. 2023 Jan 19;24(1). doi: 10.1093/bib/bbac580.

引用本文的文献

1
Exposure-inducible genes may contribute to missingness in RNAseq-based gene expression analyses.暴露诱导基因可能导致基于RNA测序的基因表达分析中出现数据缺失。
Sci Rep. 2025 Aug 22;15(1):30889. doi: 10.1038/s41598-025-14395-0.
2
Addressing Missing Data Challenges in Geriatric Health Monitoring: A Study of Statistical and Machine Learning Imputation Methods.应对老年健康监测中的数据缺失挑战:统计与机器学习插补方法研究
Sensors (Basel). 2025 Jan 21;25(3):614. doi: 10.3390/s25030614.
3
Imputation method for single-cell RNA-seq data using neural topic model.

本文引用的文献

1
BioSeq-BLM: a platform for analyzing DNA, RNA and protein sequences based on biological language models.BioSeq-BLM:一个基于生物语言模型分析 DNA、RNA 和蛋白质序列的平台。
Nucleic Acids Res. 2021 Dec 16;49(22):e129. doi: 10.1093/nar/gkab829.
2
A review of computational strategies for denoising and imputation of single-cell transcriptomic data.单细胞转录组数据去噪和插补的计算策略综述。
Brief Bioinform. 2021 Jul 20;22(4). doi: 10.1093/bib/bbaa222.
3
Dissecting the Single-Cell Transcriptome Network Underlying Gastric Premalignant Lesions and Early Gastric Cancer.
基于神经主题模型的单细胞 RNA-seq 数据插补方法。
Gigascience. 2022 Dec 28;12. doi: 10.1093/gigascience/giad098. Epub 2023 Nov 24.
4
A Framework for Comparison and Assessment of Synthetic RNA-Seq Data.用于合成 RNA-Seq 数据比较和评估的框架。
Genes (Basel). 2022 Dec 14;13(12):2362. doi: 10.3390/genes13122362.
剖析胃癌前病变和早期胃癌背后的单细胞转录组网络
Cell Rep. 2020 Mar 24;30(12):4317. doi: 10.1016/j.celrep.2020.03.020.
4
DeepImpute: an accurate, fast, and scalable deep neural network method to impute single-cell RNA-seq data.DeepImpute:一种准确、快速且可扩展的深度学习神经网络方法,用于填补单细胞 RNA-seq 数据。
Genome Biol. 2019 Oct 18;20(1):211. doi: 10.1186/s13059-019-1837-6.
5
BioSeq-Analysis2.0: an updated platform for analyzing DNA, RNA and protein sequences at sequence level and residue level based on machine learning approaches.BioSeq-Analysis2.0:一个基于机器学习方法的更新平台,用于在序列水平和残基水平上分析 DNA、RNA 和蛋白质序列。
Nucleic Acids Res. 2019 Nov 18;47(20):e127. doi: 10.1093/nar/gkz740.
6
Single-cell RNA-seq highlights intra-tumoral heterogeneity and malignant progression in pancreatic ductal adenocarcinoma.单细胞 RNA 测序凸显胰腺导管腺癌肿瘤内异质性和恶性演进。
Cell Res. 2019 Sep;29(9):725-738. doi: 10.1038/s41422-019-0195-y. Epub 2019 Jul 4.
7
CellMarker: a manually curated resource of cell markers in human and mouse.细胞标记物数据库:人类和小鼠细胞标记物的人工整理资源。
Nucleic Acids Res. 2019 Jan 8;47(D1):D721-D728. doi: 10.1093/nar/gky900.
8
Recovering Gene Interactions from Single-Cell Data Using Data Diffusion.利用数据扩散从单细胞数据中恢复基因相互作用。
Cell. 2018 Jul 26;174(3):716-729.e27. doi: 10.1016/j.cell.2018.05.061. Epub 2018 Jun 28.
9
Global characterization of T cells in non-small-cell lung cancer by single-cell sequencing.单细胞测序对非小细胞肺癌 T 细胞的全面刻画。
Nat Med. 2018 Jul;24(7):978-985. doi: 10.1038/s41591-018-0045-3. Epub 2018 Jun 25.
10
SAVER: gene expression recovery for single-cell RNA sequencing.SAVER:单细胞 RNA 测序的基因表达恢复。
Nat Methods. 2018 Jul;15(7):539-542. doi: 10.1038/s41592-018-0033-z. Epub 2018 Jun 25.