• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

一种用于单细胞RNA测序细胞类型分类的无参考方法。

A reference-free approach for cell type classification with scRNA-seq.

作者信息

Sun Qi, Peng Yifan, Liu Jinze

机构信息

Department of Computer Science, University of Kentucky, Lexington, KY, 40508, USA.

Department of Population Health Sciences, Weill Cornell Medicine, New York, NY 10065, USA.

出版信息

iScience. 2021 Jul 14;24(8):102855. doi: 10.1016/j.isci.2021.102855. eCollection 2021 Aug 20.

DOI:10.1016/j.isci.2021.102855
PMID:34381979
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8335627/
Abstract

Single-cell RNA sequencing (scRNA-seq) has become a revolutionary technology to characterize cells under different biological conditions. Unlike bulk RNA-seq, gene expression from scRNA-seq is highly sparse due to limited sequencing depth per cell. This is worsened by tossing away a significant portion of reads that attribute to gene quantification. To overcome data sparsity and fully utilize original reads, we propose scSimClassify, a reference-free and alignment-free approach to classify cell types with -mer level features. The compressed -mer groups (CKGs), identified by the simhash method, contain -mers with similar abundance profiles and serve as the cells' features. Our experiments demonstrate that CKG features lend themselves to better performance than gene expression features in scRNA-seq classification accuracy in the majority of experimental cases. Because CKGs are derived from raw reads without alignment to reference genome, scSimClassify offers an effective alternative to existing methods especially when reference genome is incomplete or insufficient to represent subject genomes.

摘要

单细胞RNA测序(scRNA-seq)已成为一种革命性技术,用于表征不同生物学条件下的细胞。与批量RNA测序不同,由于每个细胞的测序深度有限,scRNA-seq的基因表达非常稀疏。通过舍弃大量用于基因定量的读数,这种情况会变得更糟。为了克服数据稀疏性并充分利用原始读数,我们提出了scSimClassify,这是一种无参考和无比对的方法,用于使用-mer水平特征对细胞类型进行分类。通过simhash方法识别的压缩-mer组(CKG)包含具有相似丰度谱的-mer,并作为细胞的特征。我们的实验表明,在大多数实验情况下,CKG特征在scRNA-seq分类准确性方面比基因表达特征具有更好的性能。由于CKG是从原始读数中衍生而来,无需与参考基因组比对,scSimClassify为现有方法提供了一种有效的替代方案,特别是当参考基因组不完整或不足以代表目标基因组时。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/96a7/8335627/8424ae2579dc/gr3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/96a7/8335627/9a6d60626a73/fx1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/96a7/8335627/1e64d38dcbd6/gr1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/96a7/8335627/00db1a6bea47/gr2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/96a7/8335627/8424ae2579dc/gr3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/96a7/8335627/9a6d60626a73/fx1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/96a7/8335627/1e64d38dcbd6/gr1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/96a7/8335627/00db1a6bea47/gr2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/96a7/8335627/8424ae2579dc/gr3.jpg

相似文献

1
A reference-free approach for cell type classification with scRNA-seq.一种用于单细胞RNA测序细胞类型分类的无参考方法。
iScience. 2021 Jul 14;24(8):102855. doi: 10.1016/j.isci.2021.102855. eCollection 2021 Aug 20.
2
Rcorrector: efficient and accurate error correction for Illumina RNA-seq reads.Rcorrector:对Illumina RNA测序读数进行高效准确的纠错。
Gigascience. 2015 Oct 19;4:48. doi: 10.1186/s13742-015-0089-y. eCollection 2015.
3
scCensus: Off-target scRNA-seq reads reveal meaningful biology.单细胞普查:脱靶的单细胞RNA测序读数揭示了有意义的生物学信息。
bioRxiv. 2024 Jan 31:2024.01.29.577807. doi: 10.1101/2024.01.29.577807.
4
Detecting differential alternative splicing events in scRNA-seq with or without Unique Molecular Identifiers.在有或没有独特分子标识符的 scRNA-seq 中检测差异剪接事件。
PLoS Comput Biol. 2020 Jun 5;16(6):e1007925. doi: 10.1371/journal.pcbi.1007925. eCollection 2020 Jun.
5
Scalable preprocessing for sparse scRNA-seq data exploiting prior knowledge.利用先验知识对稀疏 scRNA-seq 数据进行可扩展的预处理。
Bioinformatics. 2018 Jul 1;34(13):i124-i132. doi: 10.1093/bioinformatics/bty293.
6
scNPF: an integrative framework assisted by network propagation and network fusion for preprocessing of single-cell RNA-seq data.scNPF:一种基于网络传播和网络融合的综合框架,用于单细胞 RNA-seq 数据的预处理。
BMC Genomics. 2019 May 8;20(1):347. doi: 10.1186/s12864-019-5747-5.
7
A multitask clustering approach for single-cell RNA-seq analysis in Recessive Dystrophic Epidermolysis Bullosa.一种用于隐性营养不良型大疱性表皮松解症的单细胞 RNA-seq 分析的多任务聚类方法。
PLoS Comput Biol. 2018 Apr 9;14(4):e1006053. doi: 10.1371/journal.pcbi.1006053. eCollection 2018 Apr.
8
Detection of high variability in gene expression from single-cell RNA-seq profiling.从单细胞RNA测序分析中检测基因表达的高变异性。
BMC Genomics. 2016 Aug 22;17 Suppl 7(Suppl 7):508. doi: 10.1186/s12864-016-2897-6.
9
JingleBells: A Repository of Immune-Related Single-Cell RNA-Sequencing Datasets.《铃儿响叮当》:一个免疫相关单细胞RNA测序数据集的储存库。
J Immunol. 2017 May 1;198(9):3375-3379. doi: 10.4049/jimmunol.1700272.
10
Cell-level somatic mutation detection from single-cell RNA sequencing.单细胞 RNA 测序中单细胞体细胞突变检测
Bioinformatics. 2019 Nov 1;35(22):4679-4687. doi: 10.1093/bioinformatics/btz288.

引用本文的文献

1
Mapping Cell Identity from scRNA-seq: A primer on computational methods.从单细胞RNA测序映射细胞身份:计算方法入门
Comput Struct Biotechnol J. 2025 Apr 2;27:1559-1569. doi: 10.1016/j.csbj.2025.03.051. eCollection 2025.
2
Investigating biological nitrogen fixation via single-cell transcriptomics.通过单细胞转录组学研究生物固氮作用。
J Exp Bot. 2025 Feb 25;76(4):931-949. doi: 10.1093/jxb/erae454.
3
Sex-biased gene expression at single-cell resolution: cause and consequence of sexual dimorphism.单细胞分辨率下的性别偏向基因表达:性二态性的原因与后果

本文引用的文献

1
Immunophenotyping of COVID-19 and influenza highlights the role of type I interferons in development of severe COVID-19.COVID-19 和流感的免疫表型分析突出了 I 型干扰素在 COVID-19 重症发展中的作用。
Sci Immunol. 2020 Jul 10;5(49). doi: 10.1126/sciimmunol.abd1554.
2
Integrative Analysis and Machine Learning based Characterization of Single Circulating Tumor Cells.基于整合分析和机器学习的单个循环肿瘤细胞特征分析
J Clin Med. 2020 Apr 22;9(4):1206. doi: 10.3390/jcm9041206.
3
scPred: accurate supervised method for cell-type classification from single-cell RNA-seq data.
Evol Lett. 2023 Apr 14;7(3):148-156. doi: 10.1093/evlett/qrad013. eCollection 2023 Jun.
4
Statistical and Computational Methods for Proteogenomic Data Analysis.统计和计算方法在蛋白质基因组数据分析中的应用。
Methods Mol Biol. 2023;2629:271-303. doi: 10.1007/978-1-0716-2986-4_13.
5
BLEND: a fast, memory-efficient and accurate mechanism to find fuzzy seed matches in genome analysis.BLEND:一种在基因组分析中快速、节省内存且准确地查找模糊种子匹配项的机制。
NAR Genom Bioinform. 2023 Jan 20;5(1):lqad004. doi: 10.1093/nargab/lqad004. eCollection 2023 Mar.
scPred:一种准确的有监督方法,用于对单细胞 RNA-seq 数据进行细胞类型分类。
Genome Biol. 2019 Dec 12;20(1):264. doi: 10.1186/s13059-019-1862-5.
4
Dopamine and cAMP-regulated phosphoprotein 32 kDa (DARPP-32) and survival in breast cancer: a retrospective analysis of protein and mRNA expression.多巴胺和 cAMP 调节的磷蛋白 32kDa(DARPP-32)与乳腺癌的生存:蛋白和 mRNA 表达的回顾性分析。
Sci Rep. 2019 Nov 18;9(1):16987. doi: 10.1038/s41598-019-53529-z.
5
A systematic evaluation of single cell RNA-seq analysis pipelines.单细胞 RNA 测序分析流程的系统评价。
Nat Commun. 2019 Oct 11;10(1):4667. doi: 10.1038/s41467-019-12266-7.
6
A comparison of automatic cell identification methods for single-cell RNA sequencing data.单细胞 RNA 测序数据的自动细胞识别方法比较。
Genome Biol. 2019 Sep 9;20(1):194. doi: 10.1186/s13059-019-1795-z.
7
ACTINN: automated identification of cell types in single cell RNA sequencing.ACTINN:单细胞 RNA 测序中细胞类型的自动识别。
Bioinformatics. 2020 Jan 15;36(2):533-538. doi: 10.1093/bioinformatics/btz592.
8
Benchmarking of alignment-free sequence comparison methods.无比对信息的序列比较方法的基准测试。
Genome Biol. 2019 Jul 25;20(1):144. doi: 10.1186/s13059-019-1755-7.
9
MetaPheno: A critical evaluation of deep learning and machine learning in metagenome-based disease prediction.MetaPheno:基于宏基因组的疾病预测中深度学习和机器学习的批判性评估。
Methods. 2019 Aug 15;166:74-82. doi: 10.1016/j.ymeth.2019.03.003. Epub 2019 Mar 16.
10
Challenges in unsupervised clustering of single-cell RNA-seq data.无监督单细胞 RNA-seq 数据聚类的挑战。
Nat Rev Genet. 2019 May;20(5):273-282. doi: 10.1038/s41576-018-0088-9.