• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

评价单细胞 RNA-seq 中监督细胞类型识别的一些方面:分类器、特征选择和参考构建。

Evaluation of some aspects in supervised cell type identification for single-cell RNA-seq: classifier, feature selection, and reference construction.

机构信息

Department of Computer Science, Emory University, 400 Dowman Drive, Atlanta, GA, 30322, USA.

Department of Biostatistics and Bioinformatics, Rollins School of Public Health, Emory University, 1518 Clifton Road NE, Atlanta, GA, 30322, USA.

出版信息

Genome Biol. 2021 Sep 9;22(1):264. doi: 10.1186/s13059-021-02480-2.

DOI:10.1186/s13059-021-02480-2
PMID:34503564
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8427961/
Abstract

BACKGROUND

Cell type identification is one of the most important questions in single-cell RNA sequencing (scRNA-seq) data analysis. With the accumulation of public scRNA-seq data, supervised cell type identification methods have gained increasing popularity due to better accuracy, robustness, and computational performance. Despite all the advantages, the performance of the supervised methods relies heavily on several key factors: feature selection, prediction method, and, most importantly, choice of the reference dataset.

RESULTS

In this work, we perform extensive real data analyses to systematically evaluate these strategies in supervised cell identification. We first benchmark nine classifiers along with six feature selection strategies and investigate the impact of reference data size and number of cell types in cell type prediction. Next, we focus on how discrepancies between reference and target datasets and how data preprocessing such as imputation and batch effect correction affect prediction performance. We also investigate the strategies of pooling and purifying reference data.

CONCLUSIONS

Based on our analysis results, we provide guidelines for using supervised cell typing methods. We suggest combining all individuals from available datasets to construct the reference dataset and use multi-layer perceptron (MLP) as the classifier, along with F-test as the feature selection method. All the code used for our analysis is available on GitHub ( https://github.com/marvinquiet/RefConstruction_supervisedCelltyping ).

摘要

背景

细胞类型鉴定是单细胞 RNA 测序 (scRNA-seq) 数据分析中最重要的问题之一。随着公共 scRNA-seq 数据的积累,由于具有更好的准确性、鲁棒性和计算性能,监督细胞类型鉴定方法越来越受欢迎。尽管具有所有这些优势,但监督方法的性能在很大程度上取决于几个关键因素:特征选择、预测方法,最重要的是,参考数据集的选择。

结果

在这项工作中,我们进行了广泛的真实数据分析,以系统地评估这些策略在监督细胞识别中的作用。我们首先沿着六个特征选择策略,沿着九个分类器基准测试,并研究了参考数据大小和细胞类型数量对细胞类型预测的影响。接下来,我们重点研究了参考数据集和目标数据集之间的差异以及数据预处理(如插补和批次效应校正)如何影响预测性能。我们还研究了参考数据的汇总和纯化策略。

结论

根据我们的分析结果,我们为使用监督细胞分型方法提供了指导。我们建议将来自可用数据集的所有个体组合在一起,以构建参考数据集,并使用多层感知机 (MLP) 作为分类器,同时使用 F 检验作为特征选择方法。我们分析中使用的所有代码都可在 GitHub 上获得(https://github.com/marvinquiet/RefConstruction_supervisedCelltyping)。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/accc/8427961/33c7b7e53b43/13059_2021_2480_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/accc/8427961/b8b16e819313/13059_2021_2480_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/accc/8427961/b59441726e47/13059_2021_2480_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/accc/8427961/670fdaf2c47f/13059_2021_2480_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/accc/8427961/fb072ec18963/13059_2021_2480_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/accc/8427961/e0b02eeeee17/13059_2021_2480_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/accc/8427961/33c7b7e53b43/13059_2021_2480_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/accc/8427961/b8b16e819313/13059_2021_2480_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/accc/8427961/b59441726e47/13059_2021_2480_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/accc/8427961/670fdaf2c47f/13059_2021_2480_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/accc/8427961/fb072ec18963/13059_2021_2480_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/accc/8427961/e0b02eeeee17/13059_2021_2480_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/accc/8427961/33c7b7e53b43/13059_2021_2480_Fig6_HTML.jpg

相似文献

1
Evaluation of some aspects in supervised cell type identification for single-cell RNA-seq: classifier, feature selection, and reference construction.评价单细胞 RNA-seq 中监督细胞类型识别的一些方面:分类器、特征选择和参考构建。
Genome Biol. 2021 Sep 9;22(1):264. doi: 10.1186/s13059-021-02480-2.
2
A comprehensive comparison of supervised and unsupervised methods for cell type identification in single-cell RNA-seq.单细胞 RNA-seq 中细胞类型识别的有监督与无监督方法的全面比较。
Brief Bioinform. 2022 Mar 10;23(2). doi: 10.1093/bib/bbab567.
3
Single-cell RNA-seq data semi-supervised clustering and annotation via structural regularized domain adaptation.基于结构正则化领域自适应的单细胞 RNA-seq 数据半监督聚类和注释。
Bioinformatics. 2021 May 5;37(6):775-784. doi: 10.1093/bioinformatics/btaa908.
4
Target-Oriented Reference Construction for supervised cell-type identification in scRNA-seq.用于单细胞RNA测序中监督式细胞类型识别的面向目标的参考构建
Res Sq. 2024 Jun 26:rs.3.rs-4559348. doi: 10.21203/rs.3.rs-4559348/v1.
5
Cellcano: supervised cell type identification for single cell ATAC-seq data.Cellcano:单细胞 ATAC-seq 数据的有监督细胞类型识别。
Nat Commun. 2023 Apr 3;14(1):1864. doi: 10.1038/s41467-023-37439-3.
6
CTISL: a dynamic stacking multi-class classification approach for identifying cell types from single-cell RNA-seq data.CTISL:一种动态堆叠多类分类方法,用于从单细胞 RNA-seq 数据中识别细胞类型。
Bioinformatics. 2024 Feb 1;40(2). doi: 10.1093/bioinformatics/btae063.
7
Evaluation of methods to assign cell type labels to cell clusters from single-cell RNA-sequencing data.评估从单细胞RNA测序数据为细胞簇分配细胞类型标签的方法。
F1000Res. 2019 Mar 15;8. doi: 10.12688/f1000research.18490.3. eCollection 2019.
8
scGAD: a new task and end-to-end framework for generalized cell type annotation and discovery.scGAD:用于广义细胞类型注释和发现的新任务和端到端框架。
Brief Bioinform. 2023 Mar 19;24(2). doi: 10.1093/bib/bbad045.
9
scMRA: a robust deep learning method to annotate scRNA-seq data with multiple reference datasets.scMRA:一种用于用多个参考数据集注释单细胞RNA测序数据的强大深度学习方法。
Bioinformatics. 2022 Jan 12;38(3):738-745. doi: 10.1093/bioinformatics/btab700.
10
A multitask clustering approach for single-cell RNA-seq analysis in Recessive Dystrophic Epidermolysis Bullosa.一种用于隐性营养不良型大疱性表皮松解症的单细胞 RNA-seq 分析的多任务聚类方法。
PLoS Comput Biol. 2018 Apr 9;14(4):e1006053. doi: 10.1371/journal.pcbi.1006053. eCollection 2018 Apr.

引用本文的文献

1
MINGLE: a mutual information-based interpretable framework for automatic cell type annotation in single-cell chromatin accessibility data.MINGLE:一种基于互信息的可解释框架,用于单细胞染色质可及性数据中的自动细胞类型注释。
Genome Biol. 2025 Jun 11;26(1):162. doi: 10.1186/s13059-025-03603-9.
2
TORC: Target-Oriented Reference Construction for supervised cell-type identification in scRNA-seq.TORC:用于单细胞RNA测序中监督式细胞类型识别的目标导向参考构建
Genome Biol. 2025 Jun 10;26(1):157. doi: 10.1186/s13059-025-03614-6.
3
scaLR: a low-resource deep neural network-based platform for single cell analysis and biomarker discovery.

本文引用的文献

1
A taxonomy of transcriptomic cell types across the isocortex and hippocampal formation.跨岛叶和海马结构的转录组细胞类型分类学。
Cell. 2021 Jun 10;184(12):3222-3241.e26. doi: 10.1016/j.cell.2021.04.021. Epub 2021 May 17.
2
Iterative transfer learning with neural network for clustering and cell type classification in single-cell RNA-seq analysis.用于单细胞RNA测序分析中聚类和细胞类型分类的神经网络迭代迁移学习
Nat Mach Intell. 2020 Oct;2(10):607-618. doi: 10.1038/s42256-020-00233-7. Epub 2020 Oct 5.
3
scSorter: assigning cells to known cell types according to marker genes.
scaLR:一个基于低资源深度神经网络的单细胞分析和生物标志物发现平台。
Brief Bioinform. 2025 May 1;26(3). doi: 10.1093/bib/bbaf243.
4
CellFM: a large-scale foundation model pre-trained on transcriptomics of 100 million human cells.CellFM:一个基于1亿个人类细胞转录组学预训练的大规模基础模型。
Nat Commun. 2025 May 20;16(1):4679. doi: 10.1038/s41467-025-59926-5.
5
Binned multinomial logistic regression for integrative cell-type annotation.用于综合细胞类型注释的分箱多项逻辑回归。
Ann Appl Stat. 2023 Dec;17(4):3426-3449. doi: 10.1214/23-aoas1769.
6
Combining single-cell ATAC and RNA sequencing for supervised cell annotation.结合单细胞ATAC和RNA测序进行监督细胞注释。
BMC Bioinformatics. 2025 Feb 26;26(1):67. doi: 10.1186/s12859-025-06084-6.
7
Multimodal hierarchical classification of CITE-seq data delineates immune cell states across lineages and tissues.CITE-seq数据的多模态分层分类描绘了跨谱系和组织的免疫细胞状态。
Cell Rep Methods. 2025 Jan 27;5(1):100938. doi: 10.1016/j.crmeth.2024.100938. Epub 2025 Jan 14.
8
MultiKano: an automatic cell type annotation tool for single-cell multi-omics data based on Kolmogorov-Arnold network and data augmentation.MultiKano:一种基于柯尔莫哥洛夫 - 阿诺德网络和数据增强的单细胞多组学数据自动细胞类型注释工具。
Protein Cell. 2025 May 28;16(5):374-380. doi: 10.1093/procel/pwae069.
9
A comparison of scRNA-seq annotation methods based on experimentally labeled immune cell subtype dataset.基于实验标记免疫细胞亚型数据集的 scRNA-seq 注释方法比较。
Brief Bioinform. 2024 Jul 25;25(5). doi: 10.1093/bib/bbae392.
10
Single-cell omics: experimental workflow, data analyses and applications.单细胞组学:实验工作流程、数据分析及应用
Sci China Life Sci. 2025 Jan;68(1):5-102. doi: 10.1007/s11427-023-2561-0. Epub 2024 Jul 23.
scSorter:根据标记基因将细胞分配到已知细胞类型中。
Genome Biol. 2021 Feb 22;22(1):69. doi: 10.1186/s13059-021-02281-7.
4
Automated methods for cell type annotation on scRNA-seq data.单细胞RNA测序(scRNA-seq)数据细胞类型注释的自动化方法。
Comput Struct Biotechnol J. 2021 Jan 19;19:961-969. doi: 10.1016/j.csbj.2021.01.015. eCollection 2021.
5
Accurate feature selection improves single-cell RNA-seq cell clustering.准确的特征选择可提高单细胞 RNA-seq 细胞聚类。
Brief Bioinform. 2021 Sep 2;22(5). doi: 10.1093/bib/bbab034.
6
Evaluation of Cell Type Annotation R Packages on Single-cell RNA-seq Data.单细胞 RNA-seq 数据中细胞类型注释 R 包评估。
Genomics Proteomics Bioinformatics. 2021 Apr;19(2):267-281. doi: 10.1016/j.gpb.2020.07.004. Epub 2020 Dec 24.
7
Single-cell RNA-seq data semi-supervised clustering and annotation via structural regularized domain adaptation.基于结构正则化领域自适应的单细胞 RNA-seq 数据半监督聚类和注释。
Bioinformatics. 2021 May 5;37(6):775-784. doi: 10.1093/bioinformatics/btaa908.
8
MARS: discovering novel cell types across heterogeneous single-cell experiments.MARS:在异质单细胞实验中发现新型细胞类型。
Nat Methods. 2020 Dec;17(12):1200-1206. doi: 10.1038/s41592-020-00979-3. Epub 2020 Oct 19.
9
A systematic evaluation of single-cell RNA-sequencing imputation methods.单细胞 RNA-seq 数据插补方法的系统评价
Genome Biol. 2020 Aug 27;21(1):218. doi: 10.1186/s13059-020-02132-x.
10
Systematic comparison of single-cell and single-nucleus RNA-sequencing methods.单细胞和单细胞核 RNA 测序方法的系统比较。
Nat Biotechnol. 2020 Jun;38(6):737-746. doi: 10.1038/s41587-020-0465-8. Epub 2020 Apr 6.