使用机器学习算法在单细胞RNA测序（scRNA-seq）和单细胞核RNA测序（snRNA-seq）数据中识别肾细胞类型。

Identification of kidney cell types in scRNA-seq and snRNA-seq data using machine learning algorithms.

作者信息

Tisch Adam, Madapoosi Siddharth, Blough Stephen, Rosa Jan, Eddy Sean, Mariani Laura, Naik Abhijit, Limonte Christine, McCown Philip, Menon Rajasree, Rosas Sylvia E, Parikh Chirag R, Kretzler Matthias, Mahfouz Ahmed, Alakwaa Fadhl

机构信息

Undergraduate Research Opportunity Program, University of Michigan, Ann Arbor, MI, USA.

University of Michigan Medical School, Ann Arbor, MI, USA.

出版信息

Heliyon. 2024 Sep 27;10(19):e38567. doi: 10.1016/j.heliyon.2024.e38567. eCollection 2024 Oct 15.

DOI:10.1016/j.heliyon.2024.e38567

PMID:39403515

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11471582/

Abstract

INTRODUCTION

Single-cell RNA sequencing (scRNA-seq) and single-nucleus RNA sequencing (snRNA-seq) provide valuable insights into the cellular states of kidney cells. However, the annotation of cell types often requires extensive domain expertise and time-consuming manual curation, limiting scalability and generalizability. To facilitate this process, we tested the performance of five supervised classification methods for automatic cell type annotation.

RESULTS

We analyzed publicly available sc/snRNA-seq datasets from five expert-annotated studies, comprising 62,120 cells from 79 kidney biopsy samples. Datasets were integrated by harmonizing cell type annotations across studies. Five different supervised machine learning algorithms (support vector machines, random forests, multilayer perceptrons, k-nearest neighbors, and extreme gradient boosting) were applied to automatically annotate cell types using four training datasets and one testing dataset. Performance metrics, including accuracy (F1 score) and rejection rates, were evaluated. All five machine learning algorithms demonstrated high accuracies, with a median F1 score of 0.94 and a median rejection rate of 1.8 %. The algorithms performed equally well across different datasets and successfully rejected cell types that were not present in the training data. However, F1 scores were lower when models trained primarily on scRNA-seq data were tested on snRNA-seq data.

CONCLUSIONS

Despite limitations including the number of biopsy samples, our findings demonstrate that machine learning algorithms can accurately annotate a wide range of adult kidney cell types in scRNA-seq/snRNA-seq data. This approach has the potential to standardize cell type annotation and facilitate further research on cellular mechanisms underlying kidney disease.

摘要

引言

单细胞RNA测序（scRNA-seq）和单细胞核RNA测序（snRNA-seq）为了解肾细胞的细胞状态提供了有价值的见解。然而，细胞类型的注释通常需要广泛的领域专业知识和耗时的人工整理，这限制了可扩展性和通用性。为了促进这一过程，我们测试了五种监督分类方法用于自动细胞类型注释的性能。

结果

我们分析了来自五项经过专家注释研究的公开可用sc/snRNA-seq数据集，包括来自79份肾活检样本的62,120个细胞。通过协调各研究中的细胞类型注释来整合数据集。使用四个训练数据集和一个测试数据集，应用五种不同的监督机器学习算法（支持向量机、随机森林、多层感知器、k近邻和极端梯度提升）自动注释细胞类型。评估了包括准确率（F1分数）和拒绝率在内的性能指标。所有五种机器学习算法都表现出很高的准确率，F1分数中位数为0.94，拒绝率中位数为1.8%。这些算法在不同数据集上表现同样出色，并成功拒绝了训练数据中不存在的细胞类型。然而，当主要在scRNA-seq数据上训练的模型在snRNA-seq数据上进行测试时，F1分数较低。

结论

尽管存在包括活检样本数量在内的局限性，但我们的研究结果表明，机器学习算法可以准确注释scRNA-seq/snRNA-seq数据中广泛的成年肾细胞类型。这种方法有可能使细胞类型注释标准化，并促进对肾脏疾病潜在细胞机制的进一步研究。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6fdd/11471582/eb66a9cdc187/gr1.jpg

相似文献

Identification of kidney cell types in scRNA-seq and snRNA-seq data using machine learning algorithms.使用机器学习算法在单细胞RNA测序（scRNA-seq）和单细胞核RNA测序（snRNA-seq）数据中识别肾细胞类型。

Heliyon. 2024 Sep 27;10(19):e38567. doi: 10.1016/j.heliyon.2024.e38567. eCollection 2024 Oct 15.

Leveraging a foundation model zoo for cell similarity search in oncological microscopy across devices.利用基础模型库进行跨设备肿瘤显微镜检查中的细胞相似性搜索。

Front Oncol. 2025 Jun 18;15:1480384. doi: 10.3389/fonc.2025.1480384. eCollection 2025.

Ensemble machine learning-based pre-trained annotation approach for scRNA-seq data using gradient boosting with genetic optimizer.基于集成机器学习的预训练注释方法，用于使用带有遗传优化器的梯度提升的单细胞RNA测序数据。

BMC Bioinformatics. 2025 Jul 1;26(1):166. doi: 10.1186/s12859-025-06151-y.

Proposal for Using AI to Assess Clinical Data Integrity and Generate Metadata: Algorithm Development and Validation.关于使用人工智能评估临床数据完整性并生成元数据的提案：算法开发与验证

JMIR Med Inform. 2025 Jun 30;13:e60204. doi: 10.2196/60204.

Stabilizing machine learning for reproducible and explainable results: A novel validation approach to subject-specific insights.稳定机器学习以获得可重复和可解释的结果：一种针对特定个体见解的新型验证方法。

Comput Methods Programs Biomed. 2025 Jun 21;269:108899. doi: 10.1016/j.cmpb.2025.108899.

Semi-Supervised Learning Allows for Improved Segmentation With Reduced Annotations of Brain Metastases Using Multicenter MRI Data.半监督学习可利用多中心MRI数据，通过减少脑转移瘤的标注来改进分割。

J Magn Reson Imaging. 2025 Jun;61(6):2469-2479. doi: 10.1002/jmri.29686. Epub 2025 Jan 10.

Leveraging machine learning and single-cell RNA sequencing strategies to develop a risk prognosis scoring based on liquid-liquid phase separation feature genes in pediatric hepatoblastoma.利用机器学习和单细胞RNA测序策略，基于小儿肝母细胞瘤中液-液相分离特征基因开发风险预后评分。

Comput Biol Med. 2025 Sep;196(Pt A):110685. doi: 10.1016/j.compbiomed.2025.110685. Epub 2025 Jul 6.

Artificial intelligence for diagnosing exudative age-related macular degeneration.人工智能在渗出性年龄相关性黄斑变性诊断中的应用。

Cochrane Database Syst Rev. 2024 Oct 17;10(10):CD015522. doi: 10.1002/14651858.CD015522.pub2.

Automatic sequence identification in multicentric prostate multiparametric MRI datasets for clinical machine-learning.用于临床机器学习的多中心前列腺多参数MRI数据集中的自动序列识别

Insights Imaging. 2025 Mar 27;16(1):75. doi: 10.1186/s13244-025-01938-2.

Classification of finger movements through optimal EEG channel and feature selection.通过最优脑电图通道和特征选择对手指运动进行分类。

Front Hum Neurosci. 2025 Jul 16;19:1633910. doi: 10.3389/fnhum.2025.1633910. eCollection 2025.

引用本文的文献

Single-cell expression and immune infiltration analysis of polyamine metabolism in breast cancer.乳腺癌中多胺代谢的单细胞表达与免疫浸润分析

Discov Oncol. 2024 Nov 16;15(1):666. doi: 10.1007/s12672-024-01524-w.

本文引用的文献

Experimental models for preclinical research in kidney disease.肾脏疾病临床前研究的实验模型。

Zool Res. 2024 Sep 18;45(5):1161-1174. doi: 10.24272/j.issn.2095-8137.2024.072.

scGPT: toward building a foundation model for single-cell multi-omics using generative AI.scGPT：迈向使用生成式人工智能构建单细胞多组学基础模型

Nat Methods. 2024 Aug;21(8):1470-1480. doi: 10.1038/s41592-024-02201-0. Epub 2024 Feb 26.

Big data analytics for MerTK genomics reveals its double-edged sword functions in human diseases.大数据分析 MerTK 基因组学揭示其在人类疾病中的双刃剑功能。

Redox Biol. 2024 Apr;70:103061. doi: 10.1016/j.redox.2024.103061. Epub 2024 Feb 5.

An atlas of healthy and injured cell states and niches in the human kidney.人类肾脏健康和损伤细胞状态及生态位图谱

Nature. 2023 Jul;619(7970):585-594. doi: 10.1038/s41586-023-05769-3. Epub 2023 Jul 19.

Machine learning for cell type classification from single nucleus RNA sequencing data.基于单细胞 RNA 测序数据的细胞类型分类的机器学习方法。

PLoS One. 2022 Sep 23;17(9):e0275070. doi: 10.1371/journal.pone.0275070. eCollection 2022.

devCellPy is a machine learning-enabled pipeline for automated annotation of complex multilayered single-cell transcriptomic data.devCellPy 是一个机器学习驱动的流水线，用于对复杂的多层单细胞转录组数据进行自动注释。

Nat Commun. 2022 Sep 7;13(1):5271. doi: 10.1038/s41467-022-33045-x.

A reference tissue atlas for the human kidney.人类肾脏参考组织图谱。

Sci Adv. 2022 Jun 10;8(23):eabn4965. doi: 10.1126/sciadv.abn4965. Epub 2022 Jun 8.

The Tabula Sapiens: A multiple-organ, single-cell transcriptomic atlas of humans.智慧人图谱：人类多器官单细胞转录组图谱。

Science. 2022 May 13;376(6594):eabl4896. doi: 10.1126/science.abl4896.

Single Cell Self-Paced Clustering with Transcriptome Sequencing Data.单细胞自我定标聚类与转录组测序数据。

Int J Mol Sci. 2022 Mar 31;23(7):3900. doi: 10.3390/ijms23073900.

Anatomical structures, cell types and biomarkers of the Human Reference Atlas.人体参考图谱的解剖结构、细胞类型和生物标志物。

Nat Cell Biol. 2021 Nov;23(11):1117-1128. doi: 10.1038/s41556-021-00788-6. Epub 2021 Nov 8.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

使用机器学习算法在单细胞RNA测序（scRNA-seq）和单细胞核RNA测序（snRNA-seq）数据中识别肾细胞类型。

Identification of kidney cell types in scRNA-seq and snRNA-seq data using machine learning algorithms.

作者信息

机构信息

出版信息

INTRODUCTION

RESULTS

CONCLUSIONS

引言

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献