基于转录组数据的深度迁移学习检测视网膜神经和基质细胞类群及节细胞亚型。

Detecting retinal neural and stromal cell classes and ganglion cell subtypes based on transcriptome data with deep transfer learning.

机构信息

Department of Ophthalmology, University of Tennessee Health Science Center, Memphis, TN, USA.

University of Tehran, Tehran, Iran.

出版信息

Bioinformatics. 2022 Sep 15;38(18):4321-4329. doi: 10.1093/bioinformatics/btac514.

DOI:10.1093/bioinformatics/btac514

PMID:35876552

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9991888/

Abstract

MOTIVATION

To develop and assess the accuracy of deep learning models that identify different retinal cell types, as well as different retinal ganglion cell (RGC) subtypes, based on patterns of single-cell RNA sequencing (scRNA-seq) in multiple datasets.

RESULTS

Deep domain adaptation models were developed and tested using three different datasets. The first dataset included 44 808 single retinal cells from mice (39 cell types) with 24 658 genes, the second dataset included 6225 single RGCs from mice (41 subtypes) with 13 616 genes and the third dataset included 35 699 single RGCs from mice (45 subtypes) with 18 222 genes. We used four loss functions in the learning process to align the source and target distributions, reduce misclassification errors and maximize robustness. Models were evaluated based on classification accuracy and confusion matrix. The accuracy of the model for correctly classifying 39 different retinal cell types in the first dataset was ∼92%. Accuracy in the second and third datasets reached ∼97% and 97% in correctly classifying 40 and 45 different RGCs subtypes, respectively. Across a range of seven different batches in the first dataset, the accuracy of the lead model ranged from 74% to nearly 100%. The lead model provided high accuracy in identifying retinal cell types and RGC subtypes based on scRNA-seq data. The performance was reasonable based on data from different batches as well. The validated model could be readily applied to scRNA-seq data to identify different retinal cell types and subtypes.

AVAILABILITY AND IMPLEMENTATION

The code and datasets are available on https://github.com/DM2LL/Detecting-Retinal-Cell-Classes-and-Ganglion-Cell-Subtypes. We have also added the class labels of all samples to the datasets.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

开发并评估基于多数据集单细胞 RNA 测序 (scRNA-seq) 模式识别不同视网膜细胞类型和不同视网膜神经节细胞 (RGC) 亚型的深度学习模型的准确性。

结果

使用三个不同的数据集开发和测试了深度领域自适应模型。第一个数据集包括来自小鼠的 44808 个单细胞 (39 种细胞类型)，共 24658 个基因；第二个数据集包括来自小鼠的 6225 个单个 RGC (41 种亚型)，共 13616 个基因；第三个数据集包括来自小鼠的 35699 个单个 RGC (45 种亚型)，共 18222 个基因。在学习过程中，我们使用了四个损失函数来对齐源和目标分布，减少分类错误并最大化鲁棒性。模型基于分类准确性和混淆矩阵进行评估。在第一个数据集正确分类 39 种不同视网膜细胞类型的模型准确性约为 92%。在第二个和第三个数据集，正确分类 40 种和 45 种不同 RGC 亚型的准确性分别达到 97%和 97%。在第一个数据集的七个不同批次中，领先模型的准确率从 74%到接近 100%不等。该领先模型基于 scRNA-seq 数据提供了高的视网膜细胞类型和 RGC 亚型识别准确性。基于不同批次的数据，性能也很合理。经过验证的模型可以很容易地应用于 scRNA-seq 数据，以识别不同的视网膜细胞类型和亚型。

可用性和实现

代码和数据集可在 https://github.com/DM2LL/Detecting-Retinal-Cell-Classes-and-Ganglion-Cell-Subtypes 上获得。我们还在数据集中添加了所有样本的类别标签。

补充信息

补充数据可在 Bioinformatics 在线获得。

相似文献

Detecting retinal neural and stromal cell classes and ganglion cell subtypes based on transcriptome data with deep transfer learning.基于转录组数据的深度迁移学习检测视网膜神经和基质细胞类群及节细胞亚型。

Bioinformatics. 2022 Sep 15;38(18):4321-4329. doi: 10.1093/bioinformatics/btac514.

LAmbDA: label ambiguous domain adaptation dataset integration reduces batch effects and improves subtype detection.LAmbDA：标签模糊域自适应数据集集成减少批次效应并提高亚型检测。

Bioinformatics. 2019 Nov 1;35(22):4696-4706. doi: 10.1093/bioinformatics/btz295.

scBGEDA: deep single-cell clustering analysis via a dual denoising autoencoder with bipartite graph ensemble clustering.scBGEDA：基于双分图集成分聚类的对偶去噪自动编码器的单细胞聚类分析。

Bioinformatics. 2023 Feb 14;39(2). doi: 10.1093/bioinformatics/btad075.

scMRA: a robust deep learning method to annotate scRNA-seq data with multiple reference datasets.scMRA：一种用于用多个参考数据集注释单细胞RNA测序数据的强大深度学习方法。

Bioinformatics. 2022 Jan 12;38(3):738-745. doi: 10.1093/bioinformatics/btab700.

HDMC: a novel deep learning-based framework for removing batch effects in single-cell RNA-seq data.HDMC：一种用于去除单细胞 RNA-seq 数据中批次效应的新型深度学习框架。

Bioinformatics. 2022 Feb 7;38(5):1295-1303. doi: 10.1093/bioinformatics/btab821.

Machine learning and statistical methods for clustering single-cell RNA-sequencing data.机器学习和统计方法在单细胞 RNA 测序数据分析中的应用。

Brief Bioinform. 2020 Jul 15;21(4):1209-1223. doi: 10.1093/bib/bbz063.

Joint learning dimension reduction and clustering of single-cell RNA-sequencing data.单细胞 RNA 测序数据的联合降维和聚类学习。

Bioinformatics. 2020 Jun 1;36(12):3825-3832. doi: 10.1093/bioinformatics/btaa231.

A multitask clustering approach for single-cell RNA-seq analysis in Recessive Dystrophic Epidermolysis Bullosa.一种用于隐性营养不良型大疱性表皮松解症的单细胞 RNA-seq 分析的多任务聚类方法。

PLoS Comput Biol. 2018 Apr 9;14(4):e1006053. doi: 10.1371/journal.pcbi.1006053. eCollection 2018 Apr.

CTISL: a dynamic stacking multi-class classification approach for identifying cell types from single-cell RNA-seq data.CTISL：一种动态堆叠多类分类方法，用于从单细胞 RNA-seq 数据中识别细胞类型。

Bioinformatics. 2024 Feb 1;40(2). doi: 10.1093/bioinformatics/btae063.

ProtoCell4P: an explainable prototype-based neural network for patient classification using single-cell RNA-seq.ProtoCell4P：一种基于原型的可解释神经网络，用于使用单细胞 RNA-seq 进行患者分类。

Bioinformatics. 2023 Aug 1;39(8). doi: 10.1093/bioinformatics/btad493.

引用本文的文献

Artificial Intelligence Models for Cell Type and Subtype Identification Based on Single-Cell RNA Sequencing Data in Vision Science.基于单细胞 RNA 测序数据的视觉科学中细胞类型和亚型鉴定的人工智能模型。

IEEE/ACM Trans Comput Biol Bioinform. 2023 Sep-Oct;20(5):2837-2852. doi: 10.1109/TCBB.2023.3284795. Epub 2023 Oct 9.

本文引用的文献

Deep generative model embedding of single-cell RNA-Seq profiles on hyperspheres and hyperbolic spaces.基于超球和双曲空间的单细胞 RNA-Seq 图谱的深度生成模型嵌入。

Nat Commun. 2021 May 5;12(1):2554. doi: 10.1038/s41467-021-22851-4.

Searching large-scale scRNA-seq databases via unbiased cell embedding with Cell BLAST.通过无偏倚的细胞嵌合进行 Cell BLAST，在大规模 scRNA-seq 数据库中搜索。

Nat Commun. 2020 Jul 10;11(1):3458. doi: 10.1038/s41467-020-17281-7.

Maximum Density Divergence for Domain Adaptation.用于域适应的最大密度散度

IEEE Trans Pattern Anal Mach Intell. 2021 Nov;43(11):3918-3930. doi: 10.1109/TPAMI.2020.2991050. Epub 2021 Oct 1.

Cell atlas of aqueous humor outflow pathways in eyes of humans and four model species provides insight into glaucoma pathogenesis.人眼和四种模式物种房水流出通路的细胞图谱为青光眼发病机制提供了新见解。

Proc Natl Acad Sci U S A. 2020 May 12;117(19):10339-10349. doi: 10.1073/pnas.2001250117. Epub 2020 Apr 27.

scPred: accurate supervised method for cell-type classification from single-cell RNA-seq data.scPred：一种准确的有监督方法，用于对单细胞 RNA-seq 数据进行细胞类型分类。

Genome Biol. 2019 Dec 12;20(1):264. doi: 10.1186/s13059-019-1862-5.

Single-Cell Profiles of Retinal Ganglion Cells Differing in Resilience to Injury Reveal Neuroprotective Genes.单细胞分析揭示了对损伤具有不同抵抗力的视网膜神经节细胞的特征，并发现了神经保护基因。

Neuron. 2019 Dec 18;104(6):1039-1055.e12. doi: 10.1016/j.neuron.2019.11.006. Epub 2019 Nov 26.

Multi-representation adaptation network for cross-domain image classification.多表示自适应网络用于跨领域图像分类。

Neural Netw. 2019 Nov;119:214-221. doi: 10.1016/j.neunet.2019.07.010. Epub 2019 Aug 18.

A single-cell transcriptome atlas of the adult human retina.成人视网膜单细胞转录组图谱。

EMBO J. 2019 Sep 16;38(18):e100811. doi: 10.15252/embj.2018100811. Epub 2019 Aug 22.

BERMUDA: a novel deep transfer learning method for single-cell RNA sequencing batch correction reveals hidden high-resolution cellular subtypes.百慕大：一种新型的单细胞 RNA 测序批次校正深度迁移学习方法揭示了隐藏的高分辨率细胞亚型。

Genome Biol. 2019 Aug 12;20(1):165. doi: 10.1186/s13059-019-1764-6.

CHETAH: a selective, hierarchical cell type identification method for single-cell RNA sequencing.CHETAH：一种用于单细胞 RNA 测序的选择性、层次化细胞类型识别方法。

Nucleic Acids Res. 2019 Sep 19;47(16):e95. doi: 10.1093/nar/gkz543.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。