Department of Ophthalmology, University of Tennessee Health Science Center, Memphis, TN, USA.
University of Tehran, Tehran, Iran.
Bioinformatics. 2022 Sep 15;38(18):4321-4329. doi: 10.1093/bioinformatics/btac514.
To develop and assess the accuracy of deep learning models that identify different retinal cell types, as well as different retinal ganglion cell (RGC) subtypes, based on patterns of single-cell RNA sequencing (scRNA-seq) in multiple datasets.
Deep domain adaptation models were developed and tested using three different datasets. The first dataset included 44 808 single retinal cells from mice (39 cell types) with 24 658 genes, the second dataset included 6225 single RGCs from mice (41 subtypes) with 13 616 genes and the third dataset included 35 699 single RGCs from mice (45 subtypes) with 18 222 genes. We used four loss functions in the learning process to align the source and target distributions, reduce misclassification errors and maximize robustness. Models were evaluated based on classification accuracy and confusion matrix. The accuracy of the model for correctly classifying 39 different retinal cell types in the first dataset was ∼92%. Accuracy in the second and third datasets reached ∼97% and 97% in correctly classifying 40 and 45 different RGCs subtypes, respectively. Across a range of seven different batches in the first dataset, the accuracy of the lead model ranged from 74% to nearly 100%. The lead model provided high accuracy in identifying retinal cell types and RGC subtypes based on scRNA-seq data. The performance was reasonable based on data from different batches as well. The validated model could be readily applied to scRNA-seq data to identify different retinal cell types and subtypes.
The code and datasets are available on https://github.com/DM2LL/Detecting-Retinal-Cell-Classes-and-Ganglion-Cell-Subtypes. We have also added the class labels of all samples to the datasets.
Supplementary data are available at Bioinformatics online.
开发并评估基于多数据集单细胞 RNA 测序 (scRNA-seq) 模式识别不同视网膜细胞类型和不同视网膜神经节细胞 (RGC) 亚型的深度学习模型的准确性。
使用三个不同的数据集开发和测试了深度领域自适应模型。第一个数据集包括来自小鼠的 44808 个单细胞 (39 种细胞类型),共 24658 个基因;第二个数据集包括来自小鼠的 6225 个单个 RGC (41 种亚型),共 13616 个基因;第三个数据集包括来自小鼠的 35699 个单个 RGC (45 种亚型),共 18222 个基因。在学习过程中,我们使用了四个损失函数来对齐源和目标分布,减少分类错误并最大化鲁棒性。模型基于分类准确性和混淆矩阵进行评估。在第一个数据集正确分类 39 种不同视网膜细胞类型的模型准确性约为 92%。在第二个和第三个数据集,正确分类 40 种和 45 种不同 RGC 亚型的准确性分别达到 97%和 97%。在第一个数据集的七个不同批次中,领先模型的准确率从 74%到接近 100%不等。该领先模型基于 scRNA-seq 数据提供了高的视网膜细胞类型和 RGC 亚型识别准确性。基于不同批次的数据,性能也很合理。经过验证的模型可以很容易地应用于 scRNA-seq 数据,以识别不同的视网膜细胞类型和亚型。
代码和数据集可在 https://github.com/DM2LL/Detecting-Retinal-Cell-Classes-and-Ganglion-Cell-Subtypes 上获得。我们还在数据集中添加了所有样本的类别标签。
补充数据可在 Bioinformatics 在线获得。