Suppr超能文献

pscAdapt:基于结构相似性的预训练域适应网络,用于单细胞RNA测序数据中的细胞类型注释。

pscAdapt: Pre-Trained Domain Adaptation Network Based on Structural Similarity for Cell Type Annotation in Single Cell RNA-seq Data.

作者信息

Zhao Yan, Shang Junliang, Qin Baojuan, Zhang Limin, He Xin, Ge Daohui, Ren Qianqian, Liu Jin-Xing

出版信息

IEEE J Biomed Health Inform. 2025 Jan;29(1):724-732. doi: 10.1109/JBHI.2024.3468310. Epub 2025 Jan 7.

Abstract

Cell type annotation refers to the process of categorizing and labeling cells to identify their specific cell types, which is crucial for understanding cell functions and biological processes. Although many methods have been developed for automated cell type annotation, they often encounter challenges such as batch effects due to variations in data distribution across platforms and species, thereby compromising their performance. To address batch effects, in this study, a pre-trained domain adaptation model based on structural similarity, named pscAdapt, is proposed for cell type annotation. Specifically, a pre-trained strategy is employed to initialize model parameters to learn the data distribution of source domain. This strategy is also combined with an adversarial learning strategy to train the domain adaptation network for achieving domain level alignment and reducing domain discrepancy. Furthermore, to better distinguish different types of cells, a structural similarity loss is designed, aiming to shorten distances between cells of the same type and increase distances between cells of different types in feature space, thus achieving cell level alignment and enhancing the discriminability of cell types. Comprehensive experiments were conducted on simulated datasets, cross-platforms datasets and cross-species datasets to validate the effectiveness of pscAdapt, results of which demonstrate that pscAdapt outperforms several popular cell type annotation methods.

摘要

细胞类型注释是指对细胞进行分类和标记以识别其特定细胞类型的过程,这对于理解细胞功能和生物过程至关重要。尽管已经开发了许多用于自动细胞类型注释的方法,但由于跨平台和物种的数据分布存在差异,它们经常遇到诸如批次效应等挑战,从而影响其性能。为了解决批次效应,在本研究中,提出了一种基于结构相似性的预训练域适应模型,名为pscAdapt,用于细胞类型注释。具体而言,采用预训练策略初始化模型参数以学习源域的数据分布。该策略还与对抗学习策略相结合,以训练域适应网络,实现域级对齐并减少域差异。此外,为了更好地区分不同类型的细胞,设计了一种结构相似性损失,旨在缩短特征空间中同一类型细胞之间的距离并增加不同类型细胞之间的距离,从而实现细胞级对齐并提高细胞类型的可辨别性。在模拟数据集、跨平台数据集和跨物种数据集上进行了综合实验,以验证pscAdapt的有效性,实验结果表明pscAdapt优于几种流行的细胞类型注释方法。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验