Chen Yuanyuan, Fan Xiaodan, Shi Chaowen, Shi Zhiyan, Wang Chaojie
School of Mathematical Science, Jiangsu University, Zhenjiang, 212013, Jiangsu, China.
Department of Statistics, The Chinese University of Hong Kong, Shatin, N.T., Hong Kong, SAR, China.
NPJ Syst Biol Appl. 2025 Jan 2;11(1):1. doi: 10.1038/s41540-024-00484-9.
CITE-seq provides a powerful method for simultaneously measuring RNA and protein expression at the single-cell level. The integrated analysis of RNA and protein expression in identical cells is crucial for revealing cellular heterogeneity. However, the high experimental costs associated with CITE-seq limit its widespread application. In this paper, we propose scTEL, a deep learning framework based on Transformer encoder layers, to establish a mapping from sequenced RNA expression to unobserved protein expression in the same cells. This computation-based approach significantly reduces the experimental costs of protein expression sequencing. We are now able to predict protein expression using single-cell RNA sequencing (scRNA-seq) data, which is well-established and available at a lower cost. Moreover, our scTEL model offers a unified framework for integrating multiple CITE-seq datasets, addressing the challenge posed by the partial overlap of protein panels across different datasets. Empirical validation on public CITE-seq datasets demonstrates scTEL significantly outperforms existing methods.
CITE-seq提供了一种在单细胞水平上同时测量RNA和蛋白质表达的强大方法。对相同细胞中的RNA和蛋白质表达进行综合分析对于揭示细胞异质性至关重要。然而,与CITE-seq相关的高昂实验成本限制了其广泛应用。在本文中,我们提出了scTEL,一种基于Transformer编码器层的深度学习框架,以建立从测序的RNA表达到同一细胞中未观察到的蛋白质表达的映射。这种基于计算的方法显著降低了蛋白质表达测序的实验成本。我们现在能够使用成熟且成本较低的单细胞RNA测序(scRNA-seq)数据来预测蛋白质表达。此外,我们的scTEL模型提供了一个统一的框架来整合多个CITE-seq数据集,解决了不同数据集中蛋白质面板部分重叠所带来的挑战。在公共CITE-seq数据集上的实证验证表明,scTEL明显优于现有方法。