Jin Nana, Nan Chuanchuan, Li Wanyang, Lin Peijing, Xin Yu, Wang Jun, Chen Yuelong, Wang Yuanhao, Yu Kaijiang, Wang Changsong, Chen Chunbo, Geng Qingshan, Cheng Lixin
Guangdong Provincial Clinical Research Center for Geriatrics; Shenzhen Clinical Research Center for Geriatrics, Shenzhen People's Hospital, the Second Clinical Medical College of Jinan University, the First Affiliated Hospital of Southern University of Science and Technology, 1017 Dongmen Rd N, Luohu District, Shenzhen 518020, China.
Post-doctoral Scientific Research Station of Basic Medicine, Jinan University, 601 Huangpu Blvd W, Tianhe District, Guangzhou 510632, China.
Brief Bioinform. 2024 Nov 22;26(1). doi: 10.1093/bib/bbae661.
Sepsis, caused by infections, sparks a dangerous bodily response. The transcriptional expression patterns of host responses aid in the diagnosis of sepsis, but the challenge lies in their limited generalization capabilities. To facilitate sepsis diagnosis, we present an updated version of single-cell Pair-wise Analysis of Gene Expression (scPAGE) using transfer learning method, scPAGE2, dedicated to data fusion between single-cell and bulk transcriptome. Compared to scPAGE, the upgrade to scPAGE2 featured ameliorated Differentially Expressed Gene Pairs (DEPs) for pretraining a model in single-cell transcriptome and retrained it using bulk transcriptome data to construct a sepsis diagnostic model, which effectively transferred cell-layer information from single-cell to bulk transcriptome. Seven datasets across three transcriptome platforms and fluorescence-activated cell sorting (FACS) were used for performance validation. The model involved four DEPs, showing robust performance across next-generation sequencing and microarray platforms, surpassing state-of-the-art models with an average AUROC of 0.947 and an average AUPRC of 0.987. Analysis of scRNA-seq data reveals higher cell proportions with JAM3-PIK3AP1 expression in sepsis monocytes, decreased ARG1-CCR7 in B and T cells. Elevated IRF6-HP in sepsis monocytes confirmed by both scRNA-seq and an independent cohort using FACS. Both the superior performance of the model and the in vitro validation of IRF6-HP in monocytes emphasize that scPAGE2 is effective and robust in the construction of sepsis diagnostic model. We additionally applied scPAGE2 to acute myeloid leukemia and demonstrated its superior classification performance. Overall, we provided a strategy to improve the generalizability of classification model that can be adapted to a broad range of clinical prediction scenarios.
由感染引起的脓毒症会引发危险的身体反应。宿主反应的转录表达模式有助于脓毒症的诊断,但挑战在于其泛化能力有限。为了促进脓毒症诊断,我们使用迁移学习方法提出了单细胞基因表达成对分析(scPAGE)的更新版本scPAGE2,致力于单细胞和批量转录组之间的数据融合。与scPAGE相比,scPAGE2的升级特点是改进了差异表达基因对(DEP),用于在单细胞转录组中预训练模型,并使用批量转录组数据对其进行重新训练以构建脓毒症诊断模型,该模型有效地将细胞层信息从单细胞转移到批量转录组。使用来自三个转录组平台和荧光激活细胞分选(FACS)的七个数据集进行性能验证。该模型涉及四个DEP,在下一代测序和微阵列平台上均表现出强大的性能,平均曲线下面积(AUROC)为0.947,平均精确率-召回率曲线下面积(AUPRC)为0.987,超过了现有最先进的模型。对单细胞RNA测序(scRNA-seq)数据的分析显示,脓毒症单核细胞中JAM3-PIK3AP1表达的细胞比例更高,B细胞和T细胞中ARG1-CCR7减少。scRNA-seq和使用FACS的独立队列均证实脓毒症单核细胞中IRF6-HP升高。模型的卓越性能以及单核细胞中IRF6-HP的体外验证均强调scPAGE2在构建脓毒症诊断模型方面是有效且稳健的。我们还将scPAGE2应用于急性髓系白血病,并证明了其卓越的分类性能。总体而言,我们提供了一种提高分类模型泛化能力的策略,该策略可适用于广泛的临床预测场景。