scGAL：通过联合分析独立的单细胞拷贝数和单细胞RNA测序数据来揭示肿瘤克隆亚结构。

scGAL: unmask tumor clonal substructure by jointly analyzing independent single-cell copy number and scRNA-seq data.

作者信息

Li Ruixiang, Shi Fangyuan, Song Lijuan, Yu Zhenhua

机构信息

School of Information Engineering, Ningxia University, Yinchuan, 750021, China.

Collaborative Innovation Center for Ningxia Big Data and Artificial Intelligence Co-founded by Ningxia Municipality and Ministry of Education, Ningxia University, Yinchuan, 750021, China.

出版信息

BMC Genomics. 2024 Apr 22;25(1):393. doi: 10.1186/s12864-024-10319-w.

DOI:10.1186/s12864-024-10319-w

PMID:38649804

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11034052/

Abstract

BACKGROUND

Accurately deciphering clonal copy number substructure can provide insights into the evolutionary mechanism of cancer, and clustering single-cell copy number profiles has become an effective means to unmask intra-tumor heterogeneity (ITH). However, copy numbers inferred from single-cell DNA sequencing (scDNA-seq) data are error-prone due to technically confounding factors such as amplification bias and allele-dropout, and this makes it difficult to precisely identify the ITH.

RESULTS

We introduce a hybrid model called scGAL to infer clonal copy number substructure. It combines an autoencoder with a generative adversarial network to jointly analyze independent single-cell copy number profiles and gene expression data from same cell line. Under an adversarial learning framework, scGAL exploits complementary information from gene expression data to relieve the effects of noise in copy number data, and learns latent representations of scDNA-seq cells for accurate inference of the ITH. Evaluation results on three real cancer datasets suggest scGAL is able to accurately infer clonal architecture and surpasses other similar methods. In addition, assessment of scGAL on various simulated datasets demonstrates its high robustness against the changes of data size and distribution. scGAL can be accessed at: https://github.com/zhyu-lab/scgal .

CONCLUSIONS

Joint analysis of independent single-cell copy number and gene expression data from a same cell line can effectively exploit complementary information from individual omics, and thus gives more refined indication of clonal copy number substructure.

摘要

背景

准确解读克隆拷贝数亚结构能够为癌症的进化机制提供见解，而对单细胞拷贝数图谱进行聚类已成为揭示肿瘤内异质性（ITH）的有效手段。然而，由于诸如扩增偏差和等位基因缺失等技术混杂因素，从单细胞DNA测序（scDNA-seq）数据推断出的拷贝数容易出错，这使得精确识别ITH变得困难。

结果

我们引入了一种名为scGAL的混合模型来推断克隆拷贝数亚结构。它将自动编码器与生成对抗网络相结合，以联合分析来自同一细胞系的独立单细胞拷贝数图谱和基因表达数据。在对抗学习框架下，scGAL利用基因表达数据中的互补信息来减轻拷贝数数据中噪声的影响，并学习scDNA-seq细胞的潜在表示以准确推断ITH。在三个真实癌症数据集上的评估结果表明，scGAL能够准确推断克隆结构并超越其他类似方法。此外，在各种模拟数据集上对scGAL的评估证明了其对数据大小和分布变化具有高鲁棒性。可通过https://github.com/zhyu-lab/scgal访问scGAL。