肿瘤拷贝数去卷积整合批量和单细胞测序数据。

Tumor Copy Number Deconvolution Integrating Bulk and Single-Cell Sequencing Data.

机构信息

Computational Biology Department, Carnegie Mellon University, Pittsburgh, Pennsylvania.

Department of Mathematics, Rose-Hulman Institute of Technology, Terre Haute, Indiana.

出版信息

J Comput Biol. 2020 Apr;27(4):565-598. doi: 10.1089/cmb.2019.0302. Epub 2020 Mar 16.

DOI:10.1089/cmb.2019.0302

PMID:32181683

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7185355/

Abstract

Characterizing intratumor heterogeneity (ITH) is crucial to understanding cancer development, but it is hampered by limits of available data sources. Bulk DNA sequencing is the most common technology to assess ITH, but involves the analysis of a mixture of many genetically distinct cells in each sample, which must then be computationally deconvolved. Single-cell sequencing is a promising alternative, but its limitations-for example, high noise, difficulty scaling to large populations, technical artifacts, and large data sets-have so far made it impractical for studying cohorts of sufficient size to identify statistically robust features of tumor evolution. We have developed strategies for deconvolution and tumor phylogenetics combining limited amounts of bulk and single-cell data to gain some advantages of single-cell resolution with much lower cost, with specific focus on deconvolving genomic copy number data. We developed a mixed membership model for clonal deconvolution via non-negative matrix factorization balancing deconvolution quality with similarity to single-cell samples via an associated efficient coordinate descent algorithm. We then improve on that algorithm by integrating deconvolution with clonal phylogeny inference, using a mixed integer linear programming model to incorporate a minimum evolution phylogenetic tree cost in the problem objective. We demonstrate the effectiveness of these methods on semisimulated data of known ground truth, showing improved deconvolution accuracy relative to bulk data alone.

摘要

对肿瘤内异质性（ITH）进行特征描述对于理解癌症的发展至关重要，但这受到可用数据源的限制。 bulk DNA 测序是评估 ITH 的最常用技术，但涉及对每个样本中许多遗传上不同的细胞混合物进行分析，然后必须通过计算进行解卷积。单细胞测序是一种很有前途的替代方法，但它的局限性——例如，高噪声、难以扩展到大群体、技术伪影和大数据集——迄今为止，对于研究足够大的队列以确定肿瘤进化的统计稳健特征来说是不切实际的。我们已经开发了一些结合有限数量的 bulk 和单细胞数据的去卷积和肿瘤系统发生学策略，以获得一些单细胞分辨率的优势，同时成本要低得多，特别关注基因组拷贝数数据的去卷积。我们通过非负矩阵分解开发了一种用于克隆去卷积的混合成员模型，通过相关的高效坐标下降算法，通过与单细胞样本的相似性来平衡去卷积质量。然后，我们通过将去卷积与克隆系统发生推断相结合来改进该算法，使用混合整数线性规划模型将最小进化系统发生树成本纳入问题目标中。我们在已知真实情况的半模拟数据上证明了这些方法的有效性，与仅使用 bulk 数据相比，提高了去卷积的准确性。

相似文献

Tumor Copy Number Deconvolution Integrating Bulk and Single-Cell Sequencing Data.肿瘤拷贝数去卷积整合批量和单细胞测序数据。

J Comput Biol. 2020 Apr;27(4):565-598. doi: 10.1089/cmb.2019.0302. Epub 2020 Mar 16.

Phylogenetic Copy-Number Factorization of Multiple Tumor Samples.多个肿瘤样本的系统发育拷贝数分解

J Comput Biol. 2018 Jul;25(7):689-708. doi: 10.1089/cmb.2017.0253. Epub 2018 Apr 16.

Assessing intratumor heterogeneity and tracking longitudinal and spatial clonal evolutionary history by next-generation sequencing.通过下一代测序评估肿瘤内异质性并追踪纵向和空间克隆进化史。

Proc Natl Acad Sci U S A. 2016 Sep 13;113(37):E5528-37. doi: 10.1073/pnas.1522203113. Epub 2016 Aug 29.

CLImAT-HET: detecting subclonal copy number alterations and loss of heterozygosity in heterogeneous tumor samples from whole-genome sequencing data.CLImAT-HET：从全基因组测序数据中检测异质性肿瘤样本中的亚克隆拷贝数改变和杂合性缺失

BMC Med Genomics. 2017 Mar 15;10(1):15. doi: 10.1186/s12920-017-0255-4.

Chromosomal Instability Estimation Based on Next Generation Sequencing and Single Cell Genome Wide Copy Number Variation Analysis.基于下一代测序和单细胞全基因组拷贝数变异分析的染色体不稳定性评估

PLoS One. 2016 Nov 16;11(11):e0165089. doi: 10.1371/journal.pone.0165089. eCollection 2016.

Sensitivity to sequencing depth in single-cell cancer genomics.单细胞癌症基因组学中对测序深度的敏感性。

Genome Med. 2018 Apr 16;10(1):29. doi: 10.1186/s13073-018-0537-2.

Tumor heterogeneity assessed by sequencing and fluorescence in situ hybridization (FISH) data.基于测序和荧光原位杂交（FISH）数据评估肿瘤异质性。

Bioinformatics. 2021 Dec 11;37(24):4704-4711. doi: 10.1093/bioinformatics/btab504.

PhISCS: a combinatorial approach for subperfect tumor phylogeny reconstruction via integrative use of single-cell and bulk sequencing data.PhISCS：一种通过单细胞和批量测序数据的综合使用来重建亚完美肿瘤系统发育的组合方法。

Genome Res. 2019 Nov;29(11):1860-1877. doi: 10.1101/gr.234435.118. Epub 2019 Oct 18.

SCOPE: A Normalization and Copy-Number Estimation Method for Single-Cell DNA Sequencing.范围：单细胞 DNA 测序的标准化和拷贝数估计方法。

Cell Syst. 2020 May 20;10(5):445-452.e6. doi: 10.1016/j.cels.2020.03.005.

Clonality Inference from Single Tumor Samples Using Low-Coverage Sequence Data.利用低覆盖度序列数据从单一肿瘤样本推断克隆性

J Comput Biol. 2017 Jun;24(6):515-523. doi: 10.1089/cmb.2016.0148. Epub 2017 Jan 5.

引用本文的文献

Deconvolution and Phylogeny Inference of Diverse Variant Types Integrating Bulk DNA-seq with Single-cell RNA-seq.整合批量DNA测序与单细胞RNA测序的多种变异类型的反卷积和系统发育推断

bioRxiv. 2025 Jan 27:2025.01.24.634791. doi: 10.1101/2025.01.24.634791.

Cancer phylogenetic inference using copy number alterations detected from DNA sequencing data.利用从DNA测序数据中检测到的拷贝数改变进行癌症系统发育推断。

Cancer Pathog Ther. 2024 Apr 18;3(1):16-29. doi: 10.1016/j.cpt.2024.04.003. eCollection 2025 Jan.

Semi-deconvolution of bulk and single-cell RNA-seq data with application to metastatic progression in breast cancer.基于 bulk 和单细胞 RNA-seq 数据的半分解及其在乳腺癌转移进展中的应用。

Bioinformatics. 2022 Jun 24;38(Suppl 1):i386-i394. doi: 10.1093/bioinformatics/btac262.

Reconstructing tumor clonal lineage trees incorporating single-nucleotide variants, copy number alterations and structural variations.重建整合单核苷酸变异、拷贝数改变和结构变异的肿瘤克隆谱系树。

Bioinformatics. 2022 Jun 24;38(Suppl 1):i125-i133. doi: 10.1093/bioinformatics/btac253.

Joint Clustering of Single-Cell Sequencing and Fluorescence In Situ Hybridization Data for Reconstructing Clonal Heterogeneity in Cancers.单细胞测序和荧光原位杂交数据的联合聚类用于重建癌症中的克隆异质性。

J Comput Biol. 2021 Nov;28(11):1035-1051. doi: 10.1089/cmb.2021.0255. Epub 2021 Oct 5.

Tumor heterogeneity assessed by sequencing and fluorescence in situ hybridization (FISH) data.基于测序和荧光原位杂交（FISH）数据评估肿瘤异质性。

Bioinformatics. 2021 Dec 11;37(24):4704-4711. doi: 10.1093/bioinformatics/btab504.

Assessing the contribution of tumor mutational phenotypes to cancer progression risk.评估肿瘤突变表型对癌症进展风险的贡献。

PLoS Comput Biol. 2021 Mar 12;17(3):e1008777. doi: 10.1371/journal.pcbi.1008777. eCollection 2021 Mar.

Neural Network Deconvolution Method for Resolving Pathway-Level Progression of Tumor Clonal Expression Programs With Application to Breast Cancer Brain Metastases.用于解析肿瘤克隆表达程序的通路水平进展的神经网络反卷积方法及其在乳腺癌脑转移中的应用

Front Physiol. 2020 Sep 4;11:1055. doi: 10.3389/fphys.2020.01055. eCollection 2020.

Robust and accurate deconvolution of tumor populations uncovers evolutionary mechanisms of breast cancer metastasis.稳健且准确的肿瘤群体反卷积揭示了乳腺癌转移的进化机制。

Bioinformatics. 2020 Jul 1;36(Suppl_1):i407-i416. doi: 10.1093/bioinformatics/btaa396.

Algorithmic approaches to clonal reconstruction in heterogeneous cell populations.异质细胞群体中克隆重建的算法方法。

Quant Biol. 2019 Dec;7(4):255-265. doi: 10.1007/s40484-019-0188-3. Epub 2019 Dec 7.

本文引用的文献

Genome Res. 2019 Nov;29(11):1860-1877. doi: 10.1101/gr.234435.118. Epub 2019 Oct 18.

Integrative inference of subclonal tumour evolution from single-cell and bulk sequencing data.从单细胞和批量测序数据推断亚克隆肿瘤进化。

Nat Commun. 2019 Jun 21;10(1):2750. doi: 10.1038/s41467-019-10737-5.

Evolutionary Trajectories of IDH Glioblastomas Reveal a Common Path of Early Tumorigenesis Instigated Years ahead of Initial Diagnosis.IDH 胶质母细胞瘤的进化轨迹揭示了一种常见的早期肿瘤发生途径，其起始时间早于初始诊断数年。

Cancer Cell. 2019 Apr 15;35(4):692-704.e12. doi: 10.1016/j.ccell.2019.02.007. Epub 2019 Mar 21.

Copy number signatures and mutational processes in ovarian carcinoma.卵巢癌中的拷贝数特征和突变过程。

Nat Genet. 2018 Sep;50(9):1262-1270. doi: 10.1038/s41588-018-0179-8. Epub 2018 Aug 13.

Deconvolution and phylogeny inference of structural variations in tumor genomic samples.肿瘤基因组样本中结构变异的去卷积和系统发生推断。

Bioinformatics. 2018 Jul 1;34(13):i357-i365. doi: 10.1093/bioinformatics/bty270.

Phylogenetic Copy-Number Factorization of Multiple Tumor Samples.多个肿瘤样本的系统发育拷贝数分解

J Comput Biol. 2018 Jul;25(7):689-708. doi: 10.1089/cmb.2017.0253. Epub 2018 Apr 16.

Using single-cell multiple omics approaches to resolve tumor heterogeneity.使用单细胞多组学方法解析肿瘤异质性。

Clin Transl Med. 2017 Dec 28;6(1):46. doi: 10.1186/s40169-017-0177-y.

Targeting cellular pathways in glioblastoma multiforme.靶向胶质母细胞瘤中的细胞通路。

Signal Transduct Target Ther. 2017 Sep 29;2:17040. doi: 10.1038/sigtrans.2017.40. eCollection 2017.

Allele-Specific HLA Loss and Immune Escape in Lung Cancer Evolution.肺癌演变过程中的等位基因特异性HLA缺失与免疫逃逸

Cell. 2017 Nov 30;171(6):1259-1271.e11. doi: 10.1016/j.cell.2017.10.001. Epub 2017 Oct 26.

The Role of ATRX in Glioma Biology.ATRX在胶质瘤生物学中的作用。

Front Oncol. 2017 Sep 29;7:236. doi: 10.3389/fonc.2017.00236. eCollection 2017.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。