BnpC：单细胞突变谱的贝叶斯非参数聚类。

BnpC: Bayesian non-parametric clustering of single-cell mutation profiles.

机构信息

Department of Biosystems Science and Engineering, ETH Zürich, Basel 4058, Switzerland.

SIB, Swiss Institute of Bioinformatics, Basel 4058, Switzerland.

出版信息

Bioinformatics. 2020 Dec 8;36(19):4854-4859. doi: 10.1093/bioinformatics/btaa599.

DOI:10.1093/bioinformatics/btaa599

PMID:32592465

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7750970/

Abstract

MOTIVATION

The high resolution of single-cell DNA sequencing (scDNA-seq) offers great potential to resolve intratumor heterogeneity (ITH) by distinguishing clonal populations based on their mutation profiles. However, the increasing size of scDNA-seq datasets and technical limitations, such as high error rates and a large proportion of missing values, complicate this task and limit the applicability of existing methods.

RESULTS

Here, we introduce BnpC, a novel non-parametric method to cluster individual cells into clones and infer their genotypes based on their noisy mutation profiles. We benchmarked our method comprehensively against state-of-the-art methods on simulated data using various data sizes, and applied it to three cancer scDNA-seq datasets. On simulated data, BnpC compared favorably against current methods in terms of accuracy, runtime and scalability. Its inferred genotypes were the most accurate, especially on highly heterogeneous data, and it was the only method able to run and produce results on datasets with 5000 cells. On tumor scDNA-seq data, BnpC was able to identify clonal populations missed by the original cluster analysis but supported by Supplementary Experimental Data. With ever growing scDNA-seq datasets, scalable and accurate methods such as BnpC will become increasingly relevant, not only to resolve ITH but also as a preprocessing step to reduce data size.

AVAILABILITY AND IMPLEMENTATION

BnpC is freely available under MIT license at https://github.com/cbg-ethz/BnpC.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

单细胞 DNA 测序（scDNA-seq）的高分辨率具有很大的潜力，可以通过基于其突变谱区分克隆群体来解决肿瘤内异质性（ITH）。然而，scDNA-seq 数据集的规模不断增加以及技术限制，如高错误率和大量缺失值，使得这项任务变得复杂，并限制了现有方法的适用性。

结果

在这里，我们引入了 BnpC，这是一种新的非参数方法，可以根据其嘈杂的突变谱将单个细胞聚类到克隆中，并推断它们的基因型。我们使用各种数据大小在模拟数据上全面基准测试了我们的方法，并将其应用于三个癌症 scDNA-seq 数据集。在模拟数据上，BnpC 在准确性、运行时间和可扩展性方面优于当前方法。它推断的基因型最准确，特别是在高度异质的数据上，并且是唯一能够在包含 5000 个细胞的数据集上运行并产生结果的方法。在肿瘤 scDNA-seq 数据上，BnpC 能够识别原始聚类分析遗漏但补充实验数据支持的克隆群体。随着 scDNA-seq 数据集的不断增长，像 BnpC 这样的可扩展且准确的方法将变得越来越重要，不仅可以解决 ITH，而且可以作为减少数据大小的预处理步骤。

可用性和实施

BnpC 可根据麻省理工学院的许可证在 https://github.com/cbg-ethz/BnpC 上免费获得。

补充信息

补充数据可在《生物信息学》在线获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5720/7750970/ea0d88fa0a08/btaa599f1.jpg

相似文献

BnpC: Bayesian non-parametric clustering of single-cell mutation profiles.BnpC：单细胞突变谱的贝叶斯非参数聚类。

Bioinformatics. 2020 Dec 8;36(19):4854-4859. doi: 10.1093/bioinformatics/btaa599.

AMC: accurate mutation clustering from single-cell DNA sequencing data.AMC：从单细胞DNA测序数据中进行准确的突变聚类

Bioinformatics. 2022 Mar 4;38(6):1732-1734. doi: 10.1093/bioinformatics/btab857.

doubletD: detecting doublets in single-cell DNA sequencing data.doubletD：单细胞 DNA 测序数据中的双细胞检测。

Bioinformatics. 2021 Jul 12;37(Suppl_1):i214-i221. doi: 10.1093/bioinformatics/btab266.

Cell-level somatic mutation detection from single-cell RNA sequencing.单细胞 RNA 测序中单细胞体细胞突变检测

Bioinformatics. 2019 Nov 1;35(22):4679-4687. doi: 10.1093/bioinformatics/btz288.

scGAL: unmask tumor clonal substructure by jointly analyzing independent single-cell copy number and scRNA-seq data.scGAL：通过联合分析独立的单细胞拷贝数和单细胞RNA测序数据来揭示肿瘤克隆亚结构。

BMC Genomics. 2024 Apr 22;25(1):393. doi: 10.1186/s12864-024-10319-w.

PARC: ultrafast and accurate clustering of phenotypic data of millions of single cells.PARC：对数百万个单细胞的表型数据进行超快速和准确的聚类。

Bioinformatics. 2020 May 1;36(9):2778-2786. doi: 10.1093/bioinformatics/btaa042.

Scalable preprocessing for sparse scRNA-seq data exploiting prior knowledge.利用先验知识对稀疏 scRNA-seq 数据进行可扩展的预处理。

Bioinformatics. 2018 Jul 1;34(13):i124-i132. doi: 10.1093/bioinformatics/bty293.

FlowGrid enables fast clustering of very large single-cell RNA-seq data.FlowGrid能够对非常大的单细胞RNA测序数据进行快速聚类。

Bioinformatics. 2021 Dec 22;38(1):282-283. doi: 10.1093/bioinformatics/btab521.

SCClone: Accurate Clustering of Tumor Single-Cell DNA Sequencing Data.SCClone：肿瘤单细胞DNA测序数据的精确聚类

Front Genet. 2022 Jan 27;13:823941. doi: 10.3389/fgene.2022.823941. eCollection 2022.

SCCNAInfer: a robust and accurate tool to infer the absolute copy number on scDNA-seq data.SCCNAInfer：一种用于推断单细胞DNA测序数据绝对拷贝数的强大且准确的工具。

Bioinformatics. 2024 Jul 27;40(7). doi: 10.1093/bioinformatics/btae454.

引用本文的文献

Cancer subclone detection based on DNA copy number in single-cell and spatial omic sequencing data.基于单细胞和空间组学测序数据中DNA拷贝数的癌症亚克隆检测。

Nat Methods. 2025 Sep 15. doi: 10.1038/s41592-025-02773-5.

De novo detection of somatic variants in high-quality long-read single-cell RNA sequencing data.在高质量长读长单细胞RNA测序数据中从头检测体细胞变异

Genome Res. 2025 Apr 14;35(4):900-913. doi: 10.1101/gr.279281.124.

Single-Cell Sequencing: Genomic and Transcriptomic Approaches in Cancer Cell Biology.单细胞测序：癌细胞生物学中的基因组学和转录组学方法。

Int J Mol Sci. 2025 Feb 27;26(5):2074. doi: 10.3390/ijms26052074.

Advances and applications in single-cell and spatial genomics.单细胞和空间基因组学的进展与应用

Sci China Life Sci. 2024 Dec 20. doi: 10.1007/s11427-024-2770-x.

Assessing the merits: an opinion on the effectiveness of simulation techniques in tumor subclonal reconstruction.评估优点：关于模拟技术在肿瘤亚克隆重建中的有效性的观点。

Bioinform Adv. 2024 Jun 26;4(1):vbae094. doi: 10.1093/bioadv/vbae094. eCollection 2024.

De novo detection of somatic variants in high-quality long-read single-cell RNA sequencing data.在高质量长读长单细胞RNA测序数据中从头检测体细胞变异

bioRxiv. 2024 Nov 5:2024.03.06.583775. doi: 10.1101/2024.03.06.583775.

Evaluation of simulation methods for tumor subclonal reconstruction.肿瘤亚克隆重建模拟方法的评估

ArXiv. 2024 Feb 14:arXiv:2402.09599v1.

Assessing the performance of methods for cell clustering from single-cell DNA sequencing data.评估单细胞 DNA 测序数据中细胞聚类方法的性能。

PLoS Comput Biol. 2023 Oct 12;19(10):e1010480. doi: 10.1371/journal.pcbi.1010480. eCollection 2023 Oct.

bmVAE: a variational autoencoder method for clustering single-cell mutation data.基于变分自编码器的单细胞突变聚类方法。

Bioinformatics. 2023 Jan 1;39(1). doi: 10.1093/bioinformatics/btac790.

SCClone: Accurate Clustering of Tumor Single-Cell DNA Sequencing Data.SCClone：肿瘤单细胞DNA测序数据的精确聚类

Front Genet. 2022 Jan 27;13:823941. doi: 10.3389/fgene.2022.823941. eCollection 2022.

本文引用的文献

Inferring cancer progression from Single-Cell Sequencing while allowing mutation losses.从单细胞测序推断癌症进展，同时允许突变丢失。

Bioinformatics. 2021 Apr 20;37(3):326-333. doi: 10.1093/bioinformatics/btaa722.

SiCloneFit: Bayesian inference of population structure, genotype, and phylogeny of tumor clones from single-cell genome sequencing data.SiCloneFit：基于单细胞基因组测序数据的肿瘤克隆群体结构、基因型和系统发育的贝叶斯推断。

Genome Res. 2019 Nov;29(11):1847-1859. doi: 10.1101/gr.243121.118. Epub 2019 Oct 18.

PhISCS: a combinatorial approach for subperfect tumor phylogeny reconstruction via integrative use of single-cell and bulk sequencing data.PhISCS：一种通过单细胞和批量测序数据的综合使用来重建亚完美肿瘤系统发育的组合方法。

Genome Res. 2019 Nov;29(11):1860-1877. doi: 10.1101/gr.234435.118. Epub 2019 Oct 18.

SPhyR: tumor phylogeny estimation from single-cell sequencing data under loss and error.SPhyR：在丢失和错误情况下，从单细胞测序数据中估计肿瘤进化史。

Bioinformatics. 2018 Sep 1;34(17):i671-i679. doi: 10.1093/bioinformatics/bty589.

Deterministic Evolutionary Trajectories Influence Primary Tumor Growth: TRACERx Renal.确定性进化轨迹影响原发性肿瘤生长：TRACERx 肾脏。

Cell. 2018 Apr 19;173(3):595-610.e11. doi: 10.1016/j.cell.2018.03.043. Epub 2018 Apr 12.

SiFit: inferring tumor trees from single-cell sequencing data under finite-sites models.SiFit：在有限位点模型下从单细胞测序数据中推断肿瘤树。

Genome Biol. 2017 Sep 19;18(1):178. doi: 10.1186/s13059-017-1311-2.

The evolution of tumour phylogenetics: principles and practice.肿瘤系统发育学的演变：原理与实践

Nat Rev Genet. 2017 Apr;18(4):213-229. doi: 10.1038/nrg.2016.170. Epub 2017 Feb 13.

Tumor evolution: Linear, branching, neutral or punctuated?肿瘤进化：线性、分支、中性还是间断平衡？

Biochim Biophys Acta Rev Cancer. 2017 Apr;1867(2):151-161. doi: 10.1016/j.bbcan.2017.01.003. Epub 2017 Jan 19.

Evolution and heterogeneity of non-hereditary colorectal cancer revealed by single-cell exome sequencing.单细胞外显子测序揭示的非遗传性结直肠癌的进化与异质性

Oncogene. 2017 May 18;36(20):2857-2867. doi: 10.1038/onc.2016.438. Epub 2016 Dec 12.

Clonal genotype and population structure inference from single-cell tumor sequencing.从单细胞肿瘤测序推断克隆基因型和种群结构。

Nat Methods. 2016 Jul;13(7):573-6. doi: 10.1038/nmeth.3867. Epub 2016 May 16.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

BnpC：单细胞突变谱的贝叶斯非参数聚类。

BnpC: Bayesian non-parametric clustering of single-cell mutation profiles.

机构信息

出版信息

MOTIVATION

RESULTS

AVAILABILITY AND IMPLEMENTATION

SUPPLEMENTARY INFORMATION

动机

结果

可用性和实施

补充信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献