Suppr超能文献

提高低覆盖度 DNA 测序中单核苷酸多态性特异性单细胞拷贝数估计的方法。

Improved allele-specific single-cell copy number estimation in low-coverage DNA-sequencing.

机构信息

School of Computing, University of Connecticut, Storrs, CT 06082, United States.

Institute for Systems Genomics, University of Connecticut, Storrs, CT 06082, United States.

出版信息

Bioinformatics. 2024 Aug 2;40(8). doi: 10.1093/bioinformatics/btae506.

Abstract

MOTIVATION

Advances in whole-genome single-cell DNA sequencing (scDNA-seq) have led to the development of numerous methods for detecting copy number aberrations (CNAs), a key driver of genetic heterogeneity in cancer. While most of these methods are limited to the inference of total copy number, some recent approaches now infer allele-specific CNAs using innovative techniques for estimating allele-frequencies in low coverage scDNA-seq data. However, these existing allele-specific methods are limited in their segmentation strategies, a crucial step in the CNA detection pipeline.

RESULTS

We present SEACON (Single-cell Estimation of Allele-specific COpy Numbers), an allele-specific copy number profiler for scDNA-seq data. SEACON uses a Gaussian Mixture Model to identify latent copy number states and breakpoints between contiguous segments across cells, filters the segments for high-quality breakpoints using an ensemble technique, and adopts several strategies for tolerating noisy read-depth and allele frequency measurements. Using a wide array of both real and simulated datasets, we show that SEACON derives accurate copy numbers and surpasses existing approaches under numerous experimental conditions, and identify its strengths and weaknesses.

AVAILABILITY AND IMPLEMENTATION

SEACON is implemented in Python and is freely available open-source from https://github.com/NabaviLab/SEACON and https://doi.org/10.5281/zenodo.12727008.

摘要

动机

全基因组单细胞 DNA 测序(scDNA-seq)的进展催生了许多用于检测拷贝数异常(CNAs)的方法,CNAs 是癌症遗传异质性的关键驱动因素。虽然这些方法大多数都仅限于推断总拷贝数,但最近的一些方法现在使用创新技术来推断等位基因特异性 CNAs,这些技术用于估计低覆盖 scDNA-seq 数据中的等位基因频率。然而,这些现有的等位基因特异性方法在其分割策略方面存在局限性,这是 CNA 检测管道中的一个关键步骤。

结果

我们提出了 SEACON(单细胞等位基因特异性拷贝数估计),这是一种用于 scDNA-seq 数据的等位基因特异性拷贝数分析器。SEACON 使用混合高斯模型来识别潜在的拷贝数状态和细胞间连续片段之间的断点,使用集成技术对片段进行高质量断点过滤,并采用多种策略来容忍嘈杂的读深度和等位基因频率测量。使用大量真实和模拟数据集,我们表明 SEACON 可以得出准确的拷贝数,并在许多实验条件下超越现有方法,同时确定其优缺点。

可用性和实现

SEACON 是用 Python 编写的,并在 https://github.com/NabaviLab/SEACONhttps://doi.org/10.5281/zenodo.12727008 上免费开源。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d172/11346770/58f5cea74574/btae506f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验