Suppr超能文献

提高低覆盖度 DNA 测序中单核苷酸多态性特异性单细胞拷贝数估计的方法。

Improved allele-specific single-cell copy number estimation in low-coverage DNA-sequencing.

机构信息

School of Computing, University of Connecticut, Storrs, CT 06082, United States.

Institute for Systems Genomics, University of Connecticut, Storrs, CT 06082, United States.

出版信息

Bioinformatics. 2024 Aug 2;40(8). doi: 10.1093/bioinformatics/btae506.

Abstract

MOTIVATION

Advances in whole-genome single-cell DNA sequencing (scDNA-seq) have led to the development of numerous methods for detecting copy number aberrations (CNAs), a key driver of genetic heterogeneity in cancer. While most of these methods are limited to the inference of total copy number, some recent approaches now infer allele-specific CNAs using innovative techniques for estimating allele-frequencies in low coverage scDNA-seq data. However, these existing allele-specific methods are limited in their segmentation strategies, a crucial step in the CNA detection pipeline.

RESULTS

We present SEACON (Single-cell Estimation of Allele-specific COpy Numbers), an allele-specific copy number profiler for scDNA-seq data. SEACON uses a Gaussian Mixture Model to identify latent copy number states and breakpoints between contiguous segments across cells, filters the segments for high-quality breakpoints using an ensemble technique, and adopts several strategies for tolerating noisy read-depth and allele frequency measurements. Using a wide array of both real and simulated datasets, we show that SEACON derives accurate copy numbers and surpasses existing approaches under numerous experimental conditions, and identify its strengths and weaknesses.

AVAILABILITY AND IMPLEMENTATION

SEACON is implemented in Python and is freely available open-source from https://github.com/NabaviLab/SEACON and https://doi.org/10.5281/zenodo.12727008.

摘要

动机

全基因组单细胞 DNA 测序(scDNA-seq)的进展催生了许多用于检测拷贝数异常(CNAs)的方法,CNAs 是癌症遗传异质性的关键驱动因素。虽然这些方法大多数都仅限于推断总拷贝数,但最近的一些方法现在使用创新技术来推断等位基因特异性 CNAs,这些技术用于估计低覆盖 scDNA-seq 数据中的等位基因频率。然而,这些现有的等位基因特异性方法在其分割策略方面存在局限性,这是 CNA 检测管道中的一个关键步骤。

结果

我们提出了 SEACON(单细胞等位基因特异性拷贝数估计),这是一种用于 scDNA-seq 数据的等位基因特异性拷贝数分析器。SEACON 使用混合高斯模型来识别潜在的拷贝数状态和细胞间连续片段之间的断点,使用集成技术对片段进行高质量断点过滤,并采用多种策略来容忍嘈杂的读深度和等位基因频率测量。使用大量真实和模拟数据集,我们表明 SEACON 可以得出准确的拷贝数,并在许多实验条件下超越现有方法,同时确定其优缺点。

可用性和实现

SEACON 是用 Python 编写的,并在 https://github.com/NabaviLab/SEACONhttps://doi.org/10.5281/zenodo.12727008 上免费开源。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d172/11346770/58f5cea74574/btae506f1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验