Suppr超能文献

Accucopy:从低覆盖度、低纯度肿瘤测序数据中准确快速推断等位基因特异性拷贝数改变。

Accucopy: accurate and fast inference of allele-specific copy number alterations from low-coverage low-purity tumor sequencing data.

机构信息

Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, 201203, China.

University of Chinese Academy of Sciences, Beijing, 100049, China.

出版信息

BMC Bioinformatics. 2021 Jan 15;22(1):23. doi: 10.1186/s12859-020-03924-5.

Abstract

BACKGROUND

Copy number alterations (CNAs), due to their large impact on the genome, have been an important contributing factor to oncogenesis and metastasis. Detecting genomic alterations from the shallow-sequencing data of a low-purity tumor sample remains a challenging task.

RESULTS

We introduce Accucopy, a method to infer total copy numbers (TCNs) and allele-specific copy numbers (ASCNs) from challenging low-purity and low-coverage tumor samples. Accucopy adopts many robust statistical techniques such as kernel smoothing of coverage differentiation information to discern signals from noise and combines ideas from time-series analysis and the signal-processing field to derive a range of estimates for the period in a histogram of coverage differentiation information. Statistical learning models such as the tiered Gaussian mixture model, the expectation-maximization algorithm, and sparse Bayesian learning were customized and built into the model. Accucopy is implemented in C++ /Rust, packaged in a docker image, and supports non-human samples, more at http://www.yfish.org/software/ .

CONCLUSIONS

We describe Accucopy, a method that can predict both TCNs and ASCNs from low-coverage low-purity tumor sequencing data. Through comparative analyses in both simulated and real-sequencing samples, we demonstrate that Accucopy is more accurate than Sclust, ABSOLUTE, and Sequenza.

摘要

背景

由于拷贝数改变(CNAs)对基因组的影响较大,因此一直是致癌和转移的重要因素。从低纯度肿瘤样本的浅层测序数据中检测基因组改变仍然是一项具有挑战性的任务。

结果

我们介绍了 Accucopy,这是一种从具有挑战性的低纯度和低覆盖度肿瘤样本中推断总拷贝数(TCN)和等位基因特异性拷贝数(ASCN)的方法。Accucopy 采用了许多稳健的统计技术,例如覆盖分化信息的核平滑,以辨别信号与噪声,并结合时间序列分析和信号处理领域的思想,从覆盖分化信息的直方图中推导出一系列估计值。定制了统计学习模型,例如分层高斯混合模型、期望最大化算法和稀疏贝叶斯学习,并将其构建到模型中。Accucopy 是用 C++/Rust 实现的,封装在一个 docker 映像中,并支持非人类样本,更多信息请访问 http://www.yfish.org/software/

结论

我们描述了 Accucopy,这是一种可以从低覆盖度低纯度肿瘤测序数据中预测 TCN 和 ASCN 的方法。通过在模拟和真实测序样本中的比较分析,我们证明 Accucopy 比 Sclust、ABSOLUTE 和 Sequenza 更准确。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4b4b/7811225/869368cdd65c/12859_2020_3924_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验