• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

在低覆盖度癌症基因组中进行大规模和焦点拷贝数改变的层次式发现。

Hierarchical discovery of large-scale and focal copy number alterations in low-coverage cancer genomes.

机构信息

School of Computer Science and Engineering, Nanyang Technological University, 50 Nanyang Avenue, Singapore, 639798, Singapore.

School of Biological Sciences, Nanyang Technological University, 60 Nanyang Drive, Singapore, 637551, Singapore.

出版信息

BMC Bioinformatics. 2020 Apr 16;21(1):147. doi: 10.1186/s12859-020-3480-3.

DOI:10.1186/s12859-020-3480-3
PMID:32299346
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7160937/
Abstract

BACKGROUND

Detection of DNA copy number alterations (CNAs) is critical to understand genetic diversity, genome evolution and pathological conditions such as cancer. Cancer genomes are plagued with widespread multi-level structural aberrations of chromosomes that pose challenges to discover CNAs of different length scales, and distinct biological origins and functions. Although several computational tools are available to identify CNAs using read depth (RD) signal, they fail to distinguish between large-scale and focal alterations due to inaccurate modeling of the RD signal of cancer genomes. Additionally, RD signal is affected by overdispersion-driven biases at low coverage, which significantly inflate false detection of CNA regions.

RESULTS

We have developed CNAtra framework to hierarchically discover and classify 'large-scale' and 'focal' copy number gain/loss from a single whole-genome sequencing (WGS) sample. CNAtra first utilizes a multimodal-based distribution to estimate the copy number (CN) reference from the complex RD profile of the cancer genome. We implemented Savitzky-Golay smoothing filter and Modified Varri segmentation to capture the change points of the RD signal. We then developed a CN state-driven merging algorithm to identify the large segments with distinct copy numbers. Next, we identified focal alterations in each large segment using coverage-based thresholding to mitigate the adverse effects of signal variations. Using cancer cell lines and patient datasets, we confirmed CNAtra's ability to detect and distinguish the segmental aneuploidies and focal alterations. We used realistic simulated data for benchmarking the performance of CNAtra against other single-sample detection tools, where we artificially introduced CNAs in the original cancer profiles. We found that CNAtra is superior in terms of precision, recall and f-measure. CNAtra shows the highest sensitivity of 93 and 97% for detecting large-scale and focal alterations respectively. Visual inspection of CNAs revealed that CNAtra is the most robust detection tool for low-coverage cancer data.

CONCLUSIONS

CNAtra is a single-sample CNA detection tool that provides an analytical and visualization framework for CNA profiling without relying on any reference control. It can detect chromosome-level segmental aneuploidies and high-confidence focal alterations, even from low-coverage data. CNAtra is an open-source software implemented in MATLAB. It is freely available at https://github.com/AISKhalil/CNAtra.

摘要

背景

检测 DNA 拷贝数改变(CNAs)对于理解遗传多样性、基因组进化以及癌症等病理状况至关重要。癌症基因组中广泛存在染色体的多水平结构异常,这给发现不同长度尺度、不同起源和功能的 CNA 带来了挑战。虽然有几种计算工具可用于使用读取深度(RD)信号来识别 CNA,但由于对癌症基因组 RD 信号的建模不准确,它们无法区分大尺度和局灶性改变。此外,RD 信号会受到低覆盖下过度分散驱动的偏差的影响,这会显著增加 CNA 区域的假阳性检测。

结果

我们开发了 CNAtra 框架,用于从单个全基因组测序(WGS)样本中分层发现和分类“大尺度”和“局灶性”拷贝数增益/丢失。CNAtra 首先利用多模态分布从癌症基因组复杂的 RD 谱中估计拷贝数(CN)参考。我们实现了 Savitzky-Golay 平滑滤波器和 Modified Varri 分段来捕获 RD 信号的变化点。然后,我们开发了一种基于 CN 状态的合并算法来识别具有不同拷贝数的大段。接下来,我们使用基于覆盖的阈值在每个大段中识别局灶性改变,以减轻信号变化的不利影响。使用癌细胞系和患者数据集,我们证实了 CNAtra 检测和区分片段非整倍性和局灶性改变的能力。我们使用真实模拟数据来评估 CNAtra 与其他单样本检测工具的性能,在这些工具中,我们在原始癌症图谱中人为引入了 CNA。我们发现,CNAtra 在精度、召回率和 F1 度量方面表现更优。CNAtra 分别在检测大尺度和局灶性改变方面具有 93%和 97%的最高灵敏度。对 CNA 的可视化检查表明,CNAtra 是用于低覆盖癌症数据的最稳健检测工具。

结论

CNAtra 是一种单样本 CNA 检测工具,它提供了一个用于 CNA 分析和可视化的框架,无需依赖任何参考控制。它可以检测染色体水平的片段非整倍性和高置信度的局灶性改变,即使来自低覆盖数据。CNAtra 是一个用 MATLAB 实现的开源软件。它可在 https://github.com/AISKhalil/CNAtra 上免费获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4920/7160937/1e63ed8672e6/12859_2020_3480_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4920/7160937/4c1857b3325f/12859_2020_3480_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4920/7160937/65e8ee1a344e/12859_2020_3480_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4920/7160937/96548dd16667/12859_2020_3480_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4920/7160937/2f807394ae33/12859_2020_3480_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4920/7160937/1e63ed8672e6/12859_2020_3480_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4920/7160937/4c1857b3325f/12859_2020_3480_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4920/7160937/65e8ee1a344e/12859_2020_3480_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4920/7160937/96548dd16667/12859_2020_3480_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4920/7160937/2f807394ae33/12859_2020_3480_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4920/7160937/1e63ed8672e6/12859_2020_3480_Fig5_HTML.jpg

相似文献

1
Hierarchical discovery of large-scale and focal copy number alterations in low-coverage cancer genomes.在低覆盖度癌症基因组中进行大规模和焦点拷贝数改变的层次式发现。
BMC Bioinformatics. 2020 Apr 16;21(1):147. doi: 10.1186/s12859-020-3480-3.
2
Identification and utilization of copy number information for correcting Hi-C contact map of cancer cell lines.鉴定和利用拷贝数信息来修正癌细胞系的 Hi-C 接触图谱。
BMC Bioinformatics. 2020 Nov 7;21(1):506. doi: 10.1186/s12859-020-03832-8.
3
CNApp, a tool for the quantification of copy number alterations and integrative analysis revealing clinical implications.CNApp,一种用于拷贝数改变定量分析和综合分析揭示临床意义的工具。
Elife. 2020 Jan 15;9:e50267. doi: 10.7554/eLife.50267.
4
CLImAT-HET: detecting subclonal copy number alterations and loss of heterozygosity in heterogeneous tumor samples from whole-genome sequencing data.CLImAT-HET:从全基因组测序数据中检测异质性肿瘤样本中的亚克隆拷贝数改变和杂合性缺失
BMC Med Genomics. 2017 Mar 15;10(1):15. doi: 10.1186/s12920-017-0255-4.
5
CONSERTING: integrating copy-number analysis with structural-variation detection.CONSERTING:将拷贝数分析与结构变异检测相结合。
Nat Methods. 2015 Jun;12(6):527-30. doi: 10.1038/nmeth.3394. Epub 2015 May 4.
6
Copy number alterations detected by whole-exome and whole-genome sequencing of esophageal adenocarcinoma.通过食管腺癌的全外显子组测序和全基因组测序检测到的拷贝数改变。
Hum Genomics. 2015 Sep 15;9(1):22. doi: 10.1186/s40246-015-0044-0.
7
Inferring single-cell copy number profiles through cross-cell segmentation of read counts.通过读取计数的跨细胞分割推断单细胞拷贝数谱。
BMC Genomics. 2024 Jan 2;25(1):25. doi: 10.1186/s12864-023-09901-5.
8
SCONCE: a method for profiling copy number alterations in cancer evolution using single-cell whole genome sequencing.SCONCE:一种使用单细胞全基因组测序进行癌症进化中拷贝数改变分析的方法。
Bioinformatics. 2022 Mar 28;38(7):1801-1808. doi: 10.1093/bioinformatics/btac041.
9
Genome-wide identification of significant aberrations in cancer genome.全基因组鉴定癌症基因组中的显著异常。
BMC Genomics. 2012 Jul 27;13:342. doi: 10.1186/1471-2164-13-342.
10
Improved allele-specific single-cell copy number estimation in low-coverage DNA-sequencing.提高低覆盖度 DNA 测序中单核苷酸多态性特异性单细胞拷贝数估计的方法。
Bioinformatics. 2024 Aug 2;40(8). doi: 10.1093/bioinformatics/btae506.

引用本文的文献

1
Distinctive chromosomal, mutational and transcriptional profiling in colon versus rectal cancers.结肠癌与直肠癌独特的染色体、突变及转录谱分析。
J Transl Med. 2025 Aug 6;23(1):869. doi: 10.1186/s12967-025-06908-2.
2
Analysis of Aneuploidy Spectrum From Whole-Genome Sequencing Provides Rapid Assessment of Clonal Variation Within Established Cancer Cell Lines.全基因组测序的非整倍体谱分析可快速评估已建立癌细胞系内的克隆变异。
Cancer Inform. 2021 Oct 16;20:11769351211049236. doi: 10.1177/11769351211049236. eCollection 2021.
3
Identification and utilization of copy number information for correcting Hi-C contact map of cancer cell lines.

本文引用的文献

1
Impact of DNA source on genetic variant detection from human whole-genome sequencing data.DNA 来源对人类全基因组测序数据中遗传变异检测的影响。
J Med Genet. 2019 Dec;56(12):809-817. doi: 10.1136/jmedgenet-2019-106281. Epub 2019 Sep 12.
2
Selecting precise reference normal tissue samples for cancer research using a deep learning approach.使用深度学习方法为癌症研究选择精确的参考正常组织样本。
BMC Med Genomics. 2019 Jan 31;12(Suppl 1):21. doi: 10.1186/s12920-018-0463-6.
3
COSMIC: the Catalogue Of Somatic Mutations In Cancer.COSMIC:癌症体细胞突变目录。
鉴定和利用拷贝数信息来修正癌细胞系的 Hi-C 接触图谱。
BMC Bioinformatics. 2020 Nov 7;21(1):506. doi: 10.1186/s12859-020-03832-8.
4
Spatial inter-centromeric interactions facilitated the emergence of evolutionary new centromeres.空间着丝粒相互作用促进了进化中新着丝粒的出现。
Elife. 2020 May 29;9:e58556. doi: 10.7554/eLife.58556.
Nucleic Acids Res. 2019 Jan 8;47(D1):D941-D947. doi: 10.1093/nar/gky1015.
4
Whole-genome sequencing analysis of CNV using low-coverage and paired-end strategies is efficient and outperforms array-based CNV analysis.采用低覆盖度和双端测序策略进行全基因组测序分析,效率高,优于基于阵列的 CNV 分析。
J Med Genet. 2018 Nov;55(11):735-743. doi: 10.1136/jmedgenet-2018-105272. Epub 2018 Jul 30.
5
AMYCNE: Confident copy number assessment using whole genome sequencing data.神经母细胞瘤:使用全基因组测序数据进行有信心的拷贝数评估。
PLoS One. 2018 Mar 26;13(3):e0189710. doi: 10.1371/journal.pone.0189710. eCollection 2018.
6
Accurity: accurate tumor purity and ploidy inference from tumor-normal WGS data by jointly modelling somatic copy number alterations and heterozygous germline single-nucleotide-variants.Accurity:通过联合建模体细胞拷贝数改变和杂合性胚系单核苷酸变异,从肿瘤-正常 WGS 数据中准确推断肿瘤纯度和倍性。
Bioinformatics. 2018 Jun 15;34(12):2004-2011. doi: 10.1093/bioinformatics/bty043.
7
Determinants and clinical implications of chromosomal instability in cancer.癌症中染色体不稳定性的决定因素及其临床意义。
Nat Rev Clin Oncol. 2018 Mar;15(3):139-150. doi: 10.1038/nrclinonc.2017.198. Epub 2018 Jan 3.
8
Chromatin-state discovery and genome annotation with ChromHMM.使用ChromHMM进行染色质状态发现和基因组注释。
Nat Protoc. 2017 Dec;12(12):2478-2492. doi: 10.1038/nprot.2017.124. Epub 2017 Nov 9.
9
XCAVATOR: accurate detection and genotyping of copy number variants from second and third generation whole-genome sequencing experiments.XCAVATOR:从二代和三代全基因组测序实验中准确检测和基因分型拷贝数变异。
BMC Genomics. 2017 Sep 21;18(1):747. doi: 10.1186/s12864-017-4137-0.
10
, an efficient and comprehensive structural variant caller for massive parallel sequencing data.,一种用于大规模平行测序数据的高效且全面的结构变异检测工具。
F1000Res. 2017 May 10;6:664. doi: 10.12688/f1000research.11168.2. eCollection 2017.