HCMMCNVs：基于全外显子测序技术的拷贝数变异检测的层次聚类混合模型。

HCMMCNVs: hierarchical clustering mixture model of copy number variants detection using whole exome sequencing technology.

机构信息

Division of Biostatistics, Ohio State University, Columbus, OH 43210, USA.

Whole-Genome Research Core Laboratory of Human Diseases, Chang Gung Memorial Hospital, Keelung 204, Taiwan.

出版信息

Bioinformatics. 2021 Sep 29;37(18):3026-3028. doi: 10.1093/bioinformatics/btab183.

DOI:10.1093/bioinformatics/btab183

PMID:33714997

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8479678/

Abstract

SUMMARY

In this article, we introduce a hierarchical clustering and Gaussian mixture model with expectation-maximization (EM) algorithm for detecting copy number variants (CNVs) using whole exome sequencing (WES) data. The R shiny package 'HCMMCNVs' is also developed for processing user-provided bam files, running CNVs detection algorithm and conducting visualization. Through applying our approach to 325 cancer cell lines in 22 tumor types from Cancer Cell Line Encyclopedia (CCLE), we show that our algorithm is competitive with other existing methods and feasible in using multiple cancer cell lines for CNVs estimation. In addition, by applying our approach to WES data of 120 oral squamous cell carcinoma (OSCC) samples, our algorithm, using the tumor sample only, exhibits more power in detecting CNVs as compared with the methods using both tumors and matched normal counterparts.

AVAILABILITY AND IMPLEMENTATION

HCMMCNVs R shiny software is freely available at github repository https://github.com/lunching/HCMM_CNVs.and Zenodo https://doi.org/10.5281/zenodo.4593371.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

本文提出了一种基于层次聚类和期望最大化（EM）算法的高斯混合模型，用于使用全外显子组测序（WES）数据检测拷贝数变异（CNVs）。还开发了 R shiny 包“HCMMCNVs”，用于处理用户提供的 bam 文件、运行 CNVs 检测算法和进行可视化。通过将我们的方法应用于癌症细胞系百科全书（CCLE）中 22 种肿瘤类型的 325 种癌细胞系，我们表明我们的算法与其他现有方法具有竞争力，并且可以使用多个癌细胞系进行 CNVs 估计。此外，通过将我们的方法应用于 120 个口腔鳞状细胞癌（OSCC）样本的 WES 数据，与使用肿瘤和配对正常对照的方法相比，我们的算法仅使用肿瘤样本在检测 CNVs 方面具有更高的能力。

可用性和实现

HCMMCNVs R shiny 软件可在 github 存储库 https://github.com/lunching/HCMM_CNVs. 和 Zenodo https://doi.org/10.5281/zenodo.4593371. 免费获得。

补充信息

补充数据可在生物信息学在线获得。

相似文献

HCMMCNVs: hierarchical clustering mixture model of copy number variants detection using whole exome sequencing technology.HCMMCNVs：基于全外显子测序技术的拷贝数变异检测的层次聚类混合模型。

Bioinformatics. 2021 Sep 29;37(18):3026-3028. doi: 10.1093/bioinformatics/btab183.

FishingCNV: a graphical software package for detecting rare copy number variations in exome-sequencing data.FishingCNV：一种用于检测外显子组测序数据中罕见拷贝数变异的图形化软件包。

Bioinformatics. 2013 Jun 1;29(11):1461-2. doi: 10.1093/bioinformatics/btt151. Epub 2013 Mar 28.

CLAMMS: a scalable algorithm for calling common and rare copy number variants from exome sequencing data.CLAMMS：一种用于从外显子组测序数据中检测常见和罕见拷贝数变异的可扩展算法。

Bioinformatics. 2016 Jan 1;32(1):133-5. doi: 10.1093/bioinformatics/btv547. Epub 2015 Sep 17.

Bamgineer: Introduction of simulated allele-specific copy number variants into exome and targeted sequence data sets.Bamgineer：外显子组和靶向序列数据集模拟等位基因特异性拷贝数变异的引入。

PLoS Comput Biol. 2018 Mar 28;14(3):e1006080. doi: 10.1371/journal.pcbi.1006080. eCollection 2018 Mar.

PatternCNV: a versatile tool for detecting copy number changes from exome sequencing data.PatternCNV：一种用于从外显子组测序数据中检测拷贝数变化的通用工具。

Bioinformatics. 2014 Sep 15;30(18):2678-80. doi: 10.1093/bioinformatics/btu363. Epub 2014 May 29.

Joint detection of germline and somatic copy number events in matched tumor-normal sample pairs.在配对的肿瘤-正常样本对中联合检测种系和体细胞拷贝数事件。

Bioinformatics. 2019 Dec 1;35(23):4955-4961. doi: 10.1093/bioinformatics/btz429.

An evaluation of copy number variation detection tools for cancer using whole exome sequencing data.使用全外显子组测序数据对癌症拷贝数变异检测工具的评估

BMC Bioinformatics. 2017 May 31;18(1):286. doi: 10.1186/s12859-017-1705-x.

WaveCNV: allele-specific copy number alterations in primary tumors and xenograft models from next-generation sequencing.WaveCNV：下一代测序中外源模型和原发性肿瘤的等位基因特异性拷贝数改变。

Bioinformatics. 2014 Mar 15;30(6):768-74. doi: 10.1093/bioinformatics/btt611. Epub 2013 Nov 4.

DEFOR: depth- and frequency-based somatic copy number alteration detector.DEFOR：基于深度和频率的体细胞拷贝数改变探测器。

Bioinformatics. 2019 Oct 1;35(19):3824-3825. doi: 10.1093/bioinformatics/btz170.

Pre-capture multiplexing provides additional power to detect copy number variation in exome sequencing.预捕获多重分析为外显子测序中检测拷贝数变异提供了额外的功效。

BMC Bioinformatics. 2021 Jul 20;22(1):374. doi: 10.1186/s12859-021-04246-w.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验