Liu Ze-Kun, Shang Yu-Kui, Chen Zhi-Nan, Bian Huijie
Department of Cell Biology and National Translational Science Center for Molecular Medicine, Fourth Military Medical University, Xi'an, Shaanxi 710032, P.R. China.
Mol Med Rep. 2017 May;15(5):2489-2494. doi: 10.3892/mmr.2017.6336. Epub 2017 Mar 16.
Rapid advancements in next generation sequencing (NGS) technologies, coupled with the dramatic decrease in cost, have made NGS one of the leading approaches applied in cancer research. In addition, it is increasingly used in clinical practice for cancer diagnosis and treatment. Somatic (cancer‑only) single nucleotide variants and small insertions and deletions (indels) are the simplest classes of mutation, however, their identification in whole exome sequencing data is complicated by germline polymorphisms, tumor heterogeneity and errors in sequencing and analysis. An increasing number of software and methodological guidelines are being published for the analysis of sequencing data. Usually, the algorithms of MuTect, VarScan and Genome Analysis Toolkit are applied to identify the variants. However, one of these algorithms alone results in incomplete genomic information. To address this issue, the present study developed a systematic pipeline for analyzing the whole exome sequencing data of hepatocellular carcinoma (HCC) using a combination of the three algorithms, named the three‑caller pipeline. Application of the three‑caller pipeline to the whole exome data of HCC, improved the detection of true positive mutations and a total of 75 tumor‑specific somatic variants were identified. Functional enrichment analysis revealed the mutations in the genes encoding cell adhesion and regulation of Ras GTPase activity. This pipeline provides an effective approach to identify variants from NGS data for subsequent functional analyses.
下一代测序(NGS)技术的快速发展,再加上成本的大幅下降,使得NGS成为癌症研究中应用的主要方法之一。此外,它在癌症诊断和治疗的临床实践中也越来越常用。体细胞(仅癌症相关)单核苷酸变异以及小的插入和缺失(插入缺失)是最简单的突变类型,然而,在全外显子组测序数据中识别它们会受到种系多态性、肿瘤异质性以及测序和分析错误的影响。越来越多的软件和方法指南被发表用于测序数据的分析。通常,MuTect、VarScan和基因组分析工具包的算法被用于识别变异。然而,仅使用这些算法中的一种会导致基因组信息不完整。为了解决这个问题,本研究开发了一种系统流程,使用这三种算法的组合来分析肝细胞癌(HCC)的全外显子组测序数据,命名为三调用流程。将三调用流程应用于HCC的全外显子数据,提高了对真正阳性突变的检测,共识别出75个肿瘤特异性体细胞变异。功能富集分析揭示了编码细胞黏附和Ras GTPase活性调节的基因中的突变。该流程提供了一种从NGS数据中识别变异以进行后续功能分析的有效方法。