Institute of Chemical Biology and Fundamental Medicine SB RAS, Novosibirsk, 630090, Russia; Novosibirsk State University, Novosibirsk, 630090, Russia.
Institute of Chemical Biology and Fundamental Medicine SB RAS, Novosibirsk, 630090, Russia.
Comput Biol Chem. 2018 Dec;77:297-306. doi: 10.1016/j.compbiolchem.2018.10.012. Epub 2018 Oct 23.
The use of targeted next-generation sequencing (NGS) provides great new opportunities for molecular and medical genetics. However, in order to take advantage of these opportunities, we need to have reliable tools for extracting the necessary information from the huge amount of data generated by NGS. Here we present our automatic multithreaded workflow for processing NGS data of BRCA1 and BRCA2 genes obtained with NGS technology named BRCA-analyzer. Optimizing it on the sequencing data of 899 samples from 693 patients, we were able to find the most reliable tools and adjust their parameters in such a way that all pathogenic variants found were confirmed by Sanger's sequencing. For 82 and 24 DNA samples from blood and formalin-fixed paraffin-embedded blocks, NGS libraries were prepared with GeneRead BRCA panel v2 (Qiagen). The reads obtained were processed with BRCA-analyzer and Qiagen GeneRead Data analysis workflow. In total 27 pathogenic variants were found and confirmed by Sanger's sequencing, with all of them determined with BRCA-analyzer. Qiagen GeneRead Data analysis discarded 5 true pathogenic variants due to their location in homopolymeric sequence stretches. For other 793 samples, libraries were prepared by the in-house method, and NGS data were analyzed by BRCA-analyzer in comparison to another free automatic amplicon NGS workflow Canary. From total 137 pathogenic variations, BRCA-analyzer found 135 and Canary 123. Mutations were missed by BRCA-analyzer due to the trimming primer sequences from reads before mapping to be fixed in the next version. On the freely available NGS data, we showed that BRCA-analyzer could also be used for hybrid capture gene panels, although it needs more extensive testing on such library preparation methods. Thus, BRCA-analyzer is an automatic workflow for processing NGS data of BRCA1/2 genes with variant filters adapted to amplicon-based targeted NGS data. BRCA-analyzer can be used to identify germline as well as somatic mutations. BRCA-analyzer is freely available at https://github.com/aakechin/BRCA-analyzer.
靶向新一代测序(NGS)的使用为分子和医学遗传学提供了巨大的新机遇。然而,为了利用这些机会,我们需要有可靠的工具从 NGS 产生的大量数据中提取必要的信息。在这里,我们介绍了我们用于处理使用 NGS 技术获得的 BRCA1 和 BRCA2 基因的 NGS 数据的自动多线程工作流程,该流程名为 BRCA-analyzer。我们在来自 693 名患者的 899 个样本的测序数据上对其进行了优化,从而能够找到最可靠的工具,并调整其参数,以便通过 Sanger 测序确认发现的所有致病性变异。对于来自血液和福尔马林固定石蜡包埋块的 82 和 24 个 DNA 样本,使用 GeneRead BRCA 面板 v2(Qiagen)制备了 NGS 文库。用 BRCA-analyzer 和 Qiagen GeneRead Data analysis workflow 处理获得的读数。总共发现了 27 个致病性变异,并通过 Sanger 测序得到了确认,所有这些变异都是通过 BRCA-analyzer 确定的。Qiagen GeneRead Data analysis 由于它们位于同源多聚序列延伸段中,因此丢弃了 5 个真正的致病性变体。对于其他 793 个样本,通过内部方法制备文库,并将 NGS 数据与另一个免费的自动扩增子 NGS 工作流程 Canary 进行分析。在总共 137 个致病性变异中,BRCA-analyzer 发现了 135 个,Canary 发现了 123 个。由于在将读取映射到下一个版本之前从读取中修剪引物序列,因此 BRCA-analyzer 会遗漏突变。在免费的 NGS 数据上,我们表明 BRCA-analyzer 也可以用于杂交捕获基因面板,尽管它需要在这种文库制备方法上进行更广泛的测试。因此,BRCA-analyzer 是一种用于处理基于扩增子的靶向 NGS 数据的 BRCA1/2 基因的自动工作流程,具有适应基于扩增子的靶向 NGS 数据的变体过滤器。BRCA-analyzer 可用于鉴定种系和体细胞突变。BRCA-analyzer 可在 https://github.com/aakechin/BRCA-analyzer 上免费获得。