Department of Computational and Systems Biology, Agency for Science Technology and Research, Genome Institute of Singapore, Singapore, Singapore.
School of Computing, National University of Singapore, Singapore, Singapore.
Nat Commun. 2022 Jul 22;13(1):4248. doi: 10.1038/s41467-022-31765-8.
Identification of somatic mutations in tumor samples is commonly based on statistical methods in combination with heuristic filters. Here we develop VarNet, an end-to-end deep learning approach for identification of somatic variants from aligned tumor and matched normal DNA reads. VarNet is trained using image representations of 4.6 million high-confidence somatic variants annotated in 356 tumor whole genomes. We benchmark VarNet across a range of publicly available datasets, demonstrating performance often exceeding current state-of-the-art methods. Overall, our results demonstrate how a scalable deep learning approach could augment and potentially supplant human engineered features and heuristic filters in somatic variant calling.
肿瘤样本中的体细胞突变的识别通常基于统计方法,并结合启发式过滤器。在这里,我们开发了 VarNet,这是一种从对齐的肿瘤和匹配的正常 DNA 读取中识别体细胞变体的端到端深度学习方法。VarNet 使用在 356 个肿瘤全基因组中注释的 460 万个高可信度体细胞变体的图像表示进行训练。我们在一系列公开可用的数据集上对 VarNet 进行了基准测试,结果表明其性能通常超过当前最先进的方法。总的来说,我们的结果表明,一种可扩展的深度学习方法可以增强并可能替代体细胞变异调用中的人为设计的特征和启发式过滤器。