National Human Genome Research Institute, National Institutes of Health, 50 South Drive Room 5140, Bethesda, MD, 20892, USA.
NIAID Collaborative Bioinformatics Resource, National Institutes of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, MD, USA.
BMC Bioinformatics. 2021 Apr 8;22(1):181. doi: 10.1186/s12859-021-04090-y.
The widespread use of next-generation sequencing has identified an important role for somatic mosaicism in many diseases. However, detecting low-level mosaic variants from next-generation sequencing data remains challenging.
Here, we present a method for Position-Based Variant Identification (PBVI) that uses empirically-derived distributions of alternate nucleotides from a control dataset. We modeled this approach on 11 segmental overgrowth genes. We show that this method improves detection of single nucleotide mosaic variants of 0.01-0.05 variant allele fraction compared to other low-level variant callers. At depths of 600 × and 1200 ×, we observed > 85% and > 95% sensitivity, respectively. In a cohort of 26 individuals with somatic overgrowth disorders PBVI showed improved signal to noise, identifying pathogenic variants in 17 individuals.
PBVI can facilitate identification of low-level mosaic variants thus increasing the utility of next-generation sequencing data for research and diagnostic purposes.
下一代测序的广泛应用已经确定了体细胞镶嵌在许多疾病中的重要作用。然而,从下一代测序数据中检测低水平的镶嵌变体仍然具有挑战性。
在这里,我们提出了一种基于位置的变异识别(PBVI)的方法,该方法使用来自对照数据集的经验衍生的替代核苷酸分布。我们在 11 个节段性过度生长基因上对该方法进行了建模。我们表明,与其他低水平变异调用者相比,这种方法可以提高 0.01-0.05 变异等位基因分数的单核苷酸镶嵌变体的检测率。在 600×和 1200×的深度下,我们分别观察到了超过 85%和超过 95%的灵敏度。在 26 名患有体细胞过度生长障碍的个体的队列中,PBVI 显示出改善的信噪比,在 17 名个体中鉴定出致病性变异。
PBVI 可以促进低水平镶嵌变体的识别,从而提高下一代测序数据在研究和诊断目的中的应用价值。