Institut de Biotecnologia i de Biomedicina, Universitat Autònoma de Barcelona, 08193 Bellaterra (Barcelona), Spain.
CIBIO/InBIO Research Center in Biodiversity and Genetic Resources, University of Porto, 4485-661 Vairão, Portugal.
Genome Res. 2020 May;30(5):724-735. doi: 10.1101/gr.255273.119. Epub 2020 May 18.
Despite the interest in characterizing genomic variation, the presence of large repeats at the breakpoints hinders the analysis of many structural variants. This is especially problematic for inversions, since there is typically no gain or loss of DNA. Here, we tested novel linkage-based droplet digital PCR (ddPCR) assays to study 20 inversions ranging from 3.1 to 742 kb flanked by inverted repeats (IRs) up to 134 kb long. Of those, we validated 13 inversions predicted by different genome-wide techniques. In addition, we obtained new experimental human population information across 95 African, European, and East Asian individuals for 16 inversions, including four already validated variants without high-throughput genotyping methods. Through comparison with previous data, independent replicates and both inversion breakpoints, we demonstrate that the technique is highly accurate and reproducible. Most studied inversions are widespread across continents, and their frequency is negatively correlated with genetic length. Moreover, all except two show clear signs of being recurrent, and we could better define the factors affecting recurrence levels and estimate the inversion rate across the genome. Finally, the generated genotypes have allowed us to check inversion functional effects, validating gene expression differences reported before for two inversions and finding new candidate associations. Therefore, the developed methodology makes it possible to screen these and other complex genomic variants quickly in a large number of samples for the first time, highlighting the importance of direct genotyping to assess their potential consequences and clinical implications.
尽管人们对描述基因组变异很感兴趣,但在断点处存在大的重复序列会阻碍许多结构变异的分析。这对于倒位来说尤其成问题,因为通常不会有 DNA 的获得或丢失。在这里,我们测试了新的基于连锁的液滴数字 PCR (ddPCR) 检测方法,以研究 20 个倒位,这些倒位的侧翼是长达 134kb 的反向重复(IRs),长度从 3.1kb 到 742kb 不等。其中,我们验证了通过不同的全基因组技术预测的 13 个倒位。此外,我们在 95 个非洲、欧洲和东亚个体中获得了 16 个倒位的新的人类群体实验信息,其中包括四个没有高通量基因分型方法的已验证变体。通过与以前的数据、独立重复和两个反转点进行比较,我们证明了该技术具有高度的准确性和可重复性。大多数研究的倒位在各大洲都很普遍,其频率与遗传长度呈负相关。此外,除了两个以外,所有的倒位都显示出明显的重复发生的迹象,我们可以更好地定义影响重复水平的因素,并估计整个基因组的反转率。最后,生成的基因型使我们能够检查反转的功能效应,验证以前报道的两个反转的基因表达差异,并发现新的候选关联。因此,该方法首次使我们能够在大量样本中快速筛选这些和其他复杂的基因组变体,突出了直接基因分型来评估它们潜在后果和临床意义的重要性。