Department of Human Genetics, McGill University, Montréal, Canada.
Canadian Center for Computational Genomics, Montréal, Canada.
PLoS Genet. 2018 Apr 12;14(4):e1007285. doi: 10.1371/journal.pgen.1007285. eCollection 2018 Apr.
Epilepsy will affect nearly 3% of people at some point during their lifetime. Previous copy number variants (CNVs) studies of epilepsy have used array-based technology and were restricted to the detection of large or exonic events. In contrast, whole-genome sequencing (WGS) has the potential to more comprehensively profile CNVs but existing analytic methods suffer from limited accuracy. We show that this is in part due to the non-uniformity of read coverage, even after intra-sample normalization. To improve on this, we developed PopSV, an algorithm that uses multiple samples to control for technical variation and enables the robust detection of CNVs. Using WGS and PopSV, we performed a comprehensive characterization of CNVs in 198 individuals affected with epilepsy and 301 controls. For both large and small variants, we found an enrichment of rare exonic events in epilepsy patients, especially in genes with predicted loss-of-function intolerance. Notably, this genome-wide survey also revealed an enrichment of rare non-coding CNVs near previously known epilepsy genes. This enrichment was strongest for non-coding CNVs located within 100 Kbp of an epilepsy gene and in regions associated with changes in the gene expression, such as expression QTLs or DNase I hypersensitive sites. Finally, we report on 21 potentially damaging events that could be associated with known or new candidate epilepsy genes. Our results suggest that comprehensive sequence-based profiling of CNVs could help explain a larger fraction of epilepsy cases.
癫痫将在其一生中的某个时刻影响近 3%的人。以前对癫痫的拷贝数变异 (CNV) 研究使用了基于阵列的技术,并且仅限于检测大或外显子事件。相比之下,全基因组测序 (WGS) 有可能更全面地分析 CNV,但现有的分析方法准确性有限。我们表明,这在一定程度上是由于即使在样本内标准化后,读取覆盖率的不均匀性。为了改进这一点,我们开发了 PopSV,这是一种使用多个样本来控制技术变化并能够稳健检测 CNV 的算法。使用 WGS 和 PopSV,我们对 198 名癫痫患者和 301 名对照个体的 CNV 进行了全面表征。对于大的和小的变体,我们发现癫痫患者中罕见外显子事件的富集,特别是在具有预测功能丧失不耐受的基因中。值得注意的是,这项全基因组调查还揭示了先前已知的癫痫基因附近罕见非编码 CNV 的富集。这种富集在距离癫痫基因 100 Kbp 以内的非编码 CNV 以及与基因表达变化相关的区域(例如表达 QTL 或 DNase I 超敏位点)中最强。最后,我们报告了 21 个可能与已知或新候选癫痫基因相关的潜在破坏性事件。我们的研究结果表明,全面的基于序列的 CNV 分析可能有助于解释更大比例的癫痫病例。