Institute of Evolutionary Biology (UPF-CSIC), Department of Medicine and Life Sciences, Universitat Pompeu Fabra, Barcelona, Spain.
Hum Genet. 2023 Sep;142(9):1327-1343. doi: 10.1007/s00439-023-02579-5. Epub 2023 Jun 14.
We provide the first whole genome Copy Number Variant (CNV) study addressing Roma, along with reference populations from South Asia, the Middle East and Europe. Using CNV calling software for short-read sequence data, we identified 3171 deletions and 489 duplications. Taking into account the known population history of the Roma, as inferred from whole genome nucleotide variation, we could discern how this history has shaped CNV variation. As expected, patterns of deletion variation, but not duplication, in the Roma followed those obtained from single nucleotide polymorphisms (SNPs). Reduced effective population size resulting in slightly relaxed natural selection may explain our observation of an increase in intronic (but not exonic) deletions within Loss of Function (LoF)-intolerant genes. Over-representation analysis for LoF-intolerant gene sets hosting intronic deletions highlights a substantial accumulation of shared biological processes in Roma, intriguingly related to signaling, nervous system and development features, which may be related to the known profile of private disease in the population. Finally, we show the link between deletions and known trait-related SNPs reported in the genome-wide association study (GWAS) catalog, which exhibited even frequency distributions among the studied populations. This suggests that, in general human populations, the strong association between deletions and SNPs associated to biomedical conditions and traits could be widespread across continental populations, reflecting a common background of potentially disease/trait-related CNVs.
我们提供了第一项针对罗姆人(吉普赛人)的全基因组拷贝数变异(CNV)研究,同时还提供了来自南亚、中东和欧洲的参考人群数据。我们使用短读序列数据的 CNV 调用软件,鉴定出 3171 个缺失和 489 个重复。考虑到罗姆人从全基因组核苷酸变异推断出的已知人口历史,我们可以辨别这种历史如何塑造了 CNV 变异。正如预期的那样,罗姆人缺失变异的模式(但不是重复)符合从单核苷酸多态性(SNP)中获得的模式。由于有效种群规模减小导致自然选择略有放松,这可能解释了我们观察到的功能丧失(LoF)不耐受基因中内含子缺失的增加。对含有内含子缺失的 LoF 不耐受基因集的过代表达分析突出了罗姆人共享的生物学过程的大量积累,这些过程与信号转导、神经系统和发育特征有关,这可能与该人群中已知的特有疾病特征有关。最后,我们展示了缺失与全基因组关联研究(GWAS)目录中报告的与特征相关的 SNP 之间的联系,这些 SNP 在研究人群中的分布频率甚至相同。这表明,在一般人群中,缺失与与生物医学状况和特征相关的 SNP 之间的强关联可能在大陆人群中广泛存在,反映了与潜在疾病/特征相关的 CNV 的共同背景。