Division of Life Science and Applied Genomics Centre, Hong Kong University of Science and Technology, Clear Water Bay, Hong Kong, China.
HKUST Shenzhen Research Institute, 9 Yuexing First Road, Nanshan, Shenzhen, China.
Hum Genomics. 2021 Mar 19;15(1):19. doi: 10.1186/s40246-021-00318-3.
Genetic variants, underlining phenotypic diversity, are known to distribute unevenly in the human genome. A comprehensive understanding of the distributions of different genetic variants is important for insights into genetic functions and disorders.
Herein, a sliding-window scan of regional densities of eight kinds of germline genetic variants, including single-nucleotide-polymorphisms (SNPs) and four size-classes of copy-number-variations (CNVs) in the human genome has been performed.
The study has identified 44,379 hotspots with high genetic-variant densities, and 1135 hotspot clusters comprising more than one type of hotspots, accounting for 3.1% and 0.2% of the genome respectively. The hotspots and clusters are found to co-localize with different functional genomic features, as exemplified by the associations of hotspots of middle-size CNVs with histone-modification sites, work with balancing and positive selections to meet the need for diversity in immune proteins, and facilitate the development of sensory-perception and neuroactive ligand-receptor interaction pathways in the function-sparse late-replicating genomic sequences. Genetic variants of different lengths co-localize with retrotransposons of different ages on a "long-with-young" and "short-with-all" basis. Hotspots and clusters are highly associated with tumor suppressor genes and oncogenes (p < 10), and enriched with somatic tumor CNVs and the trait- and disease-associated SNPs identified by genome-wise association studies, exceeding tenfold enrichment in clusters comprising SNPs and extra-long CNVs.
In conclusion, the genetic-variant hotspots and clusters represent two-edged swords that spearhead both positive and negative genomic changes. Their strong associations with complex traits and diseases also open up a potential "Common Disease-Hotspot Variant" approach to the missing heritability problem.
遗传变异是表型多样性的基础,其在人类基因组中的分布是不均匀的。全面了解不同遗传变异的分布对于深入了解遗传功能和疾病至关重要。
本文对人类基因组中 8 种生殖系遗传变异(包括单核苷酸多态性和 4 种大小类别的拷贝数变异)的区域密度进行了滑动窗口扫描。
该研究确定了 44,379 个高遗传变异密度热点,以及 1135 个热点簇,它们包含超过一种类型的热点,分别占基因组的 3.1%和 0.2%。这些热点和热点簇与不同的功能基因组特征密切相关,例如中等大小的 CNV 热点与组蛋白修饰位点相关,与平衡和正选择一起作用,以满足免疫蛋白多样性的需要,并促进感官感知和神经活性配体-受体相互作用途径在功能稀疏的晚复制基因组序列中的发展。不同长度的遗传变异与不同年龄的逆转座子在“长与短”和“短与全”的基础上共同定位。热点和热点簇与肿瘤抑制基因和癌基因高度相关(p < 10),并富集了体细胞肿瘤 CNV 以及全基因组关联研究中确定的与性状和疾病相关的 SNP,包含 SNP 和超长 CNV 的簇的富集程度超过了 10 倍。
综上所述,遗传变异热点和热点簇代表了双刃剑,既推动了积极的基因组变化,也推动了消极的基因组变化。它们与复杂性状和疾病的强烈关联也为解决遗传缺失问题开辟了一条潜在的“常见疾病-热点变异”途径。