Key Laboratory of RNA Biology, Center for Big Data Research in Health, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China.
College of Life Sciences, University of Chinese Academy of Sciences, Beijing 100049, China.
Nucleic Acids Res. 2022 Mar 21;50(5):2493-2508. doi: 10.1093/nar/gkac128.
Mobile element insertions (MEIs) are a major class of structural variants (SVs) and have been linked to many human genetic disorders, including hemophilia, neurofibromatosis, and various cancers. However, human MEI resources from large-scale genome sequencing are still lacking compared to those for SNPs and SVs. Here, we report a comprehensive map of 36 699 non-reference MEIs constructed from 5675 genomes, comprising 2998 Chinese samples (∼26.2×, NyuWa) and 2677 samples from the 1000 Genomes Project (∼7.4×, 1KGP). We discovered that LINE-1 insertions were highly enriched in centromere regions, implying the role of chromosome context in retroelement insertion. After functional annotation, we estimated that MEIs are responsible for about 9.3% of all protein-truncating events per genome. Finally, we built a companion database named HMEID for public use. This resource represents the latest and largest genomewide study on MEIs and will have broad utility for exploration of human MEI findings.
移动元件插入(MEI)是结构变异(SV)的主要类别之一,与许多人类遗传疾病有关,包括血友病、神经纤维瘤病和各种癌症。然而,与 SNP 和 SV 相比,人类大规模基因组测序的 MEI 资源仍然缺乏。在这里,我们报告了一个由 5675 个基因组构建的 36699 个非参考 MEI 的综合图谱,其中包括 2998 个中国样本(约 26.2×,NyuWa)和 1000 基因组计划(约 7.4×,1KGP)中的 2677 个样本。我们发现 LINE-1 插入在着丝粒区域高度富集,这表明染色体环境在反转录元件插入中的作用。经过功能注释,我们估计 MEI 大约占每个基因组中所有蛋白截断事件的 9.3%。最后,我们构建了一个名为 HMEID 的配套数据库供公众使用。该资源代表了最新和最大的 MEI 全基因组研究,将广泛用于探索人类 MEI 发现。