Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SA, UK.
Department of Genetics, School of Basic Medical Sciences, Tianjin Medical University, Tianjin, 30070, China.
Hum Genet. 2018 Jan;137(1):73-83. doi: 10.1007/s00439-017-1857-9. Epub 2017 Dec 5.
We describe the variation in copy number of a ~ 10 kb region overlapping the long intergenic noncoding RNA (lincRNA) gene, TTTY22, within the IR3 inverted repeat on the short arm of the human Y chromosome, leading to individuals with 0-3 copies of this region in the general population. Variation of this CNV is common, with 266 individuals having 0 copies, 943 (including the reference sequence) having 1, 23 having 2 copies, and two having 3 copies, and was validated by breakpoint PCR, fibre-FISH, and 10× Genomics Chromium linked-read sequencing in subsets of 1234 individuals from the 1000 Genomes Project. Mapping the changes in copy number to the phylogeny of these Y chromosomes previously established by the Project identified at least 20 mutational events, and investigation of flanking paralogous sequence variants showed that the mutations involved flanking sequences in 18 of these, and could extend over > 30 kb of DNA. While either gene conversion or double crossover between misaligned sister chromatids could formally explain the 0-2 copy events, gene conversion is the more likely mechanism, and these events include the longest non-allelic gene conversion reported thus far. Chromosomes with three copies of this CNV have arisen just once in our data set via another mechanism: duplication of 420 kb that places the third copy 230 kb proximal to the existing proximal copy. Our results establish gene conversion as a previously under-appreciated mechanism of generating copy number changes in humans and reveal the exceptionally large size of the conversion events that can occur.
我们描述了一个约 10kb 区域的拷贝数变异,该区域重叠人类 Y 染色体短臂上的 IR3 反向重复内的长基因间非编码 RNA (lincRNA) 基因 TTTY22,导致人群中该区域的拷贝数为 0-3 个。这种 CNV 的变异很常见,有 266 个人没有该区域的拷贝,943 个(包括参考序列)有 1 个拷贝,23 个有 2 个拷贝,2 个有 3 个拷贝,通过断点 PCR、纤维-FISH 和 10× Genomics Chromium 连接读取测序在 1000 个基因组项目中的 1234 个人的亚集中进行了验证。将拷贝数的变化映射到这些 Y 染色体的系统发育,这些 Y 染色体是由该项目先前建立的,确定了至少 20 个突变事件,并且对侧翼同源序列变体的研究表明,这些突变涉及到 18 个侧翼序列,并且可以扩展到超过 30kb 的 DNA。虽然基因转换或错配姐妹染色单体之间的双交叉可以形式上解释 0-2 个拷贝事件,但基因转换是更可能的机制,这些事件包括迄今为止报道的最长的非等位基因转换。我们的数据集中,三个拷贝的这个 CNV 只出现过一次,是通过另一种机制产生的:420kb 的重复,将第三个拷贝放置在现有近端拷贝的 230kb 近端。我们的结果确立了基因转换作为人类产生拷贝数变化的一种以前未被充分认识的机制,并揭示了可能发生的转换事件的异常大尺寸。