Genomics Research Center, Academia Sinica, Taipei, Taiwan.
Biomed Res Int. 2013;2013:264532. doi: 10.1155/2013/264532. Epub 2013 Aug 1.
Human and other primate genomes consist of many segmental duplications (SDs) due to fixation of copy number variations (CNVs). Structure of these duplications within the human genome has been shown to be a complex mosaic composed of juxtaposed subunits (called duplicons). These duplicons are difficult to be uncovered from the mosaic repeat structure. In addition, the distribution and evolution of duplicons among primates are still poorly investigated. In this paper, we develop a statistical framework for discovering duplicons via integration of a Hidden Markov Model (HMM) and a permutation test. Our comparative analysis indicates that the mosaic structure of duplicons is common in CNV/SD regions of both human and chimpanzee genomes, and a subset of core duplicons is shared by the majority of CNVs/SDs. Phylogenetic analyses using duplicons suggested that most CNVs/SDs share common duplication ancestry. Many human/chimpanzee duplicons flank both ends of CNVs, which may be hotspots of nonallelic homologous recombination.
人类和其他灵长类动物的基因组由于拷贝数变异 (CNVs) 的固定而包含许多串联重复序列 (SDs)。人类基因组中这些重复序列的结构被证明是由相邻亚基 (称为重复单元) 组成的复杂镶嵌体。这些重复单元很难从镶嵌重复结构中揭示出来。此外,重复单元在灵长类动物中的分布和进化仍未得到充分研究。在本文中,我们通过整合隐马尔可夫模型 (HMM) 和置换检验,开发了一种用于发现重复单元的统计框架。我们的比较分析表明,重复单元的镶嵌结构在人类和黑猩猩基因组的 CNV/SD 区域中很常见,并且大多数 CNV/SD 都共享一组核心重复单元。使用重复单元进行的系统发育分析表明,大多数 CNV/SD 具有共同的复制祖先。许多人类/黑猩猩重复单元位于 CNV 的两端,这可能是非等位基因同源重组的热点。