Department of Systems Biology, Columbia University College of Physicians and Surgeons, 1130 St. Nicholas Avenue, New York, NY 10032, USA; Department of Biomedical Informatics, Columbia University College of Physicians and Surgeons, 1130 St. Nicholas Avenue, New York, NY 10032, USA.
Department of Systems Biology, Columbia University College of Physicians and Surgeons, 1130 St. Nicholas Avenue, New York, NY 10032, USA; Department of Biomedical Informatics, Columbia University College of Physicians and Surgeons, 1130 St. Nicholas Avenue, New York, NY 10032, USA.
Cell Syst. 2016 Jul;3(1):83-94. doi: 10.1016/j.cels.2016.05.008. Epub 2016 Jun 23.
Meiotic recombination is a fundamental evolutionary process driving diversity in eukaryotes. In mammals, recombination is known to occur preferentially at specific genomic regions. Using topological data analysis (TDA), a branch of applied topology that extracts global features from large data sets, we developed an efficient method for mapping recombination at fine scales. When compared to standard linkage-based methods, TDA can deal with a larger number of SNPs and genomes without incurring prohibitive computational costs. We applied TDA to 1,000 Genomes Project data and constructed high-resolution whole-genome recombination maps of seven human populations. Our analysis shows that recombination is generally under-represented within transcription start sites. However, the binding sites of specific transcription factors are enriched for sites of recombination. These include transcription factors that regulate the expression of meiosis- and gametogenesis-specific genes, cell cycle progression, and differentiation blockage. Additionally, our analysis identifies an enrichment for sites of recombination at repeat-derived loci matched by piwi-interacting RNAs.
减数分裂重组是驱动真核生物多样性的基本进化过程。在哺乳动物中,重组已知优先发生在特定的基因组区域。我们使用拓扑数据分析(TDA),这是应用拓扑学的一个分支,从大数据集中提取全局特征,开发了一种在精细尺度上进行重组映射的有效方法。与基于标准连锁的方法相比,TDA 可以处理更多的 SNP 和基因组,而不会产生过高的计算成本。我们将 TDA 应用于 1000 基因组计划数据,并构建了七个人类群体的高分辨率全基因组重组图谱。我们的分析表明,重组通常在转录起始位点内被低估。然而,特定转录因子的结合位点富含重组位点。这些转录因子包括调节减数分裂和配子发生特异性基因表达、细胞周期进程和分化阻滞的转录因子。此外,我们的分析还发现,在与 piRNA 匹配的重复衍生基因座中,存在着重组位点的富集。