Faculty of Medicine, University Hospital Cologne, Cologne 50937, Germany.
Department of Chromosome Biology, Max Planck Institute for Plant Breeding Research, Cologne 50829, Germany.
Plant Physiol. 2023 May 31;192(2):821-836. doi: 10.1093/plphys/kiad191.
Meiotic recombination is an essential mechanism during sexual reproduction and includes the exchange of chromosome segments between homologous chromosomes. New allelic combinations are transmitted to the new generation, introducing novel genetic variation in the offspring genomes. With the improvement of high-throughput whole-genome sequencing technologies, large numbers of recombinant individuals can now be sequenced with low sequencing depth at low costs, necessitating computational methods for reconstructing their haplotypes. The main challenge is the uncertainty in haplotype calling that arises from the low information content of a single genomic position. Straightforward sliding window-based approaches are difficult to tune and fail to place recombination breakpoints precisely. Hidden Markov model (HMM)-based approaches, on the other hand, tend to over-segment the genome. Here, we present RTIGER, an HMM-based model that exploits in a mathematically precise way the fact that true chromosome segments typically have a certain minimum length. We further separate the task of identifying the correct haplotype sequence from the accurate placement of haplotype borders, thereby maximizing the accuracy of border positions. By comparing segmentations based on simulated data with known underlying haplotypes, we highlight the reasons for RTIGER outperforming traditional segmentation approaches. We then analyze the meiotic recombination pattern of segregants of 2 Arabidopsis (Arabidopsis thaliana) accessions and a previously described hyper-recombining mutant. RTIGER is available as an R package with an efficient Julia implementation of the core algorithm.
减数分裂重组是有性生殖中的一个重要机制,包括同源染色体之间的染色体片段交换。新的等位基因组合被传递给新一代,在后代基因组中引入新的遗传变异。随着高通量全基因组测序技术的提高,现在可以以低成本、低测序深度对大量重组个体进行测序,这就需要计算方法来重建它们的单倍型。主要的挑战是单核苷酸多态性调用的不确定性,这是由于单个基因组位置的信息量低而产生的。基于简单滑动窗口的方法很难进行调整,并且无法准确放置重组断点。另一方面,基于隐马尔可夫模型 (HMM) 的方法往往会过度分割基因组。在这里,我们提出了 RTIGER,这是一种基于 HMM 的模型,它以数学上精确的方式利用了一个事实,即真正的染色体片段通常具有一定的最小长度。我们进一步将正确单倍型序列的识别任务与单倍型边界的准确放置分开,从而最大限度地提高边界位置的准确性。通过比较基于模拟数据的分割与已知的潜在单倍型,我们强调了 RTIGER 优于传统分割方法的原因。然后,我们分析了 2 个拟南芥 (Arabidopsis thaliana) 品系和一个先前描述的超重组突变体的分离单倍型的减数分裂重组模式。RTIGER 作为一个 R 包提供,其核心算法有高效的 Julia 实现。