Suppr超能文献

利用稳健的隐马尔可夫模型进行重组基因组的可靠基因分型。

Reliable genotyping of recombinant genomes using a robust hidden Markov model.

机构信息

Faculty of Medicine, University Hospital Cologne, Cologne 50937, Germany.

Department of Chromosome Biology, Max Planck Institute for Plant Breeding Research, Cologne 50829, Germany.

出版信息

Plant Physiol. 2023 May 31;192(2):821-836. doi: 10.1093/plphys/kiad191.

Abstract

Meiotic recombination is an essential mechanism during sexual reproduction and includes the exchange of chromosome segments between homologous chromosomes. New allelic combinations are transmitted to the new generation, introducing novel genetic variation in the offspring genomes. With the improvement of high-throughput whole-genome sequencing technologies, large numbers of recombinant individuals can now be sequenced with low sequencing depth at low costs, necessitating computational methods for reconstructing their haplotypes. The main challenge is the uncertainty in haplotype calling that arises from the low information content of a single genomic position. Straightforward sliding window-based approaches are difficult to tune and fail to place recombination breakpoints precisely. Hidden Markov model (HMM)-based approaches, on the other hand, tend to over-segment the genome. Here, we present RTIGER, an HMM-based model that exploits in a mathematically precise way the fact that true chromosome segments typically have a certain minimum length. We further separate the task of identifying the correct haplotype sequence from the accurate placement of haplotype borders, thereby maximizing the accuracy of border positions. By comparing segmentations based on simulated data with known underlying haplotypes, we highlight the reasons for RTIGER outperforming traditional segmentation approaches. We then analyze the meiotic recombination pattern of segregants of 2 Arabidopsis (Arabidopsis thaliana) accessions and a previously described hyper-recombining mutant. RTIGER is available as an R package with an efficient Julia implementation of the core algorithm.

摘要

减数分裂重组是有性生殖中的一个重要机制,包括同源染色体之间的染色体片段交换。新的等位基因组合被传递给新一代,在后代基因组中引入新的遗传变异。随着高通量全基因组测序技术的提高,现在可以以低成本、低测序深度对大量重组个体进行测序,这就需要计算方法来重建它们的单倍型。主要的挑战是单核苷酸多态性调用的不确定性,这是由于单个基因组位置的信息量低而产生的。基于简单滑动窗口的方法很难进行调整,并且无法准确放置重组断点。另一方面,基于隐马尔可夫模型 (HMM) 的方法往往会过度分割基因组。在这里,我们提出了 RTIGER,这是一种基于 HMM 的模型,它以数学上精确的方式利用了一个事实,即真正的染色体片段通常具有一定的最小长度。我们进一步将正确单倍型序列的识别任务与单倍型边界的准确放置分开,从而最大限度地提高边界位置的准确性。通过比较基于模拟数据的分割与已知的潜在单倍型,我们强调了 RTIGER 优于传统分割方法的原因。然后,我们分析了 2 个拟南芥 (Arabidopsis thaliana) 品系和一个先前描述的超重组突变体的分离单倍型的减数分裂重组模式。RTIGER 作为一个 R 包提供,其核心算法有高效的 Julia 实现。

相似文献

1
Reliable genotyping of recombinant genomes using a robust hidden Markov model.
Plant Physiol. 2023 May 31;192(2):821-836. doi: 10.1093/plphys/kiad191.
2
Leveraging reads that span multiple single nucleotide polymorphisms for haplotype inference from sequencing data.
Bioinformatics. 2013 Sep 15;29(18):2245-52. doi: 10.1093/bioinformatics/btt386. Epub 2013 Jul 3.
3
Joint haplotype assembly and genotype calling via sequential Monte Carlo algorithm.
BMC Bioinformatics. 2015 Jul 16;16:223. doi: 10.1186/s12859-015-0651-8.
4
Joint haplotype phasing and genotype calling of multiple individuals using haplotype informative reads.
Bioinformatics. 2013 Oct 1;29(19):2427-34. doi: 10.1093/bioinformatics/btt418. Epub 2013 Aug 13.
5
A dynamic Bayesian Markov model for phasing and characterizing haplotypes in next-generation sequencing.
Bioinformatics. 2013 Apr 1;29(7):878-85. doi: 10.1093/bioinformatics/btt065. Epub 2013 Feb 13.
6
Genotype calling from next-generation sequencing data using haplotype information of reads.
Bioinformatics. 2012 Apr 1;28(7):938-46. doi: 10.1093/bioinformatics/bts047. Epub 2012 Jan 27.
7
Hidden Markov Models in Bioinformatics: SNV Inference from Next Generation Sequence.
Methods Mol Biol. 2017;1552:123-133. doi: 10.1007/978-1-4939-6753-7_9.
8
vi-HMM: a novel HMM-based method for sequence variant identification in short-read data.
Hum Genomics. 2019 Feb 13;13(1):9. doi: 10.1186/s40246-019-0194-6.
9
Haplotype inference using a Bayesian Hidden Markov model.
Genet Epidemiol. 2007 Dec;31(8):937-48. doi: 10.1002/gepi.20253.
10
HaplotypeCN: copy number haplotype inference with Hidden Markov Model and localized haplotype clustering.
PLoS One. 2014 May 21;9(5):e96841. doi: 10.1371/journal.pone.0096841. eCollection 2014.

引用本文的文献

2
Scalable eQTL mapping using single-nucleus RNA-sequencing of recombined gametes from a small number of individuals.
PLoS Biol. 2025 Apr 25;23(4):e3003085. doi: 10.1371/journal.pbio.3003085. eCollection 2025 Apr.

本文引用的文献

1
Ultra Low-Coverage Whole-Genome Sequencing as an Alternative to Genotyping Arrays in Genome-Wide Association Studies.
Front Genet. 2022 Feb 15;12:790445. doi: 10.3389/fgene.2021.790445. eCollection 2021.
3
Efficient phasing and imputation of low-coverage sequencing data using large reference panels.
Nat Genet. 2021 Jan;53(1):120-126. doi: 10.1038/s41588-020-00756-0. Epub 2021 Jan 7.
4
SyRI: finding genomic rearrangements and local sequence differences from whole-genome assemblies.
Genome Biol. 2019 Dec 16;20(1):277. doi: 10.1186/s13059-019-1911-0.
5
Predicting enhancers in mammalian genomes using supervised hidden Markov models.
BMC Bioinformatics. 2019 Mar 27;20(1):157. doi: 10.1186/s12859-019-2708-6.
6
Very low-depth whole-genome sequencing in complex trait association studies.
Bioinformatics. 2019 Aug 1;35(15):2555-2561. doi: 10.1093/bioinformatics/bty1032.
7
Accurate Genotype Imputation in Multiparental Populations from Low-Coverage Sequence.
Genetics. 2018 Sep;210(1):71-82. doi: 10.1534/genetics.118.300885. Epub 2018 Jul 25.
9
Massive crossover elevation via combination of and during meiosis.
Proc Natl Acad Sci U S A. 2018 Mar 6;115(10):2437-2442. doi: 10.1073/pnas.1713071115. Epub 2018 Feb 20.
10
Unleashing meiotic crossovers in hybrid plants.
Proc Natl Acad Sci U S A. 2018 Mar 6;115(10):2431-2436. doi: 10.1073/pnas.1713078114. Epub 2017 Nov 28.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验