Suppr超能文献

从混杂测序数据中进行多倍体基因型分析。

Genotyping Polyploids from Messy Sequencing Data.

机构信息

Department of Mathematics and Statistics, American University, Washington, DC 20016

Horticultural Sciences Department, University of Florida, Gainesville, Florida 32611.

出版信息

Genetics. 2018 Nov;210(3):789-807. doi: 10.1534/genetics.118.301468. Epub 2018 Sep 5.

Abstract

Detecting and quantifying the differences in individual genomes (, genotyping), plays a fundamental role in most modern bioinformatics pipelines. Many scientists now use reduced representation next-generation sequencing (NGS) approaches for genotyping. Genotyping diploid individuals using NGS is a well-studied field, and similar methods for polyploid individuals are just emerging. However, there are many aspects of NGS data, particularly in polyploids, that remain unexplored by most methods. Our contributions in this paper are fourfold: (i) We draw attention to, and then model, common aspects of NGS data: sequencing error, allelic bias, overdispersion, and outlying observations. (ii) Many datasets feature related individuals, and so we use the structure of Mendelian segregation to build an empirical Bayes approach for genotyping polyploid individuals. (iii) We develop novel models to account for preferential pairing of chromosomes, and harness these for genotyping. (iv) We derive oracle genotyping error rates that may be used for read depth suggestions. We assess the accuracy of our method in simulations, and apply it to a dataset of hexaploid sweet potato (). An R package implementing our method is available at https://cran.r-project.org/package=updog.

摘要

检测和量化个体基因组(即基因分型)的差异,在大多数现代生物信息学流程中起着至关重要的作用。现在,许多科学家使用简化的代表性下一代测序(NGS)方法进行基因分型。使用 NGS 对二倍体个体进行基因分型是一个研究得很好的领域,而用于多倍体个体的类似方法才刚刚出现。然而,NGS 数据有许多方面,特别是在多倍体中,大多数方法都尚未涉及。我们在本文中的贡献有四点:(i)我们提请注意 NGS 数据的常见方面,然后对其进行建模:测序错误、等位基因偏倚、过度分散和异常观测。(ii)许多数据集都具有相关个体,因此我们利用孟德尔分离的结构,为多倍体个体的基因分型构建了一种经验贝叶斯方法。(iii)我们开发了新的模型来解释染色体的优先配对,并利用这些模型进行基因分型。(iv)我们推导出了可用于读取深度建议的Oracle 基因分型错误率。我们在模拟中评估了我们方法的准确性,并将其应用于六倍体甘薯的数据集()。实现我们方法的 R 包可在 https://cran.r-project.org/package=updog 上获得。

相似文献

1
Genotyping Polyploids from Messy Sequencing Data.
Genetics. 2018 Nov;210(3):789-807. doi: 10.1534/genetics.118.301468. Epub 2018 Sep 5.
2
Genetic Mapping in Autohexaploid Sweet Potato with Low-Coverage NGS-Based Genotyping Data.
G3 (Bethesda). 2020 Aug 5;10(8):2661-2670. doi: 10.1534/g3.120.401433.
3
Sequence coverage required for accurate genotyping by sequencing in polyploid species.
Mol Ecol Resour. 2022 May;22(4):1417-1426. doi: 10.1111/1755-0998.13558. Epub 2021 Dec 20.
4
Current status in whole genome sequencing and analysis of Ipomoea spp.
Plant Cell Rep. 2019 Nov;38(11):1365-1371. doi: 10.1007/s00299-019-02464-4. Epub 2019 Aug 29.
5
SNP genotyping and parameter estimation in polyploids using low-coverage sequencing data.
Bioinformatics. 2018 Feb 1;34(3):407-415. doi: 10.1093/bioinformatics/btx587.
7
Role of NGS and SNP genotyping methods in sugarcane improvement programs.
Crit Rev Biotechnol. 2020 Sep;40(6):865-880. doi: 10.1080/07388551.2020.1765730. Epub 2020 Jun 7.
9
Developing best practices for genotyping-by-sequencing analysis in the construction of linkage maps.
Gigascience. 2022 Dec 28;12. doi: 10.1093/gigascience/giad092. Epub 2023 Oct 27.
10
polyRAD: Genotype Calling with Uncertainty from Sequencing Data in Polyploids and Diploids.
G3 (Bethesda). 2019 Mar 7;9(3):663-673. doi: 10.1534/g3.118.200913.

引用本文的文献

1
Genetic control of sweetness and acidity in blackberry.
Front Plant Sci. 2025 Jul 25;16:1569492. doi: 10.3389/fpls.2025.1569492. eCollection 2025.
2
Whole-genome duplication increases genetic diversity and load in outcrossing .
Proc Natl Acad Sci U S A. 2025 Aug 5;122(31):e2501739122. doi: 10.1073/pnas.2501739122. Epub 2025 Jul 30.
3
Genetic linkage mapping in Megathyrsus maximus (Jacq.) with multiple dosage markers.
G3 (Bethesda). 2025 Sep 3;15(9). doi: 10.1093/g3journal/jkaf126.
5
ComBat-met: adjusting batch effects in DNA methylation data.
NAR Genom Bioinform. 2025 May 19;7(2):lqaf062. doi: 10.1093/nargab/lqaf062. eCollection 2025 Jun.
7
Genomic prediction and association analyses for breeding parthenocarpic blueberries.
Hortic Res. 2025 Mar 21;12(7):uhaf086. doi: 10.1093/hr/uhaf086. eCollection 2025 Jul.
8
Identification and functional characterization of BAHD acyltransferases associated with anthocyanin acylation in blueberry.
Hortic Res. 2025 Feb 10;12(5):uhaf041. doi: 10.1093/hr/uhaf041. eCollection 2025 May.
9
Genetic control of prickles in tetraploid blackberry.
G3 (Bethesda). 2025 Jun 4;15(6). doi: 10.1093/g3journal/jkaf065.

本文引用的文献

1
polyRAD: Genotype Calling with Uncertainty from Sequencing Data in Polyploids and Diploids.
G3 (Bethesda). 2019 Mar 7;9(3):663-673. doi: 10.1534/g3.118.200913.
2
TriPoly: haplotype estimation for polyploids using sequencing data of related individuals.
Bioinformatics. 2018 Nov 15;34(22):3864-3872. doi: 10.1093/bioinformatics/bty442.
3
Tools for Genetic Studies in Experimental Populations of Polyploids.
Front Plant Sci. 2018 Apr 18;9:513. doi: 10.3389/fpls.2018.00513. eCollection 2018.
4
SNP genotyping and parameter estimation in polyploids using low-coverage sequencing data.
Bioinformatics. 2018 Feb 1;34(3):407-415. doi: 10.1093/bioinformatics/btx587.
6
Partial preferential chromosome pairing is genotype dependent in tetraploid rose.
Plant J. 2017 Apr;90(2):330-343. doi: 10.1111/tpj.13496. Epub 2017 Mar 20.
7
Genotype Calling from Population-Genomic Sequencing Data.
G3 (Bethesda). 2017 May 5;7(5):1393-1404. doi: 10.1534/g3.117.039008.
8
Automated tetraploid genotype calling by hierarchical clustering.
Theor Appl Genet. 2017 Apr;130(4):717-726. doi: 10.1007/s00122-016-2845-5. Epub 2017 Jan 9.
10

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验