• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

制定基于测序的基因型分析在连锁图谱构建中的最佳实践。

Developing best practices for genotyping-by-sequencing analysis in the construction of linkage maps.

机构信息

Department of Genetics, University of São Paulo, São Paulo 13418-900, Brazil.

Department of Horticultural Sciences, Texas A&M University, College Station, TX 77843-0001, USA.

出版信息

Gigascience. 2022 Dec 28;12. doi: 10.1093/gigascience/giad092. Epub 2023 Oct 27.

DOI:10.1093/gigascience/giad092
PMID:37889010
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10603770/
Abstract

BACKGROUND

Genotyping-by-sequencing (GBS) provides affordable methods for genotyping hundreds of individuals using millions of markers. However, this challenges bioinformatic procedures that must overcome possible artifacts such as the bias generated by polymerase chain reaction duplicates and sequencing errors. Genotyping errors lead to data that deviate from what is expected from regular meiosis. This, in turn, leads to difficulties in grouping and ordering markers, resulting in inflated and incorrect linkage maps. Therefore, genotyping errors can be easily detected by linkage map quality evaluations.

RESULTS

We developed and used the Reads2Map workflow to build linkage maps with simulated and empirical GBS data of diploid outcrossing populations. The workflows run GATK, Stacks, TASSEL, and Freebayes for single-nucleotide polymorphism calling and updog, polyRAD, and SuperMASSA for genotype calling, as well as OneMap and GUSMap to build linkage maps. Using simulated data, we observed which genotype call software fails in identifying common errors in GBS sequencing data and proposed specific filters to better handle them. We tested whether it is possible to overcome errors in a linkage map using genotype probabilities from each software or global error rates to estimate genetic distances with an updated version of OneMap. We also evaluated the impact of segregation distortion, contaminant samples, and haplotype-based multiallelic markers in the final linkage maps. Through our evaluations, we observed that some of the approaches produce different results depending on the dataset (dataset dependent) and others produce consistent advantageous results among them (dataset independent).

CONCLUSIONS

We set as default in the Reads2Map workflows the approaches that showed to be dataset independent for GBS datasets according to our results. This reduces the number of required tests to identify optimal pipelines and parameters for other empirical datasets. Using Reads2Map, users can select the pipeline and parameters that best fit their data context. The Reads2MapApp shiny app provides a graphical representation of the results to facilitate their interpretation.

摘要

背景

测序基因分型(GBS)提供了一种经济实惠的方法,可使用数百万个标记对数百个人进行基因分型。然而,这给生物信息学程序带来了挑战,这些程序必须克服聚合酶链反应重复和测序错误等可能产生的伪影。基因分型错误导致的数据偏离了常规减数分裂所预期的数据。这反过来又导致标记的分组和排序困难,从而导致连锁图谱膨胀和错误。因此,基因分型错误可以通过连锁图谱质量评估轻松检测。

结果

我们开发并使用了 Reads2Map 工作流程,使用二倍体杂交群体的模拟和经验 GBS 数据构建连锁图谱。工作流程运行 GATK、Stacks、TASSEL 和 Freebayes 进行单核苷酸多态性调用,updog、polyRAD 和 SuperMASSA 进行基因型调用,以及 OneMap 和 GUSMap 进行连锁图谱构建。使用模拟数据,我们观察到哪种基因型调用软件无法识别 GBS 测序数据中的常见错误,并提出了特定的筛选器来更好地处理这些错误。我们测试了是否可以使用每个软件的基因型概率或全局错误率来克服连锁图谱中的错误,以使用 OneMap 的更新版本估计遗传距离。我们还评估了分离失真、污染样本和基于单倍型的多等位基因标记对最终连锁图谱的影响。通过我们的评估,我们观察到一些方法根据数据集产生不同的结果(数据集依赖),而另一些方法在它们之间产生一致的有利结果(数据集独立)。

结论

根据我们的结果,我们在 Reads2Map 工作流程中将默认设置为那些针对 GBS 数据集表现为数据集独立的方法。这减少了识别其他经验数据集最佳管道和参数所需的测试数量。使用 Reads2Map,用户可以选择最适合其数据上下文的管道和参数。Reads2MapApp shiny 应用程序提供了结果的图形表示,以方便解释。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/37c7/10603770/907f24a4e4b4/giad092fig7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/37c7/10603770/798645fa8f1c/giad092fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/37c7/10603770/602594811ff1/giad092fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/37c7/10603770/f7935e634e5c/giad092fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/37c7/10603770/6fe957deb210/giad092fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/37c7/10603770/ce5eed07e12b/giad092fig5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/37c7/10603770/f3599501fa9e/giad092fig6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/37c7/10603770/907f24a4e4b4/giad092fig7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/37c7/10603770/798645fa8f1c/giad092fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/37c7/10603770/602594811ff1/giad092fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/37c7/10603770/f7935e634e5c/giad092fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/37c7/10603770/6fe957deb210/giad092fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/37c7/10603770/ce5eed07e12b/giad092fig5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/37c7/10603770/f3599501fa9e/giad092fig6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/37c7/10603770/907f24a4e4b4/giad092fig7.jpg

相似文献

1
Developing best practices for genotyping-by-sequencing analysis in the construction of linkage maps.制定基于测序的基因型分析在连锁图谱构建中的最佳实践。
Gigascience. 2022 Dec 28;12. doi: 10.1093/gigascience/giad092. Epub 2023 Oct 27.
2
Saturated linkage map construction in Rubus idaeus using genotyping by sequencing and genome-independent imputation.利用测序基因型和与基因组无关的推断构建树莓饱和连锁图谱。
BMC Genomics. 2013 Jan 16;14:2. doi: 10.1186/1471-2164-14-2.
3
Accounting for Errors in Low Coverage High-Throughput Sequencing Data When Constructing Genetic Maps Using Biparental Outcrossed Populations.利用双交群体构建遗传图谱时,考虑低覆盖率高通量测序数据中的错误。
Genetics. 2018 May;209(1):65-76. doi: 10.1534/genetics.117.300627. Epub 2018 Feb 27.
4
Heterozygous Mapping Strategy (HetMappS) for High Resolution Genotyping-By-Sequencing Markers: A Case Study in Grapevine.用于高分辨率测序基因分型标记的杂合映射策略(HetMappS):葡萄中的一个案例研究
PLoS One. 2015 Aug 5;10(8):e0134880. doi: 10.1371/journal.pone.0134880. eCollection 2015.
5
A comparison of genotyping-by-sequencing analysis methods on low-coverage crop datasets shows advantages of a new workflow, GB-eaSy.对低覆盖作物数据集的测序分析方法的比较表明,一种新的工作流程 GB-eaSy 具有优势。
BMC Bioinformatics. 2017 Dec 28;18(1):586. doi: 10.1186/s12859-017-2000-6.
6
A fully automated pipeline for quantitative genotype calling from next generation sequencing data in autopolyploids.一种全自动流水线,用于从同源多倍体的下一代测序数据中进行定量基因型调用。
BMC Bioinformatics. 2018 Nov 1;19(1):398. doi: 10.1186/s12859-018-2433-6.
7
UGbS-Flex, a novel bioinformatics pipeline for imputation-free SNP discovery in polyploids without a reference genome: finger millet as a case study.UGbS-Flex,一种新型的生物信息学管道,用于在没有参考基因组的情况下对多倍体进行无插补 SNP 发现:以手指小米为例。
BMC Plant Biol. 2018 Jun 15;18(1):117. doi: 10.1186/s12870-018-1316-3.
8
Effect of genotyping errors on linkage map construction based on repeated chip analysis of two recombinant inbred line populations in wheat (Triticum aestivum L.).基因分型错误对基于小麦(Triticum aestivum L.)两个重组自交系群体重复芯片分析构建连锁图谱的影响。
BMC Plant Biol. 2024 Apr 22;24(1):306. doi: 10.1186/s12870-024-05005-8.
9
Bioinformatic analysis of genotype by sequencing (GBS) data with NGSEP.使用NGSEP对测序基因分型(GBS)数据进行生物信息学分析。
BMC Genomics. 2016 Aug 31;17 Suppl 5(Suppl 5):498. doi: 10.1186/s12864-016-2827-7.
10
Genome-Wide SNP Calling from Genotyping by Sequencing (GBS) Data: A Comparison of Seven Pipelines and Two Sequencing Technologies.基于测序基因分型(GBS)数据的全基因组单核苷酸多态性(SNP)检测:七种流程和两种测序技术的比较
PLoS One. 2016 Aug 22;11(8):e0161333. doi: 10.1371/journal.pone.0161333. eCollection 2016.

引用本文的文献

1
The evaluation of different combinations of enzyme set, aligner and caller in GBS sequencing of soybean.大豆简化基因组测序中酶切组合、比对工具和变异检测工具不同组合的评估
Plant Methods. 2025 Aug 6;21(1):106. doi: 10.1186/s13007-025-01410-8.
2
Diurnal Regulation of SOS Pathway and Sodium Excretion Underlying Salinity Tolerance of Vigna marina.滨海豇豆耐盐性的SOS途径昼夜调节及钠排泄
Plant Cell Environ. 2025 Jun;48(6):3925-3938. doi: 10.1111/pce.15402. Epub 2025 Jan 24.

本文引用的文献

1
Recommendations for the Use of in Silico Approaches for Next-Generation Sequencing Bioinformatic Pipeline Validation: A Joint Report of the Association for Molecular Pathology, Association for Pathology Informatics, and College of American Pathologists.关于使用计算机方法进行下一代测序生物信息学流程验证的建议:分子病理学协会、病理学信息学协会和美国病理学家学会联合报告
J Mol Diagn. 2023 Jan;25(1):3-16. doi: 10.1016/j.jmoldx.2022.09.007. Epub 2022 Oct 13.
2
Identification of QTLs for Reduced Susceptibility to Rose Rosette Disease in Diploid Roses.二倍体玫瑰中对玫瑰丛枝病易感性降低的数量性状位点的鉴定
Pathogens. 2022 Jun 8;11(6):660. doi: 10.3390/pathogens11060660.
3
Using probabilistic genotypes in linkage analysis of polyploids.
利用概率基因型进行多倍体连锁分析。
Theor Appl Genet. 2021 Aug;134(8):2443-2457. doi: 10.1007/s00122-021-03834-x. Epub 2021 May 25.
4
Twelve years of SAMtools and BCFtools.SAMtools 和 BCFtools 十二年。
Gigascience. 2021 Feb 16;10(2). doi: 10.1093/gigascience/giab008.
5
Comparing Single-SNP, Multi-SNP, and Haplotype-Based Approaches in Association Studies for Major Traits in Barley.比较大麦主要性状关联研究中单 SNP、多 SNP 和单倍型分析方法。
Plant Genome. 2019 Nov;12(3):1-14. doi: 10.3835/plantgenome2019.05.0036.
6
New Solutions to Old Problems: Molecular Mechanisms of Meiotic Crossover Control.新解法应对老问题:减数分裂交叉控制的分子机制。
Trends Genet. 2020 May;36(5):337-346. doi: 10.1016/j.tig.2020.02.002. Epub 2020 Mar 21.
7
Simulation with RADinitio improves RADseq experimental design and sheds light on sources of missing data.RADinitio 模拟改进 RADseq 实验设计,并揭示了数据缺失的来源。
Mol Ecol Resour. 2021 Feb;21(2):363-378. doi: 10.1111/1755-0998.13163. Epub 2020 May 20.
8
Estimating and accounting for genotyping errors in RAD-seq experiments.估算和核算 RAD-seq 实验中的基因型错误。
Mol Ecol Resour. 2020 Jul;20(4):856-870. doi: 10.1111/1755-0998.13153. Epub 2020 Apr 6.
9
Linkage Analysis and Haplotype Phasing in Experimental Autopolyploid Populations with High Ploidy Level Using Hidden Markov Models.利用隐马尔可夫模型对具有高倍性水平的实验自交多倍体群体进行连锁分析和单倍型相位。
G3 (Bethesda). 2019 Oct 7;9(10):3297-3314. doi: 10.1534/g3.119.400378.
10
polyRAD: Genotype Calling with Uncertainty from Sequencing Data in Polyploids and Diploids.polyRAD:多倍体和二倍体测序数据不确定性下的基因型分型
G3 (Bethesda). 2019 Mar 7;9(3):663-673. doi: 10.1534/g3.118.200913.