Suppr超能文献

Fast-GBS:一种用于从测序基因分型数据中高效且高精度地调用单核苷酸多态性(SNP)的新流程。

Fast-GBS: a new pipeline for the efficient and highly accurate calling of SNPs from genotyping-by-sequencing data.

作者信息

Torkamaneh Davoud, Laroche Jérôme, Bastien Maxime, Abed Amina, Belzile François

机构信息

Département de Phytologie, Université Laval, Quebec City, QC, Canada.

Institut de Biologie Intégrative et des Systèmes (IBIS), Université Laval, Quebec City, QC, Canada.

出版信息

BMC Bioinformatics. 2017 Jan 3;18(1):5. doi: 10.1186/s12859-016-1431-9.

Abstract

BACKGROUND

Next-generation sequencing (NGS) technologies have accelerated considerably the investigation into the composition of genomes and their functions. Genotyping-by-sequencing (GBS) is a genotyping approach that makes use of NGS to rapidly and economically scan a genome. It has been shown to allow the simultaneous discovery and genotyping of thousands to millions of SNPs across a wide range of species. For most users, the main challenge in GBS is the bioinformatics analysis of the large amount of sequence information derived from sequencing GBS libraries in view of calling alleles at SNP loci. Herein we describe a new GBS bioinformatics pipeline, Fast-GBS, designed to provide highly accurate genotyping, to require modest computing resources and to offer ease of use.

RESULTS

Fast-GBS is built upon standard bioinformatics language and file formats, is capable of handling data from different sequencing platforms, is capable of detecting different kinds of variants (SNPs, MNPs, and Indels). To illustrate its performance, we called variants in three collections of samples (soybean, barley, and potato) that cover a range of different genome sizes, levels of genome complexity, and ploidy. Within these small sets of samples, we called 35 k, 32 k and 38 k SNPs for soybean, barley and potato, respectively. To assess genotype accuracy, we compared these GBS-derived SNP genotypes with independent data sets obtained from whole-genome sequencing or SNP arrays. This analysis yielded estimated accuracies of 98.7, 95.2, and 94% for soybean, barley, and potato, respectively.

CONCLUSIONS

We conclude that Fast-GBS provides a highly efficient and reliable tool for calling SNPs from GBS data.

摘要

背景

新一代测序(NGS)技术极大地加速了对基因组组成及其功能的研究。简化基因组测序(GBS)是一种利用NGS快速且经济地扫描基因组的基因分型方法。已证明它能够在广泛的物种中同时发现和基因分型数千至数百万个单核苷酸多态性(SNP)。对于大多数用户而言,GBS的主要挑战在于鉴于在SNP位点进行等位基因分型,对从GBS文库测序获得的大量序列信息进行生物信息学分析。在此,我们描述一种新的GBS生物信息学流程Fast-GBS,其旨在提供高度准确的基因分型,所需计算资源适度且易于使用。

结果

Fast-GBS基于标准生物信息学语言和文件格式构建,能够处理来自不同测序平台的数据,能够检测不同类型的变异(SNP、多核苷酸多态性(MNP)和插入缺失(Indel))。为说明其性能,我们在三个样本集合(大豆、大麦和马铃薯)中进行变异分型,这些样本涵盖了不同的基因组大小、基因组复杂程度和倍性水平。在这些少量样本中,我们分别为大豆、大麦和马铃薯鉴定出35k、32k和38k个SNP。为评估基因分型准确性,我们将这些源自GBS的SNP基因分型与从全基因组测序或SNP芯片获得的独立数据集进行比较。该分析得出大豆、大麦和马铃薯的估计准确率分别为98.7%、95.2%和94%。

结论

我们得出结论,Fast-GBS为从GBS数据中鉴定SNP提供了一种高效且可靠的工具。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验