Suppr超能文献

TASSEL-GBS:一种用于测序分析流程的高容量基因分型方法。

TASSEL-GBS: a high capacity genotyping by sequencing analysis pipeline.

作者信息

Glaubitz Jeffrey C, Casstevens Terry M, Lu Fei, Harriman James, Elshire Robert J, Sun Qi, Buckler Edward S

机构信息

Institute for Genomic Diversity, Cornell University, Ithaca, New York, United States of America.

Biotechnology Resource Center Bioinformatics Facility, Cornell University, Ithaca, New York, United States of America.

出版信息

PLoS One. 2014 Feb 28;9(2):e90346. doi: 10.1371/journal.pone.0090346. eCollection 2014.

Abstract

Genotyping by sequencing (GBS) is a next generation sequencing based method that takes advantage of reduced representation to enable high throughput genotyping of large numbers of individuals at a large number of SNP markers. The relatively straightforward, robust, and cost-effective GBS protocol is currently being applied in numerous species by a large number of researchers. Herein we describe a bioinformatics pipeline, TASSEL-GBS, designed for the efficient processing of raw GBS sequence data into SNP genotypes. The TASSEL-GBS pipeline successfully fulfills the following key design criteria: (1) Ability to run on the modest computing resources that are typically available to small breeding or ecological research programs, including desktop or laptop machines with only 8-16 GB of RAM, (2) Scalability from small to extremely large studies, where hundreds of thousands or even millions of SNPs can be scored in up to 100,000 individuals (e.g., for large breeding programs or genetic surveys), and (3) Applicability in an accelerated breeding context, requiring rapid turnover from tissue collection to genotypes. Although a reference genome is required, the pipeline can also be run with an unfinished "pseudo-reference" consisting of numerous contigs. We describe the TASSEL-GBS pipeline in detail and benchmark it based upon a large scale, species wide analysis in maize (Zea mays), where the average error rate was reduced to 0.0042 through application of population genetic-based SNP filters. Overall, the GBS assay and the TASSEL-GBS pipeline provide robust tools for studying genomic diversity.

摘要

测序基因分型(GBS)是一种基于新一代测序的方法,它利用简化基因组表征来实现对大量个体在大量单核苷酸多态性(SNP)标记上的高通量基因分型。相对简单、稳健且经济高效的GBS方案目前正被大量研究人员应用于众多物种。在此,我们描述了一个生物信息学流程TASSEL - GBS,其设计用于将原始GBS序列数据高效处理为SNP基因型。TASSEL - GBS流程成功满足了以下关键设计标准:(1)能够在小型育种或生态研究项目通常可用的适度计算资源上运行,包括仅有8 - 16GB随机存取存储器的台式机或笔记本电脑;(2)从小规模研究到超大规模研究的可扩展性,在超大规模研究中,可对数以十万计甚至数百万计的个体中的数十万个甚至数百万个SNP进行评分(例如,用于大型育种项目或遗传调查);(3)适用于加速育种环境,要求从组织采集到获得基因型的周转迅速。尽管需要一个参考基因组,但该流程也可以使用由众多重叠群组成的未完成“伪参考”来运行。我们详细描述了TASSEL - GBS流程,并基于在玉米(Zea mays)中进行的大规模全物种分析对其进行了基准测试,通过应用基于群体遗传学的SNP筛选,平均错误率降至0.0042。总体而言,GBS分析和TASSEL - GBS流程为研究基因组多样性提供了强大的工具。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6a64/3938676/7035fc65ade4/pone.0090346.g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验