Fresnedo-Ramírez Jonathan, Yang Shanshan, Sun Qi, Karn Avinash, Reisch Bruce I, Cadle-Davidson Lance
Department of Horticulture and Crop Science, The Ohio State University, Wooster, OH, United States.
School of Integrative Plant Science, Cornell AgriTech, Geneva, NY, United States.
Front Plant Sci. 2019 May 14;10:599. doi: 10.3389/fpls.2019.00599. eCollection 2019.
Amplicon sequencing (AmpSeq) is a practical, intuitive strategy with a semi-automated computational pipeline for analysis of highly multiplexed PCR-derived sequences. This genotyping platform is particularly cost-effective when multiplexing 96 or more samples with a few amplicons up to thousands of amplicons. Amplicons can target from a single nucleotide to the upper limit of the sequencing platform. The flexibility of AmpSeq's wet lab methods make it a tool of broad interest for diverse species, and AmpSeq excels in flexibility, high-throughput, low-cost, accuracy, and semi-automated analysis. Here we provide an open science framework procedure to output data out of an AmpSeq project, with an emphasis on the bioinformatics pipeline to generate SNPs, haplotypes and presence/absence variants in a set of diverse genotypes. Open-access tutorial datasets with actual data and a containerization open source software instance are provided to enable training in each of these genotyping applications. The pipelines presented here should be applicable to the analysis of various target-enriched (e.g., amplicon or sequence capture) Illumina sequence data.
扩增子测序(AmpSeq)是一种实用且直观的策略,它具有用于分析高度多重PCR衍生序列的半自动计算流程。当对96个或更多样本进行多重检测时,使用几个扩增子直至数千个扩增子,这个基因分型平台具有特别的成本效益。扩增子可以针对单个核苷酸直至测序平台的上限。AmpSeq湿实验室方法的灵活性使其成为一种广泛适用于多种物种的工具,并且AmpSeq在灵活性、高通量、低成本、准确性和半自动分析方面表现出色。在这里,我们提供了一个开放科学框架程序,用于从AmpSeq项目中输出数据,重点是生物信息学流程,以在一组不同的基因型中生成单核苷酸多态性(SNP)、单倍型和存在/缺失变异。提供了带有实际数据的开放获取教程数据集和一个容器化开源软件实例,以实现对这些基因分型应用的培训。这里介绍的流程应适用于各种目标富集(例如扩增子或序列捕获)的Illumina序列数据的分析。