Department of Biology, University of Kentucky, Lexington, KY 40506, USA.
Mol Ecol. 2013 Jan;22(1):111-29. doi: 10.1111/mec.12049. Epub 2012 Oct 12.
Modern analytical methods for population genetics and phylogenetics are expected to provide more accurate results when data from multiple genome-wide loci are analysed. We present the results of an initial application of parallel tagged sequencing (PTS) on a next-generation platform to sequence thousands of barcoded PCR amplicons generated from 95 nuclear loci and 93 individuals sampled across the range of the tiger salamander (Ambystoma tigrinum) species complex. To manage the bioinformatic processing of this large data set (344 330 reads), we developed a pipeline that sorts PTS data by barcode and locus, identifies high-quality variable nucleotides and yields phased haplotype sequences for each individual at each locus. Our sequencing and bioinformatic strategy resulted in a genome-wide data set with relatively low levels of missing data and a wide range of nucleotide variation. structure analyses of these data in a genotypic format resulted in strongly supported assignments for the majority of individuals into nine geographically defined genetic clusters. Species tree analyses of the most variable loci using a multi-species coalescent model resulted in strong support for most branches in the species tree; however, analyses including more than 50 loci produced parameter sampling trends that indicated a lack of convergence on the posterior distribution. Overall, these results demonstrate the potential for amplicon-based PTS to rapidly generate large-scale data for population genetic and phylogenetic-based research.
现代群体遗传学和系统发生学的分析方法,当分析多个全基因组位点的数据时,预计会提供更准确的结果。我们展示了在下一代平台上进行平行标记测序 (PTS) 的初步应用结果,该方法对来自老虎蝾螈(Ambystoma tigrinum)种复合体的 95 个核基因座和 93 个个体的数千个条形码 PCR 扩增子进行测序。为了管理这个大型数据集(344330 个读数)的生物信息处理,我们开发了一个流水线,通过条形码和基因座对 PTS 数据进行排序,识别高质量的可变核苷酸,并为每个个体的每个基因座生成相分的单倍型序列。我们的测序和生物信息学策略产生了一个具有相对较低缺失数据水平和广泛核苷酸变异的全基因组数据集。在基因型格式下对这些数据进行结构分析,结果强烈支持将大多数个体分配到九个地理定义的遗传群中。使用多物种合并模型对最具变异性的基因座进行物种树分析,结果强烈支持物种树中的大多数分支;然而,包括 50 多个基因座的分析产生了参数抽样趋势,表明在后验分布上缺乏收敛性。总的来说,这些结果表明,基于扩增子的 PTS 有潜力快速生成用于群体遗传学和基于系统发生的研究的大规模数据。