The Fish Molecular Genetics and Biotechnology Laboratory, Department of Fisheries and Allied Aquacultures and Program of Cell and Molecular Biosciences, Aquatic Genomics Unit, Auburn University, Auburn, AL 36849, USA.
BMC Genomics. 2011 Jan 21;12:53. doi: 10.1186/1471-2164-12-53.
Single nucleotide polymorphisms (SNPs) have become the marker of choice for genome-wide association studies. In order to provide the best genome coverage for the analysis of performance and production traits, a large number of relatively evenly distributed SNPs are needed. Gene-associated SNPs may fulfill these requirements of large numbers and genome wide distribution. In addition, gene-associated SNPs could themselves be causative SNPs for traits. The objective of this project was to identify large numbers of gene-associated SNPs using high-throughput next generation sequencing.
Transcriptome sequencing was conducted for channel catfish and blue catfish using Illumina next generation sequencing technology. Approximately 220 million reads (15.6 Gb) for channel catfish and 280 million reads (19.6 Gb) for blue catfish were obtained by sequencing gene transcripts derived from various tissues of multiple individuals from a diverse genetic background. A total of over 35 billion base pairs of expressed short read sequences were generated. Over two million putative SNPs were identified from channel catfish and almost 2.5 million putative SNPs were identified from blue catfish. Of these putative SNPs, a set of filtered SNPs were identified including 342,104 intra-specific SNPs for channel catfish, 366,269 intra-specific SNPs for blue catfish, and 420,727 inter-specific SNPs between channel catfish and blue catfish. These filtered SNPs are distributed within 16,562 unique genes in channel catfish and 17,423 unique genes in blue catfish.
For aquaculture species, transcriptome analysis of pooled RNA samples from multiple individuals using Illumina sequencing technology is both technically efficient and cost-effective for generating expressed sequences. Such an approach is most effective when coupled to existing EST resources generated using traditional sequencing approaches because the reference ESTs facilitate effective assembly of the expressed short reads. When multiple individuals with different genetic backgrounds are used, RNA-Seq is very effective for the identification of SNPs. The SNPs identified in this report will provide a much needed resource for genetic studies in catfish and will contribute to the development of a high-density SNP array. Validation and testing of these SNPs using SNP arrays will form the material basis for genome association studies and whole genome-based selection in catfish.
单核苷酸多态性(SNP)已成为全基因组关联研究的首选标记。为了为性能和生产性状分析提供最佳的基因组覆盖范围,需要大量相对均匀分布的 SNP。与基因相关的 SNP 可能满足这些大量和全基因组分布的要求。此外,与基因相关的 SNP 本身可能是性状的因果 SNP。本项目的目的是使用高通量下一代测序技术鉴定大量与基因相关的 SNP。
使用 Illumina 下一代测序技术对斑点叉尾鮰和蓝蟹进行了转录组测序。通过对来自不同遗传背景的多个个体的多种组织的基因转录本进行测序,分别获得了约 2.2 亿条读数(15.6Gb)用于斑点叉尾鮰和 2.8 亿条读数(19.6Gb)用于蓝蟹。总共生成了超过 350 亿个表达短读序列碱基对。从斑点叉尾鮰中鉴定出超过 200 万个假定 SNP,从蓝蟹中鉴定出近 250 万个假定 SNP。在这些假定 SNP 中,鉴定出了一组过滤 SNP,其中包括 342,104 个斑点叉尾鮰种内 SNP、366,269 个蓝蟹种内 SNP 和 420,727 个斑点叉尾鮰和蓝蟹种间 SNP。这些过滤 SNP 分布在斑点叉尾鮰的 16,562 个独特基因和蓝蟹的 17,423 个独特基因中。
对于水产养殖物种,使用 Illumina 测序技术对来自多个个体的混合 RNA 样本进行转录组分析,在技术上是高效且具有成本效益的,可用于生成表达序列。当与使用传统测序方法生成的现有 EST 资源结合使用时,这种方法最为有效,因为参考 EST 有助于有效组装表达的短读。当使用具有不同遗传背景的多个个体时,RNA-Seq 非常有效地鉴定 SNP。本报告中鉴定的 SNP 将为斑点叉尾鮰的遗传研究提供急需的资源,并有助于开发高密度 SNP 芯片。使用 SNP 芯片对这些 SNP 进行验证和测试将为斑点叉尾鮰的基因组关联研究和全基因组选择提供物质基础。