Institute of Molecular Biosciences, Massey University, Palmerston North, New Zealand.
BMC Genomics. 2012 Jul 19;13:322. doi: 10.1186/1471-2164-13-322.
Expression profiling has been proposed as a means for screening non-model organisms in their natural environments to identify genes potentially important in adaptive diversification. Tag profiling using high throughput sequencing is a relatively low cost means of expression profiling with deep coverage. However the extent to which very short cDNA sequences can be effectively used in screening for candidate genes is unclear. Here we investigate this question using an evolutionarily distant as well as a closely related transcriptome for referencing tags. We do this by comparing differentially expressed genes and processes between two closely related allopolyploid species of Pachycladon which have distinct altitudinal preferences in the New Zealand Southern Alps. We validate biological inferences against earlier microarray analyses.
Statistical and gene annotation enrichment analyses of tag profiles identified more differentially expressed genes of potential adaptive significance than previous analyses of array-based expression profiles. These include genes involved in glucosinolate metabolism, flowering time, and response to cold, desiccation, fungi and oxidation. In addition, despite the short length of 20mer tags, we were able to infer patterns of homeologous gene expression for 700 genes in our reference library of 7,128 full-length Pachycladon ESTs. We also demonstrate that there is significant information loss when mapping tags to the non-conspecific reference transcriptome of A. thaliana as opposed to P. fastigiatum ESTs but also describe mapping strategies by which the larger collection of A. thaliana ESTs can be used as a reference.
When coupled with a reference transcriptome generated using RNA-seq, tag sequencing offers a promising approach for screening natural populations and identifying candidate genes of potential adaptive significance. We identify computational issues important for the successful application of tag profiling in a non-model allopolyploid plant species.
表达谱分析被提议作为一种在自然环境中筛选非模式生物的方法,以鉴定在适应多样化中可能重要的基因。使用高通量测序进行标记谱分析是一种相对低成本的表达谱分析方法,具有深度覆盖。然而,非常短的 cDNA 序列在筛选候选基因方面的有效性尚不清楚。在这里,我们使用进化上较远和较近的转录组来参考标签,来研究这个问题。我们通过比较两种密切相关的多倍体 Pachycladon 物种之间的差异表达基因和过程来实现这一点,它们在新西兰南阿尔卑斯山的海拔偏好上存在明显差异。我们通过与早期微阵列分析进行比较来验证生物学推断。
对标签谱进行的统计和基因注释富集分析确定了更多具有潜在适应性意义的差异表达基因,比以前基于阵列的表达谱分析的结果更为显著。这些基因包括参与硫代葡萄糖苷代谢、开花时间和对寒冷、干燥、真菌和氧化的反应的基因。此外,尽管 20mer 标签的长度很短,但我们能够推断出我们 7128 个全长 Pachycladon EST 参考文库中 700 个基因的同源基因表达模式。我们还表明,当将标签映射到非同源参考转录组 A. thaliana 而不是 P. fastigiatum ESTs 时,会有大量信息丢失,但也描述了映射策略,通过该策略可以使用更大的 A. thaliana EST 集合作为参考。
当与使用 RNA-seq 生成的参考转录组结合使用时,标签测序为筛选自然种群和鉴定具有潜在适应性意义的候选基因提供了一种很有前途的方法。我们确定了在非模式多倍体植物物种中成功应用标签谱分析的重要计算问题。