Witjes Lotte, Kooke Rik, van der Hooft Justin J J, de Vos Ric C H, Keurentjes Joost J B, Medema Marnix H, Nijveen Harm
Bioinformatics Group, Wageningen University & Research, Wageningen, The Netherlands.
Laboratory of Genetics, Wageningen University & Research, Wageningen, The Netherlands.
BMC Res Notes. 2019 Apr 2;12(1):194. doi: 10.1186/s13104-019-4222-3.
Plants produce a plethora of specialized metabolites to defend themselves against pathogens and insects, to attract pollinators and to communicate with other organisms. Many of these are also applied in the clinic and in agriculture. Genes encoding the enzymes that drive the biosynthesis of these metabolites are sometimes physically grouped on the chromosome, in regions called biosynthetic gene clusters (BGCs). Several algorithms have been developed to identify plant BGCs, but a large percentage of predicted gene clusters upon further inspection do not show coexpression or do not encode a single functional biosynthetic pathway. Hence, further prioritization is needed.
Here, we introduce a strategy to systematically evaluate potential functions of predicted BGCs by superimposing their locations on metabolite quantitative trait loci (mQTLs). We show the feasibility of such an approach by integrating automated BGC prediction with mQTL datasets originating from a recombinant inbred line (RIL) population of Oryza sativa and a genome-wide association study (GWAS) of Arabidopsis thaliana. In these data, we identified several links for which the enzyme content of the BGCs matches well with the chemical features observed in the metabolite structure, suggesting that this method can effectively guide bioprospecting of plant BGCs.
植物产生大量的特殊代谢产物以抵御病原体和昆虫、吸引传粉者并与其他生物进行交流。其中许多代谢产物也应用于临床和农业。编码驱动这些代谢产物生物合成的酶的基因有时在染色体上物理聚集,形成所谓的生物合成基因簇(BGC)区域。已经开发了几种算法来识别植物BGC,但进一步检查发现,很大比例的预测基因簇并未显示共表达,或者并未编码单一的功能性生物合成途径。因此,需要进一步进行优先级排序。
在此,我们介绍一种策略,通过将预测的BGC的位置叠加在代谢物数量性状基因座(mQTL)上,系统地评估其潜在功能。我们通过将自动化BGC预测与源自水稻重组自交系(RIL)群体和拟南芥全基因组关联研究(GWAS)的mQTL数据集相结合,展示了这种方法的可行性。在这些数据中,我们鉴定了几个BGC的酶含量与代谢物结构中观察到的化学特征匹配良好的关联,表明该方法可以有效地指导植物BGC的生物勘探。