Bassel George W, Provart Nicholas J
Department of Cell & Systems Biology, University of Toronto, Toronto, ON, Canada.
Methods Mol Biol. 2009;495:21-37. doi: 10.1007/978-1-59745-477-3_3.
Analysis of large-scale gene expression data sets is proving to be a powerful tool for gene function prediction, cis-element discovery and hypothesis generation using Arabidopsis thaliana. Public initiatives led by the AtGenExpress Consortium and experiments conducted by individual researchers to document the transcriptome of Arabidopsis thaliana have led to a large numbers of data sets being made publicly available for data mining by so-called "electronic northerns", co-expression analysis and other means. Given that approximately 50% genes in Arabidopsis have no function ascribed to them by "traditional" homology searches, and that only around 10% of the genes have had their function confirmed in the laboratory, these analyses can accelerate the identification of potential gene function with a mouse-click. This chapter covers the use of data mining tools available at the Bio-Array Resource (www.bar.utoronto.ca) for hypothesis generation in the context of plant hormone biology.
事实证明,利用拟南芥分析大规模基因表达数据集是预测基因功能、发现顺式作用元件和提出假设的有力工具。由拟南芥基因表达联盟主导的公共项目以及个别研究人员开展的记录拟南芥转录组的实验,已产生了大量可供公众通过所谓的“电子北方杂交”、共表达分析及其他方法进行数据挖掘的数据集。鉴于拟南芥中约50%的基因在“传统”同源性搜索中没有赋予其功能,且只有约10%的基因在实验室中得到功能确认,这些分析通过点击鼠标就能加速潜在基因功能的鉴定。本章介绍了生物芯片资源库(www.bar.utoronto.ca)提供的数据挖掘工具在植物激素生物学背景下用于提出假设的情况。