To Cuong C, Vohradsky Jiri
Laboratory of Bioinformatics, Institute of Microbiology, ASCR, Videnska 1083, 142 20 Prague, Czech Republic.
BMC Genomics. 2007 Feb 13;8:49. doi: 10.1186/1471-2164-8-49.
Identification of coordinately regulated genes according to the level of their expression during the time course of a process allows for discovering functional relationships among genes involved in the process.
We present a single class classification method for the identification of genes of similar function from a gene expression time series. It is based on a parallel genetic algorithm which is a supervised computer learning method exploiting prior knowledge of gene function to identify unknown genes of similar function from expression data. The algorithm was tested with a set of randomly generated patterns; the results were compared with seven other classification algorithms including support vector machines. The algorithm avoids several problems associated with unsupervised clustering methods, and it shows better performance then the other algorithms. The algorithm was applied to the identification of secondary metabolite gene clusters of the antibiotic-producing eubacterium Streptomyces coelicolor. The algorithm also identified pathways associated with transport of the secondary metabolites out of the cell. We used the method for the prediction of the functional role of particular ORFs based on the expression data.
Through analysis of a time series of gene expression, the algorithm identifies pathways which are directly or indirectly associated with genes of interest, and which are active during the time course of the experiment.
根据基因在一个过程的时间进程中的表达水平来鉴定协同调控的基因,有助于发现该过程中涉及的基因之间的功能关系。
我们提出了一种单类分类方法,用于从基因表达时间序列中鉴定功能相似的基因。它基于并行遗传算法,这是一种有监督的计算机学习方法,利用基因功能的先验知识从表达数据中鉴定功能相似的未知基因。该算法用一组随机生成的模式进行了测试;结果与包括支持向量机在内的其他七种分类算法进行了比较。该算法避免了与无监督聚类方法相关的几个问题,并且表现出比其他算法更好的性能。该算法被应用于鉴定产抗生素真细菌天蓝色链霉菌的次生代谢物基因簇。该算法还鉴定了与次生代谢物从细胞中输出相关的途径。我们使用该方法基于表达数据预测特定开放阅读框的功能作用。
通过对基因表达时间序列的分析,该算法鉴定出与感兴趣基因直接或间接相关且在实验时间进程中活跃的途径。