利用基因本体从基因表达时间谱中学习基于规则的生物过程模型。

Learning rule-based models of biological process from gene expression time profiles using gene ontology.

作者信息

Hvidsten Torgeir R, Laegreid Astrid, Komorowski Jan

机构信息

Department of Computer and Information Science, Norwegian University of Science and Technology, N-7491 Trondheim, Norway.

出版信息

Bioinformatics. 2003 Jun 12;19(9):1116-23. doi: 10.1093/bioinformatics/btg047.

DOI:10.1093/bioinformatics/btg047

PMID:12801872

Abstract

MOTIVATION

Microarray technology enables large-scale inference of the participation of genes in biological process from similar expression profiles. Our aim is to induce classificatory models from expression data and biological knowledge that can automatically associate genes with novel hypotheses of biological process.

RESULTS

We report a systematic supervised learning approach to predicting biological process from time series of gene expression data and biological knowledge. Biological knowledge is expressed using gene ontology and this knowledge is associated with discriminatory expression-based features to form minimal decision rules. The resulting rule model is first evaluated on genes coding for proteins with known biological process roles using cross validation. Then it is used to generate hypotheses for genes for which no knowledge of participation in biological process could be found. The theoretical foundation for the methodology based on rough sets is outlined in the paper, and its practical application demonstrated on a data set previously published by Cho et al. (Nat. Genet., 27, 48-54, 2001).

AVAILABILITY

The Rosetta system is available at http://www.idi.ntnu.no/~aleks/rosetta.

SUPPLEMENTARY INFORMATION

http://www.lcb.uu.se/~hvidsten/bioinf_cho/

摘要

动机

微阵列技术能够从相似的表达谱大规模推断基因在生物过程中的参与情况。我们的目标是从表达数据和生物知识中诱导出分类模型，该模型能够自动将基因与生物过程的新假设相关联。

结果

我们报告了一种系统的监督学习方法，用于从基因表达数据的时间序列和生物知识预测生物过程。生物知识使用基因本体来表达，并且该知识与基于差异表达的特征相关联，以形成最小决策规则。首先使用交叉验证对所得的规则模型在编码具有已知生物过程作用的蛋白质的基因上进行评估。然后，它被用于为那些在生物过程参与方面没有相关知识的基因生成假设。本文概述了基于粗糙集的该方法的理论基础，并在Cho等人（《自然遗传学》，27卷，48 - 54页，2001年）先前发表的一个数据集上展示了其实际应用。

可用性

Rosetta系统可在http://www.idi.ntnu.no/~aleks/rosetta获取。

补充信息

http://www.lcb.uu.se/~hvidsten/bioinf_cho/

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

利用基因本体从基因表达时间谱中学习基于规则的生物过程模型。

Learning rule-based models of biological process from gene expression time profiles using gene ontology.

作者信息

机构信息

出版信息

MOTIVATION

RESULTS

AVAILABILITY

SUPPLEMENTARY INFORMATION

动机

结果

可用性

补充信息

相似文献

引用本文的文献

利用基因本体从基因表达时间谱中学习基于规则的生物过程模型。

Learning rule-based models of biological process from gene expression time profiles using gene ontology.

作者信息

机构信息

出版信息

MOTIVATION

RESULTS

AVAILABILITY

SUPPLEMENTARY INFORMATION

动机

结果

可用性

补充信息

相似文献

引用本文的文献