Suppr超能文献

预测酿酒酵母中的基因功能。

Predicting gene function in Saccharomyces cerevisiae.

作者信息

Clare A, King R D

机构信息

Department of Computer Science, University of Wales, Aberystwyth, Penglais, Aberystwyth, Wales, UK.

出版信息

Bioinformatics. 2003 Oct;19 Suppl 2:ii42-9. doi: 10.1093/bioinformatics/btg1058.

Abstract

MOTIVATION

S.cerevisiae is one of the most important model organisms, and has has been the focus of over a century of study. In spite of these efforts, 40% of its open reading frames (ORFs) remain classified as having unknown function (MIPS: Munich Information Center for Protein Sequences). We wished to make predictions for the function of these ORFs using data mining, as we have previously successfully done for the genomes of M.tuberculosis and E.coli. Applying this approach to the larger and eukaryotic S.cerevisiae genome involves modifying the machine learning and data mining algorithms, as this is a larger organism with more data available, and a more challenging functional classification.

RESULTS

Novel extensions to the machine learning and data mining algorithms have been devised in order to deal with the challenges. Accurate rules have been learned and predictions have been made for many of the ORFs whose function is currently unknown. The rules are informative, agree with known biology and allow for scientific discovery.

AVAILABILITY

All predictions are freely available from http://www.genepredictions.org, all datasets used in this study are freely available from http://www.aber.ac.uk/compsci/Research/bio/dss/yeastdataand software for relational data mining is available from http://www.aber.ac.uk/compsci/Research/bio/dss/polyfarm.

摘要

动机

酿酒酵母是最重要的模式生物之一,也是一个多世纪以来的研究焦点。尽管人们付出了诸多努力,但其40%的开放阅读框(ORF)仍被归类为功能未知(MIPS:慕尼黑蛋白质序列信息中心)。我们希望利用数据挖掘对这些ORF的功能进行预测,就像我们之前在结核分枝杆菌和大肠杆菌基因组研究中成功做到的那样。将这种方法应用于更大的真核酿酒酵母基因组,需要对机器学习和数据挖掘算法进行修改,因为这是一个具有更多可用数据且功能分类更具挑战性的更大生物体。

结果

为应对这些挑战,我们设计了机器学习和数据挖掘算法的新扩展。已经学习到了准确的规则,并对许多功能目前未知的ORF进行了预测。这些规则信息丰富,与已知生物学知识相符,并有助于科学发现。

可用性

所有预测可从http://www.genepredictions.org免费获取,本研究中使用的所有数据集可从http://www.aber.ac.uk/compsci/Research/bio/dss/yeastdata免费获取,用于关系数据挖掘的软件可从http://www.aber.ac.uk/compsci/Research/bio/dss/polyfarm获取。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验