预测酿酒酵母中的基因功能。

Predicting gene function in Saccharomyces cerevisiae.

作者信息

Clare A, King R D

机构信息

Department of Computer Science, University of Wales, Aberystwyth, Penglais, Aberystwyth, Wales, UK.

出版信息

Bioinformatics. 2003 Oct;19 Suppl 2:ii42-9. doi: 10.1093/bioinformatics/btg1058.

DOI:10.1093/bioinformatics/btg1058

PMID:14534170

Abstract

MOTIVATION

S.cerevisiae is one of the most important model organisms, and has has been the focus of over a century of study. In spite of these efforts, 40% of its open reading frames (ORFs) remain classified as having unknown function (MIPS: Munich Information Center for Protein Sequences). We wished to make predictions for the function of these ORFs using data mining, as we have previously successfully done for the genomes of M.tuberculosis and E.coli. Applying this approach to the larger and eukaryotic S.cerevisiae genome involves modifying the machine learning and data mining algorithms, as this is a larger organism with more data available, and a more challenging functional classification.

RESULTS

Novel extensions to the machine learning and data mining algorithms have been devised in order to deal with the challenges. Accurate rules have been learned and predictions have been made for many of the ORFs whose function is currently unknown. The rules are informative, agree with known biology and allow for scientific discovery.

AVAILABILITY

All predictions are freely available from http://www.genepredictions.org, all datasets used in this study are freely available from http://www.aber.ac.uk/compsci/Research/bio/dss/yeastdataand software for relational data mining is available from http://www.aber.ac.uk/compsci/Research/bio/dss/polyfarm.

摘要

动机

酿酒酵母是最重要的模式生物之一，也是一个多世纪以来的研究焦点。尽管人们付出了诸多努力，但其40%的开放阅读框（ORF）仍被归类为功能未知（MIPS：慕尼黑蛋白质序列信息中心）。我们希望利用数据挖掘对这些ORF的功能进行预测，就像我们之前在结核分枝杆菌和大肠杆菌基因组研究中成功做到的那样。将这种方法应用于更大的真核酿酒酵母基因组，需要对机器学习和数据挖掘算法进行修改，因为这是一个具有更多可用数据且功能分类更具挑战性的更大生物体。

结果

为应对这些挑战，我们设计了机器学习和数据挖掘算法的新扩展。已经学习到了准确的规则，并对许多功能目前未知的ORF进行了预测。这些规则信息丰富，与已知生物学知识相符，并有助于科学发现。

可用性

所有预测可从http://www.genepredictions.org免费获取，本研究中使用的所有数据集可从http://www.aber.ac.uk/compsci/Research/bio/dss/yeastdata免费获取，用于关系数据挖掘的软件可从http://www.aber.ac.uk/compsci/Research/bio/dss/polyfarm获取。

相似文献

Predicting gene function in Saccharomyces cerevisiae.预测酿酒酵母中的基因功能。

Bioinformatics. 2003 Oct;19 Suppl 2:ii42-9. doi: 10.1093/bioinformatics/btg1058.

Confirmation of data mining based predictions of protein function.基于数据挖掘的蛋白质功能预测的验证。

Bioinformatics. 2004 May 1;20(7):1110-8. doi: 10.1093/bioinformatics/bth047. Epub 2004 Feb 5.

Finding motifs in protein secondary structure for use in function prediction.寻找蛋白质二级结构中的基序以用于功能预测。

J Comput Biol. 2006 Apr;13(3):719-31. doi: 10.1089/cmb.2006.13.719.

Predicting gene function through systematic analysis and quality assessment of high-throughput data.通过对高通量数据进行系统分析和质量评估来预测基因功能。

Bioinformatics. 2005 Apr 15;21(8):1644-52. doi: 10.1093/bioinformatics/bti103. Epub 2004 Nov 5.

ORFer--retrieval of protein sequences and open reading frames from GenBank and storage into relational databases or text files.ORFer——从GenBank中检索蛋白质序列和开放阅读框，并存储到关系数据库或文本文件中。

BMC Bioinformatics. 2002 Dec 19;3:40. doi: 10.1186/1471-2105-3-40.

PROTEIOS: an open source proteomics initiative.蛋白质组计划：一项开源蛋白质组学计划。

Bioinformatics. 2005 May 1;21(9):2085-7. doi: 10.1093/bioinformatics/bti291. Epub 2005 Feb 3.

SPrCY: comparison of structural predictions in the Saccharomyces cerevisiae genome.SPrCY：酿酒酵母基因组中结构预测的比较

Bioinformatics. 2004 Sep 22;20(14):2312-4. doi: 10.1093/bioinformatics/bth223. Epub 2004 Apr 1.

Genome wide prediction of protein function via a generic knowledge discovery approach based on evidence integration.通过基于证据整合的通用知识发现方法对蛋白质功能进行全基因组预测。

BMC Bioinformatics. 2006 May 25;7:268. doi: 10.1186/1471-2105-7-268.

An ontology for a Robot Scientist.机器人科学家的本体论。

Bioinformatics. 2006 Jul 15;22(14):e464-71. doi: 10.1093/bioinformatics/btl207.

A weighted power framework for integrating multisource information: gene function prediction in yeast.加权幂框架用于整合多源信息：酵母基因功能预测。

IEEE Trans Biomed Eng. 2012 Apr;59(4):1162-8. doi: 10.1109/TBME.2012.2186689. Epub 2012 Feb 3.

引用本文的文献

Integrated transcriptomic meta-analysis and comparative artificial intelligence models in maize under biotic stress.在生物胁迫下玉米中综合转录组元分析和比较人工智能模型。

Sci Rep. 2023 Sep 23;13(1):15899. doi: 10.1038/s41598-023-42984-4.

Knowledge-based classification of fine-grained immune cell types in single-cell RNA-Seq data.基于知识的单细胞 RNA-Seq 数据中细粒度免疫细胞类型的分类。

Brief Bioinform. 2021 Sep 2;22(5). doi: 10.1093/bib/bbab039.

Automated Estimation of Food Type from Body-worn Audio and Motion Sensors in Free-Living Environments.在自由生活环境中通过佩戴在身体上的音频和运动传感器自动估计食物类型

Proc Mach Learn Res. 2019 Aug;106:641-662.

YAGM: a web tool for mining associated genes in yeast based on diverse biological associations.YAGM：一种基于多种生物学关联在酵母中挖掘相关基因的网络工具。

BMC Syst Biol. 2015;9 Suppl 6(Suppl 6):S1. doi: 10.1186/1752-0509-9-S6-S1. Epub 2015 Dec 9.

Hierarchical ensemble methods for protein function prediction.用于蛋白质功能预测的分层集成方法。

ISRN Bioinform. 2014 May 4;2014:901419. doi: 10.1155/2014/901419. eCollection 2014.

Integration of molecular network data reconstructs Gene Ontology.分子网络数据的整合重建了基因本体论。

Bioinformatics. 2014 Sep 1;30(17):i594-600. doi: 10.1093/bioinformatics/btu470.

Using multi-instance hierarchical clustering learning system to predict yeast gene function.使用多实例分层聚类学习系统预测酵母基因功能。

PLoS One. 2014 Mar 12;9(3):e90962. doi: 10.1371/journal.pone.0090962. eCollection 2014.

Collective prediction of protein functions from protein-protein interaction networks.从蛋白质-蛋白质相互作用网络中集体预测蛋白质功能。

BMC Bioinformatics. 2014;15 Suppl 2(Suppl 2):S9. doi: 10.1186/1471-2105-15-S2-S9. Epub 2014 Jan 24.

Using PPI network autocorrelation in hierarchical multi-label classification trees for gene function prediction.利用 PPI 网络自相关性在层次多标签分类树中进行基因功能预测。

BMC Bioinformatics. 2013 Sep 26;14:285. doi: 10.1186/1471-2105-14-285.

The use of classification trees for bioinformatics.分类树在生物信息学中的应用。

Wiley Interdiscip Rev Data Min Knowl Discov. 2011 Jan;1(1):55-63. doi: 10.1002/widm.14. Epub 2011 Jan 6.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

预测酿酒酵母中的基因功能。

Predicting gene function in Saccharomyces cerevisiae.

作者信息

机构信息

出版信息

MOTIVATION

RESULTS

AVAILABILITY

动机

结果

可用性

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献