文献检索文档翻译深度研究
Suppr Zotero 插件Zotero 插件
邀请有礼套餐&价格历史记录

新学期,新优惠

限时优惠:9月1日-9月22日

30天高级会员仅需29元

1天体验卡首发特惠仅需5.99元

了解详情
不再提醒
插件&应用
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
高级版
套餐订阅购买积分包
AI 工具
文献检索文档翻译深度研究
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2025

The advantage of functional prediction based on clustering of yeast genes and its correlation with non-sequence based classifications.

作者信息

Bilu Yonatan, Linial Michal

机构信息

Institute of Computer Sciences, Life Science Institute, The Hebrew University, Jerusalem 91904, Israel.

出版信息

J Comput Biol. 2002;9(2):193-210. doi: 10.1089/10665270252935412.


DOI:10.1089/10665270252935412
PMID:12015877
Abstract

Sequence similarity is probably the most widely used tool to infer functional linkage between proteins. The fully sequenced, much researched, genome of Saccharomyces cerevisiae gives us on opportunity to compare and statistically quantify computational methods based on sequence similarity, which aim to detect such linkage. In addition, the amount of data regarding Saccharomyces Cerevisiae genes and proteins, which is not directly based on sequence is rapidly increasing. Consequently, it allows investigation of the connections and correlation between classification based on these types of data and that based solely on sequence similarity. In this work we start with a simple clustering algorithm to cluster genes based on the BLAST E-score of their similarity. We analyze how well one can infer function from these clusters and for how many of the genes that are currently unknown one can suggest a prediction. Given these parameters, we show that even a simple algorithm achieves better results than simply considering the BLAST output of matching genes. In the second part of the paper, we show that there is a highly significant correlation (p-value < 10(-4) for the vast majority of the experiments) between the aforementioned clusters and other types of classifications. Namely, we show that a pair of genes being clustered together is correlated with these genes having similar expression patterns in DNA array experiments and with the encoded proteins being involved in protein-protein interactions. Although this correlation is highly significant, it is, of course, not strong enough to be, by itself, a tool for predicting co-regulation of genes or interaction of proteins. We discuss possible explanations for this correlation. Furthermore, the statistical evaluation of these results should be considered when developing tools that are aimed at making such predictions.

摘要

相似文献

[1]
The advantage of functional prediction based on clustering of yeast genes and its correlation with non-sequence based classifications.

J Comput Biol. 2002

[2]
Hierarchical signature clustering for time series microarray data.

Adv Exp Med Biol. 2011

[3]
Beyond synexpression relationships: local clustering of time-shifted and inverted gene expression profiles identifies new, biologically relevant interactions.

J Mol Biol. 2001-12-14

[4]
Learning gene functional classifications from multiple data types.

J Comput Biol. 2002

[5]
Extraction of correlated gene clusters by multiple graph comparison.

Genome Inform. 2001

[6]
Combining multisource information through functional-annotation-based weighting: gene function prediction in yeast.

IEEE Trans Biomed Eng. 2009-2

[7]
Cluster, function and promoter: analysis of yeast expression array.

Pac Symp Biocomput. 2000

[8]
Correlation and prediction of gene expression level from amino acid and dipeptide composition of its protein.

BMC Bioinformatics. 2005-3-17

[9]
Influence of microarrays experiments missing values on the stability of gene groups by hierarchical clustering.

BMC Bioinformatics. 2004-8-23

[10]
Novel symmetry-based gene-gene dissimilarity measures utilizing Gene Ontology: Application in gene clustering.

Gene. 2018-9-2

引用本文的文献

[1]
Hierarchical ensemble methods for protein function prediction.

ISRN Bioinform. 2014-5-4

[2]
Intensity dependent confidence intervals on microarray measurements of differentially expressed genes: a case study of the effect of MK5, FKRP and TAF4 on the transcriptome.

Gene Regul Syst Bio. 2007-7-17

[3]
Model order selection for bio-molecular data clustering.

BMC Bioinformatics. 2007-5-3

[4]
PANDORA: keyword-based analysis of protein sets by integration of annotation sources.

Nucleic Acids Res. 2003-10-1

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

推荐工具

医学文档翻译智能文献检索