• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于酵母基因聚类的功能预测优势及其与非序列分类的相关性。

The advantage of functional prediction based on clustering of yeast genes and its correlation with non-sequence based classifications.

作者信息

Bilu Yonatan, Linial Michal

机构信息

Institute of Computer Sciences, Life Science Institute, The Hebrew University, Jerusalem 91904, Israel.

出版信息

J Comput Biol. 2002;9(2):193-210. doi: 10.1089/10665270252935412.

DOI:10.1089/10665270252935412
PMID:12015877
Abstract

Sequence similarity is probably the most widely used tool to infer functional linkage between proteins. The fully sequenced, much researched, genome of Saccharomyces cerevisiae gives us on opportunity to compare and statistically quantify computational methods based on sequence similarity, which aim to detect such linkage. In addition, the amount of data regarding Saccharomyces Cerevisiae genes and proteins, which is not directly based on sequence is rapidly increasing. Consequently, it allows investigation of the connections and correlation between classification based on these types of data and that based solely on sequence similarity. In this work we start with a simple clustering algorithm to cluster genes based on the BLAST E-score of their similarity. We analyze how well one can infer function from these clusters and for how many of the genes that are currently unknown one can suggest a prediction. Given these parameters, we show that even a simple algorithm achieves better results than simply considering the BLAST output of matching genes. In the second part of the paper, we show that there is a highly significant correlation (p-value < 10(-4) for the vast majority of the experiments) between the aforementioned clusters and other types of classifications. Namely, we show that a pair of genes being clustered together is correlated with these genes having similar expression patterns in DNA array experiments and with the encoded proteins being involved in protein-protein interactions. Although this correlation is highly significant, it is, of course, not strong enough to be, by itself, a tool for predicting co-regulation of genes or interaction of proteins. We discuss possible explanations for this correlation. Furthermore, the statistical evaluation of these results should be considered when developing tools that are aimed at making such predictions.

摘要

序列相似性可能是用于推断蛋白质之间功能联系的最广泛使用的工具。酿酒酵母的全基因组已被测序且经过大量研究,这使我们有机会比较并以统计方式量化基于序列相似性的计算方法,这些方法旨在检测这种联系。此外,关于酿酒酵母基因和蛋白质的并非直接基于序列的数据量正在迅速增加。因此,这使得我们能够研究基于这些数据类型的分类与仅基于序列相似性的分类之间的联系和相关性。在这项工作中,我们首先从一个简单的聚类算法开始,根据基因相似性的BLAST E值对基因进行聚类。我们分析从这些聚类中推断功能的效果如何,以及对于目前未知功能的基因中有多少可以做出预测。基于这些参数,我们表明即使是一个简单的算法也能取得比仅仅考虑匹配基因的BLAST输出更好的结果。在论文的第二部分,我们表明上述聚类与其他类型的分类之间存在高度显著的相关性(绝大多数实验的p值<10^(-4))。具体而言,我们表明聚在一起的一对基因与这些基因在DNA阵列实验中具有相似的表达模式以及与编码的蛋白质参与蛋白质 - 蛋白质相互作用相关。尽管这种相关性非常显著,但它本身当然还不足以成为预测基因共调控或蛋白质相互作用的工具。我们讨论了这种相关性可能的解释。此外,在开发旨在进行此类预测的工具时,应考虑对这些结果的统计评估。

相似文献

1
The advantage of functional prediction based on clustering of yeast genes and its correlation with non-sequence based classifications.基于酵母基因聚类的功能预测优势及其与非序列分类的相关性。
J Comput Biol. 2002;9(2):193-210. doi: 10.1089/10665270252935412.
2
Hierarchical signature clustering for time series microarray data.层次签名聚类分析用于时间序列基因芯片数据。
Adv Exp Med Biol. 2011;696:57-65. doi: 10.1007/978-1-4419-7046-6_6.
3
Beyond synexpression relationships: local clustering of time-shifted and inverted gene expression profiles identifies new, biologically relevant interactions.超越共表达关系:时移和反向基因表达谱的局部聚类可识别新的生物学相关相互作用。
J Mol Biol. 2001 Dec 14;314(5):1053-66. doi: 10.1006/jmbi.2000.5219.
4
Learning gene functional classifications from multiple data types.从多种数据类型中学习基因功能分类。
J Comput Biol. 2002;9(2):401-11. doi: 10.1089/10665270252935539.
5
Extraction of correlated gene clusters by multiple graph comparison.通过多重图比较提取相关基因簇
Genome Inform. 2001;12:44-53.
6
Combining multisource information through functional-annotation-based weighting: gene function prediction in yeast.通过基于功能注释的加权整合多源信息:酵母中的基因功能预测
IEEE Trans Biomed Eng. 2009 Feb;56(2):229-36. doi: 10.1109/TBME.2008.2005955. Epub 2008 Sep 30.
7
Cluster, function and promoter: analysis of yeast expression array.聚类、功能与启动子:酵母表达阵列分析
Pac Symp Biocomput. 2000:479-90.
8
Correlation and prediction of gene expression level from amino acid and dipeptide composition of its protein.基于蛋白质的氨基酸和二肽组成对基因表达水平进行相关性分析与预测。
BMC Bioinformatics. 2005 Mar 17;6:59. doi: 10.1186/1471-2105-6-59.
9
Influence of microarrays experiments missing values on the stability of gene groups by hierarchical clustering.微阵列实验缺失值对通过层次聚类的基因组稳定性的影响。
BMC Bioinformatics. 2004 Aug 23;5:114. doi: 10.1186/1471-2105-5-114.
10
Novel symmetry-based gene-gene dissimilarity measures utilizing Gene Ontology: Application in gene clustering.基于新型对称的基因-基因相异度度量方法,并利用基因本体论:在基因聚类中的应用。
Gene. 2018 Dec 30;679:341-351. doi: 10.1016/j.gene.2018.08.062. Epub 2018 Sep 2.

引用本文的文献

1
Hierarchical ensemble methods for protein function prediction.用于蛋白质功能预测的分层集成方法。
ISRN Bioinform. 2014 May 4;2014:901419. doi: 10.1155/2014/901419. eCollection 2014.
2
Intensity dependent confidence intervals on microarray measurements of differentially expressed genes: a case study of the effect of MK5, FKRP and TAF4 on the transcriptome.差异表达基因微阵列测量中强度依赖性置信区间:MK5、FKRP和TAF4对转录组影响的案例研究
Gene Regul Syst Bio. 2007 Jul 17;1:57-72.
3
Model order selection for bio-molecular data clustering.
生物分子数据聚类的模型阶次选择
BMC Bioinformatics. 2007 May 3;8 Suppl 2(Suppl 2):S7. doi: 10.1186/1471-2105-8-S2-S7.
4
PANDORA: keyword-based analysis of protein sets by integration of annotation sources.潘多拉:通过整合注释源对蛋白质集进行基于关键词的分析。
Nucleic Acids Res. 2003 Oct 1;31(19):5617-26. doi: 10.1093/nar/gkg769.