• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于整合和解读蛋白质组学数据集的统计框架。

A statistical framework for combining and interpreting proteomic datasets.

作者信息

Gilchrist Michael A, Salter Laura A, Wagner Andreas

机构信息

Department of Biology, University of New Mexico, Albuquerque 87106, USA.

出版信息

Bioinformatics. 2004 Mar 22;20(5):689-700. doi: 10.1093/bioinformatics/btg469. Epub 2004 Jan 22.

DOI:10.1093/bioinformatics/btg469
PMID:15033876
Abstract

MOTIVATION

To identify accurately protein function on a proteome-wide scale requires integrating data within and between high-throughput experiments. High-throughput proteomic datasets often have high rates of errors and thus yield incomplete and contradictory information. In this study, we develop a simple statistical framework using Bayes' law to interpret such data and combine information from different high-throughput experiments. In order to illustrate our approach we apply it to two protein complex purification datasets.

RESULTS

Our approach shows how to use high-throughput data to calculate accurately the probability that two proteins are part of the same complex. Importantly, our approach does not need a reference set of verified protein interactions to determine false positive and false negative error rates of protein association. We also demonstrate how to combine information from two separate protein purification datasets into a combined dataset that has greater coverage and accuracy than either dataset alone. In addition, we also provide a technique for estimating the total number of proteins which can be detected using a particular experimental technique.

AVAILABILITY

A suite of simple programs to accomplish some of the above tasks is available at www.unm.edu/~compbio/software/DatasetAssess

摘要

动机

要在全蛋白质组范围内准确识别蛋白质功能,需要整合高通量实验内部和之间的数据。高通量蛋白质组数据集往往错误率很高,因此会产生不完整且相互矛盾的信息。在本研究中,我们开发了一个使用贝叶斯定律的简单统计框架来解释此类数据,并整合来自不同高通量实验的信息。为了说明我们的方法,我们将其应用于两个蛋白质复合物纯化数据集。

结果

我们的方法展示了如何利用高通量数据准确计算两种蛋白质属于同一复合物的概率。重要的是,我们的方法不需要一组经过验证的蛋白质相互作用参考集来确定蛋白质关联的假阳性和假阴性错误率。我们还展示了如何将来自两个独立蛋白质纯化数据集的信息整合到一个组合数据集中,该数据集比单独的任何一个数据集都具有更高的覆盖率和准确性。此外,我们还提供了一种技术,用于估计使用特定实验技术可检测到的蛋白质总数。

可用性

可在www.unm.edu/~compbio/software/DatasetAssess获取一套用于完成上述一些任务的简单程序。

相似文献

1
A statistical framework for combining and interpreting proteomic datasets.用于整合和解读蛋白质组学数据集的统计框架。
Bioinformatics. 2004 Mar 22;20(5):689-700. doi: 10.1093/bioinformatics/btg469. Epub 2004 Jan 22.
2
Predicting protein-protein interactions from sequence using correlation coefficient and high-quality interaction dataset.利用相关系数和高质量的交互数据集从序列预测蛋白质-蛋白质相互作用。
Amino Acids. 2010 Mar;38(3):891-9. doi: 10.1007/s00726-009-0295-y. Epub 2009 Apr 24.
3
Integrated analysis of multiple data sources reveals modular structure of biological networks.多个数据源的综合分析揭示了生物网络的模块化结构。
Biochem Biophys Res Commun. 2006 Jun 23;345(1):302-9. doi: 10.1016/j.bbrc.2006.04.088. Epub 2006 Apr 27.
4
Inferring pairwise regulatory relationships from multiple time series datasets.从多个时间序列数据集中推断成对的调控关系。
Bioinformatics. 2007 Mar 15;23(6):755-63. doi: 10.1093/bioinformatics/btl676. Epub 2007 Jan 19.
5
Bayesian methods for predicting interacting protein pairs using domain information.利用结构域信息预测相互作用蛋白对的贝叶斯方法。
Biometrics. 2007 Sep;63(3):824-33. doi: 10.1111/j.1541-0420.2007.00755.x.
6
Analyzing yeast protein-protein interaction data obtained from different sources.分析从不同来源获得的酵母蛋白质-蛋白质相互作用数据。
Nat Biotechnol. 2002 Oct;20(10):991-7. doi: 10.1038/nbt1002-991.
7
Conserved network motifs allow protein-protein interaction prediction.保守的网络基序可用于蛋白质-蛋白质相互作用预测。
Bioinformatics. 2004 Dec 12;20(18):3346-52. doi: 10.1093/bioinformatics/bth402. Epub 2004 Jul 9.
8
A Gibbs sampler for the identification of gene expression and network connectivity consistency.一种用于识别基因表达和网络连通性一致性的吉布斯采样器。
Bioinformatics. 2006 Dec 15;22(24):3040-6. doi: 10.1093/bioinformatics/btl541. Epub 2006 Oct 23.
9
Discovering motif pairs at interaction sites from protein sequences on a proteome-wide scale.在全蛋白质组范围内从蛋白质序列的相互作用位点发现基序对。
Bioinformatics. 2006 Apr 15;22(8):989-96. doi: 10.1093/bioinformatics/btl020. Epub 2006 Jan 29.
10
Functional annotation from predicted protein interaction networks.来自预测蛋白质相互作用网络的功能注释。
Bioinformatics. 2005 Aug 1;21(15):3217-26. doi: 10.1093/bioinformatics/bti514. Epub 2005 May 26.

引用本文的文献

1
Comparative analysis of gene ontology-based semantic similarity measurements for the application of identifying essential proteins.基于基因本体论的语义相似性度量的比较分析及其在识别必需蛋白质中的应用。
PLoS One. 2023 Apr 21;18(4):e0284274. doi: 10.1371/journal.pone.0284274. eCollection 2023.
2
Spotlite: web application and augmented algorithms for predicting co-complexed proteins from affinity purification--mass spectrometry data.Spotlite:用于从亲和纯化-质谱数据预测共复合蛋白的网络应用程序及增强算法。
J Proteome Res. 2014 Dec 5;13(12):5944-55. doi: 10.1021/pr5008416. Epub 2014 Oct 20.
3
Protein complex identification by integrating protein-protein interaction evidence from multiple sources.
通过整合来自多个来源的蛋白质-蛋白质相互作用证据来鉴定蛋白质复合物。
PLoS One. 2013 Dec 27;8(12):e83841. doi: 10.1371/journal.pone.0083841. eCollection 2013.
4
Supervised maximum-likelihood weighting of composite protein networks for complex prediction.用于复杂预测的复合蛋白质网络的监督最大似然加权
BMC Syst Biol. 2012;6 Suppl 2(Suppl 2):S13. doi: 10.1186/1752-0509-6-S2-S13. Epub 2012 Dec 12.
5
From evidence to inference: probing the evolution of protein interaction networks.从证据到推断:探究蛋白质相互作用网络的进化
HFSP J. 2009 Oct;3(5):290-306. doi: 10.2976/1.3167215. Epub 2009 Oct 19.
6
Improved homology-driven computational validation of protein-protein interactions motivated by the evolutionary gene duplication and divergence hypothesis.基于进化基因复制与分化假说,改进蛋白质-蛋白质相互作用的同源性驱动计算验证。
BMC Bioinformatics. 2009 Jan 19;10:21. doi: 10.1186/1471-2105-10-21.
7
Precision and recall estimates for two-hybrid screens.双杂交筛选的精确率和召回率估计。
Bioinformatics. 2009 Feb 1;25(3):372-8. doi: 10.1093/bioinformatics/btn640. Epub 2008 Dec 17.
8
Identifying protein complexes directly from high-throughput TAP data with Markov random fields.利用马尔可夫随机场直接从高通量串联亲和纯化(TAP)数据中识别蛋白质复合物。
BMC Bioinformatics. 2007 Dec 19;8:482. doi: 10.1186/1471-2105-8-482.
9
Where have all the interactions gone? Estimating the coverage of two-hybrid protein interaction maps.所有的相互作用都去哪儿了?估算双杂交蛋白质相互作用图谱的覆盖率。
PLoS Comput Biol. 2007 Nov;3(11):e214. doi: 10.1371/journal.pcbi.0030214. Epub 2007 Sep 21.
10
Making the most of high-throughput protein-interaction data.充分利用高通量蛋白质相互作用数据。
Genome Biol. 2007;8(10):112. doi: 10.1186/gb-2007-8-10-112.