• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

在整合基因交互数据库时考虑冗余。

Accounting for redundancy when integrating gene interaction databases.

机构信息

Biotechnology Center, TU Dresden, Dresden, Germany.

出版信息

PLoS One. 2009 Oct 22;4(10):e7492. doi: 10.1371/journal.pone.0007492.

DOI:10.1371/journal.pone.0007492
PMID:19847299
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2760779/
Abstract

During the last years gene interaction networks are increasingly being used for the assessment and interpretation of biological measurements. Knowledge of the interaction partners of an unknown protein allows scientists to understand the complex relationships between genetic products, helps to reveal unknown biological functions and pathways, and get a more detailed picture of an organism's complexity. Being able to measure all protein interactions under all relevant conditions is virtually impossible. Hence, computational methods integrating different datasets for predicting gene interactions are needed. However, when integrating different sources one has to account for the fact that some parts of the information may be redundant, which may lead to an overestimation of the true likelihood of an interaction. Our method integrates information derived from three different databases (Bioverse, HiMAP and STRING) for predicting human gene interactions. A Bayesian approach was implemented in order to integrate the different data sources on a common quantitative scale. An important assumption of the Bayesian integration is independence of the input data (features). Our study shows that the conditional dependency cannot be ignored when combining gene interaction databases that rely on partially overlapping input data. In addition, we show how the correlation structure between the databases can be detected and we propose a linear model to correct for this bias. Benchmarking the results against two independent reference data sets shows that the integrated model outperforms the individual datasets. Our method provides an intuitive strategy for weighting the different features while accounting for their conditional dependencies.

摘要

在过去的几年中,基因相互作用网络越来越多地被用于评估和解释生物测量数据。了解未知蛋白质的相互作用伙伴可以帮助科学家理解遗传产物之间的复杂关系,有助于揭示未知的生物学功能和途径,并更详细地了解生物体的复杂性。实际上,要在所有相关条件下测量所有蛋白质相互作用是不可能的。因此,需要整合不同数据集以预测基因相互作用的计算方法。但是,在整合不同来源时,必须考虑到某些信息可能是冗余的,这可能导致对真实相互作用可能性的高估。我们的方法整合了来自三个不同数据库(Bioverse、HiMAP 和 STRING)的信息,以预测人类基因相互作用。为了在共同的定量尺度上整合不同的数据源,我们实现了贝叶斯方法。贝叶斯整合的一个重要假设是输入数据(特征)之间的独立性。我们的研究表明,当组合依赖于部分重叠输入数据的基因相互作用数据库时,不能忽略条件依赖性。此外,我们展示了如何检测数据库之间的相关结构,并提出了一种线性模型来纠正这种偏差。将结果与两个独立的参考数据集进行基准测试表明,集成模型优于单个数据集。我们的方法提供了一种直观的策略来加权不同的特征,同时考虑它们的条件依赖性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/91b7/2760779/62bb18135559/pone.0007492.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/91b7/2760779/84ab003d224e/pone.0007492.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/91b7/2760779/4e635c8e98bf/pone.0007492.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/91b7/2760779/a4319531ed50/pone.0007492.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/91b7/2760779/0143f55773be/pone.0007492.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/91b7/2760779/f812681b2847/pone.0007492.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/91b7/2760779/62bb18135559/pone.0007492.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/91b7/2760779/84ab003d224e/pone.0007492.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/91b7/2760779/4e635c8e98bf/pone.0007492.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/91b7/2760779/a4319531ed50/pone.0007492.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/91b7/2760779/0143f55773be/pone.0007492.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/91b7/2760779/f812681b2847/pone.0007492.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/91b7/2760779/62bb18135559/pone.0007492.g006.jpg

相似文献

1
Accounting for redundancy when integrating gene interaction databases.在整合基因交互数据库时考虑冗余。
PLoS One. 2009 Oct 22;4(10):e7492. doi: 10.1371/journal.pone.0007492.
2
Assessing reliability of protein-protein interactions by integrative analysis of data in model organisms.通过对模式生物中的数据进行综合分析来评估蛋白质-蛋白质相互作用的可靠性。
BMC Bioinformatics. 2009 Apr 29;10 Suppl 4(Suppl 4):S5. doi: 10.1186/1471-2105-10-S4-S5.
3
Bayesian network model for identification of pathways by integrating protein interaction with genetic interaction data.通过整合蛋白质相互作用与基因相互作用数据来识别通路的贝叶斯网络模型。
BMC Syst Biol. 2017 Sep 21;11(Suppl 4):81. doi: 10.1186/s12918-017-0454-9.
4
A framework of integrating gene relations from heterogeneous data sources: an experiment on Arabidopsis thaliana.一个从异构数据源整合基因关系的框架:拟南芥实验
Bioinformatics. 2006 Aug 15;22(16):2037-43. doi: 10.1093/bioinformatics/btl345. Epub 2006 Jul 4.
5
Bayesian inference for genomic data integration reduces misclassification rate in predicting protein-protein interactions.贝叶斯推断在基因组数据整合中减少了预测蛋白质-蛋白质相互作用的错误分类率。
PLoS Comput Biol. 2011 Jul;7(7):e1002110. doi: 10.1371/journal.pcbi.1002110. Epub 2011 Jul 28.
6
A copula method for modeling directional dependence of genes.一种用于建模基因方向依赖性的共现方法。
BMC Bioinformatics. 2008 May 1;9:225. doi: 10.1186/1471-2105-9-225.
7
Filtering high-throughput protein-protein interaction data using a combination of genomic features.使用基因组特征组合过滤高通量蛋白质-蛋白质相互作用数据。
BMC Bioinformatics. 2005 Apr 18;6:100. doi: 10.1186/1471-2105-6-100.
8
A Bayesian Framework for Combining Protein and Network Topology Information for Predicting Protein-Protein Interactions.一种用于结合蛋白质和网络拓扑信息以预测蛋白质-蛋白质相互作用的贝叶斯框架。
IEEE/ACM Trans Comput Biol Bioinform. 2015 May-Jun;12(3):538-50. doi: 10.1109/TCBB.2014.2359441.
9
Computational analysis of human protein interaction networks.人类蛋白质相互作用网络的计算分析
Proteomics. 2007 Aug;7(15):2541-52. doi: 10.1002/pmic.200600924.
10
BMRF-MI: integrative identification of protein interaction network by modeling the gene dependency.BMRF-MI:通过对基因依赖性进行建模来综合识别蛋白质相互作用网络。
BMC Genomics. 2015;16 Suppl 7(Suppl 7):S10. doi: 10.1186/1471-2164-16-S7-S10. Epub 2015 Jun 11.

引用本文的文献

1
Redundancy-aware unsupervised ranking based on game theory: Ranking pathways in collections of gene sets.基于博弈论的冗余感知无监督排序:基因集集合中路径的排序。
PLoS One. 2023 Mar 9;18(3):e0282699. doi: 10.1371/journal.pone.0282699. eCollection 2023.
2
The ConsensusPathDB interaction database: 2013 update.共识路径数据库交互数据库:2013 年更新。
Nucleic Acids Res. 2013 Jan;41(Database issue):D793-800. doi: 10.1093/nar/gks1055. Epub 2012 Nov 11.
3
Large-scale de novo prediction of physical protein-protein association.大规模从头预测物理蛋白质-蛋白质相互作用。

本文引用的文献

1
The Bioverse API and web application.生物宇宙应用程序编程接口和网络应用程序。
Methods Mol Biol. 2009;541:511-34. doi: 10.1007/978-1-59745-243-4_22.
2
An empirical framework for binary interactome mapping.用于二元相互作用组图谱绘制的实证框架。
Nat Methods. 2009 Jan;6(1):83-90. doi: 10.1038/nmeth.1280. Epub 2008 Dec 7.
3
STRING 8--a global view on proteins and their functional interactions in 630 organisms.STRING 8——关于630种生物中蛋白质及其功能相互作用的全局视图。
Mol Cell Proteomics. 2011 Nov;10(11):M111.010629. doi: 10.1074/mcp.M111.010629. Epub 2011 Aug 11.
4
Bayesian inference for genomic data integration reduces misclassification rate in predicting protein-protein interactions.贝叶斯推断在基因组数据整合中减少了预测蛋白质-蛋白质相互作用的错误分类率。
PLoS Comput Biol. 2011 Jul;7(7):e1002110. doi: 10.1371/journal.pcbi.1002110. Epub 2011 Jul 28.
Nucleic Acids Res. 2009 Jan;37(Database issue):D412-6. doi: 10.1093/nar/gkn760. Epub 2008 Oct 21.
4
Estimating the size of the human interactome.估算人类相互作用组的规模。
Proc Natl Acad Sci U S A. 2008 May 13;105(19):6959-64. doi: 10.1073/pnas.0708078105. Epub 2008 May 12.
5
CORUM: the comprehensive resource of mammalian protein complexes.CORUM:哺乳动物蛋白质复合物的综合资源库。
Nucleic Acids Res. 2008 Jan;36(Database issue):D646-50. doi: 10.1093/nar/gkm936. Epub 2007 Oct 26.
6
An improved, bias-reduced probabilistic functional gene network of baker's yeast, Saccharomyces cerevisiae.一种经过改进、偏差降低的酿酒酵母概率功能基因网络。
PLoS One. 2007 Oct 3;2(10):e988. doi: 10.1371/journal.pone.0000988.
7
Integrating physical and genetic maps: from genomes to interaction networks.整合物理图谱和遗传图谱:从基因组到相互作用网络。
Nat Rev Genet. 2007 Sep;8(9):699-710. doi: 10.1038/nrg2144.
8
Computational analysis of human protein interaction networks.人类蛋白质相互作用网络的计算分析
Proteomics. 2007 Aug;7(15):2541-52. doi: 10.1002/pmic.200600924.
9
Large-scale mapping of human protein-protein interactions by mass spectrometry.通过质谱法对人类蛋白质-蛋白质相互作用进行大规模图谱绘制。
Mol Syst Biol. 2007;3:89. doi: 10.1038/msb4100134. Epub 2007 Mar 13.
10
IntAct--open source resource for molecular interaction data.IntAct——分子相互作用数据的开源资源。
Nucleic Acids Res. 2007 Jan;35(Database issue):D561-5. doi: 10.1093/nar/gkl958. Epub 2006 Dec 1.