Suppr超能文献

网络邻近性度量的线性:对基于集合的查询和显著性检验的影响。

Linearity of network proximity measures: implications for set-based queries and significance testing.

作者信息

Maxwell Sean, Chance Mark R, Koyutürk Mehmet

机构信息

Center for Proteomics and Bioinformatics.

Department of Nutrition.

出版信息

Bioinformatics. 2017 May 1;33(9):1354-1361. doi: 10.1093/bioinformatics/btw733.

Abstract

MOTIVATION

In recent years, various network proximity measures have been proposed to facilitate the use of biomolecular interaction data in a broad range of applications. These applications include functional annotation, disease gene prioritization, comparative analysis of biological systems and prediction of new interactions. In such applications, a major task is the scoring or ranking of the nodes in the network in terms of their proximity to a given set of 'seed' nodes (e.g. a group of proteins that are identified to be associated with a disease, or are deferentially expressed in a certain condition). Many different network proximity measures are utilized for this purpose, and these measures are quite diverse in terms of the benefits they offer.

RESULTS

We propose a unifying framework for characterizing network proximity measures for set-based queries. We observe that many existing measures are linear, in that the proximity of a node to a set of nodes can be represented as an aggregation of its proximity to the individual nodes in the set. Based on this observation, we propose methods for processing of set-based proximity queries that take advantage of sparse local proximity information. In addition, we provide an analytical framework for characterizing the distribution of proximity scores based on reference models that accurately capture the characteristics of the seed set (e.g. degree distribution and biological function). The resulting framework facilitates computation of exact figures for the statistical significance of network proximity scores, enabling assessment of the accuracy of Monte Carlo simulation based estimation methods.

AVAILABILITY AND IMPLEMENTATION

Implementations of the methods in this paper are available at https://bioengine.case.edu/crosstalker which includes a robust visualization for results viewing.

CONTACT

stm@case.edu or mxk331@case.edu.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

近年来,人们提出了各种网络邻近性度量方法,以促进生物分子相互作用数据在广泛应用中的使用。这些应用包括功能注释、疾病基因优先级排序、生物系统的比较分析以及新相互作用的预测。在这类应用中,一项主要任务是根据网络中节点与给定一组“种子”节点(例如,一组被确定与疾病相关或在特定条件下差异表达的蛋白质)的邻近程度对节点进行评分或排序。为此使用了许多不同的网络邻近性度量方法,这些方法在提供的优势方面差异很大。

结果

我们提出了一个统一框架,用于表征基于集合查询的网络邻近性度量方法。我们观察到,许多现有度量方法是线性的,即节点与一组节点的邻近程度可以表示为该节点与集合中各个节点邻近程度的汇总。基于这一观察结果,我们提出了利用稀疏局部邻近信息来处理基于集合的邻近性查询的方法。此外,我们提供了一个分析框架,用于基于能够准确捕捉种子集特征(例如度分布和生物学功能)的参考模型来表征邻近性得分的分布。由此产生的框架有助于计算网络邻近性得分统计显著性的精确数值,从而能够评估基于蒙特卡罗模拟的估计方法的准确性。

可用性与实现

本文方法的实现可在https://bioengine.case.edu/crosstalker获取,其中包括用于结果查看的强大可视化功能。

联系方式

stm@case.edumxk331@case.edu

补充信息

补充数据可在《生物信息学》在线获取。

相似文献

2
A nonparametric significance test for sampled networks.一种用于抽样网络的非参数显著性检验。
Bioinformatics. 2018 Jan 1;34(1):64-71. doi: 10.1093/bioinformatics/btx419.
6
Characterizing the topology of probabilistic biological networks.刻画概率生物网络的拓扑结构。
IEEE/ACM Trans Comput Biol Bioinform. 2013 Jul-Aug;10(4):970-83. doi: 10.1109/TCBB.2013.108.
7
Network neighborhood analysis with the multi-node topological overlap measure.采用多节点拓扑重叠度量的网络邻域分析。
Bioinformatics. 2007 Jan 15;23(2):222-31. doi: 10.1093/bioinformatics/btl581. Epub 2006 Nov 16.
8
Applying Monte Carlo Simulation to Biomedical Literature to Approximate Genetic Network.将蒙特卡洛模拟应用于生物医学文献以近似遗传网络。
IEEE/ACM Trans Comput Biol Bioinform. 2016 May-Jun;13(3):494-504. doi: 10.1109/TCBB.2015.2481399. Epub 2015 Sep 23.
9
Network-based pathway enrichment analysis with incomplete network information.基于网络的通路富集分析,网络信息不完整。
Bioinformatics. 2016 Oct 15;32(20):3165-3174. doi: 10.1093/bioinformatics/btw410. Epub 2016 Jun 29.

引用本文的文献

本文引用的文献

1
Genome-Wide Detection and Analysis of Multifunctional Genes.全基因组多功能基因的检测与分析
PLoS Comput Biol. 2015 Oct 5;11(10):e1004467. doi: 10.1371/journal.pcbi.1004467. eCollection 2015 Oct.
3
The BioGRID interaction database: 2015 update.生物通用互作数据库:2015年更新版
Nucleic Acids Res. 2015 Jan;43(Database issue):D470-8. doi: 10.1093/nar/gku1204. Epub 2014 Nov 26.
6
Causal analysis approaches in Ingenuity Pathway Analysis.Ingenuity 通路分析中的因果分析方法。
Bioinformatics. 2014 Feb 15;30(4):523-30. doi: 10.1093/bioinformatics/btt703. Epub 2013 Dec 13.
8
Network signatures of survival in glioblastoma multiforme.胶质母细胞瘤中与生存相关的网络特征。
PLoS Comput Biol. 2013;9(9):e1003237. doi: 10.1371/journal.pcbi.1003237. Epub 2013 Sep 19.
9
Network-based stratification of tumor mutations.基于网络的肿瘤突变分层。
Nat Methods. 2013 Nov;10(11):1108-15. doi: 10.1038/nmeth.2651. Epub 2013 Sep 15.
10
Network-based interpretation of genomic variation data.基于网络的基因组变异数据分析。
J Mol Biol. 2013 Nov 1;425(21):3964-9. doi: 10.1016/j.jmb.2013.07.026. Epub 2013 Jul 23.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验