Suppr超能文献

通过颜色编码进行生物分子网络基序计数与发现

Biomolecular network motif counting and discovery by color coding.

作者信息

Alon Noga, Dao Phuong, Hajirasouliha Iman, Hormozdiari Fereydoun, Sahinalp S Cenk

机构信息

School of Mathematical Sciences, Tel Aviv University, Ramat Aviv, Israel.

出版信息

Bioinformatics. 2008 Jul 1;24(13):i241-9. doi: 10.1093/bioinformatics/btn163.

Abstract

Protein-protein interaction (PPI) networks of many organisms share global topological features such as degree distribution, k-hop reachability, betweenness and closeness. Yet, some of these networks can differ significantly from the others in terms of local structures: e.g. the number of specific network motifs can vary significantly among PPI networks. Counting the number of network motifs provides a major challenge to compare biomolecular networks. Recently developed algorithms have been able to count the number of induced occurrences of subgraphs with k < or = 7 vertices. Yet no practical algorithm exists for counting non-induced occurrences, or counting subgraphs with k > or = 8 vertices. Counting non-induced occurrences of network motifs is not only challenging but also quite desirable as available PPI networks include several false interactions and miss many others. In this article, we show how to apply the 'color coding' technique for counting non-induced occurrences of subgraph topologies in the form of trees and bounded treewidth subgraphs. Our algorithm can count all occurrences of motif G' with k vertices in a network G with n vertices in time polynomial with n, provided k = O(log n). We use our algorithm to obtain 'treelet' distributions for k < or = 10 of available PPI networks of unicellular organisms (Saccharomyces cerevisiae Escherichia coli and Helicobacter Pyloris), which are all quite similar, and a multicellular organism (Caenorhabditis elegans) which is significantly different. Furthermore, the treelet distribution of the unicellular organisms are similar to that obtained by the 'duplication model' but are quite different from that of the 'preferential attachment model'. The treelet distribution is robust w.r.t. sparsification with bait/edge coverage of 70% but differences can be observed when bait/edge coverage drops to 50%.

摘要

许多生物体的蛋白质-蛋白质相互作用(PPI)网络具有一些全局拓扑特征,如度分布、k跳可达性、介数和紧密性。然而,其中一些网络在局部结构方面可能与其他网络有显著差异:例如,特定网络基序的数量在PPI网络之间可能有很大差异。计算网络基序的数量是比较生物分子网络的一项重大挑战。最近开发的算法能够计算具有k≤7个顶点的子图的诱导出现次数。然而,对于计算非诱导出现次数或计算具有k≥8个顶点的子图,尚无实用算法。计算网络基序的非诱导出现次数不仅具有挑战性,而且非常必要,因为现有的PPI网络包含一些错误的相互作用,并且遗漏了许多其他相互作用。在本文中,我们展示了如何应用“颜色编码”技术来计算树和有界树宽子图形式的子图拓扑的非诱导出现次数。我们的算法可以在时间复杂度为n的多项式时间内,计算出具有n个顶点的网络G中具有k个顶点的基序G'的所有出现次数,前提是k = O(log n)。我们使用我们的算法获得了单细胞生物(酿酒酵母、大肠杆菌和幽门螺杆菌)和多细胞生物(秀丽隐杆线虫)现有PPI网络中k≤10的“小树”分布,其中单细胞生物的分布都非常相似,而多细胞生物的分布则有显著差异。此外,单细胞生物的小树分布与通过“复制模型”获得的分布相似,但与“优先连接模型”的分布有很大不同。小树分布对于诱饵/边覆盖率为70%的稀疏化是稳健的,但当诱饵/边覆盖率降至50%时,可以观察到差异。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7077/2718641/0e78d062c73d/btn163f1.jpg

相似文献

1
Biomolecular network motif counting and discovery by color coding.
Bioinformatics. 2008 Jul 1;24(13):i241-9. doi: 10.1093/bioinformatics/btn163.
2
Discovering functional interaction patterns in protein-protein interaction networks.
BMC Bioinformatics. 2008 Jun 11;9:276. doi: 10.1186/1471-2105-9-276.
3
k-Partite cliques of protein interactions: A novel subgraph topology for functional coherence analysis on PPI networks.
J Theor Biol. 2014 Jan 7;340:146-54. doi: 10.1016/j.jtbi.2013.09.013. Epub 2013 Sep 19.
4
Subgraph ensembles and motif discovery using an alternative heuristic for graph isomorphism.
Phys Rev E Stat Nonlin Soft Matter Phys. 2006 Nov;74(5 Pt 1):051903. doi: 10.1103/PhysRevE.74.051903. Epub 2006 Nov 3.
5
Efficient detection of network motifs.
IEEE/ACM Trans Comput Biol Bioinform. 2006 Oct-Dec;3(4):347-59. doi: 10.1109/TCBB.2006.51.
6
Modifying the DPClus algorithm for identifying protein complexes based on new topological structures.
BMC Bioinformatics. 2008 Sep 25;9:398. doi: 10.1186/1471-2105-9-398.
7
Motif search in graphs: application to metabolic networks.
IEEE/ACM Trans Comput Biol Bioinform. 2006 Oct-Dec;3(4):360-8. doi: 10.1109/TCBB.2006.55.
10
Fitting a geometric graph to a protein-protein interaction network.
Bioinformatics. 2008 Apr 15;24(8):1093-9. doi: 10.1093/bioinformatics/btn079. Epub 2008 Mar 14.

引用本文的文献

1
Restless reachability problems in temporal graphs.
Knowl Inf Syst. 2025;67(7):5651-5697. doi: 10.1007/s10115-025-02405-6. Epub 2025 Apr 1.
2
Counting Temporal Paths.
Algorithmica. 2025;87(5):736-782. doi: 10.1007/s00453-025-01301-3. Epub 2025 Feb 25.
3
Faster algorithms for counting subgraphs in sparse graphs.
Algorithmica. 2021;83(8):2578-2605. doi: 10.1007/s00453-021-00811-0. Epub 2021 Feb 22.
4
Intrinsic limitations in mainstream methods of identifying network motifs in biology.
BMC Bioinformatics. 2020 Apr 29;21(1):165. doi: 10.1186/s12859-020-3441-x.
5
Characterizing building blocks of resource constrained biological networks.
BMC Bioinformatics. 2019 Jun 20;20(Suppl 12):318. doi: 10.1186/s12859-019-2838-x.
6
NoMAS: A Computational Approach to Find Mutated Subnetworks Associated With Survival in Genome-Wide Cancer Studies.
Front Genet. 2019 Apr 10;10:265. doi: 10.3389/fgene.2019.00265. eCollection 2019.
7
INBIA: a boosting methodology for proteomic network inference.
BMC Bioinformatics. 2018 Jul 9;19(Suppl 7):188. doi: 10.1186/s12859-018-2183-5.
8
Combinatorial algorithm for counting small induced graphs and orbits.
PLoS One. 2017 Feb 9;12(2):e0171428. doi: 10.1371/journal.pone.0171428. eCollection 2017.
9
Biomarker gene signature discovery integrating network knowledge.
Biology (Basel). 2012 Feb 27;1(1):5-17. doi: 10.3390/biology1010005.
10
Rosen's (M,R) system in process algebra.
BMC Syst Biol. 2013 Nov 17;7:128. doi: 10.1186/1752-0509-7-128.

本文引用的文献

1
QNet: a tool for querying protein interaction networks.
J Comput Biol. 2008 Sep;15(7):913-25. doi: 10.1089/cmb.2007.0172.
2
Not all scale-free networks are born equal: the role of the seed graph in PPI network evolution.
PLoS Comput Biol. 2007 Jul;3(7):e118. doi: 10.1371/journal.pcbi.0030118.
3
QPath: a method for querying pathways in a protein-protein interaction network.
BMC Bioinformatics. 2006 Apr 10;7:199. doi: 10.1186/1471-2105-7-199.
4
Efficient algorithms for detecting signaling pathways in protein interaction networks.
J Comput Biol. 2006 Mar;13(2):133-44. doi: 10.1089/cmb.2006.13.133.
5
Effect of sampling on topology predictions of protein-protein interaction networks.
Nat Biotechnol. 2005 Jul;23(7):839-44. doi: 10.1038/nbt1116.
6
Modeling interactome: scale-free or geometric?
Bioinformatics. 2004 Dec 12;20(18):3508-15. doi: 10.1093/bioinformatics/bth436. Epub 2004 Jul 29.
7
Efficient sampling algorithm for estimating subgraph concentrations and detecting network motifs.
Bioinformatics. 2004 Jul 22;20(11):1746-58. doi: 10.1093/bioinformatics/bth163. Epub 2004 Mar 4.
8
Duplication models for biological networks.
J Comput Biol. 2003;10(5):677-87. doi: 10.1089/106652703322539024.
9
Preferential attachment in the protein network evolution.
Phys Rev Lett. 2003 Sep 26;91(13):138701. doi: 10.1103/PhysRevLett.91.138701.
10
Network motifs: simple building blocks of complex networks.
Science. 2002 Oct 25;298(5594):824-7. doi: 10.1126/science.298.5594.824.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验