• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

从带顶点属性的图中挖掘上下文有意义的子图。

Mining contextually meaningful subgraphs from a vertex-attributed graph.

机构信息

Computer Science, North Dakota State University, Fargo, North Dakota, USA.

Computer Science and Engineering, Qatar University, Doha, Qatar.

出版信息

BMC Bioinformatics. 2024 Nov 14;25(1):356. doi: 10.1186/s12859-024-05960-x.

DOI:10.1186/s12859-024-05960-x
PMID:39543486
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11566210/
Abstract

Networks have emerged as a natural data structure to represent relations among entities. Proteins interact to carry out cellular functions and protein-Protein interaction network analysis has been employed for understanding the cellular machinery. Advances in genomics technologies enabled the collection of large data that annotate proteins in interaction networks. Integrative analysis of interaction networks with gene expression and annotations enables the discovery of context-specific complexes and improves the identification of functional modules and pathways. Extracting subnetworks whose vertices are connected and have high attribute similarity have applications in diverse domains. We present an enumeration approach for mining sets of connected and cohesive subgraphs, where vertices in the subgraphs have similar attribute profile. Due to the large number of cohesive connected subgraphs and to overcome the overlap among these subgraphs, we propose an algorithm for enumerating a set of representative subgraphs, the set of all closed subgraphs. We propose pruning strategies for efficiently enumerating the search tree without missing any pattern or reporting duplicate subgraphs. On a real protein-protein interaction network with attributes representing the dysregulation profile of genes in multiple cancers, we mine closed cohesive connected subnetworks and show their biological significance. Moreover, we conduct a runtime comparison with existing algorithms to show the efficiency of our proposed algorithm.

摘要

网络已经成为表示实体之间关系的自然数据结构。蛋白质相互作用以执行细胞功能,蛋白质-蛋白质相互作用网络分析已被用于理解细胞机制。基因组学技术的进步使得能够收集大量注释蛋白质相互作用网络的数据集。将相互作用网络与基因表达和注释进行综合分析,可以发现特定于上下文的复合物,并提高功能模块和途径的识别能力。提取顶点连接且具有高属性相似性的子网在不同领域都有应用。我们提出了一种用于挖掘一组连接和凝聚子图的枚举方法,其中子图中的顶点具有相似的属性分布。由于凝聚连接子图的数量众多,为了克服这些子图之间的重叠,我们提出了一种用于枚举一组代表子图的算法,即所有闭合子图的集合。我们提出了剪枝策略,以便在不遗漏任何模式或报告重复子图的情况下有效地枚举搜索树。在具有表示多种癌症中基因失调特征的属性的真实蛋白质-蛋白质相互作用网络上,我们挖掘了封闭凝聚的连通子网,并展示了它们的生物学意义。此外,我们还与现有算法进行了运行时比较,以展示我们提出的算法的效率。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/18cd/11566210/129e48c5090a/12859_2024_5960_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/18cd/11566210/120bad628b6a/12859_2024_5960_Figa_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/18cd/11566210/d5d677f13226/12859_2024_5960_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/18cd/11566210/0f2cb970a8c4/12859_2024_5960_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/18cd/11566210/bbf6b66e9178/12859_2024_5960_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/18cd/11566210/0a308b5c6443/12859_2024_5960_Figb_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/18cd/11566210/381e2471600b/12859_2024_5960_Figc_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/18cd/11566210/64c4672ed55b/12859_2024_5960_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/18cd/11566210/4eb53f9621d7/12859_2024_5960_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/18cd/11566210/877b9ac378b7/12859_2024_5960_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/18cd/11566210/129e48c5090a/12859_2024_5960_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/18cd/11566210/120bad628b6a/12859_2024_5960_Figa_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/18cd/11566210/d5d677f13226/12859_2024_5960_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/18cd/11566210/0f2cb970a8c4/12859_2024_5960_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/18cd/11566210/bbf6b66e9178/12859_2024_5960_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/18cd/11566210/0a308b5c6443/12859_2024_5960_Figb_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/18cd/11566210/381e2471600b/12859_2024_5960_Figc_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/18cd/11566210/64c4672ed55b/12859_2024_5960_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/18cd/11566210/4eb53f9621d7/12859_2024_5960_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/18cd/11566210/877b9ac378b7/12859_2024_5960_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/18cd/11566210/129e48c5090a/12859_2024_5960_Fig7_HTML.jpg

相似文献

1
Mining contextually meaningful subgraphs from a vertex-attributed graph.从带顶点属性的图中挖掘上下文有意义的子图。
BMC Bioinformatics. 2024 Nov 14;25(1):356. doi: 10.1186/s12859-024-05960-x.
2
A linear delay algorithm for enumerating all connected induced subgraphs.一种用于枚举所有连通诱导子图的线性延迟算法。
BMC Bioinformatics. 2019 Jun 20;20(Suppl 12):319. doi: 10.1186/s12859-019-2837-y.
3
RASMA: a reverse search algorithm for mining maximal frequent subgraphs.RASMA:一种用于挖掘最大频繁子图的反向搜索算法。
BioData Min. 2021 Mar 16;14(1):19. doi: 10.1186/s13040-021-00250-1.
4
Mining the Enriched Subgraphs for Specific Vertices in a Biological Graph.从生物图谱中特定顶点的富集子图中挖掘信息。
IEEE/ACM Trans Comput Biol Bioinform. 2019 Sep-Oct;16(5):1496-1507. doi: 10.1109/TCBB.2016.2576440. Epub 2016 Jun 7.
5
Detection of Complexes in Biological Networks Through Diversified Dense Subgraph Mining.通过多样化密集子图挖掘检测生物网络中的复合物
J Comput Biol. 2017 Sep;24(9):923-941. doi: 10.1089/cmb.2017.0037. Epub 2017 Jun 1.
6
An novel frequent probability pattern mining algorithm based on circuit simulation method in uncertain biological networks.一种基于不确定生物网络中电路仿真方法的新型频繁概率模式挖掘算法。
BMC Syst Biol. 2014;8 Suppl 3(Suppl 3):S6. doi: 10.1186/1752-0509-8-S3-S6. Epub 2014 Oct 22.
7
Mining functional subgraphs from cancer protein-protein interaction networks.从癌症蛋白质-蛋白质相互作用网络中挖掘功能子图。
BMC Syst Biol. 2012;6 Suppl 3(Suppl 3):S2. doi: 10.1186/1752-0509-6-S3-S2. Epub 2012 Dec 17.
8
Coupling Graphs, Efficient Algorithms and B-Cell Epitope Prediction.耦合图、高效算法与B细胞表位预测
IEEE/ACM Trans Comput Biol Bioinform. 2014 Jan-Feb;11(1):7-16. doi: 10.1109/TCBB.2013.136.
9
From Function to Interaction: A New Paradigm for Accurately Predicting Protein Complexes Based on Protein-to-Protein Interaction Networks.从功能到相互作用:基于蛋白质-蛋白质相互作用网络准确预测蛋白质复合物的新范式。
IEEE/ACM Trans Comput Biol Bioinform. 2014 Jul-Aug;11(4):616-27. doi: 10.1109/TCBB.2014.2306825.
10
Hash subgraph pairwise kernel for protein-protein interaction extraction.基于哈希子图的成对核函数用于蛋白质-蛋白质相互作用提取。
IEEE/ACM Trans Comput Biol Bioinform. 2012 Jul-Aug;9(4):1190-202. doi: 10.1109/TCBB.2012.50.

本文引用的文献

1
A linear delay algorithm for enumerating all connected induced subgraphs.一种用于枚举所有连通诱导子图的线性延迟算法。
BMC Bioinformatics. 2019 Jun 20;20(Suppl 12):319. doi: 10.1186/s12859-019-2837-y.
2
The BioGRID interaction database: 2019 update.生物相互作用数据库(BioGRID):2019 年更新版。
Nucleic Acids Res. 2019 Jan 8;47(D1):D529-D541. doi: 10.1093/nar/gky1079.
3
Molecular signatures database (MSigDB) 3.0.分子特征数据库(MSigDB)3.0。
Bioinformatics. 2011 Jun 15;27(12):1739-40. doi: 10.1093/bioinformatics/btr260. Epub 2011 May 5.
4
Subnetwork state functions define dysregulated subnetworks in cancer.子网状态函数定义了癌症中失调的子网。
J Comput Biol. 2011 Mar;18(3):263-81. doi: 10.1089/cmb.2010.0269.
5
Enumeration of condition-dependent dense modules in protein interaction networks.蛋白质相互作用网络中条件依赖密集模块的枚举
Bioinformatics. 2009 Apr 1;25(7):933-40. doi: 10.1093/bioinformatics/btp080. Epub 2009 Feb 11.
6
Network-based classification of breast cancer metastasis.基于网络的乳腺癌转移分类
Mol Syst Biol. 2007;3:140. doi: 10.1038/msb4100180. Epub 2007 Oct 16.
7
Efficient detection of network motifs.网络基序的高效检测。
IEEE/ACM Trans Comput Biol Bioinform. 2006 Oct-Dec;3(4):347-59. doi: 10.1109/TCBB.2006.51.
8
Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles.基因集富集分析:一种基于知识的方法用于解读全基因组表达谱。
Proc Natl Acad Sci U S A. 2005 Oct 25;102(43):15545-50. doi: 10.1073/pnas.0506580102. Epub 2005 Sep 30.
9
Discovering regulatory and signalling circuits in molecular interaction networks.在分子相互作用网络中发现调控和信号传导回路。
Bioinformatics. 2002;18 Suppl 1:S233-40. doi: 10.1093/bioinformatics/18.suppl_1.s233.