• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

分子相互作用网络中视觉知识探索的聚类方法。

Clustering approaches for visual knowledge exploration in molecular interaction networks.

机构信息

Luxembourg Centre for Systems Biomedicine, University of Luxembourg, 7, Avenue des Hauts-Fourneaux, Esch-Belval, Luxembourg.

Interdisciplinary Centre for Security, Reliability and Trust, University of Luxembourg, 6, Avenue de la Fonte, Esch-Belval, Luxembourg.

出版信息

BMC Bioinformatics. 2018 Aug 29;19(1):308. doi: 10.1186/s12859-018-2314-z.

DOI:10.1186/s12859-018-2314-z
PMID:30157777
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6116538/
Abstract

BACKGROUND

Biomedical knowledge grows in complexity, and becomes encoded in network-based repositories, which include focused, expert-drawn diagrams, networks of evidence-based associations and established ontologies. Combining these structured information sources is an important computational challenge, as large graphs are difficult to analyze visually.

RESULTS

We investigate knowledge discovery in manually curated and annotated molecular interaction diagrams. To evaluate similarity of content we use: i) Euclidean distance in expert-drawn diagrams, ii) shortest path distance using the underlying network and iii) ontology-based distance. We employ clustering with these metrics used separately and in pairwise combinations. We propose a novel bi-level optimization approach together with an evolutionary algorithm for informative combination of distance metrics. We compare the enrichment of the obtained clusters between the solutions and with expert knowledge. We calculate the number of Gene and Disease Ontology terms discovered by different solutions as a measure of cluster quality. Our results show that combining distance metrics can improve clustering accuracy, based on the comparison with expert-provided clusters. Also, the performance of specific combinations of distance functions depends on the clustering depth (number of clusters). By employing bi-level optimization approach we evaluated relative importance of distance functions and we found that indeed the order by which they are combined affects clustering performance. Next, with the enrichment analysis of clustering results we found that both hierarchical and bi-level clustering schemes discovered more Gene and Disease Ontology terms than expert-provided clusters for the same knowledge repository. Moreover, bi-level clustering found more enriched terms than the best hierarchical clustering solution for three distinct distance metric combinations in three different instances of disease maps.

CONCLUSIONS

In this work we examined the impact of different distance functions on clustering of a visual biomedical knowledge repository. We found that combining distance functions may be beneficial for clustering, and improve exploration of such repositories. We proposed bi-level optimization to evaluate the importance of order by which the distance functions are combined. Both combination and order of these functions affected clustering quality and knowledge recognition in the considered benchmarks. We propose that multiple dimensions can be utilized simultaneously for visual knowledge exploration.

摘要

背景

生物医学知识的复杂性不断增加,并被编码在基于网络的知识库中,其中包括有针对性的、专家绘制的图表、基于证据的关联网络和已建立的本体。将这些结构化信息源结合起来是一个重要的计算挑战,因为大型图难以进行可视化分析。

结果

我们研究了在手动整理和注释的分子相互作用图中进行知识发现。为了评估内容的相似性,我们使用:i)专家绘制的图表中的欧几里得距离,ii)使用基础网络的最短路径距离和 iii)基于本体的距离。我们使用这些指标进行聚类,分别使用和成对组合使用。我们提出了一种新的双层优化方法,以及一种用于距离度量信息组合的进化算法。我们将获得的聚类之间的富集与解决方案和专家知识进行比较。我们计算不同解决方案发现的基因和疾病本体术语数量作为聚类质量的度量。我们的结果表明,基于与专家提供的聚类进行比较,组合距离度量可以提高聚类准确性。此外,特定距离函数组合的性能取决于聚类深度(聚类数量)。通过使用双层优化方法,我们评估了距离函数的相对重要性,发现它们的组合顺序确实会影响聚类性能。接下来,通过对聚类结果的富集分析,我们发现对于相同的知识库,层次聚类和双层聚类方案都比专家提供的聚类发现了更多的基因和疾病本体术语。此外,对于三种不同距离度量组合的三个不同疾病图谱实例,双层聚类比最佳层次聚类解决方案发现了更多的富集术语。

结论

在这项工作中,我们检查了不同距离函数对视觉生物医学知识库聚类的影响。我们发现,组合距离函数可能有益于聚类,并改善此类知识库的探索。我们提出了双层优化来评估组合距离函数的顺序的重要性。这些函数的组合和顺序都影响了所考虑基准中的聚类质量和知识识别。我们提出可以同时利用多个维度进行可视化知识探索。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/98d7/6116538/81840ab75b54/12859_2018_2314_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/98d7/6116538/18cb3d15f5bd/12859_2018_2314_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/98d7/6116538/2f41cf65ebfc/12859_2018_2314_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/98d7/6116538/e39af7a8aa2b/12859_2018_2314_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/98d7/6116538/3fa7fa4b7ce3/12859_2018_2314_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/98d7/6116538/2a0d4db73fc5/12859_2018_2314_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/98d7/6116538/40b40c86d5a8/12859_2018_2314_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/98d7/6116538/bd6392f9aa66/12859_2018_2314_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/98d7/6116538/81840ab75b54/12859_2018_2314_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/98d7/6116538/18cb3d15f5bd/12859_2018_2314_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/98d7/6116538/2f41cf65ebfc/12859_2018_2314_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/98d7/6116538/e39af7a8aa2b/12859_2018_2314_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/98d7/6116538/3fa7fa4b7ce3/12859_2018_2314_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/98d7/6116538/2a0d4db73fc5/12859_2018_2314_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/98d7/6116538/40b40c86d5a8/12859_2018_2314_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/98d7/6116538/bd6392f9aa66/12859_2018_2314_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/98d7/6116538/81840ab75b54/12859_2018_2314_Fig8_HTML.jpg

相似文献

1
Clustering approaches for visual knowledge exploration in molecular interaction networks.分子相互作用网络中视觉知识探索的聚类方法。
BMC Bioinformatics. 2018 Aug 29;19(1):308. doi: 10.1186/s12859-018-2314-z.
2
Genetic Programming for Evolving Similarity Functions for Clustering: Representations and Analysis.遗传编程用于聚类的相似性函数进化:表示和分析。
Evol Comput. 2020 Winter;28(4):531-561. doi: 10.1162/evco_a_00264. Epub 2019 Oct 10.
3
A systematic comparison of data- and knowledge-driven approaches to disease subtype discovery.基于数据和知识的疾病亚型发现方法的系统比较。
Brief Bioinform. 2021 Nov 5;22(6). doi: 10.1093/bib/bbab314.
4
Ontology-Based Analysis of Microarray Data.基于本体的微阵列数据分析
Methods Mol Biol. 2016;1375:117-21. doi: 10.1007/7651_2015_249.
5
GO functional similarity clustering depends on similarity measure, clustering method, and annotation completeness.GO 功能相似性聚类取决于相似性度量、聚类方法和注释完整性。
BMC Bioinformatics. 2019 Mar 27;20(1):155. doi: 10.1186/s12859-019-2752-2.
6
Metric for measuring the effectiveness of clustering of DNA microarray expression.用于测量 DNA 微阵列表达聚类有效性的度量。
BMC Bioinformatics. 2006 Sep 6;7 Suppl 2(Suppl 2):S5. doi: 10.1186/1471-2105-7-S2-S5.
7
A knowledge-driven approach to biomedical document conceptualization.基于知识的生物医学文献概念化方法。
Artif Intell Med. 2010 Jun;49(2):67-78. doi: 10.1016/j.artmed.2010.02.005. Epub 2010 Apr 3.
8
Network-aided Bi-Clustering for discovering cancer subtypes.基于网络的双聚类分析用于发现癌症亚型。
Sci Rep. 2017 Apr 21;7(1):1046. doi: 10.1038/s41598-017-01064-0.
9
Context-driven automatic subgraph creation for literature-based discovery.用于基于文献的发现的上下文驱动自动子图创建
J Biomed Inform. 2015 Apr;54:141-57. doi: 10.1016/j.jbi.2015.01.014. Epub 2015 Feb 7.
10
Novel symmetry-based gene-gene dissimilarity measures utilizing Gene Ontology: Application in gene clustering.基于新型对称的基因-基因相异度度量方法,并利用基因本体论:在基因聚类中的应用。
Gene. 2018 Dec 30;679:341-351. doi: 10.1016/j.gene.2018.08.062. Epub 2018 Sep 2.

引用本文的文献

1
Exploration and comparison of molecular mechanisms across diseases using MINERVA Net.利用 MINERVA Net 探索和比较疾病之间的分子机制。
Protein Sci. 2023 Feb;32(2):e4565. doi: 10.1002/pro.4565.
2
A predicted risk score based on the expression of 16 autophagy-related genes for multiple myeloma survival.基于16个自噬相关基因表达的多发性骨髓瘤生存预测风险评分。
Oncol Lett. 2019 Nov;18(5):5310-5324. doi: 10.3892/ol.2019.10881. Epub 2019 Sep 19.

本文引用的文献

1
MINERVA-a platform for visualization and curation of molecular interaction networks.MINERVA——一个用于分子相互作用网络可视化与管理的平台。
NPJ Syst Biol Appl. 2016 Sep 22;2:16020. doi: 10.1038/npjsba.2016.20. eCollection 2016.
2
ReconMap: an interactive visualization of human metabolism.ReconMap:人类新陈代谢的交互式可视化工具。
Bioinformatics. 2017 Feb 15;33(4):605-607. doi: 10.1093/bioinformatics/btw667.
3
KEGG: new perspectives on genomes, pathways, diseases and drugs.京都基因与基因组百科全书(KEGG):关于基因组、通路、疾病和药物的新视角。
Nucleic Acids Res. 2017 Jan 4;45(D1):D353-D361. doi: 10.1093/nar/gkw1092. Epub 2016 Nov 28.
4
Integration and Visualization of Translational Medicine Data for Better Understanding of Human Diseases.转化医学数据的整合与可视化,以更好地理解人类疾病。
Big Data. 2016 Jun;4(2):97-108. doi: 10.1089/big.2015.0057.
5
MONGKIE: an integrated tool for network analysis and visualization for multi-omics data.MONGKIE:一种用于多组学数据网络分析和可视化的集成工具。
Biol Direct. 2016 Mar 18;11(1):10. doi: 10.1186/s13062-016-0112-y.
6
The Reactome pathway Knowledgebase.Reactome通路知识库。
Nucleic Acids Res. 2016 Jan 4;44(D1):D481-7. doi: 10.1093/nar/gkv1351. Epub 2015 Dec 9.
7
NDEx, the Network Data Exchange.NDEx,即网络数据交换。
Cell Syst. 2015 Oct 28;1(4):302-305. doi: 10.1016/j.cels.2015.10.001.
8
WikiPathways: capturing the full diversity of pathway knowledge.维基途径:捕捉通路知识的全部多样性。
Nucleic Acids Res. 2016 Jan 4;44(D1):D488-94. doi: 10.1093/nar/gkv1024. Epub 2015 Oct 19.
9
SIGNOR: a database of causal relationships between biological entities.SIGNOR:一个生物实体之间因果关系的数据库。
Nucleic Acids Res. 2016 Jan 4;44(D1):D548-54. doi: 10.1093/nar/gkv1048. Epub 2015 Oct 13.
10
Comparing the performance of biomedical clustering methods.比较生物医学聚类方法的性能。
Nat Methods. 2015 Nov;12(11):1033-8. doi: 10.1038/nmeth.3583. Epub 2015 Sep 21.