• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用超图从生物文本文档中进行多方面关联提取和可视化:在疾病的遗传关联研究中的应用。

Multi-way association extraction and visualization from biological text documents using hyper-graphs: applications to genetic association studies for diseases.

机构信息

Department of Computer and Information Science, Indiana University Purdue University Indianapolis, 723 West Michigan Street SL 280J, Indianapolis, IN 46202, USA.

出版信息

Artif Intell Med. 2010 Jul;49(3):145-54. doi: 10.1016/j.artmed.2010.03.002. Epub 2010 Apr 9.

DOI:10.1016/j.artmed.2010.03.002
PMID:20382004
Abstract

OBJECTIVES

Biological research literature, as in many other domains of human endeavor, represents a rich, ever growing source of knowledge. An important form of such biological knowledge constitutes associations among biological entities such as genes, proteins, diseases, drugs and chemicals, etc. There has been a considerable amount of recent research in extraction of various kinds of binary associations (e.g., gene-gene, gene-protein, protein-protein, etc.) using different text mining approaches. However, an important aspect of such associations (e.g., "gene A activates protein B") is identifying the context in which such associations occur (e.g., "gene A activates protein B in the context of disease C in organ D under the influence of chemical E"). Such contexts can be represented appropriately by a multi-way relationship involving more than two objects (e.g., objects A, B, C, D, E) rather than usual binary relationship (objects A and B).

METHODS

Such multi-way relations naturally lead to a hyper-graph representation of the knowledge rather than a binary graph. The hyper-graph based multi-way knowledge extraction from biological text literature represents a computationally difficult problem (due to its combinatorial nature) which has not received much attention from the Bioinformatics research community. In this paper, we describe and compare two different approaches to such multi-way hyper-graph extraction: one based on an exhaustive enumeration of all multi-way hyper-edges and the other based on an extension of the well-known A Priori algorithm for structured data to the case unstructured textual data. We also present a representative graph based approach towards visualizing these genetic association hyper-graphs.

RESULTS

Two case studies are conducted for two biomedical problems (related to the diseases of lung cancer and colorectal cancer respectively), illustrating that the latter approach (using the text-based A Priori method) identifies the same hyper-edges as the former approach (the exhaustive method), but at a much less computational cost. The extracted hyper-relations are presented in the paper as cognition-rich representative graphs, representing the corresponding hyper-graphs.

CONCLUSIONS

The text-based A Priori algorithm is a practical, useful method to extract hyper-graphs representing multi-way associations among biological objects. These hyper-graphs and their visualization using representative graphs can provide important contextual information for understanding gene-gene associations relevant to specific diseases.

摘要

目的

生物研究文献,与人类活动的许多其他领域一样,是一个丰富且不断增长的知识来源。此类生物知识的一个重要形式是对生物实体(如基因、蛋白质、疾病、药物和化学物质等)之间的关联进行建模。最近已经有相当多的研究致力于使用不同的文本挖掘方法提取各种类型的二元关联(例如基因-基因、基因-蛋白质、蛋白质-蛋白质等)。然而,此类关联的一个重要方面(例如“基因 A 激活蛋白质 B”)是确定此类关联发生的上下文(例如“基因 A 在器官 D 中疾病 C 的背景下激活蛋白质 B 在化学物质 E 的影响下”)。这种上下文可以通过涉及两个以上对象(例如对象 A、B、C、D、E)的多向关系来适当表示,而不是通常的二元关系(对象 A 和 B)。

方法

此类多向关系自然导致了知识的超图表示,而不是二元图。基于超图的多向生物文本文献知识提取是一个计算上困难的问题(由于其组合性质),尚未得到生物信息学研究界的太多关注。在本文中,我们描述并比较了两种不同的多向超图提取方法:一种基于所有多向超边的穷举枚举,另一种基于针对结构化数据的知名 A Priori 算法扩展到非结构化文本数据的情况。我们还提出了一种基于代表性图的方法,用于可视化这些遗传关联超图。

结果

针对两个生物医学问题(分别与肺癌和结直肠癌有关)进行了两项案例研究,结果表明,后一种方法(使用基于文本的 A Priori 方法)可以识别与前一种方法(穷举方法)相同的超边,但计算成本要低得多。提取的超关系在本文中作为认知丰富的代表性图呈现,代表相应的超图。

结论

基于文本的 A Priori 算法是提取表示生物对象之间多向关联的超图的实用且有用的方法。这些超图及其使用代表性图进行的可视化可以为理解与特定疾病相关的基因-基因关联提供重要的上下文信息。

相似文献

1
Multi-way association extraction and visualization from biological text documents using hyper-graphs: applications to genetic association studies for diseases.使用超图从生物文本文档中进行多方面关联提取和可视化:在疾病的遗传关联研究中的应用。
Artif Intell Med. 2010 Jul;49(3):145-54. doi: 10.1016/j.artmed.2010.03.002. Epub 2010 Apr 9.
2
Analysis of biological processes and diseases using text mining approaches.使用文本挖掘方法分析生物过程和疾病。
Methods Mol Biol. 2010;593:341-82. doi: 10.1007/978-1-60327-194-3_16.
3
Graph sharpening plus graph integration: a synergy that improves protein functional classification.图谱锐化加图谱整合:一种改善蛋白质功能分类的协同作用。
Bioinformatics. 2007 Dec 1;23(23):3217-24. doi: 10.1093/bioinformatics/btm511. Epub 2007 Oct 31.
4
Visualization of graph products.图乘积的可视化。
IEEE Trans Vis Comput Graph. 2010 Nov-Dec;16(6):1082-9. doi: 10.1109/TVCG.2010.217.
5
A graph-based approach to systematically reconstruct human transcriptional regulatory modules.一种基于图形的方法来系统地重建人类转录调控模块。
Bioinformatics. 2007 Jul 1;23(13):i577-86. doi: 10.1093/bioinformatics/btm227.
6
Extraction of correlated gene clusters by multiple graph comparison.通过多重图比较提取相关基因簇
Genome Inform. 2001;12:44-53.
7
GenoLink: a graph-based querying and browsing system for investigating the function of genes and proteins.基因链接(GenoLink):一个基于图形的查询和浏览系统,用于研究基因和蛋白质的功能。
BMC Bioinformatics. 2006 Jan 17;7:21. doi: 10.1186/1471-2105-7-21.
8
Text mining in livestock animal science: introducing the potential of text mining to animal sciences.文本挖掘在畜牧动物科学中的应用:介绍文本挖掘在动物科学中的应用潜力。
J Anim Sci. 2012 Oct;90(10):3666-76. doi: 10.2527/jas.2011-4841. Epub 2012 Jun 4.
9
Integrative mining of traditional Chinese medicine literature and MEDLINE for functional gene networks.整合挖掘中医文献与医学文献在线数据库以构建功能基因网络
Artif Intell Med. 2007 Oct;41(2):87-104. doi: 10.1016/j.artmed.2007.07.007. Epub 2007 Sep 5.
10
Syntons, metabolons and interactons: an exact graph-theoretical approach for exploring neighbourhood between genomic and functional data.同调体、代谢体和相互作用体:一种探索基因组数据与功能数据之间邻域关系的精确图论方法。
Bioinformatics. 2005 Dec 1;21(23):4209-15. doi: 10.1093/bioinformatics/bti711. Epub 2005 Oct 10.

引用本文的文献

1
Understanding Drug Repurposing From the Perspective of Biomedical Entities and Their Evolution: Bibliographic Research Using Aspirin.从生物医学实体及其演变的角度理解药物再利用:以阿司匹林为例的文献研究
JMIR Med Inform. 2020 Jun 16;8(6):e16739. doi: 10.2196/16739.
2
Integrated Approaches to Drug Discovery for Oxidative Stress-Related Retinal Diseases.氧化应激相关视网膜疾病药物研发的综合方法
Oxid Med Cell Longev. 2016;2016:2370252. doi: 10.1155/2016/2370252. Epub 2016 Dec 7.
3
The BioIntelligence Framework: a new computational platform for biomedical knowledge computing.
生物智能框架:一个用于生物医学知识计算的新型计算平台。
J Am Med Inform Assoc. 2013 Jan 1;20(1):128-33. doi: 10.1136/amiajnl-2011-000646. Epub 2012 Aug 2.
4
Mining the pharmacogenomics literature--a survey of the state of the art.挖掘药物基因组学文献——技术现状调查。
Brief Bioinform. 2012 Jul;13(4):460-94. doi: 10.1093/bib/bbs018.