使用文本挖掘方法构建青光眼相互作用网络。

Building a glaucoma interaction network using a text mining approach.

作者信息

Soliman Maha, Nasraoui Olfa, Cooper Nigel G F

机构信息

Department of Anatomical Sciences and Neurobiology, University of Louisville, School of Medicine, Louisville, KY USA.

Knowledge Discovery & Web Mining Lab, Department of Computer Engineering & Computer Science, University of Louisville, J.B Speed School of Engineering, Louisville, KY USA.

出版信息

BioData Min. 2016 May 5;9:17. doi: 10.1186/s13040-016-0096-2. eCollection 2016.

DOI:10.1186/s13040-016-0096-2

PMID:27152122

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4857381/

Abstract

BACKGROUND

The volume of biomedical literature and its underlying knowledge base is rapidly expanding, making it beyond the ability of a single human being to read through all the literature. Several automated methods have been developed to help make sense of this dilemma. The present study reports on the results of a text mining approach to extract gene interactions from the data warehouse of published experimental results which are then used to benchmark an interaction network associated with glaucoma. To the best of our knowledge, there is, as yet, no glaucoma interaction network derived solely from text mining approaches. The presence of such a network could provide a useful summative knowledge base to complement other forms of clinical information related to this disease.

RESULTS

A glaucoma corpus was constructed from PubMed Central and a text mining approach was applied to extract genes and their relations from this corpus. The extracted relations between genes were checked using reference interaction databases and classified generally as known or new relations. The extracted genes and relations were then used to construct a glaucoma interaction network. Analysis of the resulting network indicated that it bears the characteristics of a small world interaction network. Our analysis showed the presence of seven glaucoma linked genes that defined the network modularity. A web-based system for browsing and visualizing the extracted glaucoma related interaction networks is made available at http://neurogene.spd.louisville.edu/GlaucomaINViewer/Form1.aspx.

CONCLUSIONS

This study has reported the first version of a glaucoma interaction network using a text mining approach. The power of such an approach is in its ability to cover a wide range of glaucoma related studies published over many years. Hence, a bigger picture of the disease can be established. To the best of our knowledge, this is the first glaucoma interaction network to summarize the known literature. The major findings were a set of relations that could not be found in existing interaction databases and that were found to be new, in addition to a smaller subnetwork consisting of interconnected clusters of seven glaucoma genes. Future improvements can be applied towards obtaining a better version of this network.

摘要

背景

生物医学文献及其潜在知识库的规模正在迅速扩大，使得任何人都无法通读所有文献。已开发出多种自动化方法来帮助解决这一困境。本研究报告了一种文本挖掘方法的结果，该方法从已发表实验结果的数据仓库中提取基因相互作用，然后用于对与青光眼相关的相互作用网络进行基准测试。据我们所知，目前尚无仅通过文本挖掘方法得出的青光眼相互作用网络。这样一个网络的存在可以提供一个有用的总结性知识库，以补充与该疾病相关的其他临床信息形式。

结果

从PubMed Central构建了一个青光眼语料库，并应用文本挖掘方法从该语料库中提取基因及其关系。使用参考相互作用数据库检查提取的基因之间的关系，并大致分类为已知关系或新关系。然后，将提取的基因和关系用于构建青光眼相互作用网络。对所得网络的分析表明，它具有小世界相互作用网络的特征。我们的分析显示存在七个定义网络模块性的青光眼相关基因。可通过http://neurogene.spd.louisville.edu/GlaucomaINViewer/Form1.aspx访问基于网络的系统，用于浏览和可视化提取的与青光眼相关的相互作用网络。

结论

本研究报告了首个使用文本挖掘方法的青光眼相互作用网络版本。这种方法的优势在于能够涵盖多年来发表的大量与青光眼相关的研究。因此，可以建立该疾病的更全面图景。据我们所知，这是第一个总结已知文献的青光眼相互作用网络。主要发现是一组在现有相互作用数据库中未发现的新关系，以及一个由七个青光眼基因相互连接的簇组成的较小子网。未来可对该网络进行改进以获得更好的版本。