Suppr超能文献

用于代谢相互作用网络重建的集成文本挖掘框架。

An integrated text mining framework for metabolic interaction network reconstruction.

作者信息

Patumcharoenpol Preecha, Doungpan Narumol, Meechai Asawin, Shen Bairong, Chan Jonathan H, Vongsangnak Wanwipa

机构信息

Systems Biology and Bioinformatics Laboratory, King Mongkut's University of Technology Thonburi, Bangkok, Thailand.

Center for Systems Biology, Soochow University, Suzhou, China.

出版信息

PeerJ. 2016 Mar 21;4:e1811. doi: 10.7717/peerj.1811. eCollection 2016.

Abstract

Text mining (TM) in the field of biology is fast becoming a routine analysis for the extraction and curation of biological entities (e.g., genes, proteins, simple chemicals) as well as their relationships. Due to the wide applicability of TM in situations involving complex relationships, it is valuable to apply TM to the extraction of metabolic interactions (i.e., enzyme and metabolite interactions) through metabolic events. Here we present an integrated TM framework containing two modules for the extraction of metabolic events (Metabolic Event Extraction module-MEE) and for the construction of a metabolic interaction network (Metabolic Interaction Network Reconstruction module-MINR). The proposed integrated TM framework performed well based on standard measures of recall, precision and F-score. Evaluation of the MEE module using the constructed Metabolic Entities (ME) corpus yielded F-scores of 59.15% and 48.59% for the detection of metabolic events for production and consumption, respectively. As for the testing of the entity tagger for Gene and Protein (GP) and metabolite with the test corpus, the obtained F-score was greater than 80% for the Superpathway of leucine, valine, and isoleucine biosynthesis. Mapping of enzyme and metabolite interactions through network reconstruction showed a fair performance for the MINR module on the test corpus with F-score >70%. Finally, an application of our integrated TM framework on a big-scale data (i.e., EcoCyc extraction data) for reconstructing a metabolic interaction network showed reasonable precisions at 69.93%, 70.63% and 46.71% for enzyme, metabolite and enzyme-metabolite interaction, respectively. This study presents the first open-source integrated TM framework for reconstructing a metabolic interaction network. This framework can be a powerful tool that helps biologists to extract metabolic events for further reconstruction of a metabolic interaction network. The ME corpus, test corpus, source code, and virtual machine image with pre-configured software are available at www.sbi.kmutt.ac.th/ preecha/metrecon.

摘要

生物学领域的文本挖掘(TM)正迅速成为一种常规分析方法,用于提取和整理生物实体(如基因、蛋白质、简单化学物质)及其关系。由于TM在涉及复杂关系的情况下具有广泛的适用性,因此将TM应用于通过代谢事件提取代谢相互作用(即酶和代谢物相互作用)具有重要价值。在这里,我们提出了一个集成的TM框架,它包含两个模块,分别用于提取代谢事件(代谢事件提取模块-MEE)和构建代谢相互作用网络(代谢相互作用网络重建模块-MINR)。基于召回率、精确率和F值等标准度量,所提出的集成TM框架表现良好。使用构建的代谢实体(ME)语料库对MEE模块进行评估,对于生产和消耗代谢事件的检测,F值分别为59.15%和48.59%。至于使用测试语料库对基因和蛋白质(GP)以及代谢物的实体标记器进行测试,对于亮氨酸、缬氨酸和异亮氨酸生物合成的超级途径,获得的F值大于80%。通过网络重建对酶和代谢物相互作用进行映射,结果表明MINR模块在测试语料库上的表现良好,F值>70%。最后,将我们的集成TM框架应用于大规模数据(即EcoCyc提取数据)以重建代谢相互作用网络,对于酶、代谢物和酶-代谢物相互作用,分别显示出合理的精确率,为69.93%、70.63%和46.71%。本研究提出了首个用于重建代谢相互作用网络的开源集成TM框架。该框架可以成为一个强大的工具,帮助生物学家提取代谢事件,以进一步重建代谢相互作用网络。ME语料库、测试语料库、源代码以及预配置软件的虚拟机镜像可在www.sbi.kmutt.ac.th/preecha/metrecon获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ccbf/4806637/32a8b921d90c/peerj-04-1811-g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验