• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

一种从全文文章中提取代谢反应的文本挖掘系统。

A text-mining system for extracting metabolic reactions from full-text articles.

机构信息

Department of Biological Sciences and Institute of Molecular and Structural Biology, Birkbeck, University of London, London, UK.

出版信息

BMC Bioinformatics. 2012 Jul 23;13:172. doi: 10.1186/1471-2105-13-172.

DOI:10.1186/1471-2105-13-172
PMID:22823282
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3475109/
Abstract

BACKGROUND

Increasingly biological text mining research is focusing on the extraction of complex relationships relevant to the construction and curation of biological networks and pathways. However, one important category of pathway - metabolic pathways - has been largely neglected.Here we present a relatively simple method for extracting metabolic reaction information from free text that scores different permutations of assigned entities (enzymes and metabolites) within a given sentence based on the presence and location of stemmed keywords. This method extends an approach that has proved effective in the context of the extraction of protein-protein interactions.

RESULTS

When evaluated on a set of manually-curated metabolic pathways using standard performance criteria, our method performs surprisingly well. Precision and recall rates are comparable to those previously achieved for the well-known protein-protein interaction extraction task.

CONCLUSIONS

We conclude that automated metabolic pathway construction is more tractable than has often been assumed, and that (as in the case of protein-protein interaction extraction) relatively simple text-mining approaches can prove surprisingly effective. It is hoped that these results will provide an impetus to further research and act as a useful benchmark for judging the performance of more sophisticated methods that are yet to be developed.

摘要

背景

越来越多的生物文本挖掘研究侧重于提取与生物网络和途径的构建和管理相关的复杂关系。然而,途径的一个重要类别——代谢途径——在很大程度上被忽视了。在这里,我们提出了一种相对简单的方法,用于从文本中提取代谢反应信息,该方法根据给定句子中词干关键字的存在和位置,对分配实体(酶和代谢物)的不同排列进行评分。这种方法扩展了一种在提取蛋白质-蛋白质相互作用方面已被证明有效的方法。

结果

当使用标准性能标准评估一组手动整理的代谢途径时,我们的方法表现非常出色。精度和召回率与以前在著名的蛋白质-蛋白质相互作用提取任务中所达到的相当。

结论

我们得出结论,自动代谢途径构建比通常假设的更具可操作性,并且(与蛋白质-蛋白质相互作用提取的情况一样)相对简单的文本挖掘方法可能会非常有效。希望这些结果将为进一步的研究提供动力,并成为判断尚未开发的更复杂方法性能的有用基准。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a6d7/3475109/d693e0094413/1471-2105-13-172-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a6d7/3475109/efc8b4108e91/1471-2105-13-172-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a6d7/3475109/c82a6236e7e9/1471-2105-13-172-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a6d7/3475109/d693e0094413/1471-2105-13-172-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a6d7/3475109/efc8b4108e91/1471-2105-13-172-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a6d7/3475109/c82a6236e7e9/1471-2105-13-172-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a6d7/3475109/d693e0094413/1471-2105-13-172-3.jpg

相似文献

1
A text-mining system for extracting metabolic reactions from full-text articles.一种从全文文章中提取代谢反应的文本挖掘系统。
BMC Bioinformatics. 2012 Jul 23;13:172. doi: 10.1186/1471-2105-13-172.
2
Text mining facilitates database curation - extraction of mutation-disease associations from Bio-medical literature.文本挖掘有助于数据库管理——从生物医学文献中提取突变与疾病的关联。
BMC Bioinformatics. 2015 Jun 6;16:185. doi: 10.1186/s12859-015-0609-x.
3
Overview of the BioCreative VI Precision Medicine Track: mining protein interactions and mutations for precision medicine.BioCreative VI 精准医学赛道概述:精准医学中的蛋白质相互作用和突变挖掘。
Database (Oxford). 2019 Jan 1;2019:bay147. doi: 10.1093/database/bay147.
4
Metabolic Pathway Mining.代谢途径挖掘
Methods Mol Biol. 2017;1526:139-158. doi: 10.1007/978-1-4939-6613-4_8.
5
Mining biological networks from full-text articles.从全文文章中挖掘生物网络。
Methods Mol Biol. 2014;1159:135-45. doi: 10.1007/978-1-4939-0709-0_8.
6
The eFIP system for text mining of protein interaction networks of phosphorylated proteins.基于磷酸化蛋白质相互作用网络的文本挖掘的 eFIP 系统。
Database (Oxford). 2012 Dec 5;2012:bas044. doi: 10.1093/database/bas044. Print 2012.
7
Towards semi-automated curation: using text mining to recreate the HIV-1, human protein interaction database.迈向半自动化策展:使用文本挖掘技术重现 HIV-1 与人类蛋白质相互作用数据库。
Database (Oxford). 2012 Apr 23;2012:bas023. doi: 10.1093/database/bas023. Print 2012.
8
miRiaD: A Text Mining Tool for Detecting Associations of microRNAs with Diseases.miRiaD:一种用于检测微小RNA与疾病关联的文本挖掘工具。
J Biomed Semantics. 2016 Apr 29;7(1):9. doi: 10.1186/s13326-015-0044-y.
9
Construction of phosphorylation interaction networks by text mining of full-length articles using the eFIP system.使用eFIP系统通过对全文进行文本挖掘构建磷酸化相互作用网络。
Database (Oxford). 2015 Mar 31;2015. doi: 10.1093/database/bav020. Print 2015.
10
The Protein-Protein Interaction tasks of BioCreative III: classification/ranking of articles and linking bio-ontology concepts to full text.BioCreative III 的蛋白质-蛋白质相互作用任务:文章的分类/排序和将生物本体论概念链接到全文。
BMC Bioinformatics. 2011 Oct 3;12 Suppl 8(Suppl 8):S3. doi: 10.1186/1471-2105-12-S8-S3.

引用本文的文献

1
EnzChemRED, a rich enzyme chemistry relation extraction dataset.EnzChemRED,一个富含酶化学关系提取的数据集。
Sci Data. 2024 Sep 9;11(1):982. doi: 10.1038/s41597-024-03835-7.
2
Overview of DrugProt task at BioCreative VII: data and methods for large-scale text mining and knowledge graph generation of heterogenous chemical-protein relations.DrugProt 任务概述在 BioCreative VII 上:大规模文本挖掘和异构化学-蛋白质关系知识图生成的数据和方法。
Database (Oxford). 2023 Nov 28;2023. doi: 10.1093/database/baad080.
3
A new version of the ANDSystem tool for automatic extraction of knowledge from scientific publications with expanded functionality for reconstruction of associative gene networks by considering tissue-specific gene expression.

本文引用的文献

1
Mining metabolites: extracting the yeast metabolome from the literature.挖掘代谢物:从文献中提取酵母代谢组
Metabolomics. 2011 Mar;7(1):94-101. doi: 10.1007/s11306-010-0251-6. Epub 2010 Oct 31.
2
Text mining and manual curation of chemical-gene-disease networks for the comparative toxicogenomics database (CTD).文本挖掘和化学-基因-疾病网络的人工整理用于比较毒理学基因组数据库(CTD)。
BMC Bioinformatics. 2009 Oct 8;10:326. doi: 10.1186/1471-2105-10-326.
3
A realistic assessment of methods for extracting gene/protein interactions from free text.
一种用于从科学出版物中自动提取知识的 ANDSystem 工具的新版本,该版本具有扩展功能,可通过考虑组织特异性基因表达来重建关联基因网络。
BMC Bioinformatics. 2019 Feb 5;20(Suppl 1):34. doi: 10.1186/s12859-018-2567-6.
4
Integrating bioinformatics approaches for a comprehensive interpretation of metabolomics datasets.整合生物信息学方法,全面解读代谢组学数据集。
Curr Opin Biotechnol. 2018 Dec;54:1-9. doi: 10.1016/j.copbio.2018.01.010. Epub 2018 Feb 6.
5
A Review of Recent Advancement in Integrating Omics Data with Literature Mining towards Biomedical Discoveries.整合组学数据与文献挖掘以促进生物医学发现的最新进展综述
Int J Genomics. 2017;2017:6213474. doi: 10.1155/2017/6213474. Epub 2017 Feb 26.
6
Microbial phenomics information extractor (MicroPIE): a natural language processing tool for the automated acquisition of prokaryotic phenotypic characters from text sources.微生物表型组学信息提取器(MicroPIE):一种用于从文本来源自动获取原核生物表型特征的自然语言处理工具。
BMC Bioinformatics. 2016 Dec 13;17(1):528. doi: 10.1186/s12859-016-1396-8.
7
An integrated text mining framework for metabolic interaction network reconstruction.用于代谢相互作用网络重建的集成文本挖掘框架。
PeerJ. 2016 Mar 21;4:e1811. doi: 10.7717/peerj.1811. eCollection 2016.
8
Weakly supervised learning of biomedical information extraction from curated data.从整理数据中进行生物医学信息提取的弱监督学习。
BMC Bioinformatics. 2016 Jan 11;17 Suppl 1(Suppl 1):1. doi: 10.1186/s12859-015-0844-1.
9
Text Mining for Protein Docking.用于蛋白质对接的文本挖掘
PLoS Comput Biol. 2015 Dec 9;11(12):e1004630. doi: 10.1371/journal.pcbi.1004630. eCollection 2015 Dec.
10
Large-scale extraction of gene interactions from full-text literature using DeepDive.使用DeepDive从全文文献中大规模提取基因相互作用。
Bioinformatics. 2016 Jan 1;32(1):106-13. doi: 10.1093/bioinformatics/btv476. Epub 2015 Sep 3.
从自由文本中提取基因/蛋白质相互作用方法的现实评估。
BMC Bioinformatics. 2009 Jul 28;10:233. doi: 10.1186/1471-2105-10-233.
4
Facts from text: can text mining help to scale-up high-quality manual curation of gene products with ontologies?文本中的事实:文本挖掘能否助力利用本体对基因产物进行大规模高质量人工编目?
Brief Bioinform. 2008 Nov;9(6):466-78. doi: 10.1093/bib/bbn043. Epub 2008 Dec 6.
5
BANNER: an executable survey of advances in biomedical named entity recognition.横幅:生物医学命名实体识别进展的可执行调查。
Pac Symp Biocomput. 2008:652-63.
6
Corpus annotation for mining biomedical events from literature.用于从文献中挖掘生物医学事件的语料库标注。
BMC Bioinformatics. 2008 Jan 8;9:10. doi: 10.1186/1471-2105-9-10.
7
Text processing through Web services: calling Whatizit.通过网络服务进行文本处理:调用Whatizit。
Bioinformatics. 2008 Jan 15;24(2):296-8. doi: 10.1093/bioinformatics/btm557. Epub 2007 Nov 15.
8
Integrating natural language processing with FlyBase curation.将自然语言处理与FlyBase数据整理相结合。
Pac Symp Biocomput. 2007:245-56.
9
Automatic reconstruction of a bacterial regulatory network using Natural Language Processing.使用自然语言处理自动重建细菌调控网络。
BMC Bioinformatics. 2007 Aug 7;8:293. doi: 10.1186/1471-2105-8-293.
10
BioInfer: a corpus for information extraction in the biomedical domain.生物推理(BioInfer):一个用于生物医学领域信息提取的语料库。
BMC Bioinformatics. 2007 Feb 9;8:50. doi: 10.1186/1471-2105-8-50.