Suppr超能文献

挖掘代谢物:从文献中提取酵母代谢组

Mining metabolites: extracting the yeast metabolome from the literature.

作者信息

Nobata Chikashi, Dobson Paul D, Iqbal Syed A, Mendes Pedro, Tsujii Jun'ichi, Kell Douglas B, Ananiadou Sophia

出版信息

Metabolomics. 2011 Mar;7(1):94-101. doi: 10.1007/s11306-010-0251-6. Epub 2010 Oct 31.

Abstract

Text mining methods have added considerably to our capacity to extract biological knowledge from the literature. Recently the field of systems biology has begun to model and simulate metabolic networks, requiring knowledge of the set of molecules involved. While genomics and proteomics technologies are able to supply the macromolecular parts list, the metabolites are less easily assembled. Most metabolites are known and reported through the scientific literature, rather than through large-scale experimental surveys. Thus it is important to recover them from the literature. Here we present a novel tool to automatically identify metabolite names in the literature, and associate structures where possible, to define the reported yeast metabolome. With ten-fold cross validation on a manually annotated corpus, our recognition tool generates an f-score of 78.49 (precision of 83.02) and demonstrates greater suitability in identifying metabolite names than other existing recognition tools for general chemical molecules. The metabolite recognition tool has been applied to the literature covering an important model organism, the yeast Saccharomyces cerevisiae, to define its reported metabolome. By coupling to ChemSpider, a major chemical database, we have identified structures for much of the reported metabolome and, where structure identification fails, been able to suggest extensions to ChemSpider. Our manually annotated gold-standard data on 296 abstracts are available as supplementary materials. Metabolite names and, where appropriate, structures are also available as supplementary materials. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1007/s11306-010-0251-6) contains supplementary material, which is available to authorized users.

摘要

文本挖掘方法极大地增强了我们从文献中提取生物学知识的能力。最近,系统生物学领域已开始对代谢网络进行建模和模拟,这需要了解所涉及的分子集合。虽然基因组学和蛋白质组学技术能够提供大分子部件清单,但代谢物的组装则不那么容易。大多数代谢物是通过科学文献而非大规模实验调查得知并报道的。因此,从文献中获取它们很重要。在此,我们提出一种新颖的工具,用于自动识别文献中的代谢物名称,并在可能的情况下关联其结构,以定义已报道的酵母代谢组。在一个人工注释语料库上进行十折交叉验证时,我们的识别工具生成的F值为78.49(精确率为83.02),并且在识别代谢物名称方面比其他现有的通用化学分子识别工具更具优势。该代谢物识别工具已应用于涵盖重要模式生物酿酒酵母的文献,以定义其已报道的代谢组。通过与主要化学数据库ChemSpider耦合,我们已为大部分已报道的代谢组确定了结构,并且在结构识别失败的情况下,能够为ChemSpider提出扩展建议。我们关于296篇摘要的人工注释金标准数据作为补充材料提供。代谢物名称以及适当情况下的结构也作为补充材料提供。电子补充材料:本文的在线版本(doi:10.1007/s11306-010-0251-6)包含补充材料,授权用户可获取。

相似文献

1
Mining metabolites: extracting the yeast metabolome from the literature.挖掘代谢物:从文献中提取酵母代谢组
Metabolomics. 2011 Mar;7(1):94-101. doi: 10.1007/s11306-010-0251-6. Epub 2010 Oct 31.
5
YMDB: the Yeast Metabolome Database.YMDB:酵母代谢组数据库。
Nucleic Acids Res. 2012 Jan;40(Database issue):D815-20. doi: 10.1093/nar/gkr916. Epub 2011 Nov 7.

引用本文的文献

4
Thalia: semantic search engine for biomedical abstracts.塔利亚:生物医学文摘的语义搜索引擎。
Bioinformatics. 2019 May 15;35(10):1799-1801. doi: 10.1093/bioinformatics/bty871.
7
How close are we to complete annotation of metabolomes?我们距离完成代谢组的注释还有多远?
Curr Opin Chem Biol. 2017 Feb;36:64-69. doi: 10.1016/j.cbpa.2017.01.001. Epub 2017 Jan 21.
10
Context-based resolution of semantic conflicts in biological pathways.基于上下文的生物途径语义冲突解决方法。
BMC Med Inform Decis Mak. 2015;15 Suppl 1(Suppl 1):S3. doi: 10.1186/1472-6947-15-S1-S3. Epub 2015 May 20.

本文引用的文献

1
Text mining meets workflow: linking U-Compare with Taverna.文本挖掘与工作流程相结合:将 U-Compare 与 Taverna 相连接。
Bioinformatics. 2010 Oct 1;26(19):2486-7. doi: 10.1093/bioinformatics/btq464. Epub 2010 Aug 12.
2
Event extraction for systems biology by text mining the literature.通过文献挖掘进行系统生物学的事件抽取。
Trends Biotechnol. 2010 Jul;28(7):381-90. doi: 10.1016/j.tibtech.2010.04.005. Epub 2010 Jun 1.
3
Building a high-quality sense inventory for improved abbreviation disambiguation.构建高质量的感观词库以提高缩写词消歧
Bioinformatics. 2010 May 1;26(9):1246-53. doi: 10.1093/bioinformatics/btq129. Epub 2010 Mar 25.
6
A dictionary to identify small molecules and drugs in free text.用于识别自由文本中小分子和药物的词典。
Bioinformatics. 2009 Nov 15;25(22):2983-91. doi: 10.1093/bioinformatics/btp535. Epub 2009 Sep 16.
9
U-Compare: share and compare text mining tools with UIMA.U-Compare:与 UIMA 共享和比较文本挖掘工具。
Bioinformatics. 2009 Aug 1;25(15):1997-8. doi: 10.1093/bioinformatics/btp289. Epub 2009 May 4.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验