Suppr超能文献

LAITOR--术语共现和关系识别的文献助手。

LAITOR--Literature Assistant for Identification of Terms co-Occurrences and Relationships.

机构信息

Max-Delbrück Center for Molecular Medicine, Berlin, Germany.

出版信息

BMC Bioinformatics. 2010 Feb 1;11:70. doi: 10.1186/1471-2105-11-70.

Abstract

BACKGROUND

Biological knowledge is represented in scientific literature that often describes the function of genes/proteins (bioentities) in terms of their interactions (biointeractions). Such bioentities are often related to biological concepts of interest that are specific of a determined research field. Therefore, the study of the current literature about a selected topic deposited in public databases, facilitates the generation of novel hypotheses associating a set of bioentities to a common context.

RESULTS

We created a text mining system (LAITOR: Literature Assistant for Identification of Terms co-Occurrences and Relationships) that analyses co-occurrences of bioentities, biointeractions, and other biological terms in MEDLINE abstracts. The method accounts for the position of the co-occurring terms within sentences or abstracts. The system detected abstracts mentioning protein-protein interactions in a standard test (BioCreative II IAS test data) with a precision of 0.82-0.89 and a recall of 0.48-0.70. We illustrate the application of LAITOR to the detection of plant response genes in a dataset of 1000 abstracts relevant to the topic.

CONCLUSIONS

Text mining tools combining the extraction of interacting bioentities and biological concepts with network displays can be helpful in developing reasonable hypotheses in different scientific backgrounds.

摘要

背景

生物知识在科学文献中得到体现,这些文献通常根据基因/蛋白质(生物实体)的相互作用(生物相互作用)来描述其功能。这些生物实体通常与特定研究领域感兴趣的生物概念有关。因此,研究当前存储在公共数据库中的关于选定主题的文献,有助于生成将一组生物实体与共同背景联系起来的新假设。

结果

我们创建了一个文本挖掘系统(LAITOR:用于识别术语共现和关系的文献助手),该系统分析 MEDLINE 摘要中生物实体、生物相互作用和其他生物术语的共现。该方法考虑了共现术语在句子或摘要中的位置。该系统在标准测试(BioCreative II IAS 测试数据)中检测到提及蛋白质-蛋白质相互作用的摘要,精度为 0.82-0.89,召回率为 0.48-0.70。我们说明了将 Laitor 应用于从与主题相关的 1000 个摘要数据集中检测植物响应基因的情况。

结论

将提取相互作用的生物实体和生物概念与网络显示相结合的文本挖掘工具,可有助于在不同的科学背景下提出合理的假设。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3ed5/3098111/e4fd7f3e57f2/1471-2105-11-70-1.jpg

相似文献

1
LAITOR--Literature Assistant for Identification of Terms co-Occurrences and Relationships.
BMC Bioinformatics. 2010 Feb 1;11:70. doi: 10.1186/1471-2105-11-70.
2
SciMiner: web-based literature mining tool for target identification and functional enrichment analysis.
Bioinformatics. 2009 Mar 15;25(6):838-40. doi: 10.1093/bioinformatics/btp049. Epub 2009 Feb 2.
3
Hierarchical network analysis of co-occurring bioentities in literature.
Sci Rep. 2022 May 12;12(1):7885. doi: 10.1038/s41598-022-12093-9.
4
Biological information extraction and co-occurrence analysis.
Methods Mol Biol. 2014;1159:77-92. doi: 10.1007/978-1-4939-0709-0_5.
5
A comprehensive and quantitative comparison of text-mining in 15 million full-text articles versus their corresponding abstracts.
PLoS Comput Biol. 2018 Feb 15;14(2):e1005962. doi: 10.1371/journal.pcbi.1005962. eCollection 2018 Feb.
6
An evaluation of GO annotation retrieval for BioCreAtIvE and GOA.
BMC Bioinformatics. 2005;6 Suppl 1(Suppl 1):S17. doi: 10.1186/1471-2105-6-S1-S17. Epub 2005 May 24.
7
Textpresso: an ontology-based information retrieval and extraction system for biological literature.
PLoS Biol. 2004 Nov;2(11):e309. doi: 10.1371/journal.pbio.0020309. Epub 2004 Sep 21.
8
Text mining tools for extracting information about microbial biodiversity in food.
Food Microbiol. 2019 Aug;81:63-75. doi: 10.1016/j.fm.2018.04.011. Epub 2018 Apr 21.
9
Information content in Medline record fields.
Int J Med Inform. 2004 Jun 30;73(6):515-27. doi: 10.1016/j.ijmedinf.2004.02.008.
10
Text-mining approaches in molecular biology and biomedicine.
Drug Discov Today. 2005 Mar 15;10(6):439-45. doi: 10.1016/S1359-6446(05)03376-3.

引用本文的文献

2
LAITOR4HPC: A text mining pipeline based on HPC for building interaction networks.
BMC Bioinformatics. 2020 Aug 24;21(1):365. doi: 10.1186/s12859-020-03620-4.
3
Text Mining for Protein Docking.
PLoS Comput Biol. 2015 Dec 9;11(12):e1004630. doi: 10.1371/journal.pcbi.1004630. eCollection 2015 Dec.
4
Database constraints applied to metabolic pathway reconstruction tools.
ScientificWorldJournal. 2014;2014:967294. doi: 10.1155/2014/967294. Epub 2014 Aug 17.
5
Large-scale structure of a network of co-occurring MeSH terms: statistical analysis of macroscopic properties.
PLoS One. 2014 Jul 9;9(7):e102188. doi: 10.1371/journal.pone.0102188. eCollection 2014.
7
Extracting rate changes in transcriptional regulation from MEDLINE abstracts.
BMC Bioinformatics. 2014;15 Suppl 2(Suppl 2):S4. doi: 10.1186/1471-2105-15-S2-S4. Epub 2014 Jan 24.
8
Systems biology elucidates common pathogenic mechanisms between nonalcoholic and alcoholic-fatty liver disease.
PLoS One. 2013;8(3):e58895. doi: 10.1371/journal.pone.0058895. Epub 2013 Mar 13.
10
Preimplantation development regulatory pathway construction through a text-mining approach.
BMC Genomics. 2011 Dec 22;12 Suppl 4(Suppl 4):S3. doi: 10.1186/1471-2164-12-S4-S3.

本文引用的文献

1
PLAN2L: a web tool for integrated text mining and literature-based bioentity relation extraction.
Nucleic Acids Res. 2009 Jul;37(Web Server issue):W160-5. doi: 10.1093/nar/gkp484. Epub 2009 Jun 11.
2
MedlineRanker: flexible ranking of biomedical literature.
Nucleic Acids Res. 2009 Jul;37(Web Server issue):W141-6. doi: 10.1093/nar/gkp353. Epub 2009 May 8.
4
Arena3D: visualization of biological networks in 3D.
BMC Syst Biol. 2008 Nov 28;2:104. doi: 10.1186/1752-0509-2-104.
5
STRING 8--a global view on proteins and their functional interactions in 630 organisms.
Nucleic Acids Res. 2009 Jan;37(Database issue):D412-6. doi: 10.1093/nar/gkn760. Epub 2008 Oct 21.
7
Overview of the protein-protein interaction annotation extraction task of BioCreative II.
Genome Biol. 2008;9 Suppl 2(Suppl 2):S4. doi: 10.1186/gb-2008-9-s2-s4. Epub 2008 Sep 1.
8
Salicylic acid in plant defence--the players and protagonists.
Curr Opin Plant Biol. 2007 Oct;10(5):466-72. doi: 10.1016/j.pbi.2007.08.008. Epub 2007 Sep 27.
9
Mechanisms of high salinity tolerance in plants.
Methods Enzymol. 2007;428:419-38. doi: 10.1016/S0076-6879(07)28024-3.
10
Automatic reconstruction of a bacterial regulatory network using Natural Language Processing.
BMC Bioinformatics. 2007 Aug 7;8:293. doi: 10.1186/1471-2105-8-293.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验