Bajic Vladimir B, Veronika Merlin, Veladandi Pardha Sarathi, Meka Archana, Heng Mok-Wei, Rajaraman Kanagasabai, Pan Hong, Swarup Sanjay
Knowledge Extraction Lab, Institute for Infocomm Research, Singapore 119613.
Plant Physiol. 2005 Aug;138(4):1914-25. doi: 10.1104/pp.105.060863.
We introduce a tool for text mining, Dragon Plant Biology Explorer (DPBE) that integrates information on Arabidopsis (Arabidopsis thaliana) genes with their functions, based on gene ontologies and biochemical entity vocabularies, and presents the associations as interactive networks. The associations are based on (1) user-provided PubMed abstracts; (2) a list of Arabidopsis genes compiled by The Arabidopsis Information Resource; (3) user-defined combinations of four vocabulary lists based on the ones developed by the general, plant, and Arabidopsis GO consortia; and (4) three lists developed here based on metabolic pathways, enzymes, and metabolites derived from AraCyc, BRENDA, and other metabolism databases. We demonstrate how various combinations can be applied to fields of (1) gene function and gene interaction analyses, (2) plant development, (3) biochemistry and metabolism, and (4) pharmacology of bioactive compounds. Furthermore, we show the suitability of DPBE for systems approaches by integration with "omics" platform outputs. Using a list of abiotic stress-related genes identified by microarray experiments, we show how this tool can be used to rapidly build an information base on the previously reported relationships. This tool complements the existing biological resources for systems biology by identifying potentially novel associations using text analysis between cellular entities based on genome annotation terms. Thus, it allows researchers to efficiently summarize existing information for a group of genes or pathways, so as to make better informed choices for designing validation experiments. Last, DPBE can be helpful for beginning researchers and graduate students to summarize vast information in an unfamiliar area. DPBE is freely available for academic and nonprofit users at http://research.i2r.a-star.edu.sg/DRAGON/ME2/.
我们介绍了一种文本挖掘工具——龙植物生物学探索者(DPBE),它基于基因本体论和生化实体词汇表,将拟南芥(Arabidopsis thaliana)基因及其功能的信息整合在一起,并将这些关联呈现为交互式网络。这些关联基于以下几点:(1)用户提供的PubMed摘要;(2)拟南芥信息资源中心编制的拟南芥基因列表;(3)基于通用、植物和拟南芥基因本体联盟开发的词汇表,由用户定义的四个词汇表组合;(4)在此基于代谢途径、酶和来自AraCyc、BRENDA及其他代谢数据库的代谢物开发的三个列表。我们展示了各种组合如何应用于以下领域:(1)基因功能和基因相互作用分析;(2)植物发育;(3)生物化学和代谢;(4)生物活性化合物的药理学。此外,我们通过与“组学”平台输出结果整合,展示了DPBE在系统方法中的适用性。利用微阵列实验鉴定出的非生物胁迫相关基因列表,我们展示了该工具如何用于快速建立基于先前报道关系的信息库。该工具通过基于基因组注释术语对细胞实体之间进行文本分析,识别潜在的新关联,从而补充了系统生物学现有的生物资源。因此,它使研究人员能够有效地总结一组基因或途径的现有信息,以便在设计验证实验时做出更明智的选择。最后,DPBE有助于初学者和研究生总结陌生领域的大量信息。DPBE可供学术和非营利用户免费使用,网址为http://research.i2r.a-star.edu.sg/DRAGON/ME2/ 。
Nucleic Acids Res. 2004-7-1
BMC Bioinformatics. 2005-1-20
Infect Genet Evol. 2010-12-29
Funct Integr Genomics. 2002-11
Plant Physiol. 2005-5
BMC Bioinformatics. 2005-8-9
Oxid Med Cell Longev. 2020-3-27
Biofactors. 2020-3
Oxid Med Cell Longev. 2019-5-16
Sci Rep. 2018-9-6
Sci Rep. 2017-7-20
J Cheminform. 2013-2-16
Nucleic Acids Res. 2012-11-21
Nucleic Acids Res. 2004-7-1
Plant Physiol. 2004-6
Proc Natl Acad Sci U S A. 2004-5-18
Nat Genet. 2004-2
Bioinformatics. 2004-1-1
Nucleic Acids Res. 2004-1-1