Department of Informatics and Telecommunication Engineering-University of Catania, Catania, Italy.
Brief Bioinform. 2012 Jan;13(1):61-82. doi: 10.1093/bib/bbr018. Epub 2011 Jun 15.
A huge amount of important biomedical information is hidden in the bulk of research articles in biomedical fields. At the same time, the publication of databases of biological information and of experimental datasets generated by high-throughput methods is in great expansion, and a wealth of annotated gene databases, chemical, genomic (including microarray datasets), clinical and other types of data repositories are now available on the Web. Thus a current challenge of bioinformatics is to develop targeted methods and tools that integrate scientific literature, biological databases and experimental data for reducing the time of database curation and for accessing evidence, either in the literature or in the datasets, useful for the analysis at hand. Under this scenario, this article reviews the knowledge discovery systems that fuse information from the literature, gathered by text mining, with microarray data for enriching the lists of down and upregulated genes with elements for biological understanding and for generating and validating new biological hypothesis. Finally, an easy to use and freely accessible tool, GeneWizard, that exploits text mining and microarray data fusion for supporting researchers in discovering gene-disease relationships is described.
大量重要的生物医学信息隐藏在生物医学领域的大量研究文章中。同时,生物信息数据库和高通量方法产生的实验数据集的出版也在迅速扩张,现在可以在网上找到丰富的带注释的基因数据库、化学数据库、基因组(包括微阵列数据集)、临床和其他类型的数据存储库。因此,生物信息学的一个当前挑战是开发有针对性的方法和工具,将科学文献、生物数据库和实验数据集成在一起,以减少数据库维护的时间,并访问手头分析有用的文献或数据集证据。在这种情况下,本文综述了融合文献信息的知识发现系统,这些信息是通过文本挖掘收集的,与微阵列数据融合,以丰富下调和上调基因的列表,提供生物学理解的元素,并生成和验证新的生物学假设。最后,描述了一个易于使用且免费访问的工具 GeneWizard,它利用文本挖掘和微阵列数据融合来支持研究人员发现基因-疾病关系。