Ng SK, Wong M
Genome Inform Ser Workshop Genome Inform. 1999;10:104-112.
We are entering a new era of research where the latest scientific discoveries are often first reported online and are readily accessible by scientists worldwide. This rapid electronic dissemination of research breakthroughs has greatly accelerated the current pace in genomics and proteomics research. The race to the discovery of a gene or a drug has now become increasingly dependent on how quickly a scientist can scan through voluminous amount of information available online to construct the relevant picture (such as protein-protein interaction pathways) as it takes shape amongst the rapidly expanding pool of globally accessible biological data (e.g. GENBANK) and scientific literature (e.g. MEDLINE). We describe a prototype system for automatic pathway discovery from on-line text abstracts, combining technologies that (1) retrieve research abstracts from online sources, (2) extract relevant information from the free texts, and (3) present the extracted information graphically and intuitively. Our work demonstrates that this framework allows us to routinely scan online scientific literature for automatic discovery of knowledge, giving modern scientists the necessary competitive edge in managing the information explosion in this electronic age.
我们正步入一个研究的新时代,最新的科学发现常常首先在网上发布,全球的科学家都能轻易获取。研究突破的这种快速电子传播极大地加快了当前基因组学和蛋白质组学研究的步伐。如今,发现一个基因或一种药物的竞争越来越取决于科学家能够多快地浏览大量在线可用信息,以便在全球可获取的生物数据(如GENBANK)和科学文献(如MEDLINE)快速扩充的库中形成相关图景(如蛋白质 - 蛋白质相互作用途径)。我们描述了一个从在线文本摘要中自动发现途径的原型系统,该系统结合了以下技术:(1)从在线来源检索研究摘要,(2)从自由文本中提取相关信息,以及(3)以图形化和直观的方式呈现提取的信息。我们的工作表明,这个框架使我们能够常规地扫描在线科学文献以自动发现知识,赋予现代科学家在这个电子时代应对信息爆炸所需的竞争优势。