Müller Hans-Michael, Rangarajan Arun, Teal Tracy K, Sternberg Paul W
Division of Biology and Howard Hughes Medical Institute, California Institute of Technology, Pasadena, CA, USA.
Neuroinformatics. 2008 Sep;6(3):195-204. doi: 10.1007/s12021-008-9031-0. Epub 2008 Oct 24.
Textpresso is a text-mining system for scientific literature. Its two major features are access to the full text of research papers and the development and use of categories of biological concepts as well as categories that describe or relate objects. A search engine enables the user to search for one or a combination of these categories and/or keywords within an entire literature. Here we describe Textpresso for Neuroscience, part of the core Neuroscience Information Framework (NIF). The Textpresso site currently consists of 67,500 full text papers and 131,300 abstracts. We show that using categories in literature can make a pure keyword query more refined and meaningful. We also show how semantic queries can be formulated with categories only. We explain the build and content of the database and describe the main features of the web pages and the advanced search options. We also give detailed illustrations of the web service developed to provide programmatic access to Textpresso. This web service is used by the NIF interface to access Textpresso. The standalone website of Textpresso for Neuroscience can be accessed at http://www.textpresso.org/neuroscience/.
Textpresso是一个用于科学文献的文本挖掘系统。它的两个主要特点是能够获取研究论文的全文,以及生物概念类别以及描述或关联对象的类别的开发和使用。一个搜索引擎能让用户在整个文献中搜索这些类别和/或关键词中的一个或组合。在这里,我们描述神经科学领域的Textpresso,它是核心神经科学信息框架(NIF)的一部分。Textpresso网站目前包含67500篇全文论文和131300篇摘要。我们表明,在文献中使用类别可以使纯关键词查询更加精确和有意义。我们还展示了如何仅用类别来制定语义查询。我们解释了数据库的构建和内容,并描述了网页的主要特点和高级搜索选项。我们还详细说明了为提供对Textpresso的编程访问而开发的网络服务。NIF接口使用这个网络服务来访问Textpresso。神经科学领域的Textpresso独立网站可通过http://www.textpresso.org/neuroscience/访问。