Suppr超能文献

生物医学对象的上下文增强搜索。

BOSS: context-enhanced search for biomedical objects.

机构信息

Department of Computer Science, Korea University, Seoul, Korea.

出版信息

BMC Med Inform Decis Mak. 2012 Apr 30;12 Suppl 1(Suppl 1):S7. doi: 10.1186/1472-6947-12-S1-S7.

Abstract

BACKGROUND

There exist many academic search solutions and most of them can be put on either ends of spectrum: general-purpose search and domain-specific "deep" search systems. The general-purpose search systems, such as PubMed, offer flexible query interface, but churn out a list of matching documents that users have to go through the results in order to find the answers to their queries. On the other hand, the "deep" search systems, such as PPI Finder and iHOP, return the precompiled results in a structured way. Their results, however, are often found only within some predefined contexts. In order to alleviate these problems, we introduce a new search engine, BOSS, Biomedical Object Search System.

METHODS

Unlike the conventional search systems, BOSS indexes segments, rather than documents. A segment refers to a Maximal Coherent Semantic Unit (MCSU) such as phrase, clause or sentence that is semantically coherent in the given context (e.g., biomedical objects or their relations). For a user query, BOSS finds all matching segments, identifies the objects appearing in those segments, and aggregates the segments for each object. Finally, it returns the ranked list of the objects along with their matching segments.

RESULTS

The working prototype of BOSS is available at http://boss.korea.ac.kr. The current version of BOSS has indexed abstracts of more than 20 million articles published during last 16 years from 1996 to 2011 across all science disciplines.

CONCLUSION

BOSS fills the gap between either ends of the spectrum by allowing users to pose context-free queries and by returning a structured set of results. Furthermore, BOSS exhibits the characteristic of good scalability, just as with conventional document search engines, because it is designed to use a standard document-indexing model with minimal modifications. Considering the features, BOSS notches up the technological level of traditional solutions for search on biomedical information.

摘要

背景

有许多学术搜索解决方案,它们大多可以分为两类:通用搜索和特定领域的“深度”搜索系统。通用搜索系统,如 PubMed,提供了灵活的查询界面,但会生成一份匹配文档的列表,用户必须浏览这些结果才能找到他们查询的答案。另一方面,“深度”搜索系统,如 PPI Finder 和 iHOP,则以结构化的方式返回预先编译的结果。然而,它们的结果往往只在一些预定义的上下文中找到。为了解决这些问题,我们引入了一个新的搜索引擎,BOSS,即生物医学对象搜索系统。

方法

与传统的搜索系统不同,BOSS 索引的是片段,而不是文档。一个片段是指在给定上下文中语义连贯的最大连贯语义单元(MCSU),例如短语、子句或句子。对于用户查询,BOSS 会找到所有匹配的片段,识别出这些片段中出现的对象,并为每个对象聚合这些片段。最后,它会返回按对象排序的对象及其匹配片段的列表。

结果

BOSS 的工作原型可在 http://boss.korea.ac.kr 上获得。目前的 BOSS 版本已经索引了来自 1996 年至 2011 年所有科学领域的超过 2000 万篇文章的摘要。

结论

BOSS 通过允许用户提出无上下文的查询,并返回结构化的结果集,填补了光谱两端之间的空白。此外,BOSS 表现出良好的可扩展性特征,就像传统的文档搜索引擎一样,因为它被设计为使用标准的文档索引模型,只需进行最小的修改。考虑到这些特点,BOSS 在生物医学信息搜索的传统解决方案方面提高了技术水平。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6a86/3339395/dd081661e130/1472-6947-12-S1-S7-1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验