Suppr超能文献

MedScan,一款用于医学在线数据库摘要的自然语言处理引擎。

MedScan, a natural language processing engine for MEDLINE abstracts.

作者信息

Novichkova Svetlana, Egorov Sergei, Daraselia Nikolai

机构信息

Ariadne Genomics, Inc, 9100 Great Seneca HWY, Rockville, MD 20850, USA.

出版信息

Bioinformatics. 2003 Sep 1;19(13):1699-706. doi: 10.1093/bioinformatics/btg207.

Abstract

MOTIVATION

The importance of extracting biomedical information from scientific publications is well recognized. A number of information extraction systems for the biomedical domain have been reported, but none of them have become widely used in practical applications. Most proposals to date make rather simplistic assumptions about the syntactic aspect of natural language. There is an urgent need for a system that has broad coverage and performs well in real-text applications.

RESULTS

We present a general biomedical domain-oriented NLP engine called MedScan that efficiently processes sentences from MEDLINE abstracts and produces a set of regularized logical structures representing the meaning of each sentence. The engine utilizes a specially developed context-free grammar and lexicon. Preliminary evaluation of the system's performance, accuracy, and coverage exhibited encouraging results. Further approaches for increasing the coverage and reducing parsing ambiguity of the engine, as well as its application for information extraction are discussed.

摘要

动机

从科学出版物中提取生物医学信息的重要性已得到广泛认可。已有一些针对生物医学领域的信息提取系统被报道,但它们都未在实际应用中得到广泛使用。迄今为止,大多数提议对自然语言的句法方面都做出了相当简单化的假设。迫切需要一个具有广泛覆盖范围且在真实文本应用中表现良好的系统。

结果

我们提出了一个面向生物医学领域的通用自然语言处理引擎MedScan,它能高效处理MEDLINE摘要中的句子,并生成一组表示每个句子含义的规范化逻辑结构。该引擎利用了专门开发的上下文无关语法和词汇表。对该系统性能、准确性和覆盖范围的初步评估显示出令人鼓舞的结果。还讨论了进一步提高该引擎覆盖范围和减少解析歧义的方法,以及其在信息提取中的应用。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验