文献检索，用中文搜 PubMed

Suppr 超能文献

核心技术专利：CN118964589B侵权必究

Suppr 超能文献

核心技术专利：CN118964589B侵权必究

Furrer Lenz, Jancso Anna, Colic Nicola, Rinaldi Fabio

Institute of Computational Linguistics, University of Zurich, Andreasstr. 15, 8050, Zürich, Switzerland.

Fondazione Bruno Kessler, Via Sommarive, 18, 38123, Trento, Italy.

J Cheminform. 2019 Jan 21;11(1):7. doi: 10.1186/s13321-018-0326-3.

BACKGROUND

We present a text-mining tool for recognizing biomedical entities in scientific literature. OGER++ is a hybrid system for named entity recognition and concept recognition (linking), which combines a dictionary-based annotator with a corpus-based disambiguation component. The annotator uses an efficient look-up strategy combined with a normalization method for matching spelling variants. The disambiguation classifier is implemented as a feed-forward neural network which acts as a postfilter to the previous step.

RESULTS

We evaluated the system in terms of processing speed and annotation quality. In the speed benchmarks, the OGER++ web service processes 9.7 abstracts or 0.9 full-text documents per second. On the CRAFT corpus, we achieved 71.4% and 56.7% F1 for named entity recognition and concept recognition, respectively.

CONCLUSIONS

Combining knowledge-based and data-driven components allows creating a system with competitive performance in biomedical text mining.

背景

我们展示了一种用于识别科学文献中生物医学实体的文本挖掘工具。OGER++是一个用于命名实体识别和概念识别（链接）的混合系统，它将基于字典的注释器与基于语料库的消歧组件相结合。该注释器使用一种高效的查找策略并结合一种归一化方法来匹配拼写变体。消歧分类器被实现为一个前馈神经网络，它作为上一步的后置过滤器。

结果

我们从处理速度和注释质量方面对该系统进行了评估。在速度基准测试中，OGER++网络服务每秒可处理9.7篇摘要或0.9篇全文文档。在CRAFT语料库上，我们在命名实体识别和概念识别方面分别取得了71.4%和56.7%的F1值。

结论

将基于知识的组件和数据驱动的组件相结合，可以创建一个在生物医学文本挖掘中具有竞争力的系统。

Furrer Lenz, Jancso Anna, Colic Nicola, Rinaldi Fabio

Institute of Computational Linguistics, University of Zurich, Andreasstr. 15, 8050, Zürich, Switzerland.

Fondazione Bruno Kessler, Via Sommarive, 18, 38123, Trento, Italy.

J Cheminform. 2019 Jan 21;11(1):7. doi: 10.1186/s13321-018-0326-3.

BACKGROUND

RESULTS

CONCLUSIONS

Combining knowledge-based and data-driven components allows creating a system with competitive performance in biomedical text mining.

背景

结果

结论

将基于知识的组件和数据驱动的组件相结合，可以创建一个在生物医学文本挖掘中具有竞争力的系统。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

OGER++：混合多类型实体识别

OGER++: hybrid multi-type entity recognition.

作者信息

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSIONS

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

OGER++：混合多类型实体识别

OGER++: hybrid multi-type entity recognition.

作者信息

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSIONS

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献