Suppr超能文献

德国临床参考语料库 3000PA 最终报告

Final Report on the German Clinical Reference Corpus 3000PA.

机构信息

Jena University Language & Information Engineering (JULIE) Lab, Friedrich-Schiller-Universität Jena, Jena, Germany.

Medizinische Informatik, TU München, München, Germany.

出版信息

Stud Health Technol Inform. 2024 Jan 25;310:599-603. doi: 10.3233/SHTI231035.

Abstract

We here report on one of the outcomes of a large-scale German research program, the Medical Informatics Initiative (MII), aiming at the development of a solid data and software infrastructure for German-language clinical natural language processing. Within this framework, we have developed 3000PA, a national clinical reference corpus composed of patient records from three clinical university sites and annotated with a multitude of semantic annotation layers (including medical named entities, semantic and temporal relations between entities, as well as certainty and negation information related to entities and relations). This non-sharable corpus has been complemented by three sharable ones (JSYNCC, GGPONC, and GRASCCO). Overall, 3000PA, JSYNCC and GRASCCO feature about 2.1 million metadata points.

摘要

我们在此报告德国大型研究计划——医学信息学倡议(MII)的成果之一,该计划旨在为德语临床自然语言处理开发坚实的数据和软件基础。在此框架内,我们开发了 3000PA,这是一个由来自三个临床大学的患者记录组成的国家临床参考语料库,并使用多种语义注释层进行了注释(包括医学命名实体、实体之间的语义和时间关系,以及与实体和关系相关的确定性和否定信息)。这个不可共享的语料库由三个可共享的语料库(JSYNCC、GGPONC 和 GRASCCO)进行了补充。总体而言,3000PA、JSYNCC 和 GRASCCO 包含约 210 万条元数据。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验