Ndangang Marie, Grosjean Julien, Lelong Romain, Dahamna Badisse, Kergourlay Ivan, Griffon Nicolas, Darmoni Stéfan J
Department of Biomedical Informatics, Rouen University Hospital, Normandy, France.
Stud Health Technol Inform. 2018;255:20-24.
Unstructured health documents (e.g. discharge summaries) represent an important and unavoidable source of information.
A semantic annotator identified all the concepts present in the health documents from the clinical data warehouse of the Rouen University Hospital.
2,087,784,055 annotations were generated from a corpus of about 11.9 million documents with an average of 175 annotations per document. SNOMED CT, NCIt and MeSH were the top 3 terminologies that reported the most annotation.
As expected, the most general terminologies with the most translated concepts were those with the most concepts identified.
非结构化健康文档(如出院小结)是重要且不可避免的信息来源。
语义注释器从鲁昂大学医院的临床数据仓库中识别健康文档中存在的所有概念。
从约1190万份文档的语料库中生成了2,087,784,055条注释,每份文档平均有175条注释。SNOMED CT、NCIt和医学主题词表(MeSH)是注释最多的前3种术语。
正如预期的那样,具有最多翻译概念的最通用术语是那些识别出最多概念的术语。