临床文本词嵌入研究

A survey of word embeddings for clinical text.

作者信息

Khattak Faiza Khan, Jeblee Serena, Pou-Prom Chloé, Abdalla Mohamed, Meaney Christopher, Rudzicz Frank

机构信息

Department of Computer Science, University of Toronto, Toronto, Ontario, Canada; Vector Institute for Artificial Intelligence, Toronto, Ontario, Canada; Li Ka Shing Knowledge Institute, St Michael's Hospital, Toronto, Ontario, Canada.

Department of Computer Science, University of Toronto, Toronto, Ontario, Canada; Vector Institute for Artificial Intelligence, Toronto, Ontario, Canada.

出版信息

J Biomed Inform. 2019;100S:100057. doi: 10.1016/j.yjbinx.2019.100057. Epub 2019 Oct 28.

DOI:10.1016/j.yjbinx.2019.100057

PMID:34384583

Abstract

Representing words as numerical vectors based on the contexts in which they appear has become the de facto method of analyzing text with machine learning. In this paper, we provide a guide for training these representations on clinical text data, using a survey of relevant research. Specifically, we discuss different types of word representations, clinical text corpora, available pre-trained clinical word vector embeddings, intrinsic and extrinsic evaluation, applications, and limitations of these approaches. This work can be used as a blueprint for clinicians and healthcare workers who may want to incorporate clinical text features in their own models and applications.

摘要

基于单词出现的上下文将其表示为数值向量已成为使用机器学习分析文本的实际方法。在本文中，我们通过对相关研究的综述，为在临床文本数据上训练这些表示提供了指南。具体而言，我们讨论了不同类型的单词表示、临床文本语料库、可用的预训练临床词向量嵌入、内在和外在评估、应用以及这些方法的局限性。这项工作可以作为临床医生和医护人员的蓝图，他们可能希望在自己的模型和应用中纳入临床文本特征。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

临床文本词嵌入研究

A survey of word embeddings for clinical text.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

临床文本词嵌入研究

A survey of word embeddings for clinical text.

作者信息

机构信息

出版信息

相似文献

引用本文的文献