Deng Yihan, Faulstich Lukas, Denecke Kerstin
Bern University of Applied Science, Bern, Switzerland.
ID Information and Documentation GmbH, Berlin, Germany.
Stud Health Technol Inform. 2017;245:1260.
Automatic encoding of diagnosis and procedures can increase the interoperability and efficacy of the clinical cooperation. The concept, rule-based and machine learning classification methods for automatic code generation can easily reach their limit due to the handcrafted rules and a limited coverage of the vocabulary in a concept library. As the first step to apply deep learning methods in automatic encoding in the clinical domain, a suitable semantic representation should be generated. In this work, we will focus on the embedding mechanism and dimensional reduction method for text representation, which mitigate the sparseness of the data input in the clinical domain. Different methods such as word embedding and random projection will be evaluated based on logs of query-document matching.
诊断和手术操作的自动编码可以提高临床协作的互操作性和效率。由于手工制定的规则以及概念库中词汇覆盖范围有限,用于自动代码生成的概念、基于规则和机器学习分类方法很容易达到其极限。作为在临床领域将深度学习方法应用于自动编码的第一步,应生成合适的语义表示。在这项工作中,我们将专注于文本表示的嵌入机制和降维方法,以减轻临床领域数据输入的稀疏性。将基于查询-文档匹配日志评估词嵌入和随机投影等不同方法。