Massachusetts Institute of Technology, Cambridge, Massachusetts, USA.
Department of Computer Science, University of Massachusetts, Lowell, Massachusetts, USA.
J Am Med Inform Assoc. 2014 Sep-Oct;21(5):842-9. doi: 10.1136/amiajnl-2013-002133. Epub 2014 Jan 17.
To evaluate state-of-the-art unsupervised methods on the word sense disambiguation (WSD) task in the clinical domain. In particular, to compare graph-based approaches relying on a clinical knowledge base with bottom-up topic-modeling-based approaches. We investigate several enhancements to the topic-modeling techniques that use domain-specific knowledge sources.
The graph-based methods use variations of PageRank and distance-based similarity metrics, operating over the Unified Medical Language System (UMLS). Topic-modeling methods use unlabeled data from the Multiparameter Intelligent Monitoring in Intensive Care (MIMIC II) database to derive models for each ambiguous word. We investigate the impact of using different linguistic features for topic models, including UMLS-based and syntactic features. We use a sense-tagged clinical dataset from the Mayo Clinic for evaluation.
The topic-modeling methods achieve 66.9% accuracy on a subset of the Mayo Clinic's data, while the graph-based methods only reach the 40-50% range, with a most-frequent-sense baseline of 56.5%. Features derived from the UMLS semantic type and concept hierarchies do not produce a gain over bag-of-words features in the topic models, but identifying phrases from UMLS and using syntax does help.
Although topic models outperform graph-based methods, semantic features derived from the UMLS prove too noisy to improve performance beyond bag-of-words.
Topic modeling for WSD provides superior results in the clinical domain; however, integration of knowledge remains to be effectively exploited.
评估在临床领域的词义消歧(WSD)任务中最新的无监督方法。特别是,比较基于临床知识库的基于图的方法和基于自下而上主题建模的方法。我们研究了几种利用特定于领域的知识源增强主题建模技术的方法。
基于图的方法使用 PageRank 和基于距离的相似性度量的变体,在统一医学语言系统(UMLS)上运行。主题建模方法使用 Multiparameter Intelligent Monitoring in Intensive Care(MIMIC II)数据库中的未标记数据为每个模糊词导出模型。我们研究了使用不同的语言特征对主题模型的影响,包括基于 UMLS 和语法特征。我们使用 Mayo 诊所的标记临床数据集进行评估。
主题建模方法在 Mayo 诊所数据的子集上达到了 66.9%的准确性,而基于图的方法仅达到 40-50%的范围,最常见的感觉基线为 56.5%。从 UMLS 语义类型和概念层次结构中得出的特征在主题模型中没有超过词袋特征的增益,但从 UMLS 识别短语并使用语法确实有帮助。
尽管主题模型优于基于图的方法,但从 UMLS 中得出的语义特征证明过于嘈杂,无法在词袋之外提高性能。
主题建模在临床领域提供了优越的结果;然而,知识的整合仍然有待有效利用。