Chiao Yun-Chuang, Zweigenbaum P
STIM/DSI, Assistance Publique - Hôpitaux de Paris, Paris Cedex 13, 75634, France. [ycc, pz]@biomath.jussieu.fr
Proc AMIA Symp. 2002:150-4.
Cross-language retrieval of medical information needs to translate input queries into target language queries. It must be prepared to cope with 'new' words not yet listed in a multilingual lexicon. We address the issue of finding translational equivalents of such 'unknown' words from French to English in the medical domain. We rely on non-parallel, comparable corpora and an initial bilingual medical lexicon. We compare the distributional contexts of source and target words, testing several weighting factors and similarity measures. For the best combination (the Jaccard similarity measure with or without weighting), the correct translation is found in the top 10 candidates for more than 60% of the test words. This shows the potential of this technique to help extending bilingual medical lexicons.
医学信息的跨语言检索需要将输入查询翻译为目标语言查询。它必须准备好应对多语言词典中尚未列出的“新”词。我们解决了在医学领域中寻找从法语到英语的此类“未知”词的翻译对等词的问题。我们依赖于非平行的、可比较的语料库和一个初始的双语医学词典。我们比较源词和目标词的分布语境,测试了几个加权因子和相似度度量。对于最佳组合(带或不带加权的杰卡德相似度度量),超过60%的测试词在排名前10的候选词中找到了正确的翻译。这表明了该技术在帮助扩展双语医学词典方面的潜力。