Pereira Suzanne, Névéol Aurélie, Kerdelhué Gaétan, Serrot Elisabeth, Joubert Michel, Darmoni Stéfan J
CISMeF, LITIS EA 4108, University of Rouen, France.
AMIA Annu Symp Proc. 2008 Nov 6;2008:586-90.
To assist with the development of a French online quality-controlled health gateway(CISMeF), an automatic indexing tool assigning MeSH descriptors to medical text in French was created. The French Multi-Terminology Indexer (FMTI) relies on a multi-terminology approach involving four prominent medical terminologies and the mappings between them.
In this paper,we compare lemmatization and stemming as methods to process French medical text for indexing. We also evaluate the multi-terminology approach implemented in F-MTI.
The indexing strategies were assessed on a corpus of 18,814 resources indexed manually.
There is little difference in the indexing performance when lemmatization or stemming is used. However, the multi-terminology approach outperforms indexing relying on a single terminology in terms of recall.
F-MTI will soon be used in the CISMeF production environment and in a Health MultiTerminology Server in French.
为协助开发一个法语在线质量控制的健康网关(CISMeF),创建了一个自动索引工具,用于为法语医学文本分配医学主题词(MeSH)描述符。法语多术语索引器(FMTI)依赖于一种多术语方法,该方法涉及四个突出的医学术语表及其之间的映射。
在本文中,我们比较了词元化和词干提取作为处理法语医学文本以进行索引的方法。我们还评估了F-MTI中实施的多术语方法。
在一个由18814个手动索引的资源组成的语料库上评估索引策略。
使用词元化或词干提取时,索引性能几乎没有差异。然而,在召回率方面,多术语方法优于依赖单一术语的索引。
F-MTI很快将用于CISMeF生产环境和法语健康多术语服务器中。