• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

MedEval - 一个针对医生和患者用户群体的瑞典医学测试集。

MedEval - A Swedish medical test collection with doctors and patients user groups.

作者信息

Heppin Karin Friberg

机构信息

NLP-Unit, Department of Swedish, University of Gothenburg, S-405 30 Gothenburg, Sweden.

出版信息

J Biomed Semantics. 2011;2 Suppl 3(Suppl 3):S4. doi: 10.1186/2041-1480-2-S3-S4. Epub 2011 Jul 14.

DOI:10.1186/2041-1480-2-S3-S4
PMID:21992659
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3194176/
Abstract

BACKGROUND

Test collections for information retrieval are scarce. Domain specific test collections even more so, and medical test collections in the Swedish language non-existent prior to the making of the MedEval test collection. Most research in information retrieval has been performed in the English language, thus most test collections contain English documents. However, English is morphologically poor compared to many other European languages and a number of interesting and important aspects have not been investigated. Building a medical test collection in Swedish opens new research opportunities.

METHODS

This article describes the making of and potential uses of MedEval, a Swedish medical test collection with assessments, not only for topical relevance, but also for target reader group: Doctors or Patients. A user of the test collection may choose if she wishes to search in the Doctors or the Patients scenario where the topical relevance assessments have been adjusted with consideration to user group, or to search in a scenario which regards only topical relevance.In addition to having three user groups, MedEval, in its present form, has two indexes, one where the terms are lemmatized and one where the terms are lemmatized and the compounds split and the constituents indexed together with the whole compound.

RESULTS

Differences discovered between the documents written for medical professionals and documents written for laypersons are presented. These differences may be utilized in further studies of retrieval of documents aimed at certain groups of readers. Differences between the groups of documents are, for example, that professional documents have a higher ratio of compounds, have a greater average word length and contain more multi-word expressions.An experiment is described where the user scenarios have been utilized, searching with expert terms and lay terms, separately and in combination in the different scenarios. The tendency discovered is that the medical expert gets best results using expert terms and the lay person best results using lay terms, but also quite good results using expert terms or lay and expert terms in combination.

CONCLUSIONS

The many features of MedEval gives a variety of research possibilities, such as comparing the effectiveness of search terms when it comes to retrieving documents aimed at the different user groups or to study the effect of compound decomposition in retrieval of documents. As Swedish, the language of MedEval, is a morphologically more complex language than English, it is possible to study additional aspects of the effect of natural language processing in information retrieval, for example utilizing different inflectional word forms in the retrieval of expert vs lay documents. MedEval is the first Swedish test collection of the medical domain.

AVAILABILITY

The Department of Swedish at the University of Gothenburg is in the process of making the MedEval test collection available to academic researchers.

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8a1a/3194176/f0faef3c6ed8/2041-1480-2-S3-S4-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8a1a/3194176/fd8f71c9d977/2041-1480-2-S3-S4-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8a1a/3194176/f0faef3c6ed8/2041-1480-2-S3-S4-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8a1a/3194176/fd8f71c9d977/2041-1480-2-S3-S4-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8a1a/3194176/f0faef3c6ed8/2041-1480-2-S3-S4-2.jpg
摘要

背景

用于信息检索的测试集很少。特定领域的测试集更是如此,而在MedEval测试集创建之前,瑞典语的医学测试集并不存在。信息检索方面的大多数研究都是用英语进行的,因此大多数测试集都包含英文文档。然而,与许多其他欧洲语言相比,英语在形态学上较为贫乏,一些有趣且重要的方面尚未得到研究。构建一个瑞典语医学测试集开启了新的研究机会。

方法

本文描述了MedEval的创建过程及其潜在用途,MedEval是一个瑞典语医学测试集,不仅对主题相关性进行评估,还对目标读者群体(医生或患者)进行评估。测试集的用户可以选择是希望在考虑用户群体后调整了主题相关性评估的医生或患者场景中进行搜索,还是在仅考虑主题相关性的场景中进行搜索。除了有三个用户群体外,MedEval目前的形式有两个索引,一个索引中的词是经过词形还原的,另一个索引中的词经过词形还原、复合词拆分,其组成部分与整个复合词一起索引。

结果

展示了为医学专业人员撰写的文档和为非专业人员撰写的文档之间发现的差异。这些差异可用于针对特定读者群体的文档检索的进一步研究。文档组之间的差异例如在于,专业文档的复合词比例更高、平均单词长度更长且包含更多多词表达式。描述了一个实验,其中利用了用户场景,在不同场景中分别和组合使用专家术语和外行术语进行搜索。发现的趋势是,医学专家使用专家术语能获得最佳结果,外行人员使用外行术语能获得最佳结果,但使用专家术语或外行和专家术语组合也能获得相当好的结果。

结论

MedEval的众多特性提供了多种研究可能性,例如在检索针对不同用户群体的文档时比较搜索词的有效性,或者研究复合词分解在文档检索中的效果。由于MedEval所使用的瑞典语在形态学上比英语更复杂,因此有可能研究信息检索中自然语言处理效果的其他方面,例如在检索专家文档与外行文档时利用不同的屈折词形。MedEval是医学领域首个瑞典语测试集。

可用性

哥德堡大学瑞典语系正在使MedEval测试集可供学术研究人员使用。

相似文献

1
MedEval - A Swedish medical test collection with doctors and patients user groups.MedEval - 一个针对医生和患者用户群体的瑞典医学测试集。
J Biomed Semantics. 2011;2 Suppl 3(Suppl 3):S4. doi: 10.1186/2041-1480-2-S3-S4. Epub 2011 Jul 14.
2
Selective dissemination and indexing of scientific information.科学信息的选择性传播与索引编制
Science. 1971 Jul 23;173(3994):300-8. doi: 10.1126/science.173.3994.300.
3
Preparing accessible and understandable clinical research participant information leaflets and consent forms: a set of guidelines from an expert consensus conference.编写易于获取且通俗易懂的临床研究参与者信息手册和同意书:专家共识会议制定的一套指南
Res Involv Engagem. 2021 May 18;7(1):31. doi: 10.1186/s40900-021-00265-2.
4
The Effectiveness of Integrated Care Pathways for Adults and Children in Health Care Settings: A Systematic Review.综合护理路径在医疗环境中对成人和儿童的有效性:一项系统评价。
JBI Libr Syst Rev. 2009;7(3):80-129. doi: 10.11124/01938924-200907030-00001.
5
Adaptation of machine translation for multilingual information retrieval in the medical domain.医学领域中用于多语言信息检索的机器翻译适配
Artif Intell Med. 2014 Jul;61(3):165-85. doi: 10.1016/j.artmed.2014.01.004. Epub 2014 Feb 5.
6
Promoting and supporting self-management for adults living in the community with physical chronic illness: A systematic review of the effectiveness and meaningfulness of the patient-practitioner encounter.促进和支持社区中患有慢性身体疾病的成年人进行自我管理:对医患互动的有效性和意义的系统评价。
JBI Libr Syst Rev. 2009;7(13):492-582. doi: 10.11124/01938924-200907130-00001.
7
Retrieval augmentation of large language models for lay language generation.大语言模型的检索增强用于生成通俗语言。
J Biomed Inform. 2024 Jan;149:104580. doi: 10.1016/j.jbi.2023.104580. Epub 2023 Dec 30.
8
Creating a medical English-Swedish dictionary using interactive word alignment.使用交互式词对齐创建医学英语-瑞典语词典。
BMC Med Inform Decis Mak. 2006 Oct 12;6:35. doi: 10.1186/1472-6947-6-35.
9
Searching for cancer information on the internet: analyzing natural language search queries.在互联网上搜索癌症信息:分析自然语言搜索查询
J Med Internet Res. 2003 Dec 11;5(4):e31. doi: 10.2196/jmir.5.4.e31.
10
MorphoSaurus--design and evaluation of an interlingua-based, cross-language document retrieval engine for the medical domain.形态恐龙——一种基于中间语言的医学领域跨语言文档检索引擎的设计与评估。
Methods Inf Med. 2005;44(4):537-45.

引用本文的文献

1
Louhi 2010: Special issue on Text and Data Mining of Health Documents.卢希2010年:健康文档文本与数据挖掘特刊。
J Biomed Semantics. 2011;2 Suppl 3(Suppl 3):I1. doi: 10.1186/2041-1480-2-S3-I1. Epub 2011 Jul 14.