• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

自然语言搜索界面:健康数据需要单字段变量搜索。

Natural Language Search Interfaces: Health Data Needs Single-Field Variable Search.

作者信息

Jay Caroline, Harper Simon, Dunlop Ian, Smith Sam, Sufi Shoaib, Goble Carole, Buchan Iain

机构信息

Information Management Group, School of Computer Science, University of Manchester, Manchester, United Kingdom.

出版信息

J Med Internet Res. 2016 Jan 14;18(1):e13. doi: 10.2196/jmir.4912.

DOI:10.2196/jmir.4912
PMID:26769334
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4731680/
Abstract

BACKGROUND

Data discovery, particularly the discovery of key variables and their inter-relationships, is key to secondary data analysis, and in-turn, the evolving field of data science. Interface designers have presumed that their users are domain experts, and so they have provided complex interfaces to support these "experts." Such interfaces hark back to a time when searches needed to be accurate first time as there was a high computational cost associated with each search. Our work is part of a governmental research initiative between the medical and social research funding bodies to improve the use of social data in medical research.

OBJECTIVE

The cross-disciplinary nature of data science can make no assumptions regarding the domain expertise of a particular scientist, whose interests may intersect multiple domains. Here we consider the common requirement for scientists to seek archived data for secondary analysis. This has more in common with search needs of the "Google generation" than with their single-domain, single-tool forebears. Our study compares a Google-like interface with traditional ways of searching for noncomplex health data in a data archive.

METHODS

Two user interfaces are evaluated for the same set of tasks in extracting data from surveys stored in the UK Data Archive (UKDA). One interface, Web search, is "Google-like," enabling users to browse, search for, and view metadata about study variables, whereas the other, traditional search, has standard multioption user interface.

RESULTS

Using a comprehensive set of tasks with 20 volunteers, we found that the Web search interface met data discovery needs and expectations better than the traditional search. A task × interface repeated measures analysis showed a main effect indicating that answers found through the Web search interface were more likely to be correct (F1,19=37.3, P<.001), with a main effect of task (F3,57=6.3, P<.001). Further, participants completed the task significantly faster using the Web search interface (F1,19=18.0, P<.001). There was also a main effect of task (F2,38=4.1, P=.025, Greenhouse-Geisser correction applied). Overall, participants were asked to rate learnability, ease of use, and satisfaction. Paired mean comparisons showed that the Web search interface received significantly higher ratings than the traditional search interface for learnability (P=.002, 95% CI [0.6-2.4]), ease of use (P<.001, 95% CI [1.2-3.2]), and satisfaction (P<.001, 95% CI [1.8-3.5]). The results show superior cross-domain usability of Web search, which is consistent with its general familiarity and with enabling queries to be refined as the search proceeds, which treats serendipity as part of the refinement.

CONCLUSIONS

The results provide clear evidence that data science should adopt single-field natural language search interfaces for variable search supporting in particular: query reformulation; data browsing; faceted search; surrogates; relevance feedback; summarization, analytics, and visual presentation.

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bd75/4731680/a5a742b0ec71/jmir_v18i1e13_fig10.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bd75/4731680/5cc6cb4197c0/jmir_v18i1e13_fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bd75/4731680/587c667ac2d6/jmir_v18i1e13_fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bd75/4731680/fe1b34ec73ef/jmir_v18i1e13_fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bd75/4731680/fc4b375ce66f/jmir_v18i1e13_fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bd75/4731680/ad064bb2fa4b/jmir_v18i1e13_fig5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bd75/4731680/8b570afc66c9/jmir_v18i1e13_fig6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bd75/4731680/8e4c1b28f1f5/jmir_v18i1e13_fig7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bd75/4731680/c651132e2b3d/jmir_v18i1e13_fig8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bd75/4731680/350a55881004/jmir_v18i1e13_fig9.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bd75/4731680/a5a742b0ec71/jmir_v18i1e13_fig10.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bd75/4731680/5cc6cb4197c0/jmir_v18i1e13_fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bd75/4731680/587c667ac2d6/jmir_v18i1e13_fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bd75/4731680/fe1b34ec73ef/jmir_v18i1e13_fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bd75/4731680/fc4b375ce66f/jmir_v18i1e13_fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bd75/4731680/ad064bb2fa4b/jmir_v18i1e13_fig5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bd75/4731680/8b570afc66c9/jmir_v18i1e13_fig6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bd75/4731680/8e4c1b28f1f5/jmir_v18i1e13_fig7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bd75/4731680/c651132e2b3d/jmir_v18i1e13_fig8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bd75/4731680/350a55881004/jmir_v18i1e13_fig9.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bd75/4731680/a5a742b0ec71/jmir_v18i1e13_fig10.jpg
摘要

背景

数据发现,尤其是关键变量及其相互关系的发现,是二次数据分析以及数据科学这一不断发展的领域的关键。界面设计师假定其用户是领域专家,因此他们提供了复杂的界面来支持这些“专家”。此类界面可追溯到一个时期,那时搜索需要一次性准确无误,因为每次搜索都伴随着高昂的计算成本。我们的工作是医学和社会研究资助机构之间一项政府研究计划的一部分,旨在改善社会数据在医学研究中的应用。

目的

数据科学的跨学科性质无法对特定科学家的领域专业知识做出假设,这些科学家的兴趣可能涉及多个领域。在此,我们考虑科学家为进行二次分析而查找存档数据的常见需求。这与“谷歌一代”的搜索需求更为相似,而与他们单领域、单工具先辈的搜索需求不同。我们的研究将类似谷歌的界面与在数据存档中搜索非复杂健康数据的传统方式进行了比较。

方法

针对从英国数据存档(UKDA)中存储的调查中提取数据的同一组任务,对两个用户界面进行了评估。一个界面是网络搜索,类似“谷歌”,使用户能够浏览、搜索和查看关于研究变量的元数据,而另一个传统搜索界面则具有标准的多选项用户界面。

结果

通过20名志愿者完成一系列综合任务,我们发现网络搜索界面比传统搜索界面能更好地满足数据发现需求和期望。任务×界面重复测量分析显示出一个主效应,表明通过网络搜索界面找到的答案更可能是正确的(F1,19 = 37.3,P <.001)以及任务的主效应(F3,57 = 6.3,P <.001)。此外,参与者使用网络搜索界面完成任务的速度明显更快(F1,19 = 18.0,P <.001)。还有任务主效应(F2,38 = 4.1,P =.025,采用Greenhouse-Geisser校正)。总体而言,要求参与者对可学习性、易用性和满意度进行评分。配对均值比较显示,网络搜索界面在可学习性(P =.002,95% CI [0.6 - 2.4])、易用性(P <.001, 95% CI [1.2 - 3.2])和满意度(P <.001, 95% CI [i.8 - 3.5])方面的评分显著高于传统搜索界面。结果表明网络搜索具有卓越的跨领域可用性,这与其普遍的熟悉度以及随着搜索进行能够细化查询一致,后者将意外发现视为细化的一部分。

结论

结果提供了明确证据,表明数据科学应采用单字段自然语言搜索界面进行变量搜索,尤其支持:查询重新制定;数据浏览;分面搜索;替代物;相关性反馈;汇总、分析和可视化呈现。

相似文献

1
Natural Language Search Interfaces: Health Data Needs Single-Field Variable Search.自然语言搜索界面:健康数据需要单字段变量搜索。
J Med Internet Res. 2016 Jan 14;18(1):e13. doi: 10.2196/jmir.4912.
2
Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.在流行地区,服用抗叶酸抗疟药物的人群中,叶酸补充剂与疟疾易感性和严重程度的关系。
Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.
3
A study of the influence of task familiarity on user behaviors and performance with a MeSH term suggestion interface for PubMed bibliographic search.一项关于任务熟悉度对 PubMed 书目搜索中 MeSH 术语建议界面用户行为和绩效影响的研究。
Int J Med Inform. 2013 Sep;82(9):832-43. doi: 10.1016/j.ijmedinf.2013.04.005. Epub 2013 May 31.
4
Improving Access to Online Health Information With Conversational Agents: A Randomized Controlled Experiment.使用对话代理改善在线健康信息的获取:一项随机对照实验。
J Med Internet Res. 2016 Jan 4;18(1):e1. doi: 10.2196/jmir.5239.
5
Searching for cancer information on the internet: analyzing natural language search queries.在互联网上搜索癌症信息:分析自然语言搜索查询
J Med Internet Res. 2003 Dec 11;5(4):e31. doi: 10.2196/jmir.5.4.e31.
6
Comparing image search behaviour in the ARRS GoldMiner search engine and a clinical PACS/RIS.比较ARRS GoldMiner搜索引擎与临床PACS/RIS中的图像搜索行为。
J Biomed Inform. 2015 Aug;56:57-64. doi: 10.1016/j.jbi.2015.04.013. Epub 2015 May 19.
7
FacetMap: A scalable search and browse visualization.小平面映射:一种可扩展的搜索与浏览可视化工具。
IEEE Trans Vis Comput Graph. 2006 Sep-Oct;12(5):797-804. doi: 10.1109/TVCG.2006.142.
8
HERALD: A domain-specific query language for longitudinal health data analytics.HERALD:一种针对纵向健康数据分析的领域特定查询语言。
Int J Med Inform. 2024 Dec;192:105646. doi: 10.1016/j.ijmedinf.2024.105646. Epub 2024 Oct 5.
9
Clinician search behaviors may be influenced by search engine design.临床医生的搜索行为可能会受到搜索引擎设计的影响。
J Med Internet Res. 2010 Jun 30;12(2):e25. doi: 10.2196/jmir.1396.
10
The BioPrompt-box: an ontology-based clustering tool for searching in biological databases.生物提示框:一种用于在生物数据库中搜索的基于本体的聚类工具。
BMC Bioinformatics. 2007 Mar 8;8 Suppl 1(Suppl 1):S8. doi: 10.1186/1471-2105-8-S1-S8.

本文引用的文献

1
Negotiating the coresearcher mandate - service users' experiences of doing collaborative research on mental health.协商核心研究员任务——服务使用者在精神健康方面进行合作研究的经验。
Disabil Rehabil. 2012;34(19):1608-16. doi: 10.3109/09638288.2012.656792. Epub 2012 Apr 10.
2
Health professionals of the future: teaching information skills to the Google generation.未来的健康专业人员:向谷歌一代传授信息技能。
Health Info Libr J. 2010 Jun;27(2):158-62. doi: 10.1111/j.1471-1842.2010.00885.x.
3
Age-sensitive design of online health information: comparative usability study.
在线健康信息的年龄敏感型设计:比较可用性研究
J Med Internet Res. 2009 Nov 16;11(4):e45. doi: 10.2196/jmir.1220.
4
Beyond the five-user assumption: benefits of increased sample sizes in usability testing.超越五用户假设:可用性测试中增加样本量的益处
Behav Res Methods Instrum Comput. 2003 Aug;35(3):379-83. doi: 10.3758/bf03195514.
5
Examining age differences in performance of a complex information search and retrieval task.研究复杂信息搜索与检索任务表现中的年龄差异。
Psychol Aging. 2001 Dec;16(4):564-579. doi: 10.1037/0882-7974.16.4.564.