Suppr超能文献

通过基于确定性的筛选减少系统评价工作量。

Reducing systematic review workload through certainty-based screening.

作者信息

Miwa Makoto, Thomas James, O'Mara-Eves Alison, Ananiadou Sophia

机构信息

The National Centre for Text Mining and School of Computer Science, Manchester Institute of Biotechnology, University of Manchester, 131 Princess Street, Manchester M1 7DN, UK; Toyota Technological Institute, 2-12-1 Hisakata, Tempaku-ku, Nagoya 468-8511, Japan.

Evidence for Policy and Practice Information and Coordinating (EPPI-)Centre, Social Science Research Unit, Institute of Education, University of London, London, UK.

出版信息

J Biomed Inform. 2014 Oct;51:242-53. doi: 10.1016/j.jbi.2014.06.005. Epub 2014 Jun 19.

Abstract

In systematic reviews, the growing number of published studies imposes a significant screening workload on reviewers. Active learning is a promising approach to reduce the workload by automating some of the screening decisions, but it has been evaluated for a limited number of disciplines. The suitability of applying active learning to complex topics in disciplines such as social science has not been studied, and the selection of useful criteria and enhancements to address the data imbalance problem in systematic reviews remains an open problem. We applied active learning with two criteria (certainty and uncertainty) and several enhancements in both clinical medicine and social science (specifically, public health) areas, and compared the results in both. The results show that the certainty criterion is useful for finding relevant documents, and weighting positive instances is promising to overcome the data imbalance problem in both data sets. Latent dirichlet allocation (LDA) is also shown to be promising when little manually-assigned information is available. Active learning is effective in complex topics, although its efficiency is limited due to the difficulties in text classification. The most promising criterion and weighting method are the same regardless of the review topic, and unsupervised techniques like LDA have a possibility to boost the performance of active learning without manual annotation.

摘要

在系统性综述中,已发表研究数量的不断增加给综述者带来了巨大的筛选工作量。主动学习是一种很有前景的方法,可通过自动化部分筛选决策来减轻工作量,但目前仅在少数几个学科中得到评估。主动学习在社会科学等学科的复杂主题中的适用性尚未得到研究,并且在系统性综述中选择有用的标准以及解决数据不平衡问题的改进方法仍然是一个悬而未决的问题。我们在临床医学和社会科学(特别是公共卫生)领域应用了具有两种标准(确定性和不确定性)以及多种改进方法的主动学习,并比较了两者的结果。结果表明,确定性标准对于查找相关文档很有用,对正例进行加权有望克服两个数据集中的数据不平衡问题。当几乎没有人工标注信息时,潜在狄利克雷分配(LDA)也显示出前景。主动学习在复杂主题中是有效的,尽管由于文本分类的困难,其效率有限。无论综述主题如何,最有前景的标准和加权方法都是相同的,并且像LDA这样的无监督技术有可能在无需人工标注的情况下提高主动学习的性能。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7a6b/4199186/1a25220c3ec8/fx1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验