Ganguly Debasis, Deleris Léa A, Mac Aonghusa Pol, Wright Alison J, Finnerty Ailbhe N, Norris Emma, Marques Marta M, Michie Susan
IBM Research, Dublin, Ireland.
Centre for Behaviour Change, University College London, UK.
Stud Health Technol Inform. 2018;247:680-684.
This paper describes our approach to construct a scalable system for unsupervised information extraction from the behaviour change intervention literature. Due to the many different types of attribute to be extracted, we adopt a passage retrieval based framework that provides the most likely value for an attribute. Our proposed method is capable of addressing variable length passage sizes and different validation criteria for the extracted values corresponding to each attribute to be found. We evaluate our approach by constructing a manually annotated ground-truth from a set of 50 research papers with reported studies on smoking cessation.
本文描述了我们构建一个可扩展系统的方法,该系统用于从行为改变干预文献中进行无监督信息提取。由于要提取的属性类型众多,我们采用了基于段落检索的框架,该框架可为属性提供最可能的值。我们提出的方法能够处理可变长度的段落大小以及针对要查找的每个属性所提取值的不同验证标准。我们通过从一组50篇关于戒烟的研究报告的研究论文中构建人工标注的真值来评估我们的方法。