Ganguly Debasis, Hou Yufang, Deleris Le A A, Bonin Francesca
IBM Research - Ireland, Dublin, Ireland.
AMIA Jt Summits Transl Sci Proc. 2019 May 6;2019:182-191. eCollection 2019.
We describe an information extraction (IE) approach for knowledge base population of behavior change scientific intervention findings. In this paper, we focus on building a system able to characterize the specific intervention techniques that are undertaken within behavior change intervention studies. We have investigated three different configurations of a general information retrieval based framework for information extraction: a) an unsupervised approach that hinges on specification of a query for each attribute to be extracted and a few parameters for rule-based post-processing; b) a semi-supervised approach, which uses a part of the ground-truth annotations as a training set to automatically learn optimal representation of the queries; and c) a supervised approach that replaces the rule-based post processing by a text classifier. To train and evaluate our system, we make use of a ground-truth data set annotated by behavior science experts. This dataset consists of a total of 226 research papers on smoking cessation.
我们描述了一种用于行为改变科学干预研究知识库填充的信息提取(IE)方法。在本文中,我们专注于构建一个系统,该系统能够表征行为改变干预研究中所采用的特定干预技术。我们研究了基于通用信息检索的信息提取框架的三种不同配置:a)一种无监督方法,该方法依赖于为每个要提取的属性指定一个查询以及一些用于基于规则的后处理的参数;b)一种半监督方法,该方法使用一部分真实注释作为训练集来自动学习查询的最优表示;c)一种监督方法,该方法用文本分类器代替基于规则的后处理。为了训练和评估我们的系统,我们使用了由行为科学专家注释的真实数据集。该数据集总共包含226篇关于戒烟的研究论文。