Department of Biomedical Informatics, Columbia University, New York, NY, USA.
Department of Applied Physics and Applied Mathematics, Columbia University, New York, NY, USA.
Stud Health Technol Inform. 2022 Jun 6;290:592-596. doi: 10.3233/SHTI220146.
Complex interventions are ubiquitous in healthcare. A lack of computational representations and information extraction solutions for complex interventions hinders accurate and efficient evidence synthesis. In this study, we manually annotated and analyzed 3,447 intervention snippets from 261 randomized clinical trial (RCT) abstracts and developed a compositional representation for complex interventions, which captures the spatial, temporal and Boolean relations between intervention components, along with an intervention normalization pipeline that automates three tasks: (i) treatment entity extraction; (ii) intervention component relation extraction; and (iii) attribute extraction and association. 361 intervention snippets from 29 unseen abstracts were included to report on the performance of the evaluation. The average F-measure was 0.74 for treatment entity extraction on an exact match and 0.82 for attribute extraction. The F-measure for relation extraction of multi-component complex interventions was 0.90. 93% of extracted attributes were correctly attributed to corresponding treatment entities.
复杂干预措施在医疗保健中无处不在。缺乏用于复杂干预措施的计算表示和信息提取解决方案,阻碍了准确和高效的证据综合。在这项研究中,我们对 261 项随机临床试验 (RCT) 摘要中的 3447 个干预片段进行了手动标注和分析,并为复杂干预措施开发了一种组合表示方法,该方法捕获了干预措施组件之间的空间、时间和布尔关系,以及一个干预措施规范化管道,该管道自动执行三项任务:(i)治疗实体提取;(ii)干预组件关系提取;以及(iii)属性提取和关联。我们纳入了 29 篇未见过的摘要中的 361 个干预片段,以报告评估的性能。在精确匹配中,治疗实体提取的平均 F1 分数为 0.74,属性提取的平均 F1 分数为 0.82。多组件复杂干预措施的关系提取的 F1 分数为 0.90。提取的属性中有 93%正确归因于相应的治疗实体。