检索增强型科学声明验证

Retrieval augmented scientific claim verification.

作者信息

Liu Hao, Soroush Ali, Nestor Jordan G, Park Elizabeth, Idnay Betina, Fang Yilu, Pan Jane, Liao Stan, Bernard Marguerite, Peng Yifan, Weng Chunhua

机构信息

School of Computing, Montclair State University, Montclair, NJ 07043, United States.

Department of Medicine, Columbia University, New York, NY 10027, United States.

出版信息

JAMIA Open. 2024 Feb 21;7(1):ooae021. doi: 10.1093/jamiaopen/ooae021. eCollection 2024 Apr.

DOI:10.1093/jamiaopen/ooae021

PMID:38455840

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10919922/

Abstract

OBJECTIVE

To automate scientific claim verification using PubMed abstracts.

MATERIALS AND METHODS

We developed CliVER, an end-to-end scientific am ification system that leverages retrieval-augmented techniques to automatically retrieve relevant clinical trial abstracts, extract pertinent sentences, and use the PICO framework to support or refute a scientific claim. We also created an ensemble of three state-of-the-art deep learning models to classify rationale of support, refute, and neutral. We then constructed CoVERt, a new VID ificaion dataset comprising 15 PICO-encoded drug claims accompanied by 96 manually selected and labeled clinical trial abstracts that either support or refute each claim. We used CoVERt and SciFact (a public scientific claim verification dataset) to assess CliVER's performance in predicting labels. Finally, we compared CliVER to clinicians in the verification of 19 claims from 6 disease domains, using 189 648 PubMed abstracts extracted from January 2010 to October 2021.

RESULTS

In the evaluation of label prediction accuracy on CoVERt, CliVER achieved a notable F1 score of 0.92, highlighting the efficacy of the retrieval-augmented models. The ensemble model outperforms each individual state-of-the-art model by an absolute increase from 3% to 11% in the F1 score. Moreover, when compared with four clinicians, CliVER achieved a precision of 79.0% for abstract retrieval, 67.4% for sentence selection, and 63.2% for label prediction, respectively.

CONCLUSION

CliVER demonstrates its early potential to automate scientific claim verification using retrieval-augmented strategies to harness the wealth of clinical trial abstracts in PubMed. Future studies are warranted to further test its clinical utility.

摘要

目的

利用PubMed摘要实现科学声明验证的自动化。

材料与方法

我们开发了CliVER，这是一个端到端的科学验证系统，它利用检索增强技术自动检索相关临床试验摘要，提取相关句子，并使用PICO框架来支持或反驳科学声明。我们还创建了一个由三个最先进的深度学习模型组成的集成模型，用于对支持、反驳和中立的理由进行分类。然后，我们构建了CoVERt，这是一个新的验证数据集，包含15个经PICO编码的药物声明以及96篇手动挑选和标注的临床试验摘要，这些摘要要么支持要么反驳每个声明。我们使用CoVERt和SciFact（一个公共科学声明验证数据集）来评估CliVER在预测标签方面的性能。最后，我们使用从2010年1月至2021年10月提取的189648篇PubMed摘要，将CliVER与临床医生在6个疾病领域的19项声明验证中进行比较。

结果

在对CoVERt上的标签预测准确性进行评估时，CliVER取得了显著的F1分数0.92，突出了检索增强模型的有效性。集成模型在F1分数上比每个单独的最先进模型绝对提高了3%至11%。此外，与四位临床医生相比，CliVER在摘要检索、句子选择和标签预测方面的准确率分别为79.0%、67.4%和63.2%。