Suppr超能文献

证据三角测量器:利用大语言模型跨研究设计提取和综合因果证据。

Evidence triangulator: using large language models to extract and synthesize causal evidence across study designs.

作者信息

Shi Xuanyu, Zhao Wenjing, Chen Ting, Yang Chao, Du Jian

机构信息

Institute of Medical Technology, Peking University, Beijing, China.

National Institute of Health Data Science, Peking University, Beijing, China.

出版信息

Nat Commun. 2025 Aug 9;16(1):7355. doi: 10.1038/s41467-025-62783-x.

Abstract

Health strategies increasingly emphasize both behavioural and biomedical interventions, yet the complex and often contradictory guidance on diet, behavior, and health outcomes complicates evidence-based decision-making. Evidence triangulation across diverse study designs is essential for balancing biases and establishing causality, but scalable, automated methods for achieving this are lacking. In this study, we assess the performance of large language models in extracting both ontological and methodological information from scientific literature to automate evidence triangulation. A two-step extraction approach-focusing on exposure-outcome concepts first, followed by relation extraction-outperforms a one-step method, particularly in identifying the direction of effect (F1 = 0.86) and statistical significance (F1 = 0.96). Using salt intake and blood pressure as a case study, we calculate the Convergency of Evidence and Level of Convergency, finding a strong excitatory effect of salt on blood pressure (942 studies), and weak excitatory effect on cardiovascular diseases and deaths (124 studies). This approach complements traditional meta-analyses by integrating evidence across study designs, and enabling rapid, dynamic assessment of scientific controversies.

摘要

健康策略越来越强调行为和生物医学干预措施,然而,关于饮食、行为和健康结果的复杂且常常相互矛盾的指导意见,使得基于证据的决策变得复杂。跨多种研究设计进行证据三角验证对于平衡偏差和确定因果关系至关重要,但缺乏可扩展的自动化方法来实现这一点。在本研究中,我们评估了大语言模型从科学文献中提取本体论和方法论信息以实现证据三角验证自动化的性能。一种两步提取方法——首先关注暴露-结果概念,然后进行关系提取——优于一步法,特别是在确定效应方向(F1 = 0.86)和统计显著性(F1 = 0.96)方面。以盐摄入量和血压为例进行研究,我们计算了证据的收敛性和收敛水平,发现盐对血压有强烈的兴奋作用(942项研究),而对心血管疾病和死亡有较弱的兴奋作用(124项研究)。这种方法通过整合不同研究设计的证据,并能够对科学争议进行快速、动态的评估,对传统的荟萃分析起到了补充作用。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验