• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于补充医学数据提取和偏倚风险评估的语言模型

Language models for data extraction and risk of bias assessment in complementary medicine.

作者信息

Lai Honghao, Liu Jiayi, Bai Chunyang, Liu Hui, Pan Bei, Luo Xufei, Hou Liangying, Zhao Weilong, Xia Danni, Tian Jinhui, Chen Yaolong, Zhang Lu, Estill Janne, Liu Jie, Liao Xing, Shi Nannan, Sun Xin, Shang Hongcai, Bian Zhaoxiang, Yang Kehu, Huang Luqi, Ge Long

机构信息

Department of Health Policy and Health Management, School of Public Health, Lanzhou University, Lanzhou, China.

Evidence-Based Social Science Research Center, School of Public Health, Lanzhou University, Lanzhou, China.

出版信息

NPJ Digit Med. 2025 Jan 31;8(1):74. doi: 10.1038/s41746-025-01457-w.

DOI:10.1038/s41746-025-01457-w
PMID:39890970
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11785717/
Abstract

Large language models (LLMs) have the potential to enhance evidence synthesis efficiency and accuracy. This study assessed LLM-only and LLM-assisted methods in data extraction and risk of bias assessment for 107 trials on complementary medicine. Moonshot-v1-128k and Claude-3.5-sonnet achieved high accuracy (≥95%), with LLM-assisted methods performing better (≥97%). LLM-assisted methods significantly reduced processing time (14.7 and 5.9 min vs. 86.9 and 10.4 min for conventional methods). These findings highlight LLMs' potential when integrated with human expertise.

摘要

大型语言模型(LLMs)有潜力提高证据综合的效率和准确性。本研究评估了仅使用大型语言模型和大型语言模型辅助的方法在107项补充医学试验的数据提取和偏倚风险评估中的效果。Moonshot-v1-128k和Claude-3.5-sonnet实现了高精度(≥95%),大型语言模型辅助的方法表现更好(≥97%)。大型语言模型辅助的方法显著减少了处理时间(分别为14.7分钟和5.9分钟,而传统方法为86.9分钟和10.4分钟)。这些发现凸显了大型语言模型与人类专业知识相结合时的潜力。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a323/11785717/94aaddc334ca/41746_2025_1457_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a323/11785717/406ef6923c82/41746_2025_1457_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a323/11785717/40ba16cb335b/41746_2025_1457_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a323/11785717/94aaddc334ca/41746_2025_1457_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a323/11785717/406ef6923c82/41746_2025_1457_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a323/11785717/40ba16cb335b/41746_2025_1457_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a323/11785717/94aaddc334ca/41746_2025_1457_Fig3_HTML.jpg

相似文献

1
Language models for data extraction and risk of bias assessment in complementary medicine.用于补充医学数据提取和偏倚风险评估的语言模型
NPJ Digit Med. 2025 Jan 31;8(1):74. doi: 10.1038/s41746-025-01457-w.
2
Evaluating text and visual diagnostic capabilities of large language models on questions related to the Breast Imaging Reporting and Data System Atlas 5 edition.评估大语言模型在与《乳腺影像报告和数据系统》第5版相关问题上的文本和视觉诊断能力。
Diagn Interv Radiol. 2025 Mar 3;31(2):111-129. doi: 10.4274/dir.2024.242876. Epub 2024 Sep 9.
3
Large language models for data extraction from unstructured and semi-structured electronic health records: a multiple model performance evaluation.用于从非结构化和半结构化电子健康记录中提取数据的大语言模型:多模型性能评估
BMJ Health Care Inform. 2025 Jan 19;32(1):e101139. doi: 10.1136/bmjhci-2024-101139.
4
Examining the Role of Large Language Models in Orthopedics: Systematic Review.检查大型语言模型在骨科中的作用:系统评价。
J Med Internet Res. 2024 Nov 15;26:e59607. doi: 10.2196/59607.
5
Assessing the Risk of Bias in Randomized Clinical Trials With Large Language Models.使用大型语言模型评估随机临床试验的偏倚风险。
JAMA Netw Open. 2024 May 1;7(5):e2412687. doi: 10.1001/jamanetworkopen.2024.12687.
6
Accuracy of Large Language Models for Literature Screening in Thoracic Surgery: Diagnostic Study.大型语言模型在胸外科文献筛选中的准确性:诊断性研究
J Med Internet Res. 2025 Mar 11;27:e67488. doi: 10.2196/67488.
7
Benchmarking Vision Capabilities of Large Language Models in Surgical Examination Questions.大型语言模型在外科检查问题中的视觉能力基准测试
J Surg Educ. 2025 Apr;82(4):103442. doi: 10.1016/j.jsurg.2025.103442. Epub 2025 Feb 9.
8
Evaluating the Capabilities of Generative AI Tools in Understanding Medical Papers: Qualitative Study.评估生成式人工智能工具理解医学论文的能力:定性研究
JMIR Med Inform. 2024 Sep 4;12:e59258. doi: 10.2196/59258.
9
Data extraction for evidence synthesis using a large language model: A proof-of-concept study.使用大型语言模型进行证据综合的数据提取:概念验证研究。
Res Synth Methods. 2024 Jul;15(4):576-589. doi: 10.1002/jrsm.1710. Epub 2024 Mar 3.
10
Generalization bias in large language model summarization of scientific research.大语言模型对科学研究进行总结时的泛化偏差。
R Soc Open Sci. 2025 Apr 30;12(4):241776. doi: 10.1098/rsos.241776. eCollection 2025 Apr.

引用本文的文献

1
Artificial intelligence across the cancer care continuum.贯穿癌症护理全过程的人工智能
Cancer. 2025 Aug 15;131(16):e70050. doi: 10.1002/cncr.70050.
2
Risk of Bias Assessment of Diagnostic Accuracy Studies Using QUADAS 2 by Large Language Models.使用QUADAS-2对大型语言模型进行诊断准确性研究的偏倚风险评估
Diagnostics (Basel). 2025 Jun 6;15(12):1451. doi: 10.3390/diagnostics15121451.
3
Large Language Model-Assisted Risk-of-Bias Assessment in Randomized Controlled Trials Using the Revised Risk-of-Bias Tool: Usability Study.

本文引用的文献

1
Performance of two large language models for data extraction in evidence synthesis.两种大型语言模型在证据综合数据提取中的性能比较。
Res Synth Methods. 2024 Sep;15(5):818-824. doi: 10.1002/jrsm.1732. Epub 2024 Jun 19.
2
Assessing the Risk of Bias in Randomized Clinical Trials With Large Language Models.使用大型语言模型评估随机临床试验的偏倚风险。
JAMA Netw Open. 2024 May 1;7(5):e2412687. doi: 10.1001/jamanetworkopen.2024.12687.
3
Ensemble pretrained language models to extract biomedical knowledge from literature.基于预训练语言模型的方法从文献中提取生物医学知识。
使用修订后的偏倚风险工具在随机对照试验中进行大语言模型辅助的偏倚风险评估:可用性研究
J Med Internet Res. 2025 Jun 24;27:e70450. doi: 10.2196/70450.
J Am Med Inform Assoc. 2024 Sep 1;31(9):1904-1911. doi: 10.1093/jamia/ocae061.
4
Can large language models replace humans in systematic reviews? Evaluating GPT-4's efficacy in screening and extracting data from peer-reviewed and grey literature in multiple languages.大型语言模型能否在系统评价中取代人类?评估 GPT-4 从多种语言的同行评议文献和灰色文献中进行筛选和提取数据的效果。
Res Synth Methods. 2024 Jul;15(4):616-626. doi: 10.1002/jrsm.1715. Epub 2024 Mar 14.
5
How to write effective prompts for large language models.如何为大语言模型编写有效的提示词。
Nat Hum Behav. 2024 Apr;8(4):611-615. doi: 10.1038/s41562-024-01847-2.
6
Data extraction for evidence synthesis using a large language model: A proof-of-concept study.使用大型语言模型进行证据综合的数据提取:概念验证研究。
Res Synth Methods. 2024 Jul;15(4):576-589. doi: 10.1002/jrsm.1710. Epub 2024 Mar 3.
7
Integrating large language models in systematic reviews: a framework and case study using ROBINS-I for risk of bias assessment.将大型语言模型集成到系统评价中:使用 ROBINS-I 进行偏倚风险评估的框架和案例研究。
BMJ Evid Based Med. 2024 Nov 22;29(6):394-398. doi: 10.1136/bmjebm-2023-112597.
8
Streamlining Systematic Reviews: Harnessing Large Language Models for Quality Assessment and Risk-of-Bias Evaluation.简化系统评价:利用大语言模型进行质量评估和偏倚风险评估
Cureus. 2023 Aug 6;15(8):e43023. doi: 10.7759/cureus.43023. eCollection 2023 Aug.
9
Accuracy and Efficiency of Machine Learning-Assisted Risk-of-Bias Assessments in "Real-World" Systematic Reviews : A Noninferiority Randomized Controlled Trial.机器学习辅助“真实世界”系统评价偏倚风险评估的准确性和效率:一项非劣效性随机对照试验。
Ann Intern Med. 2022 Jul;175(7):1001-1009. doi: 10.7326/M22-0092. Epub 2022 May 31.
10
An overview of methodological approaches in systematic reviews.系统评价中的方法学概述。
J Evid Based Med. 2022 Mar;15(1):39-54. doi: 10.1111/jebm.12468.