Lai Honghao, Liu Jiayi, Bai Chunyang, Liu Hui, Pan Bei, Luo Xufei, Hou Liangying, Zhao Weilong, Xia Danni, Tian Jinhui, Chen Yaolong, Zhang Lu, Estill Janne, Liu Jie, Liao Xing, Shi Nannan, Sun Xin, Shang Hongcai, Bian Zhaoxiang, Yang Kehu, Huang Luqi, Ge Long
Department of Health Policy and Health Management, School of Public Health, Lanzhou University, Lanzhou, China.
Evidence-Based Social Science Research Center, School of Public Health, Lanzhou University, Lanzhou, China.
NPJ Digit Med. 2025 Jan 31;8(1):74. doi: 10.1038/s41746-025-01457-w.
Large language models (LLMs) have the potential to enhance evidence synthesis efficiency and accuracy. This study assessed LLM-only and LLM-assisted methods in data extraction and risk of bias assessment for 107 trials on complementary medicine. Moonshot-v1-128k and Claude-3.5-sonnet achieved high accuracy (≥95%), with LLM-assisted methods performing better (≥97%). LLM-assisted methods significantly reduced processing time (14.7 and 5.9 min vs. 86.9 and 10.4 min for conventional methods). These findings highlight LLMs' potential when integrated with human expertise.
大型语言模型(LLMs)有潜力提高证据综合的效率和准确性。本研究评估了仅使用大型语言模型和大型语言模型辅助的方法在107项补充医学试验的数据提取和偏倚风险评估中的效果。Moonshot-v1-128k和Claude-3.5-sonnet实现了高精度(≥95%),大型语言模型辅助的方法表现更好(≥97%)。大型语言模型辅助的方法显著减少了处理时间(分别为14.7分钟和5.9分钟,而传统方法为86.9分钟和10.4分钟)。这些发现凸显了大型语言模型与人类专业知识相结合时的潜力。