评估ChatGPT-4o在偏倚风险评估中的表现。 - Suppr | 超能文献

文献检索
文档翻译
深度研究
学术资讯

Zotero 插件

邀请有礼
套餐&价格
历史记录

应用&插件

Zotero 插件浏览器插件 Mac 客户端 Windows 客户端微信小程序

定价

高级版会员购买积分包购买API积分包

服务

文献检索文档翻译深度研究 API 文档 MCP 服务

关于我们

关于 Suppr 公司介绍联系我们用户协议隐私条款

关注我们

Suppr 超能文献

核心技术专利：CN118964589B侵权必究

粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

Evaluating the Performance of ChatGPT-4o in Risk of Bias Assessments.

作者信息

Kuitunen Ilari, Ponkilainen Ville T, Liukkonen Rasmus, Nyrhi Lauri, Pakarinen Oskari, Vaajala Matias, Uimonen Mikko M

机构信息

Institute of Clinical Medicine and Department of Pediatrics, University of Eastern Finland, Kuopio, Finland.

Department of Pediatrics, Kuopio University Hospital, Kuopio, Finland.

出版信息

J Evid Based Med. 2024 Dec;17(4):700-702. doi: 10.1111/jebm.12662. Epub 2024 Dec 15.

DOI:10.1111/jebm.12662

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11684499/

Abstract

摘要

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/de03/11684499/ed16c38b6c18/JEBM-17-700-g001.jpg

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/de03/11684499/ed16c38b6c18/JEBM-17-700-g001.jpg

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/de03/11684499/ed16c38b6c18/JEBM-17-700-g001.jpg

相似文献

1

Evaluating the Performance of ChatGPT-4o in Risk of Bias Assessments.评估ChatGPT-4o在偏倚风险评估中的表现。

J Evid Based Med. 2024 Dec;17(4):700-702. doi: 10.1111/jebm.12662. Epub 2024 Dec 15.

2

ChatGPT-4o in Risk-of-Bias Assessments in Neonatology: A Validity Analysis.ChatGPT-4o在新生儿科偏倚风险评估中的应用：一项效度分析

Neonatology. 2025;122(3):360-365. doi: 10.1159/000544857. Epub 2025 Feb 25.

3

ChatGPT as an effective tool for quality evaluation of radiomics research.ChatGPT作为一种用于影像组学研究质量评估的有效工具。

Eur Radiol. 2025 Apr;35(4):2030-2042. doi: 10.1007/s00330-024-11122-7. Epub 2024 Oct 15.

4

Comparative Analysis of ChatGPT-4o and Gemini Advanced Performance on Diagnostic Radiology In-Training Exams.ChatGPT-4o与Gemini在放射诊断学培训考试中的性能对比分析

Cureus. 2025 Mar 20;17(3):e80874. doi: 10.7759/cureus.80874. eCollection 2025 Mar.

5

Comparative performance of artificial intelligence models in rheumatology board-level questions: evaluating Google Gemini and ChatGPT-4o.人工智能模型在风湿病委员会级问题中的比较性能：评估 Google Gemini 和 ChatGPT-4o。

Clin Rheumatol. 2024 Nov;43(11):3507-3513. doi: 10.1007/s10067-024-07154-5. Epub 2024 Sep 28.

6

Evaluating ChatGPT-4o as a decision support tool in multidisciplinary sarcoma tumor boards: heterogeneous performance across various specialties.评估ChatGPT-4o作为多学科肉瘤肿瘤委员会决策支持工具的效果：各专业表现参差不齐

Front Oncol. 2025 Jan 17;14:1526288. doi: 10.3389/fonc.2024.1526288. eCollection 2024.

7

Can the large language model ChatGPT-4omni predict outcomes in adult patients with status epilepticus?大语言模型ChatGPT-4omni能否预测成人癫痫持续状态患者的预后？

Epilepsia. 2025 Mar;66(3):674-685. doi: 10.1111/epi.18215. Epub 2024 Dec 26.

8

Comparing performances of french orthopaedic surgery residents with the artificial intelligence ChatGPT-4/4o in the French diploma exams of orthopaedic and trauma surgery.在法国骨科与创伤外科文凭考试中，比较法国骨科住院医师与人工智能ChatGPT-4/4o的表现。

Orthop Traumatol Surg Res. 2024 Dec 4:104080. doi: 10.1016/j.otsr.2024.104080.

9

Assessing the clinical support capabilities of ChatGPT 4o and ChatGPT 4o mini in managing lumbar disc herniation.评估ChatGPT 4o和ChatGPT 4o mini在管理腰椎间盘突出症方面的临床支持能力。

Eur J Med Res. 2025 Jan 22;30(1):45. doi: 10.1186/s40001-025-02296-x.

10

Evaluation of Advanced Artificial Intelligence Algorithms' Diagnostic Efficacy in Acute Ischemic Stroke: A Comparative Analysis of ChatGPT-4o and Claude 3.5 Sonnet Models.先进人工智能算法在急性缺血性卒中诊断效能的评估：ChatGPT-4o与Claude 3.5 Sonnet模型的比较分析

J Clin Med. 2025 Jan 17;14(2):571. doi: 10.3390/jcm14020571.

引用本文的文献

1

Human Versus Artificial Intelligence: Comparing Cochrane Authors' and ChatGPT's Risk of Bias Assessments.人类与人工智能：比较Cochrane作者和ChatGPT的偏倚风险评估

Cochrane Evid Synth Methods. 2025 Aug 31;3(5):e70044. doi: 10.1002/cesm.70044. eCollection 2025 Sep.

2

Large Language Models and the Analyses of Adherence to Reporting Guidelines in Systematic Reviews and Overviews of Reviews (PRISMA 2020 and PRIOR).大型语言模型与系统评价及综述概述（PRISMA 2020和PRIOR）中报告指南的依从性分析

J Med Syst. 2025 Jun 12;49(1):80. doi: 10.1007/s10916-025-02212-0.

3

ChatGPT-4o in Risk-of-Bias Assessments in Neonatology: A Validity Analysis.

本文引用的文献

1

Incorrect blinding assessments are common in meta-analyses published in high impact journals.在高影响力期刊上发表的荟萃分析中，不正确的盲法评估很常见。

J Evid Based Med. 2024 Sep;17(3):471-473. doi: 10.1111/jebm.12636. Epub 2024 Aug 29.

2

Blinding Assessments in Neonatal Ventilation Meta-Analyses: A Systematic Meta-Epidemiological Review.新生儿通气荟萃分析中的盲法评估：一项系统的Meta-流行病学综述

Neonatology. 2024;121(6):659-666. doi: 10.1159/000539203. Epub 2024 Jun 11.

3

Pilot study on large language models for risk-of-bias assessments in systematic reviews: A(I) new type of bias?

ChatGPT-4o在新生儿科偏倚风险评估中的应用：一项效度分析

Neonatology. 2025;122(3):360-365. doi: 10.1159/000544857. Epub 2025 Feb 25.

系统评价中用于偏倚风险评估的大语言模型的初步研究：一种新型偏倚？

BMJ Evid Based Med. 2025 Jan 22;30(1):71-74. doi: 10.1136/bmjebm-2024-112990.

4

Integrating large language models in systematic reviews: a framework and case study using ROBINS-I for risk of bias assessment.将大型语言模型集成到系统评价中：使用 ROBINS-I 进行偏倚风险评估的框架和案例研究。

BMJ Evid Based Med. 2024 Nov 22;29(6):394-398. doi: 10.1136/bmjebm-2023-112597.

5

The Pros and Cons of Using ChatGPT in Medical Education: A Scoping Review.使用 ChatGPT 在医学教育中的利弊：范围综述。

Stud Health Technol Inform. 2023 Jun 29;305:644-647. doi: 10.3233/SHTI230580.

6

Risk-of-bias assessment using Cochrane's revised tool for randomized trials (RoB 2) was useful but challenging and resource-intensive: observations from a systematic review.使用 Cochrane 修订版随机试验偏倚风险评估工具（RoB 2）进行风险评估是有用的，但具有挑战性且资源密集型：来自系统评价的观察结果。

J Clin Epidemiol. 2023 Sep;161:39-45. doi: 10.1016/j.jclinepi.2023.06.015. Epub 2023 Jun 24.

7

Adherence of systematic reviews to Cochrane RoB2 guidance was frequently poor: a meta epidemiological study.系统评价对Cochrane RoB2指南的遵循情况通常较差：一项Meta流行病学研究。

J Clin Epidemiol. 2022 Dec;152:47-55. doi: 10.1016/j.jclinepi.2022.09.003. Epub 2022 Sep 23.

8

The revised Cochrane risk of bias tool for randomized trials (RoB 2) showed low interrater reliability and challenges in its application.修订后的 Cochrane 随机对照试验偏倚风险工具（RoB 2）显示出较低的评分者间可靠性和应用方面的挑战。

J Clin Epidemiol. 2020 Oct;126:37-44. doi: 10.1016/j.jclinepi.2020.06.015. Epub 2020 Jun 18.

9

Cochrane risk of bias tool was used inadequately in the majority of non-Cochrane systematic reviews.Cochrane 偏倚风险工具在大多数非 Cochrane 系统评价中使用不当。

J Clin Epidemiol. 2020 Jul;123:114-119. doi: 10.1016/j.jclinepi.2020.03.019. Epub 2020 Apr 1.

10

RoB 2: a revised tool for assessing risk of bias in randomised trials.《随机对照试验偏倚风险评估工具2：修订版》

BMJ. 2019 Aug 28;366:l4898. doi: 10.1136/bmj.l4898.