• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

人工智能能否生成能通过同行评审并在高影响力的骨科期刊上发表的科学讨论?

Can artificial intelligence generate scientific discussion that passes peer review for publication in a high-impact orthopaedic journal?

作者信息

Sheridan Gerard A, Howard Lisa C, Neufeld Michael E, Doyle Tom R, Hughes Andrew J, Sculco Peter K, Beverland David E, Garbuz Donald S, Masri Bassam A

机构信息

University of British Columbia, Vancouver, BC, Canada.

Hospital for Special Surgery, New York, NY, USA.

出版信息

Ir J Med Sci. 2025 Jun 12. doi: 10.1007/s11845-025-03971-y.

DOI:10.1007/s11845-025-03971-y
PMID:40504456
Abstract

BACKGROUND

There is huge interest in the use of artificial intelligence (AI) in the production and assessment of academic material; however, the role of AI remains unclear.

AIM

The purpose of this study was to perform a reviewer-blinded assessment of the quality of scientific discussion generated by an advanced AI language model (ChatGPT-4, Open AI) and determine whether this could be recommended for high-impact journal publication.

METHODS

The introduction, methods and results sections of a recently published article from a high-impact journal were input into a current AI model. The AI application then produced a discussion and conclusion based on the provided text using a standardized prompt. Six experienced blinded reviewers scored all five sections of the hybrid article. A one-way analysis of variance (ANOVA) was used to assess significant differences between scores of each section. Reviewers recommended a decision regarding the suitability of the article for publication.

RESULTS

AI composed a scientific discussion and conclusion. The median score was 80 (IQR 70-90) for introduction, 77.5 (IQR 70-90) for methods, 82.5 (IQR 50-90) for results, 60 (IQR 40-75) for discussion and 60 (IQR 40-80) for the conclusion. The median scores for the AI-generated sections were non-significantly lower than other sections (p = 0.37). The majority of reviewers (5/6, 83%) recommended "acceptance for publication after major revision". One reviewer recommended "resubmission with no guarantee of acceptance". There were no recommendations for rejection.

CONCLUSION

Current AI large language models are now capable of generating content that passes experienced peer review and is acceptable for publication in a high-impact orthopaedic journal, after revision. There are still many concerns regarding the integration of AI into the process of scientific writing, mainly the tendency of AI to rely on advanced pattern recognition and fabricated or inadequate references.

LEVEL OF EVIDENCE

Level IV.

摘要

背景

人工智能(AI)在学术材料的生成和评估中的应用引发了极大关注;然而,人工智能的作用仍不明确。

目的

本研究旨在对先进的人工智能语言模型(ChatGPT - 4,OpenAI)生成的科学讨论质量进行双盲评审,并确定其是否可推荐用于高影响力期刊发表。

方法

将一篇近期发表于高影响力期刊文章的引言、方法和结果部分输入当前的人工智能模型。然后,人工智能应用程序使用标准化提示基于提供的文本生成讨论和结论。六位经验丰富的双盲评审员对这篇混合文章的所有五个部分进行评分。采用单因素方差分析(ANOVA)评估各部分得分之间的显著差异。评审员就文章发表的适宜性给出推荐决定。

结果

人工智能生成了科学讨论和结论。引言部分的中位数得分是80(四分位间距70 - 90),方法部分是77.5(四分位间距70 - 90),结果部分是82.5(四分位间距50 - 90),讨论部分是60(四分位间距40 - 75),结论部分是60(四分位间距40 - 80)。人工智能生成部分的中位数得分略低于其他部分,但差异无统计学意义(p = 0.37)。大多数评审员(5/6,83%)建议“大修后接受发表”。一位评审员建议“重新提交,但不保证接受”。没有拒绝发表的建议。

结论

当前的人工智能大语言模型现在能够生成经过经验丰富的同行评审且经修订后可在高影响力骨科期刊发表的内容。将人工智能融入科学写作过程仍存在许多担忧,主要是人工智能倾向于依赖先进的模式识别以及虚假或不充分的参考文献。

证据级别

四级。

相似文献

1
Can artificial intelligence generate scientific discussion that passes peer review for publication in a high-impact orthopaedic journal?人工智能能否生成能通过同行评审并在高影响力的骨科期刊上发表的科学讨论?
Ir J Med Sci. 2025 Jun 12. doi: 10.1007/s11845-025-03971-y.
2
Prescription of Controlled Substances: Benefits and Risks管制药品的处方:益处与风险
3
Can Artificial Intelligence Improve the Readability of Patient Education Materials?人工智能能否提高患者教育材料的可读性?
Clin Orthop Relat Res. 2023 Nov 1;481(11):2260-2267. doi: 10.1097/CORR.0000000000002668. Epub 2023 Apr 28.
4
A rapid and systematic review of the clinical effectiveness and cost-effectiveness of topotecan for ovarian cancer.拓扑替康治疗卵巢癌的临床有效性和成本效益的快速系统评价。
Health Technol Assess. 2001;5(28):1-110. doi: 10.3310/hta5280.
5
Artificial intelligence as author: Can scientific reviewers recognize GPT-4o-generated manuscripts?人工智能作为作者:科学评审人员能识别由GPT-4o生成的稿件吗?
Am J Emerg Med. 2025 Jul 30;97:216-219. doi: 10.1016/j.ajem.2025.07.034.
6
Assessing the Reproducibility of the Structured Abstracts Generated by ChatGPT and Bard Compared to Human-Written Abstracts in the Field of Spine Surgery: Comparative Analysis.评估 ChatGPT 和 Bard 生成的结构化摘要与脊柱外科领域人类撰写的摘要在可重复性方面的比较:对比分析。
J Med Internet Res. 2024 Jun 26;26:e52001. doi: 10.2196/52001.
7
Can artificial intelligence write science? A comparative analysis of human-written and artificial intelligence-generated scientific writings.人工智能能撰写科学内容吗?人工撰写与人工智能生成的科学著作的比较分析。
J Neurosurg Spine. 2025 Aug 22:1-6. doi: 10.3171/2025.4.SPINE25519.
8
Consolidated standards of reporting trials (CONSORT) and the completeness of reporting of randomised controlled trials (RCTs) published in medical journals.试验报告的统一标准(CONSORT)以及医学期刊上发表的随机对照试验(RCT)的报告完整性。
Cochrane Database Syst Rev. 2012 Nov 14;11(11):MR000030. doi: 10.1002/14651858.MR000030.pub2.
9
Troubling Trends in Biomedical Research Publication: "Publish or Perish" Results in a Propensity for Ethical Violations.生物医学研究出版中的不良趋势:“不发表就出局”导致违反伦理的倾向。
Arthroscopy. 2025 Apr;41(4):859-862. doi: 10.1016/j.arthro.2024.12.017. Epub 2024 Dec 20.
10
Evaluating Artificial Intelligence-Based Writing Assistance Among Published Orthopaedic Studies: Detection and Trends for Future Interpretation.评估已发表的骨科研究中基于人工智能的写作辅助工具:检测与未来解读趋势
J Bone Joint Surg Am. 2025 May 30;107(16):1887-1893. doi: 10.2106/JBJS.24.01462.

本文引用的文献

1
Steeper Slope of the Medial Tibial Plateau, Greater Varus Alignment, and Narrower Intercondylar Distance and Notch Width Increase Risk for Medial Meniscus Posterior Root Tears: A Systematic Review.胫骨内侧平台坡度更陡、内翻畸形程度更大以及髁间距离和髁间窝宽度更窄会增加内侧半月板后根撕裂的风险:一项系统评价
Arthroscopy. 2024 Nov 4. doi: 10.1016/j.arthro.2024.10.031.
2
Author Reply to "Human- Versus ChatGPT-Generated Abstracts: Some Concerns and Suggestions".
Arthroscopy. 2025 May;41(5):1244-1245. doi: 10.1016/j.arthro.2024.09.052. Epub 2024 Oct 10.
3
Identification of ChatGPT-Generated Abstracts Within Shoulder and Elbow Surgery Poses a Challenge for Reviewers.识别肩部和肘部手术领域中由ChatGPT生成的摘要对审稿人来说是一项挑战。
Arthroscopy. 2025 Apr;41(4):916-924.e2. doi: 10.1016/j.arthro.2024.06.045. Epub 2024 Jul 9.
4
ChatGPT-4 Performs Clinical Information Retrieval Tasks Using Consistently More Trustworthy Resources Than Does Google Search for Queries Concerning the Latarjet Procedure.对于有关拉塔热手术的查询,ChatGPT-4在执行临床信息检索任务时,使用的资源始终比谷歌搜索更可靠。
Arthroscopy. 2025 Mar;41(3):588-597. doi: 10.1016/j.arthro.2024.05.025. Epub 2024 Jun 25.
5
Performance of GPT-3.5 and GPT-4 on the Japanese Medical Licensing Examination: Comparison Study.GPT-3.5和GPT-4在日本医师执照考试中的表现:比较研究。
JMIR Med Educ. 2023 Jun 29;9:e48002. doi: 10.2196/48002.
6
Algorithmic bias and research integrity; the role of nonhuman authors in shaping scientific knowledge with respect to artificial intelligence: a perspective.算法偏见与研究诚信;非人类作者在人工智能方面塑造科学知识方面的作用:一个视角。
Int J Surg. 2023 Oct 1;109(10):2987-2990. doi: 10.1097/JS9.0000000000000552.
7
Artificial intelligence applications and scholarly publication in orthopaedic surgery.人工智能在骨外科的应用与学术出版
Bone Joint J. 2023 Apr 17;105-B(6):585-586. doi: 10.1302/0301-620X.105B.BJJ-2023-0272.
8
Survival of the Exeter V40 short revision (44/00/125) stem when used in primary total hip arthroplasty.在初次全髋关节置换术中使用 Exeter V40 短翻修(44/00/125)柄的存活率。
Bone Joint J. 2023 May 1;105-B(5):504-510. doi: 10.1302/0301-620X.105B5.BJJ-2022-1124.R1.
9
How Does ChatGPT Perform on the United States Medical Licensing Examination (USMLE)? The Implications of Large Language Models for Medical Education and Knowledge Assessment.ChatGPT在美国医师执照考试(USMLE)中的表现如何?大语言模型对医学教育和知识评估的影响。
JMIR Med Educ. 2023 Feb 8;9:e45312. doi: 10.2196/45312.
10
Concerns surrounding application of artificial intelligence in hip and knee arthroplasty : a review of literature and recommendations for meaningful adoption.关于人工智能在髋关节和膝关节置换术中应用的担忧:文献综述及有意义应用的建议
Bone Joint J. 2022 Dec;104-B(12):1292-1303. doi: 10.1302/0301-620X.104B12.BJJ-2022-0922.R1.