• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

人工与人工智能生成的关节置换文献:感知交流、质量和作者来源的单盲分析。

Human versus artificial intelligence-generated arthroplasty literature: A single-blinded analysis of perceived communication, quality, and authorship source.

机构信息

Department of Orthopedic Surgery, NYU Langone Health, New York, New York, USA.

出版信息

Int J Med Robot. 2024 Feb;20(1):e2621. doi: 10.1002/rcs.2621.

DOI:10.1002/rcs.2621
PMID:38348740
Abstract

BACKGROUND

Large language models (LLM) have unknown implications for medical research. This study assessed whether LLM-generated abstracts are distinguishable from human-written abstracts and to compare their perceived quality.

METHODS

The LLM ChatGPT was used to generate 20 arthroplasty abstracts (AI-generated) based on full-text manuscripts, which were compared to originally published abstracts (human-written). Six blinded orthopaedic surgeons rated abstracts on overall quality, communication, and confidence in the authorship source. Authorship-confidence scores were compared to a test value representing complete inability to discern authorship.

RESULTS

Modestly increased confidence in human authorship was observed for human-written abstracts compared with AI-generated abstracts (p = 0.028), though AI-generated abstract authorship-confidence scores were statistically consistent with inability to discern authorship (p = 0.999). Overall abstract quality was higher for human-written abstracts (p = 0.019).

CONCLUSIONS

AI-generated abstracts' absolute authorship-confidence ratings demonstrated difficulty in discerning authorship but did not achieve the perceived quality of human-written abstracts. Caution is warranted in implementing LLMs into scientific writing.

摘要

背景

大型语言模型(LLM)对医学研究有未知的影响。本研究评估了 LLM 生成的摘要是否与人工编写的摘要有区别,并比较了它们的感知质量。

方法

使用 LLM ChatGPT 根据全文手稿生成 20 篇关节置换术摘要(AI 生成),并与原始发表的摘要(人工编写)进行比较。六名盲法骨科医生根据整体质量、沟通和对作者来源的信心对摘要进行评分。将作者信心评分与代表完全无法辨别作者身份的测试值进行比较。

结果

与 AI 生成的摘要相比,人工编写的摘要的作者身份信心略有增加(p=0.028),尽管 AI 生成的摘要作者身份信心评分在统计学上与无法辨别作者身份一致(p=0.999)。人工编写的摘要的整体摘要质量更高(p=0.019)。

结论

AI 生成的摘要的绝对作者身份信心评分表明难以辨别作者身份,但未能达到人工编写摘要的感知质量。在将 LLM 应用于科学写作时需要谨慎。

相似文献

1
Human versus artificial intelligence-generated arthroplasty literature: A single-blinded analysis of perceived communication, quality, and authorship source.人工与人工智能生成的关节置换文献:感知交流、质量和作者来源的单盲分析。
Int J Med Robot. 2024 Feb;20(1):e2621. doi: 10.1002/rcs.2621.
2
Human vs machine: identifying ChatGPT-generated abstracts in Gynecology and Urogynecology.人机之争:在妇科和泌尿外科学中识别 ChatGPT 生成的摘要。
Am J Obstet Gynecol. 2024 Aug;231(2):276.e1-276.e10. doi: 10.1016/j.ajog.2024.04.045. Epub 2024 May 6.
3
Comparisons of Quality, Correctness, and Similarity Between ChatGPT-Generated and Human-Written Abstracts for Basic Research: Cross-Sectional Study.ChatGPT 生成的和人工撰写的基础研究摘要在质量、正确性和相似性方面的比较:横断面研究。
J Med Internet Res. 2023 Dec 25;25:e51229. doi: 10.2196/51229.
4
A Study on Distinguishing ChatGPT-Generated and Human-Written Orthopaedic Abstracts by Reviewers: Decoding the Discrepancies.评审者区分ChatGPT生成和人工撰写的骨科摘要的研究:解读差异
Cureus. 2023 Nov 21;15(11):e49166. doi: 10.7759/cureus.49166. eCollection 2023 Nov.
5
Assessing the Reproducibility of the Structured Abstracts Generated by ChatGPT and Bard Compared to Human-Written Abstracts in the Field of Spine Surgery: Comparative Analysis.评估 ChatGPT 和 Bard 生成的结构化摘要与脊柱外科领域人类撰写的摘要在可重复性方面的比较:对比分析。
J Med Internet Res. 2024 Jun 26;26:e52001. doi: 10.2196/52001.
6
Reviewer Experience Detecting and Judging Human Versus Artificial Intelligence Content: The Journal Essay Contest.评审员在检测和判断人类与人工智能内容方面的体验:期刊征文比赛。
Stroke. 2024 Oct;55(10):2573-2578. doi: 10.1161/STROKEAHA.124.045012. Epub 2024 Sep 3.
7
Quality and correctness of AI-generated versus human-written abstracts in psychiatric research papers.人工智能生成与人类撰写的精神科研究论文摘要的质量和准确性。
Psychiatry Res. 2024 Nov;341:116145. doi: 10.1016/j.psychres.2024.116145. Epub 2024 Aug 17.
8
Residency Application Selection Committee Discriminatory Ability in Identifying Artificial Intelligence-Generated Personal Statements.住院医师申请选拔委员会识别人工智能生成个人陈述的歧视能力。
J Surg Educ. 2024 Jun;81(6):780-785. doi: 10.1016/j.jsurg.2024.02.009. Epub 2024 Apr 27.
9
Comparison of Medical Research Abstracts Written by Surgical Trainees and Senior Surgeons or Generated by Large Language Models.外科住院医师和资深外科医生撰写的医学研究摘要与大型语言模型生成的摘要的比较。
JAMA Netw Open. 2024 Aug 1;7(8):e2425373. doi: 10.1001/jamanetworkopen.2024.25373.
10
What is the rate of text generated by artificial intelligence over a year of publication in Orthopedics & Traumatology: Surgery & Research? Analysis of 425 articles before versus after the launch of ChatGPT in November 2022.在《矫形外科与创伤学:手术与研究》杂志上发表的人工智能文本在一年时间内的生成率是多少?分析 2022 年 11 月 ChatGPT 发布前后的 425 篇文章。
Orthop Traumatol Surg Res. 2023 Dec;109(8):103694. doi: 10.1016/j.otsr.2023.103694. Epub 2023 Sep 29.

引用本文的文献

1
ChatGPT in Academic Writing: A Scientometric Analysis of Literature Published Between 2022 and 2023.学术写作中的ChatGPT:对2022年至2023年发表文献的科学计量分析
J Empir Res Hum Res Ethics. 2025 Jul;20(3):131-148. doi: 10.1177/15562646251350203. Epub 2025 Jun 22.
2
Large Language Models in Spine Surgery: A Promising Technology.脊柱外科中的大语言模型:一项有前景的技术。
HSS J. 2025 May 29:15563316251340696. doi: 10.1177/15563316251340696.
3
Large language models in medicine: A review of current clinical trials across healthcare applications.
医学领域的大语言模型:对医疗保健应用中当前临床试验的综述。
PLOS Digit Health. 2024 Nov 19;3(11):e0000662. doi: 10.1371/journal.pdig.0000662. eCollection 2024 Nov.
4
Examining the Role of Large Language Models in Orthopedics: Systematic Review.检查大型语言模型在骨科中的作用:系统评价。
J Med Internet Res. 2024 Nov 15;26:e59607. doi: 10.2196/59607.
5
The transformative impact of large language models on medical writing and publishing: current applications, challenges and future directions.大语言模型对医学写作与出版的变革性影响:当前应用、挑战及未来方向
Korean J Physiol Pharmacol. 2024 Sep 1;28(5):393-401. doi: 10.4196/kjpp.2024.28.5.393.