人工与人工智能生成的关节置换文献：感知交流、质量和作者来源的单盲分析。

Human versus artificial intelligence-generated arthroplasty literature: A single-blinded analysis of perceived communication, quality, and authorship source.

机构信息

Department of Orthopedic Surgery, NYU Langone Health, New York, New York, USA.

出版信息

Int J Med Robot. 2024 Feb;20(1):e2621. doi: 10.1002/rcs.2621.

DOI:10.1002/rcs.2621

PMID:38348740

Abstract

BACKGROUND

Large language models (LLM) have unknown implications for medical research. This study assessed whether LLM-generated abstracts are distinguishable from human-written abstracts and to compare their perceived quality.

METHODS

The LLM ChatGPT was used to generate 20 arthroplasty abstracts (AI-generated) based on full-text manuscripts, which were compared to originally published abstracts (human-written). Six blinded orthopaedic surgeons rated abstracts on overall quality, communication, and confidence in the authorship source. Authorship-confidence scores were compared to a test value representing complete inability to discern authorship.

RESULTS

Modestly increased confidence in human authorship was observed for human-written abstracts compared with AI-generated abstracts (p = 0.028), though AI-generated abstract authorship-confidence scores were statistically consistent with inability to discern authorship (p = 0.999). Overall abstract quality was higher for human-written abstracts (p = 0.019).

CONCLUSIONS

AI-generated abstracts' absolute authorship-confidence ratings demonstrated difficulty in discerning authorship but did not achieve the perceived quality of human-written abstracts. Caution is warranted in implementing LLMs into scientific writing.

摘要

背景

大型语言模型（LLM）对医学研究有未知的影响。本研究评估了 LLM 生成的摘要是否与人工编写的摘要有区别，并比较了它们的感知质量。

方法

使用 LLM ChatGPT 根据全文手稿生成 20 篇关节置换术摘要（AI 生成），并与原始发表的摘要（人工编写）进行比较。六名盲法骨科医生根据整体质量、沟通和对作者来源的信心对摘要进行评分。将作者信心评分与代表完全无法辨别作者身份的测试值进行比较。

结果

与 AI 生成的摘要相比，人工编写的摘要的作者身份信心略有增加（p=0.028），尽管 AI 生成的摘要作者身份信心评分在统计学上与无法辨别作者身份一致（p=0.999）。人工编写的摘要的整体摘要质量更高（p=0.019）。

结论

AI 生成的摘要的绝对作者身份信心评分表明难以辨别作者身份，但未能达到人工编写摘要的感知质量。在将 LLM 应用于科学写作时需要谨慎。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

人工与人工智能生成的关节置换文献：感知交流、质量和作者来源的单盲分析。

Human versus artificial intelligence-generated arthroplasty literature: A single-blinded analysis of perceived communication, quality, and authorship source.

机构信息

出版信息

BACKGROUND

METHODS

RESULTS

CONCLUSIONS

背景

方法

结果

结论

相似文献

引用本文的文献

人工与人工智能生成的关节置换文献：感知交流、质量和作者来源的单盲分析。

Human versus artificial intelligence-generated arthroplasty literature: A single-blinded analysis of perceived communication, quality, and authorship source.

机构信息

出版信息

BACKGROUND

METHODS

RESULTS

CONCLUSIONS

背景

方法

结果

结论

相似文献

引用本文的文献