骨科医学术文献中人工撰写文本与人工智能生成文本的比较定性分析

Human-Written vs AI-Generated Texts in Orthopedic Academic Literature: Comparative Qualitative Analysis.

作者信息

Hakam Hassan Tarek, Prill Robert, Korte Lisa, Lovreković Bruno, Ostojić Marko, Ramadanov Nikolai, Muehlensiepen Felix

机构信息

Center of Orthopaedics and Trauma Surgery, University Clinic of Brandenburg, Brandenburg Medical School, Brandenburg an der Havel, Germany.

Faculty of Health Sciences, University Clinic of Brandenburg, Brandenburg an der Havel, Germany.

出版信息

JMIR Form Res. 2024 Feb 16;8:e52164. doi: 10.2196/52164.

DOI:10.2196/52164

PMID:38363631

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10907945/

Abstract

BACKGROUND

As large language models (LLMs) are becoming increasingly integrated into different aspects of health care, questions about the implications for medical academic literature have begun to emerge. Key aspects such as authenticity in academic writing are at stake with artificial intelligence (AI) generating highly linguistically accurate and grammatically sound texts.

OBJECTIVE

The objective of this study is to compare human-written with AI-generated scientific literature in orthopedics and sports medicine.

METHODS

Five original abstracts were selected from the PubMed database. These abstracts were subsequently rewritten with the assistance of 2 LLMs with different degrees of proficiency. Subsequently, researchers with varying degrees of expertise and with different areas of specialization were asked to rank the abstracts according to linguistic and methodological parameters. Finally, researchers had to classify the articles as AI generated or human written.

RESULTS

Neither the researchers nor the AI-detection software could successfully identify the AI-generated texts. Furthermore, the criteria previously suggested in the literature did not correlate with whether the researchers deemed a text to be AI generated or whether they judged the article correctly based on these parameters.

CONCLUSIONS

The primary finding of this study was that researchers were unable to distinguish between LLM-generated and human-written texts. However, due to the small sample size, it is not possible to generalize the results of this study. As is the case with any tool used in academic research, the potential to cause harm can be mitigated by relying on the transparency and integrity of the researchers. With scientific integrity at stake, further research with a similar study design should be conducted to determine the magnitude of this issue.

摘要

背景

随着大语言模型（LLMs）越来越多地融入医疗保健的各个方面，关于其对医学学术文献影响的问题开始出现。人工智能（AI）生成的文本在语言准确性和语法正确性方面都很高，这使得学术写作中的真实性等关键方面受到威胁。

目的

本研究的目的是比较骨科和运动医学领域中人工撰写与人工智能生成的科学文献。

方法

从PubMed数据库中选取了五篇原始摘要。随后，在两个不同熟练程度的大语言模型的协助下对这些摘要进行了改写。随后，要求具有不同专业水平和不同专业领域的研究人员根据语言和方法学参数对摘要进行排名。最后，研究人员必须将文章分类为人工智能生成或人工撰写。

结果

研究人员和人工智能检测软件都无法成功识别出人工智能生成的文本。此外，文献中先前提出的标准与研究人员是否认为一篇文本是由人工智能生成的，或者他们是否根据这些参数正确判断文章无关。

结论

本研究的主要发现是，研究人员无法区分大语言模型生成的文本和人工撰写的文本。然而，由于样本量较小，无法将本研究的结果推广。与学术研究中使用的任何工具一样，依靠研究人员的透明度和诚信可以减轻造成伤害的可能性。鉴于科学诚信受到威胁，应该进行类似研究设计的进一步研究，以确定这个问题的严重程度。

相似文献

Human-Written vs AI-Generated Texts in Orthopedic Academic Literature: Comparative Qualitative Analysis.

JMIR Form Res. 2024 Feb 16;8:e52164. doi: 10.2196/52164.

Between human and AI: assessing the reliability of AI text detection tools.

Curr Med Res Opin. 2024 Mar;40(3):353-358. doi: 10.1080/03007995.2024.2310086. Epub 2024 Feb 2.

Comparisons of Quality, Correctness, and Similarity Between ChatGPT-Generated and Human-Written Abstracts for Basic Research: Cross-Sectional Study.

J Med Internet Res. 2023 Dec 25;25:e51229. doi: 10.2196/51229.

Human versus artificial intelligence-generated arthroplasty literature: A single-blinded analysis of perceived communication, quality, and authorship source.

Int J Med Robot. 2024 Feb;20(1):e2621. doi: 10.1002/rcs.2621.

Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.

Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.

The transformative impact of large language models on medical writing and publishing: current applications, challenges and future directions.

Korean J Physiol Pharmacol. 2024 Sep 1;28(5):393-401. doi: 10.4196/kjpp.2024.28.5.393.

Large language models are changing landscape of academic publications. A positive transformation?

Cas Lek Cesk. 2024;162(7-8):294-297.

AI vs academia: Experimental study on AI text detectors' accuracy in behavioral health academic writing.

Account Res. 2024 Mar 22:1-17. doi: 10.1080/08989621.2024.2331757.

Evaluation of the impact of large language learning models on articles submitted to Orthopaedics & Traumatology: Surgery & Research (OTSR): A significant increase in the use of artificial intelligence in 2023.

Orthop Traumatol Surg Res. 2023 Dec;109(8):103720. doi: 10.1016/j.otsr.2023.103720. Epub 2023 Oct 20.

Artificial Intelligence Can Generate Fraudulent but Authentic-Looking Scientific Medical Articles: Pandora's Box Has Been Opened.

J Med Internet Res. 2023 May 31;25:e46924. doi: 10.2196/46924.

引用本文的文献

Evaluating human ability to distinguish between ChatGPT-generated and original scientific abstracts.

Updates Surg. 2025 Jan 24. doi: 10.1007/s13304-025-02106-3.

Reader's digest version of scientific writing: comparative evaluation of summarization capacity between large language models and medical students in analyzing scientific writing in sleep medicine.

Front Artif Intell. 2024 Dec 24;7:1477535. doi: 10.3389/frai.2024.1477535. eCollection 2024.

Examining the Role of Large Language Models in Orthopedics: Systematic Review.

J Med Internet Res. 2024 Nov 15;26:e59607. doi: 10.2196/59607.

本文引用的文献

Differentiating ChatGPT-Generated and Human-Written Medical Texts: Quantitative Study.

JMIR Med Educ. 2023 Dec 28;9:e48904. doi: 10.2196/48904.

Reflection on whether Chat GPT should be banned by academia from the perspective of education and teaching.

Front Psychol. 2023 Jun 1;14:1181712. doi: 10.3389/fpsyg.2023.1181712. eCollection 2023.

ChatGPT: Detection in Academic Journals is Editors' and Publishers' Responsibilities.

Ann Biomed Eng. 2023 Oct;51(10):2103-2104. doi: 10.1007/s10439-023-03247-5. Epub 2023 May 27.

Exploring the Boundaries of Reality: Investigating the Phenomenon of Artificial Intelligence Hallucination in Scientific Writing Through ChatGPT References.

Cureus. 2023 Apr 11;15(4):e37432. doi: 10.7759/cureus.37432. eCollection 2023 Apr.

Comparing scientific abstracts generated by ChatGPT to real abstracts with detectors and blinded human reviewers.

NPJ Digit Med. 2023 Apr 26;6(1):75. doi: 10.1038/s41746-023-00819-6.

Artificial Intelligence in Sports Medicine: Could GPT-4 Make Human Doctors Obsolete?

Ann Biomed Eng. 2023 Aug;51(8):1658-1662. doi: 10.1007/s10439-023-03213-1. Epub 2023 Apr 25.

From human writing to artificial intelligence generated text: examining the prospects and potential threats of ChatGPT in academic writing.

Biol Sport. 2023 Apr;40(2):615-622. doi: 10.5114/biolsport.2023.125623. Epub 2023 Mar 15.

ChatGPT Utility in Healthcare Education, Research, and Practice: Systematic Review on the Promising Perspectives and Valid Concerns.

Healthcare (Basel). 2023 Mar 19;11(6):887. doi: 10.3390/healthcare11060887.

Advancements in Artificial Intelligence for Foot and Ankle Surgery: A Systematic Review.

Foot Ankle Orthop. 2023 Feb 13;8(1):24730114221151079. doi: 10.1177/24730114221151079. eCollection 2023 Jan.

Ethics: disclose use of AI in scientific manuscripts.

Nature. 2023 Feb;614(7948):413. doi: 10.1038/d41586-023-00381-x.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

Suppr
超能文献

骨科医学术文献中人工撰写文本与人工智能生成文本的比较定性分析

Human-Written vs AI-Generated Texts in Orthopedic Academic Literature: Comparative Qualitative Analysis.

作者信息

机构信息

出版信息

BACKGROUND

OBJECTIVE

METHODS

RESULTS

CONCLUSIONS

背景

目的

方法

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

Suppr超能文献

骨科医学术文献中人工撰写文本与人工智能生成文本的比较定性分析

Human-Written vs AI-Generated Texts in Orthopedic Academic Literature: Comparative Qualitative Analysis.

作者信息

机构信息

出版信息

BACKGROUND

OBJECTIVE

METHODS

RESULTS

CONCLUSIONS

背景

目的

方法

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

Suppr
超能文献