Center for Perioperative Optimization, Department of Surgery, Copenhagen University Hospital - Herlev Hospital, Denmark.
Dan Med J. 2023 Nov 23;70(12):A06230412.
Artificial intelligence has started to become a part of scientific studies and may help researchers with a wide range of tasks. However, no scientific studies have been published on its ussefulness in writing cover letters for scientific articles. This study aimed to determine whether Generative Pre-Trained Transformer (GPT)-4 is as good as humans in writing cover letters for scientific papers.
In this randomised non-inferiority study, we included two parallel arms consisting of cover letters written by humans and by GPT-4. Each arm had 18 cover letters, which were assessed by three different blinded assessors. The assessors completed a questionnaire in which they had to assess the cover letters with respect to impression, readability, criteria satisfaction, and degree of detail. Subsequently, we performed readability tests with Lix score and Flesch Kincaid grade level.
No significant or relevant difference was found on any parameter. A total of 61% of the blinded assessors guessed correctly as to whether the cover letter was written by GPT-4 or a human. GPT-4 had a higher score according to our objective readability tests. Nevertheless, it performed better than human writing on readability in the subjective assessments.
We found that GPT-4 was non-inferior at writing cover letters compared to humans. This may be used to streamline cover letters for researchers, providing an equal chance to all researchers for advancement to peer-review.
This study received no financial support from external sources.
This study was not registered before the study commenced.
人工智能已开始成为科学研究的一部分,可能有助于研究人员完成各种任务。然而,目前尚未有研究发表过关于其在撰写科学论文投稿信方面的有效性。本研究旨在确定生成式预训练转换器(GPT-4)在撰写科学论文投稿信方面是否与人类一样出色。
这是一项随机非劣效性研究,我们纳入了由人类和 GPT-4 撰写的两条平行臂的投稿信。每条臂有 18 封投稿信,由三位不同的盲审评估者进行评估。评估者填写了一份问卷,根据印象、可读性、标准满足度和详细程度对投稿信进行评估。随后,我们使用 Lix 分数和 Flesch-Kincaid 阅读水平进行了可读性测试。
在任何参数上均未发现显著或相关差异。共有 61%的盲审评估者正确猜测了投稿信是由 GPT-4 还是人类撰写。根据我们的客观可读性测试,GPT-4 的得分更高。然而,在主观评估中,它的可读性表现优于人类写作。
我们发现,与人类相比,GPT-4 在撰写投稿信方面并不逊色。这可能用于简化研究人员的投稿信,为所有研究人员提供平等的机会进入同行评审。
本研究未从外部来源获得任何财务支持。
本研究在研究开始前未进行注册。