Le Bonheur Children's Hospital, The University of Tennessee Health Science Center, Memphis, TN, USA.
Yale J Biol Med. 2023 Sep 29;96(3):415-420. doi: 10.59249/SKDH9286. eCollection 2023 Sep.
The increasing volume of research submissions to academic journals poses a significant challenge for traditional peer-review processes. To address this issue, this study explores the potential of employing ChatGPT, an advanced large language model (LLM), developed by OpenAI, as an artificial intelligence (AI) reviewer for academic journals. By leveraging the vast knowledge and natural language processing capabilities of ChatGPT, we hypothesize it may be possible to enhance the efficiency, consistency, and quality of the peer-review process. This research investigated key aspects of integrating ChatGPT into the journal review workflow. We compared the critical analysis of ChatGPT, acting as an AI reviewer, to human reviews for a single published article. Our methodological framework involved subjecting ChatGPT to an intricate examination, wherein its evaluative acumen was juxtaposed against human-authored reviews of a singular published article. As this is a feasibility study, one article was reviewed, which was a case report on scurvy. The entire article was used as an input into ChatGPT and commanded it to "Please perform a review of the following article and give points for revision." Since this was a case report with a limited word count the entire article could fit in one chat box. The output by ChatGPT was then compared with the comments by human reviewers. Key performance metrics, including precision and overall agreement, were judiciously and subjectively measured to portray the efficacy of ChatGPT as an AI reviewer in comparison to its human counterparts. The outcomes of this rigorous analysis unveiled compelling evidence regarding ChatGPT's performance as an AI reviewer. We demonstrated that ChatGPT's critical analyses aligned with those of human reviewers, as evidenced by the inter-rater agreement. Notably, ChatGPT exhibited commendable capability in identifying methodological flaws, articulating insightful feedback on theoretical frameworks, and gauging the overall contribution of the articles to their respective fields. While the integration of ChatGPT showcased immense promise, certain challenges and caveats surfaced. For example, ambiguities might present with complex research articles, leading to nuanced discrepancies between AI and human reviews. Also figures and images cannot be reviewed by ChatGPT. Lengthy articles need to be reviewed in parts by ChatGPT as the entire article will not fit in one chat/response. The benefits consist of reduction in time needed by journals to review the articles submitted to them, as well as an AI assistant to give a different perspective about the research papers other than the human reviewers. In conclusion, this research contributes a groundbreaking foundation for incorporating ChatGPT into the pantheon of journal reviewers. The delineated guidelines distill key insights into operationalizing ChatGPT as a proficient reviewer within academic journal frameworks, paving the way for a more efficient and insightful review process.
学术期刊收到的研究提交量不断增加,这给传统的同行评审过程带来了巨大挑战。为了解决这个问题,本研究探讨了利用 OpenAI 开发的先进大型语言模型(LLM)ChatGPT 作为学术期刊人工智能(AI)评审员的潜力。通过利用 ChatGPT 的广泛知识和自然语言处理能力,我们假设它可能能够提高同行评审过程的效率、一致性和质量。这项研究调查了将 ChatGPT 整合到期刊评审工作流程中的关键方面。我们比较了 ChatGPT 作为 AI 评审员的关键分析,以及对一篇已发表文章的人工评审。我们的方法框架包括对 ChatGPT 进行复杂的检查,将其评估能力与对一篇已发表文章的人工评论进行对比。由于这是一项可行性研究,因此我们只对一篇文章进行了审查,这是一篇关于坏血病的案例报告。整篇文章都被输入到 ChatGPT 中,并要求它“请对以下文章进行审查并给出修改建议。”由于这是一篇案例报告,字数有限,整篇文章可以放在一个聊天框中。然后将 ChatGPT 的输出与人类评论员的评论进行比较。为了描绘 ChatGPT 作为 AI 评审员的功效,我们谨慎地、主观地衡量了包括准确性和总体一致性在内的关键性能指标,并将其与人类同行进行了比较。这项严格分析的结果提供了关于 ChatGPT 作为 AI 评审员表现的令人信服的证据。我们证明了 ChatGPT 的关键分析与人类评论员的分析一致,这一点可以从评分者间的一致性得到证明。值得注意的是,ChatGPT 在识别方法缺陷、对理论框架提出有见地的反馈以及评估文章对各自领域的总体贡献方面表现出了令人钦佩的能力。虽然 ChatGPT 的整合展示了巨大的潜力,但也出现了一些挑战和注意事项。例如,复杂的研究文章可能会出现歧义,导致人工智能和人工评论之间存在细微的差异。此外,ChatGPT 无法审查图表和图像。对于较长的文章,需要由 ChatGPT 分部分进行审查,因为整篇文章无法放入一个聊天/回复中。其优点包括减少期刊审查提交给他们的文章所需的时间,以及作为 AI 助手提供对研究论文的不同看法,而不仅仅是人类评论员。总之,这项研究为将 ChatGPT 纳入期刊评审员的行列奠定了开创性的基础。所制定的准则提炼了在学术期刊框架内将 ChatGPT 作为高效评审员运作的关键见解,为更高效、更有洞察力的评审过程铺平了道路。