通过日语风格分析区分 ChatGPT(-3.5,-4)生成的和人工撰写的论文。

Distinguishing ChatGPT(-3.5, -4)-generated and human-written papers through Japanese stylometric analysis.

机构信息

Department of Psychological Counselling, Faculty of Psychology, Mejiro University, Tokyo, Japan.

Institute of Interdisciplinary Research, Kyoto University of Advanced Science, Kyoto, Japan.

出版信息

PLoS One. 2023 Aug 9;18(8):e0288453. doi: 10.1371/journal.pone.0288453. eCollection 2023.

DOI:10.1371/journal.pone.0288453

PMID:37556434

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10411719/

Abstract

In the first half of 2023, text-generative artificial intelligence (AI), including ChatGPT from OpenAI, has attracted considerable attention worldwide. In this study, first, we compared Japanese stylometric features of texts generated by ChatGPT, equipped with GPT-3.5 and GPT-4, and those written by humans. In this work, we performed multi-dimensional scaling (MDS) to confirm the distributions of 216 texts of three classes (72 academic papers written by 36 single authors, 72 texts generated by GPT-3.5, and 72 texts generated by GPT-4 on the basis of the titles of the aforementioned papers) focusing on the following stylometric features: (1) bigrams of parts-of-speech, (2) bigram of postpositional particle words, (3) positioning of commas, and (4) rate of function words. MDS revealed distinct distributions at each stylometric feature of GPT (3.5 and 4) and human. Although GPT-4 is more powerful than GPT-3.5 because it has more parameters, both GPT (3.5 and 4) distributions are overlapping. These results indicate that although the number of parameters may increase in the future, GPT-generated texts may not be close to that written by humans in terms of stylometric features. Second, we verified the classification performance of random forest (RF) classifier for two classes (GPT and human) focusing on Japanese stylometric features. This study revealed the high performance of RF in each stylometric feature: The RF classifier focusing on the rate of function words achieved 98.1% accuracy. Furthermore the RF classifier focusing on all stylometric features reached 100% in terms of all performance indexes (accuracy, recall, precision, and F1 score). This study concluded that at this stage we human discriminate ChatGPT from human limited to Japanese language.

摘要

在 2023 年上半年，文本生成式人工智能（AI），包括 OpenAI 的 ChatGPT，在全球范围内引起了广泛关注。在本研究中，我们首先比较了配备 GPT-3.5 和 GPT-4 的 ChatGPT 生成的文本与人类撰写的文本的日语文体特征。在这项工作中，我们通过多维尺度分析（MDS）来确认三个类别（72 篇由 36 位单作者撰写的学术论文文本、72 篇基于上述论文标题由 GPT-3.5 生成的文本和 72 篇由 GPT-4 生成的文本）的 216 个文本的分布，重点关注以下文体特征：（1）词性二元组，（2）后置词二元组，（3）逗号位置和（4）功能词的比率。MDS 揭示了 GPT（3.5 和 4）和人类在每个文体特征上的明显分布。尽管 GPT-4 由于参数更多而比 GPT-3.5 更强大，但两个 GPT（3.5 和 4）的分布是重叠的。这些结果表明，尽管未来参数数量可能会增加，但 GPT 生成的文本在文体特征方面可能不会接近人类撰写的文本。其次，我们验证了随机森林（RF）分类器在两个类别（GPT 和人类）中对日语文体特征的分类性能。这项研究揭示了 RF 在每个文体特征上的高性能：专注于功能词比率的 RF 分类器实现了 98.1%的准确率。此外，专注于所有文体特征的 RF 分类器在所有性能指标（准确率、召回率、精度和 F1 得分）方面都达到了 100%。本研究得出结论，在现阶段，我们人类可以将 ChatGPT 与人类在日语方面区分开来。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/eea3/10411719/6ec3df8a2eb3/pone.0288453.g001.jpg

相似文献

Distinguishing ChatGPT(-3.5, -4)-generated and human-written papers through Japanese stylometric analysis.通过日语风格分析区分 ChatGPT(-3.5,-4)生成的和人工撰写的论文。

PLoS One. 2023 Aug 9;18(8):e0288453. doi: 10.1371/journal.pone.0288453. eCollection 2023.

Can we spot fake public comments generated by ChatGPT(-3.5, -4)?: Japanese stylometric analysis expose emulation created by one-shot learning.我们能否发现 ChatGPT(-3.5、-4)生成的虚假公众评论？日本文体分析揭示了单次学习产生的模拟。

PLoS One. 2024 Mar 13;19(3):e0299031. doi: 10.1371/journal.pone.0299031. eCollection 2024.

Unveiling ChatGPT text using writing style.用写作风格揭示ChatGPT文本。

Heliyon. 2024 Jun 15;10(12):e32976. doi: 10.1016/j.heliyon.2024.e32976. eCollection 2024 Jun 30.

Accuracy of ChatGPT on Medical Questions in the National Medical Licensing Examination in Japan: Evaluation Study.ChatGPT在日本国家医师资格考试医学问题上的准确性：评估研究

JMIR Form Res. 2023 Oct 13;7:e48023. doi: 10.2196/48023.

Performance of ChatGPT Across Different Versions in Medical Licensing Examinations Worldwide: Systematic Review and Meta-Analysis.ChatGPT 在全球医学执照考试不同版本中的表现：系统评价和荟萃分析。

J Med Internet Res. 2024 Jul 25;26:e60807. doi: 10.2196/60807.

Differentiating ChatGPT-Generated and Human-Written Medical Texts: Quantitative Study.区分 ChatGPT 生成和人工撰写的医学文本：定量研究。

JMIR Med Educ. 2023 Dec 28;9:e48904. doi: 10.2196/48904.

Strengths and Weaknesses of ChatGPT Models for Scientific Writing About Medical Vitamin B12: Mixed Methods Study.用于医学维生素B12科学写作的ChatGPT模型的优势与不足：混合方法研究

JMIR Form Res. 2023 Nov 10;7:e49459. doi: 10.2196/49459.

Human vs machine: identifying ChatGPT-generated abstracts in Gynecology and Urogynecology.人机之争：在妇科和泌尿外科学中识别 ChatGPT 生成的摘要。

Am J Obstet Gynecol. 2024 Aug;231(2):276.e1-276.e10. doi: 10.1016/j.ajog.2024.04.045. Epub 2024 May 6.

Hallucination Rates and Reference Accuracy of ChatGPT and Bard for Systematic Reviews: Comparative Analysis.幻觉发生率和 ChatGPT 与 Bard 用于系统评价的参考准确性：比较分析。

J Med Internet Res. 2024 May 22;26:e53164. doi: 10.2196/53164.

Artificial Intelligence Can Generate Fraudulent but Authentic-Looking Scientific Medical Articles: Pandora's Box Has Been Opened.人工智能可以生成虚假但看起来真实的科学医学文章：潘多拉的盒子已经被打开。

J Med Internet Res. 2023 May 31;25:e46924. doi: 10.2196/46924.

引用本文的文献

Almost Nobody Is Using ChatGPT to Write Academic Science Papers (Yet).几乎没有人（目前）使用ChatGPT来撰写学术科学论文。

Big Data Cogn Comput. 2024 Oct;8(10). doi: 10.3390/bdcc8100133. Epub 2024 Oct 11.

Identifying artificial intelligence-generated content using the DistilBERT transformer and NLP techniques.使用DistilBERT变换器和自然语言处理技术识别由人工智能生成的内容。

Sci Rep. 2025 Jul 1;15(1):20366. doi: 10.1038/s41598-025-08208-7.

A comparative analysis of large language models on clinical questions for autoimmune diseases.关于自身免疫性疾病临床问题的大语言模型比较分析。

Front Digit Health. 2025 Mar 3;7:1530442. doi: 10.3389/fdgth.2025.1530442. eCollection 2025.

Unveiling ChatGPT text using writing style.用写作风格揭示ChatGPT文本。

Heliyon. 2024 Jun 15;10(12):e32976. doi: 10.1016/j.heliyon.2024.e32976. eCollection 2024 Jun 30.

ChatGPT's Response Consistency: A Study on Repeated Queries of Medical Examination Questions.ChatGPT的回答一致性：关于医学考试问题重复查询的研究

Eur J Investig Health Psychol Educ. 2024 Mar 8;14(3):657-668. doi: 10.3390/ejihpe14030043.

PLoS One. 2024 Mar 13;19(3):e0299031. doi: 10.1371/journal.pone.0299031. eCollection 2024.

Ethical Dilemmas in Using AI for Academic Writing and an Example Framework for Peer Review in Nephrology Academia: A Narrative Review.人工智能用于学术写作中的伦理困境以及肾脏病学术界同行评审的示例框架：一项叙述性综述

Clin Pract. 2023 Dec 30;14(1):89-105. doi: 10.3390/clinpract14010008.

ChatGPT in Medical Education and Research: A Boon or a Bane?ChatGPT在医学教育与研究中：是福还是祸？

Cureus. 2023 Aug 29;15(8):e44316. doi: 10.7759/cureus.44316. eCollection 2023 Aug.

本文引用的文献

Differentiating ChatGPT-Generated and Human-Written Medical Texts: Quantitative Study.区分 ChatGPT 生成和人工撰写的医学文本：定量研究。

JMIR Med Educ. 2023 Dec 28;9:e48904. doi: 10.2196/48904.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

通过日语风格分析区分 ChatGPT(-3.5,-4)生成的和人工撰写的论文。

Distinguishing ChatGPT(-3.5, -4)-generated and human-written papers through Japanese stylometric analysis.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献