Department of Psychological Counselling, Faculty of Psychology, Mejiro University, Tokyo, Japan.
Institute of Interdisciplinary Research, Kyoto University of Advanced Science, Kyoto, Japan.
PLoS One. 2023 Aug 9;18(8):e0288453. doi: 10.1371/journal.pone.0288453. eCollection 2023.
In the first half of 2023, text-generative artificial intelligence (AI), including ChatGPT from OpenAI, has attracted considerable attention worldwide. In this study, first, we compared Japanese stylometric features of texts generated by ChatGPT, equipped with GPT-3.5 and GPT-4, and those written by humans. In this work, we performed multi-dimensional scaling (MDS) to confirm the distributions of 216 texts of three classes (72 academic papers written by 36 single authors, 72 texts generated by GPT-3.5, and 72 texts generated by GPT-4 on the basis of the titles of the aforementioned papers) focusing on the following stylometric features: (1) bigrams of parts-of-speech, (2) bigram of postpositional particle words, (3) positioning of commas, and (4) rate of function words. MDS revealed distinct distributions at each stylometric feature of GPT (3.5 and 4) and human. Although GPT-4 is more powerful than GPT-3.5 because it has more parameters, both GPT (3.5 and 4) distributions are overlapping. These results indicate that although the number of parameters may increase in the future, GPT-generated texts may not be close to that written by humans in terms of stylometric features. Second, we verified the classification performance of random forest (RF) classifier for two classes (GPT and human) focusing on Japanese stylometric features. This study revealed the high performance of RF in each stylometric feature: The RF classifier focusing on the rate of function words achieved 98.1% accuracy. Furthermore the RF classifier focusing on all stylometric features reached 100% in terms of all performance indexes (accuracy, recall, precision, and F1 score). This study concluded that at this stage we human discriminate ChatGPT from human limited to Japanese language.
在 2023 年上半年,文本生成式人工智能(AI),包括 OpenAI 的 ChatGPT,在全球范围内引起了广泛关注。在本研究中,我们首先比较了配备 GPT-3.5 和 GPT-4 的 ChatGPT 生成的文本与人类撰写的文本的日语文体特征。在这项工作中,我们通过多维尺度分析(MDS)来确认三个类别(72 篇由 36 位单作者撰写的学术论文文本、72 篇基于上述论文标题由 GPT-3.5 生成的文本和 72 篇由 GPT-4 生成的文本)的 216 个文本的分布,重点关注以下文体特征:(1)词性二元组,(2)后置词二元组,(3)逗号位置和(4)功能词的比率。MDS 揭示了 GPT(3.5 和 4)和人类在每个文体特征上的明显分布。尽管 GPT-4 由于参数更多而比 GPT-3.5 更强大,但两个 GPT(3.5 和 4)的分布是重叠的。这些结果表明,尽管未来参数数量可能会增加,但 GPT 生成的文本在文体特征方面可能不会接近人类撰写的文本。其次,我们验证了随机森林(RF)分类器在两个类别(GPT 和人类)中对日语文体特征的分类性能。这项研究揭示了 RF 在每个文体特征上的高性能:专注于功能词比率的 RF 分类器实现了 98.1%的准确率。此外,专注于所有文体特征的 RF 分类器在所有性能指标(准确率、召回率、精度和 F1 得分)方面都达到了 100%。本研究得出结论,在现阶段,我们人类可以将 ChatGPT 与人类在日语方面区分开来。