Tabone Wilbert, de Winter Joost
Department of Cognitive Robotics, Faculty of Mechanical, Maritime and Materials Engineering, Delft University of Technology, Delft 2628CD, The Netherlands.
R Soc Open Sci. 2023 Sep 13;10(9):231053. doi: 10.1098/rsos.231053. eCollection 2023 Sep.
ChatGPT could serve as a tool for text analysis within the field of Human-Computer Interaction, though its validity requires investigation. This study applied ChatGPT to: (1) textbox questionnaire responses on nine augmented-reality interfaces, (2) interview data from participants who experienced these interfaces in a virtual simulator, and (3) transcribed think-aloud data of participants who viewed a real painting and its replica. Using a hierarchical approach, ChatGPT produced scores or summaries of text batches, which were then aggregated. Results showed that (1) ChatGPT generated sentiment scores of the interfaces that correlated extremely strongly ( > 0.99) with human rating scale outcomes and with a rule-based sentiment analysis method (criterion validity). Additionally, (2) by inputting automatically transcribed interviews to ChatGPT, it provided meaningful meta-summaries of the qualities of the interfaces (face validity). One meta-summary analysed in depth was found to have substantial but imperfect overlap with a content analysis conducted by an independent researcher (criterion validity). Finally, (3) ChatGPT's summary of the think-aloud data highlighted subtle differences between the real painting and the replica (face validity), a distinction corresponding with a keyword analysis (criterion validity). In conclusion, our research indicates that, with appropriate precautions, ChatGPT can be used as a valid tool for analysing text data.
ChatGPT可以作为人机交互领域内文本分析的一种工具,不过其有效性有待研究。本研究将ChatGPT应用于:(1)关于九个增强现实界面的文本框问卷回复;(2)来自在虚拟模拟器中体验过这些界面的参与者的访谈数据;(3)观看一幅真实画作及其复制品的参与者的出声思考数据转录。采用分层方法,ChatGPT生成了文本批次的分数或总结,然后进行汇总。结果表明:(1)ChatGPT生成的界面情感分数与人类评分量表结果以及基于规则的情感分析方法高度相关(>0.99)(效标效度)。此外,(2)通过将自动转录的访谈输入ChatGPT,它提供了关于界面质量的有意义的元总结(表面效度)。深入分析的一个元总结被发现与独立研究人员进行的内容分析有实质性但并不完美的重叠(效标效度)。最后,(3)ChatGPT对出声思考数据的总结突出了真实画作和复制品之间的细微差异(表面效度),这种差异与关键词分析相符(效标效度)。总之,我们的研究表明,采取适当预防措施后,ChatGPT可作为分析文本数据的有效工具。