• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

评估ChatGPT在口吃方面的回答质量和可读性。

Assessing the response quality and readability of ChatGPT in stuttering.

作者信息

Saeedi Saeed, Bakhtiar Mehdi

机构信息

Speech and Neuromodulation Laboratory, Unit of Human Communication, Learning and Development, Faculty of Education, The University of Hong Kong, Hong Kong Special Administrative Region of China.

Speech and Neuromodulation Laboratory, Unit of Human Communication, Learning and Development, Faculty of Education, The University of Hong Kong, Hong Kong Special Administrative Region of China.

出版信息

J Fluency Disord. 2025 Sep;85:106149. doi: 10.1016/j.jfludis.2025.106149. Epub 2025 Aug 15.

DOI:10.1016/j.jfludis.2025.106149
PMID:40848602
Abstract

OBJECTIVE

This study aimed to examine how frequently asked questions regarding stuttering were comprehended and answered by ChatGPT.

METHODS

In this exploratory study, eleven common questions about stuttering were asked in a single conversation with the GPT-4o mini. While being blind relative to the source of the answers (whether by AI or SLPs), a panel of five certified speech and language pathologists (SLPs) was requested to differentiate if responses were produced by the ChatGPT chatbot or provided by SLPs. Additionally, they were instructed to evaluate the responses based on several criteria, including the presence of inaccuracies, the potential for causing harm and the degree of harm that could result, and alignment with the prevailing consensus within the SLP community. All ChatGPT responses were also evaluated utilizing various readability features, including the Flesch Reading Ease Score (FRES), Gunning Fog Scale Level (GFSL), and Dale-Chall Score (D-CS), the number of words, number of sentences, words per sentence (WPS), characters per word (CPW), and the percentage of difficult words. Furthermore, Spearman's rank correlation coefficient was employed to examine relationship between the evaluations conducted by the panel of certified SLPs and readability features.

RESULTS

A substantial proportion of the AI-generated responses (45.50 %) were incorrectly identified by SLP panel as being written by other SLPs, indicating high perceived human-likeness (origin). Regarding content quality, 83.60 % of the responses were found to be accurate (incorrectness), 63.60 % were rated as harmless (harm), and 38.20 % were considered to cause only minor to moderate impact (extent of harm). In terms of professional alignment, 62 % of the responses reflected the prevailing views within the SLP community (consensus). The means ± standard deviation of FRES, GFSL, and D-CS were 26.52 ± 13.94 (readable for college graduates), 18.17 ± 3.39 (readable for graduate students), and 9.90 ± 1.08 (readable for 13th to 15th grade [college]), respectively. Furthermore, each response contained an average of 99.73 words, 6.80 sentences, 17.44 WPS, 5.79 CPW, and 27.96 % difficult words. The correlation coefficients ranged between significantly large negative value (r = -0.909, p < 0.05) to very large positive value (r = 0.918, p < 0.05).

CONCLUSION

The results revealed that the emerging ChatGPT possesses a promising capability to provide appropriate responses to frequently asked questions in the field of stuttering, which is attested by the fact that panel of certified SLPs perceived about 45 % of them to be generated by SLPs. However, given the increasing accessibility of AI tools, particularly among individuals with limited access to professional services, it is crucial to emphasize that such tools are intended solely for educational purposes and should not replace diagnosis or treatment by qualified SLPs.

摘要

目的

本研究旨在考察ChatGPT对有关口吃的常见问题的理解和回答频率。

方法

在这项探索性研究中,在与GPT-4o mini的一次对话中提出了11个关于口吃的常见问题。在对答案来源(无论是由人工智能还是言语语言病理学家提供)不知情的情况下,要求由五名认证言语语言病理学家(SLP)组成的小组区分回答是由ChatGPT聊天机器人生成还是由SLP提供。此外,要求他们根据几个标准评估回答,包括是否存在不准确之处、造成伤害的可能性以及可能导致的伤害程度,以及是否与SLP社区内的主流共识一致。所有ChatGPT的回答还利用各种可读性特征进行评估,包括弗莱什易读性分数(FRES)、冈宁雾度等级(GFSL)和戴尔-查尔分数(D-CS)、单词数量、句子数量、每句单词数(WPS)、每个单词的字符数(CPW)以及难词百分比。此外,采用斯皮尔曼等级相关系数来检验认证SLP小组的评估与可读性特征之间的关系。

结果

SLP小组错误地将相当一部分人工智能生成的回答(45.50%)识别为由其他SLP撰写,表明其具有较高的拟人化感知(来源)。在内容质量方面,83.60%的回答被认为是准确的(不正确性),63.60%被评为无害(伤害),38.20%被认为只会造成轻微到中度的影响(伤害程度)。在专业一致性方面,62%的回答反映了SLP社区内的主流观点(共识)。FRES、GFSL和D-CS的平均值±标准差分别为26.52±13.94(适合大学毕业生阅读)、18.17±3.39(适合研究生阅读)和9.90±1.08(适合13至15年级[大学]阅读)。此外,每个回答平均包含99.73个单词、6.80个句子、17.44个WPS、5.79个CPW和27.96%的难词。相关系数范围从显著的大负值(r = -0.909,p < 0.05)到非常大的正值(r = 0.918,p < 0.05)。

结论

结果表明,新兴的ChatGPT有能力为口吃领域的常见问题提供适当的回答,这一点得到了认证SLP小组认为约45%的回答是由SLP生成的这一事实的证明。然而,鉴于人工智能工具的可及性不断提高,特别是在获得专业服务机会有限的人群中,必须强调此类工具仅用于教育目的,不应取代合格SLP的诊断或治疗。

相似文献

1
Assessing the response quality and readability of ChatGPT in stuttering.评估ChatGPT在口吃方面的回答质量和可读性。
J Fluency Disord. 2025 Sep;85:106149. doi: 10.1016/j.jfludis.2025.106149. Epub 2025 Aug 15.
2
Prescription of Controlled Substances: Benefits and Risks管制药品的处方:益处与风险
3
Stench of Errors or the Shine of Potential: The Challenge of (Ir)Responsible Use of ChatGPT in Speech-Language Pathology.错误的恶臭还是潜力的光辉:言语病理学中(不)负责任地使用ChatGPT的挑战。
Int J Lang Commun Disord. 2025 Jul-Aug;60(4):e70088. doi: 10.1111/1460-6984.70088.
4
Using Artificial Intelligence ChatGPT to Access Medical Information About Chemical Eye Injuries: Comparative Study.使用人工智能ChatGPT获取有关化学性眼外伤的医学信息:比较研究
JMIR Form Res. 2025 Aug 13;9:e73642. doi: 10.2196/73642.
5
Is Information About Musculoskeletal Malignancies From Large Language Models or Web Resources at a Suitable Reading Level for Patients?来自大语言模型或网络资源的关于肌肉骨骼恶性肿瘤的信息对患者来说是否处于合适的阅读水平?
Clin Orthop Relat Res. 2025 Feb 1;483(2):306-315. doi: 10.1097/CORR.0000000000003263. Epub 2024 Sep 25.
6
Evaluating ChatGPT's Utility in Biologic Therapy for Systemic Lupus Erythematosus: Comparative Study of ChatGPT and Google Web Search.评估ChatGPT在系统性红斑狼疮生物治疗中的效用:ChatGPT与谷歌网络搜索的比较研究
JMIR Form Res. 2025 Aug 28;9:e76458. doi: 10.2196/76458.
7
Sexual Harassment and Prevention Training性骚扰与预防培训
8
Can Artificial Intelligence Improve the Readability of Patient Education Materials?人工智能能否提高患者教育材料的可读性?
Clin Orthop Relat Res. 2023 Nov 1;481(11):2260-2267. doi: 10.1097/CORR.0000000000002668. Epub 2023 Apr 28.
9
Readability, reliability and quality of responses generated by ChatGPT, gemini, and perplexity for the most frequently asked questions about pain.ChatGPT、Gemini和Perplexity针对最常见疼痛问题生成的回答的可读性、可靠性和质量。
Medicine (Baltimore). 2025 Mar 14;104(11):e41780. doi: 10.1097/MD.0000000000041780.
10
Home treatment for mental health problems: a systematic review.心理健康问题的居家治疗:一项系统综述
Health Technol Assess. 2001;5(15):1-139. doi: 10.3310/hta5150.