• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

《科学怪人》、主题分析与生成式人工智能:定性研究的质量评估方法及考量

Frankenstein, thematic analysis and generative artificial intelligence: Quality appraisal methods and considerations for qualitative research.

作者信息

Jowsey Tanisha, Stapleton Peta, Campbell Shawna, Davidson Alexandra, McGillivray Cher, Maugeri Isabella, Lee Megan, Keogh Justin

机构信息

Faculty of Health Sciences and Medicine, Bond University, Gold Coast, Australia.

Faculty of Society and Design, Bond University, Gold Coast, Australia.

出版信息

PLoS One. 2025 Sep 5;20(9):e0330217. doi: 10.1371/journal.pone.0330217. eCollection 2025.

DOI:10.1371/journal.pone.0330217
PMID:40911617
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12412986/
Abstract

OBJECTIVE

To determine accuracy and efficiency of using generative artificial intelligence (GenAI) to undertake thematic analysis.

INTRODUCTION

With the increasing use of GenAI in data analysis, testing the reliability and suitability of using GenAI to conduct qualitative data analysis is needed. We propose a method for researchers to assess reliability of GenAI outputs using deidentified qualitative datasets.

METHODS

We searched three databases (United Kingdom Data Service, Figshare, and Google Scholar) and five journals (PlosOne, Social Science and Medicine, Qualitative Inquiry, Qualitative Research, Sociology Health Review) to identify studies on health-related topics, published prior to whereby: humans undertook thematic analysis and published both their analysis in a peer-reviewed journal and the associated dataset. We prompted a closed system GenAI (Microsoft Copilot) to undertake thematic analysis of these datasets and analysed the GenAI outputs in comparison with human outputs. Measures include time (GenAI only), accuracy, overlap with human analysis, and reliability of selected data and quotes.

RESULTS

Five studies were identified that met our inclusion criteria. The themes identified by human researchers and Copilot showed minimal overlap, with human researchers often using discursive thematic analyses (40%) and Copilot focusing on thematic analysis (100%). Copilot's outputs often included fabricated quotes (58% SD = 45%) and none of the Copilot outputs provided participant spread by theme. Additionally, Copilot's outputs primarily drew themes and quotes from the first 2-3 pages of textual data, rather than from the entire dataset. Human researchers provided broader representation and accurate quotes (79% quotes were correct, SD = 27%).

CONCLUSIONS

Based on these results, we cannot recommend the current version of Copilot for undertaking thematic analyses. This study raises concerns about the validity of both human-generated and GenAI-generated qualitative data analysis and reporting.

摘要

目的

确定使用生成式人工智能(GenAI)进行主题分析的准确性和效率。

引言

随着GenAI在数据分析中的使用日益增加,需要测试使用GenAI进行定性数据分析的可靠性和适用性。我们提出了一种方法,供研究人员使用去识别化的定性数据集来评估GenAI输出的可靠性。

方法

我们搜索了三个数据库(英国数据服务中心、Figshare和谷歌学术)以及五本期刊(《公共科学图书馆·综合》《社会科学与医学》《定性调查》《定性研究》《社会学健康评论》),以识别与健康相关主题的研究,这些研究在以下时间之前发表:人类进行了主题分析,并在同行评审期刊上发表了他们的分析以及相关数据集。我们促使一个封闭系统的GenAI(微软Copilot)对这些数据集进行主题分析,并将GenAI的输出与人类的输出进行比较分析。衡量指标包括时间(仅GenAI)、准确性、与人类分析的重叠度以及所选数据和引语的可靠性。

结果

确定了五项符合我们纳入标准的研究。人类研究人员和Copilot识别出的主题重叠极少,人类研究人员经常使用话语主题分析(40%),而Copilot专注于主题分析(100%)。Copilot的输出经常包含编造的引语(标准差为45%,比例为58%),且没有一个Copilot的输出按主题提供参与者分布情况。此外,Copilot的输出主要从前2至3页的文本数据中提取主题和引语,而不是从整个数据集中提取。人类研究人员提供了更广泛的代表性和准确的引语(79%的引语正确,标准差为27%)。

结论

基于这些结果,我们不建议使用当前版本的Copilot进行主题分析。这项研究引发了对人类生成和GenAI生成的定性数据分析及报告有效性的担忧。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/81fc/12412986/5f04ba9fc352/pone.0330217.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/81fc/12412986/5f04ba9fc352/pone.0330217.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/81fc/12412986/5f04ba9fc352/pone.0330217.g001.jpg

相似文献

1
Frankenstein, thematic analysis and generative artificial intelligence: Quality appraisal methods and considerations for qualitative research.《科学怪人》、主题分析与生成式人工智能:定性研究的质量评估方法及考量
PLoS One. 2025 Sep 5;20(9):e0330217. doi: 10.1371/journal.pone.0330217. eCollection 2025.
2
Generative Artificial Intelligence in Primary Care: Qualitative Study of UK General Practitioners' Views.基层医疗中的生成式人工智能:对英国全科医生观点的定性研究
J Med Internet Res. 2025 Aug 6;27:e74428. doi: 10.2196/74428.
3
Use of a Medical Communication Framework to Assess the Quality of Generative Artificial Intelligence Replies to Primary Care Patient Portal Messages: Content Analysis.使用医学交流框架评估生成式人工智能对基层医疗患者门户消息的回复质量:内容分析
JMIR Form Res. 2025 Jul 31;9:e71966. doi: 10.2196/71966.
4
Generative Artificial Intelligence Tools in Medical Research (GAMER): Protocol for a Scoping Review and Development of Reporting Guidelines.医学研究中的生成式人工智能工具(GAMER):范围综述与报告指南制定方案
JMIR Res Protoc. 2025 Aug 14;14:e64640. doi: 10.2196/64640.
5
Performance of 3 Conversational Generative Artificial Intelligence Models for Computing Maximum Safe Doses of Local Anesthetics: Comparative Analysis.用于计算局部麻醉药最大安全剂量的3种对话式生成人工智能模型的性能:比较分析
JMIR AI. 2025 May 13;4:e66796. doi: 10.2196/66796.
6
Health Care Professionals' Experiences and Opinions About Generative AI and Ambient Scribes in Clinical Documentation: Protocol for a Scoping Review.医疗保健专业人员对生成式人工智能和临床文档中的环境抄写员的经验与看法:一项范围综述的方案
JMIR Res Protoc. 2025 Aug 8;14:e73602. doi: 10.2196/73602.
7
Exploring Young Adults' Experiences and Beliefs in Asthma Medication Management: Pilot Qualitative Study Comparing Human and Multiple AI Thematic Analysis.探索青年成年人在哮喘药物管理方面的经历和信念:比较人工与多种人工智能主题分析的定性研究试点
JMIR Form Res. 2025 Aug 15;9:e69892. doi: 10.2196/69892.
8
Participation in environmental enhancement and conservation activities for health and well-being in adults: a review of quantitative and qualitative evidence.成年人参与促进环境改善和保护活动对健康与福祉的影响:定量和定性证据综述
Cochrane Database Syst Rev. 2016 May 21;2016(5):CD010351. doi: 10.1002/14651858.CD010351.pub2.
9
Physician Use of Large Language Models: A Quantitative Study Based on Large-Scale Query-Level Data.医生对大语言模型的使用:基于大规模查询级数据的定量研究
J Med Internet Res. 2025 Aug 25;27:e76941. doi: 10.2196/76941.
10
Reporting of Ethical Considerations in Qualitative Research Utilizing Social Media Data on Public Health Care: Scoping Review.报告利用社交媒体数据进行公共医疗保健定性研究中的伦理考虑:范围综述。
J Med Internet Res. 2024 May 17;26:e51496. doi: 10.2196/51496.

本文引用的文献

1
Artificial Intelligence to Support Qualitative Data Analysis: Promises, Approaches, Pitfalls.支持定性数据分析的人工智能:前景、方法与陷阱
Acad Med. 2025 Jun 24. doi: 10.1097/ACM.0000000000006134.
2
Advancing AI-driven thematic analysis in qualitative research: a comparative study of nine generative models on Cutaneous Leishmaniasis data.推进定性研究中人工智能驱动的主题分析:关于皮肤利什曼病数据的九种生成模型的比较研究
BMC Med Inform Decis Mak. 2025 Mar 10;25(1):124. doi: 10.1186/s12911-025-02961-5.
3
Inductive thematic analysis of healthcare qualitative interviews using open-source large language models: How does it compare to traditional methods?
使用开源大型语言模型对医疗保健定性访谈进行归纳主题分析:与传统方法相比如何?
Comput Methods Programs Biomed. 2024 Oct;255:108356. doi: 10.1016/j.cmpb.2024.108356. Epub 2024 Jul 24.
4
Can Generative AI improve social science?生成式人工智能能改进社会科学吗?
Proc Natl Acad Sci U S A. 2024 May 21;121(21):e2314021121. doi: 10.1073/pnas.2314021121. Epub 2024 May 9.
5
Engagement of health workers and peer educators from the National Adolescent Health Programme-Rashtriya Kishor Swasthya Karyakram during the COVID-19 pandemic: Findings from a situational analysis.在 COVID-19 大流行期间,国家青少年健康计划- Rashtriya Kishor Swasthya Karyakram 的卫生工作者和同伴教育者的参与情况:来自情况分析的结果。
PLoS One. 2022 Sep 21;17(9):e0266758. doi: 10.1371/journal.pone.0266758. eCollection 2022.
6
General-purpose thematic analysis: a useful qualitative method for anaesthesia research.通用主题分析:一种用于麻醉研究的有用定性方法。
BJA Educ. 2021 Dec;21(12):472-478. doi: 10.1016/j.bjae.2021.07.006. Epub 2021 Sep 23.
7
Neoliberal discourse, actor power, and the politics of nutrition policy: A qualitative analysis of informal challenges to nutrition labelling regulations at the World Trade Organization, 2007-2019.新自由主义话语、行为体权力与营养政策政治:对 2007-2019 年世界贸易组织营养标签法规所面临的非正规挑战的定性分析。
Soc Sci Med. 2021 Mar;273:113761. doi: 10.1016/j.socscimed.2021.113761. Epub 2021 Feb 11.
8
Health "Brexternalities": The Brexit Effect on Health and Health Care outside the United Kingdom.健康“脱欧外部性”:英国脱欧对英国以外地区健康和医疗的影响。
J Health Polit Policy Law. 2021 Feb 1;46(1):177-203. doi: 10.1215/03616878-8706663.
9
The Parenting Experience of Those With Borderline Personality Disorder Traits: Practitioner and Parent Perspectives.具有边缘型人格障碍特质者的育儿经历:从业者与家长的观点
Front Psychol. 2020 Aug 7;11:1913. doi: 10.3389/fpsyg.2020.01913. eCollection 2020.
10
Challenges and practices in promoting (ageing) employees working career in the health care sector - case studies from Germany, Finland and the UK.促进(老年)员工在医疗保健领域工作生涯的挑战与实践——来自德国、芬兰和英国的案例研究
BMC Health Serv Res. 2019 Nov 29;19(1):918. doi: 10.1186/s12913-019-4655-3.