Suppr超能文献

生成式人工智能在孕产妇健康研究中的主题分析:使用大语言模型对半结构化访谈进行编码

Generative AI for thematic analysis in a maternal health study: coding semistructured interviews using large language models.

作者信息

Qiao Shan, Fang Xingyu, Wang Junbo, Zhang Ran, Li Xiaoming, Kang Yuhao

机构信息

Department of Health Promotion, Education, and Behavior, Arnold School of Public Health, University of South Carolina, Columbia, South Carolina, USA.

South Carolina SmartState Center for Heatlhcare Quality (CHQ), University of South Carolina, Columbia, South Carolina, USA.

出版信息

Appl Psychol Health Well Being. 2025 Jun;17(3):e70038. doi: 10.1111/aphw.70038.

Abstract

STUDY OBJECTIVES

The coding of semistructured interview transcripts is a critical step for thematic analysis of qualitative data. However, the coding process is often labor-intensive and time-consuming. The emergence of generative artificial intelligence (GenAI) presents new opportunities to enhance the efficiency of qualitative coding. This study proposed a computational pipeline using GenAI to automatically extract themes from interview transcripts.

METHODS

Using transcripts from interviews conducted with maternity care providers in South Carolina, we leveraged ChatGPT for inductive coding to generate codes from interview transcripts without a predetermined coding scheme. Structured prompts were designed to instruct ChatGPT to generate and summarize codes. The performance of GenAI was evaluated by comparing the AI-generated codes with those generated manually.

RESULTS

GenAI demonstrated promise in detecting and summarizing codes from interview transcripts. ChatGPT exhibited an overall accuracy exceeding 80% in inductive coding. More impressively, GenAI reduced the time required for coding by 81%.

DISCUSSION

GenAI models are capable of efficiently processing language datasets and performing multi-level semantic identification. However, challenges such as inaccuracy, systematic biases, and privacy concerns must be acknowledged and addressed. Future research should focus on refining these models to enhance reliability and address inherent limitations associated with their application in qualitative research.

摘要

研究目标

对定性数据进行主题分析时,半结构化访谈转录本的编码是关键步骤。然而,编码过程往往既耗费人力又耗时。生成式人工智能(GenAI)的出现为提高定性编码效率带来了新机遇。本研究提出了一种利用GenAI从访谈转录本中自动提取主题的计算流程。

方法

我们使用南卡罗来纳州产妇护理提供者的访谈转录本,借助ChatGPT进行归纳编码,在没有预定编码方案的情况下从访谈转录本中生成代码。设计了结构化提示,以指导ChatGPT生成和总结代码。通过将人工智能生成的代码与人工生成的代码进行比较,评估GenAI的性能。

结果

GenAI在从访谈转录本中检测和总结代码方面显示出前景。ChatGPT在归纳编码中的总体准确率超过80%。更令人印象深刻的是,GenAI将编码所需时间减少了81%。

讨论

GenAI模型能够高效处理语言数据集并进行多层次语义识别。然而,必须认识到并解决诸如不准确、系统性偏差和隐私问题等挑战。未来的研究应专注于改进这些模型,以提高可靠性并解决与其在定性研究中的应用相关的固有局限性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/651a/12083056/e2ecbb2ddb95/APHW-17-0-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验