Qiao Shan, Fang Xingyu, Wang Junbo, Zhang Ran, Li Xiaoming, Kang Yuhao
Department of Health Promotion, Education, and Behavior, Arnold School of Public Health, University of South Carolina, Columbia, South Carolina, USA.
South Carolina SmartState Center for Heatlhcare Quality (CHQ), University of South Carolina, Columbia, South Carolina, USA.
Appl Psychol Health Well Being. 2025 Jun;17(3):e70038. doi: 10.1111/aphw.70038.
The coding of semistructured interview transcripts is a critical step for thematic analysis of qualitative data. However, the coding process is often labor-intensive and time-consuming. The emergence of generative artificial intelligence (GenAI) presents new opportunities to enhance the efficiency of qualitative coding. This study proposed a computational pipeline using GenAI to automatically extract themes from interview transcripts.
Using transcripts from interviews conducted with maternity care providers in South Carolina, we leveraged ChatGPT for inductive coding to generate codes from interview transcripts without a predetermined coding scheme. Structured prompts were designed to instruct ChatGPT to generate and summarize codes. The performance of GenAI was evaluated by comparing the AI-generated codes with those generated manually.
GenAI demonstrated promise in detecting and summarizing codes from interview transcripts. ChatGPT exhibited an overall accuracy exceeding 80% in inductive coding. More impressively, GenAI reduced the time required for coding by 81%.
GenAI models are capable of efficiently processing language datasets and performing multi-level semantic identification. However, challenges such as inaccuracy, systematic biases, and privacy concerns must be acknowledged and addressed. Future research should focus on refining these models to enhance reliability and address inherent limitations associated with their application in qualitative research.
对定性数据进行主题分析时,半结构化访谈转录本的编码是关键步骤。然而,编码过程往往既耗费人力又耗时。生成式人工智能(GenAI)的出现为提高定性编码效率带来了新机遇。本研究提出了一种利用GenAI从访谈转录本中自动提取主题的计算流程。
我们使用南卡罗来纳州产妇护理提供者的访谈转录本,借助ChatGPT进行归纳编码,在没有预定编码方案的情况下从访谈转录本中生成代码。设计了结构化提示,以指导ChatGPT生成和总结代码。通过将人工智能生成的代码与人工生成的代码进行比较,评估GenAI的性能。
GenAI在从访谈转录本中检测和总结代码方面显示出前景。ChatGPT在归纳编码中的总体准确率超过80%。更令人印象深刻的是,GenAI将编码所需时间减少了81%。
GenAI模型能够高效处理语言数据集并进行多层次语义识别。然而,必须认识到并解决诸如不准确、系统性偏差和隐私问题等挑战。未来的研究应专注于改进这些模型,以提高可靠性并解决与其在定性研究中的应用相关的固有局限性。