使用基于大语言模型的对话代理进行自动调查收集

Automated Survey Collection with LLM-based Conversational Agents.

作者信息

Kaiyrbekov Kurmanbek, Dobbins Nicholas J, Mooney Sean D

机构信息

Cyberinfrastructure and Artificial Intelligence Platforms Section, Center for Genomics and Data Science Research, National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland, USA.

Biomedical Informatics & Data Science, Department of Medicine, Johns Hopkins University, Baltimore, Maryland, USA.

出版信息

ArXiv. 2025 Apr 2:arXiv:2504.02891v1.

PMID:40735102

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12306824/

Abstract

OBJECTIVE

Traditional phone-based surveys are among the most accessible and widely used methods to collect biomedical and healthcare data, however, they are often costly, labor intensive, and difficult to scale effectively. To overcome these limitations, we propose an end-to-end survey collection framework driven by conversational Large Language Models (LLMs).

MATERIALS AND METHODS

Our framework consists of a researcher responsible for designing the survey and recruiting participants, a conversational phone agent powered by an LLM that calls participants and administers the survey, a second LLM (GPT-4o) that analyzes the conversation transcripts generated during the surveys, and a database for storing and organizing the results. To test our framework, we recruited 8 participants consisting of 5 native and 3 non-native english speakers and administered 40 surveys. We evaluated the correctness of LLM-generated conversation transcripts, accuracy of survey responses inferred by GPT- 4o and overall participant experience.

RESULTS

Survey responses were successfully extracted by GPT-4o from conversation transcripts with an average accuracy of 98% despite transcripts exhibiting an average per-line word error rate of 7.7%. While participants noted occasional errors made by the conversational LLM agent, they reported that the agent effectively conveyed the purpose of the survey, demonstrated good comprehension, and maintained an engaging interaction.

CONCLUSIONS

Our study highlights the potential of LLM agents in conducting and analyzing phone surveys for healthcare applications. By reducing the workload on human interviewers and offering a scalable solution, this approach paves the way for real-world, end-to-end AI-powered phone survey collection systems.

摘要

目的

传统的基于电话的调查是收集生物医学和医疗保健数据最容易获得且使用最广泛的方法之一，然而，它们通常成本高昂、劳动密集且难以有效扩展。为了克服这些限制，我们提出了一个由对话式大语言模型（LLM）驱动的端到端调查收集框架。

材料与方法

我们的框架包括一名负责设计调查和招募参与者的研究人员、一个由LLM驱动的对话式电话代理，该代理致电参与者并进行调查、第二个LLM（GPT-4o），用于分析调查期间生成的对话记录，以及一个用于存储和整理结果的数据库。为了测试我们的框架，我们招募了8名参与者，其中包括5名以英语为母语的人和3名非英语母语者，并进行了40次调查。我们评估了LLM生成的对话记录的正确性、GPT-4o推断的调查回复的准确性以及总体参与者体验。