文献检索，用中文搜 PubMed

The integration of large language models (LLMs) into healthcare highlights the need to ensure their efficacy while mitigating potential harms, such as the perpetuation of biases. Current evidence on the existence of bias within LLMs remains inconclusive. In this study, we present an approach to investigate the presence of bias within an LLM designed for mental health support. We simulated physician-patient conversations by using a communication loop between an LLM-based conversational agent and digital standardized patients (DSPs) that engaged the agent in dialogue while remaining agnostic to sociodemographic characteristics. In contrast, the conversational agent was made aware of each DSP's characteristics, including age, sex, race/ethnicity, and annual income. The agent's responses were analyzed to discern potential systematic biases using the Linguistic Inquiry and Word Count tool. Multivariate regression analysis, trend analysis, and group-based trajectory models were used to quantify potential biases. Among 449 conversations, there was no evidence of bias in both descriptive assessments and multivariable linear regression analyses. Moreover, when evaluating changes in mean tone scores throughout a dialogue, the conversational agent exhibited a capacity to show understanding of the DSPs' chief complaints and to elevate the tone scores of the DSPs throughout conversations. This finding did not vary by any sociodemographic characteristics of the DSP. Using an objective methodology, our study did not uncover significant evidence of bias within an LLM-enabled mental health conversational agent. These findings offer a complementary approach to examining bias in LLM-based conversational agents for mental health support.

将大语言模型（LLMs）整合到医疗保健领域凸显了在确保其有效性的同时减轻潜在危害的必要性，比如避免偏见的持续存在。目前关于大语言模型中偏见存在与否的证据尚无定论。在本研究中，我们提出了一种方法来调查一个用于心理健康支持的大语言模型中是否存在偏见。我们通过基于大语言模型的对话代理与数字标准化患者（DSPs）之间的通信循环来模拟医患对话，数字标准化患者在与代理对话时对社会人口统计学特征保持未知。相比之下，对话代理知晓每个DSP的特征，包括年龄、性别、种族/民族和年收入。使用语言查询与字数统计工具分析代理的回复，以辨别潜在的系统偏见。采用多元回归分析、趋势分析和基于组的轨迹模型来量化潜在偏见。在449次对话中，描述性评估和多变量线性回归分析均未发现偏见证据。此外，在评估整个对话过程中平均语气得分的变化时，对话代理表现出能够理解DSP的主要诉求，并在整个对话过程中提高DSP的语气得分。这一发现不因DSP的任何社会人口统计学特征而有所不同。通过一种客观的方法，我们的研究未发现基于大语言模型的心理健康对话代理存在显著偏见证据。这些发现为检查基于大语言模型的心理健康支持对话代理中的偏见提供了一种补充方法。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

评估用于心理健康支持的对话式人工智能中社会人口统计学偏见的证据。

Evaluating for Evidence of Sociodemographic Bias in Conversational AI for Mental Health Support.

作者信息

机构信息

出版信息

相似文献

本文引用的文献