Suppr超能文献

使用机器学习检测非裔美国参与者参与“我们所有人”研究计划的招募电话中的对话主题:模型开发与验证研究

Detecting Conversation Topics in Recruitment Calls of African American Participants to the All of Us Research Program Using Machine Learning: Model Development and Validation Study.

作者信息

Pemu Priscilla, Prude Michael, McCaslin Atuarra, Ojemakinde Elizabeth, Awad Christopher, Igwe Kelechi, Rodriguez Anny, Foriest Jasmine, Idris Muhammed

机构信息

Department of Medicine, Morehouse School of Medicine, Atlanta, GA, United States.

Clinical Research Center, Morehouse School of Medicine, Atlanta, GA, United States.

出版信息

JMIR Form Res. 2025 Jul 17;9:e65320. doi: 10.2196/65320.

Abstract

BACKGROUND

Advancements in science and technology can exacerbate health disparities, particularly when there is a lack of diversity in clinical research, which limits the benefits of innovations for underrepresented communities. Programs like the All of Us Research Program (AoURP) are actively working to address this issue by ensuring that underrepresented populations are represented in biomedical research, promoting equitable participation, and advancing health outcomes for all. African American communities have been particularly underrepresented in clinical research, often due to historical instances of research misconduct, such as the Tuskegee Syphilis Study, which have deeply impacted trust and willingness to participate in research studies. With the US population becoming increasingly diverse, it is crucial that clinical research studies reflect this diversity to improve health outcomes. However, limited data and small sample sizes in qualitative studies on the inclusion of underrepresented groups hinder progress in this area.

OBJECTIVE

The goal of this paper is to analyze recruitment conversations between research assistants (RAs) and potential participants in the AoURP to identify key topics that influence enrollment. By examining these interactions, we aim to provide insights that can improve engagement strategies and recruitment practices for underrepresented groups in biomedical research.

METHODS

Our study design was an observational, retrospective approach using machine learning for content analysis. Specifically, we used structural topic modeling to identify and compare latent topics of conversation in recruitment calls by Morehouse School of Medicine RAs between February 2021 and April 2022 by estimating expected topic proportions in the corpus as a function of enrollment and participation in AoURP.

RESULTS

In total, our model estimated 45 topics of which 12 coherent topics were identified. Notable topics, that were more likely to occur in conversations between RAs and participants that enrolled and participated, include closing or following up to schedule an appointment, COVID-19 protocols for in-person visits, explaining precision medicine and the need for representation, and working through objections, including concerns about costs, insurance, care changes, and health fears. Topics among potential participants who did not enroll include technical challenges and describing physical measurement visits (eg, collection of basic physical data, such as height, weight, and blood pressure).

CONCLUSIONS

Using an approach that leverages machine learning to identify topical structure and themes with limited human subjectivity is a promising strategy to identify gaps in, and opportunities to improve, the recruitment of underserved communities into clinical trials.

摘要

背景

科技进步可能会加剧健康差距,尤其是在临床研究缺乏多样性的情况下,这限制了创新给代表性不足的群体带来的益处。像“我们所有人研究计划”(AoURP)这样的项目正在积极努力解决这一问题,确保代表性不足的人群参与生物医学研究,促进公平参与,并改善所有人的健康结果。非裔美国人群体在临床研究中的代表性一直特别不足,这通常是由于历史上的研究不当事件,如塔斯基吉梅毒研究,这些事件严重影响了人们对参与研究的信任和意愿。随着美国人口日益多样化,临床研究反映这种多样性以改善健康结果至关重要。然而,关于纳入代表性不足群体的定性研究数据有限且样本量小,阻碍了该领域的进展。

目的

本文的目标是分析AoURP中研究助理(RA)与潜在参与者之间的招募对话,以确定影响入组的关键主题。通过研究这些互动,我们旨在提供见解,以改进生物医学研究中代表性不足群体的参与策略和招募实践。

方法

我们的研究设计是一种采用机器学习进行内容分析的观察性回顾方法。具体而言,我们使用结构主题建模来识别和比较2021年2月至2022年4月期间莫尔豪斯医学院RA在招募电话中对话的潜在主题,通过估计语料库中预期主题比例作为AoURP入组和参与情况的函数。

结果

我们的模型总共估计了45个主题,其中识别出12个连贯主题。在已入组并参与的RA与参与者之间的对话中更可能出现的显著主题包括敲定或跟进预约、面对面访问的新冠疫情防控协议、解释精准医学以及代表性的必要性,以及处理反对意见,包括对费用、保险、护理变更和健康担忧。未入组的潜在参与者之间的主题包括技术挑战以及描述身体测量访问(如收集身高、体重和血压等基本身体数据)。

结论

使用一种利用机器学习来识别主题结构和主题且主观性有限的方法,是一种有前景的策略,可用于识别将服务不足社区招募到临床试验中的差距和改进机会。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a4cb/12314466/992c15ce48ac/formative_v9i1e65320_fig1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验