Suppr超能文献

为了在 PrEP 相关干预措施中使用 Twitter:一种在美国识别男同性恋或双性恋男性的自动化自然语言处理管道。

Toward Using Twitter for PrEP-Related Interventions: An Automated Natural Language Processing Pipeline for Identifying Gay or Bisexual Men in the United States.

机构信息

Department of Biostatistics, Epidemiology, and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States.

Department of Family and Community Health, School of Nursing, University of Pennsylvania, Philadelphia, PA, United States.

出版信息

JMIR Public Health Surveill. 2022 Apr 25;8(4):e32405. doi: 10.2196/32405.

Abstract

BACKGROUND

Pre-exposure prophylaxis (PrEP) is highly effective at preventing the acquisition of HIV. There is a substantial gap, however, between the number of people in the United States who have indications for PrEP and the number of them who are prescribed PrEP. Although Twitter content has been analyzed as a source of PrEP-related data (eg, barriers), methods have not been developed to enable the use of Twitter as a platform for implementing PrEP-related interventions.

OBJECTIVE

Men who have sex with men (MSM) are the population most affected by HIV in the United States. Therefore, the objectives of this study were to (1) develop an automated natural language processing (NLP) pipeline for identifying men in the United States who have reported on Twitter that they are gay, bisexual, or MSM and (2) assess the extent to which they demographically represent MSM in the United States with new HIV diagnoses.

METHODS

Between September 2020 and January 2021, we used the Twitter Streaming Application Programming Interface (API) to collect more than 3 million tweets containing keywords that men may include in posts reporting that they are gay, bisexual, or MSM. We deployed handwritten, high-precision regular expressions-designed to filter out noise and identify actual self-reports-on the tweets and their user profile metadata. We identified 10,043 unique users geolocated in the United States and drew upon a validated NLP tool to automatically identify their ages.

RESULTS

By manually distinguishing true- and false-positive self-reports in the tweets or profiles of 1000 (10%) of the 10,043 users identified by our automated pipeline, we established that our pipeline has a precision of 0.85. Among the 8756 users for which a US state-level geolocation was detected, 5096 (58.2%) were in the 10 states with the highest numbers of new HIV diagnoses. Among the 6240 users for which a county-level geolocation was detected, 4252 (68.1%) were in counties or states considered priority jurisdictions by the Ending the HIV Epidemic initiative. Furthermore, the age distribution of the users reflected that of MSM in the United States with new HIV diagnoses.

CONCLUSIONS

Our automated NLP pipeline can be used to identify MSM in the United States who may be at risk of acquiring HIV, laying the groundwork for using Twitter on a large scale to directly target PrEP-related interventions at this population.

摘要

背景

暴露前预防 (PrEP) 可有效预防艾滋病毒的感染。然而,在美国,有 PrEP 适应证的人数与开具 PrEP 处方的人数之间存在很大差距。虽然已经分析了 Twitter 内容作为 PrEP 相关数据的来源(例如,障碍),但尚未开发出利用 Twitter 作为实施 PrEP 相关干预措施平台的方法。

目的

男男性行为者(MSM)是美国受 HIV 影响最严重的人群。因此,本研究的目的是:(1) 开发一种自动自然语言处理 (NLP) 管道,用于识别在 Twitter 上报告自己是同性恋、双性恋或 MSM 的美国男性;(2) 评估他们在新诊断出 HIV 的美国 MSM 中的人口统计学代表性。

方法

在 2020 年 9 月至 2021 年 1 月期间,我们使用 Twitter 流式应用程序编程接口 (API) 收集了超过 300 万条包含男性在报告自己是同性恋、双性恋或 MSM 时可能会包含的关键词的推文。我们部署了手写的高精度正则表达式,旨在过滤掉噪音并识别推文及其用户个人资料元数据中的实际自我报告。我们确定了 10043 名位于美国的唯一用户,并利用经过验证的 NLP 工具自动识别他们的年龄。

结果

通过手动区分我们的自动管道识别的 10043 名用户中的推文或个人资料中的真实和假阳性自我报告中的 1000 名(10%)用户,我们确定我们的管道具有 0.85 的准确率。在检测到美国州级地理位置的 8756 名用户中,5096 名(58.2%)位于新诊断出 HIV 的人数最多的 10 个州。在检测到县级地理位置的 6240 名用户中,4252 名(68.1%)位于终结艾滋病毒流行倡议被视为优先管辖范围的县或州。此外,用户的年龄分布反映了美国新诊断出 HIV 的 MSM 的情况。

结论

我们的自动 NLP 管道可用于识别美国有感染 HIV 风险的 MSM,为在大规模范围内利用 Twitter 直接针对该人群开展 PrEP 相关干预措施奠定基础。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c6b8/9086871/cba4f4398c18/publichealth_v8i4e32405_fig1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验