• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

利用OpenAI的GPT-3.5 Turbo模型探索新冠疫情期间Reddit上关于炎症性肠病的讨论:分类模型验证与案例研究

Exploring Inflammatory Bowel Disease Discourse on Reddit Throughout the COVID-19 Pandemic Using OpenAI's GPT-3.5 Turbo Model: Classification Model Validation and Case Study.

作者信息

Babinski Tyler, Karley Sara, Cooper Marita, Shaik Salma, Wang Y Ken

机构信息

Division of Gastroenterology, Hepatology, and Nutrition, Children's Hospital of Philadelphia, Philadelphia, PA, United States.

Division of Management and Education, University of Pittsburgh at Bradford, Bradford, PA, United States.

出版信息

J Med Internet Res. 2025 Jul 3;27:e53332. doi: 10.2196/53332.

DOI:10.2196/53332
PMID:40607732
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12271966/
Abstract

BACKGROUND

Inflammatory bowel disease (IBD) is a chronic autoimmune disorder with an increasing prevalence in the general population. Internet-based communities have become vital for communication among patients with IBD, especially throughout the COVID-19 pandemic. However, these internet-based patient-to-patient communications remain largely underexplored.

OBJECTIVE

This study aims to analyze community posts from 3 of the largest IBD support groups on Reddit between March 1, 2020, and December 31, 2022, using a pretrained transformer model, and to validate the classification system's results via comparison to human scoring.

METHODS

We collected posts (N=53,333) from subreddits r/CrohnsDisease, r/UlcerativeColitis, and r/IBD and classified them using OpenAI's GPT-3.5 Turbo model to determine sentiment, categorize topics, and identify demographic information and mentions of the COVID-19 pandemic. A subset of posts (n=397) was manually scored to measure interrater agreement between human raters and the GPT-3.5 Turbo model.

RESULTS

Fleiss κ and Gwet AC1 coefficients indicated a high level of agreement between raters, with values ranging from 0.53 to 0.91. The raters demonstrated almost perfect agreement on the classification of gender, with a Fleiss κ of 0.91 (P<.001). Medications (14,909/53,333) and symptoms (14,939/53,333) emerged as the most discussed topics, and most posts conveyed a neutral sentiment. While most users did not disclose their age, those who did primarily belonged to the 20-29 years (2392/4828) and 30-39 years (859/4828) age groups. Based on self-reported gender, we identified 1509 men and 1502 women among our IBD Reddit users. When comparing the users on the IBD subreddits to the general IBD population, there was a significant difference in gender distribution (N=3,090,011; χ=69.53; P<.001; φ<0.001). After an initial spike in posts within the first month, most posts did not reference the COVID-19 pandemic.

CONCLUSIONS

Our study showcases the potential of generative pretrained transformer models in processing and extracting insights from medical social media data. Future research can benefit from further subanalyses of our validated dataset or use OpenAI's model to analyze social media data for other conditions, particularly those for which patient experiences are challenging to collect.

摘要

背景

炎症性肠病(IBD)是一种慢性自身免疫性疾病,在普通人群中的患病率呈上升趋势。基于互联网的社区对于IBD患者之间的交流变得至关重要,尤其是在整个新冠疫情期间。然而,这些基于互联网的患者之间的交流在很大程度上仍未得到充分探索。

目的

本研究旨在使用预训练的Transformer模型分析2020年3月1日至2022年12月31日期间Reddit上3个最大的IBD支持小组的社区帖子,并通过与人工评分比较来验证分类系统的结果。

方法

我们从子版块r/CrohnsDisease、r/UlcerativeColitis和r/IBD收集了帖子(N = 53333),并使用OpenAI的GPT-3.5 Turbo模型对其进行分类,以确定情感、对主题进行分类,并识别人口统计学信息以及提及的新冠疫情。手动对一部分帖子(n = 397)进行评分,以测量人工评分者与GPT-3.5 Turbo模型之间的评分者间一致性。

结果

Fleiss κ和Gwet AC1系数表明评分者之间具有高度一致性,值范围为0.53至0.91。评分者在性别分类上表现出几乎完美的一致性,Fleiss κ为0.91(P <.001)。药物(14909/53333)和症状(14939/53333)是讨论最多的主题,大多数帖子传达出中性情感。虽然大多数用户未透露其年龄,但透露年龄的用户主要属于20 - 29岁(2392/4828)和30 - 三十九岁(859/4828)年龄组。根据自我报告的性别,我们在IBD Reddit用户中识别出1509名男性和1502名女性。将IBD子版块上的用户与一般IBD人群进行比较时,性别分布存在显著差异(N = 3090011;χ = 69.53;P <.001;φ < 0.001)。在第一个月内帖子数量出现初始峰值后,大多数帖子未提及新冠疫情。

结论

我们的研究展示了生成式预训练Transformer模型在处理和从医学社交媒体数据中提取见解方面的潜力。未来的研究可以从对我们经过验证的数据集进行进一步的子分析中受益,或者使用OpenAI的模型来分析其他疾病的社交媒体数据,特别是那些患者体验难以收集的疾病。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/749b/12271966/314c1361b21b/jmir_v27i1e53332_fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/749b/12271966/717dd8b96496/jmir_v27i1e53332_fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/749b/12271966/7d62c6db3bfd/jmir_v27i1e53332_fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/749b/12271966/d91b22bddf33/jmir_v27i1e53332_fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/749b/12271966/314c1361b21b/jmir_v27i1e53332_fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/749b/12271966/717dd8b96496/jmir_v27i1e53332_fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/749b/12271966/7d62c6db3bfd/jmir_v27i1e53332_fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/749b/12271966/d91b22bddf33/jmir_v27i1e53332_fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/749b/12271966/314c1361b21b/jmir_v27i1e53332_fig4.jpg

相似文献

1
Exploring Inflammatory Bowel Disease Discourse on Reddit Throughout the COVID-19 Pandemic Using OpenAI's GPT-3.5 Turbo Model: Classification Model Validation and Case Study.利用OpenAI的GPT-3.5 Turbo模型探索新冠疫情期间Reddit上关于炎症性肠病的讨论:分类模型验证与案例研究
J Med Internet Res. 2025 Jul 3;27:e53332. doi: 10.2196/53332.
2
Patient education interventions for the management of inflammatory bowel disease.炎症性肠病管理的患者教育干预措施。
Cochrane Database Syst Rev. 2023 May 4;5(5):CD013854. doi: 10.1002/14651858.CD013854.pub2.
3
Sentiment Analysis Using a Large Language Model-Based Approach to Detect Opioids Mixed With Other Substances Via Social Media: Method Development and Validation.使用基于大语言模型的方法通过社交媒体检测与其他物质混合的阿片类药物的情感分析:方法开发与验证
JMIR Infodemiology. 2025 Jun 19;5:e70525. doi: 10.2196/70525.
4
Sexual Harassment and Prevention Training性骚扰与预防培训
5
Signs and symptoms to determine if a patient presenting in primary care or hospital outpatient settings has COVID-19.在基层医疗机构或医院门诊环境中,如果患者出现以下症状和体征,可判断其是否患有 COVID-19。
Cochrane Database Syst Rev. 2022 May 20;5(5):CD013665. doi: 10.1002/14651858.CD013665.pub3.
6
Stigma Management Strategies of Autistic Social Media Users.自闭症社交媒体用户的污名管理策略
Autism Adulthood. 2025 May 28;7(3):273-282. doi: 10.1089/aut.2023.0095. eCollection 2025 Jun.
7
Risk of thromboembolism in patients with COVID-19 who are using hormonal contraception.COVID-19 患者使用激素避孕的血栓栓塞风险。
Cochrane Database Syst Rev. 2023 Jan 9;1(1):CD014908. doi: 10.1002/14651858.CD014908.pub2.
8
The effect of sample site and collection procedure on identification of SARS-CoV-2 infection.样本采集部位和采集程序对严重急性呼吸综合征冠状病毒2(SARS-CoV-2)感染鉴定的影响。
Cochrane Database Syst Rev. 2024 Dec 16;12(12):CD014780. doi: 10.1002/14651858.CD014780.
9
Measures implemented in the school setting to contain the COVID-19 pandemic.学校为控制 COVID-19 疫情而采取的措施。
Cochrane Database Syst Rev. 2022 Jan 17;1(1):CD015029. doi: 10.1002/14651858.CD015029.
10
Improving Suicidal Ideation Detection in Social Media Posts: Topic Modeling and Synthetic Data Augmentation Approach.提高社交媒体帖子中自杀意念检测的能力:主题建模与合成数据增强方法
JMIR Form Res. 2025 Jun 11;9:e63272. doi: 10.2196/63272.

本文引用的文献

1
Using Large Language Models for sentiment analysis of health-related social media data: empirical evaluation and practical tips.使用大语言模型对健康相关社交媒体数据进行情感分析:实证评估与实用技巧
AMIA Annu Symp Proc. 2025 May 22;2024:503-512. eCollection 2024.
2
An Insight into Patients' Perspectives of Ulcerative Colitis Flares via Analysis of Online Public Forum Posts.通过分析在线公共论坛帖子洞察溃疡性结肠炎发作患者的观点。
Inflamm Bowel Dis. 2024 Oct 3;30(10):1748-1758. doi: 10.1093/ibd/izad247.
3
Users' Concerns About Endometriosis on Social Media: Sentiment Analysis and Topic Modeling Study.
社交媒体中用户对子宫内膜异位症的关注:情感分析和主题建模研究。
J Med Internet Res. 2023 Aug 15;25:e45381. doi: 10.2196/45381.
4
ChatGPT outperforms humans in emotional awareness evaluations.ChatGPT在情绪感知评估方面表现优于人类。
Front Psychol. 2023 May 26;14:1199058. doi: 10.3389/fpsyg.2023.1199058. eCollection 2023.
5
The Effect of Monetary Incentives on Health Care Social Media Content: Study Based on Topic Modeling and Sentiment Analysis.金钱激励对医疗保健社交媒体内容的影响:基于主题建模和情感分析的研究。
J Med Internet Res. 2023 May 11;25:e44307. doi: 10.2196/44307.
6
What are IBD Patients Talking About on Twitter? Using Natural Language Understanding to Investigate Patients' Tweets.炎症性肠病患者在推特上都在谈论什么?运用自然语言理解来研究患者的推文。
SN Comput Sci. 2023;4(4):343. doi: 10.1007/s42979-023-01772-7. Epub 2023 Apr 20.
7
Topics Analysis of Reddit and Twitter Posts Discussing Inflammatory Bowel Disease and Distress From 2017 to 2019.2017年至2019年Reddit和Twitter上讨论炎症性肠病及痛苦的帖子主题分析
Crohns Colitis 360. 2021 Jul 7;3(3):otab044. doi: 10.1093/crocol/otab044. eCollection 2021 Jul.
8
Associations between inflammatory bowel disease, social isolation, and mortality: evidence from a longitudinal cohort study.炎症性肠病、社会隔离与死亡率之间的关联:一项纵向队列研究的证据。
Therap Adv Gastroenterol. 2022 Sep 30;15:17562848221127474. doi: 10.1177/17562848221127474. eCollection 2022.
9
Treatment of Inflammatory Bowel Disease: A Comprehensive Review.炎症性肠病的治疗:全面综述
Front Med (Lausanne). 2021 Dec 20;8:765474. doi: 10.3389/fmed.2021.765474. eCollection 2021.
10
Impact of the COVID-19 pandemic on inflammatory bowel disease: The role of emotional stress and social isolation.COVID-19 大流行对炎症性肠病的影响:情绪压力和社会隔离的作用。
Stress Health. 2022 Apr;38(2):222-233. doi: 10.1002/smi.3080. Epub 2021 Jul 26.