• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

在线健康社区中帮助版主进行文本分类。

Text classification for assisting moderators in online health communities.

机构信息

Department of Telecommunication, Information Studies, and Media, Michigan State University, 404 Wilson Rd, Rm 409, East Lansing, MI 48864, USA.

出版信息

J Biomed Inform. 2013 Dec;46(6):998-1005. doi: 10.1016/j.jbi.2013.08.011. Epub 2013 Sep 8.

DOI:10.1016/j.jbi.2013.08.011
PMID:24025513
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3874858/
Abstract

OBJECTIVES

Patients increasingly visit online health communities to get help on managing health. The large scale of these online communities makes it impossible for the moderators to engage in all conversations; yet, some conversations need their expertise. Our work explores low-cost text classification methods to this new domain of determining whether a thread in an online health forum needs moderators' help.

METHODS

We employed a binary classifier on WebMD's online diabetes community data. To train the classifier, we considered three feature types: (1) word unigram, (2) sentiment analysis features, and (3) thread length. We applied feature selection methods based on χ² statistics and under sampling to account for unbalanced data. We then performed a qualitative error analysis to investigate the appropriateness of the gold standard.

RESULTS

Using sentiment analysis features, feature selection methods, and balanced training data increased the AUC value up to 0.75 and the F1-score up to 0.54 compared to the baseline of using word unigrams with no feature selection methods on unbalanced data (0.65 AUC and 0.40 F1-score). The error analysis uncovered additional reasons for why moderators respond to patients' posts.

DISCUSSION

We showed how feature selection methods and balanced training data can improve the overall classification performance. We present implications of weighing precision versus recall for assisting moderators of online health communities. Our error analysis uncovered social, legal, and ethical issues around addressing community members' needs. We also note challenges in producing a gold standard, and discuss potential solutions for addressing these challenges.

CONCLUSION

Social media environments provide popular venues in which patients gain health-related information. Our work contributes to understanding scalable solutions for providing moderators' expertise in these large-scale, social media environments.

摘要

目的

患者越来越多地访问在线健康社区以获取健康管理方面的帮助。由于这些在线社区规模庞大,版主无法参与所有对话;然而,有些对话需要他们的专业知识。我们的工作探索了低成本的文本分类方法,以确定在线健康论坛中的主题是否需要版主的帮助。

方法

我们在 WebMD 的在线糖尿病社区数据上使用了二元分类器。为了训练分类器,我们考虑了三种特征类型:(1)单词一元词,(2)情感分析特征,和(3)线程长度。我们应用了基于 χ² 统计量和欠采样的特征选择方法来处理不平衡数据。然后,我们进行了定性错误分析,以调查黄金标准的适当性。

结果

与在不平衡数据上使用单词一元词且没有特征选择方法的基线相比,使用情感分析特征、特征选择方法和平衡训练数据可将 AUC 值提高到 0.75,将 F1 分数提高到 0.54(0.65 AUC 和 0.40 F1 分数)。错误分析揭示了版主为何会回复患者帖子的其他原因。

讨论

我们展示了特征选择方法和平衡训练数据如何提高整体分类性能。我们提出了权衡精度和召回率以帮助在线健康社区的版主的影响。我们的错误分析揭示了在解决社区成员需求方面的社会、法律和道德问题。我们还注意到制作黄金标准的挑战,并讨论了解决这些挑战的潜在解决方案。

结论

社交媒体环境提供了患者获取健康相关信息的热门场所。我们的工作有助于理解在这些大规模的社交媒体环境中提供版主专业知识的可扩展解决方案。

相似文献

1
Text classification for assisting moderators in online health communities.在线健康社区中帮助版主进行文本分类。
J Biomed Inform. 2013 Dec;46(6):998-1005. doi: 10.1016/j.jbi.2013.08.011. Epub 2013 Sep 8.
2
Lessons Learned for Online Health Community Moderator Roles: A Mixed-Methods Study of Moderators Resigning From WebMD Communities.在线健康社区版主角色的经验教训:一项关于从WebMD社区辞职的版主的混合方法研究。
J Med Internet Res. 2016 Sep 8;18(9):e247. doi: 10.2196/jmir.6331.
3
Detecting clinically related content in online patient posts.检测在线患者帖子中的临床相关内容。
J Biomed Inform. 2017 Nov;75:96-106. doi: 10.1016/j.jbi.2017.09.015. Epub 2017 Oct 3.
4
SentiHealth-Cancer: A sentiment analysis tool to help detecting mood of patients in online social networks.SentiHealth-癌症:一种用于帮助检测在线社交网络中患者情绪的情感分析工具。
Int J Med Inform. 2016 Jan;85(1):80-95. doi: 10.1016/j.ijmedinf.2015.09.007. Epub 2015 Oct 16.
5
Exploring the benefits and challenges of health professionals' participation in online health communities: Emergence of (dis)empowerment processes and outcomes.探索健康专业人员参与在线健康社区的益处与挑战:(去)赋权过程及结果的出现
Int J Med Inform. 2017 Feb;98:13-21. doi: 10.1016/j.ijmedinf.2016.11.005. Epub 2016 Nov 24.
6
Temporal Causality Analysis of Sentiment Change in a Cancer Survivor Network.癌症幸存者网络中情绪变化的时间因果关系分析
IEEE Trans Comput Soc Syst. 2016 Jun;3(2):75-87. doi: 10.1109/TCSS.2016.2591880. Epub 2016 Aug 10.
7
"How Did We Get Here?": Topic Drift in Online Health Discussions.“我们如何走到这一步?”:在线健康讨论中的话题漂移
J Med Internet Res. 2016 Nov 2;18(11):e284. doi: 10.2196/jmir.6297.
8
Harnessing Reddit to Understand the Written-Communication Challenges Experienced by Individuals With Mental Health Disorders: Analysis of Texts From Mental Health Communities.利用Reddit了解精神健康障碍患者所面临的书面沟通挑战:对心理健康社区文本的分析
J Med Internet Res. 2018 Apr 10;20(4):e121. doi: 10.2196/jmir.8219.
9
Identifying key hospital service quality factors in online health communities.识别在线健康社区中的关键医院服务质量因素。
J Med Internet Res. 2015 Apr 7;17(4):e90. doi: 10.2196/jmir.3646.
10
Patient moderator interaction in online health communities.在线健康社区中患者与版主的互动。
AMIA Annu Symp Proc. 2013 Nov 16;2013:627-36. eCollection 2013.

引用本文的文献

1
Using Machine Learning of Online Expression to Explain Recovery Trajectories: Content Analytic Approach to Studying a Substance Use Disorder Forum.利用在线表达的机器学习来解释恢复轨迹:研究物质使用障碍论坛的内容分析方法。
J Med Internet Res. 2023 Aug 22;25:e45589. doi: 10.2196/45589.
2
"I Want to Be Stepping in More" - Professional Online Forum Moderators' Experiences of Supporting Individuals in a Suicide Crisis.“我希望更多地介入其中”——专业在线论坛版主在自杀危机中支持个体的经历
Front Psychiatry. 2022 Jun 13;13:863509. doi: 10.3389/fpsyt.2022.863509. eCollection 2022.
3
When BERT meets Bilbo: a learning curve analysis of pretrained language model on disease classification.当 BERT 遇见比尔博:预训练语言模型在疾病分类上的学习曲线分析。
BMC Med Inform Decis Mak. 2022 Apr 5;21(Suppl 9):377. doi: 10.1186/s12911-022-01829-2.
4
Microaggression clues from social media: revealing and counteracting the suppression of women's health care.社交媒体中的微侵犯线索:揭示和抵制对女性医疗保健的压制。
J Am Med Inform Assoc. 2022 Jan 12;29(2):257-270. doi: 10.1093/jamia/ocab208.
5
Developing a standardized protocol for computational sentiment analysis research using health-related social media data.开发使用与健康相关的社交媒体数据进行计算情感分析研究的标准化协议。
J Am Med Inform Assoc. 2021 Jun 12;28(6):1125-1134. doi: 10.1093/jamia/ocaa298.
6
Lightme: analysing language in internet support groups for mental health.Lightme:分析心理健康互联网支持小组中的语言。
Health Inf Sci Syst. 2020 Oct 13;8(1):34. doi: 10.1007/s13755-020-00115-7. eCollection 2020 Dec.
7
A Collaborative Framework Based for Semantic Patients-Behavior Analysis and Highlight Topics Discovery of Alcoholic Beverages in Online Healthcare Forums.基于协作框架的语义患者行为分析及在线医疗保健论坛中酒类话题发现
J Med Syst. 2020 Apr 7;44(5):101. doi: 10.1007/s10916-020-01547-0.
8
Classification of Health-Related Social Media Posts: Evaluation of Post Content-Classifier Models and Analysis of User Demographics.健康相关社交媒体帖子的分类:帖子内容分类模型的评估和用户人口统计学分析。
JMIR Public Health Surveill. 2020 Apr 1;6(2):e14952. doi: 10.2196/14952.
9
Sentiment Analysis in Health and Well-Being: Systematic Review.健康与幸福中的情感分析:系统综述
JMIR Med Inform. 2020 Jan 28;8(1):e16023. doi: 10.2196/16023.
10
Artificial Intelligence in Health: New Opportunities, Challenges, and Practical Implications.健康领域的人工智能:新机遇、挑战与实际影响。
Yearb Med Inform. 2019 Aug;28(1):174-178. doi: 10.1055/s-0039-1677935. Epub 2019 Aug 16.

本文引用的文献

1
Tackling Dilemmas in Supporting "The Whole Person" in Online Patient Communities.应对在线患者社区中支持“完整的人”所面临的困境。
Proc SIGCHI Conf Hum Factor Comput Syst. 2012;2012:923-926. doi: 10.1145/2207676.2208535.
2
Patient moderator interaction in online health communities.在线健康社区中患者与版主的互动。
AMIA Annu Symp Proc. 2013 Nov 16;2013:627-36. eCollection 2013.
3
Applying active learning to high-throughput phenotyping algorithms for electronic health records data.将主动学习应用于电子健康记录数据的高通量表型算法。
J Am Med Inform Assoc. 2013 Dec;20(e2):e253-9. doi: 10.1136/amiajnl-2013-001945. Epub 2013 Jul 13.
4
Use of a support vector machine for categorizing free-text notes: assessment of accuracy across two institutions.使用支持向量机对自由文本注释进行分类:在两个机构中的准确性评估。
J Am Med Inform Assoc. 2013 Sep-Oct;20(5):887-90. doi: 10.1136/amiajnl-2012-001576. Epub 2013 Mar 30.
5
Toward automated consumer question answering: automatically separating consumer questions from professional questions in the healthcare domain.迈向自动化消费者问答:在医疗保健领域自动区分消费者问题和专业问题。
J Biomed Inform. 2011 Dec;44(6):1032-8. doi: 10.1016/j.jbi.2011.08.008. Epub 2011 Aug 12.
6
Managing the personal side of health: how patient expertise differs from the expertise of clinicians.管理健康的个人层面:患者的专业知识与临床医生的专业知识有何不同。
J Med Internet Res. 2011 Aug 16;13(3):e62. doi: 10.2196/jmir.1728.
7
AskHERMES: An online question answering system for complex clinical questions.AskHERMES:一个用于复杂临床问题的在线问答系统。
J Biomed Inform. 2011 Apr;44(2):277-88. doi: 10.1016/j.jbi.2011.01.004. Epub 2011 Jan 21.
8
Biomedical informatics techniques for processing and analyzing web blogs of military service members.用于处理和分析军人网络博客的生物医学信息学技术。
J Med Internet Res. 2010 Oct 5;12(4):e45. doi: 10.2196/jmir.1538.
9
An overview of MetaMap: historical perspective and recent advances.MetaMap 概述:历史视角与最新进展。
J Am Med Inform Assoc. 2010 May-Jun;17(3):229-36. doi: 10.1136/jamia.2009.002733.
10
Text mining and natural language processing approaches for automatic categorization of lay requests to web-based expert forums.用于将普通民众向基于网络的专家论坛提出的请求自动分类的文本挖掘和自然语言处理方法。
J Med Internet Res. 2009 Jul 22;11(3):e25. doi: 10.2196/jmir.1123.