文献检索文档翻译深度研究
Suppr Zotero 插件Zotero 插件
邀请有礼套餐&价格历史记录

新学期,新优惠

限时优惠:9月1日-9月22日

30天高级会员仅需29元

1天体验卡首发特惠仅需5.99元

了解详情
不再提醒
插件&应用
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
高级版
套餐订阅购买积分包
AI 工具
文献检索文档翻译深度研究
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2025

用于社交网络健康趋势预测的增强集成模型(AEM)。

Augmented Ensemble Model (AEM) for health trends prediction on social networks.

作者信息

Saini Sonia, Agarwal Ruchi, Singh S P, Gupta Punit, Vidhyarthi Ankit, Verma Rohit

机构信息

Associate Consultant, Tata Consultancy Services, Noida, India.

Professor, Computer Applications Department, JIMS Engineering Management Technical Campus, Greater Noida, India.

出版信息

PLoS One. 2025 Jun 5;20(6):e0323449. doi: 10.1371/journal.pone.0323449. eCollection 2025.


DOI:10.1371/journal.pone.0323449
PMID:40472295
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12140654/
Abstract

Social Media has given an exponential rise to an ever-connected world. Health data that was earlier viewed as hospital records or clinical records is now being shared as text over social media. Information and updates regarding the outbreak of a pandemic, clinical visit results, general health updates, etc., are being analyzed. The data is now shared more frequently in various formats such as images, text, documents, and videos. With fast streaming systems and no constraints on storage spaces, all this shared rich media data is quite voluminous and informative. For shared health data such as discussions on ailments, hospital visits, general health well-being updates, and drug research updates via official Twitter handles of various pharmaceutical companies and healthcare organizations, a unique level of challenge is posed for analysis of this data. The text indicating the ailment often varies from proper medical jargon to common names for the same, whereas the intent is the same in predicting the disease or ailment term. This paper focuses on how we can extract and analyze health-related data exchanged on social media and introduce an Augmented Ensemble Model (AEM), which identifies the frequently shared topics and discussions about health on social networks, to predict the emerging health trends. The analytical model works with chronological datasets to deduce text classification of topics related to health. This Hybrid Model uses text data augmentation to address class imbalance for health terms and further employs a clustering technique for location-based aggregation. An algorithm for health terms Word Vector Embedding model is formulated. This Word Vector model is further used in Text Data Augmentation to reduce the class imbalance. We evaluate the accuracy of the classifiers by constructing a Machine Learning pipeline. For our Augmented Ensemble Model, the Text classification accuracy is evaluated after the augmentation using a voting ensemble technique, and a greater accuracy has been observed. Emerging health trends are analyzed via temporal classification and location-wise aggregation of the health terms. This model demonstrates that a Text Augmented Ensemble Machine Learning approach for health topics is more efficient than the conventional Machine Learning classification technique(s).

摘要

社交媒体使我们进入了一个联系日益紧密的世界,其发展呈指数级增长。以前被视为医院记录或临床记录的健康数据,现在正以文本形式在社交媒体上分享。有关大流行病爆发、临床就诊结果、一般健康状况更新等信息和动态正在接受分析。现在,这些数据以各种格式(如图像、文本、文档和视频)更频繁地共享。借助快速流系统且不受存储空间限制,所有这些共享的富媒体数据量巨大且信息丰富。对于通过各制药公司和医疗保健组织的官方推特账号分享的健康数据,如关于疾病的讨论、医院就诊情况、一般健康状况更新以及药物研究进展等,分析此类数据面临独特的挑战。描述疾病的文本往往从专业医学术语到同一疾病的常用名称各不相同,而预测疾病或病症术语时意图是相同的。本文重点探讨如何提取和分析在社交媒体上交换的健康相关数据,并引入一种增强集成模型(AEM),该模型可识别社交网络上关于健康的频繁共享主题和讨论,以预测新出现的健康趋势。该分析模型处理按时间顺序排列的数据集,以推断与健康相关主题的文本分类。这种混合模型使用文本数据增强来解决健康术语的类别不平衡问题,并进一步采用聚类技术进行基于位置的聚合。制定了一种用于健康术语的词向量嵌入模型算法。该词向量模型进一步用于文本数据增强,以减少类别不平衡。我们通过构建机器学习管道来评估分类器的准确性。对于我们的增强集成模型,在增强后使用投票集成技术评估文本分类准确性,观察到更高的准确性。通过对健康术语进行时间分类和按位置聚合来分析新出现的健康趋势。该模型表明,针对健康主题的文本增强集成机器学习方法比传统机器学习分类技术更有效。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7bc5/12140654/45c10090261c/pone.0323449.g013.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7bc5/12140654/932b1701f294/pone.0323449.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7bc5/12140654/8078af9797b9/pone.0323449.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7bc5/12140654/351b0ffe55db/pone.0323449.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7bc5/12140654/b6ad46c32f16/pone.0323449.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7bc5/12140654/b5f714347e9a/pone.0323449.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7bc5/12140654/bd3a73c277f2/pone.0323449.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7bc5/12140654/ef11393ceed2/pone.0323449.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7bc5/12140654/c2b7d66768b9/pone.0323449.g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7bc5/12140654/d20ffc1e2357/pone.0323449.g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7bc5/12140654/1068cf76ebe3/pone.0323449.g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7bc5/12140654/260fe85680ad/pone.0323449.g011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7bc5/12140654/0cbbff7998c3/pone.0323449.g012.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7bc5/12140654/45c10090261c/pone.0323449.g013.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7bc5/12140654/932b1701f294/pone.0323449.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7bc5/12140654/8078af9797b9/pone.0323449.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7bc5/12140654/351b0ffe55db/pone.0323449.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7bc5/12140654/b6ad46c32f16/pone.0323449.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7bc5/12140654/b5f714347e9a/pone.0323449.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7bc5/12140654/bd3a73c277f2/pone.0323449.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7bc5/12140654/ef11393ceed2/pone.0323449.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7bc5/12140654/c2b7d66768b9/pone.0323449.g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7bc5/12140654/d20ffc1e2357/pone.0323449.g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7bc5/12140654/1068cf76ebe3/pone.0323449.g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7bc5/12140654/260fe85680ad/pone.0323449.g011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7bc5/12140654/0cbbff7998c3/pone.0323449.g012.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7bc5/12140654/45c10090261c/pone.0323449.g013.jpg

相似文献

[1]
Augmented Ensemble Model (AEM) for health trends prediction on social networks.

PLoS One. 2025-6-5

[2]
An ensemble heterogeneous classification methodology for discovering health-related knowledge in social media messages.

J Biomed Inform. 2014-6

[3]
Sentimental Analysis of COVID-19 Related Messages in Social Networks by Involving an N-Gram Stacked Autoencoder Integrated in an Ensemble Learning Scheme.

Sensors (Basel). 2021-11-15

[4]
Social media based surveillance systems for healthcare using machine learning: A systematic review.

J Biomed Inform. 2020-8

[5]
An Ensemble Deep Learning Model for Drug Abuse Detection in Sparse Twitter-Sphere.

Stud Health Technol Inform. 2019-8-21

[6]
Identifying health related occupations of Twitter users through word embedding and deep neural networks.

BMC Bioinformatics. 2022-9-28

[7]
Classifying adverse drug reactions from imbalanced twitter data.

Int J Med Inform. 2019-5-30

[8]
An unsupervised machine learning model for discovering latent infectious diseases using social media data.

J Biomed Inform. 2017-2

[9]
Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.

Cochrane Database Syst Rev. 2022-2-1

[10]
Temporal and Location Variations, and Link Categories for the Dissemination of COVID-19-Related Information on Twitter During the SARS-CoV-2 Outbreak in Europe: Infoveillance Study.

J Med Internet Res. 2020-8-28

本文引用的文献

[1]
An Ensemble Approach to Predict Early-Stage Diabetes Risk Using Machine Learning: An Empirical Study.

Sensors (Basel). 2022-7-13

[2]
Osteoporosis Pre-Screening Using Ensemble Machine Learning in Postmenopausal Korean Women.

Healthcare (Basel). 2022-6-14

[3]
BioWordVec, improving biomedical word embeddings with subword information and MeSH.

Sci Data. 2019-5-10

[4]
Using support vector machine ensembles for target audience classification on Twitter.

PLoS One. 2015-4-13

[5]
A social media primer for professionals: digital dos and don'ts.

Health Promot Pract. 2014-3

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

推荐工具

医学文档翻译智能文献检索