Liu Ying, Hou Yu, Yeung Jeremy, Thao Tou, Song Meijia, Rizvi Rubina, Bian Jiang, Zhang Rui
University of Minnesota, Twin Cities, MN, USA.
AMIA Jt Summits Transl Sci Proc. 2025 Jun 10;2025:322-331. eCollection 2025.
This study advances relationship identification in social media by analyzing dietary supplement-related tweets aiming to expand the drug-supplement interactions dataset iDisk. We collected 90,000+ tweets (2007-2022) and annotated 1,000 for nuanced relationships and entities. Using a BioBERT model and ChatGPT-generated prompts, we conducted entity type and relationship identification. The BioBERT model achieved an F1 score of 0.90 for relationship prediction, while ChatGPT prompts reached 0.99. Entity type recognition proved more challenging, with high semantic similarity between types impacting accuracy. Our methodology significantly enhances relationship identification from social media data, particularly for dietary supplements usage, offering promising methods for improved post-market surveillance and public health monitoring. This work demonstrates the potential of combining traditional NLP models with large language models for complex text analysis tasks in healthcare.
本研究通过分析与膳食补充剂相关的推文来推进社交媒体中的关系识别,旨在扩充药物-补充剂相互作用数据集iDisk。我们收集了9万多条推文(2007年至2022年),并对1000条推文的细微关系和实体进行了注释。使用BioBERT模型和ChatGPT生成的提示,我们进行了实体类型和关系识别。BioBERT模型在关系预测方面的F1分数达到0.90,而ChatGPT提示达到0.99。实体类型识别被证明更具挑战性,类型之间的高语义相似性影响了准确性。我们的方法显著增强了从社交媒体数据中进行关系识别的能力,特别是对于膳食补充剂的使用,为改进上市后监测和公共卫生监测提供了有前景的方法。这项工作展示了将传统自然语言处理模型与大语言模型相结合用于医疗保健中复杂文本分析任务的潜力。