Fu Li, Yi Yao, Liu Lina, Chen Ran
Department of Management, Taiyuan Normal University, Jinzhong City, China.
Tourism Information Technology Innovation Center, Jinzhong City, Shanxi Province, China.
PeerJ Comput Sci. 2025 Apr 18;11:e2807. doi: 10.7717/peerj-cs.2807. eCollection 2025.
In recent years, tourism has become a significant driver of many countries' economies. To maximize revenue from tourism, it is crucial to prioritize the effective management of scenic spots and tourist attractions, and also raise awareness about these places. Social media platforms have played a pivotal role in promoting tourism, as users frequently share videos and reviews related to tourism. Analyzing and managing these reviews is essential for understanding tourists' opinions about specific destinations. In this study, we evaluated a scenic spot by analyzing tourists' sentiments. Data was collected from popular social media sites such as TripAdvisor and Twitter using web scraping and the Twitter API. The raw data was preprocessed to remove irrelevant information and redundancies and was properly annotated for further processing. We applied two approaches to analyze the sentiments of tourists. First, we vectorized the text representing the sentiment using the term frequency-inverse document frequency (TF-IDF) and utilized big data analytics to extract meaningful insights. Secondly, we employed a pre-trained large language model, bidirectional encoder representations from transformers (BERT), with a linear classifier to classify tourists' sentiments. The results of the big data analytics approaches were compared with those of BERT and previously proposed methods. BERT outperformed other machine learning models, achieving an average accuracy of 83.5% on the test set. These insights are valuable for evaluating the informatization of tourist spots, destination management, hospitality, and overall tourist attractions.
近年来,旅游业已成为许多国家经济的重要驱动力。为了使旅游收入最大化,优先有效管理景区和旅游景点并提高对这些地方的认知至关重要。社交媒体平台在促进旅游业方面发挥了关键作用,因为用户经常分享与旅游相关的视频和评论。分析和管理这些评论对于了解游客对特定目的地的看法至关重要。在本研究中,我们通过分析游客的情感来评估一个景区。使用网络爬虫和Twitter API从 TripAdvisor 和 Twitter 等热门社交媒体网站收集数据。对原始数据进行预处理以去除无关信息和冗余信息,并进行适当标注以便进一步处理。我们应用了两种方法来分析游客的情感。首先,我们使用词频-逆文档频率(TF-IDF)将表示情感的文本向量化,并利用大数据分析来提取有意义的见解。其次,我们采用了预训练的大语言模型——来自变换器的双向编码器表示(BERT),并结合线性分类器对游客的情感进行分类。将大数据分析方法的结果与BERT以及先前提出的方法的结果进行了比较。BERT的表现优于其他机器学习模型,在测试集上的平均准确率达到了83.5%。这些见解对于评估旅游景点的信息化、目的地管理、酒店服务以及整体旅游吸引力具有重要价值。