Rashid Mohammad Harun Or, Jubaer Md Tanbeer, Chowdhury Barisha, Islam Md Minhazul
Department of Humanities, Rajshahi University of Engineering and Technology, Rajshahi, Bangladesh.
Department of Computer Science and Engineering, Rajshahi University of Engineering and Technology, Rajshahi, Bangladesh.
Data Brief. 2024 Dec 10;58:111219. doi: 10.1016/j.dib.2024.111219. eCollection 2025 Feb.
The dataset contains user engagement and language-related information from two audio story-producing channels on YouTube. It offers a comparative view of live and mediated engagements, which includes information pertinent to the user's interaction of audio-story based YouTube contents. The speciality of this dataset is the inclusion of textual data of live comments on YouTube videos. It covers the data from July 2022 to February 2024 yielding 230 audio stories of the respective channels. More than 250,000 comments and nearly 300,000 live chats from the videos are included in this dataset. It provides quantitative information of the contents such as number of views, comments and likes. Along with the textual data and numerical engagement-related data, this dataset contains the language categorization of the users' comments. It is expected that this dataset will be used in further research producing novel insights in different disciplines, uncovering patterns of digital engagement, language use in different platforms, and the dynamics of live versus post-live interactions. Additionally, content creators and marketers can utilize insights from this dataset to optimize their strategies for audience engagement. The dataset serves as a valuable resource for cross-disciplinary studies in digital media, linguistics, and social media analysis.
该数据集包含来自YouTube上两个音频故事制作频道的用户参与度和语言相关信息。它提供了直播和媒介参与度的对比视图,其中包括与用户对基于音频故事的YouTube内容的互动相关的信息。这个数据集的特别之处在于包含了YouTube视频实时评论的文本数据。它涵盖了2022年7月至2024年2月的数据,产生了各频道的230个音频故事。该数据集中包含了来自这些视频的超过25万条评论和近30万条实时聊天记录。它提供了诸如观看次数、评论数和点赞数等内容的定量信息。除了文本数据和与参与度相关的数值数据外,这个数据集还包含了用户评论的语言分类。预计该数据集将用于进一步的研究,在不同学科中产生新颖的见解,揭示数字参与模式、不同平台上的语言使用情况以及直播与直播后互动的动态。此外,内容创作者和营销人员可以利用该数据集的见解来优化他们的受众参与策略。该数据集是数字媒体、语言学和社交媒体分析跨学科研究的宝贵资源。