Suppr超能文献

社交媒体挖掘工具包(SMMT)。

Social Media Mining Toolkit (SMMT).

作者信息

Tekumalla Ramya, Banda Juan M

机构信息

Georgia State University, Atlanta, GA 30303, USA.

出版信息

Genomics Inform. 2020 Jun;18(2):e16. doi: 10.5808/GI.2020.18.2.e16. Epub 2020 Jun 15.

Abstract

There has been a dramatic increase in the popularity of utilizing social media data for research purposes within the biomedical community. In PubMed alone, there have been nearly 2,500 publication entries since 2014 that deal with analyzing social media data from Twitter and Reddit. However, the vast majority of those works do not share their code or data for replicating their studies. With minimal exceptions, the few that do, place the burden on the researcher to figure out how to fetch the data, how to best format their data, and how to create automatic and manual annotations on the acquired data. In order to address this pressing issue, we introduce the Social Media Mining Toolkit (SMMT), a suite of tools aimed to encapsulate the cumbersome details of acquiring, preprocessing, annotating and standardizing social media data. The purpose of our toolkit is for researchers to focus on answering research questions, and not the technical aspects of using social media data. By using a standard toolkit, researchers will be able to acquire, use, and release data in a consistent way that is transparent for everybody using the toolkit, hence, simplifying research reproducibility and accessibility in the social media domain.

摘要

在生物医学领域,利用社交媒体数据进行研究的受欢迎程度急剧上升。仅在PubMed上,自2014年以来就有近2500篇出版物条目涉及分析来自Twitter和Reddit的社交媒体数据。然而,这些作品中的绝大多数都不共享其代码或数据以供他人复制其研究。除了极少数例外情况,那些共享代码和数据的作品,也将获取数据的方式、如何最好地格式化数据以及如何对获取的数据进行自动和手动注释等问题的解决负担留给了研究人员。为了解决这个紧迫的问题,我们推出了社交媒体挖掘工具包(SMMT),这是一套旨在封装获取、预处理、注释和标准化社交媒体数据的繁琐细节的工具。我们工具包的目的是让研究人员专注于回答研究问题,而不是使用社交媒体数据的技术方面。通过使用标准工具包,研究人员将能够以一致的方式获取、使用和发布数据,这种方式对于使用该工具包的每个人来说都是透明的,从而简化社交媒体领域研究的可重复性和可及性。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验