Department of Psychology and Social Behavior, University of California, Irvine.
Department of Data and Analytics, Upworthy.
Psychol Methods. 2016 Dec;21(4):458-474. doi: 10.1037/met0000111.
The massive volume of data that now covers a wide variety of human behaviors offers researchers in psychology an unprecedented opportunity to conduct innovative theory- and data-driven field research. This article is a practical guide to conducting big data research, covering data management, acquisition, processing, and analytics (including key supervised and unsupervised learning data mining methods). It is accompanied by walkthrough tutorials on data acquisition, text analysis with latent Dirichlet allocation topic modeling, and classification with support vector machines. Big data practitioners in academia, industry, and the community have built a comprehensive base of tools and knowledge that makes big data research accessible to researchers in a broad range of fields. However, big data research does require knowledge of software programming and a different analytical mindset. For those willing to acquire the requisite skills, innovative analyses of unexpected or previously untapped data sources can offer fresh ways to develop, test, and extend theories. When conducted with care and respect, big data research can become an essential complement to traditional research. (PsycINFO Database Record
现在,大量涵盖各种人类行为的数据为心理学研究人员提供了一个前所未有的机会,可以进行创新的理论和数据驱动的实地研究。本文是关于进行大数据研究的实用指南,涵盖了数据管理、获取、处理和分析(包括关键的监督和无监督学习数据挖掘方法)。本文还附有关于数据获取、使用潜在狄利克雷分配主题建模进行文本分析以及使用支持向量机进行分类的演练教程。学术界、工业界和社区的大数据从业者已经建立了一个全面的工具和知识基础,使得广泛领域的研究人员都能够进行大数据研究。然而,大数据研究确实需要软件编程知识和不同的分析思维方式。对于那些愿意掌握必要技能的人来说,对意外或以前未开发的数据来源进行创新性分析,可以为开发、测试和扩展理论提供新的方法。如果谨慎和尊重地进行大数据研究,可以成为传统研究的重要补充。