Utrecht University, Department of Sociology / ICS, Padualaan 14, 3584 CH, Utrecht, The Netherlands.
Soc Sci Res. 2022 Nov;108:102784. doi: 10.1016/j.ssresearch.2022.102784. Epub 2022 Sep 2.
The emergence of big data and computational tools has introduced new possibilities for using large-scale textual sources in sociological research. Recent work in sociology of culture, science, and economic sociology has shown how computational text analysis can be used in theory building and testing. This review starts with an introduction of the history of computer-assisted text analysis in sociology and then proceeds to discuss five families of computational methods used in contemporary research. Using exemplary studies, it shows how dictionary methods, semantic and network analysis tools, language models, unsupervised, and supervised machine learning can assist sociologists with different analytical tasks. After presenting recent methodological developments, this review summarizes several important implications of using large datasets and computational methods to infer complex meaning in texts. Finally, it calls researchers from different methodological traditions to adopt text mining tools while remaining mindful of lessons learned from working with conventional data and methods.
大数据和计算工具的出现为在社会学研究中使用大规模文本资料带来了新的可能性。文化社会学、科学社会学和经济社会学的最新研究表明,计算文本分析可用于理论构建和检验。本综述首先介绍了社会学中计算机辅助文本分析的历史,然后讨论了当代研究中使用的五类计算方法。通过示例研究,展示了词典方法、语义和网络分析工具、语言模型、无监督和有监督机器学习如何帮助社会学家完成不同的分析任务。在介绍了最近的方法发展之后,本综述总结了使用大数据集和计算方法推断文本中复杂含义的几个重要意义。最后,呼吁来自不同方法论传统的研究人员采用文本挖掘工具,同时牢记从使用传统数据和方法中吸取的教训。