School of Electrical Engineering and Computer Science, Oregon State University, Corvallis.
School of Social and Behavioral Health Sciences, Oregon State University, Corvallis.
J Gerontol B Psychol Sci Soc Sci. 2017 Sep 1;72(5):742-751. doi: 10.1093/geronb/gbx014.
Social scientists need practical methods for harnessing large, publicly available datasets that inform the social context of aging. We describe our development of a semi-automated text coding method and use a content analysis of Alzheimer's disease (AD) and dementia portrayal on Twitter to demonstrate its use. The approach improves feasibility of examining large publicly available datasets.
Machine learning techniques modeled stigmatization expressed in 31,150 AD-related tweets collected via Twitter's search API based on 9 AD-related keywords. Two researchers manually coded 311 random tweets on 6 dimensions. This input from 1% of the dataset was used to train a classifier against the tweet text and code the remaining 99% of the dataset.
Our automated process identified that 21.13% of the AD-related tweets used AD-related keywords to perpetuate public stigma, which could impact stereotypes and negative expectations for individuals with the disease and increase "excess disability".
This technique could be applied to questions in social gerontology related to how social media outlets reflect and shape attitudes bearing on other developmental outcomes. Recommendations for the collection and analysis of large Twitter datasets are discussed.
社会科学家需要实用的方法来利用大型公共可用数据集,以了解老龄化的社会背景。我们描述了一种半自动文本编码方法的开发,并使用对 Twitter 上阿尔茨海默病(AD)和痴呆症描述的内容分析来展示其用途。该方法提高了检查大型公共可用数据集的可行性。
基于 9 个与 AD 相关的关键字,通过 Twitter 的搜索 API 收集了 31150 条与 AD 相关的推文,机器学习技术对这些推文中表达的污名化进行了建模。两位研究人员手动对 311 条随机推文进行了 6 个维度的编码。数据集的 1%作为输入,用于针对推文文本训练分类器,并对数据集的其余 99%进行编码。
我们的自动化流程确定,21.13%的与 AD 相关的推文使用与 AD 相关的关键字来延续公众污名,这可能会影响对患有该疾病的个体的刻板印象和负面预期,并增加“过度残疾”。
这项技术可应用于与社交媒体渠道如何反映和塑造与其他发展结果相关的态度有关的社会老年学问题。讨论了用于收集和分析大型 Twitter 数据集的建议。