Relia Kunal, Akbari Mohammad, Duncan Dustin, Chunara Rumi
New York University, USA.
New York University School of Medicine, USA.
Proc ACM Hum Comput Interact. 2018 Nov;2. doi: 10.1145/3274414.
Social media offers a unique window into attitudes like racism and homophobia, exposure to which are important, hard to measure and understudied social determinants of health. However, individual geo-located observations from social media are noisy and geographically inconsistent. Existing areas by which exposures are measured, like Zip codes, average over irrelevant administratively-defined boundaries. Hence, in order to enable studies of online social environmental measures like attitudes on social media and their possible relationship to health outcomes, first there is a need for a method to define the collective, underlying degree of social media attitudes by region. To address this, we create the Socio-spatial-Self organizing map, "SS-SOM" pipeline to best identify regions by their latent social attitude from Twitter posts. SS-SOMs use neural embedding for text-classification, and augment traditional SOMs to generate a controlled number of nonoverlapping, topologically-constrained and topically-similar clusters. We find that not only are SS-SOMs robust to missing data, the exposure of a cohort of men who are susceptible to multiple racism and homophobia-linked health outcomes, changes by up to 42% using SS-SOM measures as compared to using Zip code-based measures.
社交媒体为洞察种族主义和恐同症等态度提供了一个独特的窗口,接触这些态度是重要的、难以衡量且研究不足的健康社会决定因素。然而,来自社交媒体的个体地理位置观测数据存在噪声且在地理上不一致。现有的测量接触情况的区域,如邮政编码,是在不相关的行政定义边界上进行平均。因此,为了能够研究诸如社交媒体上的态度等在线社会环境指标及其与健康结果的可能关系,首先需要一种方法来按区域定义社交媒体态度的集体潜在程度。为了解决这个问题,我们创建了社会空间自组织映射(“SS - SOM”)管道,以根据推特帖子中的潜在社会态度最佳地识别区域。SS - SOM 使用神经嵌入进行文本分类,并对传统的自组织映射进行扩充,以生成数量可控的、不重叠的、拓扑受限且主题相似的聚类。我们发现,SS - SOM 不仅对缺失数据具有鲁棒性,与使用基于邮政编码的测量方法相比,使用 SS - SOM 测量方法时,一组易受多种与种族主义和恐同症相关的健康结果影响的男性的接触情况变化高达 42%。