Department of Geography, The University of Hong Kong, Hong Kong, China.
Department of Civil and Environmental Engineering, The Hong Kong University of Science and Technology, Hong Kong, China.
Proc Natl Acad Sci U S A. 2023 Jul 4;120(27):e2220417120. doi: 10.1073/pnas.2220417120. Epub 2023 Jun 26.
A longstanding line of research in urban studies explores how cities can be understood through their appearance. However, what remains unclear is to what extent urban dwellers' everyday life can be explained by the visual clues of the urban environment. In this paper, we address this question by applying a computer vision model to 27 million street view images across 80 counties in the United States. Then, we use the spatial distribution of notable urban features identified through the street view images, such as street furniture, sidewalks, building façades, and vegetation, to predict the socioeconomic profiles of their immediate neighborhood. Our results show that these urban features alone can account for up to 83% of the variance in people's travel behavior, 62% in poverty status, 64% in crime, and 68% in health behaviors. The results outperform models based on points of interest (POI), population, and other demographic data alone. Moreover, incorporating urban features captured from street view images can improve the explanatory power of these other methods by 5% to 25%. We propose "urban visual intelligence" as a process to uncover hidden city profiles, infer, and synthesize urban information with computer vision and street view images. This study serves as a foundation for future urban research interested in this process and understanding the role of visual aspects of the city.
长期以来,城市研究领域的一条研究主线探索了如何通过城市的外观来理解城市。然而,城市居民的日常生活在多大程度上可以通过城市环境的视觉线索来解释,这一点仍不清楚。在本文中,我们通过将计算机视觉模型应用于美国 80 个县的 2700 万张街景图像来解决这个问题。然后,我们使用通过街景图像识别出的显著城市特征(如街道家具、人行道、建筑物正面和植被)的空间分布,来预测其邻近地区的社会经济状况。我们的研究结果表明,这些城市特征本身可以解释高达 83%的人们出行行为的变化、62%的贫困状况、64%的犯罪情况和 68%的健康行为。这些结果优于基于兴趣点 (POI)、人口和其他人口统计数据的模型。此外,将街景图像中捕捉到的城市特征纳入其中,可以将这些其他方法的解释能力提高 5%至 25%。我们提出“城市视觉智能”作为一种利用计算机视觉和街景图像揭示隐藏的城市特征、推断和综合城市信息的过程。本研究为未来对这一过程感兴趣并理解城市视觉方面作用的城市研究奠定了基础。