Bamman David, Samberg Rachael, So Richard Jean, Zhou Naitian
School of Information, University of California, Berkeley, CA 94704.
Library, University of California, Berkeley, CA 94704.
Proc Natl Acad Sci U S A. 2024 Nov 12;121(46):e2409770121. doi: 10.1073/pnas.2409770121. Epub 2024 Nov 4.
Movies are a massively popular and influential form of media, but their computational study at scale has largely been off-limits to researchers in the United States due to the Digital Millennium Copyright Act. In this work, we illustrate use of a new regulatory framework to enable computational research on film that permits circumvention of technological protection measures on digital video discs (DVDs). We use this exemption to legally digitize a collection of 2,307 films representing the top 50 movies by U.S. box office over the period 1980 to 2022, along with award nominees. We design a computational pipeline for measuring the representation of gender and race/ethnicity in film, drawing on computer vision models for recognizing actors and human perceptions of gender and race/ethnicity. Doing so allows us to learn substantive facts about representation and diversity in Hollywood over this period, confirming earlier studies that see an increase in diversity over the past decade, while allowing us to use computational methods to uncover a range of ad hoc analytical findings. Our work illustrates the affordances of the data-driven analysis of film at a large scale.
电影是一种广受欢迎且极具影响力的媒体形式,但由于《数字千年版权法案》,美国研究人员在很大程度上无法对其进行大规模的计算研究。在这项工作中,我们展示了一种新的监管框架的应用,该框架能够对电影进行计算研究,允许规避数字视频光盘(DVD)上的技术保护措施。我们利用这一豁免权,合法地将代表1980年至2022年期间美国票房排名前50的电影以及获奖提名影片的2307部电影数字化。我们设计了一个计算流程,用于衡量电影中性别和种族/民族的呈现情况,该流程借鉴了用于识别演员的计算机视觉模型以及人类对性别和种族/民族的认知。这样做使我们能够了解这一时期好莱坞在呈现和多样性方面的实质性事实,证实了早期研究中关于过去十年多样性增加的观点,同时使我们能够使用计算方法揭示一系列特别的分析结果。我们的工作展示了大规模数据驱动的电影分析的作用。
Proc Natl Acad Sci U S A. 2024-11-12
Am J Public Health. 2000-3
Patterns (N Y). 2021-12-9
Soc Work. 2004-4
Autism Adulthood. 2022-9-1
Humanit Soc Sci Commun. 2022
J Exp Psychol Hum Percept Perform. 2012-3-26
Hist Cienc Saude Manguinhos. 2006-10
Science. 2022-12-2
IEEE Trans Pattern Anal Mach Intell. 2014-12
Iperception. 2011
Science. 2010-12-16