Kasieczka Gregor, Nachman Benjamin, Shih David, Amram Oz, Andreassen Anders, Benkendorfer Kees, Bortolato Blaz, Brooijmans Gustaaf, Canelli Florencia, Collins Jack H, Dai Biwei, De Freitas Felipe F, Dillon Barry M, Dinu Ioan-Mihail, Dong Zhongtian, Donini Julien, Duarte Javier, Faroughy D A, Gonski Julia, Harris Philip, Kahn Alan, Kamenik Jernej F, Khosa Charanjit K, Komiske Patrick, Le Pottier Luc, Martín-Ramiro Pablo, Matevc Andrej, Metodiev Eric, Mikuni Vinicius, Murphy Christopher W, Ochoa Inês, Park Sang Eon, Pierini Maurizio, Rankin Dylan, Sanz Veronica, Sarda Nilai, Seljak Urŏ, Smolkovic Aleks, Stein George, Suarez Cristina Mantilla, Szewc Manuel, Thaler Jesse, Tsan Steven, Udrescu Silviu-Marian, Vaslin Louis, Vlimant Jean-Roch, Williams Daniel, Yunus Mikaeel
Institut für Experimentalphysik, Universität Hamburg, Germany.
Physics Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, United States of America.
Rep Prog Phys. 2021 Dec 7;84(12). doi: 10.1088/1361-6633/ac36b9.
A new paradigm for data-driven, model-agnostic new physics searches at colliders is emerging, and aims to leverage recent breakthroughs in anomaly detection and machine learning. In order to develop and benchmark new anomaly detection methods within this framework, it is essential to have standard datasets. To this end, we have created the LHC Olympics 2020, a community challenge accompanied by a set of simulated collider events. Participants in these Olympics have developed their methods using an R&D dataset and then tested them on black boxes: datasets with an unknown anomaly (or not). Methods made use of modern machine learning tools and were based on unsupervised learning (autoencoders, generative adversarial networks, normalizing flows), weakly supervised learning, and semi-supervised learning. This paper will review the LHC Olympics 2020 challenge, including an overview of the competition, a description of methods deployed in the competition, lessons learned from the experience, and implications for data analyses with future datasets as well as future colliders.
一种用于对撞机中数据驱动、与模型无关的新物理搜索的新范式正在兴起,其旨在利用异常检测和机器学习方面的最新突破。为了在这个框架内开发和评估新的异常检测方法,拥有标准数据集至关重要。为此,我们创建了2020年大型强子对撞机奥运会,这是一项社区挑战,并伴有一组模拟对撞机事件。这些奥运会的参与者使用一个研发数据集开发了他们的方法,然后在黑匣子上进行测试:即具有未知异常(或无异常)的数据集。方法利用了现代机器学习工具,并且基于无监督学习(自动编码器、生成对抗网络、归一化流)、弱监督学习和半监督学习。本文将回顾2020年大型强子对撞机奥运会挑战,包括竞赛概述、竞赛中部署的方法描述、从经验中吸取的教训,以及对未来数据集和未来对撞机数据分析的影响。