Alvarez E, Spannowsky M, Szewc M
International Center for Advanced Studies (ICAS) and CONICET, UNSAM, San Martin, Argentina.
Institute for Particle Physics Phenomenology, Durham University, Durham, United Kingdom.
Front Artif Intell. 2022 Mar 17;5:852970. doi: 10.3389/frai.2022.852970. eCollection 2022.
The classification of jets induced by quarks or gluons is important for New Physics searches at high-energy colliders. However, available taggers usually rely on modeling the data through Monte Carlo simulations, which could veil intractable theoretical and systematical uncertainties. To significantly reduce biases, we propose an unsupervised learning algorithm that, given a sample of jets, can learn the SoftDrop Poissonian rates for quark- and gluon-initiated jets and their fractions. We extract the Maximum Likelihood Estimates for the mixture parameters and the posterior probability over them. We then construct a quark-gluon tagger and estimate its accuracy in actual data to be in the 0.65-0.7 range, below supervised algorithms but nevertheless competitive. We also show how relevant unsupervised metrics perform well, allowing for an unsupervised hyperparameter selection. Further, we find that this result is not affected by an angular smearing introduced to simulate detector effects for central jets. The presented unsupervised learning algorithm is simple; its result is interpretable and depends on very few assumptions.
由夸克或胶子引发的喷注分类对于高能对撞机上的新物理搜索至关重要。然而,现有的标记器通常依赖于通过蒙特卡罗模拟对数据进行建模,这可能掩盖难以处理的理论和系统不确定性。为了显著减少偏差,我们提出一种无监督学习算法,给定一个喷注样本,该算法可以学习夸克引发喷注和胶子引发喷注的软滴泊松率及其比例。我们提取混合参数的最大似然估计及其后验概率。然后我们构建一个夸克 - 胶子标记器,并估计其在实际数据中的准确率在0.65 - 0.7范围内,低于有监督算法,但仍具有竞争力。我们还展示了相关的无监督度量如何表现良好,从而实现无监督超参数选择。此外,我们发现这个结果不受为模拟中心喷注的探测器效应而引入的角度涂抹的影响。所提出的无监督学习算法很简单;其结果是可解释的,并且依赖于极少的假设。