量子度量学习分类器的泛化性能。

Generalization Performance of Quantum Metric Learning Classifiers.

机构信息

GSK R&D Stevenage, GlaxoSmithKline, Stevenage SG1 2NY, UK.

Department of Biochemistry and Molecular Genetics, University of Virginia, Charlottesville, VA 22908, USA.

出版信息

Biomolecules. 2022 Oct 27;12(11):1576. doi: 10.3390/biom12111576.

DOI:10.3390/biom12111576

PMID:36358927

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9687469/

Abstract

Quantum computing holds great promise for a number of fields including biology and medicine. A major application in which quantum computers could yield advantage is machine learning, especially kernel-based approaches. A recent method termed quantum metric learning, in which a quantum embedding which maximally separates data into classes is learned, was able to perfectly separate ant and bee image training data. The separation is achieved with an intrinsically quantum objective function and the overall approach was shown to work naturally as a hybrid classical-quantum computation enabling embedding of high dimensional feature data into a small number of qubits. However, the ability of the trained classifier to predict test sample data was never assessed. We assessed the performance of quantum metric learning on test ants and bees image data as well as breast cancer clinical data. We applied the original approach as well as variants in which we performed principal component analysis (PCA) on the feature data to reduce its dimensionality for quantum embedding, thereby limiting the number of model parameters. If the degree of dimensionality reduction was limited and the number of model parameters was constrained to be far less than the number of training samples, we found that quantum metric learning was able to accurately classify test data.

摘要

量子计算在包括生物学和医学在内的许多领域都有很大的应用前景。在机器学习中，特别是基于核的方法中，量子计算机可能会具有优势。最近提出的一种方法称为量子度量学习，在这种方法中，学习了一个最大限度地将数据分为类别的量子嵌入。该方法能够完美地分离蚂蚁和蜜蜂的图像训练数据。这种分离是通过内在的量子目标函数实现的，整体方法被证明是一种自然的混合经典-量子计算，能够将高维特征数据嵌入到少数量子位中。然而，训练有素的分类器预测测试样本数据的能力从未得到评估。我们评估了量子度量学习在测试蚂蚁和蜜蜂图像数据以及乳腺癌临床数据上的性能。我们应用了原始方法以及在特征数据上进行主成分分析（PCA）的变体，以降低其量子嵌入的维数，从而限制模型参数的数量。如果限制降维的程度，并将模型参数的数量限制为远小于训练样本的数量，我们发现量子度量学习能够准确地对测试数据进行分类。