核典型相关分析的影响函数与稳健变体

Influence Function and Robust Variant of Kernel Canonical Correlation Analysis.

作者信息

Alam Md Ashad, Fukumizu Kenji, Wang Yu-Ping

机构信息

Department of Biomedical Engineering, Tulane University, New Orleans, LA 70118, USA.

The Institute of Statistical Mathematics, Tachikawa, Tokyo 190-8562, Japan.

出版信息

Neurocomputing (Amst). 2018 Aug 23;304:12-29. doi: 10.1016/j.neucom.2018.04.008. Epub 2018 May 3.

DOI:10.1016/j.neucom.2018.04.008

PMID:30416263

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6223640/

Abstract

Many unsupervised kernel methods rely on the estimation of kernel covariance operator (kernel CO) or kernel cross-covariance operator (kernel CCO). Both are sensitive to contaminated data, even when bounded positive definite kernels are used. To the best of our knowledge, there are few well-founded robust kernel methods for statistical unsupervised learning. In addition, while the influence function (IF) of an estimator can characterize its robustness, asymptotic properties and standard error, the IF of a standard kernel canonical correlation analysis (standard kernel CCA) has not been derived yet. To fill this gap, we first propose a robust kernel covariance operator (robust kernel CO) and a robust kernel cross-covariance operator (robust kernel CCO) based on a generalized loss function instead of the quadratic loss function. Second, we derive the IF for robust kernel CCO and standard kernel CCA. Using the IF of the standard kernel CCA, we can detect influential observations from two sets of data. Finally, we propose a method based on the robust kernel CO and the robust kernel CCO, called , which is less sensitive to noise than the standard kernel CCA. The introduced principles can also be applied to many other kernel methods involving kernel CO or kernel CCO. Our experiments on both synthesized and imaging genetics data demonstrate that the proposed IF of standard kernel CCA can identify outliers. It is also seen that the proposed robust kernel CCA method performs better for ideal and contaminated data than the standard kernel CCA.

摘要

许多无监督核方法依赖于核协方差算子（核CO）或核互协方差算子（核CCO）的估计。即使使用有界正定核，两者对受污染的数据都很敏感。据我们所知，用于统计无监督学习的有充分依据的鲁棒核方法很少。此外，虽然估计器的影响函数（IF）可以表征其鲁棒性、渐近性质和标准误差，但标准核典型相关分析（标准核CCA）的IF尚未推导出来。为了填补这一空白，我们首先基于广义损失函数而非二次损失函数提出了一种鲁棒核协方差算子（鲁棒核CO）和一种鲁棒核互协方差算子（鲁棒核CCO）。其次，我们推导了鲁棒核CCO和标准核CCA的IF。利用标准核CCA的IF，我们可以从两组数据中检测出有影响的观测值。最后，我们提出了一种基于鲁棒核CO和鲁棒核CCO的方法，称为，它比标准核CCA对噪声更不敏感。所引入的原理也可以应用于许多其他涉及核CO或核CCO的核方法。我们对合成数据和影像遗传学数据的实验表明，所提出的标准核CCA的IF可以识别异常值。还可以看出，所提出的鲁棒核CCA方法在处理理想数据和受污染数据时比标准核CCA表现更好。

相似文献

Influence Function and Robust Variant of Kernel Canonical Correlation Analysis.核典型相关分析的影响函数与稳健变体

Neurocomputing (Amst). 2018 Aug 23;304:12-29. doi: 10.1016/j.neucom.2018.04.008. Epub 2018 May 3.

Robust kernel canonical correlation analysis to detect gene-gene co-associations: A case study in genetics.用于检测基因-基因共关联的稳健核典型相关分析：遗传学案例研究

J Bioinform Comput Biol. 2019 Aug;17(4):1950028. doi: 10.1142/S0219720019500288.

Multiview Uncorrelated Discriminant Analysis.多视图无相关判别分析。

IEEE Trans Cybern. 2016 Dec;46(12):3272-3284. doi: 10.1109/TCYB.2015.2502248. Epub 2015 Dec 3.

Robust sparse canonical correlation analysis.稳健稀疏典型相关分析

BMC Syst Biol. 2016 Aug 11;10(1):72. doi: 10.1186/s12918-016-0317-9.

A new randomized Kaczmarz based kernel canonical correlation analysis algorithm with applications to information retrieval.基于 Kaczmarz 的核典型相关分析的一种新的随机化算法及其在信息检索中的应用。

Neural Netw. 2018 Feb;98:178-191. doi: 10.1016/j.neunet.2017.11.013. Epub 2017 Dec 2.

Randomized sketches for kernel CCA.核典型相关分析的随机草图。

Neural Netw. 2020 Jul;127:29-37. doi: 10.1016/j.neunet.2020.04.006. Epub 2020 Apr 14.

Sparse kernel canonical correlation analysis for discovery of nonlinear interactions in high-dimensional data.用于发现高维数据中非线性相互作用的稀疏核典型相关分析。

BMC Bioinformatics. 2017 Feb 14;18(1):108. doi: 10.1186/s12859-017-1543-x.

Canonical Correlation Analysis With Low-Rank Learning for Image Representation.基于低秩学习的图像表示典型相关分析

IEEE Trans Image Process. 2022;31:7048-7062. doi: 10.1109/TIP.2022.3219235. Epub 2022 Nov 14.

Kernel component analysis using an epsilon-insensitive robust loss function.使用ε-不敏感稳健损失函数的核成分分析

IEEE Trans Neural Netw. 2008 Sep;19(9):1583-98. doi: 10.1109/TNN.2008.2000443.

Multi-group analysis using generalized additive kernel canonical correlation analysis.多群组分析使用广义加性核典范相关分析。

Sci Rep. 2020 Jul 28;10(1):12624. doi: 10.1038/s41598-020-69575-x.

引用本文的文献

IRC-Fuse: improved and robust prediction of redox-sensitive cysteine by fusing of multiple feature representations.IRC-Fuse：通过融合多种特征表示改进并稳健预测氧化还原敏感型半胱氨酸

J Comput Aided Mol Des. 2021 Mar;35(3):315-323. doi: 10.1007/s10822-020-00368-0. Epub 2021 Jan 4.

Canonical Correlation Analysis of Imaging Genetics Data Based on Statistical Independence and Structural Sparsity.基于统计独立性和结构稀疏性的影像遗传学数据的典型相关分析。

IEEE J Biomed Health Inform. 2020 Sep;24(9):2621-2629. doi: 10.1109/JBHI.2020.2972581. Epub 2020 Feb 10.

Gene shaving using a sensitivity analysis of kernel based machine learning approach, with applications to cancer data.利用基于核的机器学习方法的敏感性分析进行基因剪接，应用于癌症数据。

PLoS One. 2019 May 23;14(5):e0217027. doi: 10.1371/journal.pone.0217027. eCollection 2019.

本文引用的文献

Robust kernel principal component analysis.稳健核主成分分析

Neural Comput. 2009 Nov;21(11):3179-213. doi: 10.1162/neco.2009.02-08-706.

Sparse canonical correlation analysis with application to genomic data integration.应用于基因组数据整合的稀疏典型相关分析。

Stat Appl Genet Mol Biol. 2009;8:Article 1. doi: 10.2202/1544-6115.1406. Epub 2009 Jan 6.

A regularized kernel CCA contrast function for ICA.一种用于独立成分分析（ICA）的正则化核典型相关分析（CCA）对比函数。

Neural Netw. 2008 Mar-Apr;21(2-3):170-81. doi: 10.1016/j.neunet.2007.12.047. Epub 2008 Jan 10.

Canonical correlation analysis: an overview with application to learning methods.典型相关分析：概述及其在学习方法中的应用

Neural Comput. 2004 Dec;16(12):2639-64. doi: 10.1162/0899766042321814.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验