MIRP Lab-Parque i, Instituto Tecnológico Metropolitano (ITM), Medellín 050013, Colombia.
Grupo de Investigación del Instituto de Alta Tecnología Médica (IATM), Ayudas Diagnósticas Sura, Medellín 050026, Colombia.
Sensors (Basel). 2020 Jul 14;20(14):3919. doi: 10.3390/s20143919.
Advancement on computer and sensing technologies has generated exponential growth in the data available for the development of systems that support decision-making in fields such as health, entertainment, manufacturing, among others. This fact has made that the fusion of data from multiple and heterogeneous sources became one of the most promising research fields in machine learning. However, in real-world applications, to reduce the number of sources while maintaining optimal system performance is an important task due to the availability of data and implementation costs related to processing, implementation, and development times. In this work, a novel method for the objective selection of relevant information sources in a multimodality system is proposed. This approach takes advantage of the ability of multiple kernel learning (MKL) and the support vector machines (SVM) classifier to perform an optimal fusion of data by assigning weights according to their discriminative value in the classification task; when a kernel is designed for representing each data source, these weights can be used as a measure of their relevance. Moreover, three algorithms for tuning the Gaussian kernel bandwidth in the classifier prediction stage are introduced to reduce the computational cost of searching for an optimal solution; these algorithms are an adaptation of a common technique in unsupervised learning named local scaling. Two real application tasks were used to evaluate the proposed method: the selection of electrodes for a classification task in Brain-Computer Interface (BCI) systems and the selection of relevant Magnetic Resonance Imaging (MRI) sequences for detection of breast cancer. The obtained results show that the proposed method allows the selection of a small number of information sources.
计算机和传感技术的进步使得可用于开发支持健康、娱乐、制造等领域决策的系统的数据呈指数级增长。这一事实使得来自多个异构源的数据融合成为机器学习中最有前途的研究领域之一。然而,在实际应用中,由于数据的可用性以及与处理、实现和开发时间相关的实施成本,减少源的数量同时保持最佳系统性能是一项重要任务。在这项工作中,提出了一种在多模态系统中客观选择相关信息源的新方法。该方法利用多核学习 (MKL) 和支持向量机 (SVM) 分类器的能力,通过根据其在分类任务中的判别值分配权重来执行数据的最优融合;当为表示每个数据源而设计一个核时,可以将这些权重用作其相关性的度量。此外,还介绍了三种用于调整分类器预测阶段高斯核带宽的调谐算法,以降低搜索最优解的计算成本;这些算法是无监督学习中一种常见技术的自适应,称为局部缩放。使用两个实际应用任务来评估所提出的方法:用于脑机接口 (BCI) 系统分类任务的电极选择和用于检测乳腺癌的相关磁共振成像 (MRI) 序列的选择。得到的结果表明,所提出的方法允许选择少量的信息源。