Xia Kaijian, Ni Tongguang, Yin Hongsheng, Chen Bo
IEEE/ACM Trans Comput Biol Bioinform. 2021 Jan-Feb;18(1):53-61. doi: 10.1109/TCBB.2020.2973978. Epub 2021 Feb 3.
Conventional classification models for epileptic EEG signal recognition need sufficient labeled samples as training dataset. In addition, when training and testing EEG signal samples are collected from different distributions, for example, due to differences in patient groups or acquisition devices, such methods generally cannot perform well. In this paper, a cross-domain classification model with knowledge utilization maximization called CDC-KUM is presented, which takes advantage of the data global structure provided by the labeled samples in the related domain and unlabeled samples in the current domain. Through mapping the data into kernel space, the pairwise constraint regularization term is combined together the predictive differences of the labeled data in the source domain. Meanwhile, the soft clustering regularization term using quadratic weights and Gini-Simpson diversity is applied to exploit the distribution information of unlabeled data in the target domain. Experimental results show that CDC-KUM model outperformed several traditional non-transfer and transfer classification methods for recognition of epileptic EEG signals.
用于癫痫脑电信号识别的传统分类模型需要足够的标记样本作为训练数据集。此外,当训练和测试脑电信号样本是从不同分布中收集时,例如由于患者群体或采集设备的差异,此类方法通常表现不佳。本文提出了一种称为CDC-KUM的知识利用最大化的跨域分类模型,该模型利用相关域中的标记样本和当前域中的未标记样本提供的数据全局结构。通过将数据映射到核空间,成对约束正则化项结合了源域中标记数据的预测差异。同时,应用使用二次权重和基尼-辛普森多样性的软聚类正则化项来利用目标域中未标记数据的分布信息。实验结果表明,CDC-KUM模型在癫痫脑电信号识别方面优于几种传统的非迁移和迁移分类方法。