Department of Electrical Engineering, Jadavpur University, Kolkata, 700032, India.
Departamento de Electrónica, Universidad de Guadalajara, CUCEI, Av. Revolución 1500, Guadalajara, Jal, Mexico.
Comput Biol Med. 2022 May;144:105349. doi: 10.1016/j.compbiomed.2022.105349. Epub 2022 Mar 10.
The data-driven modern era has enabled the collection of large amounts of biomedical and clinical data. DNA microarray gene expression datasets have mainly gained significant attention to the research community owing to their ability to identify diseases through the "bio-markers" or specific alterations in the gene sequence that represent that particular disease (for example, different types of cancer). However, gene expression datasets are very high-dimensional, while only a few of those are "bio-markers". Meta-heuristic-based feature selection effectively filters out only the relevant genes from a large set of attributes efficiently to reduce data storage and computation requirements. To this end, in this paper, we propose an Altruistic Whale Optimization Algorithm (AltWOA) for the feature selection problem in high-dimensional microarray data. AltWOA is an improvement on the basic Whale Optimization Algorithm. We embed the concept of altruism in the whale population to help efficient propagation of candidate solutions that can reach the global optima over the iterations. Evaluation of the proposed method on eight high dimensional microarray datasets reveals the superiority of AltWOA compared to popular and classical techniques in the literature on the same datasets both in terms of accuracy and the final number of features selected. The relevant codes for the proposed approach are available publicly at https://github.com/Rohit-Kundu/AltWOA.
数据驱动的现代时代使得大量生物医学和临床数据的收集成为可能。DNA 微阵列基因表达数据集主要引起了研究界的关注,因为它们能够通过“生物标志物”或代表特定疾病的基因序列中的特定改变来识别疾病(例如,不同类型的癌症)。然而,基因表达数据集具有非常高的维度,而其中只有少数是“生物标志物”。基于元启发式的特征选择有效地从大量属性中高效筛选出仅相关的基因,以减少数据存储和计算需求。为此,在本文中,我们针对高维微阵列数据中的特征选择问题提出了一种利他鲸鱼优化算法(AltWOA)。AltWOA 是基本鲸鱼优化算法的改进。我们在鲸鱼种群中嵌入了利他主义的概念,以帮助候选解决方案在迭代过程中有效地传播,从而达到全局最优。在八个高维微阵列数据集上评估所提出的方法表明,与文献中相同数据集上的流行和经典技术相比,AltWOA 在准确性和最终选择的特征数量方面都具有优越性。该方法的相关代码可在 https://github.com/Rohit-Kundu/AltWOA 上公开获得。