School of Intelligent Manufacturing, Wenzhou Polytechnic, Wenzhou, 325035, China.
Wenzhou Vocational College of Science and Technology, Wenzhou, 325006, China.
Comput Biol Med. 2023 Sep;163:107197. doi: 10.1016/j.compbiomed.2023.107197. Epub 2023 Jun 21.
The realms of modern medicine and biology have provided substantial data sets of genetic roots that exhibit a high dimensionality. Clinical practice and associated processes are primarily dependent on data-driven decision-making. However, the high dimensionality of the data in these domains increases the complexity and size of processing. It can be challenging to determine representative genes while reducing the data's dimensionality. A successful gene selection will serve to mitigate the computing costs and refine the accuracy of the classification by eliminating superfluous or duplicative features. To address this concern, this research suggests a wrapper gene selection approach based on the HGS, combined with a dispersed foraging strategy and a differential evolution strategy, to form a new algorithm named DDHGS. Introducing the DDHGS algorithm to the global optimization field and its binary derivative bDDHGS to the feature selection problem is anticipated to refine the existing search balance between explorative and exploitative cores. We assess and confirm the efficacy of our proposed method, DDHGS, by comparing it with DE and HGS combined with a single strategy, seven classic algorithms, and ten advanced algorithms on the IEEE CEC 2017 test suite. Furthermore, to further evaluate DDHGS' performance, we compare it with several CEC winners and DE-based techniques of great efficiency on 23 popular optimization functions and the IEEE CEC 2014 benchmark test suite. The experimentation asserted that the bDDHGS approach was able to surpass bHGS and a variety of existing methods when applied to fourteen feature selection datasets from the UCI repository. The metrics measured--classification accuracy, the number of selected features, fitness scores, and execution time--all showed marked improvements with the use of bDDHGS. Considering all results, it can be concluded that bDDHGS is an optimal optimizer and an effective feature selection tool in the wrapper mode.
现代医学和生物学领域提供了大量具有高维性的遗传根源数据集。临床实践和相关过程主要依赖于数据驱动的决策。然而,这些领域的数据的高维性增加了处理的复杂性和规模。在降低数据维度的同时确定代表性基因具有挑战性。成功的基因选择将有助于减轻计算成本,并通过消除多余或重复的特征来提高分类的准确性。针对这一问题,本研究提出了一种基于 HGS 的包装基因选择方法,结合分散寻优策略和差分进化策略,形成了一种新的算法,命名为 DDHGS。将 DDHGS 算法引入全局优化领域及其二进制衍生算法 bDDHGS 到特征选择问题中,预计将细化现有探索和利用核心之间的搜索平衡。我们通过将 DDHGS 与 DE 和 HGS 结合的单一策略、七种经典算法和十种高级算法在 IEEE CEC 2017 测试套件上进行比较,评估并验证了我们提出的方法 DDHGS 的有效性。此外,为了进一步评估 DDHGS 的性能,我们将其与几个 CEC 获奖者和基于 DE 的高效技术在 23 个流行的优化函数和 IEEE CEC 2014 基准测试套件上进行了比较。实验表明,当应用于来自 UCI 存储库的十四个特征选择数据集时,bDDHGS 方法能够超过 bHGS 和各种现有方法。所测量的指标--分类准确性、选择的特征数量、适应度得分和执行时间--都显示出使用 bDDHGS 的显著改进。考虑到所有结果,可以得出结论,bDDHGS 是一种最优的优化器和包装模式下的有效特征选择工具。