Department of Computer Science, National University of Computer and Emerging Sciences (NUCES), Lahore 54000, Pakistan.
Genes (Basel). 2020 Jul 18;11(7):819. doi: 10.3390/genes11070819.
A number of different feature selection and classification techniques have been proposed in literature including parameter-free and parameter-based algorithms. The former are quick but may result in local maxima while the latter use dataset-specific parameter-tuning for higher accuracy. However, higher accuracy may not necessarily mean higher reliability of the model. Thus, generalized optimization is still a challenge open for further research. This paper presents a warzone inspired "infiltration tactics" based optimization algorithm (ITO)-not to be confused with the ITO algorithm based on the Itõ Process in the field of Stochastic calculus. The proposed ITO algorithm combines parameter-free and parameter-based classifiers to produce a high-accuracy-high-reliability (HAHR) binary classifier. The algorithm produces results in two phases: (i) Lightweight Infantry Group (LIG) converges quickly to find non-local maxima and produces comparable results (i.e., 70 to 88% accuracy) (ii) Followup Team (FT) uses advanced tuning to enhance the baseline performance (i.e., 75 to 99%). Every soldier of the ITO army is a base model with its own independently chosen Subset selection method, pre-processing, and validation methods and classifier. The successful soldiers are combined through heterogeneous ensembles for optimal results. The proposed approach addresses a data scarcity problem, is flexible to the choice of heterogeneous base classifiers, and is able to produce HAHR models comparable to the established MAQC-II results.
已经有许多不同的特征选择和分类技术在文献中被提出,包括无参数和基于参数的算法。前者速度快,但可能导致局部最大值,而后者则使用特定于数据集的参数调整来提高准确性。然而,更高的准确性并不一定意味着模型的更高可靠性。因此,广义优化仍然是一个有待进一步研究的挑战。本文提出了一种受战区启发的“渗透策略”基于优化算法(ITO)-不要与随机微积分领域中基于 Ito 过程的 ITO 算法混淆。所提出的 ITO 算法结合了无参数和基于参数的分类器,以产生高精度-高可靠性(HAHR)二进制分类器。该算法在两个阶段产生结果:(i)轻步兵小组(LIG)快速收敛以找到非局部最大值并产生可比的结果(即,70%到 88%的准确性)(ii)后续小组(FT)使用高级调整来增强基准性能(即,75%到 99%)。ITO 军队的每一个士兵都是一个基础模型,有自己独立选择的子集选择方法、预处理和验证方法和分类器。通过异构集成来组合成功的士兵,以获得最佳结果。所提出的方法解决了数据稀缺问题,对异构基础分类器的选择具有灵活性,并且能够生成与已建立的 MAQC-II 结果相当的 HAHR 模型。