Faculty of Computer Science, Dalhousie University, Halifax, B3H 1W5, Canada.
Evol Comput. 2011 Spring;19(1):137-66. doi: 10.1162/EVCO_a_00016. Epub 2010 Nov 30.
Intuitively population based algorithms such as genetic programming provide a natural environment for supporting solutions that learn to decompose the overall task between multiple individuals, or a team. This work presents a framework for evolving teams without recourse to prespecifying the number of cooperating individuals. To do so, each individual evolves a mapping to a distribution of outcomes that, following clustering, establishes the parameterization of a (Gaussian) local membership function. This gives individuals the opportunity to represent subsets of tasks, where the overall task is that of classification under the supervised learning domain. Thus, rather than each team member representing an entire class, individuals are free to identify unique subsets of the overall classification task. The framework is supported by techniques from evolutionary multiobjective optimization (EMO) and Pareto competitive coevolution. EMO establishes the basis for encouraging individuals to provide accurate yet nonoverlaping behaviors; whereas competitive coevolution provides the mechanism for scaling to potentially large unbalanced datasets. Benchmarking is performed against recent examples of nonlinear SVM classifiers over 12 UCI datasets with between 150 and 200,000 training instances. Solutions from the proposed coevolutionary multiobjective GP framework appear to provide a good balance between classification performance and model complexity, especially as the dataset instance count increases.
直观上,基于群体的算法(如遗传编程)为支持通过多个个体(或团队)来分解整体任务的解决方案提供了自然的环境。这项工作提出了一种无需预先指定合作个体数量即可演化团队的框架。为此,每个个体都进化出一种映射到结果分布的映射,在聚类之后,该映射建立了(高斯)局部隶属函数的参数化。这使个体有机会表示任务的子集,而整体任务是在监督学习领域进行分类。因此,团队成员不必代表整个类,个体可以自由地识别整体分类任务的独特子集。该框架得到了进化多目标优化(EMO)和 Pareto 竞争共进化技术的支持。EMO 为鼓励个体提供准确但不重叠的行为奠定了基础;而竞争共进化为扩展到潜在的大型不平衡数据集提供了机制。在 12 个 UCI 数据集上,对最近的非线性 SVM 分类器示例进行了基准测试,这些数据集的训练实例数在 150 到 200,000 之间。拟议的共进化多目标 GP 框架的解决方案似乎在分类性能和模型复杂度之间提供了很好的平衡,尤其是当数据集实例数增加时。