Abstract: |
Classification rules reflect information that can be extracted from a database using data mining. We began by considering a hybrid (i.e., particle swarm, genetic algorithm, hill climber) model to evolve the rules. This paper studies hybrid heuristic models in the context of classification rule discovery. Nature inspired search algorithms such as Genetic Algorithms, Ant Colonies and Particle Swarm Optimization have been previously applied to data mining tasks, in particular, classification rule discovery. We extend this work by applying hybrid models that combine GA, PSO and/or hill climbers to the same type of classification tasks. Such models have already been tested and proved to be better than individual standalone search algorithms in various combinatorial optimization problems. Our research focused on investigating the same kind of potential performance enhancements in classification rule discovery tasks. We developed a model for a hybrid heuristic based classifier and implemented different variations of it in Java. These algorithms have been benchmarked against the well-known decision tree induction algorithm C4.5 using previously studied data sets in the literature. Results have been compared in terms of prediction accuracy, speed and comprehensibility. Our results showed that, heuristic based classifiers compete with C4.5 in terms of prediction accuracy on certain data sets and outperform C4.5 in general in terms of comprehensibility. C4.5 always outperformed heuristic based classifiers in terms of speed due to the relative inefficiency inherent in heuristic based classification models. We also showed that hybridization of heuristics could bring improvements in terms of execution speed in comparison to plain standalone heuristic based classifiers.\n\nKeywords: classification rules, hybrid evolutionary models, life cycle model |