Hybrid GA-SVM for Efficient Feature Selection in E-mail Classification
Abstract
Feature selection is a problem of global combinatorial optimization in machine learning in which subsets of relevant features are selected to realize robust learning models. The inclusion of irrelevant and redundant features in the dataset can result in poor predictions and high computational overhead. Thus, selecting relevant feature subsets can help reduce the computational cost of feature measurement, speed up learning process and improve model interpretability. SVM classifier has proven inefficient in its inability to produce accurate classification results in the face of large e-mail dataset while it also consumes a lot of computational resources. In this study, a Genetic Algorithm-Support Vector Machine (GA-SVM) feature selection technique is developed to optimize the SVM classification parameters, the prediction accuracy and computation time. Spam assassin dataset was used to validate the performance of the proposed system. The hybrid GA-SVM showed remarkable improvements over SVM in terms of classification accuracy and computation time.
Keywords: E-mail Classification, Feature-Selection, Genetic algorithm, Support Vector Machine
To list your conference here. Please contact the administrator of this platform.
Paper submission email: CEIS@iiste.org
ISSN (Paper)2222-1727 ISSN (Online)2222-2863
Please add our address "contact@iiste.org" into your email contact list.
This journal follows ISO 9001 management standard and licensed under a Creative Commons Attribution 3.0 License.
Copyright © www.iiste.org