An Ensemble Model for Multiclass Classification and Outlier Detection Method in Data Mining

Dalton Ndirangu; Waweru Mwangi; Lawrence Nderu

An Ensemble Model for Multiclass Classification and Outlier Detection Method in Data Mining

Dalton Ndirangu, Waweru Mwangi, Lawrence Nderu

Abstract

Real life world datasets exhibit a multiclass classification structure characterized by imbalance classes. Minority classes are treated as outliers’ classes. The study used cross-industry process for data mining methodology. A heterogeneous multiclass ensemble was developed by combining several strategies and ensemble techniques. The datasets used were drawn from UCI machine learning repository. Experiments for validating the model were conducted and represented in form of tables and figures. An ensemble filter selection method was developed and used for preprocessing datasets. Point-outliers were filtered using Inter quartile range filter algorithm. Datasets were resampled using Synthetic minority oversampling technique (SMOTE) algorithm. Multiclass datasets were transformed to binary classes using OnevsOne decomposing technique. An Ensemble model was developed using adaboost and random subspace algorithms utilizing random forest as the base classifier. The classifiers built were combined using voting methodology. The model was validated with classification and outlier metric performance measures such as Recall, Precision, F-measure and AUCROC values. The classifiers were evaluated using 10 fold stratified cross validation. The model showed better performance in terms of outlier detection and classification prediction for multiclass problem. The model outperformed other well-known existing classification and outlier detection algorithms such as Naïve bayes, KNN, Bagging, JRipper, Decision trees, RandomTree and Random forest. The study findings established ensemble techniques, resampling datasets and decomposing multiclass results in an improved detection of minority outlier (rare) classes.

Keywords: Multiclass, Outlier, Ensemble, Model, Classification

DOI: 10.7176/JIEA/9-2-04

Publication date: April 30^th 2019

Full Text: PDF

Download the IISTE publication guideline!

To list your conference here. Please contact the administrator of this platform.

Paper submission email: JIEA@iiste.org

ISSN (Paper)2224-5782 ISSN (Online)2225-0506

Please add our address "contact@iiste.org" into your email contact list.

This journal follows ISO 9001 management standard and licensed under a Creative Commons Attribution 3.0 License.

Journal of Information Engineering and Applications

An Ensemble Model for Multiclass Classification and Outlier Detection Method in Data Mining

Abstract