Evaluation Measures for Models Assessment over Imbalanced Data Sets

Mohamed Bekkar, Hassiba Kheliouane Djemaa, Taklit Akrouf Alitouche

Abstract


Imbalanced data learning is one of the challenging problems in data mining; among this matter, founding the right model assessment measures is almost a primary research issue. Skewed class distribution causes a misreading of common evaluation measures as well it lead a biased classification. This article presents a set of alternative for imbalanced data learning assessment, using a combined measures (G-means, likelihood ratios, Discriminant power, F-Measure Balanced Accuracy, Youden index, Matthews correlation coefficient), and graphical performance assessment (ROC curve, Area Under Curve, Partial AUC, Weighted AUC, Cumulative Gains Curve and lift chart, Area Under Lift AUL), that aim to provide a more credible evaluation. We analyze the applications of these measures in churn prediction models evaluation, a well known application of imbalanced data

Keywords: imbalanced data, Model assessment, accuracy , G-means, likelihood ratios, F-Measure, Youden index, Matthews correlation coefficient, ROC, AUC, P-AUC,W-AUC, Lift, AUL


Full Text: PDF
Download the IISTE publication guideline!

To list your conference here. Please contact the administrator of this platform.

Paper submission email: JIEA@iiste.org
ISSN (Paper)2224-5782 ISSN (Online)2225-0506
Please add our address "contact@iiste.org" into your email contact list.
This journal follows ISO 9001 management standard and licensed under a Creative Commons Attribution 3.0 License.
Copyright © www.iiste.org