Positive Unlabeled Learning Algorithm for One Class Classification of Social Text Stream with only very few Positive Training Samples

Abhinandan Vishwakarma

Abstract


Text classification using a small labelled set (positive data set) and large unlabeled data is seen as a promising technique especially in case of text stream classification where it is highly possible that only few positive data and no negative data is available. This paper studies how to devise a positive and unlabeled learning technique for the text stream environment. Our proposed approach works in two steps. Firstly we use the PNLH (Positive example and negative example labelling heuristic) approach for extracting both positive and negative example from unlabeled data. This extraction would enable us to obtain an enriched vector representation for the new test messages. Secondly we construct a one class classifier by using one class SVM classifier. Using the enriched vector representation as the input in one class SVM classifier predicts the importance level of each text message.

Keywords: Positive and unlabeled learning, one class SVM (Support Vector Machine), one class classification, text stream classification.


Full Text: PDF
Download the IISTE publication guideline!

To list your conference here. Please contact the administrator of this platform.

Paper submission email: CEIS@iiste.org

ISSN (Paper)2222-1727 ISSN (Online)2222-2863

Please add our address "contact@iiste.org" into your email contact list.

This journal follows ISO 9001 management standard and licensed under a Creative Commons Attribution 3.0 License.

Copyright © www.iiste.org