Prohibited Comments Classification
This work is inspired by an exercise from the course NLP Course. The goal is to build a model to ban offensive social media comments. The dataset is rather small (it contains about 3500 comments). For this study, we will:
- Build from scratch a Naive Bayes classifier using Bag of Words features
- Compare the Naive Bayes classifier with a logistic regression model
- Compare logistic regression model with bag of words features with TF-IDF features
- Use word vectors features
The details are available in the linked Jupyter Notebook.