Prohibited Comments Classification

less than 1 minute read

This work is inspired by an exercise from the course NLP Course. The goal is to build a model to ban offensive social media comments. The dataset is rather small (it contains about 3500 comments). For this study, we will:

  • Build from scratch a Naive Bayes classifier using Bag of Words features
  • Compare the Naive Bayes classifier with a logistic regression model
  • Compare logistic regression model with bag of words features with TF-IDF features
  • Use word vectors features

The details are available in the linked Jupyter Notebook.

Ban