Naive Bayes Classifier

  • There are two things[1]
    • Probability model
    • Classification model

Probability Model

A probability model is an extension of Bayes’ rule. It makes two assumptions:

  1. Independence of Features: This assumption assumes that all features are independent of each other. However, it does not hold true in many cases. For example, having higher temperature does not necessarily imply higher humidity.
  2. Equal Weight of Features: This assumption assumes that all features have equal importance or weight in the model.
b1
b2

Classification Model

The classification model involves the following steps:

  1. Probability of Each Class: P(y) represents the probability of each class based on the training set.
  2. Probability Estimation of Feature Values: The goal is to estimate the probability distribution of each feature value given a specific class, denoted as P(x_i|y). For discrete features, this can be achieved through simple probability calculations, such as multinomial Naive Bayes. For continuous features, Gaussian distributions can be used. In the case of count data, multinomial distributions are suitable.
  3. Parameter Estimation: Parameter estimation is performed for each combination of class and feature.
  4. Scikit-learn and Distribution Types: Scikit-learn library provides implementations of Gaussian Naive Bayes, Bernoulli Naive Bayes, and multinomial Naive Bayes classifiers. These classifiers refer to the distribution of features. It is important to note that different features can follow different distributions. Therefore, customization of the distribution based on the application may be necessary.
b3
b4

Advantages

  • Fast and Easy Implementation: Naive Bayes classifiers are known for their simplicity and efficiency in implementation.
  • Acceptable Classification Performance: While Naive Bayes classifiers may not always accurately predict probabilities, their classification performance is generally satisfactory.

Disadvantage

  • Independence Assumption: The assumption of feature independence does not hold true in all scenarios, which can affect the model’s accuracy.

Reference

[0] https://towardsdatascience.com/naive-bayes-classifier-81d512f50a7c

[1] https://en.wikipedia.org/wiki/Naive_Bayes_classifier

2 thoughts on “Naive Bayes Classifier

Leave a comment