Naive Bayes

From Rice Wiki


Naive Bayes is an approach to Bayesian networks that simplify the computation of joint probability of an outcome based on high dimensional features.

Motivation

Consider binary classification output C is dependent on binary features X1~X3. By Bayes theorem, we can compute C's probability based on the features with Bayes' theorem:

This, in turn, mean that we need to estimate the probability of every combination of features (0 0 0, 0 0 1...). This is computationally expensive.

How it works

By assuming that the features are independent, Naive Bayes simplifies the computation to

We can divide P(C|X) over P(notC|X) to avoid calculating P(X1,X2,X3). We can then apply a log to avoid zero denominators.