Naive Bayes: Difference between revisions

From Rice Wiki
(Created page with "Category:Machine Learning '''Naive Bayes''' is an approach to Bayesian networks that simplify the computation of joint probability of an outcome based on high dimensional features. = Motivation = Consider binary classification output C is dependent on binary features X1~X3. By Bayes theorem, we can compute C's probability based on the features with Bayes' theorem: <math> P(C|X_1,X_2,X_3)=\frac{P(X_1,X_2,X_3|C)P(C)}{P(X_1,X_2,X_3)} </math> This, in turn,...")
 
 
Line 19: Line 19:
P(C|X_1,X_2,X_3)\propto P(X_1|C)P(X_2|C)P(X_3|C)
P(C|X_1,X_2,X_3)\propto P(X_1|C)P(X_2|C)P(X_3|C)
</math>
</math>
We can divide P(C|X) over P(notC|X) to avoid calculating P(X1,X2,X3). We can then apply a log to avoid zero denominators.

Latest revision as of 19:18, 24 May 2024


Naive Bayes is an approach to Bayesian networks that simplify the computation of joint probability of an outcome based on high dimensional features.

Motivation

Consider binary classification output C is dependent on binary features X1~X3. By Bayes theorem, we can compute C's probability based on the features with Bayes' theorem:

This, in turn, mean that we need to estimate the probability of every combination of features (0 0 0, 0 0 1...). This is computationally expensive.

How it works

By assuming that the features are independent, Naive Bayes simplifies the computation to

We can divide P(C|X) over P(notC|X) to avoid calculating P(X1,X2,X3). We can then apply a log to avoid zero denominators.