Naive Bayes: Difference between revisions
From Rice Wiki
(Created page with "Category:Machine Learning '''Naive Bayes''' is an approach to Bayesian networks that simplify the computation of joint probability of an outcome based on high dimensional features. = Motivation = Consider binary classification output C is dependent on binary features X1~X3. By Bayes theorem, we can compute C's probability based on the features with Bayes' theorem: <math> P(C|X_1,X_2,X_3)=\frac{P(X_1,X_2,X_3|C)P(C)}{P(X_1,X_2,X_3)} </math> This, in turn,...") |
|||
Line 19: | Line 19: | ||
P(C|X_1,X_2,X_3)\propto P(X_1|C)P(X_2|C)P(X_3|C) | P(C|X_1,X_2,X_3)\propto P(X_1|C)P(X_2|C)P(X_3|C) | ||
</math> | </math> | ||
We can divide P(C|X) over P(notC|X) to avoid calculating P(X1,X2,X3). We can then apply a log to avoid zero denominators. |
Latest revision as of 19:18, 24 May 2024
Naive Bayes is an approach to Bayesian networks that simplify the computation of joint probability of an outcome based on high dimensional features.
Motivation
Consider binary classification output C is dependent on binary features X1~X3. By Bayes theorem, we can compute C's probability based on the features with Bayes' theorem:
This, in turn, mean that we need to estimate the probability of every combination of features (0 0 0, 0 0 1...). This is computationally expensive.
How it works
By assuming that the features are independent, Naive Bayes simplifies the computation to
We can divide P(C|X) over P(notC|X) to avoid calculating P(X1,X2,X3). We can then apply a log to avoid zero denominators.