Bayesian network: Difference between revisions

From Rice Wiki
(Created page with "Category:Machine Learning The '''Bayesian network''' is a network probabilistic, graphical model that describes dependencies. = Application = Bayesian networks have applications in machine learning tasks that deal with dependent features. An example is Part-of-speech, where words are grammatically classified in a string. This involves a complex network of dependencies between object, subject, verbs, nouns, etc. that can be modeled and optimized with a Bayesia...")
 
No edit summary
 
(8 intermediate revisions by the same user not shown)
Line 2: Line 2:


The '''Bayesian network''' is a network probabilistic, graphical model that describes dependencies.
The '''Bayesian network''' is a network probabilistic, graphical model that describes dependencies.
= Properties =
== DAG ==
Bayesian networks are directed, acyclic [[graph]]s.
Each edge identifies a causal relation, usually temporal: something that happened in the future cannot cause something to happen in the past. As such, the graph is ''acyclic''.
Each node is a conditional probability: Given that the parents happen, what is the probability of the node event happening.
= Joint probability =
Bayesian networks are primarily used to calculate the ''joint probability'' of an event given its dependencies. This can be done with the following formula
<math>
P(x_1, x_2,\ldots,x_n)=\prod_{i=1}^nP(x_i|Parents(x_i))
</math>
By relaxing some assumptions, [[Naive Bayes]] reduces the computational complexity.
= Training =
After the structure of a Bayesian network is defined, its parameters (the conditional probabilities) are estimated by training data. The model structure is then refined and the parameters updated again.
= Data-driven learning =
Instead of using domain knowledge, one can use learning algorithms to automatically learn the dependencies between the variables using data.
An example algorithm is BIC.
= Analysis =
Bayesian networks are interpretable.
However, they are computationally expensive on higher dimensional data. We need to compute a large number of combinations of feature input probabilities. This is also the motivation behind [[Naive Bayes]]


= Application =
= Application =


Bayesian networks have applications in machine learning tasks that deal with dependent features.
Bayesian networks have applications in machine learning tasks that deal with dependent features. Bayes theorem can be used in classification problems.


An example is [[Part-of-speech]], where words are grammatically classified in a string. This involves a complex network of dependencies between object, subject, verbs, nouns, etc. that can be modeled and optimized with a Bayesian network.
An example is spam detection. Each node represents the conditional probability of a word being part of a spam email.

Latest revision as of 18:55, 24 May 2024


The Bayesian network is a network probabilistic, graphical model that describes dependencies.

Properties

DAG

Bayesian networks are directed, acyclic graphs.

Each edge identifies a causal relation, usually temporal: something that happened in the future cannot cause something to happen in the past. As such, the graph is acyclic.

Each node is a conditional probability: Given that the parents happen, what is the probability of the node event happening.

Joint probability

Bayesian networks are primarily used to calculate the joint probability of an event given its dependencies. This can be done with the following formula

By relaxing some assumptions, Naive Bayes reduces the computational complexity.

Training

After the structure of a Bayesian network is defined, its parameters (the conditional probabilities) are estimated by training data. The model structure is then refined and the parameters updated again.

Data-driven learning

Instead of using domain knowledge, one can use learning algorithms to automatically learn the dependencies between the variables using data.

An example algorithm is BIC.

Analysis

Bayesian networks are interpretable.

However, they are computationally expensive on higher dimensional data. We need to compute a large number of combinations of feature input probabilities. This is also the motivation behind Naive Bayes

Application

Bayesian networks have applications in machine learning tasks that deal with dependent features. Bayes theorem can be used in classification problems.

An example is spam detection. Each node represents the conditional probability of a word being part of a spam email.