Bayesian network

From Rice Wiki


The Bayesian network is a network probabilistic, graphical model that describes dependencies.

Properties

DAG

Bayesian networks are directed, acyclic graphs.

Each edge identifies a causal relation, usually temporal: something that happened in the future cannot cause something to happen in the past. As such, the graph is acyclic.

Each node is a conditional probability: Given that the parents happen, what is the probability of the node event happening.

Joint probability

Bayesian networks are primarily used to calculate the joint probability of an event given its dependencies. This can be done with the following formula

By relaxing some assumptions, Naive Bayes reduces the computational complexity.

Training

After the structure of a Bayesian network is defined, its parameters (the conditional probabilities) are estimated by training data. The model structure is then refined and the parameters updated again.

Data-driven learning

Instead of using domain knowledge, one can use learning algorithms to automatically learn the dependencies between the variables using data.

An example algorithm is BIC.

Analysis

Bayesian networks are interpretable.

However, they are computationally expensive on higher dimensional data. We need to compute a large number of combinations of feature input probabilities. This is also the motivation behind Naive Bayes

Application

Bayesian networks have applications in machine learning tasks that deal with dependent features. Bayes theorem can be used in classification problems.

An example is spam detection. Each node represents the conditional probability of a word being part of a spam email.