Bayesian Machine Learning

 

Bayesian machine learning is a way of using Bayes theorem to estimate the posterior of model distribution given the observed data.

Parameter Estimation

Parameter estimation is a common application of Bayesian machine learning. Given a model with some unknown parameters \theta , we use Bayes theorem to estimate the probability distribution p(\theta|x) of these parameters.

To make the posterior distributions easy to calculate, we often select the prior distribution to be conjugate prior as the posterior distributions.

CMU lecture video:

https://scs.hosted.panopto.com/Panopto/Pages/Viewer.aspx?id=49482c13-0a60-4b02-8220-cf70f20ecf3a

Maximum a Posteriori (MAP) Estimation

Calculating the posterior distributions can involve complex integrals that are not directly calculatable. MAP is a method to simplify this calculation. Instead of calculating the whole posterior distribution, we estimate \theta to be the point with maximum posterior probability. It is equivalent to finding the mode in the distribution. The problem then becomes an optimization problem: to find the variable \theta that maximize the posterior probability of the model.

Although MAP is easier to implement, it also has some problems.

Problem 1: the mode can be an atypical point in the distribution. The following figure is an example of it.

Evernote Snapshot 20170313 105331

Problem 2: MAP estimate is not invariant to reparmeterization. That is, given a transformation y=f(x), the mode of y does not necessarily equal to f(x_{mode}).

Reference:

https://metacademy.org/roadmaps/rgrosse/bayesian_machine_learning

Envisage

verb (used with object)envisaged, envisaging.
1.to contemplate; visualize:

He envisages an era of great scientific discoveries.
2.Archaic. to look in the face of; face.
Synonyms: confront, consider, image, picture, regard, visualize

For example, in GENI it is envisaged that a researcher will be allocated a slice of re- sources across the whole network, consisting of a portion of network links, packet processing elements (e.g. routers) and end-hosts; researchers program their slices to behave as they wish.