Bayesian Machine Learning


Bayesian machine learning is a way of using Bayes theorem to estimate the posterior of model distribution given the observed data.

Parameter Estimation

Parameter estimation is a common application of Bayesian machine learning. Given a model with some unknown parameters \theta , we use Bayes theorem to estimate the probability distribution p(\theta|x) of these parameters.

To make the posterior distributions easy to calculate, we often select the prior distribution to be conjugate prior as the posterior distributions.

CMU lecture video:

Maximum a Posteriori (MAP) Estimation

Calculating the posterior distributions can involve complex integrals that are not directly calculatable. MAP is a method to simplify this calculation. Instead of calculating the whole posterior distribution, we estimate \theta to be the point with maximum posterior probability. It is equivalent to finding the mode in the distribution. The problem then becomes an optimization problem: to find the variable \theta that maximize the posterior probability of the model.

Although MAP is easier to implement, it also has some problems.

Problem 1: the mode can be an atypical point in the distribution. The following figure is an example of it.

Evernote Snapshot 20170313 105331

Problem 2: MAP estimate is not invariant to reparmeterization. That is, given a transformation y=f(x), the mode of y does not necessarily equal to f(x_{mode}).



Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s