PMTK: Probabilistic Modeling Toolkit

PMTK: Probabilistic Modeling Toolkit

Written by Kevin Murphy and Matt Dunham.
Includes code written by various other people.
Last updated: 25 May 2009.

Important notice (15 July 2009): PMTK is no longer being developed. A brand new version, called PMTK 2.0, was started, with a much cleaner design (click here for some slides describing an initial version). However, PMTK 2 is also no longer being developed, since Matt Dunham has left UBC. Thus both of these packages are in an incomplete state. Sorry.

PMTK is a Matlab package for probabilistic modeling of data. A large variety of models are supported, including multivariate Gaussians, (sparse) linear and logistic regression models, directed and undirected graphical models, etc. Also, a large variety of algorithms are supported, for both Bayesian inference (including exact computation, dynamic programming and MCMC) and MAP/ML estimation (including EM, bound optimization, conjugate and projected gradient methods, etc.)

PMTK is designed to accompany my book Machine learning: a probabilistic approach, but can be used independently of it. Consequently, PMTK emphasises readable source code; the goal is to provide "reference" implementations of commonly used methods, with a unified interface. (Of course, the code also has to be fast enough to be useful.) The toolkit is built around the "holy trinity" of Bayesian statistics, graphical models and machine learning.

Note: PMTK requires Matlab 2008a or newer to run, since it uses the latest object-oriented features of Matlab. We chose Matlab instead of R because, while R has many useful statistical packages, many of the more interesting ones do not provide high-level source code, making them hard to understand and/or modify. (For example, glasso is written in Fortran, and many other packages are written in C.) In addition, Matlab is about 2-5 times faster than R.

Documentation

Detailed documentation will be added once the package is debugged and stable.
Meanwhile, you may find the following useful

Installation

Basics

To get started, do the following: unzip pmtk.zip, start Matlab, and then type
cd pmtk
loadPMTK
To check it's working, type
testPMTK
To run all the demos listed here you can use
runDemos
This takes about 20 minutes.

Compiling C code

For advanced users, you may wish to compile some of the C code as follows:
mex -setup % only needed once per matlab installation
compilePMTKmex
After compiling, check that 'testPMTK' still works. If not, you can roll back to pure matlab code using
removePMTKmex
PMTK includes Tom Minka's lightspeed 2.2 library. There are some problems compiling this on Mac's. Hence PMTK will not compile the C version of the lightspeed functions if it detects you are using a Mac.

Graphviz

PMTK includes the graphlayout class, which can be used to visualize graph structures, and then edit them interactively. For best results, first install graphviz. Then add it to your path. To check it installed correctly, type the following from within matlab
system('neato -V')
As an example of an automatically produced layout (using graphviz), click here. If you cannot, or do not want to, install graphviz, graphlayout can still do a "bare bones" layout.

Examples of use

Here are a few "tasters"