Pylearn2 MLPs With Custom Data

I decided to revisit the state map recognition problem, only this time, rather than using an SVM on the Hu moments, I used an MLP. This is not the same as using a “deep neural network” on the raw image pixels themselves as I am still using using domain specific knowledge to build my features (the Hu moments).

As of my writing this blog post, scikit-learn does not support MLPs (see this GSoC for plans to add this feature). Instead I turn to pylearn2, the machine learning library from the LISA lab. While pylearn2 is not as easy to use as scikit-learn, there are some great tutorials to get you started.

I added this line to the bottom of my last notebook to dump the Hu moments to a CSV so that I could start working in a fresh notebook:

items.to_csv(CSV_DATA, index=False)

In my new notebook, I performed the same feature normalization that had for the SVM: taking the log of the Hu moments and then mean centering and std-dev scaling those log-transformed values. I did actually test the MLP classifier without performing these normalization and like the SVM, classification performance degraded.

With the data normalized, I shuffled the data and then split the data into test, validation and training sets (15%, 15%, 70% respectively). The simplest way I found to shuffle the rows of a Pandas DataFrame was: df.ix[np.random.permutation(df.index)]. I initially tried df.apply(np.random.permutation), which lead to each column being shuffled independently (and my fits being horrible).

I saved the three data sets out to three separate CSV files. I could then use the CSVDataset in the training YAML to load in the file. I created my YAML file by taking the one used for classifying the MNIST digits in the MLP tutorial, modifying the datasets and changing n_classes (output classes) from 10 to 50 and nvis (input features) from 784 to 7. I also had to reduce sparse_init from 15 to 7 after a mildly helpful assertion error was thrown (I am not really sure what this variable does but it has to be <= the number of input features). For good measure I also reduced the size of the first layer from 500 nodes to 50, given that I have far fewer features than the 784 pixels in the MNIST images.

After that I ran the fit and saw that my test_y_misclass went to 0.0 in the final epochs. I was able to plot this convergence using the plot_monitor script. Since I like to have my entire notebook self-contained, I ended up modifying the plot_monitor script to run as a function call from within IPython Notebook (see my fork).

Here is the entire IPython Notebook.

I didn’t tinker with regularization, but this topic is well addressed in the pylearn2 tutorials.

In a future post, I plan to use CNN to classify the state map images without having to rely on domain specific knowledge for feature design.

Rothberg Writes

Reminiscences of an Option Operator.

Pylearn2 MLPs With Custom Data

Comments