I decided to revisit the state map recognition problem, only this time, rather than using an SVM on the Hu moments, I used an MLP. This is not the same as using a “deep neural network” on the raw image pixels themselves as I am still using using domain specific knowledge to build my features (the Hu moments).
As of my writing this blog post, scikit-learn does not support MLPs (see this GSoC for plans to add this feature). Instead I turn to pylearn2, the machine learning library from the LISA lab. While pylearn2 is not as easy to use as scikit-learn, there are some great tutorials to get you started.
I added this line to the bottom of my last notebook to dump the Hu moments to a CSV so that I could start working in a fresh notebook:
1
|
|
In my new notebook, I performed the same feature normalization that had for the SVM: taking the log of the Hu moments and then mean centering and std-dev scaling those log-transformed values. I did actually test the MLP classifier without performing these normalization and like the SVM, classification performance degraded.
With the data normalized, I shuffled the data and then split the data into test, validation and training sets (15%, 15%, 70% respectively). The simplest way I found to shuffle the rows of a Pandas DataFrame was: df.ix[np.random.permutation(df.index)]
. I initially tried df.apply(np.random.permutation)
, which lead to each column being shuffled independently (and my fits being horrible).
I saved the three data sets out to three separate CSV files. I could then use the CSVDataset in the training YAML to load in the file. I created my YAML file by taking the one used for classifying the MNIST digits in the MLP tutorial, modifying the datasets and changing n_classes
(output classes) from 10 to 50 and nvis
(input features) from 784 to 7. I also had to reduce sparse_init
from 15 to 7 after a mildly helpful assertion error was thrown (I am not really sure what this variable does but it has to be <= the number of input features). For good measure I also reduced the size of the first layer from 500 nodes to 50, given that I have far fewer features than the 784 pixels in the MNIST images.
After that I ran the fit and saw that my test_y_misclass
went to 0.0 in the final epochs. I was able to plot this convergence using the plot_monitor
script. Since I like to have my entire notebook self-contained, I ended up modifying the plot_monitor script to run as a function call from within IPython Notebook (see my fork).
Here is the entire IPython Notebook.
I didn’t tinker with regularization, but this topic is well addressed in the pylearn2 tutorials.
In a future post, I plan to use CNN to classify the state map images without having to rely on domain specific knowledge for feature design.