Sunday, July 15, 2018

Scikit Learn Fetching MNIST original Timeout

Learning some Machine Learning using Scikit Learn.  It was not a good start, keep getting timeout when fetching the MNIST original dataset.  The code was like this


from sklearn.datasets import fetch_mldata
mnist = fetch_mldata('MNIST original')

Turns out Scikit Learn would fetch the dataset from mldata.org.  I don't know if I was just unlucky or it is generally unstable.  Ended up download the data from somewhere else first, then move the mnist-original.mat to


~/scikit_learn_data/mldata/mnist-original.mat

For windows user, the ~ is your home directory.  That's where Scikit Learn store the downloaded dataset which works as a cache too.  Now it works.

The dataset was downloaded from.


https://github.com/amplab/datascience-sp14/raw/master/lab7/mldata/mnist-original.mat