Image Processing and Deep Learning (EEEM063)
In this lab you will use a web based interface called “DIGITS” to run some simple classification experiments using a convolution neural network (CNN). This lab assumes use of DIGITS v3.0.0 or higher.
CNNs are state of the art deep neural networks that perform well at machine perception tasks such as image classification. You will start learning about these in detail within Week 6.
To do this work you will need to connect to a teaching server called aineko1 on the campus network. The server is not visible off campus, so you are encouraged to use the lab or library PCs on campus to do this work. If you want to use your own laptop then you will be able to connect via the campus wide “eduroam” wifi network. The campus wide “The Cloud” network will not work. It is possible to connect from offcampus using the facility and entering the web interface address including http:// into the text box in the upper-right of the portal.
The address of the web interface is
Verify now that you can connect to the web interface via your browser, or you will not be able to progress any further.
It is possible to install your own version of DIGITS on your local lab PC (e.g. in the Penguin lab). Since we have a large class this year this may be useful if the aineko server becomes very busy. You can refer to the supplementary instructions on SurreyLearn if you find is desirable to do this. If you want to install DIGITS yourself on your own machine then you might find those instructions a useful starting point but note that we do not have the resource to assist 50+ students installing their own DIGITS on their own variants/configurations of Linux so if you try this you are on your own.
The aineko server (pictured right) is an Intel i7 PC with 4 Nvidia Titan-X GPUs which power our deep learning experiments.
- Getting Started
We will be working with a dataset of hand-written numbers 0-9, collected by the US postal service from mail. The dataset is called MNIST and contains 70k images each only 28×28 pixels in size.
You must first create your own work area on the server and download the database.
Remotely login to the teaching server using ssh. In the Linux labs you can open up a terminal window using Ctrl-Alt-T and then enter the following
ssh aineko.eps.surrey.ac.uk –l your_username
Note that character after the hyphen is a lower case L not a 1! Please substitute username you’re your actual username. The password is your URN. If you are prompted by an “are you sure…” prompt just type the word yes. If you can’t log in the we will have to create you an account.
Alternatively you may issue the same command in the Mac OS “Terminal” (usually found under /Applications/Utilities) or use a Windows ssh client such as PuTTY.
When you have logged in, please create a folder in the ‘scratch’ area of the server to work in:
Please note that anyone will be able to access anyone else’s work on this part of the teaching server, so take care not to trample over each others’ files or leave anything sensitive such as coursework submissions in this space.
Now download the MNIST dataset in a format that can be used with DIGITS, using the built-in tool
cd /opt/DIGITS python -m digits.download_data mnist /scratch/Teaching/your_username/mnist Change into the workspace you created
If you type ls to list the files in your workspace you will see that a folder mnist has been created containing folders train and test. Both folders contain subfolders 0,1,2..,9 which contain the images.
We will be using train as our training image set (contains around 60k images) which we will show to the CNN during training. We will use test as our test image set (contains around 10k images) which we won’t show to the CNN during training, but will use after training to measure how well the CNN has learned to recognise the ten kinds of digit.
In addition to images, each folder train and test contains a pair of text files
train.txt or test.txt which is a list of every file in the image set, a space, and then a number which is associated with a class (there are 10 classes, numbered 0-9). One line in the file corresponds to one image file.
labels.txt which contains 10 lines each providing a descriptive name for each of the 10 classes – which coincidently in this case also the names 0,1,2,..,9.
Take a look inside the files using the Linux cat command to see how they are formatted e.g.
cat train/train.txt Remember you can use Ctrl-C to stop if it is scrolling for a long time. cat train/labels.txt
Imagine if we were working a different image classification with the ImageNet dataset, which contains 16 million images of 1000 classes of object. We would see numbers in train.txt from 0-999 and then 1000 lines in labels.txt containing the actual names of each class e.g. dog, cat, tree, etc.
- Import the dataset into DIGITS
Go to and look at the “Datasets” tab – there may or may not be datasets already listed in there from other users. In any case you will be creating your own by following these steps. Click on the blue “Images” button by “New Dataset” and select “Classification” as the dataset type from the dropdown menu.
Now fill in the form you are presented with as per the following page. On the server everyone has access to everything – there are no private work areas. So, it is very important that you name everything using a standard convention. We will create a dataset called ‘yourusername_mnist_dataset’.
Make sure that everything you create in DIGITS starts with the prefix yourusername_
You need to click on the “Use Text Files” tab which will use the train.txt etc. files we just inspected as lists from which to build.
Note that the dataset name starts myusername_ i.e. it is jc0028_mnist_dataset. Ensure you follow this naming convention to prevent problems with other users. Note that:
- images are greyscale and of size 28×28. • we have unchecked “validation” and checked “test”. • we are going to use files already on the teaching server (in your area) rather than uploading
them via the browser, so check “Use local paths on server”. • the locations of the training, test and labels text files images… i.e.
/scratch/Teaching/yourusername/mnist/train/train.txt /scratch/Teaching/yourusername/mnist/test/test.txt /scratch/Teaching/yourusername/mnist/train/labels.txt
- finally note that “image folder (optional)” is filled in with /scratch/Teaching/yourusername/
Click create and you will see some progress bars in blue on the right hand side of the screen. It will take about 60 seconds to create the dataset from the 70k images in MNIST.
If you get errors, check you didn’t lead off the trailing / on that last field, and check all spelling.
If you click on the word “DIGITS” on the top-left to go home, or go to the original URL, you will see your dataset in the active datasets listed with “Done” (or in progress if you didn’t wait for the blue bars to go to 100%, in which case wait for the job to complete on the main screen)
- Training a CNN Now we will train a standard CNN called “LeNet” to recognise the numbers 0-9 in the MNIST dataset. This popular yet simple CNN architecture is included as a preset within DIGITS so is easy to try out. On the Models tab/box click on the blue Images button by New Model, and pick Classification.
Then fill in the form that appears to tell DIGITS how to run the training. First you need to select the dataset your prepared (which will be easy to find, because you named it using yourusername_ as a prefix). Next you need to select the LeNet CNN Finally you need to name this training job. Again we keep to a careful convention. We will use: yourusername_dataset_network_anythingyouwant So, I have used for example, jc0028_mnist_lenet_exp1 Where exp1 means experiment 1, but you can use anything you like for this. It makes sense to keep a notepad beside the PC to record what settings you used for each experiment, for ease of use later. For now, leave all the other settings as they were originally and click Create.
You will see a blue progress bar again on the right, and a graph in the centre of the page which will update itself as training proceeds. After about 60 seconds (30 iterations or to use the terminology “epochs”) of training, the job will be complete. If you click out of this screen back to the DIGITS home page (click on DIGITS on the top-left) you will see a list of all experiments. You will be able to get back into these results by clicking on the old experiment. Later, when you run longer experiments, you can do this to leave a job running on the server and return later to analyse the results. Please do not use more than 1 GPU out of the 4 available for any single job. Your end result should look something like this – a blue graph spanning 30 epochs that converges to zero after about 5 epochs.
The first graph is a plot of something called the “training loss”. The second is a plot of the “learning rate”. Both are vs. the epoch number 1-30. We will discuss the meaning of the graphs shortly.
NOTE: Choosing a GPU
The server you are using has 4 GPUs. You can either let DIGITS choose which GPU to use (the default option) or select GPUs at the bottom of the screen in DIGITS by highlighting them in a list.
We advise you let DIGITS choose the GPU for you by not adjusting anything in this section.
However sometimes DIGITS will get confused and allocate all jobs to a single GPU and you may see “out of memory” or “cuDNN 0=4” or similarly phrased errors. In this case the GPU is fully loaded and you should select manually a different GPU. If you want to see which GPUs are heavily loaded you can run the nvidia-smi command within the ssh window you created in step 1.