image-classifier-project-tutorial

🌱 How to start this project

🚛 How to deliver this project

📝 Instructions

Image classification system

Understanding a new dataset.
Model the data using an ANN.
Analyze the results and optimize the model.

🌱 How to start this project

Follow the instructions below:

Create a new repository based on machine learning project by clicking here.
Open the newly created repository in Codespace using the Codespace button extension.
Once the Codespace VSCode has finished opening, start your project by following the instructions below.

🚛 How to deliver this project

Once you have finished solving the exercises, be sure to commit your changes, push them to your repository, and go to 4Geeks.com to upload the repository link.

📝 Instructions

Image classification system

The dataset is composed of dog and cat photos provided as a subset of photos from a much larger collection of 3 million manually annotated photos. This data was obtained through a collaboration between Petfinder.com and Microsoft.

The data set was originally used as a CAPTCHA, i.e., a task that a human is believed to find trivial, but that a machine cannot solve, which is used on websites to distinguish between human users and bots. The task was named "Asirra". When "Asirra" was introduced, it was mentioned "that user studies indicate that humans can solve it 99.6% of the time in less than 30 seconds." Barring a breakthrough in computer vision, we expect that computers will have no more than a 1/54,000 chance of solving it.

At the time the competition was published, the state-of-the-art result was achieved with an SVM and was described in a 2007 paper with the title "Machine Learning Attacks against Asirra's CAPTCHA" (PDF) that achieved 80% classification accuracy. It was this paper that showed that the task was no longer a suitable task for a CAPTCHA shortly after the task was proposed.

Step 1: Loading the dataset

The dataset is located in Kaggle and you will need to access it to download it. You can find the competition here (or by copying and pasting the following link in your browser: https://www.kaggle.com/c/dogs-vs-cats/data)

Download the dataset folder and unzip the files. You will now have a folder called train containing 25,000 image files (.jpg format) of dogs and cats. The pictures are labeled by their file name, with the word dog or cat.

Step 2: Visualize the input information

The first step when faced with a picture classification problem is to get as much information as possible through the pictures. Therefore, load and print the first nine pictures of dogs in a single figure. Repeat the same for cats. You can see that the pictures are in color and have different shapes and sizes.

This variety of sizes and formats must be sorted out before training the model. Make sure they all have a fixed size of 200x200 pixels.

As you can see, there are a lot of images. Make sure you stick to the following rules:

If you have more than 12 gigabytes of RAM, use the Keras image processing API to load the 25,000 photos into the training dataset and reshape them to 200×200 pixel square photos. The label must also be determined for each photo based on the file names. A tuple of photos and labels should be saved.
If you have no more than 12 gigabytes of RAM, load the images progressively using the Keras ImageDataGenerator class and the flow_from_directory() function. This will be slower to run but it will run on less capable hardware. This function prefers the data to be split into separate train and test directories, and under each directory to have a subdirectory for each class.

Once you have all the images processed, create an ImageDataGenerator object for training and test data. Then pass the folder that has training data to the trdata object and, similarly, pass the folder that has test data to the tsdata object. In this way, the images will be automatically labeled, and everything will be ready to enter the network.

Step 3: Build an ANN

Any classifier that fits this problem will have to be robust because some images show the cat or dog in a corner, or perhaps 2 cats or dogs in the same picture. If you have been able to research some of the winner implementations of other competitions also related to images, you will see that VGG16 is a CNN architecture used to win the Kaggle ILSVR (Imagenet) competition in 2014. It is considered one of the best performing vision model architectures to date.

It uses the following test architecture:

1model = Sequential()
2model.add(Conv2D(input_shape = (224,224,3), filters = 64, kernel_size = (3,3), padding = "same", activation = "relu"))
3model.add(Conv2D(filters = 64,kernel_size = (3,3),padding = "same", activation = "relu"))
4model.add(MaxPool2D(pool_size = (2,2),strides = (2,2)))
5model.add(Conv2D(filters = 128, kernel_size = (3,3), padding = "same", activation = "relu"))
6model.add(Conv2D(filters = 128, kernel_size = (3,3), padding = "same", activation = "relu"))
7model.add(MaxPool2D(pool_size = (2,2),strides = (2,2)))
8model.add(Conv2D(filters = 256, kernel_size = (3,3), padding = "same", activation = "relu"))
9model.add(Conv2D(filters = 256, kernel_size = (3,3), padding = "same", activation = "relu"))
10model.add(Conv2D(filters = 256, kernel_size = (3,3), padding = "same", activation = "relu"))
11model.add(MaxPool2D(pool_size = (2,2),strides = (2,2)))
12model.add(Conv2D(filters = 512, kernel_size = (3,3), padding = "same", activation = "relu"))
13model.add(Conv2D(filters = 512, kernel_size = (3,3), padding = "same", activation = "relu"))
14model.add(Conv2D(filters = 512, kernel_size = (3,3), padding = "same", activation = "relu"))
15model.add(MaxPool2D(pool_size = (2,2),strides = (2,2)))
16model.add(Conv2D(filters = 512, kernel_size = (3,3), padding = "same", activation = "relu"))
17model.add(Conv2D(filters = 512, kernel_size = (3,3), padding = "same", activation = "relu"))
18model.add(Conv2D(filters = 512, kernel_size = (3,3), padding = "same", activation = "relu"))
19model.add(MaxPool2D(pool_size = (2,2),strides = (2,2)))
20model.add(Flatten())
21model.add(Dense(units = 4096,activation = "relu"))
22model.add(Dense(units = 4096,activation = "relu"))
23model.add(Dense(units = 2, activation = "softmax"))

The above code applies convolutions to the data (Conv2D and MaxPool2D layers) and then applies dense layers (Dense layers) for processing the numerical values obtained after the convolutions.

Then add the remaining elements to form the model, train it and measure its performance.

Step 4: Optimize the above model

Import the ModelCheckpoint and EarlyStopping method from Keras. Create an object of both and pass them as callback functions to fit_generator.

Load the best model from the above and use the test set to make predictions.

Step 5: Save the model

Store the model in the corresponding folder.

Note: We also incorporated the solution samples on ./solution.ipynb that we strongly suggest you only use if you are stuck for more than 30 min or if you have already finished and want to compare it with your approach.

Start interactive tutorial

image-classifier-project-tutorial

🌱 How to start this project

🚛 How to deliver this project

📝 Instructions

Image classification system

Step 1: Loading the dataset

Step 2: Visualize the input information

Step 3: Build an ANN

Step 4: Optimize the above model

Step 5: Save the model

Sign up and get access to solution files and videos for free

Sign up and get access to solution files and videos for free