Judging books by their covers
Image Recognition | Using Flux.jl for GCI'19
(Don't) judge a book by its cover.
Task Description
Create a machine learning model to predict the category of a book from its cover image This task is inspired by this paper. Your task is to use the Flux machine learning library to predict the category of books in this dataset based on their cover images. You can find the Flux documentation here and sample models for image categorization in the model zoo. We recommend starting with a simple model like this one and then optionally using a more complex one if you are interested.
Aim :
In this notebook, I'll attempt to judge a book by it's cover (sorry Mom!). Pretty Simple right ? I think not... Shoutout to Akshat Mehortra and Mudit Somani for their helpful message in GCI Slack.
1. Importing required libraries.
using Flux
using CSV, Images, FileIO
2. Getting the data
Data is sourced from The Book DatasSet. We'll use FileIO
to get it into a variable.
It'd been better if the researcher could have made a script to download the full images in Julia also. I'll try doing that myself when I get some free time.
Data Courtesy :
B. K. Iwana, S. T. Raza Rizvi, S. Ahmed, A. Dengel, and S. Uchida, "Judging a Book by its Cover," arXiv preprint arXiv:1610.09204 (2016).
data_train_csv = CSV.File("book30-listing-train.csv");
data_train_csv[42]
So we can see that every item (or row here) is of the form,
ID | FileName | Image URL | Title | Author | CategoryNum | Category
From the data README on GitHub, we come to know that there are 30 categories of books, each 1710 train and 190 test images.
Total Number of images : 51,300 (Train) | 5,700 (Test)
Our model will accept an image as a Floating Vector. I'll also convert it to greyscale as directed by Image Classification workflows in ML community.
function grey_arr(img)
return vec(Float64.(Gray.(img)))
end
function batcher(size)
for x in data_train_csv[1:size]
images = [grey_arr(load("./data/$(x[2])"))];
labels = [Flux.onehot(x[6]+1,1:30)]; #plus 1 to account for 1 based indexing
end
return (Flux.batch(images), Flux.batch(labels))
end
Making batches of 2000/1000 book images using our newly created function.
trainbatch = batcher(2000);
trainbatch_2 = batcher(1000)
const alpha = 0.000075;
const epoch = 20;
Using a NN with 3 layers as my fellow peers at GCI said that they were themselves unable to get a conv NN work.
relu as an activation function because it's my go to with image classification tasks and also of its non-saturation of gradient, which greatly accelerates the convergence of stochastic gradient descent compared to the sigmoid / tanh functions.
softmax to return a 30 element array with probabilities of the predicted labels.
model = Chain(Dense(224*224, 512, relu),
Dense(512, 64),
Dense(64, 30), softmax,
)
using Flux: onehotbatch, crossentropy, throttle
using Statistics
optim = ADAM(alpha);
loss(x,y) = Flux.crossentropy(model(x), y);
acc(a,b) = mean(Flux.onecold(model(a)).== Flux.onecold(b));
function mod_cb()
c_acc = acc(trainbatch_2...)
c_loss = loss(trainbatch_2...)
print("Current Accuracy: ", string(c_acc), " | Current Loss : ", string(c_loss), " ;\n")
end
Flux.train!(loss, params(model), Iterators.repeated(trainbatch_2, 10), optim, cb = Flux.throttle(mod_cb, 10))
Flux.train!(loss, params(model), Iterators.repeated(trainbatch, 10), optim, cb = Flux.throttle(mod_cb, 10))
we can see that the accuracy nearly doubled, Lets train it further and also the iterations.
trainbatch_3 = create_batch(3000)
Flux.train!(loss, params(model), Iterators.repeated(trainbatch, 50), optim, cb = Flux.throttle(mod_cb, 10))
We get a train accuracy of 25.5 % which is swell.
loss(trainbatch_3...)
acc(trainbatch_3...)
Loading and predicting label for a new image.
load("./data/$(data_test_csv[7][2])")
data_test_csv[7]
output_arr = model(grey_arr(load("./data/$(data_train_csv[69][2])")))
maxval = maximum(output_arr)
findall(x -> x==maxval, output_arr)