Using Machine Learning to Identify my Pets

Chances are, if you’re interesting in technology you’ve heard the buzzword “Machine Learning” come up a lot over the past few years. But what exactly is machine learning? Simply put, Machine Learning is a type of Artificial Intelligence that gets better at doing something by itself. Chances are, you are affected by some sort of Machine Learning every single day without thinking about it. From your Netflix recommendations, to unlocking your iPhone with Face ID, to Google Assistant’s uncanny voice recognition abilities: Machine Learning has been subtly shaping our modern lives for years. What you might not realize is how accessible this technology has become. 

I thought it would be fun to create a simple machine learning model to help me better understand how the process works. I decided to go with an image recognition model that identifies wether a picture is of my dog Rudy, or of my girlfriend’s dog Meadow. Rudy is a yellow Labrador Retriever, and Meadow is a Golden Retriever. These two dog breeds look pretty similar, to the point that many humans would have a hard time telling the two apart if they don’t spend a lot of time around either of the breeds. This makes the two dogs a perfect subject to demonstrate the power of machine learning.

To achieve this, I’m going to be using Create ML, which is a part of Apple’s software development suite Xcode. The fundamentals that I’ll be discussing apply to any tool used to create machine learning models. If you don’t have a Mac, this exact same process can be achieved with something like Google’s Teachable Machine, or Microsoft’s Azure Machine Learning service. I encourage you to give one a try!

A machine learning model is typically made by using three different datasets: one for Training, one for Validation, and one for Testing. The first thing we need to do is to start building our training dataset. Thankfully, as a dog owner I’m required to take multiple pictures of my pet per day, so I had years of photos of both dogs to choose from. I chose about 100 photos of each dog, and I organized the photos into two folders, one for Rudy and one for Meadow.

Next, I needed to annotate these images to tell the computer which part of each images represents Rudy or Meadow. Create ML takes this data in as a JSON file, so I needed a tool that would allow me to easily create a compatible file. I found a free service from IBM called Cloud Annotations. I was able to upload all of my training images to this service, and then draw boxes around each image, tagging the dogs in all 200 images. For this part, I was conflicted as to wether I should tag their entire bodies, or just their faces. I ended up tagging the whole dogs, but let me know how it goes if you try something similar with just faces!

Cloud Annotations then gave me a .zip with the images and a Create ML compatible JSON file, which Create ML recognized right away. For validation data, I let Create ML automatically take some images out of the Training pile. I then sourced a few brand new images that weren’t included in the training data and added those to the Testing Data field. I was ready to being training to the model.

Training the model took around two hours on my i9-9980HK MacBook. I suspect using a cloud-based solution like Google’s would have achieved this faster, but I haven’t experimented with that. Over the course of the process, you can see the system slowly reiterated and got better at picking out which dog is which, and how the loss goes down accordingly.

 

After it blew threw the default 5000 iterations, it was time to give my creation a try… and the results were nothing short of impressive.

 

 

Just like that, my machine learning model was identifying Rudy and Meadow in completely different lightning conditions, from different angels, in different poses. Some showed the dogs faces, some didn’t. Sometimes one of the dogs was wearing or holding something- that didn’t matter, the model could still figure out which dog was which with a surprising degree of accuracy. It was so accurate, I thought I might have done something wrong, and it was using the test images to training the model, giving me false positives. I verified this wasn’t the case by taking fresh photos of each dog, and lo and behold, it was able to pick out the dogs just as accurately.

It blows me away that this relatively new technology is assessable enough that anyone with a little bit of tech-savvy can training a machine learning model. This same method can be applied to video, audio, motion, and text data. Does this spark any ideas as to how you can use a machine learning application to improve your business? If so, feel free to reach out so we can discuss!