Andrew Hall


DDD EA – Session 2 – Building Skynet

This was live blogged during the talk at DDD ea. The talk is by Anthony Brown

People think that machine learning is scary, or to do with science fiction but machine learning is all around us

Advertising, which products you get suggested to buy online or even online game match making are all examples of machine learning

Even CPUs since ~1995 use machine learning to optimise.

Target in the US used machine learning to predict what a 16 year old would want to buy. However, the company predicted that she was pregnant! Her father saw the flyer that she received and went mental, complaining to the shop. However, it turned out she really was pregnant – the supermarket knew this before her own father!

When guessing peoples names, we have an idea that a person “looks like a <NAME>”. When we are growing up, we learn that something is an apple by the shape, size, colour etc.
This is no different to how computers machine learn. They build up a picture of a particular subject and gain experience.

Data + assumptions = results.

But often these could be the wrong results.

Bayes introduced a new way of thinking.
Spam filters work by working out which emails are valid and which are spammy. But we need to give it some data for it to learn – this is like us learning as children, building our experience.

We can build classifiers based on certain rules and then work out whether an email is real or spam. This is similar to how gmails spam classifier works.

The nearest neighbour algorithm works by determining which of the data classes are closest to the target we are looking to classify.
For example: when trying to work out which character a hand drawn letter is, we simply work out which character from our known character list is the closest to the letter we want to classify by comparing the letter with the most pixels in the same place.

But how is this useful in production?

Microsoft azure has a product called MAML (Microsoft azure machine learning) which generates R code.
We can build a recommendation engine in about 30 minutes. For example, if a user previous watched a Tarantino film, they are probably interested in other ones and similar films.
This is a graphic, flow chart style tool which helps us to simply do this. The tool contains various machine learning algorithms to help you.
The tool allows you manipulate and split data too, for training purposes.

Given some test data from what a user does and doesn’t like in films, we can recommend new films based on titles from IMDB. We can also join this data to other things, for example, we can say that if a user liked a film we may be able to imply other data about them by joining to other data sets (although this might be non correlated data and give strange results)

When we look at the results, we will have to tweak the training to get more accurate results.

The MAML tool allows you to create a web service based on this data which can take inputs (for example from your web application) and create outputs from your model to read back into your web application. This will also scale to your companies need.

If you have some data in your application right now, you can use machine learning to predict future outcomes – even if you are only adding this to a small application, don’t forget that user experience is key.

For example if a user always goes to their invoice page to print it after the day it’s generated, consider automatically redirecting them there when they log in on that day

Giving users this experience makes your product better than your competition.

Leave a comment or tweet me