Why? Well, in a nutshell, machine learning is the use of computers to refine patterns at a level beyond human ability. It’s kind of like if our brains were supercharged and able to store thousands of years of data, keep it all in memory, isolate the most important information, and then use all that information to predict results in real-time.
Let’s look at an example inspired by an earlier piece written by Quora:
Say you’re looking to buy the sweetest oranges. You ask your mother, who always has the ones you like, and she tells you that the biggest oranges from the local market are the sweetest.
So, following mom’s directions, you go to the local market and you buy ten of the biggest oranges you can find. Sadly, only about half of your oranges are sweet. Further investigation reveals that the oranges that happen to be the sweetest, also have bright spots.
You come to the conclusion that the sweetest oranges are the ones with the bright spots… and that your mother is a liar.
Armed with this knowledge, the next time you go to the store you only get oranges with bright spots. Once again, only half of your oranges are sweet, but this time, the differentiating quality is that the biggest are the sweetest.
You realize that the oranges that are both big and have bright spots are sweetest.
Now you can go to store and always pick the sweetest oranges…after a quick apology to your mother.
Next time you go to the local market, however, your roommate asks your to pick up the juiciest oranges. But you don’t have any information on how to buy the juiciest oranges, so it’s back to the drawing board. It would take multiple trips back and forth from the market, trying different combinations of oranges, and maybe even stores, before you would ever find the right mix of attributes that equaled a “juicy orange”.
That means a lot of wasted time and money… unless, of course, you have a machine with a good algorithm to predict results based on a sampling of millions of different oranges across thousands of different stores. Then it would be able to use this information to find the desirable oranges (whether sweet, juicy, or sour) by generalizing and isolating all the combinations of attributes (in the case of oranges, might be big, small, smooth, lumpy, spotty, soft, hard, pale, etc.) that are of statistical importance among the infinite sets of possible variations. By paying attention to only the attributes that are known and show statistical significance towards a specific outcome, the algorithm would be able to pinpoint what matters and calculate results in a highly efficiently manner, providing results before even leaving for the store.. and without having to eat a ton of ”bad” oranges!
But this isn’t just about oranges. Let’s look at how this process may effect an advertiser:
The graph below shows the daily spend of a fashion brand that ran on our network. Follow the paths of daily spend. Over the first few days of the campaign we’re spending more with MSIE, and less with Safari. The CPA on MSIE was high and burning most of the budget too quickly, so the system stopped buying it as much. In this case, the machine was able to anticipate the outcome by recognizing trends in the data. Eventually, we were essentially spending nothing against MSIE.
The best part? The client asked us to stop spending against MSIE on August 22, but looking at the graphs you can see that we were automatically decreasing spending against it.
The machines were a step ahead, and this gave the client great confidence that our system – machine learning, works.