Akamai Diversity

The Akamai Blog

Machine Learning - The new bicycle of the mind

I was on a flight to Brazil last night to kick off a week of meetings with partners and customers in Latin America. During the eight-and-a-half-hour flight from Atlanta, I got an opportunity to watch a few movies I've been meaning to catch up on, and on the top of the list was Steve Jobs. There's a scene near the end of the movie where Steve is trying to recruit John Sculley, the CEO of Pepsi, to join Apple as their new CEO. Steve Jobs' winning pitch was that his vision for the Macintosh will be the equivalent of a bicycle for our minds.

"I think one of the things that really separates us from the high primates is that we're tool builders. I read a study that measured the efficiency of locomotion for various species on the planet. The condor used the least energy to move a kilometer. And, humans came in with a rather unimpressive showing, about a third of the way down the list. It was not too proud a showing for the crown of creation. So, that didn't look so good. But, then somebody at Scientific American had the insight to test the efficiency of locomotion for a man on a bicycle. And, a man on a bicycle, a human on a bicycle, blew the condor away, completely off the top of the charts.

And that's what a computer is to me. What a computer is to me is it's the most remarkable tool that we've ever come up with, and it's the equivalent of a bicycle for our minds."

If I could rephrase Steve's quote for application security, it would be that Web Application Firewalls (WAFs) are traditionally hard to configure - they take a lot of energy to work with and aren't operationally efficient to maintain. But I think machine learning is the new equivalent of a bicycle for our minds. And in web security, machine learning is the next frontier in simplifying the process of maintaining a security perimeter for web applications.

In my last blog post on Machine Learning in Security, I introduced some of the basic concepts of what machine learning is and referenced limitations in today's implementations of artificial intelligence (AI). As I mentioned last week, computers aren't truly intelligent today, but they're great at pattern recognition. In this blog post, I want to demonstrate how pattern recognition works, and how good pattern recognition can make WAF policy tuning both easier and more effective.

Pattern recognition algorithms are designed to optimally extract patterns from data and then separate and categorize data based on similarities in pattern. The process of categorizing data based on patterns is called classification.

Classification 1.png

Figure 1: Pattern recognition & classification of shapes

Web requests vary widely, and are difficult to automatically classify as either "good" or "bad". We tackle this difficulty by assessing confidence in true positives or false positives of each WAF rule that triggers. By assessing the probability that a rule accurately triggered against a client or not, we can determine the probability that a request or client was malicious, and help our customers tune their WAF policies with higher efficiency.

For reference, a false positive is any request that has been labeled as malicious by a WAF but was actually legitimate. A true positive is any request that has been labeled as malicious by a WAF that was correctly identified as a malicious request.

Classification 2.png

Figure 2: An illustrative definition of "false positives" and "false negatives"

Before 2016, Akamai ran machine learning algorithms "behind the scenes" to build better products for our customers. Lately, our customers have asked for greater visibility into how our machine learning algorithms work. We answered by opening our WAF policy recommendation engine for our customers to use. The WAF policy recommendation engine is just a small piece of the Akamai Security Center, which launched earlier this year and is available for all Kona Site Defender customers. The new security dashboard is a one stop shop for intelligence into the who, what, when, and how of attacks across a customer's web properties.

Classification 3.png

Figure 3: A screen shot of the Kona Site Defender Security Center

By aggregating all security events for a customer in one place, we're able to run classification algorithms so customers have insight into noisy or accurate rules. As I mentioned above, Web requests vary widely, and are difficult to automatically classify, and pattern recognition algorithms and classification algorithms are quiet complicated to build. In Figure 4 below, you'll see that Akamai takes care of the heavy lifting and provides interesting data points in backing up the probability percentage. A lot happens in the back-end, but we want to show our customers tidbits of information to justify the probability.

Classification 4.png

Figure 4: WAF Tuning Recommendation Report

Figure 4 exemplifies a sample recommendation report on how to reduce a false positive for a particular rule (or rules in this instance). At the top of the screenshot, the first outlined box highlights an exception to two specific command injection rules to drop the false positives. The second box details that the probability of a false positive is high (but not 100%), and that in a seven-day time period, 7139 requests were made across 440 IPs (which can be visualized in the Active Days diagram). One data set the pattern algorithm is using is that if a diverse set of IPs are triggering a rule frequently, every day, there's a high probability the rule contains a false positive. There's clearly more that goes into the pattern recognition algorithm than just the number of times a rule has triggered, and our customers have the ability to look at each tab below the "Active Days" graph to look at the top ten sources for the noise in this particular set of rules.

Again, the goal of Security Center is to improve the efficiency of tuning our WAF by leveraging machine learning. In other words, machine learning is the bicycle of our WAF.  Through visibility into wide swaths of web data, a Cloud Security Intelligence data analysis engine, and a team of data scientists who can run heuristics on that data on an hourly basis, we are able to recognize patterns and develop baseline rulesets on a regular basis that help our customers start with a negative security model that is as far ahead of our competitors' models as the bicycle-powered human is than the condor.

Now of course this is all B2B enterprise stuff, and unless Michael Fassbender wants to play our CEO Tom Leighton in an Akamai movie, we probably won't be seeing a full-feature film about this WAF of ours. So if Application Security is your thing, you might want to sign up for at least a live demo of our capabilities.  

If you're interested in learning more about the recommendation tools I've mentioned above, please sign up for a personal demo so that we can dive deeper into the artificial intelligence algorithms that can simplify the maintenance of your application security posture.