Akamai Diversity

The Akamai Blog

The Myth of the self tuning / machine learning Web Application Firewall

There's an old adage that if something seems too good to be true, it probably is. If you're like me, you can apply this to your own experiences. For example, about 5 years ago a small chain of gyms that exclusively used vibrating exercise machines popped up near my home. Their gym goers would stand on a vibration platform for 15 minutes while reading or watching TV. The gym promised weight loss, fat burn, improved flexibility, and enhanced blood flow. The thought of getting a complete workout in 15 minutes without breaking a sweat is pretty appealing. I'm in! Unfortunately, research (or lack thereof) brings us back to reality and it appears that adage about something being too good to be true applies once again and those people who stood on a vibrating platform for exercise, at best experienced minor caloric burn.

Similar logic also applies to many security technologies including Web Application Firewalls. For review, a Web Application Firewall (or WAF as we'll refer to it) acts as a reverse proxy between your users and front end web servers. A WAF's job is to intercept malicious traffic while allowing non-malicious traffic through to your web infrastructure. Here's the big question - Can you implement a WAF to stop malicious cyber attacks without putting in the sweat equity of an intense workout? As you might expect the answer isn't black or white. Let's explore below.

When using a WAF (and many other types of security devices) there are two types of outcomes we want to avoid: A False Positive is when a WAF mistakenly stops a legitimate request in error, thinking it's malicious. We don't want False Positives because it stops legitimate users from accessing your web property. False Negatives occur when a WAF examines malicious traffic and mistakenly classifies it as clean and allows it through to the front end web services. We know it's bad for malicious intrusion attempts to get past any type of security device, so false negatives are also something we want to avoid.

So how do we control our False Positive and False Negative rates? The two are related. Simply put, as a WAF becomes more aggressive at filtering traffic there is an increased chance of False Positives. The inverse is true as well. The more conservative a WAF is when filtering traffic, the more likely malicious requests are to slip through.

The reality is False Positives and False Negatives are both bad for different reasons. Ideally, we'd like to have neither. Which one is worse? From a WAF vendor's point of view, if a WAF has even a marginal False Positive rate, website operators would receive feedback from legitimate users who are unable to access their website. I've never met a website operator who is willing to accept their website would be inaccessible to even a fractional percentage of legitimate users. To use an analogy, it would be like your local store mistakenly turning away a small number of shoppers thinking they are trying to shoplift. If any detectable level of false positives occur, most website operators wouldn't operate the WAF in deny mode which would allow malicious threats to come through and render the WAF ineffective. We've just established that a WAF can't have false positives.

Behind every WAF is a methodology for how it filters and a set of rules that specify the bad stuff a WAF will try to detect. In a perfect world, the WAF's rules would be incredibly awesome and produce no (or extremely low) false positives and false negatives. Yes, in an ideal world that would be the case, and in an ideal world we'd all be doing passive vibration exercises in place of going to the gym. So why can't we have the perfect WAF with a balanced set of rules that produces no undesired affects? There's a major variable that we haven't touched on - every application is unique. Even applications based on common framework (ala WordPress, SAP Hybris, etc) can have variations such as how cookies are set or irregular output based on different configurations or plugins. The truth is a WAF that is turned on without any tuning can't possibly stop most/all malicious traffic AND not block legitimate users because it doesn't understand how the application it is protecting behaves. Since a WAF would be quickly removed if it produced false positives, "set and forget" WAFs typically have a low false positive rate at the expense of letting more (but certainly not all) malicious traffic through.

How do we solve this problem at Akamai? There's no silver bullet. Our Kona Site Defender is a highly-sophisticated cloud WAF. It avoids false positives and false negatives by operating a modular and granular rule base that is tuned to match the nuances of each application being protected. This requires more effort when implementing the WAF, and carries the responsibility of ensuring the tuned WAF rules remain effective as both your application and the threat landscape change. Let's face it, not every organization has the resources to devote to this. For these customers, we offer the Web Application Protector or WAP for short. WAP is a WAF that provides protection from the most common type of attacks. However because a WAF can't produce false positives, it won't be as granular as Kona Site Defender is at detecting malicious traffic and stopping the most targeted malicious traffic from getting through.

The same is true outside of Akamai too. If a vendor is selling a WAF with buzzwords like "self tuning" or "machine learning" that produce incredibly low false positive and false negative rates, you might want to look a little closer and just for fun make sure they don't sell vibrating exercise platforms!


Thanks for your view Les.
It will be interesting if you develop further the topic started in this blog post.
I'd be interested to read more about WAP-WAF-RASP combination.

Good blog, however, a little slanted to your current offerings.

Whilst I agree with some of your comments think you are being unfair on other vendors. As the ones who offer AI and machine learning often also you the same core mod2 rule set as Akamai. So a more accurate description would they not just vibrating exercise platform but an entire gym equipped with the latest technology that may include a vibrating platform, however you use what works best for individuals.

ps. your captcha sucks