Akamai Diversity

The Akamai Blog

WAF: trade-off between false positives and false negatives

In the previous article, we introduced, arguably, the most important metric to measure WAF quality (in subsequent entries we will talk about WAF performance). But we left one question flying in the air: How can we achieve that our WAF rules provide a virtually null False Positive rate, keeping at the same time a very low percentage of False Negatives?
Akamai based its first commercial version of a WAF, back in 2011, in the ModSecurity CoreRule Set, an open source project led by OWASP.  Since then, we strove to gain accuracy in our WAF technology. In mid 2013, Akamai introduced to the market the first version of Kona Rule Set (KRS), a proprietary technology that successfully accomplished the goal of improving accuracy and effectiveness of our WAF. This new strategy is based in three factors:

1. Combination of broader and more flexible rules.

Traditional WAF technology models each attack with a complex and sophisticated rule that represents the specific pattern of that attack. This approach harms accuracy since attacks of one kind, although being similar, may not be exactly the same, plus in some cases new attacks are variations from previous ones. In these cases, given the specific rule doesn't exactly match the attack, it may not be detected by the WAF. Akamai uses a different strategy, defining in its set of rules KRS a probabilistic combination between a more reduced number of more flexible rules, modeling factors that may be present in one attack. The outcome is a more accurate system, since Akamai WAF can detect attacks that have not been individually modeled or unknown attacks. This is so because the attacks share some of the suspicious behaviors, and also these will be correlated. This strategy is also more effective when it comes to detect Zero-Days vulnerabilities. 

Let's use the following analogy to better explain this concept. In a supposed automatic system to detect a physical assault into a bank, our intelligent system will evaluate individually each of the following factors:

• One individual gets into the bank with a weapon
• The individual shouts at bank personnel 
• The individual receives a big amount of money from bank's personnel
• The individual runs away from the bank
• The individual jumps into a car that was right outside the bank with the engine turned on.

Please note that that each of the above factors, solely by itself, doesn't necessarily mean that a robbery is being committed. E.g., it could be that one of the physical security professionals working in the bank has a weapon, but this shouldn't lead to infer that he/she is robbing the bank. Also, it may occur that one customer is yelling at a bank's employee because they didn't concede a loan, or one customer may run away out of the bank and jump into a car only because he/she was in a hurry, but again, this only reason doesn't imply a crime. However, an intelligent system will evaluate every factor and will weigh the occurrence of each particular factor based on past experiences of robberies.  Finally, different combinations of these factors will result into an accurate possibility of an attack is effectively taking place.

This combined and flexible strategy is more accurate then the traditional one of creating either individual and isolated rules or specific and complex rules that include all the factors for each threatening situation. Following with previous analogy, in the traditional rigid model, we could create individual rules for each suspicious behavior, but as soon as just one hint happens, it could lead to a potential false positive. Alternatively, the traditional model could create a single rule that contemplates every and all of the factors mentioned above. However, if any factor is not present, the rule wouldn't trigger and the threat would be unnoticed, leading to false negatives. The failure of these kinds of complex and specific rules would evidence if one attacker, trying to elude a detection system, decided not to use a weapon during the robbery, like in this Pulp Fiction's memorable scene

2. Accuracy: feedback with real data.
Akamai WAF technology is constantly receiving feedback from an enormous number of samples obtained from real customers observed in the Akamai Intelligent Platform. Delivering between 15% and 30% of worldwide web traffic provides a unique visibility that no other WAF solution can even come close to. This is another reason our rules strategy is more accurate than any hardware-based or even a lower scale cloud-based WAF solution. The analysis performed against every single request analyzed in the WAF is based on data from the broadest set of real requests, both legitimate and illegitimate, including customers of every size, industry and geography. This methodology includes the detection of false negatives and false positives sources so the rates can be improved and refined over time to sharpen our WAF accuracy.

3. Rule updates
Thanks to the methodology described in the 2 points above, Akamai is constantly updating rules that provide better accuracy or a new specific defense against recently detected attack patters. These new rule updates are available so customers can download them through Luna Control Center with a versioning system that eases update management.
Does all of this matter? Of course it does! The outcome of this innovative rule design, as well as the continuous evolution and improvement of WAF accuracy, is displayed in the chart below, which represents False Positive and False Negatives index for Akamai WAF solution in October of 3 consecutive years.

WAF- trade-off between false positives and false negatives Img1-thumb-500x353-4717.png

Based on the 3 aspects mentioned before, our strategy proves to be more effective than traditional WAFs. As well, it is supported by a cloud architecture that provides greater accuracy than hardware devices. The combination of more flexible rules is a powerful mechanism as long as correlation among rules can be monitored at high scale. Finally, testing and feedback is more effective thanks to the broad sample research and the update is more efficient if it is based and governed by a cloud-distributed collective intelligence.

The next blog post will continue to identify characteristics that help us evaluate the quality of a WAF solution. In the meantime, like I did in my previous entry, I recommend the reading of this white paper for further details on why the Akamai approach to WAF is more accurate, efficient, and reliable than the approach of our competitors.