I recently started studying for the GMAT and ran into an interesting passage while taking a diagnostic test - a reading comprehension argument by the philosopher John Searle, who was one of the first philosophers to challenge the idea of artificial intelligence. Searle argued that the human brain is not like a computer processor, and that computers are syntactic (rule based), rather than semantic (meaning based) creatures. The diagnostic test question is referring to Searle's thought experiment called the Chinese Room. The idea of the Chinese Room suggests that if you lock a person in a room with rules translating English to Chinese characters, the individual will be able to respond in Chinese to questions written in Chinese. The experiment suggests that no matter how intelligent a computer (person locked in a room) can respond, a program (rules) can not give a computer "understanding", and therefore a computer can not "think" (i.e. "strong AI" does not exist).
Searle's concept of machine learning and artificial intelligence is now referred to as supervised learning. Supervised learning refers to the process of training a computer to identify patterns based on pre-existing knowledge of that pattern. The person who attempts to teach the computer would need to first understand a pre-existing data set, and then train the computer to copy that person's understanding of that data set. For example, a human might show a computer pictures of a dog with the word "dog" associated with each picture. After X number of pictures the computer will be able to identify what a dog is. In the context above, if you lock me in a room with rules on how to translate English to Chinese characters, I will be able to answer, using Chinese, questions posed to me in Chinese, but I won't have an actual understanding of the Chinese language.
Machine learning is the science of getting computers to act without being explicitly programmed. Today, computers are really good at "acting" smart (supervised learning). To put this as Searle might put it, computers are syntactic. Pattern recognition like Facebook's ability to tag friends in a picture or a self driving car's ability to not hit a pedestrian or to stay in a lane - these are examples of syntactic learning. In web security, web application firewalls (WAFs) are specifically designed to fight known exploits - security researchers build out rules and configure rules based on pre-existing knowledge of all possible threats. From an artificial intelligence perspective, machines can be fed web logs and can be trained to identify application behavior, and application vulnerabilities. Securosis wrote a great article about Maximizing WAF Value - discussing that [unsupervised class of] protection has a way to go in terms of maturity and effectiveness. If machine learning capabilities help tune your WAF policies without a lot of work on your part, you're getting huge value. But much like every other example of supervised learning, a WAF's pattern recognition capabilities are only as good as the data you provide it.
There are clear limitations to supervised learning. Without Searle's "semantics", machine learning systems can't use knowledge they've learned in one area and easily apply it to another situation. Mark Zuckerberg gave a great example of these limitations - when you give a child a book, a child can hold it, turn its pages, drop it and understand gravity exists. Computers can't. This means computers can't effectively react to new problems or situations they haven't seen before, i.e. they aren't intelligent.
Much like humans, computers still have a hard time determining unknown exploits. This is an example of unsupervised learning. A computer would have to discover hidden patterns in traffic, without being fed existing data. Unsupervised learning is learning how the world works by observing and trying things out rather than being told what to do. Today, no one has built a perfect model for unsupervised learning, humans don't know how to give context and meaning to computers yet.
As I mentioned above, pattern recognition capabilities in security are only as good as the number and quality of the exploits you see. In the next series of blog posts, I'll show you how Akamai's visibility into 15-30% of Web traffic drives our products to utilize machine learning, including training our WAF, detecting malicious clients on the web, and how we help our clients make better decisions on tuning their rules.