A new kind of DDoS attack is currently stressing DNS infrastructure everywhere. Attackers gain access to DNS resolvers through home gateways with open DNS proxies. Proxies forward large bursts of queries with spoofed IP addresses to whatever resolver they are configured to use, usually an ISP resolver. With these attacks, the overwhelming majority of queries require recursion so resolvers in turn query authoritative servers to get answers.
The volume of queries can stress resolvers and authoritative servers significantly. Nominum researchers regularly see evidence authorities have failed as they study these attacks. It appears as though there is a cascading effect as authorities fail, resolvers try other authorities, increasing the load on them, causing additional failures. The net effect is resources associated with target names become unavailable when all of the authorities responsible for them become unreachable or unresponsive.
Recently some new behaviors have been observed that make these attacks even more problematic. Authorities that implement Response Rate Limiting (RRL) create even more work for themselves and resolvers. RRL was developed to address DNS amplification attacks targeting authorities, which started to become problematic again in 2012.
RRL allows an authority to either drop answers or send truncated responses to queries that exceed a configured rate limit for a target name. A truncated response signals the sender to retry over TCP. This mechanism foils queries send from spoofed source IPs, since the truncated response gets sent to a spoofed source a client will never reply.
In the current attack environment RRL creates a couple of different problems because queries sent to authorities are coming from legitimate ISP resolvers.
- If a resolver gets a truncated response back from an authority it does what it's supposed to do, it resends the query over TCP! BOTH the resolver and the authority have to do more work to deal with a query that is not legitimate to begin with. Meltdown.
- If a resolver doesn't get a response back because the request was dropped by an authority it may mark the authority as unavailable. This can start the cascading effect as the resolver tries successive authorities.
RRL was developed following an uptick in DNS amplification attacks directly targeting Authoritative servers. The dynamics of DNS DDoS attacks changed when attackers discovered home gateways with open DNS proxies provided them access to ISP resolvers. The extra level of indirection introduced by home gateways means a well-intended protection introduces these unexpected behaviors.
Because of this the right way to address these attacks is to filter attack related traffic at ingress to resolvers. In other words minimize the amount of work the resolver has to do to identify attack traffic and then get rid of it.
When resolvers eliminate unwanted traffic authorities don't see it in the first place so RRL and truncation won't be invoked (at least toward resolvers). It also benefits resolvers since there is no need for additional recursive processing. In fact authorities can still use RRL to protect themselves when they're attacked directly - the original design goal.
Resolvers equipped with fine grained policies and dynamic list entries can work together. Separate dynamic lists can track malicious subdomains (blocklist), and legitimate subdomains (whitelist) respectively. A policy tying them together protects good traffic and drops bad traffic. Keep operations happy and keep subscribers happy!
Let's circle back to truncated responses for a minute. They're problematic when used by authorities responding to resolvers because resolvers will reply, stressing both. However, when resolvers employ truncation responding to client queries (stubs) there's usually not a reply from the client since the IP address is spoofed. If the client does reply, in most cases it's a legitimate query the resolver should answer. We have seen exceptions that can cause problems, such as when forwarders are used in front of resolvers, but overall truncation provides the right result - deterring unwanted traffic and protecting good traffic.