DNS and security have had a long and tangled relationship. The DNS has always been an attractive target since it's a network leverage point. At DNS OARC 30 in Bangkok in 2019, Akamai's Ralf Weber did a presentation called DNS Security: Past, Present, and Future (It's Not Easy) covering numerous DNS security issues that have arisen over the years. Various forms of DNS-DDoS were a recurring theme in the presentation. Starting in late 2013, it became a major problem when attackers figured out there was a massive pool of open DNS resolvers conveniently situated in networks around the world that made it trivially simple to attack essential resources. Weber and many of his peers from Nominum, a company focused on DNS for ISPs, communicated about this issue extensively at DNS OARC and many other venues that service providers depend on to gain meaningful insights.
In May 2020, a new kind of DNS DDoS attack was disclosed, called the NoneXistent Name Servers (NXNS) attack. The name of the exploit is modeled after the NoneXistent Domain (NXD) attacks that plagued the DNS starting in 2014. A group of researchers discovered the practical reality of how resolvers operate was much different than the theoretical reality conveyed in simple diagrams commonly used to explain how the DNS works. Theory assumes resolvers receive nameserver IP addresses (i.e., glue records) when they receive corresponding nameserver (NS) records at the higher layers of the DNS hierarchy. In practice, they don't receive these records, which means resolvers must exchange many more DNS messages to proactively resolve name server IPs. Data presented in their paper showed a potential amplification factor of more than 1,620x was possible. Collateral damage included impact to resolver caches and recursion capacity. The potential impact to resolver operations and amplification targets was substantial.
Our technical teams engaged with the NXNS research team and ran the attack against the DNS infrastructure (DNSi) CacheServe resolver widely deployed in ISP networks. We were pleased to discover that CacheServe resisted the attack. A discussion with the developers of the CacheServe code revealed they paid careful attention to the recursion path in the original design because they saw that the problems such as the ones presented in the NXNS paper had the potential to consume significant resources. At the same time, they also saw the need to find the most responsive nameservers to ensure clients would get answers to their queries as quickly as possible. So they also developed algorithms to collect metrics and evaluate nameserver responsiveness. The result was optimal use of recursive resources and optimal responses for clients -- a nice balance!
Another Akamai recursive resolver, DNSi AnswerX, also had mechanisms to protect resources associated with resolving lists of nameservers. Resolution algorithms limit the number of nameservers consulted and temporarily blacklist misbehaving nameservers on a zone-plus-nameserver IP address basis. Slow/nonexistent nameservers are monitored and eliminated from consideration.
Akamai's trusted position in provider networks affords operational insights that guide development of our software. The result is a strong track record of innovation serving the unique needs of ISPs and Mobile Network Operator marketplaces. There are other precedents where software design helped subvert visible and dangerous attacks on the DNS. To cite one important example, in 2008 when Dan Kaminsky disclosed his cache poisoning attack, he had the opportunity to work with an earlier version of the CacheServe resolver (branded as "Vantio" at the time) and learn about its built-in cache poisoning defenses. Layers of protection resisted his exploit far better than anything else available. The development team went on to build in UDP source port randomization (the industry-agreed solution), and added even more protections with an additional feature that screened query responses for spoofed answers.
There are many other examples of DNSi innovations from Akamai helping to protect provider networks and subscribers, reduce the operational burden, and improve content delivery and the subscriber experience. We're happy to have a more detailed discussion.