DNS & Anycast (Part 1): Two Technologies that Work Great Together
The StackPath platform sits on DNS as a foundational pillar and making sure it reflects sound engineering using Anycast is part of our approach to delivering performant edge services. In honor of this, we decided to write about what makes DNS and Anycast great as a pair. This is part one. You can find part two here.
DNS is a foundational technology with a simple job. Given a human-readable name, like stackpath.com, DNS will translate and return back a computer-usable IP address. Without this, the whole Internet becomes unusable.
Because DNS is so important it has to be extremely reliable. Modern DNS clients can achieve this with a high level of redundancy by maintaining a list of multiple DNS servers. The client can talk to any one of these servers in the event that one of them stops responding to DNS queries. Anycast helps DNS achieve this in an efficient and performant manner.
Below, I’ll answer some common questions about DNS failure and then explain how Anycast helps prevent these failures.
How does a computer deal with a failed DNS server?
DNS clients have timeout settings and number-of-attempt settings. Although the specifics vary depending on the operating system in question, most will follow some simple retry logic:
- There’s a list of multiple DNS servers; the first one on the list is tried first.
- If this server fails to respond within 5 seconds (the default timeout) that same server gets retried again.
- These retries continue until the number-of-attempt setting is reached.
- The second DNS server on the list is tried.
Some performance-related issues are implied here: One is that a DNS server is given multiple chances to fail completely before the next server is tried. And because each failure takes time, the end user has to wait. Another is that the client isn’t distributing its queries across all possible DNS servers by default. There is an order that’s always returned to. So even if the first DNS server is failing it still gets sent subsequent queries in the future.
What does a failing DNS server look like?
DNS servers can get overloaded and stop sending responses to queries. When this happens, a client has to pause and cycle through its retry attempts in hopes that it can catch the server after it has had some time to process its current workload and begin responding again.
DNS servers can also suffer from bad networks. Packet loss at the network layer can cause the DNS server to fail to receive a query in the first place. Or it can receive the request and send the response, but network loss can cause the response to get lost. This, again, causes the client to step through its retry attempts.
Why doesn’t the network protocol help?
By default, DNS requests and responses are not sent via TCP, the Web’s usual communication protocol. They’re sent via a different protocol, UDP.
UDP lacks all the safety of TCP. In UDP there are no retries, no scaling, and no pacing. It’s a very simple protocol that sends out a packet, and that’s it. Which is why the DNS client has to handle all errors and implement its own retry logic.
So why use DNS over UDP? The answer is efficiency.
A DNS query fits inside a single UDP packet. With DNS over TCP, a TCP handshake has to occur to create a TCP connection, inducing overhead. Then, that very small query has to be sent over the TCP connection.
TCP tries to engage in a lot of high-bandwidth behaviors which depend on large amounts of data being transmitted. These behaviors are all useless when the data that needs to be sent is so incredibly small.
How do we get better DNS performance?
We can take advantage of modern Internet routing and implement a technology called Anycast. Anycast allows us to announce the IP address of a DNS server from multiple locations throughout the world called PoPs (Points of Presence). With Anycast enabled, all ISPs (Internet Service Providers) route users to the PoP that’s physically closest to them.
Bringing up multiple identical DNS servers—or one per PoP, where each DNS server knows to respond to a query in exactly the same way—means that the DNS client can talk to any of those DNS servers and receive the same response.
Since the client’s ISP is always routing to the closest possible PoP it gets the fastest response. Even better, if the PoP has problems and its IP announcements are pulled, the routing table automatically adjusts to send that exact same DNS client’s request to the next closest PoP where it’s served by another DNS server.
Because this happens over UDP and is a single packet, it doesn’t matter which PoP returns the response. There is no TCP connection to interrupt and no wasted overhead. The automatic DNS client retry logic could send an initial request to one PoP and a second request to a second PoP—all without intervention on the part of the DNS client.
Combining technologies like this ultimately leads to a more robust and better performing DNS system.
Want to learn more about DNS and Anycast? Check out part two of this series that deals with monitoring DNS over Anycast.