Some thoughts on TTL Values

We had a recent serious system outage due to broken access connections to one of our sites which it took our esteemed telecom supplier an inordinate amount of time to fix. One of the consequences of this outage was to cause us to re-visit to the issue of TTL values and their impact on change propagation, resilience and fail-over.

Background

It may be worth re-stating the obvious - DNS TTL values reflect a balance between three issues:

  1. DNS load: The number of times the authoritative name servers and the DNS hierarchy need to be accessed. The lower the TTL the more frequently the DNS is accessed. If not careful DNS reliability may become more important than the reliability of, say, the corporate web server.
  2. DNS Changes: The ability to make changes within a reasonable period of time for those values that are likely to change. The lower the TTL the quicker changes will propagate.
  3. Cache poisoning: Every read from the DNS hierarchy also offers a DNS poisoning possibility. Simply stated the longer the TTL, the less frequently the RR is read, the lower the possibility it may be poisoned.

We have competing and mutually exclusive requirements thus the TTL value is a trade-off.

Since RFC 2308 the default TTL value for a zone has been defined using the $TTL zone file directive. Best practice for TTLs and other topics are defined in RFC 1912. Zone RRs and BIND parameters that have the most impact on resilience and fail-over are NS RRs, A/AAAA RRs, MX RRs and BIND's rrset-order statement.

DNS is able to provide a measure of resilience and load-balancing (minute-to-minute) which can be surprisingly effective. This article investigates the use of short TTLs and Multiple RRs in a fail-over context and this one discusses the minimum sensible TTL. Our experiments suggest that for browser based applications the minimum failover time, irrespective of the TTL values used, for a single A (or AAAA) solution may be as high as 30 minutes but when using multiple A/AAAA RRs can be as low as 1 minute 30 seconds. In general our conclusion is that short TTLs are a flawed and pointless strategy - again for browser based applications. There may be other applications that can genuinely use short TTLs effectively. I don't know of any - but that may simply reflect the prejudice that comes from a profound - verging on pathological - dislike of short TTLs.

Trying to use DNS to solve short term load-balancing - typically by using very short TTLs - can open the doors to far more serious problems. This paper on DNS spoofing (cache poisoning) suggests that some of the simpler attacks which are pretty ineffective with longish TTLs (1 hour or longer) can suddenly become very viable with low (sub one minute) TTLs. In general short-term load balancing (second-to-second) can only be effective when using specialised load-balancers. But simple multiple A RR strategies can be effective as a longer-term (minute-to-minute) load-balancer and extremely effective, if used with geographic separation, for service resilience.

TTLs and RRs

While the $TTL directive is very convenient for creating a zone-wide TTL value it is not always appropriate. The group of RRs at the zone apex (or root) - SOA, NS and MX RRs - are frequently and collectively referred to as the infrastructure RRs. These RRs have different properties to those of address RRs - such as A/AAAA RRs - in that they involve multiple DNS queries. The first query will read, say, the MX RR(s) and subsequent queries will return the A (or AAAA) RR(s) of the name pointed to in the MX RR(s).

The first question here is how frequently will you change the NAME of your infrastructure servers, such as name servers, mail servers etc.? You may well change the server's IP address(es) (the A or AAAA RRs) - perhaps even very frequently - but their names? Thus the MX RRs could run with a TTL of multiple weeks while the A/AAAA RR(s) may run with only hours. The net effect of this trivial strategy could be to reduce your DNS load by a figure approaching 50% for this record type. Which in the case of the MX RR means that you can get your SPAM with a reduced load on your server - that was meant to a semi-serious joke for those entirely devoid of a sense of humor.

Note: It may be worth digressing for a second or so. There is no reason that the server name (as it appears in /etc/hosts) of your name servers or your mail severs or your web servers need be the same name as that used by external users. Thus for operational reasons you may have a DNS server and refer to it as, say, ns1.example.com in an NS RR (and with a corresponding A/AAAA RR) but the service could be provided by a server whose real name is, say, super-fast-dns-master.example.com. Other than the overhead of defining two A RRs (one for ns1 another for super-fast-dns-master) both pointing to the same IP this a perfectly legitimate configuration. Some people even use the dreaded CNAME RR instead of multiple A/AAAA RRs for this purpose. The following zone file snippet illustrates this point:

; zone file snippet for example.com
; illustrating multiple server names
; uses BIND's short form time parameters for clarity
$TTL 2h  ; default zone TTL
$ORIGIN example.com
...
         3w           NS ns1.example.com. ;very long TTL
...
; both A RRs below use the default TTL of 2 hours (7200 seconds)
ns1                   A  192.168.1.2 ;external name
super-fast-dns-master A  192.168.1.2 ;internal - real - name
...

TTL values

When considering TTL values it is important that we exploit the fact that all RRs were not created equal so each can use an appropriate TTL value and thus contribute to a quieter life for the DNS hierarchy, caching name servers everywhere and the poor old zone's authoritative name servers in particular. Here, for what they are worth, are some thoughts on TTL values for the major RRs and RR groups:

SOA RR Unless DDNS (Dynamic Update) is being used or the name of the zone support contact changes frequently (joke) this RR, for all types of site, can have an extended TTL - 3 to 7 days, perhaps even weeks. There are no caching implications for the slave since it explicitly reads the SOA on receipt of a NOTIFY or when the refresh value is reached. For slaves the serial number is significant not the TTL. If DDNS is being used the Primary Master, referenced in the SOA, is used to confirm which DNS to update and if this NAME is likely to change the TTL value may need to be lower than would otherwise be desirable.
SOA Refresh The refresh parameter of the SOA RR defines when the slave will read the master's SOA and compare its current serial number with that received from the master. If it cannot reach the master the slave will continue to service the domain until the expiry value is reached. If NOTIFY is being used then there is little merit is setting this to anything but 12+ hours since it overall effect is simply to reset the expiry timer. If NOTIFY is not being used then this value will determine the rate at which zone changes are propagated to the slaves.
SOA Expiry The expiry parameter of the SOA RR defines when the slave will stop responding to zone requests if it cannot reach the master and has a significant effect on overall DNS availability. Setting this to a very high value (2+ weeks) will ensure that even if the master name server is out for an extended period DNS service will continue for the zone using the zone slave copies. There are no caching implications for this value since it is only used by the slave which uses the serial number not the SOA TTL to determine when to transfer the zone.
NS RR Assuming you have two (or more) NS RRs, as per RFC 1912, and they are physically separated by location and access supplier the NS RRs can have very long TTLs - 3 to 7 days, perhaps even weeks - since if one is down for a prolonged period the others will continue to service the zone. There is clearly a timeout overhead for every access to an out-of-service DNS but since most observed name servers use a round-robin NS list this timeout should only affect 50% of DNS requests as a worst case. In the event that an NS change is required (changing an external DNS supplier) this would typically be a planned activity and changing the TTL to a lower value (hours) during the transition would probably be a wise move before restoring to a very high value after the change.
MX RR The MX RR apparently has built in redundancy. But this can be extremely misleading unless well understood. Most mail systems forward mail to the backup (higher preference value in the MX RR) immediately they fail to access the primary - either because it's too busy or is not responding (permanently or temporarily). The backup mail server's role is to forward mail to the primary - mail is not read (by the user) from the backup. There is no advantage in having anything but long TTLs - typically weeks - for MX RRs. The Preference parameter of the MX RR provides for resilience not the TTL. Again if mail servers are to be changed (as opposed to their A/AAAA RRs) then this will typically be a planned activity and changing the TTL to a lower value (hours) during the transition would probably be a wise move before restoring to a very high value after the change.
A/AAAA RR The increasing use of very low TTLs (sub one minute) is extremely misguided if not fundamentally flawed. The most charitable explanation for the trend to lower TTL value may be to try and create a dynamic load-balancer or a fast fail-over strategy. More likely the effect will be to break the name server through increased load. Many commercial devices seem to use this low TTL strategy (we have seen the lunatically short 1 - 5 second TTLs) - one hesitates to suggest in a desire to sell more equipment. Some experiments we ran using single and multiple A/AAAA RRs provided very effective fail-over in approximately 1 minute 30 seconds for multiple RRs irrespective of the TTL value - without the need for expensive equipment. In the case of single RR even with a lunatically short TTL the fail-over time increased to 30 minutes. In a multiple RR strategy the TTL value is NOT important when considering fail-over times and thus the A/AAAA RR's TTL value should be set to the acceptable propagation delay, say, 12+ hours. In the event of an IP address change, which in any case is typically a planned activity, TTLs can be lowered prior to the change and increased afterwards. Certainly running with multiple A/AAA RR TTLs in the range 12+ hours should be the norm - significantly higher if IP addresses are regarded as stable. In cases where a single IP address may need to be replaced quickly a TTL value of 1800 (30 minutes) is really the lowest sensible value imaginable.
SRV/NAPTR RR These two RRs are similar in characteristic to infrastructure RRs in that they define a name (or label) which requires a further DNS A/AAAA query. Again long TTLs (3 - 7 days) are appropriate.
TXT/SPF RR The SPF RR may contain a mixture of names and IP addresses. If using explicit addresses it has properties similar to the A/AAAA RR above. If only names the properties are those of the infrastructure RRs (NS, MX, SOA).

OK so we are all lazy and do not want to make changes all the time to many zone RRs. Since there are - usually - more A (or AAAA) RRs in a zone and they are most likely to change frequently - default all these to the $TTL value for the zone and run with explicit overrides on all other RRs. Minimum typing - maximum flexibility. Does life get any better?


Pro DNS and BIND by Ron Aitchison

Contents

tech info
guides home
dns articles
intro
contents
1 objectives
big picture
2 concepts
3 reverse map
4 dns types
quickstart
5 install bind
6 samples
reference
7 named.conf
8 dns records
operations
9 howtos
10 tools
11 trouble
programming
12 bind api's
security
13 dns security
bits & bytes
15 messages
resources
notes & tips
registration FAQ
dns resources
dns rfc's
change log