January 6th, 2012, 09:41 AM
Odd issue with named caching server. Please Help!
The issues arises when running the host command or a java app that requires lookups. It appears to be either a problem with the named caching server running on Centos or the response is mangle in someway for the url webservices.securetrading.net.
The problem arrises as soon as the a lookup up is made for ipv6 AAAA record.
It only occurs with the above mentioned URL.
I have installed a separate instance of centos running named caching server on a virtualbox host running under OSX on my home network and can confirm the issue is still arising. So it's not limited to our business network.
I can also confirm the issue also arrises when changing to different nameservers ie when using google public nameserver 188.8.131.52
After a period of time the nslookup starts to resolve correctly but as soon as I execute a host command or a java program that requires a lookup to the ipv6 AAAA the issue returns.
Please find below details of issue.
Host command lookup:
January 6th, 2012, 10:59 AM
A quad A record doesn't exist for webservices.securetrading.net. Normally a NOERROR would be returned with the SOA, however it is returning an NXDOMAIN instead. I'm not sure exactly what would cause that but it isn't normal.
The name servers for securetrading.net are most likely not BIND servers since this shouldn't happen in BIND and a version query returns NOTIMP. This could mean a lot of things but my best guess is they are using something that just has a unique response.
The problem this causes is that NXDOMAIN responses are cached for the minimum field, which is 600 seconds in this case. So you will essentially have an intermittent issue but this is due to the securetrading.net server configuration. The A record for that host is 20 seconds but that low ttl doesn't matter. A query for the AAAA that gets the NXDOMAIN response back will overwrite anything for webservices.securetrading.net regardless of record type (since the auth server is saying the whole domain doesn't exist).
I don't think there's too much you can do in this case unless you have a way of contacting the people that run the server and figure out what's happening.
January 6th, 2012, 01:59 PM
Thank you CaptPikel for you very informative reply.
One thing to note is that if I disable named on the proxy and edit /etc/resolv.conf to point to an external DNS the lookups do not hang.
Where can I set the TTL for the cache ie drop it down from 600 say to 20 seconds?
I am sure the above will not help for the reason you mentioned in your previous response but unfortunately I need to prove this to securetrading.com so as to satisfy them.
January 6th, 2012, 02:12 PM
Max ncache is set globally in the options statement in BIND. Other servers on the internet may never give you an issue because a lot are running custom versions of DNS software. Like Google for example. They do not use BIND and don't always follow RFC's (which isn't bad but makes troubleshooting unreliable). Some issues I have on my server, Google doesn't have and vice versa. A lot of public systems are set to notice problems like the one you have and reissue direct queries to nameservers rather than answer from cache (or ncache). So it's hard to guess why one may work and another may not honestly. If you lower your max ncache time, just know it will lower it for everything.
January 6th, 2012, 02:23 PM
Thanks again CaptPikel.
The thing is this was working fine until around the 22nd December. The work around for the minute is to hard code it in /etc/hosts but it was niggling at me as why this was occurring.
I am guessing securetrading made some changes to their DNS servers on the 22nd Dec.
I presume the max-ncache-ttl directive should go into the named.conf?
January 7th, 2012, 11:27 AM
Yeah just put it in the options section of named.conf.