One of the two DNS server going down causes impacts


 
Thread Tools Search this Thread
Top Forums UNIX for Advanced & Expert Users One of the two DNS server going down causes impacts
# 1  
Old 01-17-2020
One of the two DNS server going down causes impacts

Our computing enviornment consists for Linux, Solaris , AIX, Windows. /etc/resolv.conf file of each *nix has two entries. When the 2nd one goes down we are seeing impacts on AIX hosted services. We are breaking our head, to no avail yet. We have not seen any impact on non-AIX hosted services. While 2nd DNS server remains down, nslookup returns hostnames immediately.

We are trying to avoid running tcpdump, and was trying to capture DNS traffic from client through netstat. But netstat does not capture DNS traffic either.

Would you please give us a hand?
# 2  
Old 01-17-2020
Why are you focused on network traffic analysis?

Seems it is best to audit the DNS server which is failing, or "going down" in your words.

What does it mean "going down" ? What is crashing, exactly and why?
# 3  
Old 01-18-2020
Hi Neo,
One DNS server out of two failed only once. And that caused impact. We are able to reproduce the problem. We want to find out why the impact is felt even though the other DNS server was fine. The impact was reported only for AIX hosted services

--- Post updated at 05:45 AM ---

Since we have not found any reason at upper layer, we want to investigate at "netstat" layer. There we found that netstat does not report DNS requests
# 4  
Old 01-18-2020
Yes but what actually faiiled?

What process? What does "failed" mean??

DNS daemon process crashed? Needed to be restarted? A single DNS query failed?

What EXACTLY failed?
# 5  
Old 01-18-2020
Hi Neo,
First level failure: The CPU DNS server (Windows DC) spiked. This DNS server appears as the 2nd server in /etc/resolv.conf file.
Second (result of the 1st): Application servers were unable to connect to DB server. Logs reported --unable to find DB connection stream

When we reproduced the situation (kept 2nd DNS server down), the application server was unable to connect to DB server. But "nslookup <host> " worked

--- Post updated at 07:22 AM ---

Neo,
Also please note that fixing the root cause (CPU spike or death of one DNS server) is not what I want. I want to solve the fact that resiliency did not work-- why app servers were unable to connect while only DNS server out of two was down.
# 6  
Old 01-18-2020
I've seen this mostly related to DNS query timeouts setup from client side.
The defaults are quite high on most linux/unix operating system, from AIX man page online :
Quote:
timeout:n Enables you to specify the initial timeout for a query to a nameserver. The default value is five seconds. The maximum value is 30 seconds. For the second and successive rounds of queries, the resolver doubles the initial timeout and is divided by the number of nameservers in the resolv.conf file.
attempts:n Enables you to specify how many queries the resolver should send to each nameserver in the resolv.conf file before it stops execution. The default value is 4. The maximum value is 5.
In practice if you have, for instance, two dns servers, and first one /etc/resolv.conf goes down...
The system will try to query first with timeout of 5 seconds and 4 attempts, totaling 20 seconds, until second is tried.

This will for sure hit some timeouts from application side, e.g application will timeout before system returns valid DNS entry.

As for nslookup working, i'm unsure. It this from the same box ?

Suggestion is to change to defaults to lower values and/or implement DNS caching mechanism locally on AIX box.

Hope that helps
Regards
Peasant.
These 2 Users Gave Thanks to Peasant For This Post:
# 7  
Old 01-18-2020
Obviously the default timeout is too high.
Add two lines to /etc/resolv.conf
Code:
options timeout:2
options attempts:2

These values will give a total delay of 2 * 2 = 4 seconds when the first DNS server (nameserver) is down.

Further ensure that local is first for hosts in /etc/netsvc.conf (before a reference to bind or dns) - so for example a lookup for localhost is found in /etc/hosts (must be there of course) without querying DNS.
See also: AIX ClearCase server is not responsive during DNS outage

Last edited by MadeInGermany; 01-18-2020 at 04:17 AM..
This User Gave Thanks to MadeInGermany For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Solaris

DNS client added to DNS server but not working

Hi, We have built a new server (RHEL VM)and added that IP/hostname into dns zone configs file on DNS server (Solaris 10). Reloaded the configuration using and added nameserver into resolv.conf on client. But when I am trying nslookup, its not getting resolved. The nameserver is not able to... (8 Replies)
Discussion started by: snchaudhari2
8 Replies

2. Solaris

DNS Server help

Hi Team, I need to find the clients which are being served by the DNS server in our environment. The approach currently i am having is to look for the DNS server IP in nameserver IP in the /etc/resolv.conf file in all the servers in our environment. Do we have any command(s) which gives... (1 Reply)
Discussion started by: vishalaswani
1 Replies

3. UNIX for Advanced & Expert Users

DNS server choice: Windows DNS vs Linux BIND

I'd like to get some opnions on choosing DNS server: Windows DNS vs Linux BIND comparrsion: 1) managment, easy of use 2) Security 3) features 4) peformance 5) ?? I personally prefer Windows DNS server for management, it supports GUI and command line. But I am not sure about security... (2 Replies)
Discussion started by: honglus
2 Replies

4. AIX

Impacts of emptying /var/adm/wtmp file ?

In our operating procedures, if a workstation has a space problem in the /var filesystem, one of the most frequent case we were told is the size of the /var/adm/wtmp file. Someone once told me it is dangerous to do this. Is it ? I cannot say for certain that whomever wrote that procedure is... (2 Replies)
Discussion started by: Browser_ice
2 Replies

5. HP-UX

Impacts on upgrading the aCC compiler in HP-UX

Hi, We are currently using the aCC 3.13 compiler in HP-UX 11.0 but we need to upgrade the aCC compiler version from aCC 3.13 to aCC3.31. 1. Is there any major impact of update the compiler? 2. What are the major things we need to make sure before updating the compiler? Can any one guide... (2 Replies)
Discussion started by: gyanusoni
2 Replies

6. Solaris

Solaris DNS Client For Microsoft DNS Server

hey guys, how to add soalris box as a microsoft DNS Client ? and how to register in the microsoft DNS ?? i managed to query from the DNS server after adding /etc/resolve.conf and editing /etc/nsswitch.conf but i need to register the soalris server (dns Client) into Microsoft DNS automatically.... (3 Replies)
Discussion started by: mduweik
3 Replies

7. UNIX for Dummies Questions & Answers

setup a DNS server for my redhat server

Using Redhat Linux Enterprise AS 4 can someone teach me how to setup a dns server for my webserver? i've registered a domainname at mydomain.com but when i type in the domain i register i cannot enter to my webserver. someone told me that it is related with the DNS setting on my server. i've... (2 Replies)
Discussion started by: kaixiang88
2 Replies

8. UNIX for Dummies Questions & Answers

DNS Server help

Hi, I would like to create an internal webpage for my company that would only be viewable when connected to the VPN or internal network. I want a webpage like newsite.company.com. Is there a way to do this. We obviously already have www.company.com, but how would I go about creating the newsite.... (1 Reply)
Discussion started by: ejbrever
1 Replies

9. UNIX for Dummies Questions & Answers

dns server

Hi all, I need to know step by step procedure to configure and to test a dns server on redhatlinux 7.2. thanks and reg, bache gowda (4 Replies)
Discussion started by: bache_gowda
4 Replies

10. UNIX for Advanced & Expert Users

Impacts on Timezone changes

I'm running on UNIX with Solaris. I need to change the Timezone on the box and was going to make the changes to the files and then reboot the box. Someone warned me that this method should not be used and that a util (unconfig.sys I think it was) should be used to make the changes. Any ideas... (2 Replies)
Discussion started by: Lextar
2 Replies
Login or Register to Ask a Question