Serious un-pingable stumper of a problem...


 
Thread Tools Search this Thread
Special Forums IP Networking Serious un-pingable stumper of a problem...
# 1  
Old 01-24-2008
Serious un-pingable stumper of a problem...

I have been busting my head over a network issue at work recently. I believe the problem to be in the L2 domain, but "the powers that be" believe that it looks more like a server port related problem. And the biggest problem of all is that EVERYBODY in the Engineering Department uses this file-server...

The symptoms are as follows:
  • A samba connection is shared out from "FileServ_1" to my desktop. While having a file open for read/write, I will lose the file (aka. the persistence of connection), and will be prompted by my App to save a local copy (lucky me).
  • From that point, I immediately (being prepared) switch to a shell in which I kick off a ping to "FileServ_1"... then another shell I bypass DNS & go straight for the IP... then another shell I have a remote connection from a totally different subnet, also pinging "FileServ_1"... and finally a trace-route running from both my desktop and the remote connection.
  • From ALL pings I receive timeouts & from all traces I find the last hop is the dead-zone.

Although "the powers that be" make a strong case for their point, I have noticed "network topology changes" being reported at the switch (indicating a loop?) and I have been able to serial-console "FileServ_1" and watch it while it is supposedly "down"... only problem is: It never thinks that it is down.
  • Eth1 (till last week was the only port plugged in) never reports any issues (at least not at any default log levels) and from what I can see there is no way to tell if the ICMP packets are dying on the way in or on the way out.

Finally, as if things were not bad enough, they decided last week to make Eth0 a redundant fail-over for Eth1... which amazingly seemed to lighten the problem from "a few minutes of un-ping" to "a few seconds of un-ping"... and now, instead of happening 10 times a day it happens only once or twice.

So first things first (unless you have better ideas), I am wondering how to turn up the logging of ICMP (thats kernel level right?) and possibly Eth* logging so that I don't have to resort to sniffing for the entire day till it happens. Cause if nothing else, I would like to diagnose this problem correctly and get something done about it.

Any Help?
# 2  
Old 01-25-2008
This is how it can be done on router's side, certainly, this would require the net-admins to get involved. On your end, you may find "ngrep" utility useful to track down ICMP traffic. More you can do is to run "netstat -s" which will show all network connection statistics.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

sed Or Grep Problem OR Terminal Problem?

I don't know if you guys get this problem sometimes at Terminal but I had been having this problem since yesterday :( Maybe I overdid the Terminal. Even the codes that used to work doesn't work anymore. Here is what 's happening: * I wanted to remove lines containing digits so I used this... (25 Replies)
Discussion started by: Nexeu
25 Replies

2. Shell Programming and Scripting

validating(pingable or not) remote ip address in shell script

i need to verify whether the ip adress given as input to the shell script is pingable or not... that is whether the ip is alive and responding.. ping $ip_adress the above wont work in script because the execution is continuous... so the shell script keeps will dwell in this pinging process...... (8 Replies)
Discussion started by: vivek d r
8 Replies

3. IP Networking

Problem with forwarding emails (SPF problem)

Hi, This is rather a question from a "user" than from a sys admin, but I think this forum is apropriate for the question. I have an adress with automatic email forwarding and for some senders (two hietherto), emails are bouncing. This has really created a lot of problems those two time so I... (0 Replies)
Discussion started by: carwe
0 Replies

4. Solaris

[Help] - 2 VM solaris pingable

Hi, I have 2 VM of Solaris ( 2nd one full clone ) 1st VM - 192.168.1.30 2nd VM - 192.168.1.31 My need : ping both VM from each other I have added host entry in /etc/hosts of both server but unable to ping each other from solaris console... Pls advice (4 Replies)
Discussion started by: saurabh84g
4 Replies

5. UNIX for Dummies Questions & Answers

host not booting, but is pingable

hi there. im having a problem with a host at the moment, i can ping the host and responds with host is alive. i cannot telnet, rsh or anything else to it... it tells me connection refused. when i run a ckport on it i get answers from : *** successful - smtp *** successful - sunrpc ... (6 Replies)
Discussion started by: brian112
6 Replies

6. AIX

user login problem & Files listing problem.

1) when user login to the server the session got colosed. How will resolve? 2) While firing the command ls -l we are not able to see the any files in the director. but over all view the file system using the command df -g it is showing 91% used. what will be the problem? Thanks in advance. (1 Reply)
Discussion started by: pernasivam
1 Replies

7. Shell Programming and Scripting

need to check whether a sever is pingable or not inside the script

Hi, need to write a script which will check number of ip address are able to ping or not .. (2 Replies)
Discussion started by: mail2sant
2 Replies

8. Solaris

problem in finding a hardware problem

Hi I am right now facing a strange hardware problem. System get booted with the following error: Fatal Error Reset CPU 0000.0000.0000.0003 AFSR 0100.0000.0000.0000 SCE AFAR 0000.07c6.0000.1000 SC Alert: Host System has Reset It happen 4 or 5 times and get the same error every time.I... (8 Replies)
Discussion started by: girish.batra
8 Replies

9. Shell Programming and Scripting

ssh script problem problem

Hi Please help me with the following problem with my script. The following block of code is not repeating in the while loop and exiting after searching for first message. input_file ========== host001-01 host001-02 2008-07-23 13:02:04,651 ConnectionFactory - Setting session state... (2 Replies)
Discussion started by: pcjandyala
2 Replies

10. UNIX for Advanced & Expert Users

SSH Problem auth problem

Hi, Just recently we seem to be getting the following error message relating to SSH when we run the UNIX script in background mode: warning: You have no controlling tty. Cannot read confirmation.^M warning: Authentication failed.^M Disconnected; key exchange or algorithm negotiation... (1 Reply)
Discussion started by: budrito
1 Replies
Login or Register to Ask a Question