Interpretation of Ping behaviour


 
Thread Tools Search this Thread
Top Forums UNIX for Advanced & Expert Users Interpretation of Ping behaviour
# 1  
Old 10-20-2014
Interpretation of Ping behaviour

hi,

working on Solaris 10. need your help on ping behaviour that I encountered.

I ping from source to destination

Code:
-bash-3.2# ping -s -t 128  10.10.10.200
PING 10.10.10.200: 56 data bytes   <===== stops here for 2 minutues before getting reply back
64 bytes from 10.10.10.200: icmp_seq=0. time=1.05 ms
64 bytes from 10.10.10.200: icmp_seq=1. time=7.91e+04 ms
.....
64 bytes from 10.10.10.200: icmp_seq=80. time=33.7 ms
64 bytes from 10.10.10.200: icmp_seq=81. time=0.470 ms
64 bytes from 10.10.10.200: icmp_seq=82. time=0.526 ms
64 bytes from 10.10.10.200: icmp_seq=219. time=3.71 ms
64 bytes from 10.10.10.200: icmp_seq=220. time=3.10 ms
64 bytes from 10.10.10.200: icmp_seq=221. time=11.9 ms
64 bytes from 10.10.10.200: icmp_seq=222. time=6.37 ms
64 bytes from 10.10.10.200: icmp_seq=223. time=4.57 ms
64 bytes from 10.10.10.200: icmp_seq=224. time=2.61 ms
64 bytes from 10.10.10.200: icmp_seq=225. time=4.70 ms
64 bytes from 10.10.10.200: icmp_seq=226. time=5.50 ms
64 bytes from 10.10.10.200: icmp_seq=227. time=6.08 ms
64 bytes from 10.10.10.200: icmp_seq=228. time=2.67 ms
....

What could be the cause? I notice also sometimes packet comes back more than 1ms..some reaching 30+ms..

If I use truss to see further:
Code:
16825:  xstat(2, "/etc/resolv.conf", 0x080472F8)        = 0
16825:  sysconfig(_CONFIG_OPEN_FILES)                   = 256
16825:  so_socket(PF_INET, SOCK_DGRAM, IPPROTO_IP, "", SOV_DEFAULT) = 5
16825:  connect(5, 0x0809BF20, 16, SOV_DEFAULT)         = 0
16825:  send(5, " d v01\0\001\0\0\0\0\0\0".., 44, 0)    = 44
16825:      Received signal #14, SIGALRM, in pollsys() [caught]
16825:  pollsys(0x08046CE8, 1, 0x08046CA0, 0x00000000)  Err#4 EINTR
16825:  lwp_sigmask(SIG_SETMASK, 0x00002000, 0x00000000) = 0xFFBFFEFF [0x0000FFFF]
16825:  sendto(3, "\b\0 gE6 AB9\002 L90 D T".., 64, 32768, 0x0806AA90, 16) = 64
16825:  alarm(1)                                        = 0
16825:  setcontext(0x08046470)
16825:      Received signal #14, SIGALRM, in pollsys() [caught]
16825:  pollsys(0x08046CE8, 1, 0x08046CA0, 0x00000000)  Err#4 EINTR
16825:  lwp_sigmask(SIG_SETMASK, 0x00002000, 0x00000000) = 0xFFBFFEFF [0x0000FFFF]
16825:  sendto(3, "\b\0D5E4 AB9\003 M90 D T".., 64, 32768, 0x0806AA90, 16) = 64
16825:  alarm(1)                                        = 0
16825:  setcontext(0x08046470)
16825:      Received signal #14, SIGALRM, in pollsys() [caught]
16825:  pollsys(0x08046CE8, 1, 0x08046CA0, 0x00000000)  Err#4 EINTR
16825:  lwp_sigmask(SIG_SETMASK, 0x00002000, 0x00000000) = 0xFFBFFEFF [0x0000FFFF]
16825:  sendto(3, "\b\0\nE6 AB9\004 N90 D T".., 64, 32768, 0x0806AA90, 16) = 64
16825:  alarm(1)                                        = 0
16825:  setcontext(0x08046470)
16825:      Received signal #14, SIGALRM, in pollsys() [caught]
16825:  pollsys(0x08046CE8, 1, 0x08046CA0, 0x00000000)  Err#4 EINTR
16825:  lwp_sigmask(SIG_SETMASK, 0x00002000, 0x00000000) = 0xFFBFFEFF [0x0000FFFF]
16825:  sendto(3, "\b\0 ME6 AB9\005 O90 D T".., 64, 32768, 0x0806AA90, 16) = 64
16825:  alarm(1)                                        = 0
16825:  setcontext(0x08046470)
16825:      Received signal #14, SIGALRM, in pollsys() [caught]
16825:  pollsys(0x08046CE8, 1, 0x08046CA0, 0x00000000)  Err#4 EINTR
16825:  lwp_sigmask(SIG_SETMASK, 0x00002000, 0x00000000) = 0xFFBFFEFF [0x0000FFFF]
16825:  sendto(3, "\b\094E6 AB9\006 P90 D T".., 64, 32768, 0x0806AA90, 16) = 64
16825:  alarm(1)                                        = 0

found that there are many Err#4 EINTR errors.

Can anyone shed some light? thanks

Last edited by ghostdog74; 10-20-2014 at 02:36 AM..
# 2  
Old 10-20-2014
Your network or network connectivity is bad.
Are other destinations affected, too? Then check the cables first.
# 3  
Old 10-20-2014
hi,
i would also like to ask, in my truss output, it tries to read
xstat(2, "/etc/resolv.conf", 0x080472F8) = 0
the /etc/resolv.conf file. I notice in this file, there is a:

Code:
search mydomain.com
nameserver 172.x.x.x

I am suspecting is it because its querying a dns server and getting slow reply back? As far as i know, my machine is not configured at nameserver 172.x.x.x

How does /etc/resolv.conf work? Does Solaris use this file by default when querying Dns? or does it use /etc/hosts first?

thanks
# 4  
Old 10-20-2014
The hosts: entry in /etc/nsswich.conf is the host resolution order.
file corresponds to /etc/hosts, and dns corresponds to /etc/resolv.conf
The commands nslookup and hosts bypass /etc/nsswitch.conf and directly use DNS (/etc/resolv.conf).
The command getent hosts ... uses the default lookup i.e. via /etc/nsswitch.conf.
# 5  
Old 10-20-2014
Quote:
Originally Posted by MadeInGermany
The hosts: entry in /etc/nsswich.conf is the host resolution order.
file corresponds to /etc/hosts, and dns corresponds to /etc/resolv.conf
The commands nslookup and hosts bypass /etc/nsswitch.conf and directly use DNS (/etc/resolv.conf).
The command getent hosts ... uses the default lookup i.e. via /etc/nsswitch.conf.
hi, thanks
so if i want to force the OS not to query dns? Can i remove the nameserver in resolv.conf ? or totally remove /etc/resolv.conf ?
# 6  
Old 10-20-2014
You can remove the dns in /etc/nsswitch.conf
Deleting /etc/resolv.conf has no effect.
BTW the host resolving has nothing to do with your ping problem. You pinged IP addresses - nothing to resolve.
# 7  
Old 10-20-2014
/etc/resolv.conf just enumerates the order and names of dns servers to check. You
may need to modify that. We use several dns servers in out network - two infoblox appliances and one windows domain controller.

As Made_in_Germany said, /etc/nsswitch.conf controls where to look in general.

Is your cache name service daemon running? Turn on dns caching.

Code:
/fmd> svcs /system/name-service-cache
STATE          STIME    FMRI
online         Oct_17   svc:/system/name-service-cache:default

Should say 'online'

Next check performance of the caching with
Code:
nscd -g

You want to see:
Code:
CACHE: hosts

         CONFIG:
         enabled: yes
         per user cache: no
         avoid name service: no
         check file: yes
         check file interval: 0
         positive ttl: 3600
         negative ttl: 5
         keep hot count: 20
         hint size: 2048
         max entries: 0 (unlimited)

         STATISTICS:
         positive hits: 39
         negative hits: 2
         positive misses: 2
         negative misses: 3
         total entries: 2
         queries queued: 0
         queries dropped: 0
         cache invalidations: 0
         cache hit rate:       89.1


CACHE: ipnodes

         CONFIG:
         enabled: yes
         per user cache: no
         avoid name service: no
         check file: yes
         check file interval: 0
         positive ttl: 3600
         negative ttl: 5
         keep hot count: 20
         hint size: 2048
         max entries: 0 (unlimited)

         STATISTICS:
         positive hits: 1104
         negative hits: 2
         positive misses: 25
         negative misses: 3
         total entries: 4
         queries queued: 0
         queries dropped: 0
         cache invalidations: 18
         cache hit rate:       97.5

You may need to increase your local dns cache size. Primarily what you need is a sysadmin/network admin who knows this stuff, and is not following a rote playbook for how to maintain a network.

The optimal solution for dns problems like this is most often to set up caching DNS servers, and turn off nscd.

As a side note, it is very slightly possible your cache is becoming stale, possibly a DNS server has problems. If the cache stuff is working you may want to bounce the nscd process. This will clear the caches. So if an immediate rerun of your problem continues, then you have other issues, which IMO tend to be nasty.

N.B.:
This kind of advice is hard to give without actually being there, too many moving parts to do a decent job vicariously like this.
Login or Register to Ask a Question

Previous Thread | Next Thread

9 More Discussions You Might Find Interesting

1. Programming

Ping test sends mail when ping fails

help with bash script! im am working on this script to make sure my server will stay online, so i made this script.. HOSTS="192.168.138.155" COUNT=4 pingtest(){ for myhost in "$@" do ping -c "$COUNT" "$myhost" &&return 1 done return 0 } if pingtest $HOSTS #100% failed... (4 Replies)
Discussion started by: mort3924
4 Replies

2. UNIX for Beginners Questions & Answers

Interpretation of UNIX command

what does the below do. echo * | xargs ls | wc –l echo * - Output a string comprising the name of each file in the working directory, with each name separated by a space. xargs ls - construct argument list command wc -l - it will pipe the output to the wc command, which will... (4 Replies)
Discussion started by: houmingc
4 Replies

3. Shell Programming and Scripting

How to get reason for ping failure using perls Net::Ping->new("icmp");?

Hi I am using perl to ping a list of nodes - with script below : $p = Net::Ping->new("icmp"); if ($p->ping($host,1)){ print "$host is alive.\n"; } else { print "$host is unreacheable.\n"; } $p->close();... (4 Replies)
Discussion started by: tavanagh
4 Replies

4. Shell Programming and Scripting

Negating shell interpretation

I'm writing a Korn script but am having trouble because the shell interprets the asterisk in this case. Can anyone tell me if there is a way to fix this so that grep takes in STDIN without the interpretation? line="30 09 * * 1-4 /home/user01/bin/start" echo "$line" | grep 'start' (16 Replies)
Discussion started by: sprucio
16 Replies

5. Shell Programming and Scripting

Animation Ping on Solaris Like Cisco Ping

Hi, I develop simple animation ping script on Solaris Platform. It is like Cisco ping. Examples and source code are below. bash-3.00$ gokcell 152.155.180.8 30 Sending 30 Ping Packets to 152.155.180.8 !!!!!!!!!!!!!.!!!!!!!!!!!!!!!. % 93.33 success... % 6.66 packet loss...... (1 Reply)
Discussion started by: gokcell
1 Replies

6. AIX

interpretation of sar

hello with a sar i have this result: System configuration: lcpu=48 ent=4.00 14:06:37 %usr %sys %wio %idle physc %entc 14:06:39 26 9 3 62 1.63 40.7 14:06:41 26 9 3 63 1.58 39.4 14:06:43 ... (0 Replies)
Discussion started by: pascalbout
0 Replies

7. UNIX for Advanced & Expert Users

SAR -b interpretation

I have used SAR -b to get some Unix cache / buffer metrics and the results are confusing me a bit. The pread/s & pwrit/s are showing 0. However the lread/s and lwrit/s are showing figures. I note also that the bread/s and bwrit/s are showing figures. I believe that pread/s and pwrit/s is not... (3 Replies)
Discussion started by: jimthompson
3 Replies

8. UNIX for Dummies Questions & Answers

Interpretation of the uptime command

Hi there, do someone have detailed information how to interpret the uptime command or rather which values can be called normal? (i know what the information means, but i have no idea if these values are ok or to high: 3:02pm an 13:53, 2 Benutzer, Durchschnittslast: 10,06, 12,05, 13,00) ... (5 Replies)
Discussion started by: odin1999
5 Replies

9. UNIX for Dummies Questions & Answers

shell interpretation

I executed the following command in the korn shell: $ variable1="qwerty" ls | sort and the shell executed the 'ls | sort' command. I would have expected an error message from the shell, but instead of that the shell ran the 'ls | sort' command and didn't realize the variable assignement. ... (1 Reply)
Discussion started by: PhilippeCrokaer
1 Replies
Login or Register to Ask a Question