Unix/Linux Go Back    


Emergency UNIX and Linux Support Please post your urgent questions here for highest visibility. Posting a new thread to this forum requires Bits. We monitor this forum to help people with emergencies, but we do not guarantee response time or answers. This forum is "best effort" only. Members who reply to posts here receive a bonus of 1000 Bits per reply.

Network related issues

Emergency UNIX and Linux Support


Reply    
 
Thread Tools Search this Thread Display Modes
    #1  
Old Unix and Linux 2 Weeks Ago
ggayathri ggayathri is offline
Registered User
 
Join Date: Oct 2008
Last Activity: 5 September 2017, 11:43 AM EDT
Posts: 84
Thanks: 1
Thanked 0 Times in 0 Posts
Network related issues

Oflate we are finding a few servers experiencing severe slowness. What would be the commands that I need to try to postmortem the situation?
Sponsored Links
    #2  
Old Unix and Linux 2 Weeks Ago
jim mcnamara jim mcnamara is offline Forum Staff  
...@...
 
Join Date: Feb 2004
Last Activity: 25 September 2017, 9:00 AM EDT
Location: NM
Posts: 11,184
Thanks: 561
Thanked 1,095 Times in 1,011 Posts
What OS? Your other post mentions AIX. Getting data from the past is really difficult unless you had already set up monitoring or auditing.

If you have detailed logs from applications, sometimes you can infer that application A has been taking longer and longer times to complete.

Many kinds of problems are sporadic or are hard to reproduce. These can only be found by creating monitors before the fact.

Please give us more system details: specific OS, main application(s) for the system.
Example: AIX 7.3, sybase server on SAN.
Sponsored Links
    #3  
Old Unix and Linux 2 Weeks Ago
rbatte1 rbatte1 is offline Forum Staff  
Root armed
 
Join Date: Jun 2007
Last Activity: 25 September 2017, 8:55 AM EDT
Location: Lancashire, UK
Posts: 3,261
Thanks: 1,391
Thanked 630 Times in 569 Posts
Some wild guesses:-
  • Loss of access to DNS server (slow reverse IP lookup for auditing, so slow login or application)
  • Database locks - hugely dependant on your application
  • Missing database index causing full table scans
  • Poor data queries, e.g. get all records from the database then check each in turn on criteria rather than building the condition into the query
  • Database logs files filling and flushing too slowly
  • Exhausting real memory causing paging (potentially DB consuming too much real memory)
  • Network speed conflict, e.g. if NIC is 10M-half and switch is 100M-full, it will work, but any file transfer will cripple it with lots of dropped packets.
  • IO issues, especially with NFS or an HA cluster if you fail over
  • Scheduled work, e.g. current stock summary
  • Ad-hoc jobs, e.g. current stock summary
  • Resources stealing by another LPAR if the definitions allow it
  • Large write volume to direct disk (e.g. local) rather than cached disk (RAID or SAN etc.)
  • High NFS contention especially with other seemingly unrelated servers

You can see it is a very very VERY wide spread of options so far - and the list is a long way from being exhaustive. You need to be a fair bit more explicit about what you have (including OS) what goes slow, what's happening at the time, what dependencies you have with other servers.



Robin
    #4  
Old Unix and Linux 2 Weeks Ago
otheus's Unix or Linux Image
otheus otheus is offline Forum Advisor  
Smartass
 
Join Date: Feb 2007
Last Activity: 6 September 2017, 5:43 AM EDT
Location: Innsbruck, Austria
Posts: 2,157
Thanks: 12
Thanked 51 Times in 48 Posts
Most *NIX systems (AIX, Linux, Solaris, BSD) have some kind of system and accounting records. You can run
Code:
sar

to see if it is properly deployed on your system. If you run it and get loads of output, you may be in luck. To use it, refer to the man pages. Typically you want to check options for memory and swap usage, CPU usage, and I/O activity.

If it's not installed, consider deploying this first before installing some complex monitoring software; it's a very standard unix utility that has been around for ages, but the implementation and features vary from platform to platform. For Linux install the sysstat package.

On most systems, sar's data is collected through another program which is run as a cronjob. On a typical RedHat/CentOS Linux system, you will find /etc/cron.d/sysstat to contain:


Code:
* * * * * root /usr/lib64/sa/sa1 -S XALL 1 1

which I immediately change to


Code:
*/5 * * * * root /usr/lib64/sa/sa1 -L -S XALL 10 30

The original form collects data once per minute, which is often simply not enough granularity to get a feel for rapid changes to the system, the kind that cause instability and crashes. Also, if memory becomes extremely sparse, cron might not be able to spawn the job every minute.

My form, however, spawns a new job every 5 minutes. It writes 30 records, one every 10 seconds. The corresponding reports contain enough detail to know very precisely when the problem started. You will need an additional 1.5 GB of disk space on /var/log if you do this.

If you want graphs and pretty output, you may be able to export the data into graphing engines or spreadsheets. Linux's sar has such a program (sadf), and other related projects can slurp of the data and present graphs.
Sponsored Links
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Linux More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
Network related script kannansoft1985 Shell Programming and Scripting 1 08-28-2014 12:25 AM
Swap space related issues, how to recognise the newly attached disk vamshigvk475 Solaris 2 01-03-2013 11:56 PM
Network related errors samsungsamsung AIX 2 11-22-2010 06:07 PM
Fork syscall and related issues MrUser Programming 2 02-19-2010 07:59 AM



All times are GMT -4. The time now is 09:31 AM.