Threshold for open connections


 
Thread Tools Search this Thread
Top Forums UNIX for Advanced & Expert Users Threshold for open connections
# 1  
Old 05-30-2012
Threshold for open connections

Hey guys,

We've been having issues on one of our CentOS 6 servers and one of the java programs that runs on it. As the software hasn't caused issues in the past, I'm wondering if its a problem with the CentOS server.

Basically, the software drops it's tcp connection and won't reconnect, timing out on each occasion. I'm wondering if the number of open tcp connections on the server has exceeded a limit ? Where would a tcp limit be held ? I've had a look at /proc/sys/net/core/somaxconn and the limit in there is 128 (which seems quite low although is the default on our other 5 CentOS boxes with no issues) - would this benefit from being increased ? Also, anyone any ideas on what else I could check ? I've had a look in the /etc/sysctl.conf and there's nothing in there that relates to any incoming connection limit.

Much thanks in advance for any help, it's driving me crazy! Smilie

Jim
# 2  
Old 05-30-2012
Depending on the type of network connection (i suppose TCP/IP?) there are a few limits, but i doubt that you have hit them:

A normal TCP connection (say, a telnet session) is done via a "virtual channel": while it is establishing a "socket" (this is a Layer-4 addressing device, an IP-address [=layer-3] combined with a port number [=layer-4]) connection is defined via which all the communication is done. The communication runs from host-aSmilieort-x to host-bSmilieort-y and vice versa. Once the session is closed this socket connection is decomposed and the used ports are released and ready to be used again.

So, in a way, the number of available ports are a limiting factor, but as they are 16-bit numbers (1-65535) this isn't all to imposing.

Another limit is the available memory. Some OSes have tuning options how much memory i set aside for various aspects of the networking stack (for instance TCP reassemble buffers), but i don't know how this is done in Linux, just that this exists. Hopefully someone more knowledgeable regarding Linux than me will fill in this gap.

Still, modern systems on modern networks usually have enough memory to handle the network with ease. While this should be investigated (to make sure the problem doesn't sit there) it is rather unlikely that this is the problem.

What might be a problem is: if the application runs under a normal UID it might lack the necessary ulimits?

What does the application do when it drops the sessions? Does it throw a coredump or does it close normally or does it still run? What do you have to do to revive the system? Reboot? Restart of the application? Just wait?

The more you tell us about the system the easier it is to give concrete advise instead of some general chit-chat. So help us help you and explain in more detail what your system looks like (versions, how much memory/processors/swap space, which network connections) what it does (running applications, what they are doing, how many logins/how much network traffic/how much disk I/O) and some traces: "vmstat", "iostat", etc.

I hope this helps.

bakunin
# 3  
Old 05-30-2012
Thanks for your reply, bakunin.

The application itself is run under root and it's ulimit is unlimited. When the application drops it's connection, it constantly attempts to reconnect and just logs this every 30 seconds as the service is still running. To get it to work again, we simply restart the application and on the very odd occasion, the full server. I don't think memory is a factor, all 6 servers have 16GB of RAM installed and have around 20GB of swap.

I'm probably going off on a tangent, but the number of connections was something that popped into my head Smilie

Jim
# 4  
Old 05-31-2012
Could you post the output of vmstat 1 during the loss of connection?

You say that the application runs for some time and only then stops. This sounds like some resource being used and not being released properly, so that it exhausts over time. It might be that "vmstat" is showing some effects of this.

To be honest, there is no obvious reason one could point at. So good old problem solving skills and patient investigation is what is called for.

I hope this helps.

bakunin

Moderator's Comments:
Mod Comment As this is far from being a "Unix for Dummies" problem i am going to transfer the thread from here to the experts section.
# 5  
Old 05-31-2012
I've seen this behavior when clients do not close the connection properly, resulting in lots of connections in state TIME_WAIT or CLOSE_WAIT when viewed with netstat -n on the server.
# 6  
Old 05-31-2012
bakunin - thanks for your reply. Once the issue happens again, I'll post the output of vmstat. It's a peculiar problem in that it could happen today and then not happen again for a day, a week or a month, there's no pattern in it.

cero - thanks for your reply. When the problem arises again, I'll check the open connections.

Thanks guys, your advice and help is appreciated.
Login or Register to Ask a Question

Previous Thread | Next Thread

9 More Discussions You Might Find Interesting

1. Solaris

How the free memory threshold?

If I understand your question correctly, you are asking for an explanation of Solaris memory manager. You'd better ask Oracle that question because you are talking about Solaris kernel internals. The operating system kernel has no reason to kick a process's memory set out of real memory until... (4 Replies)
Discussion started by: hicksd8
4 Replies

2. Solaris

Rootvol above threshold

Hi there, Root filesystem is above threshold, I have search and cleared unwanted files which are filling up space. But the root fs is still above threshold. I don't know about veritas volume management. Can anyone show me how to solve this. Du shows /proc is occupying a lot of space. Most of the... (2 Replies)
Discussion started by: sundar63
2 Replies

3. UNIX for Dummies Questions & Answers

threshold

Hi, I have a table with 14 columns. How can I filter the columns 2-14, so that I get only those rows back in which the data values are >= 6 in 5 or more columns. :confused: E.g. A 6 6 3 6 7 8 B 1 2 3 4 5 5 C 2 2 2 6 7 8 Here I should only get back the row A. I would like to work from... (5 Replies)
Discussion started by: danieladna
5 Replies

4. UNIX for Dummies Questions & Answers

Load Average threshold

What should be the threshold for load average of a quad core processor? What constitutes "good" and "bad" load average values? (2 Replies)
Discussion started by: proactiveaditya
2 Replies

5. UNIX for Advanced & Expert Users

Quota threshold

Hi, I am trying to make a script in which the user is notified once the disk space of the environment increases a particular threshold. I have made a script for it but I am facing an error while executing it. Could any one here guide me further?? Script #!/bin/sh warninglimit=350000... (22 Replies)
Discussion started by: Taranjeet Singh
22 Replies

6. UNIX for Advanced & Expert Users

how to lessen the threshold of diskusgae %

Hi experts, I found- $ tail -f /var/adm/messages .... .... Jan 17 05:16:31 server01b last message repeated 6 times Jan 17 05:17:05 server01c ufs: NOTICE: alloc: /var/fileserver:file system full but I checked with df -k and found /var/fileserver is only 49% is used. It means... (7 Replies)
Discussion started by: thepurple
7 Replies

7. Shell Programming and Scripting

apache threshold

Hi folks, how can i check apache threshold values via shell scripting and what factors need to check via shell scripting process or number of users or what. pls do advice me. Thanks, Bash (9 Replies)
Discussion started by: learnbash
9 Replies

8. Solaris

Help! Performance Threshold for Solaris 8,9,10

hi, I am trying to determine a 'rule of thumb' threshold for memory usage on different Solaris versions. I know that prior to Solaris 8, "page scan rate > 300" can be used as a general rule of thumb to determine any memory shortages. Since Solaris 9 and 10 have a different memory handling... (2 Replies)
Discussion started by: bwclu
2 Replies

9. Solaris

Unable to open remote connections

Hello everybody, This is an unusual problem that I am facing on my Solaris 9 on Sun Blade 150 workstation. I can ping remote machines (outside subnet) but I can't open up a connection/port on those machines. For example, `ping ftp.xyz.com` gives ftp.xyz.com is alive but if I do a `ftp... (1 Reply)
Discussion started by: red_crab
1 Replies
Login or Register to Ask a Question