UNIX & Linux Forums

UNIX & Linux Forums (https://www.unix.com/index.php)
-   UNIX for Advanced & Expert Users (https://www.unix.com/unix-for-advanced-and-expert-users/)
-   -   Time_wait ?? (https://www.unix.com/unix-for-advanced-and-expert-users/5292-time_wait.html)

shibz 03-23-2002 05:21 AM

Time_wait ??

My machine is Enterprise 250, solaris 2.6, with Oracle 9iApplication Server( 1022) and Apache 1.3 running.

The problem is the machine appears to be slow when accessed from remote. when we login, it takes time to connect, when we type it appears after some time and so on...

I have checked up netstat -a , there are nearly 175 TIME_WAIT connections originated by java.
( found out using lsof, thanks to the forum for 'lsof' information )
In my earlier thread I received recommendations that TIME_WAIT does not cause any problem. ( But here this seems to be something odd.)

Is this causing the problem?. I have a standby machine, with same configuration ( but Application Server and Apache is not running) on the same network. This works well from remote.

Hope to receive some information.

Thanks in Advance,

halfling 03-26-2002 06:30 PM

Man, I can tell this post is gonna be long already ;)

:p :p Disclaimer :p :p These are only suggestions for research, and should be tested in a development environment... as everything should... before being put into production.

TIME_WAIT states arn't too great an idication of a problem, especially on web servers. If people repeatedly press the stop button in their browser, it can create TIME_WAIT states on the server :D How long connections stay in this state is tunable, read below...

This sounds more like a link, cable, hardware, network congestion problem than an application issue.

You're on the right track with netstat, but use the -s option. That'll give you some nice stats. With netstat -s, the section you want to pay attention to is TCP, pipe it to more or a file for easier reading. Some values you want to look at are:

tcpOutDataSegs =456318
tcpRetransSegs = 86
tcpInInorderSegs =229512
tcpInUnorderSegs = 0
tcpInDupSegs = 34

I've included some sample numbers.

Divide the tcpRetransSegs by tcpOutDataSegs and multiply by 100 to get your retransmission percent. In a LAN environment on an intranet server, the retrans percent should be below 1%... for an average work-load server.

If the retrans rate is too high, there could be a cable problem, or the hardware is failing.

I've also seen problems with Sun equipment and cicso switches not autonegotiating properly. You may want to try setting the speed manually. In /etc/system, add the following (assuming a 100mbit full duplex connection and an hme interface):

set hme:hme_adv_autoneg_cap=0
set hme:hme_adv_100fdx_cap=1
set hme:hme_adv_100hdx_cap=0
set hme:hme_adv_10fdx_cap=0
set hme:hme_adv_10hdx_cap=0
set hme:hme_adv_100T4_cap=0

Also, if needed, have your network admin set the port speed on his end.

traceroute from/to the box, do a little snoop'ing... any errors in the /var/adm/messages logs? Ftp a large file, setting hash marks, and see how fast it is...

Here are also some more aggressive tcp settings for web servers, I'd recommend using ndd to set these, perhaps in an /etc/S69inet_mods script (make sure you grab your current values so you can revert to them on the fly):

# how long to keep connections in a TIME_WAIT state. Default 4min, new is 1min
ndd -set tcp_close_wait_interval 60000

# max number of completed connections waiting for a return, default 128, new is 1024
ndd -set tcp_conn_req_max_q 1024
# max number of incomplete connections (ie: SYN_RCVD state), default 1024, new is 4096
ndd -set tcp_conn_req_max_q0 4096

# how long to retransmit after ESTABLISHED, default 8min, new is 1min
ndd -set tcp_ip_abort_interval 60000
# how long to keep connections alive, default 12min, new is 1min
ndd -set tcp_keepalive_interval 900000

# how long to wait before retransmitting, default 3sec, no change
ndd -set tcp_rexmit_interval_initial 3000
# max wait time for retransmission after initial retransmit, default 1min, new is 10sec
ndd -set tcp_rexmit_interval_max 10000
# how long to wait to retransmit after initial retransmit, default .4sec, new is 3sec (this value "grows" to max on subsequent retransmissions)
ndd -set tcp_rexmit_interval_min 3000

# increase the send/receive buffers, default 8192, new is 32768
ndd -set tcp_xmit_hiwat 32768
ndd -set tcp_recv_hiwat 32768

Managing the TIME_WAIT queue does take system resources, but 175 isn't anything to worry about. The tcp_close_wait_interval above will lower the amount of time connections will stay in this state.

The tcp_conn_req_max_q and tcp_conn_req_max_q0 are important for web servers, as it directly affects how many concurrent connections or requests you can get in at any one moment. These setting should solve any DoS attacks.

You may need to increase file descriptor limits if you havn't already as well, in /etc/system. These settings aren't too agressive, but more than default values:

set rlim_fd_cur=256
set rlim_fd_max=1024

Hmmm... man, good luck... any more information or errors you can provide would be helpful

halfling 03-26-2002 06:34 PM

Should have asked this basic question first. Have you tried shutting down oracle and apache for a time, then test the connection?

That will more clearly define if it's a system resource issue, or possible hardware problem.

shibz 03-29-2002 11:04 AM


Thanks for the inputs.. Currently the server is up and running. I don't have clearance for a down. Of course I will come back with further details after my next shutdown..

Thanks again,

All times are GMT -4. The time now is 08:49 PM.

UNIX and Linux Forums
Search Engine Optimisation provided by DragonByte SEO v2.0.32 (Pro) - vBulletin Mods & Addons Copyright © 2017 DragonByte Technologies Ltd.
Unix & Linux Forums Content Copyright©1993-2017. All Rights Reserved.
Forum Operations by The UNIX and Linux Forums