AIX 6.1 reach the threshold of stream(no -a|grep strthresh)
last night i want to do oracle full backup with expdp when i switch to oracle it hangs,it looks like:
su - oracle
there is nothing feedback and hang ,but su - root work fine.
then i use truss su - oracle found it stuck at "ENOSR" ,then i changed the kernel parameter of strthresh from 85 to 90 and the su - oracle command works fine.
the no -a command output:
and i found the "delayed" column in the result of netstat -m output
is not 0,it likes:
my questions is :
1、how can i know the stream usage on aix 6.1?
and now i suspect the problem is network issue but i don't know how to affirm that.
thanks!
tony
2013/2/17
---------- Post updated at 10:08 PM ---------- Previous update was at 10:03 PM ----------
the strthresh means:
AIX has another no option called "strthresh" which is defined as "Specifies the maximum number of bytes Streams are normally allowed to allocate. When the threshold is passed, does not allow users without the appropriate privilege to open Streams, push modules, or write to Streams devices, and returns ENOSR. The threshold applies only to output side and does not affect data coming into the system` (e.g. console continues to work properly). A value of zero means that there is no threshold. The strthresh attribute represents a percentage of the thewall attribute and you can set its value from 0 to 100. The thewall attribute indicates the maximum number of bytes that can be allocated by Streams and Sockets using the net_malloc() call. When you change thewall attribute, the threshold gets updated accordingly." Thank you for using AIX Support Family Services.
Moderator's Comments:
Use code tags, thanks.
Last edited by zaxxon; 02-18-2013 at 07:31 AM..
Reason: code tags, see PM
Location: on the road for work; home is private time
Posts: 456
Thanks Given: 10
Thanked 108 Times in 100 Posts
sb_max, at 4Mbyte looks large enough, but i would increase the tcp_sendspace and tcp_recvspace. 256 or 512k, rather than 64k. Note, an application can overide the defaults, so maybe your real sizes are larger already.
how much real memory?
This User Gave Thanks to MichaelFelt For This Post:
sb_max, at 4Mbyte looks large enough, but i would increase the tcp_sendspace and tcp_recvspace. 256 or 512k, rather than 64k. Note, an application can overide the defaults, so maybe your real sizes are larger already.
how much real memory?
thanks for your reply.the physical memory size is 96gb.
the application run on this machine is oracle 11gR2 RAC,i set the tcp_sendspace from the oracle manual and do it on other machine many times and never face this problem,how can ionitor the stream usage in aix?
Location: on the road for work; home is private time
Posts: 456
Thanks Given: 10
Thanked 108 Times in 100 Posts
A rather simple way to monitor socket activity (aka streams), especially for blockage is to look at netstat -tn output.
What you are looking for is numbers in the Send-Q and/or Recv-Q. If they are consistently at the sendspace/recvspace size then you may be suffering from network congestion outside the box - as TCP is doing what it can, then stopping and waiting for acknowledgements (Send-Q at max) and the "outside" is waiting for the server to wake up and respond when the Revc-Q is "stuck" at max.
I have looked at netstat -nm again. It is normal that there are some "delayed" numbers. Not sure why - probably has something to do with setting up the stack. What you want to watch for is "failed" - as that indicates, mainly, not enough memory for communications.
Question: as this sometimes occurs: are you using large sends (e.g., MTU of 9000) while the network and/or endpoints cannot support that?
1、netstat -m output of failed is consistently zero.
2、mtu
both of en1 and en2's mtu are 1500
the application run this machine is oracle 11gR2 rac,and the client is middleware tuxedo
3、first output
4、second output
5、netstat -p tcp output
6、this qustion has appeared 2 times,first time i changed the strthresh from 85 to 92 ,the second from 90 to 92,a few days ago the ibm engineer told me can modify the strthresh to 0 and the stream has no limit,i dont't modify that because i am worry about if i change to 0 when the stream usage reach 100% and whole system is hang until i reboot the system from hmc or something.
7、the system has reboot a week ago and the switch hang has found before that,so the netstat's statistics was lost
thanks a lot for your helping.
Last edited by zaxxon; 02-19-2013 at 06:07 AM..
Reason: uncomplete usage of code tags
Location: on the road for work; home is private time
Posts: 456
Thanks Given: 10
Thanked 108 Times in 100 Posts
Quote:
and now i suspect the problem is network issue but i don't know how to affirm that.
I have been approaching this as an AIX configuration issue because changing a setting has helped it "go away". Needed: better definition of what you mean by "network issue".
Some data during/after the problem (during - repeating commands to look for deltas helps pin-point what the system is trying to say).
Sort of: no Pain, no Gain.
In any case - real data values - during a backup are needed to know if we are looking at this properly.
FYI: not sure what the limits are these days. Back when CHRP (Common Hardware Reference Platform) first came out IP buffers were limited to 4x 256MB memory, or 1G - up to 50% of memory (so, when more than 2G of memory, maximum was 1G)
no -o thewall tells us the HW limit (1k value) - so roughly, drop 6 digits, and you get the GByte value - on my system with 9G - that makes it near 50% still.
Your number: thewall = 47448064 goes down to 47.
I do not see this as being your limiting factor - unless it is conflicting with something else. However, there is a second variable to set a limit under thewall.
sorry about my english,the "network issue" i mean that is network problem ,maybe the networker parameter of kernel parameter set a wrong value or network card and cable has something wrong .
is default 0.
the physical memory is 96g,thewall set 47G ,is nearly 50% of ram.
thanks!
Moderator's Comments:
Use code tags constantly for code, logs, snippets etc. thanks. You got a reminder about using it after just a PM seems to be not sufficient.
Last edited by zaxxon; 02-19-2013 at 06:09 AM..
Reason: code tags
Hello,
I am trying to reach forum administrator Neo because of specific material that he posted here. I cannot reach him directly via PM him unless I have made 10 posts, but at the same time I don't want to spam the forum with 10 posts just to be able to PM him. I assume I won't be able to... (3 Replies)
new3=`cat /tmp/list3`
for pol in "$new3" $(su - dbadmin -c "ssh $new3 '/usr//llist'");
do export policy=`echo $pol`; su - dbadmin -c "ssh $x '/usr/policycmd $policy -L |grep -i active; echo $policy'">>/tmp/listxyz;done
I am having trouble with this testscript as the file list3 has two names... (1 Reply)
Dear All
When I start the AIX(6100-06)audit subsystem.
the log will save in /audit/stream.out (or /audit/trail), but in default when /audit/stream.out to grow up to 150MB.
It will replace the original /audit/stream.out (or /audit/trail).
Then the /audit/stream.out become empty and... (2 Replies)
Hi Guys,
The management is being frisky about scan rate in the range of a few thousands ( 4 digit scan rates occasionally). After much research ive concluded that its ok to have high scan rates , unless it leads to swapping/ it falls above 1:4 ratio with free rate (fr:sr)
My question is:... (2 Replies)
Hi, I recently research on how auto-mailing to notify the increase of storage size. I try avoid schedule/routine checkup the storage to determine increase the storage size. It is time-consuming. Any comment on how to get the storage size %? and automatically trigger mailing function instead... (16 Replies)
Hi
I used this command:
mplayer http://host/axis-cgi/mjpg/video.cgi -user root -passwd root \
-cache 1024 -fps 25.0 -nosound -vc ffh264 \
-demuxer 3 -dumpstream -dumpfile output.avi It's ok but...
Video Playing is very fast! Why? Is it a synch problem?
What parameter I have to use for... (1 Reply)
Hi all,
First I know little about ipv6.
I have two target. A and B,A and B connet with each other *directly* with line.and I can ping each other with ipv4
For A:
# ifconfig
eth0 Link encap:Ethernet HWaddr 00:21:9B:80:51:68
inet addr:128.224.159.188 Bcast:128.224.159.255... (1 Reply)
Hi all!
I have problem with copying files from tape drive.
The contents of tape:
silverman# tcopy /dev/sa1
file 0: block size 10240: 21 records
file 0: eof after 21 records: 215040 bytes
file 1: block size 10240: 20712 records
file 1: eof after 20712 records: 212090880 bytes
file 2:... (2 Replies)