I am posting this because my searches for this problem only came up with two posts and no helpful suggestions. I have a "solution" (read work-around hack) and have not tried yet to find a root cause, and may never because I am busy doing other things (read working to pay the bills).
However, I post this with two goals:
1. For the poor shmuck at 3am
2. document in case someone really has a wild hair (hare?) up their butt
Simply put, msgget(2) will return 0 for some reason, which the msgsnd(2) and msgrcv(2) do not like. My notes indicate msgsnd() was OK, and msgrcv() complained, but this was 12 hours into a debugging session....
There are two threads I have found in the interwebs:
forums.codeguru (dot) com/showthread.php?403036-strange-problem-in-using-msgget%28%29-in-Linux
and
unix (dot) com/programming/3755-about-msgget-troble.html
Both of these threads are "old" and closed, otherwise I would have responded to one of them.
NOTE: The codeguru.com has the best code example. The unix.com code has what may be a fatal flaw: it uses IPC_EXCL as part of the permissions - so the second time it is run it should complain, unless he first removed the message queue. However, he should have gotten errno == EEXIST and it appears he did not - he does print errno.
The Linux distro is Ubunto 8, not patched. Because the other posts are from 2006 and 2005, the CPU does not seem to be an issue.
The interesting thing is:
Running ipcs gives (in addition to various semaphores and shared memory):
The original key was 0xF0 which returned 0x8000 when it was working. The hex for the decimal 163840 = 0x28000. I arbitrarily tried a key of 0x7B (well, decimal 123) and got a msgqid = 0x8001 (which == 32769 decimal).
I also see cases in my slime trail that when msgget() was returning non-zero, for a while it returned 0x10001. In all cases I am using an int to hold the msgQ_id. The key = 0xF0 returns 0, not 0x8000, so truncation is not an issue. I have not tried switching back to a key = 0xF0. I will try looking on another system running the same code (ie using 0xF0) to see what ipcs shows.
Another thing: 0 is supposed to be a legal return:
Quote:
Upon successful completion, msgget() returns a non-negative integer, namely a message queue identifier. Otherwise, it returns -1 and errno is set to indicate the error.
So - I don't know why msgget() will start returning 0. Honestly, I had another bug which (for a while) masked what msgsnd() was doing - a "(u)" instead of a "(%lu")" printf was throwing SIGSEGV (sigh) and I fixed both at the same time (ie new key) - this is a non-trivial system to run a code build on && one wants to do as much as one can between runs.
The only suggestion I can make is have the system come up with a unique key using ftok() every time, and remove old message queues. A good start on a key would be the parent process PID.
(please forgive the chopped links - apparently I am not yet blessed to give raw links yet :^)
hi,all
i have in trouble about msgget.
i create a queue and the program like blow:
openMsg( pid_t key )
{
....
int msgid;
....
msgid=msgget(key,IPC_CREAT|IPC_EXCL|0666)
if( msgid<=0 ){
fprintf( stdout,"%s,%d",strerror(errno),errno );
return -1;
... (9 Replies)
I have a script that does a search and replace on a tree using find, xargs and sed that looks something like this.
find . -type f -print0 | xargs -0 sed -i 's/fromthis/tothis/g'
Now this works fine on new versions on Linux but I need to make the script work on an old RAQ550 that has an older... (3 Replies)
Hi,
I am having problem with msgget() function.
Here is the problem that I am having on Unix :
I have two processes sender and receiver. Sender generates queue (msgget()) with some key e.g. 938, for output.
Receiver reads from the same queue. i.e. receiver also tries to get queue... (2 Replies)
Hi,
I've some existing scripts wherein am using ftp + .netrc. I've defined my macros in .netrc file.
I want to switch to sftp now but it seems it doesn't support macros and .netrc and it gives "command invalid" error.
Is there any other alternative?
Note: I don't want help for... (1 Reply)
Hello,
I've setup a ubuntu 10.04 server running samba 3.4.7 as domain controler / file server at a customer site, that works great most of the time but I face a random problem. Of course I'm never on the site when the problem occurs, so I cannot investigate in real time.
What happens is that... (2 Replies)
I want to use msgget() to obtain a message queue between two processes, here is my code:
the first one create the mq, the second one open it and add a message to it. But when I execute the second one, I get permission denied. I've already desperately tried everything I can think of to solve this... (2 Replies)
Solaris 10 Sparc:
When you got a connection locking a tcp/port, and the status is CLOSE_WAIT (for ever :wall:), you just use the tcpdrop, to close the connection.
This is a OS bug. I wrote the bug id bellow:
BUG-ID
6468753 connections stuck in CLOSE_WAIT
The patch that's correct the bug:... (0 Replies)
Hello Folks,
Facing a problem starting Apache Services on AIX 7.1
This is the error i'm getting
/oraapp/prodora/iAS/Apache/Apache/bin/apachectl start: httpd started
Syntax error on line 17 of /oraapp/prodora/iAS/Apache/modplsql/cfg/plsql_pls.conf:
Cannot load... (0 Replies)