Machine dependent problems when using Sockets.


 
Thread Tools Search this Thread
Top Forums Programming Machine dependent problems when using Sockets.
# 1  
Old 10-31-2011
Machine dependent problems when using Sockets.

I am trying to write code for a client-server scenario using AF_INET sockets..

As is usually the case, everything works fine and dandy on my machine, but gives me the following error at runtime:

send: Socket operation on non-socket

The error is thrown by the server when trying to send the next plan of action the the client. Note that this is neither the first nor the first receive between the client and servers..many previous exchanges were successful, and the failure occurs consistently at this point.

The below snippet is within a select while(1) loop accepting incoming connections from clients. no_of_clients is a fixed parameter giving max cllient count.
argv[3] is the hop count a packet is to bounce through the network.

Code:
                        clientfds[client_count] = new_client;

                        if(client_count == no_of_clients)
                        {
                            //FINAL--
                        printf("Sending command\n");
                            if(atoi(argv[3])==0)
                            {
                                printf("Trace of packet:\n");
                                for(i=1; i<=no_of_clients; i++)
                                {
                                    len = send(clientfds[i], "Shutdown", 8, 0);
                                    if (len != 8)
                                    {
                                        fprintf(stderr,"Send sent partial string!\n");
                                        perror("send");
                                        exit(1);
                                    }
                                }

                                exit(0);
                            }
                            else
                            {
                                for(i=1; i<=no_of_clients; i++)
                                {
                                    len = send(clientfds[i], "Charge!!", 8, 0);
                                    if (len != 8)
                                    {
                                        fprintf(stderr,"Yes I Send sent partial string!\n");
                                        perror("send");
                                        exit(1);
                                    }
                                }
                            }
                            //FINAL--
                        printf("Command sent..waiting for listener ready setup\n");

Relevant O/p portion..
...
Sending command
Yes I Send sent partial string!
send: Socket operation on non-socket
...
The only root cause common to this error from google(apart from semantic errors) was exceeding MTU size. The send() definitely is not exceeding any MTUs here as it is very small.

The code runs fine on a 64bit Ubuntu 11.04 install, but fails on a RHEL 5 64 bit machine

Any ideas guys?

---------- Post updated at 01:43 PM ---------- Previous update was at 04:24 AM ----------

Still haven't been able to resolve the issue.
# 2  
Old 10-31-2011
It says 'socket operation on non-socket'. Somehow a non-socket got put in there...

You should print the FD of the socket you're sending to, and system("ls -l /proc/self/fd");

I suspect a buffer got overrun somewhere and the FD list corrupted with something unexpected. A buffer overrun would be very compiler dependent.
# 3  
Old 10-31-2011
Corona : you were spot on. The FD array was getting corrupted.
The array was dynamically allocated.

However, I tried statically allocating the array and it worked like a charm. The interesting bit is, I reverted back to dynamic so that I could show you the corruption..But now the array doesn't get corrupted...Magically the corruption has disappeared. In such a scenario would you recommend me sticking with a dynamic allocation?

[Removed personal info]

Earlier FD 4 was corrupted and command was being sent to FD 0 instead.

Last edited by ab_tall; 11-06-2011 at 12:19 AM..
# 4  
Old 11-01-2011
Quote:
However, I tried statically allocating the array and it worked like a charm. The interesting bit is, I reverted back to dynamic so that I could show you the corruption..But now the array doesn't get corrupted...Magically the corruption has disappeared. In such a scenario would you recommend me sticking with a dynamic allocation?
It means that you still have a pointer or buffer overrunning somewhere. It is just not hitting the FD array at present.
# 5  
Old 11-01-2011
Agreed. That's the trouble with debugging buffer overruns, occasionally they are harmless, but the instant you change anything, bang.
# 6  
Old 11-01-2011
My initial corruption was due to lack of accounting for that 1 index. Anyways, even when I accounted for it, I was getting the corruption, so I concluded that this lack of index was not the root cause of the corruption.

Static allocation has worked around the problem for now.

Two other problems I am facing now:

1) What I am having problem understanding is, why getaddrinfo() is not able to resolve if I give it the entire hostname, whereas leaving out the domain name gives me an error stating service name or host not found.

2) Some combinations of hosts(on the same domain) are unable to interconnect.

The scenario that occurs is,
Initially master waits for incoming connections, and when all clients have connected, passes each client its right neighbours details and instructs the client 1 to go ahead and connect to its right neighbour and then so on.
However, what I am seeing is, on some machines wherein master is on one machine, and clients on another, the Player 1 receives a connect refused(even though i traced the code of the other client and ensured it was listening.
So Master is on Host1
Client 1 and Client 2 on Host 2.
Client 2 listens on port 5555.(assume)
Client 1 tries to connect to neighbour on host 2 and port 5555.
It gets a connect::connection refused.

Output of actual Run

Server Output
-------------
./master 65450 2 10
packet Master on bn19****
clients = 2
Hops = 10
client 1 is on host nom2684*****
client 2 is on host nom2684*****
<Hangs at this point>

Client 1 Output
---------------

./client bn19****
Connected as client 1
Trying to connect to host nom2684**** on port 34437
connect:: Connection refused
<exits>

Client 2 Output
---------------

Connected as client 2
I am client 2 and I am listening on port 34437
I sent my port 34437 to master
<Hangs>
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. IP Networking

Problems and doubts with sockets and timeouts

Hi! I need some help to understand a little bit more the behaviour about socket and TCP connections... Here is my problem I have a client and a server that were written in python. The server program wait until a message arrive and then print the message but if the message not arrive in a second... (2 Replies)
Discussion started by: Kovalevski
2 Replies

2. Shell Programming and Scripting

Script to Start services based on dependent services on other AIX machine

Hi, I just started working on a script. After my research, i found a command which can help me: AIM: To build a script which starts the services (Services 1) on server 1 automatically whenever its down. And it has a dependency on other service (Service 2) on Server 2. So my script has to... (4 Replies)
Discussion started by: draghun9
4 Replies

3. AIX

Dependent modules libc.a and libpthreads.a

Hello I am trying to install ActivePerl Pro Studio and I am seeing the following errors. Could not load program ./setup: Symbol resolution failed for setup because: Symbol ___memcmp (number 1) is not exported from dependent module /usr/lib/libc.a(shr.o). ... (2 Replies)
Discussion started by: flagman5
2 Replies

4. UNIX for Advanced & Expert Users

AIX Dependent Module could not be loaded

I am encountering the above error , even after setting the environment variables correctly AFAIK. I've found sources that say LIBPATH is the shared library variable , and others that say LIB_PATH, so I set both: ========================= root@lipossrp01ga: # echo $gtm_dist ... (6 Replies)
Discussion started by: Clovis_Sangrail
6 Replies

5. Slackware

Context dependent symlinks

Ive got multiple PCs, sharing an NFS mounted home dir. For certain apps I would like to keep the config files host specific. Easy solution is to create symlinks to local folders for configs. Ideally I would still want the .config files to reside in the user home folder. Is it possible to... (2 Replies)
Discussion started by: agentrnge
2 Replies

6. Programming

shell scripting problems involving operations with remote machine

Hi, i have been developing a shell script to transfer a set of files from one ubuntu system to another. Task: while executing the script the files ( ls, dir, cat) in the source machine should transfer to destination machine(at /home/mac/mac/bin) While the script is executed once again, It... (0 Replies)
Discussion started by: srijith
0 Replies

7. Shell Programming and Scripting

Shell scripting problems - Commands not on local machine

Hello all- I have done a lot of searching tonight, but all leads seem to be dead ends. Forgive me if this has been covered, but I've searched the forum and the internet. I am having trouble building a shell script which uses SSH to login to our schools 1024 cluster grid. The issue that I am... (1 Reply)
Discussion started by: Sagan
1 Replies

8. Solaris

pkgrm without removing the dependent packages

Hi all, Is there any option to remove a package without removing the dependent packages.... ie, i need to remove a package, while trying to remove by using pkgrm command it says as some dependent packages also will get removed, i dont want to remove those dependent packages. (1 Reply)
Discussion started by: judi
1 Replies

9. HP-UX

how can i make two dependent jobs into cron?

Hi all, How can i make two dependent jobs into crontab? means after the first job the second job should run in a single crontab entry in unix. (1 Reply)
Discussion started by: megh
1 Replies

10. Shell Programming and Scripting

Editor dependent error?

Hi all, i did typed the following code in vi and executed it. #!/bin/sh //notice the blank line here. DATA=test echo "$DATA" There was no problem. I then deleted the file using the rm command. I used ultraedit typed in the same file. , FTP it in and i got the following output :... (2 Replies)
Discussion started by: new2ss
2 Replies
Login or Register to Ask a Question