semaphore access speed


 
Thread Tools Search this Thread
Top Forums Programming semaphore access speed
# 29  
Old 09-24-2008
My version of truss does not have -c flag, or any other flag similar to the one I used with trace under Linux.


On Linux :
$ /sbin/sysctl -a |grep shm
... <snip error msgs>
vm.hugetlb_shm_group = 0
kernel.shmmni = 4096
kernel.shmall = 2097152
kernel.shmmax = 33554432

Q: Why would SHM parameters affect semaphore performance?

Also, just in case I ran this
$ /sbin/sysctl -a |grep sem
<snip error msgs>
kernel.sem = 250 32000 32 128

Last edited by migurus; 09-24-2008 at 08:55 PM.. Reason: added info
# 30  
Old 09-25-2008
Quote:
Originally Posted by migurus
Q: Why would SHM parameters affect semaphore performance?
*Possibly* because of the number of page tables and structures required to make use of all that memory. It might result in every op requiring two to three cache misses per call. If there are no cache misses, because shmem is less, then maybe it takes 3 times as fast. Jim??
# 31  
Old 09-29-2008
I'd like to ask gurus where else can I post my question. Would you recommend me any other group or forum?
Your suggestions would be appreciated.
# 32  
Old 09-30-2008
As I previously suggested: LinuxQuestions.org.

Since the bottleneck appears to be within the system call, I suggest you also look at Kernel-related BB's (KernelTrap.org is a good one).
# 33  
Old 09-30-2008
Thanks Otheus and Jim, I got quite detailed answer here:
semaphore access speed | KernelTrap

So, the 2.6.9 kernel is not the best for modern h/w.

I appreciate everebodys time!
# 34  
Old 10-01-2008
Data

Kudos to you for your tenacity! However, I don't think this is the end of it.

I did a little research on strcmp's answer. 2.6.9 was released in 2004 and is standard with RHEL 4, which shipped with glibc 2.3.4. Pentium 3's were old in 2004. RHEL 5 ships with kernel 2.6.18 and glibc 2.5.12.

So I did some benchmarks.

I followed strcmp's suggestion and used a "falling timer" method, where the loop starts and ends after the time() call notes a change in seconds. There's a 10 to 100 ms variance on either side of the fall, so I took an average of several runs. Then I divide the ops/s number by the CPU speed (cycles/s) to get "tics per op".
  • 2.6.18 / P3 / 800 MHz: 548300/s (average, 19 runs) = 1459 tics/op
  • 2.6.18 / AMD Opteron 285 / 2.6 GHz: 1689138 (avg 6 runs) = 1539 tics/op
  • 2.6.18 / AMD Opteron 270 / 1.0 GHz: 974228 (avg 7 runs) = 1026 tics/op
  • 2.6.9 / Xeon / 3.6 GHz: 917196 (avg, 4 runs) = 3925 tics/op
  • 2.6.9 / P3 / 1.25 GHz : 733927 (avg, 5 runs) = 1703 tics/op
  • 2.6.9 / Xeon / 2.3 GHz: 1127894 (avg, 10 runs) = 2608 tics/op

For tics/op, smaller is better. So the 2.6.18 kernel is indeed faster than the 2.6.9 kernels. The Xeon is MUCH slower. Presumably the kernels were compiled by a lowest common denominator. No Optimization flags were enabled, but there was a difference in compilers: the 2.6.9 hosts used gcc 3.4.6, while the newer ones were with gcc 4.1.1. Also, it should be noted that we don't have an AMD running 2.6.9 nor a Xeon running 2.6.18.

It very may well be that the problem is that these kernels were not compiled optimally for the various architectures. Why the Xeons are so much slower is quite surprising, given their characteristic use as HPC components.

Regardless, none of these results seem to explain the fundamental question: Why is SCO so much faster??

Last edited by otheus; 10-08-2008 at 06:57 AM.. Reason: I said the Xeon is much faster when it was much slower.
# 35  
Old 10-01-2008
Here is my code:
Code:
#include <stdio.h>
#include <sys/types.h>
#include <sys/ipc.h>
#include <sys/sem.h>
#include <time.h>
#define NSEMS   2

/* change this per CPU to run between 8 and 12 s*/
const static int maxloop = 10000000; 

main(int argc, char *argv[])
{
    time_t start,last,stop;
    long int i;
    int estimate = 100;
    int sid;
    key_t key;
    ushort vals[NSEMS] = { 0, 0 };

    key = ftok("/tmp",99);
    last=start=time(NULL);
    for (i = 0; i < 1000; ++i) {
        usleep(10);
        last=time(NULL);
        if (last > start) break;
    }
    start=last;
    last = 0;

    for (i = maxloop/8; i < maxloop; i++) {
      if ((sid = semget(key, NSEMS, IPC_CREAT | 0777)) == -1) {
          perror("Can Not Get Semaphore ID");
      }
      if (semctl(sid, NSEMS, GETALL, vals) == -1) {
          perror("Can Not Get Semaphore Values");
      }
    }

/* do the last 1/8th until the second changes.
    If your processor reaches the maxloop before that,
    change the maxloop or the divisor or the "estimate" */

    stop=time(NULL);
    for (i = maxloop - maxloop/8; i < maxloop; ++i) {
      if ( !(i % estimate) ) {
        last=time(NULL);
        if (last > stop) break;
        stop=last;
      }

      /* repeat semaphore opts */
      if ((sid = semget(key, NSEMS, IPC_CREAT | 0777)) == -1) {
          perror("Can Not Get Semaphore ID");
      }
      if (semctl(sid, NSEMS, GETALL, vals) == -1) {
          perror("Can Not Get Semaphore Values");
      }
    }
    stop=last;

    printf("%.2f semop/s (%i/%i) [%d]\n", (double)i/(stop-start), i, stop-start, estimate);
}

Login or Register to Ask a Question

Previous Thread | Next Thread

9 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Semaphore

I was asked to add this piece of code to a c program which I will execute through the shell: for(long i = 0; i < NITER; i++) { sem_wait( &sema); count++; sem_post( &sema); } I didn't get it, which is the critical section ? if it's "count++" how would a thread wake up in order to enter it... (1 Reply)
Discussion started by: uniran
1 Replies

2. Programming

Semaphore

If I create a semaphore and then I fork a number of child processes then all the child process use that same semaphore. Since the process address spaces are different rfom each other then how all the child process are able to access the same semaphore? I understand that semaphore/mutex is at os... (0 Replies)
Discussion started by: rupeshkp728
0 Replies

3. Shell Programming and Scripting

semaphore

Control two exclusively shared resources(semaphore). The two resources are two files. The producer will write even numbers to one file, and odd numbers to another one. The consumer respectively reads from each file until it gets 5 even numbers and 5 odd numbers. Can any one help me with the... (0 Replies)
Discussion started by: gokult
0 Replies

4. Filesystems, Disks and Memory

data from blktrace: read speed V.S. write speed

I analysed disk performance with blktrace and get some data: read: 8,3 4 2141 2.882115217 3342 Q R 195732187 + 32 8,3 4 2142 2.882116411 3342 G R 195732187 + 32 8,3 4 2144 2.882117647 3342 I R 195732187 + 32 8,3 4 2145 ... (1 Reply)
Discussion started by: W.C.C
1 Replies

5. UNIX for Dummies Questions & Answers

semaphore

what is semaphore? can any body explain it in a more simple way than the manual ?? replies appreciated Regards raguram R (7 Replies)
Discussion started by: raguramtgr
7 Replies

6. Shell Programming and Scripting

Semaphore

Hi, I am looking to use a semaphore for the first time in one of my scripts. I am just wondering if there are any simple examples or tutorials around? I am a beginner so the simpler the better :) Thanks -Jaken (2 Replies)
Discussion started by: Jaken
2 Replies

7. Filesystems, Disks and Memory

dmidecode, RAM speed = "Current Speed: Unknown"

Hello, I have a Supermicro server with a P4SCI mother board running Debian Sarge 3.1. This is the "dmidecode" output related to RAM info: RAM speed information is incomplete.. "Current Speed: Unknown", is there anyway/soft to get the speed of installed RAM modules? thanks!! Regards :)... (0 Replies)
Discussion started by: Santi
0 Replies

8. UNIX for Dummies Questions & Answers

Semaphore

Hi, I'm new to UNIX. I need to know what's a semaphore Do reply. Thanks VJ (3 Replies)
Discussion started by: vjsony
3 Replies

9. UNIX for Dummies Questions & Answers

semaphore

hi, is there any command where we can monitor semaphores? (1 Reply)
Discussion started by: yls177
1 Replies
Login or Register to Ask a Question