The UNIX and Linux Forums  
Hello and Welcome from United States to the UNIX and Linux Forums! Thank You for Visiting and Joining Our Global Community.

Go Back   The UNIX and Linux Forums > Top Forums > High Level Programming
.
google unix.com



High Level Programming Post questions about C, C++, Java, SQL, and other programming languages here.

More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
semaphore raguramtgr UNIX for Dummies Questions & Answers 7 06-15-2009 09:39 AM
Semaphore Jaken Shell Programming and Scripting 2 04-04-2009 05:10 PM
dmidecode, RAM speed = "Current Speed: Unknown" Santi Filesystems, Disks and Memory 0 02-16-2006 06:16 AM
Semaphore vjsony UNIX for Dummies Questions & Answers 3 04-07-2003 02:06 PM
semaphore yls177 UNIX for Dummies Questions & Answers 1 10-08-2002 11:18 PM

Closed Thread
English Japanese Spanish French German Portuguese Italian Dutch Swedish Russian Norwegian Hungarian Hebrew Danish Powered by Powered by Google
 
LinkBack Thread Tools Search this Thread Rating: Thread Rating: 1 votes, 4.00 average. Display Modes
  #29 (permalink)  
Old 09-24-2008
migurus migurus is offline
Registered User
  
 

Join Date: Sep 2008
Location: US
Posts: 49
My version of truss does not have -c flag, or any other flag similar to the one I used with trace under Linux.


On Linux :
$ /sbin/sysctl -a |grep shm
... <snip error msgs>
vm.hugetlb_shm_group = 0
kernel.shmmni = 4096
kernel.shmall = 2097152
kernel.shmmax = 33554432

Q: Why would SHM parameters affect semaphore performance?

Also, just in case I ran this
$ /sbin/sysctl -a |grep sem
<snip error msgs>
kernel.sem = 250 32000 32 128

Last edited by migurus; 09-24-2008 at 07:55 PM.. Reason: added info
  #30 (permalink)  
Old 09-25-2008
otheus's Avatar
otheus otheus is offline Forum Staff  
Moderator ala Mode
  
 

Join Date: Feb 2007
Location: Innsbruck, Austria
Posts: 1,884
Quote:
Originally Posted by migurus View Post
Q: Why would SHM parameters affect semaphore performance?
*Possibly* because of the number of page tables and structures required to make use of all that memory. It might result in every op requiring two to three cache misses per call. If there are no cache misses, because shmem is less, then maybe it takes 3 times as fast. Jim??
  #31 (permalink)  
Old 09-29-2008
migurus migurus is offline
Registered User
  
 

Join Date: Sep 2008
Location: US
Posts: 49
I'd like to ask gurus where else can I post my question. Would you recommend me any other group or forum?
Your suggestions would be appreciated.
  #32 (permalink)  
Old 09-30-2008
otheus's Avatar
otheus otheus is offline Forum Staff  
Moderator ala Mode
  
 

Join Date: Feb 2007
Location: Innsbruck, Austria
Posts: 1,884
As I previously suggested: LinuxQuestions.org.

Since the bottleneck appears to be within the system call, I suggest you also look at Kernel-related BB's (KernelTrap.org is a good one).
  #33 (permalink)  
Old 09-30-2008
migurus migurus is offline
Registered User
  
 

Join Date: Sep 2008
Location: US
Posts: 49
Thanks Otheus and Jim, I got quite detailed answer here:
semaphore access speed | KernelTrap

So, the 2.6.9 kernel is not the best for modern h/w.

I appreciate everebodys time!
  #34 (permalink)  
Old 10-01-2008
otheus's Avatar
otheus otheus is offline Forum Staff  
Moderator ala Mode
  
 

Join Date: Feb 2007
Location: Innsbruck, Austria
Posts: 1,884
Unhappy

Kudos to you for your tenacity! However, I don't think this is the end of it.

I did a little research on strcmp's answer. 2.6.9 was released in 2004 and is standard with RHEL 4, which shipped with glibc 2.3.4. Pentium 3's were old in 2004. RHEL 5 ships with kernel 2.6.18 and glibc 2.5.12.

So I did some benchmarks.

I followed strcmp's suggestion and used a "falling timer" method, where the loop starts and ends after the time() call notes a change in seconds. There's a 10 to 100 ms variance on either side of the fall, so I took an average of several runs. Then I divide the ops/s number by the CPU speed (cycles/s) to get "tics per op".
  • 2.6.18 / P3 / 800 MHz: 548300/s (average, 19 runs) = 1459 tics/op
  • 2.6.18 / AMD Opteron 285 / 2.6 GHz: 1689138 (avg 6 runs) = 1539 tics/op
  • 2.6.18 / AMD Opteron 270 / 1.0 GHz: 974228 (avg 7 runs) = 1026 tics/op
  • 2.6.9 / Xeon / 3.6 GHz: 917196 (avg, 4 runs) = 3925 tics/op
  • 2.6.9 / P3 / 1.25 GHz : 733927 (avg, 5 runs) = 1703 tics/op
  • 2.6.9 / Xeon / 2.3 GHz: 1127894 (avg, 10 runs) = 2608 tics/op

For tics/op, smaller is better. So the 2.6.18 kernel is indeed faster than the 2.6.9 kernels. The Xeon is MUCH slower. Presumably the kernels were compiled by a lowest common denominator. No Optimization flags were enabled, but there was a difference in compilers: the 2.6.9 hosts used gcc 3.4.6, while the newer ones were with gcc 4.1.1. Also, it should be noted that we don't have an AMD running 2.6.9 nor a Xeon running 2.6.18.

It very may well be that the problem is that these kernels were not compiled optimally for the various architectures. Why the Xeons are so much slower is quite surprising, given their characteristic use as HPC components.

Regardless, none of these results seem to explain the fundamental question: Why is SCO so much faster??

Last edited by otheus; 10-08-2008 at 05:57 AM.. Reason: I said the Xeon is much faster when it was much slower.
  #35 (permalink)  
Old 10-01-2008
otheus's Avatar
otheus otheus is offline Forum Staff  
Moderator ala Mode
  
 

Join Date: Feb 2007
Location: Innsbruck, Austria
Posts: 1,884
Here is my code:
Code:
#include <stdio.h>
#include <sys/types.h>
#include <sys/ipc.h>
#include <sys/sem.h>
#include <time.h>
#define NSEMS   2

/* change this per CPU to run between 8 and 12 s*/
const static int maxloop = 10000000; 

main(int argc, char *argv[])
{
    time_t start,last,stop;
    long int i;
    int estimate = 100;
    int sid;
    key_t key;
    ushort vals[NSEMS] = { 0, 0 };

    key = ftok("/tmp",99);
    last=start=time(NULL);
    for (i = 0; i < 1000; ++i) {
        usleep(10);
        last=time(NULL);
        if (last > start) break;
    }
    start=last;
    last = 0;

    for (i = maxloop/8; i < maxloop; i++) {
      if ((sid = semget(key, NSEMS, IPC_CREAT | 0777)) == -1) {
          perror("Can Not Get Semaphore ID");
      }
      if (semctl(sid, NSEMS, GETALL, vals) == -1) {
          perror("Can Not Get Semaphore Values");
      }
    }

/* do the last 1/8th until the second changes.
    If your processor reaches the maxloop before that,
    change the maxloop or the divisor or the "estimate" */

    stop=time(NULL);
    for (i = maxloop - maxloop/8; i < maxloop; ++i) {
      if ( !(i % estimate) ) {
        last=time(NULL);
        if (last > stop) break;
        stop=last;
      }

      /* repeat semaphore opts */
      if ((sid = semget(key, NSEMS, IPC_CREAT | 0777)) == -1) {
          perror("Can Not Get Semaphore ID");
      }
      if (semctl(sid, NSEMS, GETALL, vals) == -1) {
          perror("Can Not Get Semaphore Values");
      }
    }
    stop=last;

    printf("%.2f semop/s (%i/%i) [%d]\n", (double)i/(stop-start), i, stop-start, estimate);
}
Sponsored Links
Closed Thread

Bookmarks

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On




All times are GMT -4. The time now is 08:51 AM.


Powered by: vBulletin, Copyright ©2000 - 2006, Jelsoft Enterprises Limited. Language Translations Powered by .
vBCredits v1.4 Copyright ©2007 - 2008, PixelFX Studios
The UNIX and Linux Forums Content Copyright ©1993-2009. All Rights Reserved.Ad Management by RedTyger

Content Relevant URLs by vBSEO 3.2.0