a question about pthread performance


 
Thread Tools Search this Thread
Top Forums Programming a question about pthread performance
# 1  
Old 12-11-2008
Question a question about pthread performance

Hello,

I run my pthread code on Linux with 4 processors. However, the speed up is only 2 times.

The code is about solving equation (G+s(i)C)z(i)=B*us(i), i=1,...,n. Here G,C are m*m matrix, B*us(i) is a m*1 vector and s(i) are n different numbers. I need to solve the equation n times to get z(1)...z(n). Here, I use multithread to solve the n equations.
eg. Now I have 4 thread and n=12. Thread(1) solves 4 equations (G+s(1,2,3,4)C)X(1,2,3,4)=B. Thread(2) solves (G+s(5,6,7,8)C)X(5,6,7,8)=B....

I use pthread_creat() and pthread_join(). However, for 4thread, it is not as 4 times faster as 1 thread, only 2 times instead. For 2 thread, it is about 1.5 times as 1 thread. What I have observed that, to solve one equation under 4 thread is much slower than solving one equation using 1 thread.

Can anybody tell me what is the reason? Thanks a lot.

Below is the code. Thanks,
Code:
typedef struct{
  int thread_no;
  int allthread;
  mat *G,*C,*B;
  mat *us, *Z;
  vec *samples;
  double *Control, *Info;
} parm;

 
 void sampleLU(int thread_no, int allthread, mat *G, mat *C, mat *B, mat &us, mat &Z, vec &samples, double Control[], double Info[])
{
  Real_Timer lu_symbolic_init, lu_symbolic_free, lu_numerical, lu_solve_time;
  Real_Timer sCpG_run_time;
  int np = samples.size();
  int nDim = B->m;
  int start, stop;
 
  start = thread_no * (int)(np/allthread);
  stop = start + (int)(np/allthread) - 1;
  if( thread_no == allthread-1 ) stop = np-1; 
  for (int i = start; i<=stop; i++ ){
    cs *A;
    if(thread_no == 0) {
      sCpG_run_time.start();
      A = G+sample(i)*C;
      sCpG_run_time.stop();
    }
    else{
      A = G+sample(i)*C;
    }
 
 /* LU decomposition */
...
 if(i == start){
   if(thread_no == 0) {
     lu_symbolic_init.start();
     A = LU;
     lu_symbolic_init.stop();
   }
   else {
    A = LU
   }
 }

 /* solve Az = b  */
 double* z = new double[nDim];
 vec b(nDim);
 b.zeros();
 if(thread_no == 0) lu_solve_time.start();
 LUz=b
 vec zz(z, nDim);
 delete [] z;
 Z.set_col(i, zz); 
 ...
  }
  if(thread_no == 0){
    std::cout << "Thread" << thread_no << " sC+G \t:" << sCpG_run_time.get_time() << std::endl;
    std::cout << "Thread" << thread_no << " symbolic initial time: \t"<<lu_symbolic_init.get_time()<<std::endl;
    std::cout << "Thread" << thread_no << " LU decomposition time \t:" << lu_numerical.get_time() << std::endl;
    std::cout << "Thread" << thread_no << " symbolic free time: \t"<<lu_symbolic_init.get_time()<<std::endl;
    std::cout << "Thread" << thread_no << " LU solve time   \t: " << lu_solve_time.get_time() <<std::endl;
    
  }
}
void * psampleLU(void *arg)
{
  parm *p = (parm *)arg;
  sampleLU(p->thread_no, p->allthread, p->G, p->C, p->B, *(p->us), *(p->Z), *(p->samples), p->Control, p->Info);
  return NULL;
}


Last edited by otheus; 12-15-2008 at 09:24 AM.. Reason: Added code tags
# 2  
Old 12-15-2008
Quote:
Originally Posted by mgig
Hello,

I run my pthread code on Linux with 4 processors. However, the speed up is only 2 times.
Stop right there. Post your /proc/cpuinfo before we proceed. I believe that may help answer some questions.
Login or Register to Ask a Question

Previous Thread | Next Thread

9 More Discussions You Might Find Interesting

1. Programming

pthread question : global variable not updated

Hi, I wrote the following program to understand mutexes. If I run the program , number of threads is shown as zero, even after creating one thread. When running with gdb, it works fine. The function process is used to update global variable (used to keep track of threads). It looks like the... (2 Replies)
Discussion started by: sanjayc
2 Replies

2. Programming

Question (pthread): How to Signal all threads w/o "broadcast"?

Hello all, Is there any way that I can signal (wake) all threads that I have created without using pthread_cond_broadcast? Cheers! Aaron (6 Replies)
Discussion started by: mobility
6 Replies

3. AIX

Performance question

Looking at the performance hit on my server, does it matter wich command I run? client # rsh server tar –cf - . | tar –cv –f – or server # tar –cf – . | rsh client ‘cd target && tar –xv -f –‘ I think it doesn't really matter because both command strings involve a tar being run on the... (1 Reply)
Discussion started by: petervg
1 Replies

4. Programming

pthread and mutex question

Hello, I have got some issue with the struct variable with passed arguments the variable in the sturct is only recognize the last value their assigned to I'm pretty confused why the mutex didn't work out here is my program: #include<stdio.h> #include<pthread.h> pthread_mutex_t lock... (3 Replies)
Discussion started by: michael23
3 Replies

5. UNIX for Dummies Questions & Answers

Processor performance question [hardware]

A few of our machines need upgrading and we are looking into a selection of processors at present. There are suggestions on the vendor's websites that the L3 cache was specifically introduced for gamers. Is this true? Does having L1, L2 and/or L3 cache help at all in performance or are the... (0 Replies)
Discussion started by: figaro
0 Replies

6. Solaris

nestat performance question

We have sun solaris server v440 and when we do netat -a | wc , it takes more than 5 minutes to complete. Do you think its a network problem or server load issue? (1 Reply)
Discussion started by: mokkan
1 Replies

7. AIX

AIX 5.2 performance question

I am trying to analyze the performance of an AIX system. I think I may have a disk I/O issue, but I am asking for help to validate or invalidate this assumption. I ran the commands below during a period of peak load. Please help me to find any performance bottlenecks. Thanks in advance for your... (15 Replies)
Discussion started by: jhall
15 Replies

8. AIX

pthread performance question

Running dedicated on AIX with 4 processors, creating 4 threads, each with equal work to do, only runs about 20% faster than 1 thread with all of the work. Test case has no blocking but does share memory for read access only. Any ideas why I'm only seeing 20% gain? Is this typical on AIX? ... (1 Reply)
Discussion started by: ldarden
1 Replies

9. AIX

pthread lock question

Is it possible that the function "pthread_cond_broadcast" block itself and the function "pthread_cond_wait" unblock in multi-threads programming ? The operating system is AIX 5.2, its maintenance level is : 5.2.0.4, VisualAge C++ 6.0. Thanks (0 Replies)
Discussion started by: Frank2004
0 Replies
Login or Register to Ask a Question