Vectorization


 
Thread Tools Search this Thread
Special Forums UNIX and Linux Applications High Performance Computing Vectorization
# 1  
Old 03-28-2014
Signal Vectorization

Hi,

I have the following vectorized code:
Code:
long valor = 0, i=0;

 __m128i vsum, vecPi, vecCi, vecQCi;

 vsum = _mm_set1_epi32(0);

 int32_t * const pA = A->data;
 int32_t * const pB = B->data;

 int sumDot[1];

 for( ; i<SIZE-3 ;i+=4){

 vecPi = _mm_loadu_si128((__m128i *)&(pA)[i] );
 vecCi = _mm_loadu_si128((__m128i *)&(pB)[i] );
 vecQCi = _mm_mullo_epi32(vecPi,vecCi);
 vsum = _mm_add_epi32(vsum,vecQCi);

 } 
 vsum = _mm_hadd_epi32(vsum, vsum);
 vsum = _mm_hadd_epi32(vsum, vsum);
 _mm_storeu_si128((__m128i *)&(sumDot), vsum);

 for( ; i<SIZE; i++)
 valor += A->data[i] * B->data[i];   valor += sumDot[0];

However, as I get overflows, I need to handle those cases. Could you please help me with that?

Thanks

Last edited by bartus11; 03-28-2014 at 06:06 PM.. Reason: Please use [code][/code] tags.
# 2  
Old 03-28-2014
What compiler is this?
# 3  
Old 03-29-2014
ipcp and g++, I use both.

---------- Post updated 03-29-14 at 11:29 AM ---------- Previous update was 03-28-14 at 04:52 PM ----------

Someone?
Login or Register to Ask a Question

Previous Thread | Next Thread
Login or Register to Ask a Question