Here’s my attempt. Any tips on a better solution?:
// for loop to convert 32 to 16 bits uint32_t i; int32_t * samps32 = (int32_t *)&(inIQbuffer[0]); int16_t * samps16 = (int16_t *)&(outIQbuffer[0]); for( i = 0; i < ( num_samples * 2/* because each sample is two int32 s*/ ); i++ ) { overflowCount += ( abs(samps32[i]) & 0xFFFF8000 ) ? 1 : 0; samps16[i] = (int16_t)samps32[i]; } // Only report error every 4096 accumulated overflows if( ( overflowCount & 0x1FFF ) > 4096 ) { printf( 'ERROR: Overflow has occured while scaling from 32 ' 'bit to 16 bit samples %d times', overflowCount ); }
Here’s the part that actually checks for overflow:
overflowCount += ( abs(samps32[i]) & 0xFFFF8000 ) ? 1 : 0;
It seems that you are checking for the overflow of a 16-bit addition. You can avoid branch in the assembler code by just having
This generates three ALU operations but no branch in the code. It may or may not be faster than a branching version.