Intro to removing code branches

removing_branches

On Intel’s 64 and IA-32 Architectures Optimization Reference Manual there is a section on how to optimize code via branch removal.

I found this interesting and started looking around for some places that would explain this in a bit more detail. I found a great post on Stackoverflow explaining how to Remove Branching via Bitwise Select and some more examples of branch removing via the bit-twiddling hacks article that has been reference many times.

I haven’t yet found a place that explains this in more detail so I thought that I’d give it a go.

Example #1:

Code with branch is:


if(value > some_other_value)
{
 value *= 23;
}
else
{
 value -= 5;
}

Solution #1:


const int Mask = (some_other_value-value)>>31;
value=((value*23)&Mask)|((value-5)&~Mask);

How does this work?

  1. Get the difference between value and some_other_value
  2. Mask will be a string of 32 1’s if the value is negative (due to 2’s compliment), 0 if the value is positive.
  3. Both multiplications are done either way, but by doing a Bitwise AND with the Mask (and its compliment via ~Mask), we either set one of the multiplications to 0, and leave the other alone by multiplying it by 1, since the ~ operator will turn all the 1’s to 0, except for the last bit.

Though this example is too specific, if it were more general then we could create some #defines to easily let you write branchless code, while keeping some sort of readability by naming the defines something like:

NO_BRANCH_EVAL_LESS_THAN_MULT_SUB(value, some_other_value, 23, 5)

Which would expand itself to be equal to the expression in Solution #1

Example #2:

Calculate the absolute value of an integer without branching,

Solution #2:

#define CHAR_BIT 8
int v; // we want to find the absolute value of v
unsigned int r; // the result goes here
int const mask = v >> sizeof(int) * CHAR_BIT - 1;

r = (v + mask) ^ mask;

How does this work?

  1. Again, get the MSB out of this by doing a 31 bit right shift
  2. Add the mask to the original value, and XOR it with the mask
  3. This means that if the value is positive, it remains unmodified
  4. However, if the value was negative, then things get interesting as this approach exploits the concept of 2’s compliment
  5. The value of the mask in this case becomes -1 (or 32 1s as represented by 2’s compliment)
  6. By adding -1 to the original value, we get -2
  7. Lastly, by doing an XOR with the new value,  we strip away the bits that represent -1 from the value, and arrive at the original value, with no negative sign.

Using -2 as an example:

abs

Until next time!

Leave a Reply

Your email address will not be published. Required fields are marked *