The 'select' instruction is used to choose one value based on a condition, without branching.
I want to know the differences between branching and select instructions (preferably for both x86 architectures and PTX). As far as I know, select is more optimal compared to branching instructions, but I don’t have a clear picture.
Branching is a general-purpose mechanism used to redirect control flow. It is used to implement most forms of the
ifstatement (when specific optimizations don’t apply).Selection is a specialized instruction available on some instruction sets which can implement some forms of the conditional expression
or
provided that
xandyare plain values (if they were expressions, they would both have to be computed before the select, which might incur performance penalties or incorrect side-effect evaluation). Such an instruction is necessarily more limited than branching, but has the distinct advantage that the instruction pointer doesn’t change. As a result, the processor does not need to flush its pipeline on a branch misprediction (since there is no branch). Because of this, a select instruction (where available) is faster.On some superscalar architectures, e.g. CUDA, branches are very expensive performance-wise because the parallel units must remain perfectly synchronized. On CUDA, for example, every execution unit in a block must take the same execution path; if one thread branches, then every unit steps through both branches (but will execute no-operations on the branch not taken). A select instruction, however, doesn’t incur this kind of penalty.
Note that most compilers will, with suitable options, generate ‘select’-style instructions like
cmovif given a simple-enoughifstatement. Also, in some cases, it is possible to use bitwise manipulation or logical operations to combine a boolean conditional with expression values without performing a branch.