Or, maybe, what I don’t get is unary coding:
In Golomb, or Rice, coding, you split a number N into two parts by dividing it by another number M and then code the integer result of that division in unary and the remainder in binary.
In the Wikipedia example, they use 42 as N and 10 as M, so we end up with a quotient q of 4 (in unary: 1110) and a remainder r of 2 (in binary 010), so that the resulting message is 1110,010, or 8 bits (the comma can be skipped). The simple binary representation of 42 is 101010, or 6 bits.
To me, this seems due to the unary representation of q which always has to be more bits than binary.
Clearly, I’m missing some important point here. What is it?
The important point is that Golomb codes are not meant to be shorter than the shortest binary encoding for one particular number. Rather, by providing a specific kind of variable-length encoding, they reduce the average length per encoded value compared to fixed-width encoding, if the encoded values are from a large range, but the most common values are generally small (and hence are using only a small fraction of that range most of the time).
As an example, if you were to transmit integers in the range from 0 to 1000, but a large majority of the actual values were in the range between 0 and 10, in a fixed-width encoding, most of the transmitted codes would have leading 0s that contain no information:
To cover all values between 0 and 1000, you need a 10-bit wide encoding in fixed-width binary. Now, as most of your values would be below 10, at least the first 6 bits of most numbers would be 0 and would carry little information.
To rectify this with Golomb codes, you split the numbers by dividing them by 10 and encoding the quotient and the remainder separately. For most values, all that would have to be transmitted is the remainder which can be encoded using 4 bits at most (if you use truncated binary for the remainder it can be less). The quotient is then transmitted in unary, which encodes as a single
0bit for all values below 10, as10for 10..19,110for 20..29 etc.Now, for most of your values, you have reduced the message size to 5 bits max, but you are still able to transmit all values unambigously without separators.
This comes at a rather high cost for the larger values (for example, values in the range 990..999 need 100 bits for the quotient), which is why the coding is optimal for 2-sided geometric distributions.
The long runs of 1 bits in the quotients of larger values can be addressed with subsequent run-length encoding. However, if the quotients consume too much space in the resulting message, this could indicate that other codes might be more appropriate than Golomb/Rice.