A code is the assignment of a unique string of characters (a
codeword) to each character in an alphabet.
A code in which the codewords contain only zeroes and ones is
called a binary code.
All ASCII codewords have the same length. This ensures that an
important property called the prefix property holds true for the
ASCII code.
The encoding of a string of characters from an alphabet (the
cleartext) is the concatenation of the codewords corresponding to
the characters of the cleartext, in order, from left to right. A code
is uniquely decodable if the encoding of every possible cleartext
using that code is unique.
Based on the above information I was trying to do some exercises:
Considering the following matrix:
Code1 Code2 Code3 Code4
A 0 0 1 1
B 100 1 01 01
C 10 00 001 001
D 11 11 0001 000
The confusions:
- Are all the above assignment considered as
codessince they have a unique string of characters??? - I understand that
code 1 and code 2are prefix free since they do not have equal length. Having said that, if you have a look atcode 4for alphabetsD and Cit cosists of 3 digits. Wouldcode 4be considered prefix free too? - Is
code 3the only uniquely decodable code?
I think you have misunderstood the prefix property – it isn’t mainly about length (but enforcing the same length
non each code point will make the code prefix-free – you cannot have unique codes otherwise).Rather, it is about uniquely being able to identify each code point so that a decoder greedily can take the first translation that matches. In the case of fixed length, the decoder knows that it has to read
ndigits.In the case of variable length code like
Code1, you don’t know upon reading10if that can be translated toCor if it is the first two digits of the three-digitB–10is a prefix of100. The same holds true forCode2:0is a prefix for00and1is a prefix of11.Consider reading the sequence
100one digit at a time:Hope this helps you forward!