Im trying to better understand how the huffman decoder works. Ive got a code table but im having a hard time trying to understand how the decoder would work because of ambiguity in the binary string.
(im learning this in prep for my final year at uni)
my table:
Data Hcode
0, 0
1, 1
2, 10
3, 11
17, 100
18, 101
19, 110
29, 111
If i have a huffman code string like 010011 i can return many different combinations of data so how can i discriminate?
i understand the huffman logic in BST representation and you follow a path to a given leaf which the path resembles the code for that given value between (0-255(ascii)) but i still dont know how you can discriminate between returning data: 0,1,0 or data: 0,17
do i really have to enforce 2 bit codes on data 0 and 1? (00 and 01)
i hope ive explained the best i can XD
If your wondering how I generated the table – your gonna kill me because i didnt use tree logic to generate it. Althought i sorted the data (random bytes) on frequency – i generated the Hcodes by converting the element position number into binary (hency why i called this post Poor Mans Huffman).
Many Thanks for any advice.
The code table is wrong. Huffman odes are supposed to be prefix free. This is neccessary in order to decode them afterwards without ambiguities.
If you would use a binary tree for creating the codes, this would automatically ensure the “prefix freeness”. See: http://en.wikipedia.org/wiki/Huffman_coding
And now, I am going to kill you … 😉