I created a silly Huffman compressor in Python, so I can compress image/sound data to be applied in my Tandy Color Computer Projects. The decompressor is written in 6809 Assembly.
I couldn’t find a way to store the huffman tree, so I generated Assembly code that walks into the tree and get the correct uncompressed data. Here is an example:
DECOMP_HUFFMAN: PSHS A,B,X,Y,U
LDB #8
STB $2100
pshs x
ldx $2102
stx $2106
puls x
LDB ,X+
JMP inicio
prox_bit: LSLB
PSHS CC
DEC $2100
BNE S_P_B
LDB #8
STB $2100
LDB ,X+
S_P_B: PULS CC
RTS
armazena: STA ,U+
LEAY -1,Y
BNE inicio
PULS U,Y,X,B,A
RTS
inicio: jsr prox_bit
tfr cc,a
anda #1
sta $2104
lda ($2102)
bne n1
lda $2104
n0: pshs x
ldx $2102
leax 1,x
lda a,x
puls x
bsr armazena
pshs x
ldx $2106
stx $2102
puls x
bra inicio
n1: cmpa #1
bne n2
lda $2104
bne n0
bra n4
n2: cmpa #2
bne n3
lda $2104
beq n0
n3: lda $2104
n4: pshs x
ldx $2102
leax 1,x
lda a,x
leax a,x
stx $2102
puls x
bra inicio
I would like to use the real huffman tree, instead of creating the Assembly code to do it.
Thank you for your time.
You can transmit a Huffman code simply by sending the code length for each symbol. You do not need to send a tree. A code length of zero indicates that that symbol does not occur.
What you send might be something like:
Where you only send the numbers — the assignment to symbols is in symbol order.
Both ends would assume a canonical Huffman code, where the code values are assigned in order from the shortest code lengths to the longest. Within a bit length, the codes are assigned incrementally to the symbols in their order. For example (symbol: code length – code):
Now the decoder only has to compare the low bits with integer values at the cutoff between each bit length (store the bits above reversed), starting with the shortest. Within each bit length, an index from the start provides an offset into a lookup table for the symbol.