I’m trying to rewrite a c++ patricia trie in java.
The c++ code is from here
I’m a bit stuck.
So here’s my understanding:
#define ZEROTAB_SIZE 256
head->key = (char*)calloc(ZEROTAB_SIZE, 1);
we create an array of 256 bits for the key, so we can have a string with a maximum length of 32 characters and every character is represented with 8 bits. Can i implement this with a char array in java?
template <class T>
int PatriciaTrie<T>::bit_get(PatriciaTrieKey bit_stream, int n) {
if (n < 0) return 2; // "pseudo-bit" with a value of 2.
int k = (n & 0x7);
return ( (*(bit_stream + (n >> 3))) >> k) & 0x1;
}
k gets the last 7 bits of n, we move to the n/8 character of the string (not exactly n/8 since shifting to the right would remove anything lower than 8 to zero) then we shift the value of bit_stream[n>>3] by k and then we get last bit. if i use arrays in java could i rewrite this as
return (bit_stream[n>>3] >> k) & 0x1;
?
template <class T>
int PatriciaTrie<T>::bit_first_different(PatriciaTrieKey k1, PatriciaTrieKey k2) {
if (!k1 || !k2)
return 0; // First bit is different!
int n = 0;
int d = 0;
while ( (k1[n] == k2[n]) &&
(k1[n] != 0) &&
(k2[n] != 0) )
n++;
while (bit_get(&k1[n], d) == bit_get(&k2[n], d))
d++;
return ((n << 3) + d);
}
now this is where it gets confusing, the first part until the second while loop looks clear enough, loop and check how many bits are equal and non zero, but the i’m not sure what the second loop is doing, we take the address of the two keys and check the first bits if they’re equal and if they are we check again until we find unequal bits?
Mainly i’m not sure how the address of the key is used here, but i might be confused on bit shifting in bit_get class too.
I want to do a comparison between there trie in c++ and java for my java class and i want to keep the implementations as similar as possible.
I’m not familiar with this data structure, but there are some problems with your understanding of this code.
First,
callocallocates 256 bytes, not bits.new byte[256]Would be comparable in java.Second,
n & 0x7gets three bits ofn, not seven. A clearer way to write this would ben/8andn%8instead ofn>>3andn & 7, but the bitwise operations might be slightly faster if your compiler is stupid.You are correct that
(bit_stream[n>>3]>>k) & 1is the same.Now, the first loop in
bit_first_differentloops over bytes, not bits. The check for 0 is to prevent running off the end of the keys. Once that loop terminates,nrefers to the first differing byte. The second loop is then looking for which bit is different.Note that if the two keys are not different, then the second loop may run off the end of the keys, potentially causing a segmentation fault.
Now, the & is taking the address of
k1[n]because thebit_getfunction is expecting a pointer to a character…this passes in thenth element of the bit stream. After the loop,dis the offset of the first different bit ofk[n].Finally the code combines
n(which byte?) Withd(which bit in that byte?) to give the bit. Again I would advocate8*n+dfor clarity, but that’s a matter of taste.