Given a pseudorandom number generator int64 rand64(), I would like to build a set of pseudo random numbers. This set should have the property that the XOR combinations of each subset should not result in the value 0.
I’m thinking of following algorithm:
count = 0
set = {}
while (count < desiredSetSize)
set[count] = rand64()
if propertyIsNotFullfilled(set[0] to set[count])
continue
count = count + 1
The question is: How can propertyIsNotFullfilled be implemented?
Notes: The reason why I like to generate such a set is following: I have a hash table where the hash values are generated via Zobrist hashing. Instead of keeping a boolean value to each hash table entry indicating if the entry is filled, I thought the hash value – which is stored with each entry – is sufficient for this information (0 … empty, != 0 … set). There is another reason to carry this information as sentinel value inside the hash-key-table. I’m trying to switch from a AoS (Array of Structure) to a SoA (Structure of Array) memory layout. I’m trying this to avoid padding and to test if there are lesser cache misses. I hope in most cases the access to the hash-key-table is enough (implied that the hash value provides the information if the entry is empty or not).
I also thought about reserving the most significant bit of the hash values for this information but this would reduce the area of possible hash values more than it is necessary. Theoretically the area would be reduced from 264 (minus the seninal 0-value) to 263.
One can read the question in the other way: Given a set of 84 pseudorandom numbers, is there any number which can’t be generated by XORing any subset of this set, and how to get it? This number can be used as sentinel value.
Now, for what I need it: I have developed a connect four game engine. There are 6 x 7 moves possible for player A and also for player B. Thus there are 84 possible moves (therefore 84 random values needed). The hash value of a board-state is generated by the precalculated random values in the following manner: hash(board) = randomset[move1] XOR randomset[move2] XOR randomset[move3] ...
IMHO this would restrict the maxinum number of subsets to 64 (Pigeonhole principle); for >64 subsets, there will always be a (non empty) subset that XORs to zero. For smaller subsets, the property can be fulfilled.
To further illustrate my point: consider a system of 64 equations over 64 unknown variables. Then, add one extra equation. The fact that the equations and variables are booleans does not make the problem different.
–EDIT/UPDATE–: Since the application appears to be the game “connect-four”, you could instead enumerate all possible configurations. Not being able to code the impossible board configurations will save enough coding space to fit any valid board position in 64 bits:
Encoding the colored stones as {A,B}, and irrelevant as {X} the configuration of a (hight=6) column can be one of:
(and similar for B instead of A). The numbers below the piles are the number of posssibilities for the Xs on top, the negative numbers the number of forbidden/impossible configurations. For the column with one A and 4 Xs, every value for the Xs is valid, *except 3*A (the game would already have ended). The same for the rightmost pile: the bottom 3Xs cannot be all A, and X cannot be B for all the Xs.
This leads to a total of 1 + 2 * (63-7) := 113.
(1 is for the empty board, 2 is the number of colors). So: 113 is the number of configurations for one column, fitting well within 7 bit. For 7 columns we’ll need 7*7:=49 bits. (we might save one bit for the L/R mirror symmetry, maybe even one for the color symmetry, but that would only complicate things, IMHO).
There still be a lot of coding space wasted (the columns are not independent, the number of As on the board is equal to the number of Bs, or one more, etc), but I don’t think it would be easy to avoid them. Fortunately, it will not be necessary.