I have the following code:
procedure TCellBlock.GeneratePtoQ;
var
x,y: integer;
i: integer;
Change: cardinal;
begin
i:= 0;
//Walk the grid of changed (alive) cells
for x:= GridMaxX downto 1 do begin
for y:= GridMaxY downto 1 do begin
if Active[cIndexP][x, y] then begin
Active[cIndexP][x,y]:= false;
//Put active items on the stack.
ToDo[i]:= x shl 16 or y;
Inc(i);
end; {if}
end; {for y}
end; {for x}
while i > 0 do begin
Dec(i);
y:= ToDo[i] and $FFFF;
x:= ToDo[i] shr 16;
//Calculate the cell, Change = (oldval XOR newval)
Change:= Grid[x,y].GeneratePtoQ;
//Mark the cells in the grid that need to be recalculated next generation.
Active[cIndexQ][x,y]:= Active[cIndexQ][x,y] or (Change <> 0);
Active[cIndexQ][x+1,y+1]:= Active[cIndexQ][x+1,y+1] or ((Change and $cc000000) <> 0);
Active[cIndexQ][x+1,y]:= Active[cIndexQ][x+1,y] or ((Change and $ff000000) <> 0);
Active[cIndexQ][x,y+1]:= Active[cIndexQ][x,y+1] or ((Change and $cccccccc) <> 0);
end; {while}
end;
The above is a code snippet of a test program that calculates conway’s game of life.
The code needs to be as fast as possible. And for this purpose I’m trying different approaches.
It walks though a grid of active cells, looks to see which cells are active and puts those
on a stack.
Next it processes the items on the stack and sees which cells have changed.
If a cell has changed it updates the changes into the grid for the next generation.
I store cells in 32bit cardinals (4 bits Y, 8 bits X) and the P (even) generations are offset 1,1 pixel relative to the Q (odd) generations, this way I only have to take 3 neighbors into account instead of 8.
Question
I want to get rid of the grid, I just want to deal with the stack.
How do I implement a stack that eliminates duplicates?
Note that it needs to be as fast as possible and I’m not above using dirty tricks to get that.
I’ve been thinking about it and I think I have a solution.
some background
Here’s how the data is in laid out in memory
I haven’t decided on the size of the building block, but this is the general idea.
Even generations are called
P, odd generations are calledQ.They are staggered like this
This way when generating
P -> Q, I only need to look at P itself and its S, SE, E neighbors, instead of all 8 neighbors, ditto for Q -> P. I need only look at Q itself and its N, NW and W neighbors.Also notice that the staggering saves me time in translating the result of the lookup, because I have to do less bit shifting to put the results in place.
When I loop though a grid (Figure A) I walk though the cells (Figure B) in the order shown in figure A. Always in strictly increasing order in a P-cycle and always in decreasing order in a Q-cycle.
In fact the Q cycle works in exactly the opposite order from the P-cycle, this speeds things up by reusing the cache as much as possible.
I want to minimize using pointers as much as possible, because pointers cannot be predicted and are not accessed sequentially (they jump all over the place) So I want to use arrays, stacks and queues as much as possible.
What data do to need to keep track of
I need to keep track of only the cells that change. If a cell (that is an int32) does not change from one generation to the next I remove it from consideration.
This is what the code in the question does. It uses a grid to keep track of the changes, but I want to use a stack, not a grid; and I only want to deal with active cells I don’t want to know about stable or dead cells.
Some background on the data
Notice how the cell itself is always monotonically increasing. As is its S-neighbor, as well as the E and SE-neighbor. I can use this info to cheat.
The solution
I use a stack to keep track of the cell itself and its S neighbor and a queue to keep track of its E and SE neighbor and when I’m done I merge the two.
Suppose in the Grid the following cells come out as active after I’ve calculated them:
And even better
I don’t have to actually do the two stacks and the merging as two separate steps, if I make
Aa stack andBa queue, I can do the second step described in the pseudo code and the building of the two stacks in one pass.Note
As a cell changes its S, E or SE border does not necessary need to change, but I can test for that using the masks in table C, and only add the cells that really need checking in the next generation to the list.
Benefits