OK, guys : memory optimization is definitely not my thing and since I’m currently working on a big, and cpu and memory-intensive project, I think I need your help.
The project is a Chess Engine and the actual problem lies (I guess) in one of the 2 following methods (the code above is not 100% exact but it’s more-or-less that) :
Tree Search (MiniMax) – the actual code is an Alpha-Beta with different additions, but this quite basic example is more illustrative :
int Board::miniMax_(int ply)
{
if (ply == this->searchDepth) return this->eval();
int best = MINUS_INF;
vector<Move*> moves = this->possibleMoves();
FOREACH(Move*, move, moves)
{
HASHMAKE((*move),this);
int val = -this->miniMax_(ply+1);
UNHASHMAKE((*move),this);
if (val>best) {
if (ply==0) this->bestMove = (*move);
best = val;
}
}
return best;
}
Move Generation (if you haven’t ever played with Chess programming and bitboards, the following might simply look almost non-sensical; but you’ll still get the idea in terms of memory handling – well, hopefully…):
vector<Move*> Board::possibleMoves()
{
U64 ownPieces = this->piecesForColor(this->playing);
U64 occupied = ~(this->pieces[empty]);
vector<Move*> moves;
//-----------------------------------------
// "Normal" Piece Moves
//-----------------------------------------
const int from = (1+(this->playing))*3;
const int to = (1+(this->playing))*3+6;
for (int pieceType=from; pieceType<to; pieceType++)
{
U64 piece = this->pieces[pieceType];
for (; piece != 0; piece &= (piece - 1))
{
UINT pos = log2(piece & ~(piece-1));
U64 move;
switch (pieceType)
{
case bPawns: move = BPAWN_(pos,ownPieces,occupied); break;
case wPawns: move = WPAWN_(pos,ownPieces,occupied); break;
case bRooks:
case wRooks: move = ROOK_(pos,ownPieces,occupied); break;
case bKnights:
case wKnights: move = KNIGHT_(pos,ownPieces,occupied); break;
case bBishops:
case wBishops: move = BISHOP_(pos,ownPieces,occupied); break;
case bQueen:
case wQueen: move = QUEEN_(pos,ownPieces,occupied); break;
case bKing:
case wKing: move = KING_(pos,ownPieces,occupied); break;
default:break;
}
for (; move !=0; move &= (move-1))
{
moves += new Move(pos, log2(move&~(move-1)),this);
}
}
}
return moves;
}
The Move class
//=======================================================
// Prototype
//=======================================================
class Move
{
public:
Move (string m, Board* b) : from(ALG2FROM(m)), to(ALG2TO(m)), pieceFrom(b->atPosition[from]), pieceTo(b->atPosition[to]) {}
Move (int f, int t, Board* b) : from(f), to(t), pieceFrom(b->atPosition[from]), pieceTo(b->atPosition[to]) {}
Move (int val) : value(val) {}
inline string notation();
inline string out();
//----------------------
int from, to;
int pieceFrom, pieceTo;
int value;
};
//=======================================================
// Inline Functions
//=======================================================
inline string Move::notation()
{
return SSTR(SPOS(this->from) << SPOS(this->to));
}
inline string Move::out()
{
return SSTR(this->notation() << " :: " << this->value);
}
Obviously, the first function being recursive AND called some millions of times, there is some expected load. The thing is once the search goes up to the 4th,5th ply or something, the app already takes up like 2GB. And the thing is that once it’s complete (the search), the memory is still not freed – so I suppose this is indicating a problem.
So, any ideas?
Please, just let me know in case you need to know anything else about the implementation.
Hints :
FOREACHis just a macro for a vector iterator- the
+=for vector appending comes from Boost - everything in bold is a macro, but in terms of memory overhead, none of them is doing anything intensive (so I decided to omit them)
- no destructors implemented whatsoever
Here:
A vector of pointers does not free the pointers for you. You could try a vector of
std::unique_ptr<Move>instead.You will have better performance if you do not do individual allocations of moves. Use a memory pool. In that respect, you might not really need a vector at all.
Unless
Moveis a polymorphic class, I suggest you don’t allocate it at all. Instead, make a bunch ofSetfunctions onMoveand declare yourpossibleMovesfunction like this:And obviously call it like this:
So, this means that when you add a move, instead of doing a
new, you can do something like this:That invokes the copy constructor. If you want to avoid an extra copy, then make an empty constructor for
Moveand make the afore-mentioned ‘setter’ function:I am not 100% sure whether that will be any quicker. Anyway, another thing… If you expect that a typical board almost always has less than 50 possible moves, you can improve performance by doing this at the beginning of
possibleMoves:That means the vector will hardly ever have to be resized, and thus makes the
push_backoperation faster.