I am trying to write a C++ program for the canonical genetic algorithm, where you have a population of individuals (chromosomes) of length N, where each element is a O or 1.
I have started writing my program using STL vectors, but before I go more deeply into it I would like to ask your opinions about how to write the functions and the data structures in the most efficient way.
Memory footprint is not a problem, I have a population about 100 individuals where each of them are a 64 character long string of 0-s and 1-s. The performance on the other hand is very important, as there would be about thousands of generations, each having thousands of operations.
Here is my implementation so far (just the most important funcitions and the data structure):
typedef vector<int> chromosome;
typedef vector<chromosome> population;
population popul;
float eval[number];
void cross_chromosomes( const chromosome &parent_a, const chromosome &parent_b, chromosome &child_a, chromosome &child_b )
{
int crossing_point = crossing_point_gen( gen );
child_a.reserve( length );
child_a.insert( child_a.end(), parent_a.cbegin(), parent_a.cbegin() + crossing_point );
child_a.insert( child_a.end(), parent_b.cbegin() + crossing_point, parent_b.cend() );
child_b.reserve( length );
child_b.insert( child_b.end(), parent_b.cbegin(), parent_b.cbegin() + crossing_point );
child_b.insert( child_b.end(), parent_a.cbegin() + crossing_point, parent_a.cend() );
}
void calculate_eval()
{
for( int i = 0; i < number; i++ )
{
eval[i] = evaluate_chromosome( popul[i] );
}
}
Do you think it is an efficient way of implementing this algorithm? I originally used vector for the chromosome, but I have read this question: C++ Vector vs Array (Time) and I updated my code to vector<int>.
Do you think there are other optimisations I should do with my code to make it more efficient? Is the crossing code efficient as it is now?
The crossover code seems at max efficiency for what you are trying to do with the vectors. From my experience with genetic algorithms, the fitness function and selection operator are the most time intensive. Since you will be using crossover and mutation on a sample of the population you don’t have to worry too much about the efficiency of the crossover operator. Focus on defining a good representation for your data and an optimal fitness function implementation.