I’m attempting to write a genetic algorithm framework in Python, and am running into issues with shallow/deep copying. My background is mainly C/C++, and I’m struggling to understand how these connections are persisting.
What I am seeing is an explosion in the length of an attribute list within a subclass. My code is below…I’ll point out the problems.
This is the class for a single gene. Essentially, it should have a name, value, and boolean flag. Instances of Gene populate a list within my Individual class.
# gene class
class Gene():
# constructor
def __init__(self, name, is_float):
self.name_ = name
self.is_float_ = is_float
self.value_ = self.randomize_gene()
# create a random gene
def randomize_gene(self):
return random.random()
This is my Individual class. Each generation, a population of these are created (I’ll show the creation code after the class declaration) and have the typical genetic algorithm operations applied. Of note is the print len(self.Genes_) call, which grows each time this class is instantiated.
# individual class
class Individual():
# genome definition
Genes_ = [] # genes list
evaluated_ = False # prevent re-evaluation
fitness_ = 0.0 # fitness value (from evaluation)
trace_ = "" # path to trace file
generation_ = 0 # generation to which this individual belonged
indiv_ = 0 # identify this individual by number
# constructor
def __init__(self, gen, indv):
# assign indices
self.generation_ = gen
self.indiv_ = indv
self.fitness_ = random.random()
# populate genome
for lp in cfg.params_:
g = Gene(lp[0], lp[1])
self.Genes_.append(g)
print len(self.Genes_)
> python ga.py
> 24
> 48
> 72
> 96
> 120
> 144
......
As you can see, each Individual should have 24 genes, however this population explodes quite rapidly.
I create an initial population of new Individuals like this:
# create a randomized initial population
def createPopulation(self, gen):
loc_population = []
for i in range(0, cfg.population_size_):
indv = Individual(gen, i)
loc_population.append(indv)
return loc_population
and later on my main loop (apologies for the whole dump, but felt it was necessary – if my secondary calls (mutation/crossover) are needed please let me know))
for i in range(0, cfg.generations_):
# evaluate current population
self.evaluate(i)
# sort population on fitness
loc_pop = sorted(self.population_, key=operator.attrgetter('fitness_'), reverse=True)
# create next population & preserve elite individual
next_population = []
elitist = copy.deepcopy(loc_pop[0])
elitist.generation_ = i
next_population.append(elitist)
# perform selection
selection_pool = []
selection_pool = self.selection(elitist)
# perform crossover on selection
new_children = []
new_children = self.crossover(selection_pool, i)
# perform mutation on selection
muties = []
muties = self.mutation(selection_pool, i)
# add members to next population
next_population = next_population + new_children + muties
# fill out the rest with random
for j in xrange(len(next_population)-1, cfg.population_size_ - 1):
next_population.append(Individual(i, j))
# copy next population over old population
self.population_ = copy.deepcopy(next_population)
# clear old lists
selection_pool[:] = []
new_children[:] = []
muties[:] = []
next_population[:] = []
I’m not not completely sure that I understand your question, but I suspect that your problem is that the Genes_ variable in your Individual() class is declared in the class namespace. This namespace is available to all members of the class. In other words, each instance of Individual() will share the same variable Genes_.
Consider the following two snippets:
and
The first snippet outputs
while the second snippet outputs
In the first scenario, when the second Individual() is instantiated the genes list variable already exists, and the genes from the second individual are added to this existing list.
Rather than creating the Individual() class like this,
you should consider declaring the Genes_ variable in init so that each Individual() instance gets its own gene set