The term genetic algorithm is used to describe a program which simulates the Darwinian theory of evolution by natural selection. It may sound complicated, but the idea and code are actually really simple, and an incredibly powerful tool.
Firstly let's recap that Biology class you slept through. In 1859, Darwin published the results of his findings in a book called "On the Origin of Species", which detailed the change in characteristics of animals over time. This would be the first time anyone had put forward the idea of evolution.
But what is evolution? When an animal is born it inherits some characteristics from one parent, and some characteristics from another parent, so this animal has unique DNA, and this is what is known as variance. Let's say we were looking at birds over many thousands of years. If one day a chick hatches that happens to have a bigger beak, then that bird will be a more successful hunter and more able to open nuts. So this bird is more likely to survive. This means that the bird is also more likely to reproduce, and pass on it's own genetic information to it's offspring.
It's hard to notice, but over a long period of time, animals evolve to become more adapted to their environments.
But what has any of this got to do with computer science and artificial intelligence? Well, the so called genetic algorithm is basically a way of simulating natural selection from within a computer program. Of course, we're not simulating the sizes of birds' beaks over time, but we are training the AI to adapt to some task we want it to do.
There are 3 main steps in evolution:
- Heredity
- Natural Selection
- Reproduction
We can create some object that contains "genes", which we encode with some information on how the individual should behave. This could be a list of cities to visit, or what move to make in a chess game, it can really be anything. Now, we use something called a "fitness function" to test each individual on how successful they are at performing a task.
Imagine for a minute that we wanted to teach an AI how to walk. We can create say 100 different individuals, all with different shaped an ideas of how to walk, which are encoded into their "genes". Then we let them try to walk, and use our "fitness function" to see how far they can walk. This is the process of natural selection.
Now, we can take the individuals who were the most successful, or could walk the furthest, and we get them to reproduce. Of course they don't actually reproduce, but they do what's called a "crossover", and the genes encoded in both parents are shuffled around and given to the child. We can repeat this many times, and create a whole new generation, each with genes which should (in theory) let them be more successful.
However, this is only a very small difference, and the members in the two generations are unlikely to have really "learnt" how to walk. So, we can create hundreds of new generations, or as many as we like, each generation being slightly more successful than the last. Obviously this is all within the confines of a computer, and so we have the advantage that we don't have to wait millions of years for species to evolve.
Here's some code from my GitHub page https://github.com/Riptide684, which I used to create a genetic algorithm to solve a maze. Rather than going over it all at once, let's look at it in small chunks.
***************************************
class DNA:
def__init__(self):
self.genes= []
self.fitness=0
def create_genes(self, number):
for i in range(number):
self.genes.append(directions[random.randint(0, 3)])
***************************************
In case you aren't familiar with python, the above code is very simple. It simply creates an object called "DNA" and gives it some random genes. This will be one of the many starting individuals we will use to evolve our algorithm. In this case, since I am trying to solve a maze, the "genes" are just a sequence of directions saying to go up, down, left or right.
***************************************
def update_fitness():
self.fitness = calculate_fitness()
***************************************
Now, we have created a way of calculating the "fitness" of our individuals, which we can do however we like. (I omitted the code here as it was quite long but feel free to read it on my GitHub page). This will be our way of determining how successful an individual is, or in my case, how far they can make it through some maze.
***************************************
def crossover(self, data):
parent1=data[0]
parent2=data[1]
midpoint=random.randint(1, len(parent1.genes)-1)
self.genes=list(parent1.genes[:midpoint] + parent2.genes[midpoint:])
***************************************
I mentioned earlier that we don't get the individuals to "reproduce" as such, but instead do a "crossover". What this function does, is it takes the genes of the two parents, and splits it somewhere at random down the middle. Then some of the genes are taken from one parent and some from the other parent. This will be how we make the new generation.
There is another function here which I am choosing to miss out, but it has a low chance of making the genes "mutate" which is to increase variation, and stop the population from becoming too similar.
***************************************
def populate(self):
for n in range(self.population_size):
chromosome = DNA()
chromosome.create_genes(maze_size**2)
self.population.append(chromosome)
***************************************
Now we can start to build our population. This function runs 100 times, and each time it creates a random new individual (from our first function earlier) with random genes. This will be our starting population.
***************************************
def calculate_fitness(self):
for solution in self.population:
solution.update_fitness()
if solution.fitness > self.best:
self.best=solution.fitness
self.complete=solution.genes.copy()
***************************************
This is the function we use to calculate the fittest members of the population. Using the fitness function we defined earlier, we calculate the fitness of every member of the population, and see who is the most successful.
The final function is also very long, but I encourage you to see how it works on my GitHub page if you are interested. This final function is the generation function, and is how we create our next generation of individuals. We take two parents from our previous generation based off of who are the most successful. Then we make them "crossover", and we add the child to the next generation. We repeat this until our next generation is full of new individuals.
Now we are done! All that is left to do is to run the program many times, until we get one member who is able to complete the maze. (If you want to see how this is done the code is here https://github.com/Riptide684/genetic-maze-algorithm/blob/main/MazeSolver.py).
So that was an introduction to genetic algorithms, and also how you can implement them into your own code, and use it to solve problems.
If you have any questions on genetic algorithms feel free to contact me via the form on my website.
If you're still unsure on how anything worked, or your just interested and want to learn more, I would highly recommend this YouTube playlist, https://www.youtube.com/watch?v=9zfeTw-uFCw&list=PLRqwX-V7Uu6bJM3VgzjNV5YxVxUwzALHV.
Make sure to subscribe so you don't miss any of my future posts on computing and artificial intelligence!
コメント