by a set of cities and the distances between each city pair. • The problem is to find a circuit that goes through each city once and that ends where it starts. (This in itself isn't difficult) • What makes the problem interesting is to find the shortest circuit among all those that are possible.
need to compute the length of all possible circuits. • Keep the shortest one. Issue is that the number of such circuits grows very quickly with the number of cities. If there are n cities then this number is factorial of n-1 = (n-1)(n-2)...3.2 Indeed, we can select arbitrarily one city to start from (the starting point doesn't matter much given one must end at the same place). Then we have n-1 different choices for the second city to be visited, n-2 choices for the third city, and so on. For example, the factorial of 10 is 3628800, but the factorial of 20 is a gigantic, 2432902008176640000. (~ Seconds since the big bang…)
contains an array of traits - called DNA, when 2 parents mate they produce a child containing a mixture of their DNA. Depending on how well those traits work together to help the child survive so that he may reproduce, will determine if those traits will pass into future generation. Rarely a random trait enters a child's DNA that was not inherited from the parents, we call this mutation. If the mutation is beneficial, the organism will survive long enough to carry the mutation over to future generations. So the cycle continues, after many generations we continue to optimize the population by "mixing and matching", "trial and error" of DNA.
used to find optimal solution by method of evolution-inspired search and optimization. Generally used in problems where linear/brute-force searches are not viable in terms of time, such as – Travelling Salesman Problem, Timetable Scheduling, Finding Neural Network Weights, Sudoku, Trees(data-structure) etc. The first requirement is a encoding scheme suitable for representing individuals, second requirement being a evaluation function for representing the fitness of an individual.
initialized by randomly generating a collection of DNA samples. The size of the population depends on the size of the problems search space, and the computational time it takes to evaluate each individual. Most of the time you will be dealing in population counts of about 50 up-to a 1000.
the fitness of an individual. This is the most important part to the Genetic algorithm, if this function is flawed, the algorithm will not produce results. The evaluation function should not return a Boolean(true/false) value, it has to be a comparable result. If individuals are able to be sorted from fittest to weakest, it is a viable evaluation function. For evaluating distances between cities you may return the total distance traveled as a fitness score.
will be choosing the parents for mating. Selection happens for each child in the new population. There are three type of parent selection methods, Fitness Proportionate Selection, Tournament Selection, and Truncation Selection. A viable option is to carry over selected individuals to the next generation.
selected individual(s) also known as the parents, we need to mix and match their DNA to produce a new population of children. We call this process Crossover.
to introduce diversity into the population, expanding the opportunity to search unexplored areas in the search space for fitter solutions. Mutation is implemented by giving each element in the DNA array a probability of 2-5% of being randomly altered. The mutation procedure is called for every child. After the generating a new generation, we go back to the evaluation step.
Evaluation step, we can either stop: • From finding the most optimal solution. • Generation is not progressing (evolving) in fitness. • Where time is a factor.
solution to the travelling salesman problem requires we set up a genetic algorithm in a specialized way. For instance, a valid solution would need to represent a route where every location is included at least once and only once. If a route contain a single location more than once, or missed a location out completely it wouldn't be valid and we would be valuable computation time calculating it's distance. To ensure the genetic algorithm does indeed meet this requirement special types of mutation and crossover methods are needed.
should only be capable of shuffling the route, it shouldn't ever add or remove a location from the route, otherwise it would risk creating an invalid solution. One type of mutation method we could use is swap mutation.
that's able to produce a valid route is ordered crossover. In this crossover method we select a subset from the first parent, and then add that subset to the offspring. Any missing values are then adding to the offspring from the second parent in order that they are found. Implementation