Genetic Algorithm and Traveling Salesman Problem

Sample Image

Environment: VC++ 6.0. SP5, Win2k SP2, MS Platform SDK April 2001

Table of contents

  1. Genetic Algorithm
    • Theory
    • GA and TSP
  2. Base implementation, Template class GA<> and GA Selection classes
  3. Genome of Travel
  4. TSP Application
    • GA thread
    • UI interface
  5. Environment
  6. Reference

Disclaimer

I am not a genetic algorighm (GA) guru and I do not have any degree in GA so this article can’t be used as a GA GA tutorial. There are not any mathematics, logic, or algebra about GA. It’s only a programmer’s view on Genetic Algorithms and only an example of GA coding. Use it carefully! Any comments and criticism are highly appreciated.

Genetic Algorithm, Theory

There are many books and resources on the WEB about Genetics Algorithm. Best that I can do, I can quote some nice description from my preferred sites.

Definition from Marek Obitko’s Site

“Genetic algorithms are a part of evolutionary computing, which is a rapidly growing area of artificial intelligence. As you can guess, genetic algorithms are inspired by Darwin’s theory about evolution. Simply said, solution to a problem solved by genetic algorithms is evolved.”

Explanation from Generation5.org Site

“Genetic algorithms are not too hard to program or understand, since they are biological based. Thinking in terms of real-life evolution may help you understand. Here is the general algorithm for a GA:

Create a Random Initial State

An initial population is created from a random selection of solutions (which are analagous to chromosomes). This is unlike the situation for Symbolic AI systems, where the initial state in a problem is already given instead.

Evaluate Fitness

A value for fitness is assigned to each solution (chromosome) depending on how close it actually is to solving the problem (thus arriving to the answer of the desired problem). (These “solutions” are not to be confused with “answers” to the problem, think of them as possible characteristics that the system would employ in order to reach the answer.)

Reproduce (& Children Mutate)

Those chromosomes with a higher fitness value are more likely to reproduce offspring (which can mutate after reproduction). The offspring is a product of the father and mother, whose composition consists of a combination of genes from them (this process is known as “crossing over”.

Next Generation

If the new generation contains a solution that produces an output that is close enough or equal to the desired answer then the problem has been solved. If this is not the case, then the new generation will go through the same process as their parents did. This will continue until a solution is reached.”

My opinion, the GA is easy to understanding and easy to implementation on C++.
The main advantage and disadvantage of GA at the same time is robustness.
Even if you implement some features incorrect, the GA will continue run and sooner or later will solve the problem (or find any local optimum). Definitely such feature can produce some troubles during debugging and tuning program. Another interesting problem, to choose best algorithms from existing plenty of algorithms of crossover, mutation, gene presentation, etc.

See references topic for some useful links about GA theory.

Genetic Algorithm and Traveling Salesman Problem

About Traveling Salesman Problem

Again quotation from http://www.math.princeton.edu/tsp/

“The traveling salesman problem, or TSP for short, is this: given a finite number of ‘cities’ along with the cost of travel between each pair of them, find the cheapest way of visiting all the cities and returning to your starting point.”

Genome and Algorithm

We can’t use a traditional presentation and algorithms for TSP problem, because every city must be unique in gene, and can’t be duplicated.

For gene presentation, I used sequential representation where the cities are listed in the order in which they are visited. It’s common way for TSP Genome.

Example: [9 3 4 0 1 2 5 7 6 8]

For crossover operation after several tests and researching I selected Greedy Crossover by J. Grefenstette.

The citation from Sushil J. Louis

“Greedy crossover selects the first city of one parent, compares the cities leaving that city in both parents, and chooses the closer one to extend the tour. If one city has already appeared in the tour, we choose the other city. If both cities have already appeared, we randomly select a non-selected city.”

From my experience it’s a very effective method.

Mutation

We can’t change gene’s bits as usual traditional mutation does. Instead of it we must swap order of cities in a path.

Example:
  Before mutation   [0 1 2 3 4 5 6]
  After mutation    [0 1 3 2 4 5 6]

There are a lot of manners for doing such swapping operation. Easiest way in using random swap. Unfortunately, such strategy unable to achieve optimum a quickly but can prevent convergence into a local optimum. Additionally I used a greedy-swap mutation.

Once more citation from Sushil J. Louis

“The basic idea of greedy-swap is to randomly select two cities from one chromosome and swap them if the new (swapped) tour length is shorter than the old one”

In during browsing WEB I discover a excellent researching into GA “A fast TSP solver using a genetic algorithm” by Hiroaki Sengoku and Ikuo Yoshihara. It’s fastest algorithm I ever saw. Unfortunately, It’s Java implementation and without any sources. I could find only one PDF document with a description about this algorithm:
arob98.pdf
After reading and studying I used “Mutation by 2opt” idea in my code. This method has the same idea as greedy-swap mutation but more expansive and more effective. After adding into code I can improve speed of my program greatly on small and middle sets (till 200 cities).
However for big sets (1000 and more) this heuristics is very slow method.

Selection

I implemented three selection methods : routlette rank, roulette cost and tournamnet.
Of course I used elitism too.


Roulette Wheel Selection


Definition from Marek Obitko’s Site

“Cost Selection : Parents are selected according to their fitness. The better the chromosomes are, the more chances to be selected they have. Imagine a roulette wheel where are placed all chromosomes in the population, every has its place big accordingly to its fitness function

Rank Selection : The previous selection will have problems when the fitnesses differ very much. For example, if the best chromosome fitness is 90% of all the roulette wheel then the other chromosomes will have very few chances to be selected.
Rank selection first ranks the population and then every chromosome receives fitness from this ranking. The worst will have fitness 1, second worst 2 etc. and the best will have fitness N (number of chromosomes in population).”


Tournamnet Selection and Elitism


Definition from W. B. Langdon, University College, London

“A mechanism for choosing individuals from a population. A group (typically between 2 and 7 individuals) are selected at random from the population and the best (normally only one, but possibly more) is chosen An elitist genetic algorithm is one that always retains in the population the best individual found so far, Tournamnet Selection is naturally elitist.”

In my opinion and after testing the roulette rank and tournament selections are slightly faster for TSP case. For another problems, others selection algorithms can be best.

Co-evolutions. Migrations

I didn’t find many documents about these methods in WEB.
The base idea: allow evolving of several populations at the same time.
The description found in Generation5.org Site

“Genetic algorithms are neat, but they do come with their own set of problems. One big problem is that genetic algorithms have a tendency to get stuck at local optima. In other words, they will find a reasonable solution, but not the best solution. There are several methods that have been devised to counter this problem, and the one we will look at is coevolution”

Simultaneously, co-evolution idea allows utilizing the SMP ability of WinNT\2k machines with multi CPUs. We can easily run several GA in separate threads without any penalty. For exchange data between different GA we can migrate the best genes in population.

Base implementation
Template class GA<> and GA Selection classes

The GA<> class implements a base logic of Genetic Algorithms: recombination, mutation.
User can manage the gene population via methods of GA<> class. It holds a gene population and gene context, selection methods, and method of randomization. User can specify behavior of this class via template parameters.

template <typename Traits, typename Selection> class GA {}

Template parameter Traits must define typedefs for Gene class, Random class, Population container class and Thread Synchronize class.

Template parameter Selection must provide the GA selection algorithm.
Now there are three such classes: selection_tournament<>, selection_roulette_cost<>, selection_roulette_rank<>.

GA<> interface

  • init – initializes population
  • update – computes fitness’s values and return optimal gene or end()
  • find_best – finds the gene with best fitness
  • epoch – makes next population (selection, crossover, mutation, etc)
  • recombine – makes selection, produce new genes, removes non elite parents and removes twins gene
  • mutate – attempts to mutate genes
  • migration – exchanges the best genes between populations
  • sort – orders the genes in population depending on fitness value (moves the best into a beginning of population)
  • begin – returns iterator at the first (best) gene of the population
  • end – returns a iterator that points just beyond the end of the population
  • size – returns the size of populations

The example of using GA<> class

typedef ga_traits<RandomCRT> traits;
typedef GA<traits, selection_tournament<traits> > tGA;
traits::Gene::Context context;
tGA ga(50, &context);
tGA::iterator it = ga.update();
while(it == ga.end())
{
  ga.epoch();
  it = ga.update();
}
traits::Gene* gnp = (*it);

Genome of Travel

class TSPData

Context of travel holds a travel data, auxiliary data and methods for crossover operation.

struct TSPBase

Base gene class has thread specific memory pool for memory optimization
In process of computing GA creates and destroys a lot of dynamic gene objects. It’s very ineffective to use default memory allocation routine. Instead of it I used a special memory pool class, which pre-allocated a large block of memory from process heap and then caches a freeing blocks.

class TSPGene<> : TSPBase

Gene’s implementation
Every gene holds a path (travel) of salesman and fitness value of this travel.
Of course the lower the cost of travel the better fitness of gene.
It has some constructors and methods for mutation and heuristics computing.
The default constructor creates gene with random travel and for crossover operation uses another constructor:

TSPGene* gnp = new TSPGene(parent1, parent2).

It creates offspring from genetic materials of parents’ genes.

TSP Application, GA thread

For every co-evolution _Main class creates a separate thread with exemplar of GA<> class.
Depending on user’s setting it creates GA with one of three selection methods and sets a population size, elite size, migration’s size, heuristics value, mutation and crossover probability.

Then every turn (or AI term “epoch”) it calls recombination, mutation, and heuristics methods and update the Application’s board with result of best gene. Also it tries to prevent convergence to a local optimum and kills 50% population if fitness stays unchanged some time.

TSP Application, UI interface

User can change a setting of GA for tuning it depending on size and shape of set of points.

  • Co-evolution field: a number of co-evolutions (separate threads) from 1 till 16. Default value is number of CPU * 2.
  • Population field: a size of population per co-evolution, from 10 till 1000
  • Elite field: an elite size in the population, from 0 till size of population.
  • Migration field: a size of migrated genes, from 0 till size of population.
  • Heuristics filed: a size of best genes improved via heuristics method in every epoch. Attention, in large population it can extremely slow the computing.
  • Crossover field: probability of crossover, from 0 till 100
  • Mutation filed: probability of mutation. From 0 till 100
  • Selection combo: selection method (roulette rank, roulette cost and tournament)
  • Remove Twins checkbox: set using a natural selection algorithm. It eliminates similar gene to for avoiding the immature convergence.

Environment

I used VC++ 6.0. SP5, Win2k SP2, MS Platform SDK April 2001.

And tested on Win2k SP2 IE 6.0,Win2k SP1 IE 5.0, Win ME IE 5.5, Win 98 SE IE 5.0.

References

1. S.Hsiung and J.Matthews,
Generation 5 – Genetic Algorithms and Genetic Programming

2. Marek Obitko,
Introduction to Genetic
Algorithms

3. Hiroaki Sengoku and Ikuo Yoshihara,
A fast TSP solver using a genetic algorithm

4. Sushil J. Louis and Rilun Tang,
Interactive
Genetic Algorithms for the Traveling Salesman Problem
,
Genetic
Algorithms with Memory for Traveling Salesman Problems
,
Augmenting
Genetic Algorithms with Memory to Solve Traveling Salesman Problems

5. Sergey Isaev,
Genetic Algorithm

6. W. B. Langdon,
Genetic Programming and Data Structures

7. Solving Traveling Salesman Problem

Downloads

Download TSPApp executable – 159 Kb
Download TSPApp source – 119 Kb

More by Author

Get the Free Newsletter!

Subscribe to Developer Insider for top news, trends & analysis

Must Read