Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Genetic Algorithms with Go

Genetic Algorithms with Go

Sau Sheong Chang

January 29, 2019
Tweet

More Decks by Sau Sheong Chang

Other Decks in Technology

Transcript

  1. 24 years in tech industry 20 years managing software product

    organizations Spoke in, organized tech conferences Wrote 4 programming books @sausheong
  2. Genetic algorithms are software algorithms that are based on the

    process of natural selection Genetic algorithms are software algorithms that are based on the process of natural selection
  3. Natural selection A natural process that causes populations of organisms

    to adapt to their environment over time Natural selection A natural process that causes populations of organisms to adapt to their environment over time
  4. Natural selection • Before the early 1800s, the peppered moth

    in England were mostly white • This helped it to hide from predatory birds as it blended well with light-colored lichens and English trees
  5. Natural selection • During the Industrial Revolution, lichens died due

    to pollution and many trees became blackened by soot • This gave the dark-colored moths an advantage in hiding from predators • By the end of the century, almost all peppered moths were of the dark variety • After the Clean Air Act 1956, and the dark colored moths became rare again
  6. Natural selection jargon • Organism The subject of study that

    is struggling for survival • DNA Carries genetic information for the organism • Population A group of organisms with different genes (values) for their DNA • Fitness A measurement of how well adapted an organism to its environment
  7. Natural selection jargon • Selection Organisms with the best fitness

    have higher chances to reproduce • Reproduction The next generation of the population is reproduced from the selected best-fit organisms • Inheritance The next generation must inherit the values of the genes • Mutation With each generation, there is a small chance that the values of the genes changes
  8. Define organism Create initial population of organisms Find fitness of

    organisms Select organisms with best fitness Let selected organisms reproduce next generation Next generation inherits from previous generation Randomly mutate each generation Reproduce until goal achieved! The genetic algorithm
  9. Infinite monkey problem , , , , , , ,

    , , , , , Infinite monkey problem
  10. To be or not to be • What’s the probability

    of randomly typing the exact sequence out? • 1/26 x 1/26 x 1/26 … 1/26 (1/26 ^ 18) • 1 out of 29,479,510,200,013,900,000,000,000 (29 billion trillion) • If the monkey type a letter every second, there is just 1 chance out of 934,789,136,225,707,600 years that it will type out that quote • That’s 1 time in 934 trillion years! 18 characters
  11. Let’s start with defining an organism • Byte array represents

    the DNA of the organism • Fitness is how well the DNA matches “To be or not to be” type Organism struct { DNA []byte Fitness float64 }
  12. Create an initial population of organisms func createOrganism(target []byte) (organism

    Organism) { ba := make([]byte, len(target)) for i := 0; i < len(target); i++ { ba[i] = byte(rand.Intn(95) + 32) } organism = Organism{ DNA: ba, Fitness: 0, } organism.calcFitness(target) return } func createPopulation(target []byte) (population []Organism) { population = make([]Organism, PopSize) for i := 0; i < PopSize; i++ { population[i] = createOrganism(target) } return }
  13. Find the fitness of the organisms • Find the fitness

    of each organism • The fitness of the organism is how closely it matches the phrase “to be or not to be”, byte by byte • 0 means totally different, 1 means a total match func (d *Organism) calcFitness(target []byte) { score := 0 for i := 0; i < len(d.DNA); i++ { if d.DNA[i] == target[i] { score++ } } d.Fitness = float64(score) / float64(len(d.DNA)) return }
  14. Select the organisms with the best fitness and give them

    higher chances to reproduce • Create a breeding pool and place a number of copies of the same organism according to its fitness into the pool • The higher the fitness of the organism, the more copies of the organism end up in the pool • This ensures the fittest organisms have higher chances of being picked to pass on the DNA to the next generation
  15. Creating a breeding pool func createPool(population []Organism, target []byte, maxFitness

    float64) (pool []Organism) { pool = make([]Organism, 0) // create a pool for next generation for i := 0; i < len(population); i++ { population[i].calcFitness(target) num := int((population[i].Fitness / maxFitness) * 100) for n := 0; n < num; n++ { pool = append(pool, population[i]) } } return }
  16. Create the next generation of the population from the selected

    best-fit organisms • Randomly pick the 2 parents from the breeding pool to create the next generation func naturalSelection(pool []Organism, population []Organism, target []byte) []Organism { next := make([]Organism, len(population)) for i := 0; i < len(population); i++ { r1, r2 := rand.Intn(len(pool)), rand.Intn(len(pool)) a := pool[r1] b := pool[r2] child := crossover(a, b) child.mutate() child.calcFitness(target) next[i] = child } return next }
  17. The next generation of the population must inherit the values

    of the genes • Child is bred from the the crossover between 2 randomly picked organisms, and inherits the DNA from both func crossover(d1 Organism, d2 Organism) Organism { child := Organism{ DNA: make([]byte, len(d1.DNA)), Fitness: 0, } mid := rand.Intn(len(d1.DNA)) for i := 0; i < len(d1.DNA); i++ { if i > mid { child.DNA[i] = d1.DNA[i] } else { child.DNA[i] = d2.DNA[i] } } return child }
  18. f c e b s x d w t d

    g p r f s + Randomly select 2 parents from the breeding pool 1 f c e b s r f s Combine the 2 DNA fragments to form the child 3 Randomly select a mid-point in the DNA 2
  19. Randomly mutate each generation • If mutation never occurs, the

    DNA within the population will always remain the same as the original population • This means if the original population doesn’t have a particular gene that is needed, the optimal result will never be achieved • For example, if the letter t is not found in the initial population at all, we will never be able to come up with the quote no matter how many generations we go through func (d *Organism) mutate() { for i := 0; i < len(d.DNA); i++ { if rand.Float64() < MutationRate { d.DNA[i] = byte(rand.Intn(95) + 32) } } }
  20. Let’s run our genetic algorithm! func main() { start :=

    time.Now() rand.Seed(time.Now().UTC().UnixNano()) target := []byte("To be or not to be that is the question") population := createPopulation(target) found := false generation := 0 for !found { generation++ bestOrganism := getBest(population) fmt.Printf("\r generation: %d | %s | fitness: %2f", generation, string(bestOrganism.DNA), bestOrganism.Fitness) if bytes.Compare(bestOrganism.DNA, target) == 0 { found = true } else { maxFitness := bestOrganism.Fitness pool := createPool(population, target, maxFitness) population = naturalSelection(pool, population, target) } } elapsed := time.Since(start) fmt.Printf("\nTime taken: %s\n", elapsed) }
  21. 100 pixels 67 pixels Scope of difficulty • 100 x

    67 = 6,700 pixels • 1 pixel = 4 bytes • 6,700 pixels = 26,800 bytes • Compare to 18 bytes in the Shakespeare example, ~1,500x more data!
  22. Let’s start with defining an organism • *image.RGBA is the

    DNA of the organism • Fitness is how well the DNA matches the picture of Mona Lisa type Organism struct { DNA *image.RGBA Fitness int64 }
  23. DNA type RGBA struct { // Pix holds the image's

    pixels, in R, G, B, A order. Pix []uint8 // Stride is the Pix stride (in bytes) between vertically adjacent pixels. Stride int // Rect is the image's bounds. Rect Rectangle } This is our byte array
  24. Pix is a byte array of pixels R G B

    A R G B A R G B A … pixel 1 pixel 2
  25. Create an initial population of organisms func createOrganism(target *image.RGBA) (organism

    Organism) { organism = Organism{ DNA: createRandomImageFrom(target), Fitness: 0, } organism.calcFitness(target) return } func createRandomImageFrom(img *image.RGBA) (created *image.RGBA) { pix := make([]uint8, len(img.Pix)) rand.Read(pix) created = &image.RGBA{ Pix: pix, Stride: img.Stride, Rect: img.Rect, } return }
  26. Find the fitness of the organisms • The fitness of

    the organism is the difference between the image and the image of Mona Lisa • The lower the difference, the fitter the organism is • To find the difference, we use the Pythagorean theorem
  27. Difference between 2 images • In 3 dimensional space, we

    simply do the Pythagorean theorem twice, and in 4 dimensional space, we do it 3 times • The RGBA values of a pixel is essentially a point in a 4 dimensional space, so to find the difference between 2 pixels !"## = (&2 − &1)++ (-2 − -1)++(.2 − .1)++(/2 − /1)++ … + (.2 − .1)++(/2 − /1)+ • But since Pix is essentially a byte array sequence of RGBA: !"## = (&2 − &1)++ (-2 − -1)++(.2 − .1)++(/2 − /1)+ pixel 1
  28. Using the Pix trick to find the difference // difference

    between 2 images func diff(a, b *image.RGBA) (d int64) { d = 0 for i := 0; i < len(a.Pix); i++ { d += int64(squareDifference(a.Pix[i], b.Pix[i])) } return int64(math.Sqrt(float64(d))) } // square the difference func squareDifference(x, y uint8) uint64 { d := uint64(x) - uint64(y) return d * d }
  29. Select the organisms with the best fitness and give them

    higher chances to reproduce • Still using the breeding pool but with a slight difference • Instead of using the fitness directly to determine number of copies of organism in the pool, we use a differentiated fitness by subtracting the top organism’s fitness from the least fit organism • For example, if the difference between the best fit organism and the least fit organism in the top best is 20, we place 20 organisms in the breeding pool • If there is no difference between the top best fit organisms, we can’t really create a proper breeding pool. To overcome this, we set the pool to be the whole population
  30. Creating the breeding pool func createPool(population []Organism, target *image.RGBA) (pool

    []Organism) { pool = make([]Organism, 0) sort.SliceStable(population, func(i, j int) bool { return population[i].Fitness < population[j].Fitness }) top := population[0 : PoolSize+1] if top[len(top)-1].Fitness-top[0].Fitness == 0 { pool = population return } for i := 0; i < len(top)-1; i++ { num := (top[PoolSize].Fitness - top[i].Fitness) for n := int64(0); n < num; n++ { pool = append(pool, top[i]) } } return }
  31. Create the next generation of the population from the selected

    best-fit organisms func naturalSelection(pool []Organism, population []Organism, target []byte) []Organism { next := make([]Organism, len(population)) for i := 0; i < len(population); i++ { r1, r2 := rand.Intn(len(pool)), rand.Intn(len(pool)) a := pool[r1] b := pool[r2] child := crossover(a, b) child.mutate() child.calcFitness(target) next[i] = child } return next }
  32. The next generation of the population must inherit the values

    of the genes func crossover(d1 Organism, d2 Organism) Organism { pix := make([]uint8, len(d1.DNA.Pix)) child := Organism{ DNA: &image.RGBA{ Pix: pix, Stride: d1.DNA.Stride, Rect: d1.DNA.Rect, }, Fitness: 0, } mid := rand.Intn(len(d1.DNA.Pix)) for i := 0; i < len(d1.DNA.Pix); i++ { if i > mid { child.DNA.Pix[i] = d1.DNA.Pix[i] } else { child.DNA.Pix[i] = d2.DNA.Pix[i] } } return child }
  33. Randomly mutate each generation func (o *Organism) mutate() { for

    i := 0; i < len(o.DNA.Pix); i++ { if rand.Float64() < MutationRate { o.DNA.Pix[i] = uint8(rand.Intn(255)) } } }
  34. Let’s run our genetic algorithm! func main() { start :=

    time.Now() rand.Seed(time.Now().UTC().UnixNano()) target := load("./ml.png") printImage(target.SubImage(target.Rect)) population := createPopulation(target) found := false generation := 0 for !found { generation++ bestOrganism := getBest(population) if bestOrganism.Fitness < FitnessLimit { found = true } else { pool := createPool(population, target) population = naturalSelection(pool, population, target) if generation%100 == 0 { sofar := time.Since(start) fmt.Printf("\nTime taken so far: %s | generation: %d | fitness: %d | pool size: %d", sofar, generation, bestOrganism.Fitness, len(pool)) save("./evolved.png", bestOrganism.DNA) fmt.Println() printImage(bestOrganism.DNA.SubImage(bestOrganism.DNA.Rect)) } } } elapsed := time.Since(start) fmt.Printf("\nTotal time taken: %s\n", elapsed) }