Slide 1

Slide 1 text

Genetic Algorithms with Go Chang Sau Sheong

Slide 2

Slide 2 text

24 years in tech industry 20 years managing software product organizations Spoke in, organized tech conferences Wrote 4 programming books @sausheong

Slide 3

Slide 3 text

Genetic algorithms are software algorithms that are based on the process of natural selection Genetic algorithms are software algorithms that are based on the process of natural selection

Slide 4

Slide 4 text

Natural selection A natural process that causes populations of organisms to adapt to their environment over time Natural selection A natural process that causes populations of organisms to adapt to their environment over time

Slide 5

Slide 5 text

Natural selection • Before the early 1800s, the peppered moth in England were mostly white • This helped it to hide from predatory birds as it blended well with light-colored lichens and English trees

Slide 6

Slide 6 text

Natural selection • During the Industrial Revolution, lichens died due to pollution and many trees became blackened by soot • This gave the dark-colored moths an advantage in hiding from predators • By the end of the century, almost all peppered moths were of the dark variety • After the Clean Air Act 1956, and the dark colored moths became rare again

Slide 7

Slide 7 text

Natural selection jargon • Organism The subject of study that is struggling for survival • DNA Carries genetic information for the organism • Population A group of organisms with different genes (values) for their DNA • Fitness A measurement of how well adapted an organism to its environment

Slide 8

Slide 8 text

Natural selection jargon • Selection Organisms with the best fitness have higher chances to reproduce • Reproduction The next generation of the population is reproduced from the selected best-fit organisms • Inheritance The next generation must inherit the values of the genes • Mutation With each generation, there is a small chance that the values of the genes changes

Slide 9

Slide 9 text

Define organism Create initial population of organisms Find fitness of organisms Select organisms with best fitness Let selected organisms reproduce next generation Next generation inherits from previous generation Randomly mutate each generation Reproduce until goal achieved! The genetic algorithm

Slide 10

Slide 10 text

Infinite monkey problem , , , , , , , , , , , , Infinite monkey problem

Slide 11

Slide 11 text

To be or not to be • What’s the probability of randomly typing the exact sequence out? • 1/26 x 1/26 x 1/26 … 1/26 (1/26 ^ 18) • 1 out of 29,479,510,200,013,900,000,000,000 (29 billion trillion) • If the monkey type a letter every second, there is just 1 chance out of 934,789,136,225,707,600 years that it will type out that quote • That’s 1 time in 934 trillion years! 18 characters

Slide 12

Slide 12 text

Genetic algorithms to the rescue!

Slide 13

Slide 13 text

Let’s start with defining an organism • Byte array represents the DNA of the organism • Fitness is how well the DNA matches “To be or not to be” type Organism struct { DNA []byte Fitness float64 }

Slide 14

Slide 14 text

Create an initial population of organisms func createOrganism(target []byte) (organism Organism) { ba := make([]byte, len(target)) for i := 0; i < len(target); i++ { ba[i] = byte(rand.Intn(95) + 32) } organism = Organism{ DNA: ba, Fitness: 0, } organism.calcFitness(target) return } func createPopulation(target []byte) (population []Organism) { population = make([]Organism, PopSize) for i := 0; i < PopSize; i++ { population[i] = createOrganism(target) } return }

Slide 15

Slide 15 text

Find the fitness of the organisms • Find the fitness of each organism • The fitness of the organism is how closely it matches the phrase “to be or not to be”, byte by byte • 0 means totally different, 1 means a total match func (d *Organism) calcFitness(target []byte) { score := 0 for i := 0; i < len(d.DNA); i++ { if d.DNA[i] == target[i] { score++ } } d.Fitness = float64(score) / float64(len(d.DNA)) return }

Slide 16

Slide 16 text

Select the organisms with the best fitness and give them higher chances to reproduce • Create a breeding pool and place a number of copies of the same organism according to its fitness into the pool • The higher the fitness of the organism, the more copies of the organism end up in the pool • This ensures the fittest organisms have higher chances of being picked to pass on the DNA to the next generation

Slide 17

Slide 17 text

Creating a breeding pool func createPool(population []Organism, target []byte, maxFitness float64) (pool []Organism) { pool = make([]Organism, 0) // create a pool for next generation for i := 0; i < len(population); i++ { population[i].calcFitness(target) num := int((population[i].Fitness / maxFitness) * 100) for n := 0; n < num; n++ { pool = append(pool, population[i]) } } return }

Slide 18

Slide 18 text

Create the next generation of the population from the selected best-fit organisms • Randomly pick the 2 parents from the breeding pool to create the next generation func naturalSelection(pool []Organism, population []Organism, target []byte) []Organism { next := make([]Organism, len(population)) for i := 0; i < len(population); i++ { r1, r2 := rand.Intn(len(pool)), rand.Intn(len(pool)) a := pool[r1] b := pool[r2] child := crossover(a, b) child.mutate() child.calcFitness(target) next[i] = child } return next }

Slide 19

Slide 19 text

The next generation of the population must inherit the values of the genes • Child is bred from the the crossover between 2 randomly picked organisms, and inherits the DNA from both func crossover(d1 Organism, d2 Organism) Organism { child := Organism{ DNA: make([]byte, len(d1.DNA)), Fitness: 0, } mid := rand.Intn(len(d1.DNA)) for i := 0; i < len(d1.DNA); i++ { if i > mid { child.DNA[i] = d1.DNA[i] } else { child.DNA[i] = d2.DNA[i] } } return child }

Slide 20

Slide 20 text

f c e b s x d w t d g p r f s + Randomly select 2 parents from the breeding pool 1 f c e b s r f s Combine the 2 DNA fragments to form the child 3 Randomly select a mid-point in the DNA 2

Slide 21

Slide 21 text

Randomly mutate each generation • If mutation never occurs, the DNA within the population will always remain the same as the original population • This means if the original population doesn’t have a particular gene that is needed, the optimal result will never be achieved • For example, if the letter t is not found in the initial population at all, we will never be able to come up with the quote no matter how many generations we go through func (d *Organism) mutate() { for i := 0; i < len(d.DNA); i++ { if rand.Float64() < MutationRate { d.DNA[i] = byte(rand.Intn(95) + 32) } } }

Slide 22

Slide 22 text

Let’s run our genetic algorithm! func main() { start := time.Now() rand.Seed(time.Now().UTC().UnixNano()) target := []byte("To be or not to be that is the question") population := createPopulation(target) found := false generation := 0 for !found { generation++ bestOrganism := getBest(population) fmt.Printf("\r generation: %d | %s | fitness: %2f", generation, string(bestOrganism.DNA), bestOrganism.Fitness) if bytes.Compare(bestOrganism.DNA, target) == 0 { found = true } else { maxFitness := bestOrganism.Fitness pool := createPool(population, target, maxFitness) population = naturalSelection(pool, population, target) } } elapsed := time.Since(start) fmt.Printf("\nTime taken: %s\n", elapsed) }

Slide 23

Slide 23 text

Demo

Slide 24

Slide 24 text

No content

Slide 25

Slide 25 text

Evolving Mona Lisa

Slide 26

Slide 26 text

Demo

Slide 27

Slide 27 text

100 pixels 67 pixels Scope of difficulty • 100 x 67 = 6,700 pixels • 1 pixel = 4 bytes • 6,700 pixels = 26,800 bytes • Compare to 18 bytes in the Shakespeare example, ~1,500x more data!

Slide 28

Slide 28 text

Let’s start with defining an organism • *image.RGBA is the DNA of the organism • Fitness is how well the DNA matches the picture of Mona Lisa type Organism struct { DNA *image.RGBA Fitness int64 }

Slide 29

Slide 29 text

DNA type RGBA struct { // Pix holds the image's pixels, in R, G, B, A order. Pix []uint8 // Stride is the Pix stride (in bytes) between vertically adjacent pixels. Stride int // Rect is the image's bounds. Rect Rectangle } This is our byte array

Slide 30

Slide 30 text

Pix is a byte array of pixels R G B A R G B A R G B A … pixel 1 pixel 2

Slide 31

Slide 31 text

Create an initial population of organisms func createOrganism(target *image.RGBA) (organism Organism) { organism = Organism{ DNA: createRandomImageFrom(target), Fitness: 0, } organism.calcFitness(target) return } func createRandomImageFrom(img *image.RGBA) (created *image.RGBA) { pix := make([]uint8, len(img.Pix)) rand.Read(pix) created = &image.RGBA{ Pix: pix, Stride: img.Stride, Rect: img.Rect, } return }

Slide 32

Slide 32 text

No content

Slide 33

Slide 33 text

Find the fitness of the organisms • The fitness of the organism is the difference between the image and the image of Mona Lisa • The lower the difference, the fitter the organism is • To find the difference, we use the Pythagorean theorem

Slide 34

Slide 34 text

Difference between 2 images • In 3 dimensional space, we simply do the Pythagorean theorem twice, and in 4 dimensional space, we do it 3 times • The RGBA values of a pixel is essentially a point in a 4 dimensional space, so to find the difference between 2 pixels !"## = (&2 − &1)++ (-2 − -1)++(.2 − .1)++(/2 − /1)++ … + (.2 − .1)++(/2 − /1)+ • But since Pix is essentially a byte array sequence of RGBA: !"## = (&2 − &1)++ (-2 − -1)++(.2 − .1)++(/2 − /1)+ pixel 1

Slide 35

Slide 35 text

Using the Pix trick to find the difference // difference between 2 images func diff(a, b *image.RGBA) (d int64) { d = 0 for i := 0; i < len(a.Pix); i++ { d += int64(squareDifference(a.Pix[i], b.Pix[i])) } return int64(math.Sqrt(float64(d))) } // square the difference func squareDifference(x, y uint8) uint64 { d := uint64(x) - uint64(y) return d * d }

Slide 36

Slide 36 text

Select the organisms with the best fitness and give them higher chances to reproduce • Still using the breeding pool but with a slight difference • Instead of using the fitness directly to determine number of copies of organism in the pool, we use a differentiated fitness by subtracting the top organism’s fitness from the least fit organism • For example, if the difference between the best fit organism and the least fit organism in the top best is 20, we place 20 organisms in the breeding pool • If there is no difference between the top best fit organisms, we can’t really create a proper breeding pool. To overcome this, we set the pool to be the whole population

Slide 37

Slide 37 text

Creating the breeding pool func createPool(population []Organism, target *image.RGBA) (pool []Organism) { pool = make([]Organism, 0) sort.SliceStable(population, func(i, j int) bool { return population[i].Fitness < population[j].Fitness }) top := population[0 : PoolSize+1] if top[len(top)-1].Fitness-top[0].Fitness == 0 { pool = population return } for i := 0; i < len(top)-1; i++ { num := (top[PoolSize].Fitness - top[i].Fitness) for n := int64(0); n < num; n++ { pool = append(pool, top[i]) } } return }

Slide 38

Slide 38 text

Create the next generation of the population from the selected best-fit organisms func naturalSelection(pool []Organism, population []Organism, target []byte) []Organism { next := make([]Organism, len(population)) for i := 0; i < len(population); i++ { r1, r2 := rand.Intn(len(pool)), rand.Intn(len(pool)) a := pool[r1] b := pool[r2] child := crossover(a, b) child.mutate() child.calcFitness(target) next[i] = child } return next }

Slide 39

Slide 39 text

The next generation of the population must inherit the values of the genes func crossover(d1 Organism, d2 Organism) Organism { pix := make([]uint8, len(d1.DNA.Pix)) child := Organism{ DNA: &image.RGBA{ Pix: pix, Stride: d1.DNA.Stride, Rect: d1.DNA.Rect, }, Fitness: 0, } mid := rand.Intn(len(d1.DNA.Pix)) for i := 0; i < len(d1.DNA.Pix); i++ { if i > mid { child.DNA.Pix[i] = d1.DNA.Pix[i] } else { child.DNA.Pix[i] = d2.DNA.Pix[i] } } return child }

Slide 40

Slide 40 text

Randomly mutate each generation func (o *Organism) mutate() { for i := 0; i < len(o.DNA.Pix); i++ { if rand.Float64() < MutationRate { o.DNA.Pix[i] = uint8(rand.Intn(255)) } } }

Slide 41

Slide 41 text

Let’s run our genetic algorithm! func main() { start := time.Now() rand.Seed(time.Now().UTC().UnixNano()) target := load("./ml.png") printImage(target.SubImage(target.Rect)) population := createPopulation(target) found := false generation := 0 for !found { generation++ bestOrganism := getBest(population) if bestOrganism.Fitness < FitnessLimit { found = true } else { pool := createPool(population, target) population = naturalSelection(pool, population, target) if generation%100 == 0 { sofar := time.Since(start) fmt.Printf("\nTime taken so far: %s | generation: %d | fitness: %d | pool size: %d", sofar, generation, bestOrganism.Fitness, len(pool)) save("./evolved.png", bestOrganism.DNA) fmt.Println() printImage(bestOrganism.DNA.SubImage(bestOrganism.DNA.Rect)) } } } elapsed := time.Since(start) fmt.Printf("\nTotal time taken: %s\n", elapsed) }

Slide 42

Slide 42 text

Getting back to the demo …

Slide 43

Slide 43 text

Sample results

Slide 44

Slide 44 text

Sample results by drawing triangles

Slide 45

Slide 45 text

Sample results by drawing circles

Slide 46

Slide 46 text

Questions? https://sausheong.github.io https://github.com/sausheong/ga