This post supplements a YouTube video, and largely just consists of a few technical details about the Pi Fractal and its genesis. **The bulk of the story about the Pi Fractal is in the YouTube video.**

Here’s a large image of the 8th iteration of the Pi Fractal (click to expand).

For those interested, the string rewriting rules required to generate the pi fractal follow. The pattern begins with a single 0 block. The darkest block is 0 and the lightest is 29.

0 -> 11 25 25, 3 6 12, 16 15 11 1 -> 26 26 28, 25 27 29, 28 29 29 2 -> 28 28 29, 22 26 26, 29 29 28 3 -> 22 29 24, 29 29 24, 29 22 21 4 -> 28 29 24, 28 26 29, 29 26 17 5 -> 0 0 2, 8 2 2, 1 3 2 6 -> 23 7 8, 18 29 12, 16 22 12 7 -> 2 2 1, 2 1 2, 1 3 1 8 -> 19 13 22, 24 18 22, 1 27 29 9 -> 0 8 23, 11 7 28, 1 5 2 10 -> 22 29 26, 27 15 19, 25 21 10 11 -> 16 17 1, 1 21 10, 13 5 29 12 -> 23 29 29, 23 29 29, 18 29 29 13 -> 27 27 29, 26 22 15, 24 12 24 14 -> 14 28 29, 28 28 14, 29 23 19 15 -> 16 7 12, 7 7 24, 29 7 29 16 -> 22 8 21, 29 21 21, 29 13 10 17 -> 4 23 23, 2 24 13, 23 0 16 18 -> 5 17 1, 27 8 2, 5 25 1 19 -> 25 25 15, 19 29 29, 19 28 21 20 -> 1 5 0, 1 6 2, 3 0 2 21 -> 27 27 20, 22 22 27, 29 22 28 22 -> 1 1 1, 4 3 2, 3 2 2 23 -> 29 27 24, 29 28 24, 29 26 24 24 -> 21 25 27, 23 28 27, 19 28 29 25 -> 28 28 28, 21 13 26, 3 7 27 26 -> 5 22 20, 15 22 15, 24 14 22 27 -> 1 3 2, 2 1 7, 3 2 0 28 -> 3 15 10, 22 7 29, 5 22 7 29 -> 1 1 3, 1 1 2, 1 1 3

### Update: 19th September, 2017

I’ve had a number of people ask me for more precise information about my implementation of the genetic algorithm. I wrote this post on Reddit and thought that I would share it here since it might be helpful.

*There is a lot of experimentation involved. I’ve tried to justify some of my choices below, but you may think of better ideas at every turn, so please try them out. If you find anything interesting, I’d love to hear about it. If it’s not clear what I mean by “at random” I always mean uniformly at random unless otherwise specified.*

#### Fitness function

- I tried a lot of different things for this, and arguably this is the most important part of the process. The function I found to be most useful was:

For some cell , let be the mean value of it and its eight neighboring cells. I just ignored the edges, but you might like to do something a bit more clever. - For each cell in the target image, let be the corresponding cell in your generated image. Sum over each cell.

I experimented with just comparing exact cells (rather than averages) and this resulted in a much more pixelated image. Average is considerably slower (well, in my case, certainly, since I used the naive algorithm of recomputing from scratch per pixel. You could be a little clever with this to optimise it, but I found that to be overkill for my needs. I always program the quickest solution first and optimise later if necessary – it wasn’t). I believe that a fitness function not using averages is probably better if you have reason to believe that your image can be represented exactly with the number of rules that you’re working with. Otherwise, an average will lead to better results.

I experimented with different exponents too. Higher exponents discourage exploration via random mutation, but this is mitigated somewhat by taking an average, and color switching in the rule breeding function. I found squaring to be a good balance (it also had the convenience factor of not having to worry about taking absolute values or anything).

#### Rule breeding

This is the exact function that I used. It started out very different, and I tweaked it as I observed the results it produced. As such, it’s a little Frankenstein. I hope to go back and improve it at some point.

- Input two rules and .
- With probability 0.2, swap two randomly selected colors in (completely, which means swapping the two rules
*and*swapping all instances of those colors in all rules). - Otherwise, choose a selection of rules from and at random. Initially, I had one of or selected as the “main” parent and it passed on about 90% of its rules while the other only contributed 10%, but I found this actually performed worse. Perhaps there is a middle ground between 50% and 90%.
- Finally, take whatever you have from above, and change a random number (from 0 to 5 inclusive) of output numbers in the ruleset. I just mean single instances, so of the output numbers, I am only changing from 0 to 5. I chose a low number here because I feel like this is quite sensitive since a small change here could ruin the image by iteration 1 or 2, let alone 4. Having said that, I didn’t really experiment with increasing this parameter. Maybe my intuition is wrong – this is something worth playing around with.

#### Evolution

This is the part that I think largely looks after itself in this problem. I wouldn’t worry too much about the parameters here, since the above two things will impact your results much, much more. I have highlighted the two things that I think are critical in bold.

These parameters will only affect the speed at which you converge and how long each generation takes, which I didn’t feel was *too* important in my case, since a few hours is really nothing in the grand scheme of things.

Measuring things in “generations” is a little artificial since the final form of my generations had 200 rulesets, but I experimented with anywhere from 200 to 2000 or more. Different sizes result in correspondingly more or fewer generations until convergence, but in real-world time it’s hard to tell which would be faster without trying them (since a generation of fewer rulesets takes less time).

- Start with 2000 completely random rulesets.
- At each generation, evaluate the fitness function of everything you’ve got. Take the best 10 and breed random pairs together to produce 195 new rulesets. (In the video I mistakenly say 15 and 495 – that was from an earlier version of the program. I’m fairly confident that that version would have produced nearly identical results).
**Throw away all of your previous rulesets.**This is very important, otherwise, it is very very likely that you will end up in a local minimum where all of the best 15 are identical. You could experiment with throwing away all but the best 1, which would mitigate this but keep progress moving in the same direction. I opted to throw away everything because I was going out for a walk and I wanted to give my program a chance to explore many different avenues of solution without input from me (in the form of resetting the program).*I don’t think that the details are critical as long as you throw away most of what you had.***Add in 5 completely random rulesets.**You need to add in*something*. I question the efficacy of adding in just 5, but any more would start to dominate and make the program essentially brute force. Perhaps it’s worth experimenting further.- Pass these 195 + 5 rulesets to the next generation.

Loop until happy with your results.

Thanks for creating and sharing the video. You made me think of image compression. The rules for your pi image are very compact, and could be encoded using just 5 bits per color, with 9 colors per rule, for 30 rules, which adds up to 1350 bits. That’s in the ballpark of a very low quality JPEG! What do you think? Is there a nugget of a new image compression algorithm here?

LikeLike

You’re not the first to mention this, and I must say I have thought about it a little too. I’ll explore these ideas a little further, but my gut feeling is that actually finding these rules is so hard as to be impractical for large images (unless they are inordinately symmetrical) and that the end result has so much loss…

LikeLike

Really cool video, I was curious about your comment about how we don’t understand neural-nets in some ways as well as we understand genetic algorithms. Is there a field of math that could speak to how “fractle-able” a picture is? For example, you mentioned (in response to Ron Little) that the more symmetric a picture is, the more feasible this process might be as a compression-algorithm, have you more rigorously explored what makes a picture a feasible subject for this process?

LikeLike

Thanks! My comment on genetic algorithms vs neural networks was distilled into a one-liner for a general audience. My opinion, really, is slightly more involved. We understand well that a neural net is just a network of linear approximations with non-linearity between (see this post for a great discussion: http://colah.github.io/posts/2014-03-NN-Manifolds-Topology/). What isn’t understood at all well is what exactly adding/removing a layer/node in a large neural network really does, in terms of its implications for quality and quantity of training data needed, and the eventual quality of the network.

I haven’t thought very much about genetic string rewriting and its applications to compression yet, but it seems intuitively clear that an image with greater self-similarity will be easier to encode in this way, thought that is true of most (all?!) compression algorithms. I’ve contacted a friend of mine in Cambridge who is far more knowledgable about compression than me, and we hope to explore this concept in greater detail soon.

LikeLike

Hey buddy. Awesome blog post and video. You mentioned you didn’t want to spend 90 pounds on the book. But guess what. There’s this beautiful website called http://Gen.Lib.Rus.Ec where you can download pretty much all the math books for free.

LikeLike

Brilliant, thank you!

LikeLike

Also I subscribed! Plus I wanted to say that you should watch the following lecture from MIT’s AI course just on genetic algorithms – https://www.youtube.com/watch?v=kHyNqSnzP8Y – this lecture has some extra insight into how to accelerate genetic algorithms and produce offsprings that converge to near-optimum solution super super quickly. Cheers.

LikeLike

I’m here because another former Corpuscle shared your Corpus Prime on Facebook. Amazing. I noticed your carpets and ended up here. The maths is way over my head but you have very clear delivery and an engaging style. I’ll keep coming back to read about more super interesting stuff that I don’t understand!

LikeLike