The right way to start this part is by apologizing to all of you who waited so long for it. I had a pretty busy and yet not terribly productive week. Result was a grave need for rest and reset - that caused a delay in writing this blog post. Thank you all for the patience and I hope you will enjoy the fourth part of Build your own fuzzer series.
In the previous part we’ve added some instrumentation that allowed us to track the execution coverage. This cost us a lot of performance but it was a necessary sacrifice as the coverage is an essential part of the next iteration of our fuzzer. In the end we want it to be able to select most promising samples for further mutation and discard ones that we considered inferior. So, in essence, “it’s evolution, baby”.
As always, I will be presenting relevant code fragments but for the sake of space conservation some boring parts will be skipped. Full code, if you are inclined to read it is available at my github. Eventually commits will be tagged properly and links will point to a version of fuzzer relevant for a given part of the series.
Before we dive into genetic algorithms there are few things we need to fix and reimplement. First one is our basic coverage and how it is represented.
Simpler and faster coverage
As you might remember we had a fairly naive implementation where we stored