Build simple fuzzer - part 1

First of all, we are learning here and this fuzzer is in no way going to be a proper tool used against real targets (at least not initially). This is why we are going to code it in python. For real tools we should have picked something way closer to the metal - language that compile to a native code. Most of the fuzzers used professionally are written in C/C++ and some cool kids use Rust, but almost nobody uses python. Fuzzing is primarily about how many executions per second you can squeeze out and use of interpreted language incurs many speed penalties.


Second important thing is to pick a right target - we are going to use the exif library mentioned in h0mbre’s article because it was coded many years ago and will most likely spew crashes like there is no tomorrow. There is nothing worse than picking a target that might be actually well written. You will spend the rest of your day wondering if you suck at coding/fuzzing or maybe there are no crashes to be found. Remember - we are learning, we want some quick results.


Main parts of fuzzer

The premise of fuzzing is deceptively simple - you feed random data to a program and see if it crashed. Then you change the data a little bit and feed it to a program again. And again. And again. So essentially it is doing exactly the same thing over and over again expecting different outcomes. Like insanity. Before we move further there is one golden rule of fuzzing that you have to remember till the end of your days. It’s like the equivalent of Heisenberg principle or Schroedinger paradox - an observed fuzzer never crashes.


Back to the general architecture - every fuzzer has at least two main components - mutation and execution engine. This is roughly how I’ve initially implemented it (and remember, it heavily borrows from ‘Fuzzing like Cavemen’ article):


def main():
    if len(sys.argv) < 2:
        print('Usage: {} <valid_jpg>'.format(sys.argv[0]))
    else:
        filename = sys.argv[1]
        orig_data = get_bytes(filename)
        dbg = debugger.PtraceDebugger()
    
    counter = 0
    while counter < 100000:
        data = orig_data[:]
        mutated_data = mutate(data)
        create_new(mutated_data)
        execute_fuzz(dbg, mutated_data, counter)
    
    if counter % 100 == 0:
        print('Counter: {}\r'.format(counter)