Andres

Posted on Sep 7, 2021

Crafting an Interpreter in Zig - part 2

#learn

This is the second post on Crafting an Interpreter in Zig, the blog series where I read the book Crafting Interpreters, implement the III part using Zig and highlight some C feature used in the book and compare it to how I did it using Zig. You can find the code here and if you haven't read the first part you can read it here.

Crafting an Interpreter in Zig - part 1

Andres ・ Aug 31 '21

#learn

In this chapter, chapter 15, we start implementing the virtual machine (VM) that will power our interpreter. One of the first things that caught my attention was making the VM a static global instance. The author arguments that while having a global instance might be bad for larger code bases, it is just good enough for this book and that the benefits of not having to pass the pointer to the VM to every function out-weights the potential problems of having the global instance. Zig offers a nice way to call the function that take a pointer to an instance of a struct a.k.a methods using the dot syntax.

In Zig it is just as easy as defining your function as

pub const Vm = struct {
    const Self = @This();

    pub fn interpret(self: *Self, chunk: *Chunk) InterpretError!void {...}
}

Note that the const Self = @This() is optional. And we can call it like this

var vm = Vm.init();
defer vm.deinit();
try vm.interpret(&chunk);

Compare that to the way it is implemented in the book using C.

// definition
InterpretResult interpret(Chunk* chunk);

//Usage
interpret(&chunk);

No big difference in how we call it, but in the C version it is not clear that the function uses, mutates, and needs the VM, which can definitely be confusing if you are seeing a code base for the first time. In Zig we are avoiding this by making it clear that the function uses the VM instance and avoiding some other potential pitfalls of declaring our VM instance globally.

Another thing you might notice from the previous code snippets is that the C version returns InterpretResult. In the book InterpretResult is an enum with two error values and one success value. In the Zig version we do not need that, our function returns InterpretError!void, InterpretError only defines the possible errors that can happen in this specific function, also note that we have to call the function with the try keyword making it explicit that this function can error and force the users (myself only) of the code to handle the errors. We define InterpretError like so.

pub const InterpretError = error{
    CompileError,
    RuntimeError,
};

So far I have not used the error handling features of Zig for this codebase, but I am sure it will come handy once we start reporting and handling errors since Zig story around errors is very solid.

This chapter heavily uses the C preprocessor. The author uses it for conditional compilation, to reduce boilerplate, and as a form of code reuse. I remember watching a talk by Andrew Kelley, the creator of a niche programming language that nobody uses (I am kidding, if you don't know him, Andrew Kelley is the President of the Zig foundation and creator of Zig), where he mentions all the problems that the C preprocessor brings to the C programming languages and was one of the things he specifically wanted to improve over C. So we have no preprocessor in Zig, can we accomplish the same goals as the author of the book without having one? Let's see...

The author uses the preprocessor for conditionally compile debug only code

#ifdef DEBUG_TRACE_EXECUTION
    disassembleInstruction(vm.chunk,
                           (int)(vm.ip - vm.chunk->code));
#endif

We don't want this piece of code to be part of the executable once we disable Debug Tracing. Zig accomplishes this with just normal code. Zig heavily try to execute as much code as possible at compile time, it even has keywords to force compile time execution of specific blocks of code comptime, as well as some values that only exist at compile time, for example, all types can be use as values at compile time. This enables very powerful features like generics. For this specific use of the preprocessor we just need to write this Zig code.

if (DEBUG_TRACE_EXECUTION) {
    debug.disassemble_instruction(self.chunk, @ptrToInt(self.ip) - @ptrToInt(self.chunk.code.items.ptr));
}

I am defining DEBUG_TRACE_EXECUTION at the top level of the file.

The author also uses the C preprocessor to add more semantic meaning to some pointer operations, for example

#define READ_BYTE() (*vm.ip++)
#define READ_CONSTANT() (vm.chunk->constants.values[READ_BYTE()])

// ... Some code
#undef READ_BYTE
#undef READ_CONSTANT

As the author said "Undefining these macros explicitly might seem needlessly fastidious, but C tends to punish sloppy users, and the C preprocessor doubly so." So don't forget to be tidy when using the preprocessor.

For my Zig implementation I used normal methods, so I don't need to be as tidy as if I where using the C preprocessor.

fn read_instruction(self: *Self) OpCode {
    const instruction = @intToEnum(OpCode, self.ip[0]);
    self.ip += 1;
    return instruction;
}

fn read_constant(self: *Self) Value {
    const constant = self.chunk.constants.items[self.ip[0]];
    self.ip += 1;
    return constant;
}

I hope that Zig inline this function calls. Will see in the future if I need to optimize this given that is in the hottest path on our code, but for now it doesn't look necessary.

The last use of the preprocessor I want to highlight is its use as a tool for code reuse and generic programming. The author defines this macro

#define BINARY_OP(op) \
    do { \
      double b = pop(); \
      double a = pop(); \
      push(a op b); \
    } while (false)

and use it like so

// Inside a switch statement
case OP_ADD:      BINARY_OP(+); break;
case OP_SUBTRACT: BINARY_OP(-); break;
case OP_MULTIPLY: BINARY_OP(*); break;
case OP_DIVIDE:   BINARY_OP(/); break;

This one is really interesting, other that the weird do/while that is wrapping everything, this macro basically let us use the binary operators as first class constructs and reduce some code duplication. I couldn't find a direct way of translating this to Zig. I though about using function pointers and wrap the math operators in functions to be able to pass them around, looked at the std trying to find some functions already defined for this basic operations, etc. In the end I inspired my self in the @reduce builtin function, which takes as the first argument an enum of the possible operations it can perform, here is the signature

@reduce(comptime op: std.builtin.ReduceOp, value: anytype) std.meta.Child(value)

I did something similar when defining my binary_op function

fn binary_op(self: *Self, op: BinaryOp) void {
    const b = self.pop();
    const a = self.pop();
    const result = switch (op) {
        .add => a + b,
        .sub => a - b,
        .mul => a * b,
        .div => a / b,
    };
    self.push(result);
}

And use like this

//... Inside a switch statement
.op_add => self.binary_op(.add),
.op_sub => self.binary_op(.sub),
.op_mul => self.binary_op(.mul),
.op_div => self.binary_op(.div),

If you now a better way or have an idea on how to solve this problem in different and interesting way, please let me know.

In this post we saw how in Zig we can live without a C preprocessor equivalent for a variety of examples. compile time execution solves a lot of the problems for what we normally need the C preprocessor, sometimes Zig offers a better solution that what the preprocessor offers, sometimes it is just as good, and sometimes it requires a bit more code, but overall the experience with comptime is pleasant, it requires some intuition to exactly now when a expression will be executed at compile time or at runtime, but the more I use Zig the more natural it feels.

Hope you liked this post and see you in the next one!

Cover Photo by Anthony Shkraba from Pexels.

Top comments (3)

Loris Cro • Sep 8 '21

I've never done a Crafting Interpreters run, but I'm happy to follow along your experience. It's a bit like watching somebody on Twitch do a full run of a game. Keep it up!

I have two comments that might interest you about the content:

I hope that Zig inline this function calls.

You can use the inline keyword to enforce this. Compilation will fail if for some reason Zig will not be able to inline a call so you know for sure that those functions will be functionally equivalent to the original C macro.

If you now a better way or have an idea on how to solve this problem in different and interesting way, please let me know.

One thing that you can do is mark the op argument as comptime, which will allow Zig to resolve the switch at comptime, basically making the function equivalent to the C macro.

fn binary_op(self: *Self, comptime op: BinaryOp) void

One final suggestion: Forem supports article series, go in the edit page of your first article and there you will find an option to create a series, then go to the edit page of this new one and add it to the same series. Doing so will add a neat table of content at the top of each post that gets automatically updated when you add a new one.

Cheers!

Antonio Patriarca • Jan 23 '22

I am currently doing a Crafting Interpreters run as well. While I like your approach for the binary op macro, I personally found it was more work then just copying the code multiple times. Particularly because after challenge 4 suggenstion it is just two lines of code for each op:

self.stack[self.stackTop - 2] += self.stack[self.stackTop - 1];
self.stackTop -= 1;

David Vanderson • Sep 9 '21

This is great. Very nice to contrast to how it's done in C (with preprocessor). Thanks!

Zig NEWS

Crafting an Interpreter in Zig - part 2

Crafting an Interpreter in Zig - part 1

Andres ・ Aug 31 '21

Top comments (3)

Read next

Mutable Global Data in Zig

TerrainZigger: A Beginner's 3D Terrain Generator in Zig

Closure Pattern in Zig

build.zig.zon dependency hashes