LeRoyce Pearson
LeRoyce Pearson

Posted on

Thoughts on Parsing Part 3 - Write Cursors

In djot.zig, the parsed document is, at it's core, a list of events. While I'm parsing the document, I want to reduce the amount of redundant/temporary allocations that are done, so I put all of the events directly into the ArrayList that will make up the finished Document. To handle resetting the state, I use a pattern similar to the Read Cursors I discussed in the last post.

Posts in my Thoughts on Parsing series:

  1. Part 1 - Parsed Representation
  2. Part 2 - Read Cursors
  3. Part 3 - Write Cursors

What is a Write Cursor

Like a read cursor, it's really just a pointer to an index. However this will be an index into a growable array, not a slice. Let's modify the parseIntList function from last time to use this pattern:

fn parseIntList(list: *std.ArrayList(u64), parent_event_index: *usize, text: []const u8, parent: *usize) !?void {
    const start = parent.*;
    var index = start;
    var event_index = parent_event_index.*;

    // A list starts with `[`
    _ = parseExpectCharacter(text, &index, '[') orelse return null;

    while (true) {
        // Here we ignore the `.len` field on the ArrayList.
        // We use `event_index` as our source of truth of where
        // to write to.
        const int = parseInt(text, &index) orelse break; 
        try list.resize(event_index + 1);
        list.items[event_index] = int;
        event_index += 1;

        _ = parseExpectCharacter(text, &index, ',') orelse break;

    // Ends with a `]`
    _ = parseExpectCharacter(text, &index, ']') orelse return null;

    parent.* = index;
    parent_event_index.* = event_index;
Enter fullscreen mode Exit fullscreen mode

Using an ArrayList like this let's us fulfill the properties I set out for Read Cursor's in the last post:

  1. easy to copy: It's a pointer to an ArrayList and an index into it
  2. easy to update: The ArrayList handles copying over data when it needs to grow the list

Now we have a Transactional way to write data.


djot.zig gets a couple of benefits from using this pattern:

  • Reduced allocations: Since I can reuse a single ArrayList, djot.zig not only avoids allocating lists that would've just been copied over anyway, it also benefits from the amortization of allocation that ArrayLists provide.
  • Simpler lifetime management: As nested objects grow, it becomes more difficult to track what can be freed and what needs to stick around. The single ArrayList of events gives us a single space to check, and a single region of memory to free.

In the next post I'll talk about how I'm wrapping these patterns into a concrete types.

Top comments (0)