Zig NEWS

loading...
Cover image for How to Add Buffering to a Reader / Writer in Zig

How to Add Buffering to a Reader / Writer in Zig

Loris Cro
I swear I didn't put that bug there
Updated on ・4 min read

Once you get access to an open file descriptor, you can start reading or writing to it. In this example we're going to use stdin and stdout, but the same applies to sockets, files, and any stream that offers a reader/writer interface.

Why buffer reads and writes?

Long story short: for performance.

Everytime you issue an unbuffered write/read, the program will execute syscall to make the OS perform the relative operation. Unfortunately, syscalls are slow because they have to navigate through lots of abstraction layers in the system.

On top of that, many situations will require issuing many small reads/writes. For example many parsers will try to read one token at a time. Buffering allows to batch those small read/writes, resulting in a much lower number of syscalls.

While buffering is just a pattern like many others, it's a very useful one to know as it allows to keep things clean at a high level (e.g., parser code), while drastically improving performance at a lower level with very little complexity added between the two layers.

Buffering stdout

First we need to get a handle to stdout.

const std = @import("std");

pub fn main() !void {
   const out = std.io.getStdOut();
}
Enter fullscreen mode Exit fullscreen mode

out is one specific type of thing that can be written to. You can obtain a unified writer interface by calling its writer() method. This is what gives you access to (unbuffered) print and other similar methods (as exposed by the generic writer inferface).

var w = out.writer();
try w.print("Hello {s}!", .{"World"});
Enter fullscreen mode Exit fullscreen mode

More on writer interfaces
If you clicked the link above you should have noticed that Writer is a generic struct. Why is that?

There are multiple ways of implementing interfaces in Zig with different degrees of runtime dynamicism.

This specific implementation is concerned with two main things: knowing the set of possible errors that the stream can produce (so that then they can be included in the error set of print etc), and with the ability to pass a (correctly typed) reference to the original stream to the write function, which is the only primitive that a stream has to expose for print and all the other goodies already implemented in Writer to work.


Obtain a BufferedWriter

Buffering is implemented as a sort of wrapper around Writer and, similarly to Writer itself, its a generic type because it needs, among other things, to know the error set of the underlying Writer so that those can be added to the error set exposed by its implemenation of print etc.

BufferedWriter is implemented here and its (type) constructor has the following signature:

pub fn BufferedWriter(comptime buffer_size: usize, comptime WriterType: type) type
Enter fullscreen mode Exit fullscreen mode

Here you can see that it needs WriterType, as mentioned above, but it also needs a buffer_size. This has to be a comptime parameter because it decides the amount of stack memory that will be used to buffer the writes. This is an important detail to know:

  1. the buffer is an array on the stack
  2. the BufferedReader doesn't do dynamic allocations

At this point you could use that function directly to obtain a buffered reader, but if you look at the same same file where it's implemented, near the bottom you will see a very nice helper function that does all this work for us. I'll report here the full implementation:

pub fn bufferedWriter(underlying_stream: anytype) BufferedWriter(4096, @TypeOf(underlying_stream)) {
    return .{ .unbuffered_writer = underlying_stream };
}
Enter fullscreen mode Exit fullscreen mode

As you can see it automatically wires in the generic parameter and defaults to a 4kb buffer. Pretty handy!

Flushing

The BufferedWriter will automatically flush (i.e., issue a write syscall with the content of its buffer and empty it) when full, but it has no way of knowing when you intend to issue the last write. For this reason you need to conclude your writing session with a call to its flush method.

Writer Writer

BufferedWriter, despite its name, its not a proper Writer, but instead it just implements write() and exposes a Writer interface to gain access to the usual functionality (e.g., print). This way it can reuse the same implementation that all other writers share.

From an architectural perspective, the full "abstraction cake" looks like this:

[Writer]
   ▽
[BufferedWriter]
   ▽
[Writer]
   ▽
[Stdout]
Enter fullscreen mode Exit fullscreen mode

The final code should look like this:

const std = @import("std");

pub fn main() !void {
   const out = std.io.getStdOut();
   var buf = std.io.bufferedWriter(out.writer());

   // Get the Writer interface from BufferedWriter
   var w = buf.writer();

   try w.print("Hello {s}!", .{"World"});

   // Don't forget to flush!
   try buf.flush();
}
Enter fullscreen mode Exit fullscreen mode

About Readers

The same thing that we've seen for BufferedWriter also applies to readers: there's a Reader interface and a BufferedReader generic type implemented in std.io. The only difference is that you don't have to flush() readers.

Here's some sample code just for the sake of clarity:

const std = @import("std");

pub fn main() !void {
   const in = std.io.getStdIn();
   var buf = std.io.bufferedReader(in.reader());

   // Get the Reader interface from BufferedReader
   var r = buf.reader();

   std.debug.print("Write something: ", .{});
   // Ideally we would want to issue more than one read
   // otherwise there is no point in buffering.
   var msg_buf: [4096]u8 = undefined;
   var msg = try r.readUntilDelimiterOrEof(&msg_buf, '\n');

   if (msg) | m | {
       std.debug.print("msg: {s}\n", .{m});
    }
}
Enter fullscreen mode Exit fullscreen mode

Discussion (3)

Collapse
greenfork profile image
Dmitry Matveyev

I think there's a typo reader()->writer()

out is one specific type of thing that can be written to. You can obtain a unified writer interface by calling its reader() method. This is what gives you access to (unbuffered) print and other similar methods (as exposed by the generic writer inferface).

Collapse
kristoff profile image
Loris Cro Author

Fixed, thank you very much!

Collapse
david_vanderson profile image
David Vanderson

Great explanation and thanks for the code examples!