Skip to main content
Musings

Zig Readers and Writers

Zig 0.15 introduces a new Reader and Writer interface where "the buffer is in the interface". While there are a lot of examples of what this means, none of them quite struck me as clear. To learn more, I wanted to write an implementation of PackBits, which is a simple run-length compression scheme.

packbits

PackBits uses a simple ruleset for encoding:

Header ByteData
0..1271 + n literal bytes of data
129..255one byte repeated 1 - n times
128skip

One thing to note is that we can compress at most 127 bytes of data, so we should ensure that whatever buffer we're using is at least 127 bytes. The buffer is how we will compute our Run-length sizes. In this simple version, we just want to get something working and can figure out how to make it more flexible later.

Writer

A std.Io.Writer is a structure in zig that consists of a buffer, an end index into said buffer which indicates where the actual data is, and a vtable which contains functions providing the implementation details. Only one function is required, which is drain:

// drain function
pub const VTable = struct {
    /// Sends bytes to the logical sink. A write will only be sent here if it
    /// could not fit into `buffer`, or during a `flush` operation.
    ///
    /// `buffer[0..end]` is consumed first, followed by each slice of `data` in
    /// order. Elements of `data` may alias each other but may not alias
    /// `buffer`.
    ///
    /// This function modifies `Writer.end` and `Writer.buffer` in an
    /// implementation-defined manner.
    ///
    /// `data.len` must be nonzero.
    ///
    /// The last element of `data` is repeated as necessary so that it is
    /// written `splat` number of times, which may be zero.
    ///
    /// This function may not be called if the data to be written could have
    /// been stored in `buffer` instead, including when the amount of data to
    /// be written is zero and the buffer capacity is zero.
    ///
    /// Number of bytes consumed from `data` is returned, excluding bytes from
    /// `buffer`.
    ///
    /// Number of bytes returned may be zero, which does not indicate stream
    /// end. A subsequent call may return nonzero, or signal end of stream via
    /// `error.WriteFailed`.
    drain: *const fn (w: *Writer, data: []const []const u8, splat: usize) Error!usize,
    /// ...other functions that we dont need
};

drain is a heavily loaded function. We are expected to handle the buffer, the slices of data, and repeat using splat. I think this documentation is unclear about what's required versus what is available for optimization. The last two paragraphs mention the most important part in my view, which is that the writer may (at their discretion) only flush the contents of the buffer. This means we can forget about data and splat, and the higher-level methods of Writer will respond accordingly.

So in practice, we should:

  1. Consume all of the data in buffer[0..end]
  2. Move end to 0 to indicate that we cleared the buffer
  3. return 0 to indicate that we didn't consume data.

It's not required to consume the entirety of buffer, but it makes it simpler.

This is details