LeRoyce Pearson

Posted on Dec 19, 2023

Wayland From the Wire: Part 2

#wayland #windowing

To write a graphical application for Wayland, you need to connect to a Wayland server, make a window, and render something to it.

This article is part 2 of a series:

Wayland From the Wire: Part 1 -- We connect to a Wayland compositor and get a list of global objects. We pick out the global objects that we need to create a window with a framebuffer.
Wayland From the Wire: Part 2 -- We create an Window and a Framebuffer

By the end of this series, you should have a window that looks like this:

And here are some useful links:

Wayland Explorer: An site that makes the Wayland protocols easy to browse
zig-wayland-wire: The library I wrote while learning enough to write these articles.
zig-wayland-wire/examples/00_client_connect: This is the code you should have be the end of this series
How to Use Abstraction to Kill Your API - Jonathan Marler - Software You Can Love Vancouver 2023: A talk given by John Marler that covers a similar topic to this series, but for X11.

Creating an XDG Toplevel window

Last time we left off after binding some global objects to client-side ids. This next section will show how to create a window.

Creating a window is not complicated, but it does take several steps:

Create a wl_surface using wl_compositor:create_surface
Assign that wl_surface to the xdg_surface role using xdg_wm_base:get_xdg_surface
Assign that xdg_surface to the xdg_toplevel role using xdg_surface:get_toplevel
Commit the changes to the wl_surface
Wait for an xdg_surface:configure event to arrive before trying to attach a buffer to it

The protocol description of wl_compositor:create_surface is straightforward:

wl_compositor::create_surface(id: new_id<wl_surface>)

All we have to do is bind a wl_surface to a client-side id.

// Create a surface using wl_compositor::create_surface
const surface_id = next_id;
next_id += 1;
// https://wayland.app/protocols/wayland#wl_compositor:request:create_surface
const WL_COMPOSITOR_REQUEST_CREATE_SURFACE = 0;
try writeRequest(socket, compositor_id, WL_COMPOSITOR_REQUEST_CREATE_SURFACE, &[_]u32{
    // id: new_id<wl_surface>
    surface_id,
});

Steps 2, 3, and 4 are similarly simple:

xdg_wm_base::get_xdg_surface(id: new_id<xdg_surface>, surface: object<wl_surface>)
xdg_wm_base::get_toplevel(id: new_id<xdg_toplevel>)
wl_surface::commit()

// Create an xdg_surface
const xdg_surface_id = next_id;
next_id += 1;
// https://wayland.app/protocols/xdg-shell#xdg_wm_base:request:get_xdg_surface
const XDG_WM_BASE_REQUEST_GET_XDG_SURFACE = 2;
try writeRequest(socket, xdg_wm_base_id, XDG_WM_BASE_REQUEST_GET_XDG_SURFACE, &[_]u32{
    // id: new_id<xdg_surface>
    xdg_surface_id,
    // surface: object<wl_surface>
    surface_id,
});

// Get the xdg_surface as an xdg_toplevel object
const xdg_toplevel_id = next_id;
next_id += 1;
// https://wayland.app/protocols/xdg-shell#xdg_surface:request:get_toplevel
const XDG_SURFACE_REQUEST_GET_TOPLEVEL = 1;
try writeRequest(socket, xdg_surface_id, XDG_SURFACE_REQUEST_GET_TOPLEVEL, &[_]u32{
    // id: new_id<xdg_surface>
    xdg_toplevel_id,
});

// Commit the surface. This tells the compositor that the current batch of
// changes is ready, and they can now be applied.

// https://wayland.app/protocols/wayland#wl_surface:request:commit
const WL_SURFACE_REQUEST_COMMIT = 6;
try writeRequest(socket, surface_id, WL_SURFACE_REQUEST_COMMIT, &[_]u32{});

Step 5 takes a bit more code and a little more effort to understand. Let's first go over why we needed to call wl_surface::commit.

The xdg_surface documentation says the following:

A role must be assigned before any other requests are made to the xdg_surface object.

The client must call wl_surface.commit on the corresponding wl_surface for the xdg_surface state to take effect.

What this means is we must first send a request that assigns a role (like xdg_surface::get_toplevel), and then put that role into effect by committing the surface (using wl_surface::commit).

Even then we aren't allowed to attach a buffer until we respond to a configure event:

After creating a role-specific object and setting it up, the client must perform an initial commit without any buffer attached. The compositor will reply with initial wl_surface state such as wl_surface.preferred_buffer_scale followed by an xdg_surface.configure event. The client must acknowledge it and is then allowed to attach a buffer to map the surface.

To wait for the configure event we create another while loop:

// Wait for the surface to be configured before moving on
while (true) {
    const event = try Event.read(socket, &message_buffer);

    // TODO: match events by object_id and opcode
}

We can then check if the event is a configure event meant for our xdg_surface object:

    if (event.header.object_id == xdg_surface_id) {
        switch (event.header.opcode) {
            // https://wayland.app/protocols/xdg-shell#xdg_surface:event:configure
            0 => {
                // TODO
            },
        }
    }

An xdg_surface::configure event must be responded to with xdg_surface::ack_configure:

# We must respond to this event:
xdg_surface::configure(serial: uint)

# With this request:
xdg_surface::ack_configure(serial: uint)

# Followed by another commit:
wl_surface::commit()

// The configure event acts as a heartbeat. Every once in a while the compositor will send us
// a `configure` event, and if our application doesn't respond with an `ack_configure` response
// it will assume our program has died and destroy the window.
const serial: u32 = @bitCast(event.body[0..4].*);

try writeRequest(socket, xdg_surface_id, XDG_SURFACE_REQUEST_ACK_CONFIGURE, &[_]u32{
    // We respond with the number it sent us, so it knows which configure we are responding to.
    serial,
});

try writeRequest(socket, surface_id, WL_SURFACE_REQUEST_COMMIT, &[_]u32{});

// The surface has been configured! We can move on
break;

All together, it looks like this:

while (true) {
    const event = try Event.read(socket, &message_buffer);

    if (event.header.object_id == xdg_surface_id) {
        switch (event.header.opcode) {
            // https://wayland.app/protocols/xdg-shell#xdg_surface:event:configure
            0 => {
                // The configure event acts as a heartbeat. Every once in a while the compositor will send us
                // a `configure` event, and if our application doesn't respond with an `ack_configure` response
                // it will assume our program has died and destroy the window.
                const serial: u32 = @bitCast(event.body[0..4].*);

                try writeRequest(socket, xdg_surface_id, XDG_SURFACE_REQUEST_ACK_CONFIGURE, &[_]u32{
                    // We respond with the number it sent us, so it knows which configure we are responding to.
                    serial,
                });

                try writeRequest(socket, surface_id, WL_SURFACE_REQUEST_COMMIT, &[_]u32{});

                // The surface has been configured! We can move on
                break;
            },
            else => return error.InvalidOpcode,
        }
    }
}

Now, one thing I like to do (but isn't necessary) is add an else statement that prints out the events that we are not handling:

    if (event.header.object_id == xdg_surface_id) {
        // -- snip --
    } else {
        std.log.warn("unknown event {{ .object_id = {}, .opcode = {x}, .message = \"{}\" }}", .{ event.header.object_id, event.header.opcode, std.zig.fmtEscapes(std.mem.sliceAsBytes(event.body)) });
    }

This makes it easier to debug if something goes wrong.

Create a Framebuffer

Like creating a window, this section requires several steps. However, these steps are not as straight-forward as the steps for creating a window. We won't need to create another loop (besides some kind of main loop), but we do need to understand some Linux syscalls.

The steps to create a framebuffer are as follows:

Create a shared memory file using memfd_create
Allocate space in the shared memory file using ftruncate
Create a shared memory pool using wl_shm::create_pool
Allocate a wl_buffer from the shared memory pool using wl_shm_pool::create_buffer

Steps 1 and 2: Allocate a memory backed file

Steps 1 and 2 require interfacing with the Linux kernel, and luckily for us the Zig standard library already implements these functions:

Before we make use of those functions let's do some math to figure out how much memory we should allocate. For this article, we are only going to support a 128x128 argb888 framebuffer.

const Pixel = [4]u8;
const framebuffer_size = [2]usize{ 128, 128 };
const shared_memory_pool_len = framebuffer_size[0] * framebuffer_size[1] * @sizeOf(Pixel);

Now we can create and resize the file:

const shared_memory_pool_fd = try std.os.memfd_create("my-wayland-framebuffer", 0);
try std.os.ftruncate(shared_memory_pool_fd, shared_memory_pool_len);

Step 3: Creating the memory pool

Step 3 is much more complex. We are sending message to Wayland compositor (which is easy), but this time we must attach the shared memory pool file descriptor to a control message. So while the protocol definition looks simple:

wl_shm::create_pool(id: new_id<wl_shm_pool>, fd: fd, size: int)

It will require an entirely separate code path to send. I'm going to split it out into a separate function so we can clearly see what it requires:

/// https://wayland.app/protocols/wayland#wl_shm:request:create_pool
const WL_SHM_REQUEST_CREATE_POOL = 0;

/// This request is more complicated that most other requests, because it has to send the file descriptor to the
/// compositor using a control message.
///
/// Returns the id of the newly create wl_shm_pool
pub fn writeWlShmRequestCreatePool(socket: std.net.Stream, wl_shm_id: u32, next_id: *u32, fd: std.os.fd_t, fd_len: i32) !u32 {
    _ = socket;
    _ = wl_shm_id;
    _ = next_id;
    _ = fd;
    _ = fd_len;
    return error.Unimplemented
}

First we'll get the current value of next_id:

    const wl_shm_pool_id = next_id.*;

But we'll leave incrementing it until we know the message has been sent:

    // Wait to increment until we know the message has been sent
    next_id.* += 1;
    return wl_shm_pool_id;

Next, we'll create the body of the message:

    const wl_shm_pool_id = next_id.*;

    const message = [_]u32{
        // id: new_id<wl_shm_pool>
        wl_shm_pool_id,
        // size: int
        @intCast(fd_len),
    };

If you're paying close attention, you'll notice that our message only has two parameters in it, despite the
documentation calling for 3. This is because fd is sent in the control message, and so is not included in the regular message body.

Creating the message header is the same as in a regular request:

    // Create the message header as usual
    const message_bytes = std.mem.sliceAsBytes(&message);
    const header = Header{
        .object_id = wl_shm_id,
        .opcode = WL_SHM_REQUEST_CREATE_POOL,
        .size = @sizeOf(Header) + @as(u16, @intCast(message_bytes.len)),
    };
    const header_bytes = std.mem.asBytes(&header);

Instead of writing the bytes directly to the socket, we create a vectorized io array with both the header and the body:

    // we'll be using `std.os.sendmsg` to send a control message, so we may as well use the vectorized
    // IO to send the header and the message body while we're at it.
    const msg_iov = [_]std.os.iovec_const{
        .{
            .iov_base = header_bytes.ptr,
            .iov_len = header_bytes.len,
        },
        .{
            .iov_base = message_bytes.ptr,
            .iov_len = message_bytes.len,
        },
    };

Before we continue, we must make another detour to define the cmsg function. In C, CMSG is a set of macros for creating control messages. In zig, I have it generate an extern struct with the correct layout, and a default value for the length field.

fn cmsg(comptime T: type) type {
    const padding_size = (@sizeOf(T) + @sizeOf(c_long) - 1) & ~(@as(usize, @sizeOf(c_long)) - 1);
    return extern struct {
        len: c_ulong = @sizeOf(@This()) - padding_size,
        level: c_int,
        type: c_int,
        data: T,
        _padding: [padding_size]u8 align(1) = [_]u8{0} ** padding_size,
    };
}

With the cmsg function in hand, we can return to writing the writeWlShmRequestCreatePool function.

    // Send the file descriptor through a control message

    // This is the control message! It is not a fixed size struct. Instead it varies depending on the message you want to send.
    // C uses macros to define it, here we make a comptime function instead.
    const control_message = cmsg(std.os.fd_t){
        .level = std.os.SOL.SOCKET,
        .type = 0x01, // value of SCM_RIGHTS
        .data = fd,
    };
    const control_message_bytes = std.mem.asBytes(&control_message);

SCM_RIGHTS is a unix domain socket control message that will duplicate an open file descriptor (or a list of file descriptors) over to the receiving process.

We now have all the pieces we need to assemble a std.os.msghdr_const struct:

    const socket_message = std.os.msghdr_const{
        .name = null,
        .namelen = 0,
        .iov = &msg_iov,
        .iovlen = msg_iov.len,
        .control = control_message_bytes.ptr,
        // This is the size of the control message in bytes
        .controllen = control_message_bytes.len,
        .flags = 0,
    };

Then we send the message and check that all of the bytes were sent:

    const bytes_sent = try std.os.sendmsg(socket.handle, &socket_message, 0);
    if (bytes_sent < header_bytes.len + message_bytes.len) {
        return error.ConnectionClosed;
    }

The full functions look like this:

/// https://wayland.app/protocols/wayland#wl_shm:request:create_pool
const WL_SHM_REQUEST_CREATE_POOL = 0;

/// This request is more complicated that most other requests, because it has to send the file descriptor to the
/// compositor using a control message.
///
/// Returns the id of the newly create wl_shm_pool
pub fn writeWlShmRequestCreatePool(socket: std.net.Stream, wl_shm_id: u32, next_id: *u32, fd: std.os.fd_t, fd_len: i32) !u32 {
    const wl_shm_pool_id = next_id.*;

    const message = [_]u32{
        // id: new_id<wl_shm_pool>
        wl_shm_pool_id,
        // size: int
        @intCast(fd_len),
    };
    // If you're paying close attention, you'll notice that our message only has two parameters in it, despite the
    // documentation calling for 3: wl_shm_pool_id, fd, and size. This is because `fd` is sent in the control message,
    // and so not included in the regular message body.

    // Create the message header as usual
    const message_bytes = std.mem.sliceAsBytes(&message);
    const header = Header{
        .object_id = wl_shm_id,
        .opcode = WL_SHM_REQUEST_CREATE_POOL,
        .size = @sizeOf(Header) + @as(u16, @intCast(message_bytes.len)),
    };
    const header_bytes = std.mem.asBytes(&header);

    // we'll be using `std.os.sendmsg` to send a control message, so we may as well use the vectorized
    // IO to send the header and the message body while we're at it.
    const msg_iov = [_]std.os.iovec_const{
        .{
            .iov_base = header_bytes.ptr,
            .iov_len = header_bytes.len,
        },
        .{
            .iov_base = message_bytes.ptr,
            .iov_len = message_bytes.len,
        },
    };

    // Send the file descriptor through a control message

    // This is the control message! It is not a fixed size struct. Instead it varies depending on the message you want to send.
    // C uses macros to define it, here we make a comptime function instead.
    const control_message = cmsg(std.os.fd_t){
        .level = std.os.SOL.SOCKET,
        .type = 0x01, // value of SCM_RIGHTS
        .data = fd,
    };
    const control_message_bytes = std.mem.asBytes(&control_message);

    const socket_message = std.os.msghdr_const{
        .name = null,
        .namelen = 0,
        .iov = &msg_iov,
        .iovlen = msg_iov.len,
        .control = control_message_bytes.ptr,
        // This is the size of the control message in bytes
        .controllen = control_message_bytes.len,
        .flags = 0,
    };

    const bytes_sent = try std.os.sendmsg(socket.handle, &socket_message, 0);
    if (bytes_sent < header_bytes.len + message_bytes.len) {
        return error.ConnectionClosed;
    }

    // Wait to increment until we know the message has been sent
    next_id.* += 1;
    return wl_shm_pool_id;
}

fn cmsg(comptime T: type) type {
    const padding_size = (@sizeOf(T) + @sizeOf(c_long) - 1) & ~(@as(usize, @sizeOf(c_long)) - 1);
    return extern struct {
        len: c_ulong = @sizeOf(@This()) - padding_size,
        level: c_int,
        type: c_int,
        data: T,
        _padding: [padding_size]u8 align(1) = [_]u8{0} ** padding_size,
    };
}

Now we can return to the main function and create the memory pool:

    // Create a wl_shm_pool (wayland shared memory pool). This will be used to create framebuffers,
    // though in this article we only plan on creating one.
    const wl_shm_pool_id = try writeWlShmRequestCreatePool(
        socket,
        shm_id,
        &next_id,
        shared_memory_pool_fd,
        @intCast(shared_memory_pool_len),
    );

Step 4: Allocate a framebuffer

Step 4 is much simpler. We need to send the wl_shm_pool::create_buffer request to specify the size and format of our framebuffer.

wl_shm_pool::create_buffer(id: new_id<wl_buffer>, offset: int, width: int, height: int, stride: int, format: uint<wl_shm.format>)

It has a lot of parameters, but they don't require any funky control messages to send:

    // Now we allocate a framebuffer from the shared memory pool
    const wl_buffer_id = next_id;
    next_id += 1;

    // https://wayland.app/protocols/wayland#wl_shm_pool:request:create_buffer
    const WL_SHM_POOL_REQUEST_CREATE_BUFFER = 0;
    // https://wayland.app/protocols/wayland#wl_shm:enum:format
    const WL_SHM_POOL_ENUM_FORMAT_ARGB8888 = 0;
    try writeRequest(socket, wl_shm_pool_id, WL_SHM_POOL_REQUEST_CREATE_BUFFER, &[_]u32{
        // id: new_id<wl_buffer>,
        wl_buffer_id,
        // Byte offset of the framebuffer in the pool. In this case we allocate it at the very start of the file.
        0,
        // Width of the framebuffer.
        framebuffer_size[0],
        // Height of the framebuffer.
        framebuffer_size[1],
        // Stride of the framebuffer, or rather, how many bytes are in a single row of pixels.
        framebuffer_size[0] * @sizeOf(Pixel),
        // The format of the framebuffer. In this case we choose argb8888.
        WL_SHM_POOL_ENUM_FORMAT_ARGB8888,
    });

And now we have a wl_buffer that we can render into.

Conclusion

We've created a Window and a Framebuffer, in the next article we'll combine the two to render something to the display.

Latest comments (5)

2-www • Dec 28 '23

nice to see someone got it working! i have a similar code, with a python script generating zig structs from wayland's xml and metaprogramming to parse and serialize messages, but currently it fails to parse some messages. if anyone wants to see mine too i'll post it here when i upload it to codeberg