Zig NEWS

LeRoyce Pearson
LeRoyce Pearson

Posted on • Updated on

Wayland From the Wire: Part 1

To write a graphical application for Wayland, you need to connect to a Wayland server, make a window, and render something to it.

While we could use a library like libwayland to manage this connection for us, doing it with nothing but linux syscalls gives a deeper understanding of the protocol.

While I have created a (WIP) pure zig wayland library, which was part of the learning process for this article, this article is focused on building a similar library yourself, not on using that library. You can find it here.

You can find the complete code for this post in the same repository, under examples/00_client_connect.zig. It was written using zig 0.11.0.

This post will focus on software rendering. Creating an OpenGL or Vulkan context is left as an exercise for the reader and other articles.

If you enjoy this post you may want to check out this talk by Johnathan Marler where he does a similar thing for X11.

By the end of this series, you should have a window that looks like this:

Image description

Overview

  1. Open a connection to the wayland display server
  2. Get a list of global objects, bind ones we are using (wl_shm, wl_compositor, xdg_wm_base)
  3. Create a xdg toplevel surface object; starting with a core surface
  4. Create a framebuffer in shared memory
  5. Render to it

You can find the other articles in the series here:

  1. Wayland From the Wire: Part 1 -- We connect to a Wayland compositor and get a list of global objects. We pick out the global objects that we need to create a window with a framebuffer.
  2. Wayland From the Wire: Part 2 -- We create an Window and a Framebuffer

Connect to Display Server

The first thing we need to do is establish a connection to the display server. Wayland Display servers are accessible via Unix Domain sockets, at a path specified via an environment variable or a predefined location. Let's start with a function to get the display socket path inside main.zig:

pub fn getDisplayPath(gpa: std.mem.Allocator) ![]u8 {
    const xdg_runtime_dir_path = try std.process.getEnvVarOwned(gpa, "XDG_RUNTIME_DIR");
    defer gpa.free(xdg_runtime_dir_path);
    const display_name = try std.process.getEnvVarOwned(gpa, "WAYLAND_DISPLAY");
    defer gpa.free(display_name);

    return try std.fs.path.join(gpa, &.{ xdg_runtime_dir_path, display_name });
}
Enter fullscreen mode Exit fullscreen mode

Now we can use the path to open a connection to the server.

const std = @import("std");

pub fn main() !void {
    var general_allocator = std.heap.GeneralPurposeAllocator(.{}){};
    defer _ = general_allocator.deinit();
    const gpa = general_allocator.allocator();

    const display_path = try getDisplayPath(gpa);
    defer gpa.free(display_path);

    std.log.info("wayland display = {}", .{std.zig.fmtEscapes(display_path)});

    const socket = try std.net.connectUnixSocket(display_path);
    defer socket.close();
}
Enter fullscreen mode Exit fullscreen mode

Running this program will find the display path, log it, open a socket and then exit.

Creating a wl_registry and Listening for Global Objects

Now that the socket is open, we are going to construct and send two packets over it. The first packet will get the wl_registry and bind it to an id. The second packet tells the server to send a reply to the client.

Wayland messages require knowing the schema before hand. You can see a description of the various protocols here.

The first message will be a Request on a wl_display object. Wayland specifies that every connection will automatically get a wl_display object assigned to the id 1.

// in `main`
const display_id = 1;
var next_id: u32 = 2;

// reserve an object id for the registry
const registry_id = next_id;
next_id += 1;

try socket.writeAll(std.mem.sliceAsBytes(&[_]u32{
    // ID of the object; in this case the default wl_display object at 1
    1,

    // The size (in bytes) of the message and the opcode, which is object specific.
    // In this case we are using opcode 1, which corresponds to `wl_display::get_registry`.
    //
    // The size includes the size of the header.
    (0x000C << 16) | (0x0001),

    // Finally, we pass in the only argument that this opcode takes: an id for the `wl_registry`
    // we are creating.
    registry_id,
}));
Enter fullscreen mode Exit fullscreen mode

Now we create the second packet, a wl_display sync request. This will let us loop until the server has finished sending us global object events.

// create a sync callback so we know when we are caught up with the server
const registry_done_callback_id = next_id;
next_id += 1;

try socket.writeAll(std.mem.sliceAsBytes(&[_]u32{
    display_id,

    // The size (in bytes) of the message and the opcode.
    // In this case we are using opcode 0, which corresponds to `wl_display::sync`.
    //
    // The size includes the size of the header.
    (0x000C << 16) | (0x0000),

    // Finally, we pass in the only argument that this opcode takes: an id for the `wl_registry`
    // we are creating.
    registry_done_callback_id,
}));
Enter fullscreen mode Exit fullscreen mode

We have to allocate ids as we go, because the wayland protocol only allows ids to be one higher than the highest id previously used.

The next step is listening for messages from the server. We'll start by reading the header, which is 2 32-bit words containing the object id, message size, and opcode (same as the Request header we sent to the server earlier). This time we'll create an extern struct to read the bytes into.

/// A wayland packet header
const Header = extern struct {
    object_id: u32 align(1),
    opcode: u16 align(1),
    size: u16 align(1),

    pub fn read(socket: std.net.Stream) !Header {
        var header: Header = undefined;
        const header_bytes_read = try socket.readAll(std.mem.asBytes(&header));
        if (header_bytes_read < @sizeOf(Header)) {
            return error.UnexpectedEOF;
        }
        return header;
    }
};
Enter fullscreen mode Exit fullscreen mode

And while we're at it, we might as well make some code to abstract reading Events, as we'll need it later.

/// This is the general shape of a Wayland `Event` (a message from the compositor to the client).
const Event = struct {
    header: Header,
    body: []const u8,

    pub fn read(socket: std.net.Stream, body_buffer: *std.ArrayList(u8)) !Event {
        const header = try Header.read(socket);

        // read bytes until we match the size in the header, not including the bytes in the header.
        try body_buffer.resize(header.size - @sizeOf(Header));
        const message_bytes_read = try socket.readAll(body_buffer.items);
        if (message_bytes_read < body_buffer.items.len) {
            return error.UnexpectedEOF;
        }

        return Event{
            .header = header,
            .body = body_buffer.items,
        };
    }
};
Enter fullscreen mode Exit fullscreen mode

With functions we defined above, the general shape of our loop looks like this:

// create a ArrayList that we will read messages into for the rest of the program
var message_bytes = std.ArrayList(u8).init(gpa);
defer message_bytes.deinit();
while (true) {
    const event = try Event.read(socket, &message_buffer);

    // TODO: check what events we received
}
Enter fullscreen mode Exit fullscreen mode

First, let's check if we've received the sync callback. We'll exit the loop as soon as we see it:

while (true) {
    const event = try Event.read(socket, &message_buffer);

    // Check if the object_id is the sync callback we made earlier
    if (event.header.object_id == registry_done_callback_id) {
        // No need to parse the message body, there is only one possible opcode
        break;
    }
}
Enter fullscreen mode Exit fullscreen mode

Next, let's abstract writing to the socket a bit, so we don't have to manually construct the header each time:

/// Handles creating a header and writing the request to the socket.
pub fn writeRequest(socket: std.net.Stream, object_id: u32, opcode: u16, message: []const u32) !void {
    const message_bytes = std.mem.sliceAsBytes(message);
    const header = Header{
        .object_id = object_id,
        .opcode = opcode,
        .size = @sizeOf(Header) + @as(u16, @intCast(message_bytes.len)),
    };

    try socket.writeAll(std.mem.asBytes(&header));
    try socket.writeAll(message_bytes);
}
Enter fullscreen mode Exit fullscreen mode

Now we check for the registry global event, and parse out the parameters:

    // https://wayland.app/protocols/wayland#wl_registry:event:global
    const WL_REGISTRY_EVENT_GLOBAL = 0;

    if (event.header.object_id == registry_id and event.header.opcode == WL_REGISTRY_EVENT_GLOBAL) {
        // Parse out the fields of the global event
        const name: u32 = @bitCast(event.body[0..4].*);

        const interface_str_len: u32 = @bitCast(event.body[4..8].*);
        // The interface_str is `interface_str_len - 1` because `interface_str_len` includes the null pointer
        const interface_str: [:0]const u8 = event.body[8..][0 .. interface_str_len - 1 :0];

        const interface_str_len_u32_align = std.mem.alignForward(u32, interface_str_len, @alignOf(u32));
        const version: u32 = @bitCast(event.body[8 + interface_str_len_u32_align ..][0..4].*);

        // TODO: match the interfaces
    }
Enter fullscreen mode Exit fullscreen mode

We are looking for three global objects: wl_shm, wl_compositor, and xdg_wm_base. This is the minimum set of protocols we need to create a window with a framebuffer. These global objects also have a version field, which allow us to check if the compositor supports the protocol versions we are targeting. Let's define our targeted versions as constants:

/// The version of the wl_shm protocol we will be targeting.
const WL_SHM_VERSION = 1;
/// The version of the wl_compositor protocol we will be targeting.
const WL_COMPOSITOR_VERSION = 5;
/// The version of the xdg_wm_base protocol we will be targeting.
const XDG_WM_BASE_VERSION = 2;
Enter fullscreen mode Exit fullscreen mode

In addition, let's create some variables outside of the loop so we can check if the global objects were found afterwards.

var shm_id_opt: ?u32 = null;
var compositor_id_opt: ?u32 = null;
var xdg_wm_base_id_opt: ?u32 = null;
Enter fullscreen mode Exit fullscreen mode

To bind the wl_shm global object to a client id, we need to do the following:

  1. Check that interface_str is equal to "wl_shm"
  2. Make sure that the version is WL_SHM_VERSION or higher.
  3. Send wl_registry:bind request to the compositor

Now, the wl_registry:bind request is a bit tricky. Unlike other request's with a new_id that we've seen, it does not specify a specific type in the protocol! This means we must tell the server which interface we are binding in the request. Instead of sending a simple u32 for the id, we send 3 parameters, (new_id: u32, interface: string, version: u32). This make 4 parameters when we include the "numeric name" parameter.

        if (std.mem.eql(u8, interface_str, "wl_shm")) {
            if (version < WL_SHM_VERSION) {
                std.log.err("compositor supports only {s} version {}, client expected version >= {}", .{ interface_str, version, WL_SHM_VERSION });
                return error.WaylandInterfaceOutOfDate;
            }
            shm_id_opt = next_id;
            next_id += 1;

            try writeRequest(socket, registry_id, WL_REGISTRY_REQUEST_BIND, &[_]u32{
                // The numeric name of the global we want to bind.
                name,

                // `new_id` arguments have three parts when the sub-type is not specified by the protocol:
                //   1. A string specifying the textual name of the interface
                "wl_shm".len + 1, // length of "wl_shm" plus one for the required null byte
                @bitCast(@as([4]u8, "wl_s".*)),
                @bitCast(@as([4]u8, "hm\x00\x00".*)), // we have two 0x00 bytes to align the string with u32

                //   2. The version you are using, affects which functions you can access
                WL_SHM_VERSION,

                //   3. And the `new_id` part, where we tell it which client id we are giving it
                shm_id_opt.?,
            });
        }
Enter fullscreen mode Exit fullscreen mode

Writing out the entire loop we get this:

    while (true) {
        const event = try Event.read(socket, &message_buffer);

        // Parse event messages based on which object it is for
        if (event.header.object_id == registry_done_callback_id) {
            // No need to parse the message body, there is only one possible opcode
            break;
        }

        if (event.header.object_id == registry_id and event.header.opcode == WL_REGISTRY_EVENT_GLOBAL) {
            // Parse out the fields of the global event
            const name: u32 = @bitCast(event.body[0..4].*);

            const interface_str_len: u32 = @bitCast(event.body[4..8].*);
            // The interface_str is `interface_str_len - 1` because `interface_str_len` includes the null pointer
            const interface_str: [:0]const u8 = event.body[8..][0 .. interface_str_len - 1 :0];

            const interface_str_len_u32_align = std.mem.alignForward(u32, interface_str_len, @alignOf(u32));
            const version: u32 = @bitCast(event.body[8 + interface_str_len_u32_align ..][0..4].*);

            // Check to see if the interface is one of the globals we are looking for
            if (std.mem.eql(u8, interface_str, "wl_shm")) {
                if (version < WL_SHM_VERSION) {
                    std.log.err("compositor supports only {s} version {}, client expected version >= {}", .{ interface_str, version, WL_SHM_VERSION });
                    return error.WaylandInterfaceOutOfDate;
                }
                shm_id_opt = next_id;
                next_id += 1;

                try writeRequest(socket, registry_id, WL_REGISTRY_REQUEST_BIND, &[_]u32{
                    // The numeric name of the global we want to bind.
                    name,

                    // `new_id` arguments have three parts when the sub-type is not specified by the protocol:
                    //   1. A string specifying the textual name of the interface
                    "wl_shm".len + 1, // length of "wl_shm" plus one for the required null byte
                    @bitCast(@as([4]u8, "wl_s".*)),
                    @bitCast(@as([4]u8, "hm\x00\x00".*)), // we have two 0x00 bytes to align the string with u32

                    //   2. The version you are using, affects which functions you can access
                    WL_SHM_VERSION,

                    //   3. And the `new_id` part, where we tell it which client id we are giving it
                    shm_id_opt.?,
                });
            } else if (std.mem.eql(u8, interface_str, "wl_compositor")) {
                if (version < WL_COMPOSITOR_VERSION) {
                    std.log.err("compositor supports only {s} version {}, client expected version >= {}", .{ interface_str, version, WL_COMPOSITOR_VERSION });
                    return error.WaylandInterfaceOutOfDate;
                }
                compositor_id_opt = next_id;
                next_id += 1;

                try writeRequest(socket, registry_id, WL_REGISTRY_REQUEST_BIND, &[_]u32{
                    name,
                    "wl_compositor".len + 1, // add one for the required null byte
                    @bitCast(@as([4]u8, "wl_c".*)),
                    @bitCast(@as([4]u8, "ompo".*)),
                    @bitCast(@as([4]u8, "sito".*)),
                    @bitCast(@as([4]u8, "r\x00\x00\x00".*)),
                    WL_COMPOSITOR_VERSION,
                    compositor_id_opt.?,
                });
            } else if (std.mem.eql(u8, interface_str, "xdg_wm_base")) {
                if (version < XDG_WM_BASE_VERSION) {
                    std.log.err("compositor supports only {s} version {}, client expected version >= {}", .{ interface_str, version, XDG_WM_BASE_VERSION });
                    return error.WaylandInterfaceOutOfDate;
                }
                xdg_wm_base_id_opt = next_id;
                next_id += 1;

                try writeRequest(socket, registry_id, WL_REGISTRY_REQUEST_BIND, &[_]u32{
                    name,
                    "xdg_wm_base".len + 1,
                    @bitCast(@as([4]u8, "xdg_".*)),
                    @bitCast(@as([4]u8, "wm_b".*)),
                    @bitCast(@as([4]u8, "ase\x00".*)),
                    XDG_WM_BASE_VERSION,
                    xdg_wm_base_id_opt.?,
                });
            }
            continue;
        }
    }
Enter fullscreen mode Exit fullscreen mode

Let's ensure that we have all the necessary global objects.

const shm_id = shm_id_opt orelse return error.NeccessaryWaylandExtensionMissing;
const compositor_id = compositor_id_opt orelse return error.NeccessaryWaylandExtensionMissing;
const xdg_wm_base_id = xdg_wm_base_id_opt orelse return error.NeccessaryWaylandExtensionMissing;

std.log.debug("wl_shm client id = {}; wl_compositor client id = {}; xdg_wm_base client id = {}", .{ shm_id, compositor_id, xdg_wm_base_id });
Enter fullscreen mode Exit fullscreen mode

Now, assuming you've followed along, running the program with zig run main.zig should give output similar to the following:

$ zig run main.zig
debug: wl_shm client id = 4; wl_compositor client id = 5; xdg_wm_base client id = 6
Enter fullscreen mode Exit fullscreen mode

In the next article, we'll create a window and a framebuffer.

Oldest comments (2)

Collapse
 
ashwith2427 profile image
ashwith2427

Where can i find the opcodes? I saw you have used 0 and 1 what about others? can u provide the link to those which is better.

Collapse
 
bentheklutz profile image
bentheklutz

Each interface has its own sets of event and request opcodes. The interfaces are defined in xml files. The opcodes are implicit in the definitions in the xml files based on the order in which they appear. The first request to appear in an interface gets opcode 0, then 1, and so on. The main protocol is here. If you have a system running a wayland compositor and a wayland dev package installed from your package manager you should have a similar file in something like /usr/share/wayland/. Some windowing stuff is in extension protocols like xdg-shell.