To write a graphical application for Wayland, you need to connect to a Wayland server, make a window, and render something to it.
While we could use a library like libwayland to manage this connection for us, doing it with nothing but linux syscalls gives a deeper understanding of the protocol.
While I have created a (WIP) pure zig wayland library, which was part of the learning process for this article, this article is focused on building a similar library yourself, not on using that library. You can find it here.
You can find the complete code for this post in the same repository, under examples/00_client_connect.zig
. It was written using zig 0.11.0.
This post will focus on software rendering. Creating an OpenGL or Vulkan context is left as an exercise for the reader and other articles.
If you enjoy this post you may want to check out this talk by Johnathan Marler where he does a similar thing for X11.
By the end of this series, you should have a window that looks like this:
Overview
- Open a connection to the wayland display server
- Get a list of global objects, bind ones we are using (wl_shm, wl_compositor, xdg_wm_base)
- Create a xdg toplevel surface object; starting with a core surface
- Create a framebuffer in shared memory
- Render to it
You can find the other articles in the series here:
- Wayland From the Wire: Part 1 -- We connect to a Wayland compositor and get a list of global objects. We pick out the global objects that we need to create a window with a framebuffer.
- Wayland From the Wire: Part 2 -- We create an Window and a Framebuffer
Connect to Display Server
The first thing we need to do is establish a connection to the display server. Wayland Display servers are accessible via Unix Domain sockets, at a path specified via an environment variable or a predefined location. Let's start with a function to get the display socket path inside main.zig
:
pub fn getDisplayPath(gpa: std.mem.Allocator) ![]u8 {
const xdg_runtime_dir_path = try std.process.getEnvVarOwned(gpa, "XDG_RUNTIME_DIR");
defer gpa.free(xdg_runtime_dir_path);
const display_name = try std.process.getEnvVarOwned(gpa, "WAYLAND_DISPLAY");
defer gpa.free(display_name);
return try std.fs.path.join(gpa, &.{ xdg_runtime_dir_path, display_name });
}
Now we can use the path to open a connection to the server.
const std = @import("std");
pub fn main() !void {
var general_allocator = std.heap.GeneralPurposeAllocator(.{}){};
defer _ = general_allocator.deinit();
const gpa = general_allocator.allocator();
const display_path = try getDisplayPath(gpa);
defer gpa.free(display_path);
std.log.info("wayland display = {}", .{std.zig.fmtEscapes(display_path)});
const socket = try std.net.connectUnixSocket(display_path);
defer socket.close();
}
Running this program will find the display path, log it, open a socket and then exit.
Creating a wl_registry
and Listening for Global Objects
Now that the socket is open, we are going to construct and send two packets over it. The first packet will get the wl_registry
and bind it to an id. The second packet tells the server to send a reply to the client.
Wayland messages require knowing the schema before hand. You can see a description of the various protocols here.
The first message will be a Request on a wl_display object. Wayland specifies that every connection will automatically get a wl_display
object assigned to the id 1
.
// in `main`
const display_id = 1;
var next_id: u32 = 2;
// reserve an object id for the registry
const registry_id = next_id;
next_id += 1;
try socket.writeAll(std.mem.sliceAsBytes(&[_]u32{
// ID of the object; in this case the default wl_display object at 1
1,
// The size (in bytes) of the message and the opcode, which is object specific.
// In this case we are using opcode 1, which corresponds to `wl_display::get_registry`.
//
// The size includes the size of the header.
(0x000C << 16) | (0x0001),
// Finally, we pass in the only argument that this opcode takes: an id for the `wl_registry`
// we are creating.
registry_id,
}));
Now we create the second packet, a wl_display sync request. This will let us loop until the server has finished sending us global object events.
// create a sync callback so we know when we are caught up with the server
const registry_done_callback_id = next_id;
next_id += 1;
try socket.writeAll(std.mem.sliceAsBytes(&[_]u32{
display_id,
// The size (in bytes) of the message and the opcode.
// In this case we are using opcode 0, which corresponds to `wl_display::sync`.
//
// The size includes the size of the header.
(0x000C << 16) | (0x0000),
// Finally, we pass in the only argument that this opcode takes: an id for the `wl_registry`
// we are creating.
registry_done_callback_id,
}));
We have to allocate ids as we go, because the wayland protocol only allows ids to be one higher than the highest id previously used.
The next step is listening for messages from the server. We'll start by reading the header, which is 2 32-bit words containing the object id, message size, and opcode (same as the Request header we sent to the server earlier). This time we'll create an extern struct to read the bytes into.
/// A wayland packet header
const Header = extern struct {
object_id: u32 align(1),
opcode: u16 align(1),
size: u16 align(1),
pub fn read(socket: std.net.Stream) !Header {
var header: Header = undefined;
const header_bytes_read = try socket.readAll(std.mem.asBytes(&header));
if (header_bytes_read < @sizeOf(Header)) {
return error.UnexpectedEOF;
}
return header;
}
};
And while we're at it, we might as well make some code to abstract reading Events
, as we'll need it later.
/// This is the general shape of a Wayland `Event` (a message from the compositor to the client).
const Event = struct {
header: Header,
body: []const u8,
pub fn read(socket: std.net.Stream, body_buffer: *std.ArrayList(u8)) !Event {
const header = try Header.read(socket);
// read bytes until we match the size in the header, not including the bytes in the header.
try body_buffer.resize(header.size - @sizeOf(Header));
const message_bytes_read = try socket.readAll(body_buffer.items);
if (message_bytes_read < body_buffer.items.len) {
return error.UnexpectedEOF;
}
return Event{
.header = header,
.body = body_buffer.items,
};
}
};
With functions we defined above, the general shape of our loop looks like this:
// create a ArrayList that we will read messages into for the rest of the program
var message_bytes = std.ArrayList(u8).init(gpa);
defer message_bytes.deinit();
while (true) {
const event = try Event.read(socket, &message_buffer);
// TODO: check what events we received
}
First, let's check if we've received the sync callback. We'll exit the loop as soon as we see it:
while (true) {
const event = try Event.read(socket, &message_buffer);
// Check if the object_id is the sync callback we made earlier
if (event.header.object_id == registry_done_callback_id) {
// No need to parse the message body, there is only one possible opcode
break;
}
}
Next, let's abstract writing to the socket a bit, so we don't have to manually construct the header each time:
/// Handles creating a header and writing the request to the socket.
pub fn writeRequest(socket: std.net.Stream, object_id: u32, opcode: u16, message: []const u32) !void {
const message_bytes = std.mem.sliceAsBytes(message);
const header = Header{
.object_id = object_id,
.opcode = opcode,
.size = @sizeOf(Header) + @as(u16, @intCast(message_bytes.len)),
};
try socket.writeAll(std.mem.asBytes(&header));
try socket.writeAll(message_bytes);
}
Now we check for the registry global event, and parse out the parameters:
// https://wayland.app/protocols/wayland#wl_registry:event:global
const WL_REGISTRY_EVENT_GLOBAL = 0;
if (event.header.object_id == registry_id and event.header.opcode == WL_REGISTRY_EVENT_GLOBAL) {
// Parse out the fields of the global event
const name: u32 = @bitCast(event.body[0..4].*);
const interface_str_len: u32 = @bitCast(event.body[4..8].*);
// The interface_str is `interface_str_len - 1` because `interface_str_len` includes the null pointer
const interface_str: [:0]const u8 = event.body[8..][0 .. interface_str_len - 1 :0];
const interface_str_len_u32_align = std.mem.alignForward(u32, interface_str_len, @alignOf(u32));
const version: u32 = @bitCast(event.body[8 + interface_str_len_u32_align ..][0..4].*);
// TODO: match the interfaces
}
We are looking for three global objects: wl_shm
, wl_compositor
, and xdg_wm_base
. This is the minimum set of protocols we need to create a window with a framebuffer. These global objects also have a version field, which allow us to check if the compositor supports the protocol versions we are targeting. Let's define our targeted versions as constants:
/// The version of the wl_shm protocol we will be targeting.
const WL_SHM_VERSION = 1;
/// The version of the wl_compositor protocol we will be targeting.
const WL_COMPOSITOR_VERSION = 5;
/// The version of the xdg_wm_base protocol we will be targeting.
const XDG_WM_BASE_VERSION = 2;
In addition, let's create some variables outside of the loop so we can check if the global objects were found afterwards.
var shm_id_opt: ?u32 = null;
var compositor_id_opt: ?u32 = null;
var xdg_wm_base_id_opt: ?u32 = null;
To bind the wl_shm
global object to a client id, we need to do the following:
- Check that
interface_str
is equal to"wl_shm"
- Make sure that the
version
isWL_SHM_VERSION
or higher. - Send
wl_registry:bind
request to the compositor
Now, the wl_registry:bind
request is a bit tricky. Unlike other request's with a new_id
that we've seen, it does not specify a specific type in the protocol! This means we must tell the server which interface we are binding in the request. Instead of sending a simple u32
for the id, we send 3 parameters, (new_id: u32, interface: string, version: u32)
. This make 4 parameters when we include the "numeric name" parameter.
if (std.mem.eql(u8, interface_str, "wl_shm")) {
if (version < WL_SHM_VERSION) {
std.log.err("compositor supports only {s} version {}, client expected version >= {}", .{ interface_str, version, WL_SHM_VERSION });
return error.WaylandInterfaceOutOfDate;
}
shm_id_opt = next_id;
next_id += 1;
try writeRequest(socket, registry_id, WL_REGISTRY_REQUEST_BIND, &[_]u32{
// The numeric name of the global we want to bind.
name,
// `new_id` arguments have three parts when the sub-type is not specified by the protocol:
// 1. A string specifying the textual name of the interface
"wl_shm".len + 1, // length of "wl_shm" plus one for the required null byte
@bitCast(@as([4]u8, "wl_s".*)),
@bitCast(@as([4]u8, "hm\x00\x00".*)), // we have two 0x00 bytes to align the string with u32
// 2. The version you are using, affects which functions you can access
WL_SHM_VERSION,
// 3. And the `new_id` part, where we tell it which client id we are giving it
shm_id_opt.?,
});
}
Writing out the entire loop we get this:
while (true) {
const event = try Event.read(socket, &message_buffer);
// Parse event messages based on which object it is for
if (event.header.object_id == registry_done_callback_id) {
// No need to parse the message body, there is only one possible opcode
break;
}
if (event.header.object_id == registry_id and event.header.opcode == WL_REGISTRY_EVENT_GLOBAL) {
// Parse out the fields of the global event
const name: u32 = @bitCast(event.body[0..4].*);
const interface_str_len: u32 = @bitCast(event.body[4..8].*);
// The interface_str is `interface_str_len - 1` because `interface_str_len` includes the null pointer
const interface_str: [:0]const u8 = event.body[8..][0 .. interface_str_len - 1 :0];
const interface_str_len_u32_align = std.mem.alignForward(u32, interface_str_len, @alignOf(u32));
const version: u32 = @bitCast(event.body[8 + interface_str_len_u32_align ..][0..4].*);
// Check to see if the interface is one of the globals we are looking for
if (std.mem.eql(u8, interface_str, "wl_shm")) {
if (version < WL_SHM_VERSION) {
std.log.err("compositor supports only {s} version {}, client expected version >= {}", .{ interface_str, version, WL_SHM_VERSION });
return error.WaylandInterfaceOutOfDate;
}
shm_id_opt = next_id;
next_id += 1;
try writeRequest(socket, registry_id, WL_REGISTRY_REQUEST_BIND, &[_]u32{
// The numeric name of the global we want to bind.
name,
// `new_id` arguments have three parts when the sub-type is not specified by the protocol:
// 1. A string specifying the textual name of the interface
"wl_shm".len + 1, // length of "wl_shm" plus one for the required null byte
@bitCast(@as([4]u8, "wl_s".*)),
@bitCast(@as([4]u8, "hm\x00\x00".*)), // we have two 0x00 bytes to align the string with u32
// 2. The version you are using, affects which functions you can access
WL_SHM_VERSION,
// 3. And the `new_id` part, where we tell it which client id we are giving it
shm_id_opt.?,
});
} else if (std.mem.eql(u8, interface_str, "wl_compositor")) {
if (version < WL_COMPOSITOR_VERSION) {
std.log.err("compositor supports only {s} version {}, client expected version >= {}", .{ interface_str, version, WL_COMPOSITOR_VERSION });
return error.WaylandInterfaceOutOfDate;
}
compositor_id_opt = next_id;
next_id += 1;
try writeRequest(socket, registry_id, WL_REGISTRY_REQUEST_BIND, &[_]u32{
name,
"wl_compositor".len + 1, // add one for the required null byte
@bitCast(@as([4]u8, "wl_c".*)),
@bitCast(@as([4]u8, "ompo".*)),
@bitCast(@as([4]u8, "sito".*)),
@bitCast(@as([4]u8, "r\x00\x00\x00".*)),
WL_COMPOSITOR_VERSION,
compositor_id_opt.?,
});
} else if (std.mem.eql(u8, interface_str, "xdg_wm_base")) {
if (version < XDG_WM_BASE_VERSION) {
std.log.err("compositor supports only {s} version {}, client expected version >= {}", .{ interface_str, version, XDG_WM_BASE_VERSION });
return error.WaylandInterfaceOutOfDate;
}
xdg_wm_base_id_opt = next_id;
next_id += 1;
try writeRequest(socket, registry_id, WL_REGISTRY_REQUEST_BIND, &[_]u32{
name,
"xdg_wm_base".len + 1,
@bitCast(@as([4]u8, "xdg_".*)),
@bitCast(@as([4]u8, "wm_b".*)),
@bitCast(@as([4]u8, "ase\x00".*)),
XDG_WM_BASE_VERSION,
xdg_wm_base_id_opt.?,
});
}
continue;
}
}
Let's ensure that we have all the necessary global objects.
const shm_id = shm_id_opt orelse return error.NeccessaryWaylandExtensionMissing;
const compositor_id = compositor_id_opt orelse return error.NeccessaryWaylandExtensionMissing;
const xdg_wm_base_id = xdg_wm_base_id_opt orelse return error.NeccessaryWaylandExtensionMissing;
std.log.debug("wl_shm client id = {}; wl_compositor client id = {}; xdg_wm_base client id = {}", .{ shm_id, compositor_id, xdg_wm_base_id });
Now, assuming you've followed along, running the program with zig run main.zig
should give output similar to the following:
$ zig run main.zig
debug: wl_shm client id = 4; wl_compositor client id = 5; xdg_wm_base client id = 6
In the next article, we'll create a window and a framebuffer.
Oldest comments (2)
Where can i find the opcodes? I saw you have used 0 and 1 what about others? can u provide the link to those which is better.
Each interface has its own sets of event and request opcodes. The interfaces are defined in xml files. The opcodes are implicit in the definitions in the xml files based on the order in which they appear. The first request to appear in an interface gets opcode 0, then 1, and so on. The main protocol is here. If you have a system running a wayland compositor and a wayland dev package installed from your package manager you should have a similar file in something like /usr/share/wayland/. Some windowing stuff is in extension protocols like xdg-shell.