In game development being able to transparently store an array of structs in a transposed memory layout is a topic so hot that some even want it to be a first class feature offered by their programming language in order to have good ergonomics.
In Zig, if there's one thing we can boast about, is that we can often get great ergonomics with simple userland code, and it turns out this case is no different :^)
What's SoA/AoS?
When you have a collection of items, you normally would like for each to be represented by a struct instance. The problem is that sometimes certain operations would be faster if instead you had a collection of arrays each corresponding to a different struct field.
Game developers are very familiar with this concept and if you're not a game developer, you might have seen it when it comes to databases. The most natural way of thinking of data in a SQL database is by row, but some operations (e.g. analytics) are much more performant on databases that store data column-major.
Performance in games is often a critical aspect, and so developers want to exploit the fact that access to contiguous memory tends to be more performant, but at the same time they would still like to be able to reason about their objects using "row" semantics, which is what SoA/AoS (Struct of Arrays / Array of Structs) is all about.
Meet MultiArrayList
In the Zig standard library you can find MultiArrayList
, a struct that uses comptime generics to implement SoA. This is what the docstring has to say about it:
A MultiArrayList stores a list of a struct type. Instead of storing a single list of items, MultiArrayList stores separate lists for each field of the struct. This allows for memory savings if the struct has padding, and also improves cache usage if only some fields are needed for a computation.
This is how you can use it:
const std = @import("std");
const Monster = struct {
element: enum { fire, water, earth, wind },
hp: u32,
};
const MonsterList = std.MultiArrayList(Monster);
pub fn main() !void {
var gpa = std.heap.GeneralPurposeAllocator(.{}){};
var soa = MonsterList{};
defer soa.deinit(&gpa.allocator);
// Normally you would want to append many monsters
try soa.append(&gpa.allocator, .{
.element = .fire,
.hp = 20,
});
// Count the number of fire monsters
var total_fire: usize = 0;
for (soa.items(.element)) |t| {
if (t == .fire) total_fire += 1;
}
// Heal all monsters
for (soa.items(.hp)) |*hp| {
hp.* = 100;
}
}
That's it! Not much different than using a normal ArrayList
, huh?
The ergonomic trick is that you can use enum literal syntax to refer to a specific field when iterating on a "column". If you want to see how that's managed in the Zig code, I recommend reading the source code directly.
Future
EDIT: New for loops are out!
While the entire implementation of MultiArrayList
doesn't depend on ad-hoc features in the language, one future addition will make it even nicer to use.
Proposal #7257 was accepted and, once implemented, will make the following syntax possible:
// Heal water monsters
for (soa.items(.element), soa.items(.hp)) |elem, *hp| {
if (elem == .water) hp.* = 100;
}
Watch the talk
Andrew gave a talk at Handmade Seattle to show how the Zig self-hosted compiler uses data-oriented techniques to improve compilation performance.
Oldest comments (6)
Awesome stuff!
I wanted to make an ECS using SOA and was trying to implement something similar myself. Glad it was already done by someone that knows what they are doing. Thanks for sharing this
I've read the std library implementation and I think MultiArrayList is AOS not SOA. The reason is because in SOA the struct holds lists to each individual component, meanwhile the library stores each component together. Quick example, in a soa I can create an entity with velocity and another with position and they would be separated in different arrays. In multiarraylist I can only create an entity with both velocity And position. I may be wrong and have misread the documentation or have misinterpreted SOA and AOS though
The interface is meant to give you syntax sugar to make it feel like an AoS list, but internally it’s stored in SoA form. You might think that it’s not SoA because you see a single array of bytes in the MultiArrayList implementation, but that’s the chunk of memory that holds all the per-field arrays.
Re-read the documentation and I was wrong on how I thought it worked, I blame the fact that I read it on a phone lol. Anyways thanks for the clarification
How does one properly sort this? Say you sort by a field and now your index is out of order. I came up with a way of capturing the indexes before sorting and then binary search the values after sorting, passing the indexes to a list and swap the values of the rest of the fields based on the new index orders, except for the field that was sorted. I bet that there has to be some other way. The .sort method is confusing me. It wants ctx: anytype ( I assume ctx means context ) but idk what type I am supposed to pass in there.