Recently I decided to stop awaiting for async to return (blocked on async, the irony) and resumed development of Bork, my Twitch chat client for livecoding.
https://github.com/kristoff-it/bork
That made me look at 2 years old code and the biggest thing that I experienced was annoyance at how I kept calling my allocators.
To make a long story short, here's my advice: in code that is not meant to be a fully reusable library (eg in application code) don't call your allocators allocator
(or other contractions like alloc
or ally
) and instead name them gpa
, arena
, fba
, so that you can better convey their intended usage pattern.
Additionally, don't shy away from passing around two allocators at a time, if it makes sense:
pub fn doSomething(gpa: std.mem.Allocator, arena: std.mem.Allocator) Result {}
I don't have yet a concrete example to show in Bork because I haven't yet fully designed how it should free memory. For now Bork keeps the full list of messages in memory until the application exits, but I want to move eventually to a model where the user configures a max amount of memory that can be used to store message history and we automatically evict old messages once we fill that memory.
In that scenario I will have some kind of ring buffer allocator for storing data relative to one specific message, and another one for data that should never be evicted, like emote images, for example.
In conclusion
Application code doesn't normally need to be allocator agnostic and so by giving concrete names to your allocator interfaces you can gain more clarity and unlock the concept of having multiple allocators at hand at once.
Go from this:
var gpa: std.heap.GeneralPurposeAllocator(.{}) = .{};
const alloc = gpa.allocator();
var arena = std.heap.ArenaAllocator.init(alloc);
const alloc1 = arena.allocator();
To this:
var gpa_impl: std.heap.GeneralPurposeAllocator(.{}) = .{};
const gpa = gpa_impl.allocator();
var arena_impl = std.heap.ArenaAllocator.init(gpa);
const arena = arena_impl.allocator();
And same with functions that accept allocators:
fn foo(gpa: std.mem.Allocator) !Result {}
fn bar(arena: std.mem.Allocator) !Result {}
fn baz(gpa: std.mem.Allocator, arena: std.mem.Allocator) !Result {}
Top comments (9)
I'd like to propose
_state
instead of_impl
in order to better convey that it is storing data the allocator works with, not just a container for some functions.Re: the last example, this seems as though it defeats the purpose of having generic allocators in the first place. Perhaps it would be better to explain the usage of each allocator in a doc comment instead of suggesting a specific implementation. This might just be a difference in philosophy though, I generally enjoy writing code as though it were library code.
I don't get the non-typed approach either. If you want an arena allocator, why wouldn't you expect a std.heap.Allocator? If you get allocators passed in, it's the caller who determines the strategy. Naming doesn't enforce the correct choreography. This, together with signaling of memory ownership (i.e., e.g., need to pass a copy of keys for allocated keys in a hashmap) is still an area of question marks for me.
Because "arena" and "gpa" are usage patterns, and not unique implementations. Both std.heap.GeneralPurposeAllocator and malloc are "gpa"s, and likewise arenas can be implemented in different ways, like FixedBufferAllocator for example.
Is this just a current quirk of zig and the way the Allocator interface is implemented?
It would be nice to have a vision laid out by andrew on how he imagines these(*) things for 0.12 (the "stable" 0) / 1.0, if he has one. I'm not aware of such a vision documented, if you ("whoever") have an article/video to share, I'm curious to get to know it, please share.
(*) I see accepting allocators as tightly scoped strategy pattern, which would mean it's the caller who determines the strategy.
If there's requirements to use a specific strategy (but via an object that the caller passes in, manages and owns), I would expect/wish/hope for a strong reflection of that fact in the fn signature, and a way for both author and user to make sure a compatible implementation has been passed at compile time. I would also assume the names (in the example with multiple allocators) would communicate their use, instead of the author's preferred strategy, e.g. communicating whether an allocator is used for few, big allocations with long lifetime or recurring small allocations of samely typed/sized things with high frequency and churn. Obviously the interface can only take so much, and there's need for documentation as well IMO...
You are definitely not the first one to come up with the idea that even at the usage site there are two types of allocations. I remember reading someone's C blog saying that we should use an additional allocator and name it
scratch
. I can't remember who.I would suggest that instead of naming the allocators after their concrete types, you just use at most two types of allocators:
I like it :)
Passing two allocators to a function with different usage in mind looks quite a bit like Odin's context system.
It's also very similar to what Casey Muratori did in his Handmade Hero series with one arena for permanent data and one for temporary data that gets freed after a frame.
I'm currently using Odin (originally I wanted to use Zig but I found that it's still too early) and the context system does annoy me quite a bit since opting out is theoretically possible but impractical for general usage because the standard library relies on it. It's also just a little more magic than I'm happy with in a low level language. So the explicit style suggested here is pretty much the sweet spot for me.
My gut tells me that name based typing is not great. I think when you specifically need an Arena or a GPA what you really want to encode is who (caller or callee) is responsible to free (owns) the allocated data. Almost every time you use an Arena, you are implying that the caller owns the data.
Personally I think it's fine if you use naming to convey intended usage in your own application codebase.
I agree with you if you mean that this convention could get tricky if you have a team and some people coming and going. In this case though, you don't have to be super academic and dogmatic about everything. You want your little tool to be useful in a reasonable timeframe. It's fine to cut some corners and Loris has also given the caveat "in code that is not meant to be a fully reusable library", which you and I might want to expand a little but that's not a reason to dismiss the idea entirely.