Newcomers sometimes get confused by undefined
, wonder how one is supposed to use the keyword, and might even have misconceptions based on the fact that JavaScript has it too (but with different semantics).
Undefined initialization
When creating a variable in Zig, you are forced to decide how to initialize it. Sometimes you will have a good initial value to offer, but occasionally you won't have one.
One particularly relevant case is when you are creating an array that you plan to fill over time. In this case you will only care about a small subset of the array in the beginning, so what should be done about the rest of the bytes?
const std = @import("std");
const TEAM_SIZE = 6;
const Pokemon = struct {
name: []const u8,
pk_type: enum { fire, water, electricity },
hp: u32,
};
const Player = struct {
name: []const u8,
slots: [TEAM_SIZE]Pokemon,
active_slots: usize,
fn give_pokemon(self: *Player, pk: Pokemon) []Pokemon {
if (self.active_slots == self.slots.len)
@panic("Too many Pokemon!");
self.slots[self.active_slots] = pk;
self.active_slots += 1;
return self.slots[0..self.active_slots];
}
};
pub fn main() void {
var player: Player = .{
.name = "Red",
.slots = undefined,
.active_slots = 0,
};
// Give a starter Pokemon to the player
var player_team = player.give_pokemon(.{
.name = "Pikachu",
.pk_type = .electricity,
.hp = 12,
});
}
In the code example above you can see how initially we only want to give a single Pokemon to the player. In this case we don't really care about the initial state of the array and so we can set it to undefined
:
var player: Player = .{
.name = "Red",
.slots = undefined, // <---
.active_slots = 0,
};
What does undefined
do?
If we distill the code above to its essence, this is what we did:
var slots: [6]Pokemon = undefined;
Setting something to undefined
, from a semantic perspective, means that we're telling the compiler "I don't care about the state of that memory".
In Release modes, assigning undefined
results in a noop: the computer will not write anything to that memory and will leave it in its (potentially dirty) initial state. As an example, you can expect memory that an allocator recycles to be in a dirty state.
In Debug mode, assigning undefined
will write to memory 0xAA
(i.e., a 101010... pattern). This value will have no correct semantic meaning for the variable, but it will allow you to spot programming errors more easily when using a debugger: if you see an important variable like a pointer set to 0xAAAA, then you can suspect that the variable was never properly initialized. Having a pattern to look for is much easier than spotting generic corrupted data. Additionally, if running the program in Valgrind, Zig will make a client request to mark the memory as undefined, helping Valgrind properly analyze the program. The Valgrind integration code is available in the standard library under std.valgrind
.
Can you check if something is undefined?
This is a question that I imagine comes most often from people with JavaScript experience, since in JS you can check for equality with undefined (which is the value assigned to parameters that were not set when calling a function).
// JavaScript
function couple(a, b) {
return [a, b];
}
>>> couple(1, 2)
[1, 2]
>>> couple("uh-oh")
["uh-oh", undefined]
>>> couple()
[undefined, undefined]
In JavaScript we don't say "wrong number of arguments", we say
undefined
and I think its beautiful.
As we saw above, undefined
in Zig has to be different because there is no runtime. In JS everything is a dynamically typed object that can be inspected by the runtime. In Zig if you create a u32
, then those become 32 bits worth of memory where every single bit combination represents a number, including the 101010...
pattern. So not only Zig has no runtime to begin with, but also you can see how u32
has no space to even represent an undefined state.
This might be annoying on one hand but, on the other, maximizing resource utility is part of the reason why systems programming languages like Zig can implement things that would be impossible in pure JS even with the latest and greatest hardware available today.
On the upside, Zig will try in Debug mode to sneak in extra bytes and checks that can perform limited introspection (e.g., in debug mode unions have a hidden tag to spot misuse), but the core point is that you need to recalibrate your expectations when it comes to languages without a runtime, and accept that you will sometimes need to rely on tooling (e.g., debuggers, Valgrind, {ub, a, t}san) to help you diagnose problems.
Why use undefined
?
I just spent a lot of words about describing the problems related to undefined
, why use it at all then?
First reason: performance
This should be easy to understand: not doing anything is faster than doing something, generally speaking, and especially when it comes to big quantities of memory -- either because of a big allocation or because of many reuses of a smaller one -- skipping a useless step, where you set to zero memory that you're soon going to overwrite anyway, can make a big difference in terms of performance.
Second reason: modeling
In the starting code example we decided to allocate upfront all 6 Pokemon slots that the player has, but this is far from the only possibility and we could have used an ArrayList that grows as needed, for example. In this second case we would not have had any need for undefined
, as the list would start empty, allowing us to add slots on-demand. From a modeling perspective this seems great: we have made bad states much harder to represent than in the first design!
Unfortunately this is not a strict improvement. Yes, we have made bad states harder to represent, but now we have also introduced the possibility of allocation failure. In the original code we were able to exploit the knowledge that the player will only need 6 slots to place the entire array on stack memory.
In a small-scale example like this one, it really doesn't matter much in practice, but the more general point is that pre-allocating memory -- and thus having to be careful about it -- is an inherent part of the design of some programs. In that regard Zig does you a favor by having first-class support for the concept.
A note on zero initialization
In some circles, like Go and Odin, there's this very popular idea of making the zero value of an element be useful. This brings some very good properties, starting from lowering the amount of invalid states, to improving the API of a struct, by making it ready to use without any .init
step required.
It's a neat idea that I recommend keeping in mind, but in my opinion is not always possible to make the zero value useful (on top of the fact that in Zig it's not the idiomatic choice) and, more importantly, Andrew pointed out a few times how, when it comes to mistakes, it's much easier to recognize an operation that depends on an undefined value, than it is when the value was forcefully set to zero, or any other semantically wrong value that instrumentation cannot catch.
Watch the talk
If you want to watch... oh, turns out nobody has ever given a talk on the use of undefined
. Maybe you could be the one?
Apply to speak on Zig SHOWTIME
Bonus credit
You can also use undefined
to explicitly mark memory as dirty, not just as a form of initialization. The benefit is that there is no overhead in Release, while in Debug you will be able to rely on tooling in case of programming errors.
In practice this is how it normally looks: a struct has a deinit function that frees any other resource (which is usually the main reason why a struct would have a deinit function in the first place) and right at the end sets the entire struct to undefined
.
const Foo = struct {
bar: Bar,
baz: Baz,
fn deinit(self: *Foo) void {
// ...
self.* = undefined;
}
};
Now it will be easier to spot if some of the logic in your program depends erroneously on the state of an instance of Foo that you've deinited.
To be clear, this is not something unique to structs, in fact any variable involved in a complex life cycle might benefit from being set to undefined.
Latest comments (3)
I like the bonus point, kinds of like drop in Rust.
I think there's a typo in the final snippet: it should have been
deinit(self: *Foo)
.Ah yes, good catch, thanks!