This is a continuation of my previous post, where I explored how Zig returns structs via the stack.
We saw that for small struct, we can allocate some space in the frame of the calling function and then pass a pointer to it, that gets filled by the function returning the struct.
What if the returning struct is very big ? Something like the following :
const std = @import("std");
const X = struct { x: u32, y: u64, r: [32000]u32 };
fn Xmaker() X {
return X{
.x = 455,
.y = 497,
.r = [_]u32{0} ** 32000,
};
}
pub fn main() void {
var q = Xmaker();
std.debug.print("{}", .{q});
}
We are storing an array of 32,000 unsigned integers instead of 8 integers previously. A single struct has the size 16 + (32,000 * 4) bytes = 128016 bytes (128k bytes) !
What does Zig do in this case ? Let us take a look at the generated assembly.
000000000024c280 <main>:
24c280: 55 push rbp
24c281: 48 89 e5 mov rbp,rsp
24c284: b8 20 e8 03 00 mov eax,0x3e820
24c289: e8 92 a0 00 00 call 256320 <__zig_probe_stack>
24c28e: 48 29 c4 sub rsp,rax
24c291: 48 8d bd f0 0b fe ff lea rdi,[rbp-0x1f410]
24c298: e8 c3 72 00 00 call 253560 <Xmaker>
24c29d: 48 8d bd e0 17 fc ff lea rdi,[rbp-0x3e820]
24c2a4: 48 8d b5 f0 0b fe ff lea rsi,[rbp-0x1f410]
24c2ab: ba 10 f4 01 00 mov edx,0x1f410
24c2b0: e8 1b 9e 00 00 call 2560d0 <memcpy>
Notice in line 4 of main
, we have a call to __zig_probe_stack
. We did not directly call this fn, so it looks like the zig compiler injected this fn call into our code. What does __zig_probe_stack
do ?
0000000000256320 <__zig_probe_stack>:
256320: 51 push rcx 256321: 48 89 c1 mov rcx,rax
256324: 48 81 f9 00 10 00 00 cmp rcx,0x1000
25632b: 72 1c jb 256349 <__zig_probe_stack+0x29>
25632d: 48 81 ec 00 10 00 00 sub rsp,0x1000
256334: 83 4c 24 10 00 or DWORD PTR [rsp+0x10],0x0
256339: 48 81 e9 00 10 00 00 sub rcx,0x1000
256340: 48 81 f9 00 10 00 00 cmp rcx,0x1000
256347: 77 e4 ja 25632d <__zig_probe_stack+0xd>
256349: 48 29 cc sub rsp,rcx
25634c: 83 4c 24 10 00 or DWORD PTR [rsp+0x10],0x0
256351: 48 01 c4 add rsp,rax
256354: 59 pop rcx
256355: c3 ret
We call __zig_probe_stack
from main with a value of 0x3e820 (decimal 256032 = 2 * sizeof(x)). This argument is passed to __zig_probe_stack
via the rax
register.
__zig_probe_stack
subtracts rsp
by this value (the jb
and ja
are if else to do this sub only once if rax
< 4096 or more than once if rax
> 4096).
__zig_probe_stack
after subtracting rsp
, access a value 16 bytes above rsp
and or
s it with 0x0
. It then returns after restoring the value of rcx
.
This seems weird ? The or
seems useless and is done at a totally random location. Why ?
The source of zig_stack_probe
doesn't shed a lot of light, except that any access below the rsp
will cause a segfault in Linux with kernel versions below 5.1.
Some further stack overflow-ing later, I found a plausible explanation: sub rsp
is a way to extend the stack of a process, lazily (till it hits the limits set on the process by the kernel). Think of it like malloc
but for the stack.
Once rsp
is subbed, we access a location just above it, in order to trigger the stack expansion if necessary.
What Zig is doing here is making sure that there is enough space on the stack to allocate 2 instances of our large struct. If there isn't, then this lazy allocation of stack will segfault, causing our program to crash early. A pretty elegant solution, I must say!
Once the process is able to extend the stack, it then proceeds, with our Xmaker
storing the large struct in the stack and then followed by memcpy
making a copy of it, as in the previous example.
Latest comments (4)
Does Zig will systematically panic in case of memory allocation failure ?
Or is there a way to handle the error ?
I don't think this can be handled. unlike
malloc
where something likesbrk
, a system call is used internally, this is no system call where you can check the return value, but upto the OS to decide a process can have a larger stack or not and can crash if the OS decides that this process cannot have more stack storage.Thank you for the article!
If you want, consider putting part 1 and part 2 into a series (edit the first article, then click the button with a gear icon to create a new series). Also, since you mentioned that the first article was originally posted in your blog, you can also add a setting for that (in the same menu opened by the button with a gear icon).
Hi Loris,
Thanks for the tip ! Update my bio and linked to the articles and made them into a series.
I am learning more Zig and will be planning to write more such intro articles. Maybe this will get more beginners interested in exploring Zig and operating systems / assembly :)