Zig NEWS

Govind
Govind

Posted on

Beware the copy!

TL;DR Beware of Zig when it copies data vs when it creates a reference !

Imagine a simple struct

const OtherStruct = struct {
    b: u8
};
Enter fullscreen mode Exit fullscreen mode

And lets try to add a function to update this struct

fn cannot_modify_struct(o: OtherStruct) void {
    o.b = @as(u8, 23);
}
Enter fullscreen mode Exit fullscreen mode

Calling this function would immediately result in an error

no_pointer_alist.zig:39:6: error: cannot assign to constant
    o.b = @as(u8, 23);
    ~^~
Enter fullscreen mode Exit fullscreen mode

This is because Zig treats all parameter values as constants. The solution is to pass o as *OtherStruct and update it in the function using o.*.b = @as(u8, 32)

fn modify_struct(o: *OtherStruct) void {
    o.*.b = @as(u8, 23);
}
// .... Usage
    modify_struct(&ostruct);
    std.debug.print("{}", .{ostruct});
//.... Output
> no_pointer_alist.OtherStruct{ .b = 23 }%
Enter fullscreen mode Exit fullscreen mode

What if we have a fn that appends to an ArrayList instead of a struct and we pass this ArrayList as a parameter to the fn. Will that work ?

const OtherStructList = std.ArrayList(OtherStruct);

fn modify_struct_list(olist: OtherStructList) !void {
    try olist.append(OtherStruct{ .b = @as(u8, 33)});
}
fn main() void {
var arena = ArenaAllocator.init(std.heap.page_allocator);
defer arena.deinit();
var alloc = arena.allocator();
var oLoost = OtherStructList.init(alloc);
try modify_struct_list(oLoost);
std.debug.print("{}", .{oLoost});
}
Enter fullscreen mode Exit fullscreen mode

Anddd.... no. When Zig passes an ArrayList as as param, it changes the type of the ArrayList to *const, and the append fn cannot append to a *const

no_pointer_alist.zig:46:14: error: expected type '*array_list.ArrayListAligned(no_pointer_alist.OtherStruct,null)', found '*const array_list.ArrayListAligned(no_pointer_alist.OtherStruct,null)'
    try olist.append(OtherStruct{ .b = @as(u8, 33)});
        ~~~~~^~~~~~~
no_pointer_alist.zig:46:14: note: cast discards const qualifier
Enter fullscreen mode Exit fullscreen mode

However, within the function, I can take a copy of the ArrayList and copy to it

fn modify_struct_list_with_copy(olist: OtherStructList) !void {
    var head = olist;
    try head.append(OtherStruct{ .b = @as(u8, 33)});
    std.debug.print("{}\n", .{head.items[0]});
}
//... Output
no_pointer_alist.OtherStruct{ .b = 33 }
Enter fullscreen mode Exit fullscreen mode

The passed in parameter: olist is NOT modified. The changes happen only to the fn local copy (In other languages like Python or Java, this might not be true because Python does not copy non-primitives by default)

Okay, so simple structs and ArrayLists are not modified. What about ArrayLists embedded inside other lists ?

const MyStruct = struct {
    a: i32,
    o: OtherStructList
};
const MyStructList = std.ArrayList(MyStruct);

fn modify_embedded_list(m: MyStructList) !void {
   try m.items[0].o.append(OtherStruct{ .b = @as(u8, 66)});
}
// And in main()
var arena = ArenaAllocator.init(std.heap.page_allocator);
     defer arena.deinit();
     var alloc = arena.allocator();
     var myLoost = MyStructList.init(alloc);
     var oLoost = OtherStructList.init(alloc);
     try myLoost.append(MyStruct{.a = @as(i32, 345), .o = oLoost});
     try modify_embedded_list(myLoost);
     std.debug.print("{}\n", .{myLoost.items[0].o.items.len});
// Output 
>> 1
Enter fullscreen mode Exit fullscreen mode

Note that while m in fn modify_embedded_list is a *const, the same doesn't seem to apply to its members (I have no idea here what the expected behaviour should be, probably is summed up here in the proposal/reference for Result Location semantics: https://github.com/ziglang/zig/issues/287)
So, even though , m is constant. m.items[0].o can still be appended to inside a function.

Here is where Zig's implicit copying might cause unintentional bugs, if you don't know how the language works. What if we make a copy of myLoost.items[0] and then append to the copy instead ?

fn modify_embedded_list_with_copy(m: MyStructList) !void {
   var top = m.items[0]
   try top.append(OtherStruct{ .b = @as(u8, 66)});
}
// And in main()
try modify_embedded_list(myLoost);
     std.debug.print("{}\n", .{myLoost.items[0].o.items.len});
// Output: KABOOM !
>> 0 
Enter fullscreen mode Exit fullscreen mode

Yes, the append happens to the copy (local var top) rather than to your original list embedded inside another list. While writing some code, this behaviour cause some subtle bugs that took almost 2-3 days before I realized that the copy could be at fault for appends to my embedded lists simply vanishing.

Summing up my article, the lesson learnt is: Beware the copy, especially of non-primitive data types.

Top comments (7)

Collapse
 
david_vanderson profile image
David Vanderson

Great write up! I had a similar experience - it took me a while to realize that obj.foo() works the same way and could pass a copy of obj.

It was surprising enough that I proposed changing zig:
github.com/ziglang/zig/issues/13249

Trying to implement the proposal showed it wasn't as good as I hoped.

This is one of those things that people (at least me!) have to learn through experience and reading writeups like this!

Collapse
 
gowind profile image
Govind

Interesting issue. I admit that one area where Zig (or Zig documentation) fails big time is with memory, more specially when copies are made and when they aren't. In Rust, this is more explicit due to this being in the type, but in Zig, for ex, when you have an ArrayList(SomeStruct) and try to do append, you are making a copy of SomeStruct which is the one that ends up being added to the ArrayList. For a lot of us who come from memory managed languages, understanding all this nuance without docs explaining why is a big learning curve before feeling comfortable with Zig.

Collapse
 
guidorice profile image
guidorice

Ah gotcha- I think I see now.

Collapse
 
taudev profile image
Tau

Of course the members of a constant value are const too! A constant value can contain a pointer to or slice of non-const data though, which is why you can append to m.items[0].o. Maybe the language reference should explicitly state that all values, not just parameters, are copied by-value. This seems to be generally implicitly understood for systems languages, but I guess it can be surprising if you are coming from an object-oriented language.

Collapse
 
pyrolistical profile image
Pyrolistical

Feels like something a linter can warn

Collapse
 
guidorice profile image
guidorice • Edited

Related section in the language reference (27.1 Pass-by-value Parameters)

Collapse
 
gowind profile image
Govind

Yeah, its clear on parameters, but not on fn local variables, which is where I had indecipherable bugs for a while.