Zig NEWS

Govind
Govind

Posted on

Beware the copy!

TL;DR Beware of Zig when it copies data vs when it creates a reference !

Imagine a simple struct

const OtherStruct = struct {
    b: u8
};
Enter fullscreen mode Exit fullscreen mode

And lets try to add a function to update this struct

fn cannot_modify_struct(o: OtherStruct) void {
    o.b = @as(u8, 23);
}
Enter fullscreen mode Exit fullscreen mode

Calling this function would immediately result in an error

no_pointer_alist.zig:39:6: error: cannot assign to constant
    o.b = @as(u8, 23);
    ~^~
Enter fullscreen mode Exit fullscreen mode

This is because Zig treats all parameter values as constants. The solution is to pass o as *OtherStruct and update it in the function using o.*.b = @as(u8, 32)

fn modify_struct(o: *OtherStruct) void {
    o.*.b = @as(u8, 23);
}
// .... Usage
    modify_struct(&ostruct);
    std.debug.print("{}", .{ostruct});
//.... Output
> no_pointer_alist.OtherStruct{ .b = 23 }%
Enter fullscreen mode Exit fullscreen mode

What if we have a fn that appends to an ArrayList instead of a struct and we pass this ArrayList as a parameter to the fn. Will that work ?

const OtherStructList = std.ArrayList(OtherStruct);

fn modify_struct_list(olist: OtherStructList) !void {
    try olist.append(OtherStruct{ .b = @as(u8, 33)});
}
fn main() void {
var arena = ArenaAllocator.init(std.heap.page_allocator);
defer arena.deinit();
var alloc = arena.allocator();
var oLoost = OtherStructList.init(alloc);
try modify_struct_list(oLoost);
std.debug.print("{}", .{oLoost});
}
Enter fullscreen mode Exit fullscreen mode

Anddd.... no. When Zig passes an ArrayList as as param, it changes the type of the ArrayList to *const, and the append fn cannot append to a *const

no_pointer_alist.zig:46:14: error: expected type '*array_list.ArrayListAligned(no_pointer_alist.OtherStruct,null)', found '*const array_list.ArrayListAligned(no_pointer_alist.OtherStruct,null)'
    try olist.append(OtherStruct{ .b = @as(u8, 33)});
        ~~~~~^~~~~~~
no_pointer_alist.zig:46:14: note: cast discards const qualifier
Enter fullscreen mode Exit fullscreen mode

However, within the function, I can take a copy of the ArrayList and copy to it

fn modify_struct_list_with_copy(olist: OtherStructList) !void {
    var head = olist;
    try head.append(OtherStruct{ .b = @as(u8, 33)});
    std.debug.print("{}\n", .{head.items[0]});
}
//... Output
no_pointer_alist.OtherStruct{ .b = 33 }
Enter fullscreen mode Exit fullscreen mode

The passed in parameter: olist is NOT modified. The changes happen only to the fn local copy (In other languages like Python or Java, this might not be true because Python does not copy non-primitives by default)

Okay, so simple structs and ArrayLists are not modified. What about ArrayLists embedded inside other lists ?

const MyStruct = struct {
    a: i32,
    o: OtherStructList
};
const MyStructList = std.ArrayList(MyStruct);

fn modify_embedded_list(m: MyStructList) !void {
   try m.items[0].o.append(OtherStruct{ .b = @as(u8, 66)});
}
// And in main()
var arena = ArenaAllocator.init(std.heap.page_allocator);
     defer arena.deinit();
     var alloc = arena.allocator();
     var myLoost = MyStructList.init(alloc);
     var oLoost = OtherStructList.init(alloc);
     try myLoost.append(MyStruct{.a = @as(i32, 345), .o = oLoost});
     try modify_embedded_list(myLoost);
     std.debug.print("{}\n", .{myLoost.items[0].o.items.len});
// Output 
>> 1
Enter fullscreen mode Exit fullscreen mode

Note that while m in fn modify_embedded_list is a *const, the same doesn't seem to apply to its members (I have no idea here what the expected behaviour should be, probably is summed up here in the proposal/reference for Result Location semantics: https://github.com/ziglang/zig/issues/287)
So, even though , m is constant. m.items[0].o can still be appended to inside a function.

Here is where Zig's implicit copying might cause unintentional bugs, if you don't know how the language works. What if we make a copy of myLoost.items[0] and then append to the copy instead ?

fn modify_embedded_list_with_copy(m: MyStructList) !void {
   var top = m.items[0]
   try top.append(OtherStruct{ .b = @as(u8, 66)});
}
// And in main()
try modify_embedded_list(myLoost);
     std.debug.print("{}\n", .{myLoost.items[0].o.items.len});
// Output: KABOOM !
>> 0 
Enter fullscreen mode Exit fullscreen mode

Yes, the append happens to the copy (local var top) rather than to your original list embedded inside another list. While writing some code, this behaviour cause some subtle bugs that took almost 2-3 days before I realized that the copy could be at fault for appends to my embedded lists simply vanishing.

Summing up my article, the lesson learnt is: Beware the copy, especially of non-primitive data types.

Discussion (4)

Collapse
taudev profile image
Tau

Of course the members of a constant value are const too! A constant value can contain a pointer to or slice of non-const data though, which is why you can append to m.items[0].o. Maybe the language reference should explicitly state that all values, not just parameters, are copied by-value. This seems to be generally implicitly understood for systems languages, but I guess it can be surprising if you are coming from an object-oriented language.

Collapse
guidorice profile image
guidorice

Ah gotcha- I think I see now.

Collapse
guidorice profile image
guidorice • Edited on

Related section in the language reference (27.1 Pass-by-value Parameters)

Collapse
gowind profile image
Govind Author

Yeah, its clear on parameters, but not on fn local variables, which is where I had indecipherable bugs for a while.