The power hack and complexity of Package Manager in Zig 0.11.0
Ed Yu (@edyu on Github and
@edyu on Twitter)
Oct.18.2023
Introduction
Zig is a modern system programming language and although it claims to a be a better C, many people who initially didn't need system programming were attracted to it due to the simplicity of its syntax compared to alternatives such as C++ or Rust.
However, due to the power of the language, some of the syntaxes are not obvious for those first coming into the language. I was actually one such person.
Several months ago, when I first tried out the new Zig package manager, it was before Zig 0.11.0 was officially released. Not only was the language unstable, but also the package manager itself was subject to a lot of stability issues especially with TLS. I had to hack together a system that worked for my need, and I documented my journey in Zig Package Manager - WTF is Zon.
Since then I've had discussion of the Zig package manager with Andrew and various others through the Zig Discord, Ziggit, and even opened up a Github issue.
Now that Zig has released 0.11.0 in August 2023, and many of the stability problems were resolved so I want to revisit my hack to see whether I can do a better hack.
A special shoutout to my friend InKryption, who was tremendously helpful in my understanding of the package manager. I wouldn't be able to come up with this better hack without his help.
Disclaimer
As I mentioned in my previous article, I changed my typical subtitle of power and complexity to hack and complexity because not only was Zig 0.11.0 (which first introduced the package manager) not released yet but also I had to do a pretty ugly hack to make it work.
I just want to reiterate my stance on Zig and the package manager. I'm not writing this to discourage you from using it but to set the right expectation and hopefully help you in case you encounter similar issues.
Zig along with its package manager is being constantly improved and I'm looking forward to the 0.12.0 release.
Today, I'll introduce a better hack than what I had to do in June, 2023 and ideally I can retire my hack after the 0.12.0 release.
I'll most likely write a follow-up article once Zig 0.12.0 is released (hopefully) by the end of the year.
I will not reiterate concepts introduced in Part 1, so please read that first if you find this article confusing.
Package (Manager) vs Binary Library
One of my previous misunderstandings of the package manager was that I was using a Zig package as a library.
Let's reuse the same example of C -> B -> A from Part 1
in that our program C depended on package B, which in turn depended on package A.
The way I was building the program C and packages B and A was that I was basically copying over everything package A produced to package B and then copied over both what package B produced and package A produced to program C as part of the build process. The thing that was produced is called an artifact in Zig package manager.
That was not the correct way to use a package manager because one of the benefits of a package manager is that you only need to concern yourself with the packages you depended on directly without needing to care about the additional packages those direct packages depended on themselves.
In the example of C -> B -> A, program C should only know/care about package B and not needing to care at all that package B needed package A internally because the package manager should have taken care of the transitive dependencies.
In other words, package manager should have good enough encapsulation for packages so that the users need not care about packages not directly required by the main (their own) programs.
As an example, despite many of the dependency problems, npm does a good job (probably too good a job) of encapsulation.
It's so good that sometimes when you add 1 package, you might be surprised when npm automatically pulls down hundreds of packages because it would recursively download all depenencies.
However, such clean encapsulation is not always possible when we are building native programs in Zig especially when shared libraries are involved.
Artifact vs Module
In addition to artifacts, the Zig package manager also has the concept of a module but it is mainly referring to Zig source code and is primary used so that your program can import the Zig package as a library.
A module is equivalent to a Zig library (source code) exposed by the package manager. A module is not useful when the binary library you depend on is not written in Zig.
When building your program, you need access to the artifact produced by the dependency in order to access the specific items produced by such dependency.
To summarize, if your package is written in Zig, then you can access the Zig code in such package as a module and you can access either the shared libarary, static library, or the executable produced by such package as artifacts. However, if your package is not written in Zig, then you need to do some additional work to expose the code/library as a module and expose the resulting items as part of the artifact.
The main problem I had to deal with was that the Zig package manager resolved around the idea of an artifact which requires a Compile step that is involved with either a compilation and/or linking step. As stated earlier, an artifact is the stuff that was produced as part of the build process. Where this falls apart is when we need to package together items that do not require a build (Compile) step.
Hence, the existing artifact conceptualization doesn't work well with when we have to deal with a package composed of an existing binary library such as a shared library that doesn't require any additional compilation or linking. Note that this can be the case even if you have the source code because you may not want to compile the source code yourself if the project releases binary packages as part of its releases.
The Problem
I'll reintroduce the problem mentioned in Part 1.
The scenario is quite common in projects that uses packages written in a different language from the main project:
A: You often would need the shared or static library from the package written in another language compiled for your environment (such as Linux).
B: You would also need to write a wrapper for such library in your native language.
C: You then would write your program calling the functions provided by the wrapper B.
Our concrete example has 3 packages A, B, and C. Our program my-wtf-project is in package C, which needs to use DuckDb for its database needs.
The project C will use the Zig layer provided by package B, which in turn will need the actual DuckDb implementation provided by package A.
For our my-wtf-project
, our main program will call the Zig library provided by zig-duckdb. The zig-duckdb is just a Zig wrapper of libduckdb that provides the dynamic library of release 0.9.1 of DuckDb.
To use the C -> B -> A example in the earlier section, program C is our project my-wtf-project
, package B is zig-duckdb, and project A is libduckdb.
Note that package B used to be called duckdb.zig
but it has since been renamed to zig-duckdb.
The Hack in Part 1
There are two hacks I had to do for the build.zig
of package A(libduckdb),
package B(zig-duckdb), and program C(my-wtf-project):
In the
build.zig
of libduckdb, I had to create an artifact even if thelibduckdb.so
is a shared library that doesn't need additional compilation/linking by creating a new static library that is linked tolibduckdb.so
just so I can use the artifact in
zig-duckdb.I had to use
Build.installHeader
to install both theduckdb.h
and thelibduckdb.so
in all thebuild.zig
to copy over these 2 files tozig-out/include
andzig-out/lib
respectively.
The New Hack
I'm still calling this a hack because as stated, a module is mainly used to refer to Zig source code that can be used as a library to be imported by your program. Just like how a shared library is not meant to be installed via calls to install header files, a module is meant to be used to refer to individual artifacts in a package. However, this is exactly what I had to do.
I believe this is better than how I was using Build.installHeader
and Build.installLibraryHeader
to install artifacts produced by dependencies.
A big benefit of using the module to refer to non-Zig-produced artifacts is that we do not need to copy over artifacts from the dependencies anymore.
A: libduckdb
The duckdb was written in c++ and the libduckdb-linux-amd64
release from duckdb only provided 3 files: duckdb.h
, duckdb.hpp
, and libduckdb.so
.
I unzipped the package and placed duckdb.h
under the include
directory and libduckdb.so
under the lib
directory.
build.zig.zon of A: libduckdb
Because libduckdb has no dependencies, the zon file is extremely simple.
It just lists the name and the version. I've intentionally been using the actual version number of the underlying DuckDb.
// build.zig.zon
// there are no dependencies
.{
// note that we don't have to call this libduckdb
.name = "duckdb",
.version = "0.9.1",
}
build.zig of A: libduckdb
This is the first big change from Part 1. We are not building anymore fake artifact. We are only introducing some modules so that any package depending on this package can reference these items using the various module names. This is still a hack because technically these items are artifacts not modules but at least we don't have to compile a shared library that doesn't need to be compiled.
pub fn build(b: *std.Build) !void {
_ = b.addModule("libduckdb.lib", .{ .source_file = .{ .path = b.pathFromRoot("lib") } });
_ = b.addModule("libduckdb.include", .{ .source_file = .{ .path = b.pathFromRoot("include") } });
_ = b.addModule("duckdb.h", .{ .source_file = .{ .path = b.pathFromRoot("include/duckdb.h") } });
_ = b.addModule("libduckdb.so", .{ .source_file = .{ .path = b.pathFromRoot("lib/libduckdb.so") } });
}
This will make more sense in the next sections.
B: zig-duckdb
The zig-duckdb is still a minimal Zig wrapper to DuckDb. It suits my needs for now and the only changes added since last time are the ability to query for boolean
and optional
values.
The big change is that we no longer need to install libduckdb.so
or duckdb.h
from libduckdb.
build.zig.zon of B: zig-duckdb
We do have a dependency now as we need to refer to a release of A: libduckdb.
// build.zig.zon
// Now we depend on a release of A: libduckdb
.{
.name = "duck",
.version = "0.0.5",
.dependencies = .{
// this is the name you want to use in the build.zig to reference this dependency
// note that we didn't have to call this libduckdb or even duckdb
.duckdb = .{
.url = "https://github.com/beachglasslabs/libduckdb/archive/refs/tags/v0.9.1.3.tar.gz",
.hash = "1220e182337ada061ebf86df2a73bda40e605561554f9dfebd6d1cd486a86c964e09",
},
},
}
build.zig of B: zig-duckdb
Note that we no longer install libduckdb.so
or duckdb.h
as part of the build process we previous had to do in Part 1.
We do have to call addModule
multiple times to expose not only the library libduck.a
(the artifact of this package) itself but also re-export the modules provided by libduckdb.
Note how we now call duck_dep.builder.pathFromRoot(duck_dep.module("libduckdb.include").source_file.path
to access the include
directory and duck_dep.builder.pathFromRoot(duck_dep.module("libduckdb.lib").source_file.path)
to access the lib
directory.
You can think of this as equivalent of reaching inside of libduckdb to access these items and therefore we don't have to copy these items into our output directory anymore as we previously had to do with lib.installLibraryHeaders(duck_dep.artifact("duckdb"))
.
pub fn build(b: *std.Build) !void {
const target = b.standardTargetOptions(.{});
const optimize = b.standardOptimizeOption(.{});
const duck_dep = b.dependency("duckdb", .{});
// this is our main wrapper file
_ = b.addModule("duck", .{
.source_file = .{ .path = "src/main.zig" },
});
// (re-)add modules from libduckdb
_ = b.addModule("libduckdb.include", .{
.source_file = .{ .path = duck_dep.builder.pathFromRoot(
duck_dep.module("libduckdb.include").source_file.path,
) },
});
_ = b.addModule("libduckdb.lib", .{
.source_file = .{ .path = duck_dep.builder.pathFromRoot(
duck_dep.module("libduckdb.lib").source_file.path,
) },
});
_ = b.addModule("duckdb.h", .{
.source_file = .{ .path = duck_dep.builder.pathFromRoot(
duck_dep.module("duckdb.h").source_file.path,
) },
});
_ = b.addModule("libduckdb.so", .{
.source_file = .{ .path = duck_dep.builder.pathFromRoot(
duck_dep.module("libduckdb.so").source_file.path,
) },
});
const lib = b.addStaticLibrary(.{
.name = "duck",
// In this case the main source file is merely a path, however, in more
// complicated build scripts, this could be a generated file.
.root_source_file = .{ .path = "src/main.zig" },
.target = target,
.optimize = optimize,
});
lib.addLibraryPath(.{ .path = duck_dep.builder.pathFromRoot(
duck_dep.module("libduckdb.lib").source_file.path,
) });
lib.addIncludePath(.{ .path = duck_dep.builder.pathFromRoot(
duck_dep.module("libduckdb.include").source_file.path,
) });
lib.linkSystemLibraryName("duckdb");
b.installArtifact(lib);
}
Note that if you really want to install libduckdb.so
for example, you can do so with the following call:
_ = b.installLibFile(duck_dep.builder.pathFromRoot(
duck_dep.module("libduckdb.so").source_file.path,
), "libduckdb.so");
If you look into the project, you will see that I introduced a new file called test.zig
that was meant to test the new boolean
and optional
values.
In order to run the test, I've added a new test step in build.zig:
const unit_tests = b.addTest(.{
.root_source_file = .{ .path = "src/test.zig" },
.target = target,
.optimize = optimize,
});
unit_tests.step.dependOn(b.getInstallStep());
unit_tests.linkLibC();
// note how I use modules to access these directories
unit_tests.addLibraryPath(.{ .path = duck_dep.builder.pathFromRoot(
duck_dep.module("libduckdb.lib").source_file.path,
) });
unit_tests.addIncludePath(.{ .path = duck_dep.builder.pathFromRoot(
duck_dep.module("libduckdb.include").source_file.path,
) });
unit_tests.linkSystemLibraryName("duckdb");
const run_unit_tests = b.addRunArtifact(unit_tests);
run_unit_tests.setEnvironmentVariable("LD_LIBRARY_PATH", duck_dep.builder.pathFromRoot(
duck_dep.module("libduckdb.lib").source_file.path,
));
const test_step = b.step("test", "Run unit tests");
test_step.dependOn(&run_unit_tests.step);
Once again, you can see that's why I've exposed the lib
and include
directories of libduckdb via module.
I can now call addIncludePath
and addLibraryPath
by referencing their modules.
Note the call to setEnvironmentVariable
because -L
is only useful for linking not for running the test/program. Hence you need to point to libduckdb.so
using LD_LIBRARY_PATH
and once again by accessing the location of the shared library inside the libduckdb package.
C: my-wtf-project
Now to create the executable for our project, we need to link to the packages A libduckdb and B zig-duckdb.
build.zig.zon of C: my-wtf-project
Our only dependency is the release of B: zig-duckdb.
// build.zig.zon
// Now we depend on a release of B: zig-duckdb
.{
// this is the name of our own project
.name = "my-wtf-project",
// this is the version of our own project
.version = "0.0.2",
.dependencies = .{
// we depend on the duck package described in B
.duck = .{
.url = "https://github.com/beachglasslabs/zig-duckdb/archive/refs/tags/v0.0.5.tar.gz",
.hash = "12207c44a5bc996bb969915a5091ca9b70e5bb0f9806827f2e3dd210c946e346a05e",
},
},
}
build.zig of C: my-wtf-project
This is somewhat similar to the build.zig
of B (zig-duckdb).
Note once again that we do not need to call installLibraryHeaders
to install the libduckdb.so
and duckdb.h
anymore.
I've also added setEnvironmentVariable
to set LD_LIBRARY_PATH
for running the test program.
pub fn build(b: *std.Build) !void {
const target = b.standardTargetOptions(.{});
const optimize = b.standardOptimizeOption(.{});
const exe = b.addExecutable(.{
.name = "my-wtf-project",
.root_source_file = .{ .path = "testzon.zig" },
.target = target,
.optimize = optimize,
});
const duck = b.dependency("duck", .{
.target = target,
.optimize = optimize,
});
exe.addModule("duck", duck.module("duck"));
exe.linkLibrary(duck.artifact("duck"));
exe.addIncludePath(.{ .path = duck.builder.pathFromRoot(
duck.module("libduckdb.include").source_file.path,
) });
exe.addLibraryPath(.{ .path = duck.builder.pathFromRoot(
duck.module("libduckdb.lib").source_file.path,
) });
// You'll get segmentation fault if you don't link with libC
exe.linkLibC();
exe.linkSystemLibraryName("duckdb");
b.installArtifact(exe);
const run_cmd = b.addRunArtifact(exe);
run_cmd.step.dependOn(b.getInstallStep());
// you must set the LD_LIBRARY_PATH to find libduckdb.so
run_cmd.setEnvironmentVariable("LD_LIBRARY_PATH", duck.builder.pathFromRoot(
duck.module("libduckdb.lib").source_file.path,
));
const run_step = b.step("run", "Run the test");
run_step.dependOn(&run_cmd.step);
}
Running the executable
You can now just call zig build run
to run the test program because we already set LD_LIBRARY_PATH
using setEnvironmentVariable
in our build.zig
.
I ~/w/z/wtf-zig-zon-2 6m 10.7s ❱ zig build run
info: duckdb: opened in-memory db
info: duckdb: db connected
debug: duckdb: query sql select * from pragma_version();
Database version is v0.9.1
STOPPED!
Leaks detected: false
I ~/w/z/wtf-zig-zon-2 4.1s ❱
Bonus: Package Cache
When I mentioned reaching inside the package, what happens behind the scene is that the package is in ~/.cache/zig
so all these magic with module is really specifying the path to the particular packages under ~/.cache/zig
.
You can see more clearly what's going on if you add --verbose
to your zig build
or zig build
commands.
I ~/w/z/wtf-zig-zon-2 4.1s ❱ zig build run --verbose
/snap/zig/8241/zig build-lib /home/ed/.cache/zig/p/1220fe38df4d196b7aeca68ee6de3f7b36f1424196466038000f7485113cf704f478/src/main.zig -lduckdb --cache-dir /home/ed/ws/zig/wtf-zig-zon-2/zig-cache --global-cache-dir /home/ed/.cache/zig --name duck -static -target native-native -mcpu znver3-mwaitx-pku+shstk-wbnoinvd -I /home/ed/.cache/zig/p/1220e182337ada061ebf86df2a73bda40e605561554f9dfebd6d1cd486a86c964e09/include -L /home/ed/.cache/zig/p/1220e182337ada061ebf86df2a73bda40e605561554f9dfebd6d1cd486a86c964e09/lib --listen=-
/snap/zig/8241/zig build-exe /home/ed/ws/zig/wtf-zig-zon-2/testzon.zig /home/ed/ws/zig/wtf-zig-zon-2/zig-cache/o/b893f00994b9c79eab2c150de991b233/libduck.a -lduckdb -lduckdb -lc --cache-dir /home/ed/ws/zig/wtf-zig-zon-2/zig-cache --global-cache-dir /home/ed/.cache/zig --name my-wtf-project --mod duck::/home/ed/.cache/zig/p/1220fe38df4d196b7aeca68ee6de3f7b36f1424196466038000f7485113cf704f478/src/main.zig --deps duck -I /home/ed/.cache/zig/p/1220e182337ada061ebf86df2a73bda40e605561554f9dfebd6d1cd486a86c964e09/include -L /home/ed/.cache/zig/p/1220e182337ada061ebf86df2a73bda40e605561554f9dfebd6d1cd486a86c964e09/lib --listen=-
LD_LIBRARY_PATH=/home/ed/.cache/zig/p/1220e182337ada061ebf86df2a73bda40e605561554f9dfebd6d1cd486a86c964e09/lib /home/ed/ws/zig/wtf-zig-zon-2/zig-out/bin/my-wtf-project
info: duckdb: opened in-memory db
info: duckdb: db connected
debug: duckdb: query sql select * from pragma_version();
Database version is v0.9.1
STOPPED!
Leaks detected: false
I ~/w/z/wtf-zig-zon-2 ❱
The End
Part 1 is here.
You can find the code here.
Here are the code for zig-duckdb and libduckdb.
Special thanks to @InKryption for helping out on the new hack for the Zig package manager!
Top comments (7)
Thanks for this writing.
There is a missing part of how to use local dep, which is probably something definitely will show up in our life.
Namely, image that the data structure is like follows
in
my_app.zig
want to use code frommy_mod.zig
, how to setup? which is not obvious and not covered.It turns out (after reading zig build source a bit)
createModule
should be used, like following build.zig formy_app
some explanations of above are
b.createModule
helped us to create a module live, and then it can be used for giving an alias nameexe.addModule("my_mod", my_mod);
is the line to give our live module an aliasmy_mod
, so inmy_app
, it can be imported asconst my_mod = @import("my_mod")
my_app
exe (as you can see in lines ofaddObjectFile
(static lib), andlinkSystemLibrary2
. As far as I know this is the waywith above method, just inside
my_app
folder,zig build run
should then correctly usemy_mod
and behave!Hope this can help others searching for the same question.
Edit: I think 0.12.0's package manager, or the package manager of the master branch now should handle local packages better by relative path or
file://
protocol.Related Discussions on Github Issues:
#17364
#14603
In this case could the
my-mod
package be consumed online? In other words, how should the other zig apps depending on it (let say aanother-app
written by others) declares the dependency. I assume it is impossible to depend on it even the correct github link is included as there is no build.zig inmy-mod
?Would it be possible to instead first publish
my-mod
as in thezig-duckdb
in the blog post, then the consumer app, let it bemy-wtf-project
ormy-app
has abuild.zig.zon
with something like this:Thanks again. I have read and tried. Local file URL works!
With
zig
version0.12.0-dev.1828+225fe6ddb
(or up :)),zig fetch --save "../my_mod"
insidemy_app
folder will save correct dependency entry intobuild.zig.zon
. Something like belowthen in
my_app
build.zig
add correct code to use it then will be fine, something similar to belowone thing to mention is that whenever the files of
my_mod
changed, need tozig fetch
again insidemy_app
, otherwise zig is using old version.Thanks @liyu1981 ! I'm using a similar approach but just one difference. I prefer
.path
over.url
as it does not require a.hash
check. It allows me to combinebuild.zig.zon
andgit submodule
, so I can download package behind an enviornment that lives behind a network proxy. I wrote a post on this if you would like to have a try. zig.news/fuzhouch/use-git-submodul...Thanks for sharing the pointers. I will read them.
On the case I want to use local mod is that when developing some lib, for example, a string lib, I want to write some benchmark using it before publish it. So the string lib is my-mod, and the benchmark is my-app. My-mod does not necessary go out before the benchmark is done. This is common practice in other morden languages, so why I am searching.
I will definitely read the file:// support see what I find :)
I'm curious if you could give an update on this situation now that Zig has reached 0.13? In particular, are these hacks needed? Are there still lesser hacks needed? Or is package management done and solved (lol)? I joke, but I may actually consider cargo to be done and solved, as it's been a delight every time to use and many of my developer tools are rust projects.
I'm learning about zig and it's a good time to evaluate its capabilities in terms of project management, as that of course has an impact on achievable project scale. As a metaprogramming enthusiast, I'm loving what I've read so far about the comptime features. The convention where zig build instructions are written in zig is also coherent.
I write lots of different kinds of software (graphics/gamedev, all kinds of automation) but have been on a metaprogramming kick lately, doing that with TypeScript. Although zig is not well suited for functional programming, which is needed at a high level for many complex projects, I can continue to express those layers in more functional friendly languages, and I'm optimistic that Zig may be how I want to approach growing out the performance critical sections of software I make from here on out. Up till recently, my choice of non-GC systems lang would have been C++ or Rust, but both of those seem to import a lot of baggage.
All this work just to install a simple library?
Honestly I'd rather just use plain old C instead.