(Following is a summary for myself as I progressed day by day in getting better on this. Hopefully will be useful to others :))
One thing attracted me to zig
is because it is a much better c
and really seriously in working with c
or even legacy c
code. There is a whole section dedicated in zig
language book. But as I am following with it, more and more questions just surfaces themselves. Some of them have answers, while others may not yet. But anyway, let below list to be the Catch-them-all section.
All codes showed below are uploaded to https://github.com/liyu1981/zig_c_tips
General usage points
1. convert c
headers to zig
TLDR; with: zig translate-c hello.h
, or use @cImport
in zig source.
But it will output a lot, so sometimes I use prefix names and grep
to help to reduce them.
for example
#include <stdio.h>
int my_create_a_hello_string(char** buf);
If directly zig translate-c
will have too many lines (not very necessary), I will use zig translate-c hello.h | grep "my_"
to get following result
pub extern fn my_create_a_hello_string(buf: [*c][*c]u8) c_int;
For complex .h
file, I have wrote a small tool called translate_c_extract, which can accept something like follows
#include <stddef.h>
#include <stdint.h>
// translate-c provide-begin: /#define\s(?<tk>\S+)\s.+/
#define CONST_A 'a'
#define CONST_ONE 1
// translate-c provide-end: /#define\s(?<tk>\S+)\s.+/
// translate-c provide: RegexMatchResult
typedef struct {
size_t start;
size_t len;
} Loc;
// translate-c provide: get_last_error_message
void get_last_error_message(Loc* loc);
// translate-c provide: my_create_a_hello_string
int my_create_a_hello_string(char** buf);
#endif
to follows
pub extern fn get_last_error_message(loc: [*c]Loc) void;
pub extern fn my_create_a_hello_string(buf: [*c][*c]u8) c_int;
pub const CONST_A = 'a';
pub const CONST_ONE = @as(c_int, 1);
2. convert .zig
to .h
TLDR; zig
provides this feature, but currently not as smooth as I hoped.
// hello.zig
pub export fn hello(buf: [*c]u8, buf_len: usize) u8 {
const h = "hello";
const to_copy_len = @min(buf_len, h.len);
for (0..to_copy_len) |i| buf[i] = h[i];
return to_copy_len;
}
To get a hello.h
, we will need do zig build-lib -femit-h hello.zig
. An file hello.h
will be emitted in the same dir of hello.zig
. But it is also a bit of messy. In particular, it will be
#include "zig.h"
zig_extern uint8_t hello(uint8_t *const a0, uintptr_t const a1);
...lots of other fns...
only the hello
line is what we need. And if taking the 2 lines to c/cpp compiler (zig cc
/clang
/gcc
), there will be errors around zig.h
. And as discussed in here, now there is so far not a good way of getting this done. So my solutions is to take out hello
line and get following .h
// hello.h
#include <stdint.h>
#define zig_extern
zig_extern uint8_t hello(uint8_t *const a0, uintptr_t const a1);
This file will then work with no problem in other c/cpp compiler.
With the headers and zig files generated, we may find that not everything can be mapped from zig
to c
or vice versa. And pay attention, what I am talking about is whether zig
can operate on c
ABI or vice versa (they are guaranteed working by zig
's design), but those syntax sugar/good parts of zig
.
In the rest of this note, I will try to list them one by one
Use case and example
pointers
Most scalar data types have their c
counterparts, so just look up in language spec. They are simple to deal with. In reverse direction, zig
also provides common c
types like c_int
etc, as their size (or alignment) is platform dependent. Again, check language spec here.
One tricky thing worth talking more is pointers
. zig
has special [*c]T
for c
pointer. So
-
c
int*
will bezig
[*c]c_int
, orc
uint8_t*
will bezig
[*c]u8
-
c
char**
will bezig
[*c][*c]c_char
, andc
char***
will bezig
[*c][*c][*c]c_char
-
const
applies, like
// c zig
// pointer to u8, pointer & value mutable
uint8_t * p1; => var p1: *u8 = undefined;
// pointer to const u8, only pointer mutable
const uint8_t * p2; => var p2: *const u8 = undefined;
// const pointer to u8, only value mutable
uint8_t * const p3; => const p3: *u8 = undefined;
// const pointer to const u8, pointer & value immutable
const uint8_t * const p4; => const p4: *const u8 = undefined;
(wonder example from Pointers and constness in Zig (and why it is confusing to a C programmer))
from zig
, call c
simple char pointers
// ptr.c
#include <stdio.h>
void hello_c(const char* str) {
printf("%s\n", str);
}
in zig
can use the ptr
inside slice
// ptr.zig
pub extern fn hello_c(str: [*c]const u8) void;
pub fn main() void {
const msg = "world";
hello_c(msg.ptr);
}
zig cc -c ptr.c
zig run ptr.zig ptr.o
then how about char**
or char* msgs[]
// ptr.h
#include <stdio.h>
void hello_all_c(const char* msgs[], int howmany) {
for (int i = 0; i < howmany; i++) {
printf("%s\n", msgs[i]);
}
}
This time a bit of more steps, as the normal slice of zig
we usually have no [*c]T
ready. So need to convert them, and again use ptr
from slice.
pub extern fn hello_all_c(msgs: [*c][*c]const u8, howmany: c_int) void;
pub fn main() void {
var msgs = [_][]const u8{ "hello", "world" };
_ = &msgs;
var msgs_for_c: [2][*c]const u8 = undefined;
msgs_for_c[0] = msgs[0].ptr;
msgs_for_c[1] = msgs[1].ptr;
hello_all_c(msgs_for_c[0..].ptr, 2);
}
zig cc -c ptr.c
zig run ptr.zig ptr.o
from c
, call zig
// ptr.zig
const std = @import("std");
pub export fn hello(str: [*c]const u8, len: usize) void {
std.debug.print("{s}\n", .{str[0..len]});
}
generate ptr.h
and clean it up as described above.
// ptr.h
#include <stdint.h>
#define zig_extern
zig_extern void hello(uint8_t const *const a0, uintptr_t const a1);
(notice that zig
str is with uint8_t const *const
type, not char*
)
then in ptr.c
#include "ptr.h"
int main() {
char* str = "world";
hello((uint8_t*)str, 5);
return 0;
}
and run as zig cc ptr.c libptr.a && ./a.out
Notice that we casted char*
to (uint8_t*)
in c
, otherwise there will be a warning but it will work too.
Now let us try char* msgs[]
// ptr.zig
const std = @import("std");
pub export fn hello_all(msgs: [*c][*c]const u8, len: usize) void {
for (0..len) |i| {
var msg_ptr = msgs[i];
var j: usize = 0;
while (true) : (j += 1) {
if (msg_ptr[j] == 0) {
break;
}
}
std.debug.print("{s}\n", .{msg_ptr[0..j]});
}
}
noice this time zig
implementation is more complex, as c pointer is not carrying the len
information (and we can not use zig slice in export fn), so we will need to manually find each msg
's len by finding the '0' sentinel. After that create a slice from c pointer then feed to print.
The generated and cleaned ptr.h
is as follows
// ptr.h
#include <stdint.h>
#define zig_extern
zig_extern void hello(uint8_t const *const a0, uintptr_t const a1);
zig_extern void hello_all(uint8_t const **const a0, uintptr_t const a1);
and finally ptr.c
// ptr.c
#include "ptr.h"
int main() {
char* msgs[] = {"hello", "world"};
hello_all((const uint8_t**)msgs, 2);
return 0;
}
we will still need casting in c
as char
is not uint8_t
.
allocator
They can not be used in exported zig
fn, as
hello.zig:10:21: error: parameter of type 'mem.Allocator' not allowed in function with calling convention 'C'
pub export fn test1(allocator: std.mem.Allocator) void {
^~~~~~~~~~~~~~~~~~~~~~~~~~~~
hello.zig:10:21: note: only extern structs and ABI sized packed structs are extern compatible
because allocator
is more than just a function, but a lot more. I personally find that reading std.heap.ArenaAllocator source is an extra good way of understanding what is allocator
. Its source is concise and short so easy to digest allocator
from high level on what it is doning.
opaque
, void*
and *opaque
opaque
structs, and void*
are very common in any important and mature libs of c
. It is a widely used technique in c
to hide its internal implementation. For examle, if you every look into use SQLite with its c
API, you will find like follows
// from https://sqlite.org/c3ref/prepare.html
int sqlite3_prepare(
sqlite3 *db, /* Database handle */
const char *zSql, /* SQL statement, UTF-8 encoded */
int nByte, /* Maximum length of zSql in bytes. */
sqlite3_stmt **ppStmt, /* OUT: Statement handle */
const char **pzTail /* OUT: Pointer to unused portion of zSql */
);
but try to locate sqlite3
type in sqlite3.h
, this is what we will find
// https://github.com/GaloisInc/sqlite/blob/master/sqlite3.5/sqlite3.h#L169
typedef struct sqlite3 sqlite3;
and in nowhere we will find how struct sqlite3
is defined in sqlite3.h
as c
allows this definition, and it is called opaque
. (the real struct sqlite3
is defined here, which is only avaliable in full source code).
void*
is usually used in c
lib for handle
-- some resource could later be generated into more than one types. So, user can provide a simple pointer, which is a void*
and let lib to deal with it. Example like in PCRE2 lib's PCRE2.h
, we can find something like below
// https://github.com/PCRE2Project/pcre2/blob/master/src/pcre2.h.in#L576
PCRE2_EXP_DECL int PCRE2_CALL_CONVENTION pcre2_config(uint32_t, void *);
and if read its doc, the 2nd param is where
to change this config
, which could then be many different data structs.
call opaque
and *opaque
from zig
With the knowledge gained above, this part should not be hard
// opaque.h
typedef struct Op* Op_t;
Op_t new_op(const char* name);
Op_t new_op_all(const char** names, int len);
void free_op(Op_t op);
void hello(Op_t op);
void hello_all(Op_t op);
// opaque.c
#include "opaque.h"
#include <stdio.h>
#include <stdlib.h>
typedef struct Op {
char *name;
char **names;
int howmany;
} *Op_t;
Op_t new_op(const char *name) {
Op_t op = malloc(sizeof(struct Op));
if (op != NULL) {
op->name = name;
}
return op;
}
Op_t new_op_all(const char **names, int howmany) {
Op_t op = malloc(sizeof(struct Op));
if (op != NULL) {
op->names = names;
op->howmany = howmany;
}
return op;
}
void free_op(Op_t op) {
free(op);
}
void hello(Op_t op) {
printf("%s\n", op->name);
}
void hello_all(Op_t op) {
for (int i = 0; i < op->howmany; i++) {
printf("%s\n", op->names[i]);
}
}
Notice that our c
code provides new_op*
functions, which is usually our c
lib will provide to create opaque
structs.
Translate with zig translate-c
and clean up, we will have
pub const struct_Op = opaque {};
pub const Op_t = ?*struct_Op;
pub extern fn new_op(name: [*c]const u8) Op_t;
pub extern fn new_op_all(names: [*c][*c]const u8, howmany: c_int) Op_t;
pub extern fn free_op(op: Op_t) void;
pub extern fn hello(op: Op_t) void;
pub extern fn hello_all(op: Op_t) void;
and then can easily write some code to call our c
functions
const std = @import("std");
pub const struct_Op = opaque {};
pub const Op_t = ?*struct_Op;
pub extern fn new_op(name: [*c]const u8) Op_t;
pub extern fn new_op_all(names: [*c][*c]const u8, howmany: c_int) Op_t;
pub extern fn free_op(op: Op_t) void;
pub extern fn hello(op: Op_t) void;
pub extern fn hello_all(op: Op_t) void;
pub fn main() !void {
{
const maybe_op: Op_t = new_op("world");
if (maybe_op) |op| {
hello(op);
free_op(op);
}
}
{
const names = [_][]const u8{ "hello", "world" };
var names_for_c: [2][*c]const u8 = undefined;
names_for_c[0] = names[0].ptr;
names_for_c[1] = names[1].ptr;
const maybe_op: Op_t = new_op_all(names_for_c[0..].ptr, 2);
if (maybe_op) |op| {
hello_all(op);
free_op(op);
}
}
}
zig cc -c opaque.c
then zig run opaque.zig opaque.o
, should work.
But may be we want to do some hacky thing, like modify or create opaque
from outside zig? can we just manually redefine a struct in zig so that we can access the child fields? Sounds possible, but on the other side, zig
and c
compiler has different opinions on how to arrage the memory layout for child fields of struct. This may fail. There is a extern
keyword in zig
doc, but as I tried so far, not yet working.
const std = @import("std");
//pub const struct_Op = opaque {};
pub const struct_Op = extern struct {
name: [*c]u8,
names: [*c][*c]u8,
howmany: c_int,
};
pub const Op_t = ?*struct_Op;
pub extern fn new_op(name: [*c]const u8) Op_t;
pub extern fn new_op_all(names: [*c][*c]const u8, howmany: c_int) Op_t;
pub extern fn free_op(op: Op_t) void;
pub extern fn hello(op: Op_t) void;
pub extern fn hello_all(op: Op_t) void;
pub fn main() !void {
{
const names = [_][]const u8{ "hello", "world" };
var zig: [3:0]u8 = undefined;
zig[0] = 'z';
zig[1] = 'i';
zig[2] = 'g';
zig[3] = 0;
var names_for_c: [2][*c]const u8 = undefined;
names_for_c[0] = names[0].ptr;
names_for_c[1] = names[1].ptr;
var maybe_op = new_op_all(names_for_c[0..].ptr, 2);
_ = &maybe_op;
if (maybe_op != null) {
std.debug.print("{any}\n", .{maybe_op.?.names[2]});
var zig_s = zig[0..3];
_ = &zig_s;
std.debug.print("{any}\n", .{zig_s});
maybe_op.?.names[2] = zig_s.ptr;
hello_all(maybe_op.?);
//free_op(maybe_op.?);
}
}
}
above code is what I have tried, but every time will cause SIG_TRAP
, which as further investigated, because of the modification of names[2
has ruined the overall struct Op
.
call opaque
and *opaque
from c
This does not make much sense as opaque
is specifically designed in zig
for c
lib using this technique. For zig
, seems there is no need to use this technique as zig
has pub
keyword to control visible and invisible code to outside.
call void*
from zig
quite similar to opaque
. Just watch the output of zig translate-c
, to use *anyopaque
for void*
. Example is as follows
// voidstart.h
void* set(const char* name);
void* set_all(const char** names, int howmany);
void hello(void* h);
void hello_all(void* h);
// voidstar.c
#include <stdio.h>
#include <stdlib.h>
char* name_info;
void* set(const char* name) {
name_info = name;
return (void*)name_info;
}
struct names_info_t {
const char** names;
int howmany;
} names_info;
void* set_all(const char** names, int howmany) {
names_info.names = names;
names_info.howmany = howmany;
return (void*)&names_info;
}
void hello(void* h) {
printf("%s\n", (char*)h);
}
void hello_all(void* h) {
struct names_info_t* ni = (struct names_info_t*)h;
for (int i = 0; i < ni->howmany; i++) {
printf("%s\n", ni->names[i]);
}
}
// voidstar_z.zig
const std = @import("std");
pub extern fn set(name: [*c]const u8) ?*anyopaque;
pub extern fn set_all(names: [*c][*c]const u8, howmany: c_int) ?*anyopaque;
pub extern fn hello(h: ?*anyopaque) void;
pub extern fn hello_all(h: ?*anyopaque) void;
pub fn main() !void {
{
var h = set("hello");
_ = &h;
hello(h);
}
{
const names = [_][]const u8{ "hello", "world" };
var names_for_c: [2][*c]const u8 = undefined;
names_for_c[0] = names[0].ptr;
names_for_c[1] = names[1].ptr;
var h = set_all(names_for_c[0..].ptr, 2);
_ = &h;
hello_all(h);
}
}
Top comments (3)
Thank you for sharing! I think the translate-c + grep combo is a neat trick that takes very little effort and can be very helpful to quickly get a function signature translated, for when one doesn't want to import the full header file.
Nice tips, thank u.
I'm a newbie so telling newbies to compile the library first.
Otherwise you 'll see "FileNotFound" T_T
translate-c is too noisy, I really hope you can join the development team. I want clean code.