New way to split and iterate over strings

#std

std.mem.window has been merged!

std.mem.split is very useful when there is a known delimiter, but there is no easy way to split a buffer every N items.

Manually implementing this for every 3 items looks like:

const buffer = "abcdefg";
var i: usize = 0;
const size  = 3;
while (size < buffer.len) : (i += size) {
  const end = @min(i + size, buffer.len);
  const slice = buffer[i..end];
  ...slice is "abc", "def", "g"
}

std.mem.window simplifies that to:

const buffer = "abcdefg";
var it = std.mem.window(u8, buffer, 3, 3);
while (it.next()) |slice| {
  ...slice is "abc", "def", "g"
}

But there's more! This isn't named splitEvery as std.mem.window is more powerful. It takes in both a size and advance parameter. When they are equal, it is the same as splitEvery.

By choosing an advance smaller than size we get a sliding window:

const buffer = "abcdefg";
var it = std.mem.window(u8, buffer, 3, 1);
while (it.next()) |slice| {
  ...slice is "abc", "bcd", "cde", "def", "efg"
}

Going the other way, we can pick out every Nth element. For example, if we only want the items with an even index:

const buffer = "abcdefg";
var it = std.mem.window(u8, buffer, 1, 2);
while (it.next()) |slice| {
  ...slice is "a", "c", "e", "g"
}

Top comments (4)

Jean-Pierre • Dec 17 '22 • Edited

linux zig 0.10.0 error: root struct of file 'mem' has no member named 'window'
var it = std.mem.window(u8, buffer, 1, 1);

const buffer = "àéç";
var it = std.mem.window(u8, buffer, 1, 1);
while (it.next()) |slice| {
std.debug.print("value:{any}",.{slice});
}

Only works with the master version,
on the other hand does not support UTF8
only american ascii 128
too bad, because we are not far from Rune de nim-lang

Lisael • Dec 21 '22

It's a low level memory operation. It's just a sliding window along an array in memory. It's presented in examples as a string tool (as [_]const u8 are the easiest array to create in a small example snippet), but it's really not. Low level memory operations are useless when dealing with real world strings (except that they may be blocks to build higher-level ops).

What you want is github.com/JakubSzark/zig-string that does exactly that.

Jean-Pierre • Dec 22 '22

zig-string isn't bad, but it's missing some stuff.

Jean-Pierre • Dec 16 '22 • Edited

hello, it works with UTF8 ex: here éçà...???

Zig NEWS

New way to split and iterate over strings

Top comments (4)

Read next

ZCS — An Entity Component System written in Zig

🧩 Zig multi-project workflow in VS Code (with dynamic debug/build and one `tasks.json` to rule them all)

First Post

Unpacking Zig's Standard Library Documentation