Discussion on: Cool Zig Patterns - Gotta alloc fast

ikrima profile image

if you have fixed upper bound + 64bit address space, another common approach is to vmem reserve the address space upfront + commit/decommit on-demand

  • the freelist can be made more cache friendly with array instead of linked list
  • depending on frequency of alloc vs. dealloc vs. size, can micro-optimize it to a bitvector

of course some subtle, confusing and non-deterministic complications (the best kind, right?):

  • (on windows at least) even if you reserve + commit upfront, the OS always page maps on-demand on page first touch. A naive implementation may inexplicable performance stalls (e.g. depends on # of process standby page pool, # of zeroed page pool, # of demand-zero page faults generated, etc). Bruce Dawson has excellent write-ups as usual
  • if used liberally, you may also need to misalign or randomize your vmem base address so it's not page aligned; otherwise, you might cripple your cache. @danluu wrote a more detailed explanation
  • there are more nuances implementation modulo multi-threading/contention/segregating the freelist metadata into hot/cold; but at that point, you're writing a full blown allocator so you might as well just grab someone's slab allocator implementation :P