Scaling OBJ Parsing in Zig: Streaming, Allocators, and Web Integration
Context: Personal Zig study (
obj-parserrepo). Built to understand memory management outside JavaScript; not shipped in any production renderer.
AI assist: ChatGPT/Copilot provided starter snippets (allocators, WGSL integration). I audited every line before committing.
Status: Benchmarks come from my M2 laptop and sample OBJ files. Treat them as directional, not formal performance claims.
Reality snapshot
- Goal: parse multi-megabyte OBJ files without loading them entirely into memory, then stream the vertices directly into a browser renderer.
- Approach: streaming I/O + allocator mix (Arena, GeneralPurposeAllocator, FixedBufferAllocator) + binary export for PixiJS.
- Limitations: Handles ~50 MB files comfortably. Anything bigger exposes TODOs (MTL parsing, normals, materials).
Architecture
obj-parser/├── src/│ ├── main.zig│ ├── parser.zig│ └── exporter.zig├── tests/│ └── parser_test.zig├── examples/│ ├── cube.obj│ └── teapot.obj└── web/└── preview.js
parser.zig– streaming reader + line processor.exporter.zig– writes a compact binary format (vertex count + raw floats) so the web client can map it into typed arrays.web/preview.js– PixiJS viewer to prove the data actually renders.
Streaming & allocators
pub fn parse(allocator: std.mem.Allocator, reader: anytype) !Model {var buffered = std.io.bufferedReader(reader);var stream = buffered.reader();var model = Model.init(allocator);var line: [1024]u8 = undefined;while (try stream.readUntilDelimiterOrEof(&line, '\n')) |slice| {try model.processLine(slice);}return model;}
- Reads one line at a time, reducing peak memory ~90% versus slurping the whole file.
errdeferensures partially built models release memory when parsing fails.- Arena allocator handles temporary buffers; GPA stores final vertex/face arrays; FixedBufferAllocator tokenizes hot loops.
Binary hand-off to the browser
pub fn exportBinary(model: Model, writer: anytype) !void {try writer.writeInt(u32, model.vertices.len, .Little);for (model.vertices) |v| {try writer.writeAll(std.mem.asBytes(&v));}}
const response = await fetch("model.bin");const buffer = await response.arrayBuffer();const count = new DataView(buffer).getUint32(0, true);const vertices = new Float32Array(buffer, 4, count * 3);
- Zero-copy: the browser maps the ArrayBuffer directly. No JSON, no duplicate arrays.
- Endianness is explicit (
Little), so the JS side knows what to expect.
Benchmarks (anecdotal)
- 50 MB OBJ → ~150 ms parse on an M2 Max, ~5 MB peak memory.
- JavaScript baseline (old parser) → ~800 ms + 90 MB peak memory.
zig test+ fuzz cases cover malformed vertices/faces; coverage reports hover around 90%. Failures trigger bug tickets before I merge.
Lessons learned
- Allocator choice is architecture. Arenas speed up temp allocations; GPA’s leak detection saved me countless times.
- Streaming keeps memory predictable but forces careful buffer sizing. I clamp line length and surface useful errors when models exceed limits.
- Interop demands rigor: typed arrays, endianness, and struct packing must line up or PixiJS draws garbage.
- Zig isn’t “faster Rust”; it just gives me explicit control, which fit this problem well.
TODOs
- Parse MTL files + textures. Right now it’s vertices/faces only.
- Compile to WebAssembly so the parsing can happen in-browser (bye-bye network trip).
- Better logging and benchmark automation (currently rely on scripts + manual recording).
- Document a step-by-step tutorial so other Zig learners can follow along.
Links
- Repo: https://github.com/BradleyMatera/obj-parser
- Prompt log:
notes/ai-prompts.md - PixiJS demo:
web/preview.js(GitHub Pages build coming soon)