Archive

Performance

ZML is a production inference stack, purpose-built to decouple AI workloads from proprietary hardware.

01

engineering

10x faster tokenization

Integrating with the IREE tokenizer for a 10x uplift in tokenization performance.

Apr 7, 2026 monitoring / iree
02

engineering

Introducing zml-smi

A universal diagnostic and monitoring tool for GPUs, TPUs and NPUs.

Mar 30, 2026 monitoring / diagnostics
03

engineering

Introducing ZML/v2

Frontier performance through composability.

Mar 24, 2026 compilers / inference