<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>ZML - Model to Metal</title><link>https://zml.ai/</link><description>Recent content on ZML - Model to Metal</description><generator>Hugo</generator><language>en-us</language><lastBuildDate>Mon, 30 Mar 2026 16:00:00 +0100</lastBuildDate><atom:link href="https://zml.ai/index.xml" rel="self" type="application/rss+xml"/><item><title>Introducing zml-smi</title><link>https://zml.ai/posts/zml-smi/</link><pubDate>Mon, 30 Mar 2026 16:00:00 +0100</pubDate><guid>https://zml.ai/posts/zml-smi/</guid><description>&lt;p&gt;&lt;code&gt;zml-smi&lt;/code&gt; is a universal diagnostic and monitoring tool for GPUs, TPUs and NPUs.
It provides real-time insights into the performance and health of your hardware.&lt;/p&gt;
&lt;p&gt;&lt;img src="https://zml.ai/img/posts/zml-smi/1.png" alt=""&gt;&lt;/p&gt;
&lt;p&gt;It is a mix between &lt;code&gt;nvidia-smi&lt;/code&gt; and &lt;code&gt;nvtop&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;It transparently supports all the platforms ZML supports. That is NVIDIA, AMD, Google TPU
and AWS Trainium devices. It will be extended to support more platforms in the future as ZML
continues to expand its hardware support.&lt;/p&gt;</description></item><item><title>Introducing ZML/v2</title><link>https://zml.ai/posts/zml-v2/</link><pubDate>Tue, 24 Mar 2026 16:00:00 +0100</pubDate><guid>https://zml.ai/posts/zml-v2/</guid><description>&lt;p&gt;ZML is an inference stack built close to the hardware. It lowers models directly onto NVIDIA, AMD, TPU, and Trainium
targets from a single codebase, without depending on and suffering from the Python-heavy runtime layers that most of the
ecosystem is built around.&lt;/p&gt;
&lt;p&gt;The guiding idea behind zml/v1 was simplicity: give ZML a model and its weights, and the system would take care of
compilation, placement, and execution for you. That made the first version approachable and effective for standard
deployments, but it also baked too much behavior into implicit global state. As the project pushed into partial
compilation, custom passes, sharding, quantization, and more backend-specific execution paths, those implicit shortcuts
became constraints. ZML/v2 is the rewrite that makes those concepts explicit: platform ownership, compilation, memory,
IO, and placement are now first-class, so advanced use cases can be expressed directly instead of forced through
workarounds.&lt;/p&gt;</description></item><item><title>About</title><link>https://zml.ai/about/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://zml.ai/about/</guid><description>&lt;p&gt;The ZML Blog is a technical publication about running modern AI systems in production.&lt;/p&gt;
&lt;p&gt;We write about:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;inference systems&lt;/li&gt;
&lt;li&gt;compiler architecture&lt;/li&gt;
&lt;li&gt;hardware portability&lt;/li&gt;
&lt;li&gt;deployment ergonomics&lt;/li&gt;
&lt;li&gt;observability and operating discipline&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The editorial bias is simple: practical speed, maintainable systems, and fewer hidden compromises.&lt;/p&gt;</description></item></channel></rss>