Skip to main content

Components

Pie has three parts: the engine that runs inferlets, the SDKs for writing them and talking to a running engine, and Bakery, the toolchain for building and publishing them.

Engine

The engine is the "operating system" for inferlets, made of three pieces:

  • Runtime (runtime/). The Rust core. Hosts the Wasmtime-based WebAssembly runtime, the inferlet scheduler, KV cache mapping, contention management, and the WebSocket protocol clients connect to.
  • Driver (driver/). The hardware layer. Pie ships a CUDA driver, and a local driver for CPU and Apple Silicon. Pie also has experimental support for vLLM and SGLang as drivers.
  • Server. Glues the runtime and driver together, and provide CLI to configure and start them.

SDKs

Each available in Rust, Python, and JavaScript / TypeScript.

  • Inferlet SDK (sdk/). A library for writing inferlets. Comes with helpful utilities and macros to reduce boilerplate.
  • Client SDK (client/). A programmatic client for controlling a running Pie engine, for example install, run, and manage inferlets.

Bakery

bakery (in sdk/tools/bakery/) is the inferlet toolchain, that provides CLI commands to scaffold and build inferlets. It also provides an experimental package manager to share the inferlets.