Quickstart

Get from zero to generating text in under 5 minutes.

Prerequisites

Make sure you have:

Pie installed (Installation Guide)
A model downloaded (pie model download meta-llama/Llama-3.2-1B-Instruct)
Configuration initialized (pie config init)

Run Your First Prompt

The simplest way to use Pie is with the run command:

pie run text-completion -- --prompt "The capital of France is"

You should see output like:

╭─ Pie Run ─────────────────────────────────────╮
│ Inferlet       text-completion                │
│ Model          meta-llama/Llama-3.2-1B-Instruct │
│ Device         cuda:0                         │
╰───────────────────────────────────────────────╯

The capital of France is Paris, which is located in...

Understanding the Command

Let's break down what happened:

pie run text-completion -- --prompt "The capital of France is"

Part	Meaning
`pie run`	Run an inferlet one-shot (engine starts, runs, shuts down)
`text-completion`	The inferlet to run (from the standard registry)
`--`	Separator between Pie args and inferlet args
`--prompt "..."`	Argument passed to the inferlet

Configuration

Your configuration lives at ~/.pie/config.toml. View it:

pie config show

Key settings:

[engine]
host = "127.0.0.1"      # Server bind address
port = 8080             # Server port
enable_auth = true      # Require authentication

[[model]]
hf_repo = "meta-llama/Llama-3.2-1B-Instruct"
device = ["cuda:0"]     # GPU(s) to use

Change the Model

Update your config to use a different model:

# Download a new model
pie model download Qwen/Qwen2.5-3B-Instruct

# Update config
pie config update --hf-repo "Qwen/Qwen2.5-3B-Instruct"

Run with a Local Inferlet

If you have a custom .wasm inferlet:

pie run --path ./my_inferlet.wasm -- --arg1 value1

Next Steps

Learn about Server Mode for persistent serving
Connect from code with Client Basics

Prerequisites​

Run Your First Prompt​

Understanding the Command​

Configuration​

Change the Model​

Run with a Local Inferlet​

Next Steps​