Skip to main content

Quickstart

Get from zero to generating text in under 5 minutes.

Prerequisites

Make sure you have:

  • Pie installed (Installation Guide)
  • A model downloaded (pie model download meta-llama/Llama-3.2-1B-Instruct)
  • Configuration initialized (pie config init)

Run Your First Prompt

The simplest way to use Pie is with the run command:

pie run text-completion -- --prompt "The capital of France is"

You should see output like:

╭─ Pie Run ─────────────────────────────────────╮
│ Inferlet text-completion │
│ Model meta-llama/Llama-3.2-1B-Instruct │
│ Device cuda:0 │
╰───────────────────────────────────────────────╯

The capital of France is Paris, which is located in...

Understanding the Command

Let's break down what happened:

pie run text-completion -- --prompt "The capital of France is"
PartMeaning
pie runRun an inferlet one-shot (engine starts, runs, shuts down)
text-completionThe inferlet to run (from the standard registry)
--Separator between Pie args and inferlet args
--prompt "..."Argument passed to the inferlet

Configuration

Your configuration lives at ~/.pie/config.toml. View it:

pie config show

Key settings:

[engine]
host = "127.0.0.1" # Server bind address
port = 8080 # Server port
enable_auth = true # Require authentication

[[model]]
hf_repo = "meta-llama/Llama-3.2-1B-Instruct"
device = ["cuda:0"] # GPU(s) to use

Change the Model

Update your config to use a different model:

# Download a new model
pie model download Qwen/Qwen2.5-3B-Instruct

# Update config
pie config update --hf-repo "Qwen/Qwen2.5-3B-Instruct"

Run with a Local Inferlet

If you have a custom .wasm inferlet:

pie run --path ./my_inferlet.wasm -- --arg1 value1

Next Steps