Quickstart
Get from zero to generating text in under 5 minutes.
Prerequisites
Make sure you have:
- Pie installed (Installation Guide)
- A model downloaded (
pie model download meta-llama/Llama-3.2-1B-Instruct) - Configuration initialized (
pie config init)
Run Your First Prompt
The simplest way to use Pie is with the run command:
pie run text-completion -- --prompt "The capital of France is"
You should see output like:
╭─ Pie Run ─────────────────────────────────────╮
│ Inferlet text-completion │
│ Model meta-llama/Llama-3.2-1B-Instruct │
│ Device cuda:0 │
╰───────────────────────────────────────────────╯
The capital of France is Paris, which is located in...
Understanding the Command
Let's break down what happened:
pie run text-completion -- --prompt "The capital of France is"
| Part | Meaning |
|---|---|
pie run | Run an inferlet one-shot (engine starts, runs, shuts down) |
text-completion | The inferlet to run (from the standard registry) |
-- | Separator between Pie args and inferlet args |
--prompt "..." | Argument passed to the inferlet |
Configuration
Your configuration lives at ~/.pie/config.toml. View it:
pie config show
Key settings:
[engine]
host = "127.0.0.1" # Server bind address
port = 8080 # Server port
enable_auth = true # Require authentication
[[model]]
hf_repo = "meta-llama/Llama-3.2-1B-Instruct"
device = ["cuda:0"] # GPU(s) to use
Change the Model
Update your config to use a different model:
# Download a new model
pie model download Qwen/Qwen2.5-3B-Instruct
# Update config
pie config update --hf-repo "Qwen/Qwen2.5-3B-Instruct"
Run with a Local Inferlet
If you have a custom .wasm inferlet:
pie run --path ./my_inferlet.wasm -- --arg1 value1
Next Steps
- Learn about Server Mode for persistent serving
- Connect from code with Client Basics