Manual Installation
We are actively working on providing first-class support for Docker-based installation and deployment of Pie. Stay tuned!
This guide will walk you through setting up the Pie development environment from the source code.
Step 1: Install Prerequisites
First, ensure you have the following tools installed on your system.
-
Rust: Install via
rustup
from the official website. -
uv: A fast Python package installer.
Once Rust is installed, you must add the WebAssembly (Wasm) target required to build Pie's inferlets.
rustup target add wasm32-wasip2
Step 2: Clone the Pie Repository
Clone the project repository from GitHub. All subsequent commands in this guide should be run from the root of this pie
directory unless otherwise noted.
git clone https://github.com/pie-project/pie.git
cd pie
Step 3: Set Up the Python Backend
The Pie engine relies on a Python backend for model execution. These steps configure the Python environment inside the backend/backend-python
directory.
-
Navigate to the Backend Directory
cd backend/backend-python
-
Create a Virtual Environment Create a new virtual environment using
uv
.uv venv
importantYou must use
uv
to create and manage the virtual environment, as the Pie CLI directly invokesuv
to run the backend. -
Install
flashinfer
uv pip install flashinfer
-
Install PyTorch Next, install the correct version of PyTorch for your system's hardware (e.g., specific CUDA version for NVIDIA GPUs). You can find the correct command for your system on the PyTorch website.
Use--force-reinstall
You must include the
--force-reinstall
flag. This ensures the GPU-enabled version of PyTorch correctly overwrites any CPU-only version thatflashinfer
may have installed as a dependency.uv pip install torch --index-url https://download.pytorch.org/whl/cu121 --force-reinstall
-
Install Triton The Triton package required depends on your system's CPU architecture.
- x86
- aarch64
uv pip install triton
uv pip install pytorch_triton --index-url https://download.pytorch.org/whl --force-reinstall
-
Return to the Root Directory
cd ../..
Step 4: Build the Pie CLI and Examples
Now, compile the Rust-based command-line interface and the example Wasm inferlets.
-
Build and Install the Pie CLI This command compiles and installs the
pie
executable into your Cargo home directory (~/.cargo/bin
).cargo install --path ./pie-cli
-
Build the Example Inferlets This command compiles the example applications to WebAssembly.
cargo build --target wasm32-wasip2 --release --manifest-path example-apps/Cargo.toml
Step 5: Verify Your Installation
Let's confirm that everything is working correctly by downloading a model and running an inference task.
-
Download a Model Use the
pie
CLI to download and register a model. We'll use a small Llama 3.2 model for this test.pie model add "llama-3.2-1b-instruct"
-
Start the Pie Engine Launch the Pie engine using the example configuration file. This will start the backend services and open the interactive Pie shell.
pie start --config ./pie-cli/example_config.toml
-
Run an Inference Task From within the interactive Pie shell (
pie>
), execute thetext_completion.wasm
inferlet you built earlier.pie> run example-apps/target/wasm32-wasip2/release/text_completion.wasm -- --prompt "What is the capital of France?"
First Run May Be SlowThe very first inference run can take a few minutes. This is expected, as the engine is performing a one-time compilation of the CUDA kernels optimized for your specific GPU. Subsequent runs will be much faster.
-
Expected output
After a few moments, you should see a response similar to this:
The capital of France is Paris.
Congratulations! You have successfully installed Pie and run your first inference task.