Manual Installation
We are actively working on providing first-class support for Docker-based installation and deployment of Pie. Stay tuned!
This guide will walk you through setting up the Pie development environment from the source code.
Step 1: Install Prerequisites
First, ensure you have the following tools installed on your system.
-
Rust: Install via
rustupfrom the official website. -
uv: A fast Python package installer.
Once Rust is installed, you must add the WebAssembly (Wasm) target required to build Pie's inferlets.
rustup target add wasm32-wasip2
Step 2: Clone the Pie Repository
Clone the project repository from GitHub. All subsequent commands in this guide should be run from the root of this pie directory unless otherwise noted.
git clone https://github.com/pie-project/pie.git
cd pie
Step 3: Set Up the Python Backend
The Pie engine relies on a Python backend for model execution. These steps configure the Python environment inside the backend/backend-python directory.
-
Navigate to the Backend Directory
cd backend/backend-python -
Create a Virtual Environment Create a new virtual environment using
uv.uv venvimportantYou must use
uvto create and manage the virtual environment, as the Pie CLI directly invokesuvto run the backend. -
Install
flashinferuv pip install flashinfer -
Install PyTorch Next, install the correct version of PyTorch for your system's hardware (e.g., specific CUDA version for NVIDIA GPUs). You can find the correct command for your system on the PyTorch website.
Use--force-reinstallYou must include the
--force-reinstallflag. This ensures the GPU-enabled version of PyTorch correctly overwrites any CPU-only version thatflashinfermay have installed as a dependency.uv pip install torch --index-url https://download.pytorch.org/whl/cu121 --force-reinstall -
Install Triton The Triton package required depends on your system's CPU architecture.
- x86
- aarch64
uv pip install tritonuv pip install pytorch_triton --index-url https://download.pytorch.org/whl --force-reinstall -
Return to the Root Directory
cd ../..
Step 4: Build the Pie CLI and Examples
Now, compile the Rust-based command-line interface and the example Wasm inferlets.
-
Build and Install the Pie CLI This command compiles and installs the
pieexecutable into your Cargo home directory (~/.cargo/bin).cargo install --path ./pie-cli -
Build the Example Inferlets This command compiles the example applications to WebAssembly.
cargo build --target wasm32-wasip2 --release --manifest-path example-apps/Cargo.toml
Step 5: Verify Your Installation
Let's confirm that everything is working correctly by downloading a model and running an inference task.
-
Download a Model Use the
pieCLI to download and register a model. We'll use a small Llama 3.2 model for this test.pie model add "llama-3.2-1b-instruct" -
Start the Pie Engine Launch the Pie engine using the example configuration file. This will start the backend services and open the interactive Pie shell.
pie start --config ./pie-cli/example_config.toml -
Run an Inference Task From within the interactive Pie shell (
pie>), execute thetext_completion.wasminferlet you built earlier.pie> run example-apps/target/wasm32-wasip2/release/text_completion.wasm -- --prompt "What is the capital of France?"First Run May Be SlowThe very first inference run can take a few minutes. This is expected, as the engine is performing a one-time compilation of the CUDA kernels optimized for your specific GPU. Subsequent runs will be much faster.
-
Expected output
After a few moments, you should see a response similar to this:
The capital of France is Paris.Congratulations! You have successfully installed Pie and run your first inference task.