Skip to main content

Manual Installation

Docker support

We are actively working on providing first-class support for Docker-based installation and deployment of Pie. Stay tuned!

This guide will walk you through setting up the Pie development environment from the source code.

Step 1: Install Prerequisites

First, ensure you have the following tools installed on your system.

Once Rust is installed, you must add the WebAssembly (Wasm) target required to build Pie's inferlets.

rustup target add wasm32-wasip2

Step 2: Clone the Pie Repository

Clone the project repository from GitHub. All subsequent commands in this guide should be run from the root of this pie directory unless otherwise noted.

git clone https://github.com/pie-project/pie.git
cd pie

Step 3: Set Up the Python Backend

The Pie engine relies on a Python backend for model execution. These steps configure the Python environment inside the backend/backend-python directory.

  1. Navigate to the Backend Directory

    cd backend/backend-python
  2. Create a Virtual Environment Create a new virtual environment using uv.

    uv venv
    important

    You must use uv to create and manage the virtual environment, as the Pie CLI directly invokes uv to run the backend.

  3. Install flashinfer

    uv pip install flashinfer
  4. Install PyTorch Next, install the correct version of PyTorch for your system's hardware (e.g., specific CUDA version for NVIDIA GPUs). You can find the correct command for your system on the PyTorch website.

    Use --force-reinstall

    You must include the --force-reinstall flag. This ensures the GPU-enabled version of PyTorch correctly overwrites any CPU-only version that flashinfer may have installed as a dependency.

    uv pip install torch --index-url https://download.pytorch.org/whl/cu121 --force-reinstall
  5. Install Triton The Triton package required depends on your system's CPU architecture.

    uv pip install triton
  6. Return to the Root Directory

    cd ../..

Step 4: Build the Pie CLI and Examples

Now, compile the Rust-based command-line interface and the example Wasm inferlets.

  1. Build and Install the Pie CLI This command compiles and installs the pie executable into your Cargo home directory (~/.cargo/bin).

    cargo install --path ./pie-cli
  2. Build the Example Inferlets This command compiles the example applications to WebAssembly.

    cargo build --target wasm32-wasip2 --release --manifest-path example-apps/Cargo.toml

Step 5: Verify Your Installation

Let's confirm that everything is working correctly by downloading a model and running an inference task.

  1. Download a Model Use the pie CLI to download and register a model. We'll use a small Llama 3.2 model for this test.

    pie model add "llama-3.2-1b-instruct"
  2. Start the Pie Engine Launch the Pie engine using the example configuration file. This will start the backend services and open the interactive Pie shell.

    pie start --config ./pie-cli/example_config.toml
  3. Run an Inference Task From within the interactive Pie shell (pie>), execute the text_completion.wasm inferlet you built earlier.

    pie> run example-apps/target/wasm32-wasip2/release/text_completion.wasm -- --prompt "What is the capital of France?"
    First Run May Be Slow

    The very first inference run can take a few minutes. This is expected, as the engine is performing a one-time compilation of the CUDA kernels optimized for your specific GPU. Subsequent runs will be much faster.

  4. Expected output

    After a few moments, you should see a response similar to this:

    The capital of France is Paris.

    Congratulations! You have successfully installed Pie and run your first inference task.