Skip to main content

Run and iterate

This page builds the research-agent inferlet you wrote on the previous page, runs it with pie run, and walks through the output. By the end you can edit the inferlet, rebuild, rerun, and see your changes in seconds.

pie run boots a one-shot engine, runs the inferlet, prints its output, and exits. It is the right tool while you are iterating. The next page covers running the engine as a long-lived service.

Build

bakery build compiles the source tree to a WebAssembly component.

bakery build . -o research-agent.wasm

The first build pulls and compiles dependencies (futures, serde_json, urlencoding). Subsequent builds are incremental. Output lands at research-agent.wasm next to your Pie.toml.

If you prefer raw cargo:

cargo build --target wasm32-wasip2 --release
# output: target/wasm32-wasip2/release/research_agent.wasm

Both paths produce the same artifact.

See bakery for the full flag list.

Run

pie run \
--path ./research-agent.wasm \
--manifest ./Pie.toml \
-- \
--question "Compare the climates of Tokyo, Reykjavik, and Singapore."

The first run is slower because of JIT and kernel warmup. Expected output:

╭─ Pie Run ───────────────────────────────────────╮
│ Inferlet research-agent@0.1.0 │
│ Model default (Qwen/Qwen3-0.6B) │
│ Driver cuda_native │
│ Device cuda:0 │
╰─────────────────────────────────────────────────╯

Tokyo has a humid subtropical climate, with hot, humid summers
and cool winters. Reykjavik has a subpolar oceanic climate,
moderated by the Gulf Stream, with cool summers and chilly winters.
Singapore lies near the equator and has a tropical rainforest
climate: hot and humid year-round with no distinct seasons. The
three differ primarily by latitude and ocean influence.

The CLI flags after -- fold into the inferlet's input. --question "..." becomes {"question": "..."}. Type inference handles values such as --max-tokens 64 (int) or --temperature 0.7 (float); Pie.toml documents those parameters, but the inferlet's input type performs the actual validation.

Watching the steps

The default output prints only the final string. To watch the four steps as they happen, stream events with session.send and add eprintln! / print(file=sys.stderr) / console.error for diagnostics. Diagnostic output goes to the engine's stderr stream, which pie run mirrors to your terminal.

use inferlet::pie::core::session;

eprintln!("[plan] {plan_text}");
session::send(&format!("Looking up: {}\n", plan.titles.join(", ")));

// after the join_all:
let elapsed = start.elapsed().as_millis();
eprintln!("[fetch] {} summaries in {} ms", plan.titles.len(), elapsed);

eprintln! writes to stderr; session::send streams a chunk back to the client (here, pie run) before the final result.

After rebuilding and rerunning, you'll see something like:

[plan] {"titles": ["Tokyo", "Reykjavik", "Singapore"]}
Looking up: Tokyo, Reykjavik, Singapore
[fetch] 3 summaries in 412 ms

Tokyo has a humid subtropical climate, ...

The fetch is reported as elapsed wall-clock time across all three requests, not the sum of individual request times. That is the win from running them in parallel.

The edit loop

# After editing the source:
bakery build . -o research-agent.wasm
pie run --path ./research-agent.wasm --manifest ./Pie.toml -- --question "Compare the climates of Tokyo and Singapore."

A Rust rebuild after a one-line change typically takes a few seconds. Python and JavaScript rebuilds are faster because they skip native compilation.

To run the same inferlet against a different model without rebuilding, change the active model in your config:

pie config set model.0.hf_repo Qwen/Qwen2.5-7B-Instruct
pie model download Qwen/Qwen2.5-7B-Instruct

pie run reads the config on every invocation, so the next run uses the new model.

When something goes wrong

A few common failure modes and what to do.

The planner returned malformed JSON

Symptom: the inferlet returns an error like expected value at line 1 column 1. The planner emitted prose around the JSON, or no JSON at all.

What to do: add eprintln! / print(file=sys.stderr) around the parse to see what the planner produced. If the model is small (Qwen3-0.6B), it sometimes wraps JSON in code fences. Either strip the fences before parsing, or constrain the planner with a JSON schema. See Structured generation.

A Wikipedia title is wrong

Symptom: [fetch failed: HTTP 404] for one of the titles. The model invented a title that does not exist on Wikipedia, or used a non-canonical name (e.g. "Tokyo, Japan" instead of "Tokyo").

What to do: include in the planner prompt a hint like "use exact Wikipedia article titles, e.g. 'Tokyo' not 'Tokyo, Japan'". For more robust handling, fall back to the Wikipedia search API when the summary endpoint 404s.

pie run says "no models available"

Symptom: the inferlet returns Err("no models") immediately.

What to do: run pie model list to see what is downloaded, and pie config show to see what the engine is configured to load. If the configured model is not downloaded, either download it or pie config set model.0.hf_repo <repo> to one you do have.

Build fails on wasm32-wasip2

Symptom: cargo build errors with "target wasm32-wasip2 not installed".

What to do: rustup target add wasm32-wasip2. The bakery toolchain installs this automatically, but a raw cargo build does not.

Next