Skip to main content

Core WIT Definitions

The inferlet:core world defines the complete set of APIs available to an inferlet. It aggregates several specialized interfaces, each responsible for a distinct area of functionality, from model interaction and runtime environment access to messaging and data storage.

This document provides a concise overview of each interface and its components.

world inferlet:core
package inferlet:core;

world imports {
import common;
import runtime;
import forward;
import tokenize;
import kvs;
import message;
}

common Interface

The common interface provides fundamental types, resources, and functions that are shared across other interfaces. It forms the foundation for interacting with models and managing resources.

Core Types

  • pointer: A u32 representing a memory address or resource handle.
  • priority: An enum (low, normal, high) for setting the execution priority of a command queue.

Resources

  • blob / blob-result: Used for handling raw binary data. blob-result is an async resource for polling the result of a blob receive operation.
  • model: Represents a specific model instance, providing methods to query its metadata like name, traits, prompt template, and stop tokens. It's the entry point for creating a command queue.
  • queue: Represents a command queue for a specific model. It's used to synchronize operations, set priority, and manage resources.
  • synchronization-result: An async resource to check the status of a synchronize operation on a queue.
  • debug-query-result: An async resource to get the string result from a debug query.

Resource Management Functions

These functions operate on a queue to manage shared memory resources, enabling efficient data transfer and interoperability between inferlets.

  • allocate-resources(...): Allocates a specified number of a given resource type.
  • deallocate-resources(...): Frees previously allocated resources.
  • export-resources(...): Makes a set of resources available to other inferlets under a given name.
  • import-resources(...): Accesses resources exported by another inferlet by name.
  • get-all-exported-resources(...): Lists all resources currently exported.
  • release-exported-resources(...): Removes an export, making the resources private again.
View common WIT Definition
interface common {
use wasi:io/poll@0.2.4.{pollable};
type pointer = u32;

enum priority { low, normal, high }

resource blob {
constructor(init: list<u8>);
read: func(offset: u64, n: u64) -> list<u8>;
size: func() -> u64;
}

resource blob-result {
pollable: func() -> pollable;
get: func() -> option<blob>;
}

resource model {
get-name: func() -> string;
get-traits: func() -> list<string>;
get-description: func() -> string;
get-prompt-template: func() -> string;
get-stop-tokens: func() -> list<string>;
get-service-id: func() -> u32;
get-kv-page-size: func() -> u32;
create-queue: func() -> queue;
}

resource queue {
get-service-id: func() -> u32;
synchronize: func() -> synchronization-result;
set-priority: func(priority: priority);
debug-query: func(query: string) -> debug-query-result;
}

resource synchronization-result {
pollable: func() -> pollable;
get: func() -> option<bool>;
}

resource debug-query-result {
pollable: func() -> pollable;
get: func() -> option<string>;
}

allocate-resources: func(queue: borrow<queue>, resource-type: u32, count: u32) -> list<pointer>;
deallocate-resources: func(queue: borrow<queue>, resource-type: u32, ptrs: list<pointer>);
get-all-exported-resources: func(queue: borrow<queue>, resource-type: u32) -> list<tuple<string, u32>>;
release-exported-resources: func(queue: borrow<queue>, resource-type: u32, name: string);
export-resources: func(queue: borrow<queue>, resource-type: u32, ptrs: list<pointer>, name: string);
import-resources: func(queue: borrow<queue>, resource-type: u32, name: string) -> list<pointer>;
}

forward Interface

This interface defines the building blocks for executing a model's forward pass. It allows for detailed control over inputs (tokens, embeddings, KV cache) and outputs (sampling methods).

Resources

  • forward-pass: Represents a single forward pass operation. You configure it with inputs and sampling parameters before calling execute.
  • forward-pass-result: An async resource to poll and retrieve the results of a forward-pass, which can be token IDs or full probability distributions.

Functions

The workflow involves creating a forward-pass, configuring it with input and output functions, and then executing it.

  • create-forward-pass(...): Creates a new forward-pass object for a given queue.

Input Configuration

  • input-tokens(...): Sets the input token IDs and their positions.
  • input-embeddings(...): Sets pre-computed input embeddings.
  • kv-cache(...): Provides the KV cache pages for stateful generation.
  • attention-mask(...): Applies an attention mask to the input.

Output and Sampling Configuration

These functions define what the model should output and how to sample the resulting logits.

  • output-tokens(...): Generates output tokens using a specified temperature.
  • output-tokens-top-k(...): Samples from the k most likely tokens.
  • output-tokens-top-p(...): Samples from the smallest set of tokens whose cumulative probability exceeds p.
  • output-tokens-min-p(...): Samples from tokens with probability mass of at least min-p * max(P).
  • output-embeddings(...): Retrieves the raw output embeddings for specified token indices.
  • output-distributions(...): Retrieves the full probability distribution over the vocabulary.
View forward WIT Definition
interface forward {
use wasi:io/poll@0.2.4.{pollable};
use common.{queue, pointer};

resource forward-pass {
execute: func() -> option<forward-pass-result>;
}

resource forward-pass-result {
pollable: func() -> pollable;
get-distributions: func() -> option<list<tuple<list<u32>, list<f32>>>>;
get-tokens: func() -> option<list<u32>>;
}

create-forward-pass: func(queue: borrow<queue>) -> forward-pass;
attention-mask: func(pass: borrow<forward-pass>, mask: list<list<u32>>);
kv-cache: func(pass: borrow<forward-pass>, kv-page-ptrs: list<pointer>, last-kv-page-len: u32);
input-embeddings: func(pass: borrow<forward-pass>, emb-ptrs: list<pointer>, positions: list<u32>);
input-tokens: func(pass: borrow<forward-pass>, input-tokens: list<u32>, positions: list<u32>);
output-embeddings: func(pass: borrow<forward-pass>, emb-ptrs: list<pointer>, indices: list<u32>);
output-distributions: func(pass: borrow<forward-pass>, indices: list<u32>, temperature: f32, top-k: option<u32>);
output-tokens: func(pass: borrow<forward-pass>, indices: list<u32>, temperature: f32);
output-tokens-top-k: func(pass: borrow<forward-pass>, indices: list<u32>, temperature: f32, top-k: u32);
output-tokens-top-p: func(pass: borrow<forward-pass>, indices: list<u32>, temperature: f32, top-p: f32);
output-tokens-min-p: func(pass: borrow<forward-pass>, indices: list<u32>, temperature: f32, min-p: f32);
output-tokens-top-k-top-p: func(pass: borrow<forward-pass>, indices: list<u32>, temperature: f32, top-k: u32, top-p: f32);
}

kvs Interface

Provides a simple, persistent key-value storage mechanism. This is useful for saving state, configuration, or other data across inferlet invocations.

  • store-get(key): Retrieves the value for a given key.
  • store-set(key, value): Creates or overwrites an entry.
  • store-delete(key): Removes an entry.
  • store-exists(key): Checks if a key exists.
  • store-list-keys(): Returns all keys in the store.
View kvs WIT Definition
interface kvs {
store-get: func(key: string) -> option<string>;
store-set: func(key: string, value: string);
store-delete: func(key: string);
store-exists: func(key: string) -> bool;
store-list-keys: func() -> list<string>;
}

message Interface

This interface handles communication between the inferlet and the outside world, supporting direct messaging, binary data transfer, and a publish-subscribe model.

Resources

  • subscription: A handle to a topic subscription, used to poll for and receive broadcast messages.
  • receive-result: An async resource for polling the result of a direct receive operation.

Functions

Direct Messaging

  • send(message) / receive(): Send and receive string-based messages to/from the host.
  • send-blob(blob) / receive-blob(): Send and receive binary data using the blob resource.

Publish/Subscribe

  • broadcast(topic, message): Publishes a message to a named topic for all subscribers.
  • subscribe(topic): Subscribes to a topic, returning a subscription handle.
View message WIT Definition
interface message {
use wasi:io/poll@0.2.4.{pollable};
use common.{blob, blob-result};

resource subscription {
pollable: func() -> pollable;
get: func() -> option<string>;
unsubscribe: func();
}

resource receive-result {
pollable: func() -> pollable;
get: func() -> option<string>;
}

send: func(message: string);
receive: func() -> receive-result;
send-blob: func(blob: blob);
receive-blob: func() -> blob-result;
broadcast: func(topic: string, message: string);
subscribe: func(topic: string) -> subscription;
}

runtime Interface

The runtime interface provides access to the execution environment, allowing the inferlet to get information about itself, its arguments, and the available models.

Functions

  • get-version(): Returns the runtime version string.
  • get-instance-id(): Returns a unique ID for the current inferlet instance.
  • get-arguments(): Retrieves the command-line arguments passed to the inferlet.
  • set-return(value): Sets the final return value of the inferlet upon exit.
  • get-model(name): Retrieves a model resource by its name.
  • get-all-models(): Lists the names of all available models.
  • get-all-models-with-traits(traits): Filters models by required traits (e.g., "tokenize").
  • debug-query(query): Executes a runtime-specific debug command.
View runtime WIT Definition
interface runtime {
use wasi:io/poll@0.2.4.{pollable};
use common.{model, debug-query-result};

get-version: func() -> string;
get-instance-id: func() -> string;
get-arguments: func() -> list<string>;
set-return: func(value: string);
get-model: func(name: string) -> option<model>;
get-all-models: func() -> list<string>;
get-all-models-with-traits: func(traits: list<string>) -> list<string>;
debug-query: func(query: string) -> debug-query-result;
}

tokenize Interface

This interface provides text tokenization capabilities specific to a given model.

Resource

  • tokenizer: A resource that can convert text to token IDs and back.

Functions

  • get-tokenizer(model): Retrieves the tokenizer associated with a specific model.
  • tokenizer.tokenize(text): Converts a string into a list of token IDs.
  • tokenizer.detokenize(tokens): Converts a list of token IDs back into a string.
  • tokenizer.get-vocabs(): Returns the model's vocabulary as a list of tokens and their corresponding IDs.
View tokenize WIT Definition
interface tokenize {
use common.{model};

get-tokenizer: func(queue: borrow<model>) -> tokenizer;

resource tokenizer {
tokenize: func(text: string) -> list<u32>;
detokenize: func(tokens: list<u32>) -> string;
get-vocabs: func() -> tuple<list<u32>, list<list<u8>>>;
}
}