Core WIT Definitions
The inferlet:core world defines the complete set of APIs available to an inferlet. It aggregates several specialized interfaces, each responsible for a distinct area of functionality, from model interaction and runtime environment access to messaging and data storage.
This document provides a concise overview of each interface and its components.
package inferlet:core;
world imports {
import common;
import runtime;
import forward;
import tokenize;
import kvs;
import message;
}
common Interface
The common interface provides fundamental types, resources, and functions that are shared across other interfaces. It forms the foundation for interacting with models and managing resources.
Core Types
pointer: Au32representing a memory address or resource handle.priority: An enum (low,normal,high) for setting the execution priority of a command queue.
Resources
blob/blob-result: Used for handling raw binary data.blob-resultis an async resource for polling the result of a blob receive operation.model: Represents a specific model instance, providing methods to query its metadata like name, traits, prompt template, and stop tokens. It's the entry point for creating a commandqueue.queue: Represents a command queue for a specific model. It's used to synchronize operations, set priority, and manage resources.synchronization-result: An async resource to check the status of asynchronizeoperation on aqueue.debug-query-result: An async resource to get the string result from a debug query.
Resource Management Functions
These functions operate on a queue to manage shared memory resources, enabling efficient data transfer and interoperability between inferlets.
allocate-resources(...): Allocates a specified number of a given resource type.deallocate-resources(...): Frees previously allocated resources.export-resources(...): Makes a set of resources available to other inferlets under a given name.import-resources(...): Accesses resources exported by another inferlet by name.get-all-exported-resources(...): Lists all resources currently exported.release-exported-resources(...): Removes an export, making the resources private again.
View common WIT Definition
interface common {
use wasi:io/poll@0.2.4.{pollable};
type pointer = u32;
enum priority { low, normal, high }
resource blob {
constructor(init: list<u8>);
read: func(offset: u64, n: u64) -> list<u8>;
size: func() -> u64;
}
resource blob-result {
pollable: func() -> pollable;
get: func() -> option<blob>;
}
resource model {
get-name: func() -> string;
get-traits: func() -> list<string>;
get-description: func() -> string;
get-prompt-template: func() -> string;
get-stop-tokens: func() -> list<string>;
get-service-id: func() -> u32;
get-kv-page-size: func() -> u32;
create-queue: func() -> queue;
}
resource queue {
get-service-id: func() -> u32;
synchronize: func() -> synchronization-result;
set-priority: func(priority: priority);
debug-query: func(query: string) -> debug-query-result;
}
resource synchronization-result {
pollable: func() -> pollable;
get: func() -> option<bool>;
}
resource debug-query-result {
pollable: func() -> pollable;
get: func() -> option<string>;
}
allocate-resources: func(queue: borrow<queue>, resource-type: u32, count: u32) -> list<pointer>;
deallocate-resources: func(queue: borrow<queue>, resource-type: u32, ptrs: list<pointer>);
get-all-exported-resources: func(queue: borrow<queue>, resource-type: u32) -> list<tuple<string, u32>>;
release-exported-resources: func(queue: borrow<queue>, resource-type: u32, name: string);
export-resources: func(queue: borrow<queue>, resource-type: u32, ptrs: list<pointer>, name: string);
import-resources: func(queue: borrow<queue>, resource-type: u32, name: string) -> list<pointer>;
}
forward Interface
This interface defines the building blocks for executing a model's forward pass. It allows for detailed control over inputs (tokens, embeddings, KV cache) and outputs (sampling methods).
Resources
forward-pass: Represents a single forward pass operation. You configure it with inputs and sampling parameters before callingexecute.forward-pass-result: An async resource to poll and retrieve the results of aforward-pass, which can be token IDs or full probability distributions.
Functions
The workflow involves creating a forward-pass, configuring it with input and output functions, and then executing it.
create-forward-pass(...): Creates a newforward-passobject for a givenqueue.
Input Configuration
input-tokens(...): Sets the input token IDs and their positions.input-embeddings(...): Sets pre-computed input embeddings.kv-cache(...): Provides the KV cache pages for stateful generation.attention-mask(...): Applies an attention mask to the input.
Output and Sampling Configuration
These functions define what the model should output and how to sample the resulting logits.
output-tokens(...): Generates output tokens using a specified temperature.output-tokens-top-k(...): Samples from thekmost likely tokens.output-tokens-top-p(...): Samples from the smallest set of tokens whose cumulative probability exceedsp.output-tokens-min-p(...): Samples from tokens with probability mass of at leastmin-p * max(P).output-embeddings(...): Retrieves the raw output embeddings for specified token indices.output-distributions(...): Retrieves the full probability distribution over the vocabulary.
View forward WIT Definition
interface forward {
use wasi:io/poll@0.2.4.{pollable};
use common.{queue, pointer};
resource forward-pass {
execute: func() -> option<forward-pass-result>;
}
resource forward-pass-result {
pollable: func() -> pollable;
get-distributions: func() -> option<list<tuple<list<u32>, list<f32>>>>;
get-tokens: func() -> option<list<u32>>;
}
create-forward-pass: func(queue: borrow<queue>) -> forward-pass;
attention-mask: func(pass: borrow<forward-pass>, mask: list<list<u32>>);
kv-cache: func(pass: borrow<forward-pass>, kv-page-ptrs: list<pointer>, last-kv-page-len: u32);
input-embeddings: func(pass: borrow<forward-pass>, emb-ptrs: list<pointer>, positions: list<u32>);
input-tokens: func(pass: borrow<forward-pass>, input-tokens: list<u32>, positions: list<u32>);
output-embeddings: func(pass: borrow<forward-pass>, emb-ptrs: list<pointer>, indices: list<u32>);
output-distributions: func(pass: borrow<forward-pass>, indices: list<u32>, temperature: f32, top-k: option<u32>);
output-tokens: func(pass: borrow<forward-pass>, indices: list<u32>, temperature: f32);
output-tokens-top-k: func(pass: borrow<forward-pass>, indices: list<u32>, temperature: f32, top-k: u32);
output-tokens-top-p: func(pass: borrow<forward-pass>, indices: list<u32>, temperature: f32, top-p: f32);
output-tokens-min-p: func(pass: borrow<forward-pass>, indices: list<u32>, temperature: f32, min-p: f32);
output-tokens-top-k-top-p: func(pass: borrow<forward-pass>, indices: list<u32>, temperature: f32, top-k: u32, top-p: f32);
}
kvs Interface
Provides a simple, persistent key-value storage mechanism. This is useful for saving state, configuration, or other data across inferlet invocations.
store-get(key): Retrieves the value for a given key.store-set(key, value): Creates or overwrites an entry.store-delete(key): Removes an entry.store-exists(key): Checks if a key exists.store-list-keys(): Returns all keys in the store.
View kvs WIT Definition
interface kvs {
store-get: func(key: string) -> option<string>;
store-set: func(key: string, value: string);
store-delete: func(key: string);
store-exists: func(key: string) -> bool;
store-list-keys: func() -> list<string>;
}
message Interface
This interface handles communication between the inferlet and the outside world, supporting direct messaging, binary data transfer, and a publish-subscribe model.
Resources
subscription: A handle to a topic subscription, used to poll for and receive broadcast messages.receive-result: An async resource for polling the result of a directreceiveoperation.
Functions
Direct Messaging
send(message)/receive(): Send and receive string-based messages to/from the host.send-blob(blob)/receive-blob(): Send and receive binary data using theblobresource.
Publish/Subscribe
broadcast(topic, message): Publishes a message to a named topic for all subscribers.subscribe(topic): Subscribes to a topic, returning asubscriptionhandle.
View message WIT Definition
interface message {
use wasi:io/poll@0.2.4.{pollable};
use common.{blob, blob-result};
resource subscription {
pollable: func() -> pollable;
get: func() -> option<string>;
unsubscribe: func();
}
resource receive-result {
pollable: func() -> pollable;
get: func() -> option<string>;
}
send: func(message: string);
receive: func() -> receive-result;
send-blob: func(blob: blob);
receive-blob: func() -> blob-result;
broadcast: func(topic: string, message: string);
subscribe: func(topic: string) -> subscription;
}
runtime Interface
The runtime interface provides access to the execution environment, allowing the inferlet to get information about itself, its arguments, and the available models.
Functions
get-version(): Returns the runtime version string.get-instance-id(): Returns a unique ID for the current inferlet instance.get-arguments(): Retrieves the command-line arguments passed to the inferlet.set-return(value): Sets the final return value of the inferlet upon exit.get-model(name): Retrieves amodelresource by its name.get-all-models(): Lists the names of all available models.get-all-models-with-traits(traits): Filters models by required traits (e.g., "tokenize").debug-query(query): Executes a runtime-specific debug command.
View runtime WIT Definition
interface runtime {
use wasi:io/poll@0.2.4.{pollable};
use common.{model, debug-query-result};
get-version: func() -> string;
get-instance-id: func() -> string;
get-arguments: func() -> list<string>;
set-return: func(value: string);
get-model: func(name: string) -> option<model>;
get-all-models: func() -> list<string>;
get-all-models-with-traits: func(traits: list<string>) -> list<string>;
debug-query: func(query: string) -> debug-query-result;
}
tokenize Interface
This interface provides text tokenization capabilities specific to a given model.
Resource
tokenizer: A resource that can convert text to token IDs and back.
Functions
get-tokenizer(model): Retrieves thetokenizerassociated with a specificmodel.tokenizer.tokenize(text): Converts a string into a list of token IDs.tokenizer.detokenize(tokens): Converts a list of token IDs back into a string.tokenizer.get-vocabs(): Returns the model's vocabulary as a list of tokens and their corresponding IDs.
View tokenize WIT Definition
interface tokenize {
use common.{model};
get-tokenizer: func(queue: borrow<model>) -> tokenizer;
resource tokenizer {
tokenize: func(text: string) -> list<u32>;
detokenize: func(tokens: list<u32>) -> string;
get-vocabs: func() -> tuple<list<u32>, list<list<u8>>>;
}
}