Core WIT Definitions
The inferlet:core
world defines the complete set of APIs available to an inferlet. It aggregates several specialized interfaces, each responsible for a distinct area of functionality, from model interaction and runtime environment access to messaging and data storage.
This document provides a concise overview of each interface and its components.
package inferlet:core;
world imports {
import common;
import runtime;
import forward;
import tokenize;
import kvs;
import message;
}
common
Interface
The common
interface provides fundamental types, resources, and functions that are shared across other interfaces. It forms the foundation for interacting with models and managing resources.
Core Types
pointer
: Au32
representing a memory address or resource handle.priority
: An enum (low
,normal
,high
) for setting the execution priority of a command queue.
Resources
blob
/blob-result
: Used for handling raw binary data.blob-result
is an async resource for polling the result of a blob receive operation.model
: Represents a specific model instance, providing methods to query its metadata like name, traits, prompt template, and stop tokens. It's the entry point for creating a commandqueue
.queue
: Represents a command queue for a specific model. It's used to synchronize operations, set priority, and manage resources.synchronization-result
: An async resource to check the status of asynchronize
operation on aqueue
.debug-query-result
: An async resource to get the string result from a debug query.
Resource Management Functions
These functions operate on a queue
to manage shared memory resources, enabling efficient data transfer and interoperability between inferlets.
allocate-resources(...)
: Allocates a specified number of a given resource type.deallocate-resources(...)
: Frees previously allocated resources.export-resources(...)
: Makes a set of resources available to other inferlets under a given name.import-resources(...)
: Accesses resources exported by another inferlet by name.get-all-exported-resources(...)
: Lists all resources currently exported.release-exported-resources(...)
: Removes an export, making the resources private again.
View common
WIT Definition
interface common {
use wasi:io/poll@0.2.4.{pollable};
type pointer = u32;
enum priority { low, normal, high }
resource blob {
constructor(init: list<u8>);
read: func(offset: u64, n: u64) -> list<u8>;
size: func() -> u64;
}
resource blob-result {
pollable: func() -> pollable;
get: func() -> option<blob>;
}
resource model {
get-name: func() -> string;
get-traits: func() -> list<string>;
get-description: func() -> string;
get-prompt-template: func() -> string;
get-stop-tokens: func() -> list<string>;
get-service-id: func() -> u32;
get-kv-page-size: func() -> u32;
create-queue: func() -> queue;
}
resource queue {
get-service-id: func() -> u32;
synchronize: func() -> synchronization-result;
set-priority: func(priority: priority);
debug-query: func(query: string) -> debug-query-result;
}
resource synchronization-result {
pollable: func() -> pollable;
get: func() -> option<bool>;
}
resource debug-query-result {
pollable: func() -> pollable;
get: func() -> option<string>;
}
allocate-resources: func(queue: borrow<queue>, resource-type: u32, count: u32) -> list<pointer>;
deallocate-resources: func(queue: borrow<queue>, resource-type: u32, ptrs: list<pointer>);
get-all-exported-resources: func(queue: borrow<queue>, resource-type: u32) -> list<tuple<string, u32>>;
release-exported-resources: func(queue: borrow<queue>, resource-type: u32, name: string);
export-resources: func(queue: borrow<queue>, resource-type: u32, ptrs: list<pointer>, name: string);
import-resources: func(queue: borrow<queue>, resource-type: u32, name: string) -> list<pointer>;
}
forward
Interface
This interface defines the building blocks for executing a model's forward pass. It allows for detailed control over inputs (tokens, embeddings, KV cache) and outputs (sampling methods).
Resources
forward-pass
: Represents a single forward pass operation. You configure it with inputs and sampling parameters before callingexecute
.forward-pass-result
: An async resource to poll and retrieve the results of aforward-pass
, which can be token IDs or full probability distributions.
Functions
The workflow involves creating a forward-pass
, configuring it with input and output functions, and then executing it.
create-forward-pass(...)
: Creates a newforward-pass
object for a givenqueue
.
Input Configuration
input-tokens(...)
: Sets the input token IDs and their positions.input-embeddings(...)
: Sets pre-computed input embeddings.kv-cache(...)
: Provides the KV cache pages for stateful generation.attention-mask(...)
: Applies an attention mask to the input.
Output and Sampling Configuration
These functions define what the model should output and how to sample the resulting logits.
output-tokens(...)
: Generates output tokens using a specified temperature.output-tokens-top-k(...)
: Samples from thek
most likely tokens.output-tokens-top-p(...)
: Samples from the smallest set of tokens whose cumulative probability exceedsp
.output-tokens-min-p(...)
: Samples from tokens with probability mass of at leastmin-p * max(P)
.output-embeddings(...)
: Retrieves the raw output embeddings for specified token indices.output-distributions(...)
: Retrieves the full probability distribution over the vocabulary.
View forward
WIT Definition
interface forward {
use wasi:io/poll@0.2.4.{pollable};
use common.{queue, pointer};
resource forward-pass {
execute: func() -> option<forward-pass-result>;
}
resource forward-pass-result {
pollable: func() -> pollable;
get-distributions: func() -> option<list<tuple<list<u32>, list<f32>>>>;
get-tokens: func() -> option<list<u32>>;
}
create-forward-pass: func(queue: borrow<queue>) -> forward-pass;
attention-mask: func(pass: borrow<forward-pass>, mask: list<list<u32>>);
kv-cache: func(pass: borrow<forward-pass>, kv-page-ptrs: list<pointer>, last-kv-page-len: u32);
input-embeddings: func(pass: borrow<forward-pass>, emb-ptrs: list<pointer>, positions: list<u32>);
input-tokens: func(pass: borrow<forward-pass>, input-tokens: list<u32>, positions: list<u32>);
output-embeddings: func(pass: borrow<forward-pass>, emb-ptrs: list<pointer>, indices: list<u32>);
output-distributions: func(pass: borrow<forward-pass>, indices: list<u32>, temperature: f32, top-k: option<u32>);
output-tokens: func(pass: borrow<forward-pass>, indices: list<u32>, temperature: f32);
output-tokens-top-k: func(pass: borrow<forward-pass>, indices: list<u32>, temperature: f32, top-k: u32);
output-tokens-top-p: func(pass: borrow<forward-pass>, indices: list<u32>, temperature: f32, top-p: f32);
output-tokens-min-p: func(pass: borrow<forward-pass>, indices: list<u32>, temperature: f32, min-p: f32);
output-tokens-top-k-top-p: func(pass: borrow<forward-pass>, indices: list<u32>, temperature: f32, top-k: u32, top-p: f32);
}
kvs
Interface
Provides a simple, persistent key-value storage mechanism. This is useful for saving state, configuration, or other data across inferlet invocations.
store-get(key)
: Retrieves the value for a given key.store-set(key, value)
: Creates or overwrites an entry.store-delete(key)
: Removes an entry.store-exists(key)
: Checks if a key exists.store-list-keys()
: Returns all keys in the store.
View kvs
WIT Definition
interface kvs {
store-get: func(key: string) -> option<string>;
store-set: func(key: string, value: string);
store-delete: func(key: string);
store-exists: func(key: string) -> bool;
store-list-keys: func() -> list<string>;
}
message
Interface
This interface handles communication between the inferlet and the outside world, supporting direct messaging, binary data transfer, and a publish-subscribe model.
Resources
subscription
: A handle to a topic subscription, used to poll for and receive broadcast messages.receive-result
: An async resource for polling the result of a directreceive
operation.
Functions
Direct Messaging
send(message)
/receive()
: Send and receive string-based messages to/from the host.send-blob(blob)
/receive-blob()
: Send and receive binary data using theblob
resource.
Publish/Subscribe
broadcast(topic, message)
: Publishes a message to a named topic for all subscribers.subscribe(topic)
: Subscribes to a topic, returning asubscription
handle.
View message
WIT Definition
interface message {
use wasi:io/poll@0.2.4.{pollable};
use common.{blob, blob-result};
resource subscription {
pollable: func() -> pollable;
get: func() -> option<string>;
unsubscribe: func();
}
resource receive-result {
pollable: func() -> pollable;
get: func() -> option<string>;
}
send: func(message: string);
receive: func() -> receive-result;
send-blob: func(blob: blob);
receive-blob: func() -> blob-result;
broadcast: func(topic: string, message: string);
subscribe: func(topic: string) -> subscription;
}
runtime
Interface
The runtime
interface provides access to the execution environment, allowing the inferlet to get information about itself, its arguments, and the available models.
Functions
get-version()
: Returns the runtime version string.get-instance-id()
: Returns a unique ID for the current inferlet instance.get-arguments()
: Retrieves the command-line arguments passed to the inferlet.set-return(value)
: Sets the final return value of the inferlet upon exit.get-model(name)
: Retrieves amodel
resource by its name.get-all-models()
: Lists the names of all available models.get-all-models-with-traits(traits)
: Filters models by required traits (e.g., "tokenize").debug-query(query)
: Executes a runtime-specific debug command.
View runtime
WIT Definition
interface runtime {
use wasi:io/poll@0.2.4.{pollable};
use common.{model, debug-query-result};
get-version: func() -> string;
get-instance-id: func() -> string;
get-arguments: func() -> list<string>;
set-return: func(value: string);
get-model: func(name: string) -> option<model>;
get-all-models: func() -> list<string>;
get-all-models-with-traits: func(traits: list<string>) -> list<string>;
debug-query: func(query: string) -> debug-query-result;
}
tokenize
Interface
This interface provides text tokenization capabilities specific to a given model.
Resource
tokenizer
: A resource that can convert text to token IDs and back.
Functions
get-tokenizer(model)
: Retrieves thetokenizer
associated with a specificmodel
.tokenizer.tokenize(text)
: Converts a string into a list of token IDs.tokenizer.detokenize(tokens)
: Converts a list of token IDs back into a string.tokenizer.get-vocabs()
: Returns the model's vocabulary as a list of tokens and their corresponding IDs.
View tokenize
WIT Definition
interface tokenize {
use common.{model};
get-tokenizer: func(queue: borrow<model>) -> tokenizer;
resource tokenizer {
tokenize: func(text: string) -> list<u32>;
detokenize: func(tokens: list<u32>) -> string;
get-vocabs: func() -> tuple<list<u32>, list<list<u8>>>;
}
}