Skip to main content

Adapters

LoRA adapters are runtime-managed overlays scoped to a model. The full lifecycle (create, open, fork, save, load, destroy) is available in all three SDKs. This page covers using an adapter at inference. Read this after The forward pass.

Running on the dummy driver?

Adapter ops are no-ops on the dummy driver — Adapter::create, load, fork, save, swap, and destroy all succeed but inference is identical with or without an adapter attached.

Using an adapter

Attach an adapter to a Generator or to a manual forward() pass. The adapter applies to that one pass; subsequent passes default to no adapter unless you attach again.

use inferlet::adapter::Adapter;

let adapter = Adapter::open(&model, "my-adapter").ok_or("adapter not found")?;

let text = ctx
.generate(sampler)
.adapter(&adapter)
.max_tokens(256)
.collect_text()
.await?;

// Or on a manual forward pass:
let mut fwd = ctx.forward();
fwd.adapter(&adapter);
fwd.input_at(&tokens, &positions);
let out = fwd.execute().await?;

Adapters compose with all other forward-pass primitives. You can attach an adapter and apply a constraint and sample with TopP on the same pass.

Lifecycle

Adapters are named at the engine level. The lifecycle methods follow the same shape across SDKs.

use inferlet::adapter::Adapter;

// Create an empty adapter
let a = Adapter::create(&model, "draft-adapter")?;

// Open an existing adapter
let a = Adapter::open(&model, "draft-adapter").ok_or("not found")?;

// Fork: O(1) copy-on-write clone under a new name
let a2 = a.fork("draft-adapter-v2");

// Save weights to a file (host filesystem)
a.save("/data/adapters/draft-v1.bin")?;

// Load weights back
let a = Adapter::create(&model, "draft-adapter")?;
a.load("/data/adapters/draft-v1.bin")?;

// Drop the registry slot (instance method, consumes the handle)
a.destroy();

A few details that bite the first time:

  • fork(new_name) requires the new adapter's name. Adapters are registered at the engine level, so the fork needs an identity.
  • destroy() is an instance method that consumes the handle (Rust) or releases the slot (JS). Python has no explicit destroy; the slot drops on garbage-collect.
  • Adapter::open returns Option<Adapter> (Rust) / Adapter | None (Python) / Adapter | undefined (JS). Check before use.

Like saved contexts, adapters persist across inferlet runs as long as the engine is up. They do not survive a server restart unless you save() to disk and load() back on startup.

Next