Chat and generation
The starting point. Each of these is a small inferlet that loads a model, builds a chat prompt, and streams a reply.
| Inferlet | What it shows |
|---|---|
helloworld | Minimal #[inferlet::main] with typed JSON input/output. |
text-completion | Chat-style generation via Context::system / user / cue, the per-step Generator, and chat::Decoder. |
text-completion-spec | Same shape, but exercises the speculative-decoding path (Speculation::Default → backend NGRAM). |
python-example | Text completion in Python. |
js-example | Text completion in JavaScript. |
Related guide
- Generation overview: the three-layer model that these inferlets drive.
- Generator: the multi-step loop they all use.
- Chat parser: the streaming text parser the chat-style examples feed.