Documentation Index
Fetch the complete documentation index at: https://daily-mb-ui-agent.mintlify.app/llms.txt
Use this file to discover all available pages before exploring further.
Overview
ReplyToolMixin is the canonical bundled-tool shape for UIAgent. It
exposes a single LLM tool — reply(answer, ...) — that combines a
required spoken answer with the optional standard UI actions in one
turn-terminating call. Apps compose it by inheriting alongside their
UIAgent subclass:
answer a required argument enforced by
the API schema closes that loop — the model can’t call reply(...)
without committing to what it will say.
The mixin covers the canonical app shapes (pointing, reading, form-fill)
and any blend of them. Apps don’t pick a mode up front; the LLM uses
whichever fields fit the user’s request per turn, leaving the rest as
null.
The host class must provide
scroll_to,
highlight,
select_text,
click,
set_input_value,
and respond_to_task
(UIAgent does) and must be the target of @tool discovery on the LLM
pipeline.
Apps with a different shape — multiple per-domain tools (
play,
navigate_to_artist, etc.) where each tool is the whole turn — should skip
the mixin and write their own @tool methods. Use the action helpers
(scroll_to, highlight, select_text, click, set_input_value) on
UIAgent directly inside those tool bodies. See Choosing a tool
shape for the decision.ReplyToolMixin
Exposes a single bundledreply(...) LLM tool. One call per turn, no
chaining; the required answer argument keeps the model from omitting the
spoken terminator.
reply
scroll_to, then highlight, then select_text, then fills, then
click, then the spoken answer.
Visual / pointing actions
Draw the user’s attention without changing app state.scroll_tobrings an element into view (single ref).highlightflashes elements briefly (list of refs). Best for short emphasis like a button or a fact.select_textputs the page’s text selection on an element (single ref). Best for “this paragraph” / “the section about X” so the user sees exactly what was meant. Persists until the user clicks elsewhere.
State-changing actions
Modify form / app state.fillswrites values into inputs (list of{"ref", "value"}objects, multi-fill in one turn).clickclicks elements (list of refs in order). Use for checkboxes, radios, submit buttons.
Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
params | FunctionCallParams | Framework-provided tool invocation context. | |
answer | str | The spoken reply in plain language. One short sentence. No markdown, no symbols. | |
scroll_to | str | None | None | Optional snapshot ref. Scrolls the element into view before speaking. |
highlight | list[str] | None | None | Optional list of snapshot refs. Visually pulses each element. |
select_text | str | None | None | Optional snapshot ref. Places the page’s text selection on that element. |
fills | list[dict] | None | None | Optional list of {"ref": "eN", "value": "..."} objects. Writes each value into the input at ref. |
click | list[str] | None | None | Optional list of snapshot refs to click in order. Use for checkboxes, radios, submit buttons. |
Examples
A pointing turn:fills element such as None or a bare string is dropped
rather than raising) so a transient model hiccup doesn’t leave the
single-flight task lock held until the voice-task timeout fires.