UI Commands

Overview

These pydantic models describe commands that have matching default React handlers in @pipecat-ai/client-react’s standardHandlers. Apps can use them as-is, override the client handler to customize rendering, or ignore them entirely and define their own command names. The canonical home of the wire-format models is pipecat (see the RTVI standard for the on-the-wire shapes). Subagents re-exports them from pipecat_subagents.agents for convenience.

from pipecat.processors.frameworks.rtvi.models import (
    Toast,
    Navigate,
    ScrollTo,
    Highlight,
    Focus,
    Click,
    SetInputValue,
    SelectText,
)

UIAgent.send_command(name, payload) accepts any of these pydantic models directly. BaseModel.model_dump() converts them to the plain-dict shape that travels over the wire.

await self.send_command("toast", Toast(title="Saved", subtitle="Just now"))
await self.send_command("navigate", Navigate(view="settings"))
await self.send_command("scroll_to", ScrollTo(ref="e42"))
await self.send_command("click", Click(ref="e7"))

Toast

A transient notification surface shown on the client. Apps wire a toast renderer of their choice via useToastHandler; the SDK doesn’t ship one. | Field | Type | Default | Description | | ------------- | ----- | ------- | ------------------ | ----------------------------------------------------------- | | title | str | | Required headline. | | subtitle | str | None | None | Optional second line beneath the title. | | description | str | None | None | Optional body text. | | image_url | str | None | None | Optional leading image URL. | | duration_ms | int | None | None | Optional dismiss timer. Client default applies when None. |

Navigate

Client-side navigation to a named view. Wire into your router of choice via useNavigateHandler. | Field | Type | Default | Description | | -------- | ----- | ------- | -------------------------------------------------------- | ---------------------------------- | | view | str | | App-defined view name (route, screen id, tab key, etc.). | | params | dict | None | None | Optional view-specific parameters. |

ScrollTo

Scroll a target UI node into view. The client resolves the target by ref first (a snapshot ref like "e42" assigned by the client), then falls back to target_id, an app-defined stable target identifier. Supply whichever you have; ref is the normal choice when acting on a node from <ui_state>. | Field | Type | Default | Description | | ----------- | ---- | ------- | ----------- | ----------------------------------------------------------------------- | | ref | str | None | None | Snapshot ref from <ui_state>. | | target_id | str | None | None | App-defined stable target id registered on the client. | | behavior | str | None | None | Optional behavior hint ("smooth" or "instant"). Clients may ignore. |

Highlight

Briefly emphasize a target UI node (flash, glow, pulse). | Field | Type | Default | Description | | ------------- | ---- | ------- | ----------- | ------------------------------------------------------------ | | ref | str | None | None | Snapshot ref from <ui_state>. | | target_id | str | None | None | App-defined stable target id registered on the client. | | duration_ms | int | None | None | Optional duration in ms. Client default applies when None. |

Focus

Move input focus to a target UI node. | Field | Type | Default | Description | | ----------- | ---- | ------- | ----------- | ------------------------------------ | | ref | str | None | None | Snapshot ref from <ui_state>. | | target_id | str | None | None | App-defined stable target id registered on the client. |

Click

Activate a target UI node on the client. Closes the form-fill loop for non-text inputs (checkboxes, radios) and exposes the rest of the action vocabulary (submit buttons, links, app-specific clickable nodes). The web standard handler silently no-ops on disabled targets so the agent can’t bypass UI affordances meant to be user-controlled. For HTML <select>, prefer SetInputValue (clicking options doesn’t reliably change the selection); for custom comboboxes (ARIA listbox + popup), apps wire their own command matching the library’s interaction model. | Field | Type | Default | Description | | ----------- | ---- | ------- | ----------- | ------------------------------------------------------------------------------------------------ | | ref | str | None | None | Snapshot ref from <ui_state>. | | target_id | str | None | None | App-defined stable target id registered on the client. Used as a fallback when ref is not set or has gone stale. |

SetInputValue

Write a value into a text input on the client. Use this for form-filling: the agent has decided what should go into a field (clarifying answer, tax form entry, etc.) and asks the client to populate it. The web standard handler silently no-ops on disabled, readonly, and <input type="hidden"> targets so the agent can’t write into fields the user can’t. | Field | Type | Default | Description | | ----------- | ------ | ------- | ----------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------ | | value | str | "" | The text to write. | | ref | str | None | None | Snapshot ref from <ui_state>. Typically the ref of an editable text field. | | target_id | str | None | None | App-defined stable target id registered on the client. Used as a fallback when ref is not set or has gone stale. | | replace | bool | True | When True (the default), overwrite the current value. When False, append to it. |

SelectText

Select text in the client UI so the user can see what the agent means. Mirror of the selection field surfaced in the snapshot. Use this to point the user’s attention at a specific text node or range after the agent has decided what it’s referring to. With start_offset and end_offset omitted, the entire target’s text content is selected. Web clients implement this with DOM selection APIs; native clients can map it to their platform text-selection primitives. | Field | Type | Default | Description | | -------------- | ---- | ------- | ----------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | | ref | str | None | None | Snapshot ref from <ui_state>. Typically the ref of text content or an editable text field. | | target_id | str | None | None | App-defined stable target id registered on the client. Used as a fallback when ref is not set or has gone stale. | | start_offset | int | None | None | Character offset within the target’s text where the selection should start. | | end_offset | int | None | None | End character offset, exclusive. Same coordinate system as start_offset. |

Pipecat Server

Pipecat Subagents

Client SDKs

Pipecat Flows

Pipecat Cloud

CLI

Overview

Toast

Navigate

ScrollTo

Highlight

Focus

Click

SetInputValue

SelectText

Pipecat Server

Pipecat Subagents

Client SDKs

Pipecat Flows

Pipecat Cloud

CLI

Documentation Index

​Overview

​Toast

​Navigate

​ScrollTo

​Highlight

​Focus

​Click

​SetInputValue

​SelectText

Overview

Toast

Navigate

ScrollTo

Highlight

Focus

Click

SetInputValue

SelectText