mirror of
https://github.com/nesquena/hermes-webui.git
synced 2026-05-26 11:40:26 +00:00
Merge pull request #2251 into stage-355
docs(runtime): codify #1925 adapter contract and migration gates (franksong2702)
This commit is contained in:
@@ -14,6 +14,10 @@
|
||||
|
||||
- **PR #2236** by @jasonjcwu — Silent failure detection in `api/streaming.py` now scans only NEW messages, not the full conversation history. Pre-fix, the `_assistant_added` check at `_run_agent_streaming` scanned all messages in `result["messages"]` (including pre-turn history); if any prior turn contained an assistant response, `_assistant_added` was `True` and the apperror SSE event was silently skipped, leaving the user staring at a blank response after a provider 401/429/rate-limit error. Fix extracts a `_has_new_assistant_reply(all_messages, prev_count)` helper that only inspects messages beyond the pre-turn history offset (`_previous_context_messages`); applied to both the main detection path and the self-heal/retry `_heal_ok` check. 15-test regression suite covering empty/short/long-history scenarios, the heal path, and the `len < prev_count` edge-case fallback. Also includes a small alignment fix to `test_issue1857_usage_overwrite.py` so the FakeAgent message shape matches what the real agent produces.
|
||||
|
||||
### Docs
|
||||
|
||||
- **PR #2251** by @franksong2702 (refs #1925) — Updates the Hermes run adapter RFC to codify the #1925 review direction: WebUI stays broad in product scope but becomes thin in execution ownership. The revised RFC credits Michael Lam's "protocol translator, not runtime surrogate" guardrail, defines the browser event/control contract, classifies current runtime state into runner/journal/adapter/presentation ownership, adds an acceptance-test catalog, and gates the first implementation slice to append-only journal/replay without changing `_run_agent_streaming` control flow.
|
||||
|
||||
## [v0.51.60] — 2026-05-14 — Release AJ (stage-353 — 3-PR overlapping Appearance + critical #2223 compression-rotation data-loss fix + Opus SHOULD-FIX on parent_session_id)
|
||||
|
||||
### Fixed
|
||||
|
||||
+4
-3
@@ -38,8 +38,9 @@ First-time contributor RFCs should be discussed in an issue before opening a PR.
|
||||
|
||||
## Current RFCs
|
||||
|
||||
- [`hermes-run-adapter-contract.md`](hermes-run-adapter-contract.md) — Event/control
|
||||
compatibility contract and gap matrix for moving WebUI chat runs to Hermes-owned
|
||||
runtime execution.
|
||||
- [`hermes-run-adapter-contract.md`](hermes-run-adapter-contract.md) — #1925
|
||||
event/control contract, runtime-state ownership matrix, acceptance catalog,
|
||||
and reversible migration gates for moving WebUI execution behind an explicit
|
||||
adapter boundary.
|
||||
- [`turn-journal.md`](turn-journal.md) — Crash-safe WebUI turn journal for
|
||||
recovering interrupted chat submissions.
|
||||
|
||||
@@ -1,134 +1,140 @@
|
||||
# Hermes Run Adapter Compatibility Contract
|
||||
# Hermes Run Adapter Contract and Migration Gates
|
||||
|
||||
- **Status:** Proposed
|
||||
- **Author:** @Michaelyklam
|
||||
- **Updated by:** @franksong2702
|
||||
- **Created:** 2026-05-11
|
||||
- **Revised:** 2026-05-14
|
||||
- **Tracking issue:** [#1925](https://github.com/nesquena/hermes-webui/issues/1925)
|
||||
|
||||
## Problem
|
||||
## Credit and Scope
|
||||
|
||||
Hermes WebUI currently gives a rich workbench experience, but browser-originated
|
||||
chat turns are still executed inside the WebUI server process. The WebUI path
|
||||
creates process-local stream state, starts background agent threads, constructs or
|
||||
reuses `AIAgent`, and owns callback queues for token, tool, reasoning, approval,
|
||||
and clarify state.
|
||||
This RFC codifies the direction discussed in #1925. It does not introduce an
|
||||
implementation. The central guardrail comes from Michael Lam's review framing:
|
||||
|
||||
The target boundary from #1925 is:
|
||||
> the adapter should be a protocol translator, not a runtime surrogate.
|
||||
|
||||
The product boundary from #1925 is:
|
||||
|
||||
> WebUI should be thin in execution ownership, not thin in product scope.
|
||||
|
||||
That means WebUI remains the full browser workbench for sessions, workspace
|
||||
files, chat rendering, tools, approvals, status, diagnostics, and controls. The
|
||||
change is that Hermes Agent must own run lifecycle, event ordering, replay,
|
||||
approvals, clarify, cancellation, and terminal state.
|
||||
files, chat rendering, tool cards, approvals, status, diagnostics, and controls.
|
||||
The change is that long-lived execution ownership should move behind an explicit
|
||||
runtime boundary instead of remaining scattered through the main WebUI request
|
||||
process.
|
||||
|
||||
This document defines the first reviewable contract for a Hermes-owned run
|
||||
adapter. It is intentionally a spec/gap matrix, not an implementation plan for a
|
||||
new WebUI runtime surrogate.
|
||||
This document is intentionally a reviewable spec and migration gate. It should be
|
||||
accepted before any implementation PR changes the streaming hot path, introduces a
|
||||
runner process, or moves cancellation / approval / clarify control flow.
|
||||
|
||||
## Problem
|
||||
|
||||
Browser-originated chat turns are still executed inside the WebUI server process.
|
||||
The current path creates process-local stream state, starts background agent
|
||||
threads, constructs or reuses `AIAgent`, and owns callback state for token, tool,
|
||||
reasoning, approval, clarify, cancellation, and terminal events.
|
||||
|
||||
That shape works, but it makes the WebUI process the owner of active runtime
|
||||
truth. Consequences include:
|
||||
|
||||
- restarting WebUI can orphan active work,
|
||||
- reconnect depends on process-local state rather than a durable run/event view,
|
||||
- cancellation and stale writeback bugs recur around ownership boundaries,
|
||||
- approvals and clarify prompts are tied to live callbacks,
|
||||
- future Hermes runtime APIs cannot be adopted cleanly because WebUI lacks a
|
||||
single adapter boundary.
|
||||
|
||||
The immediate goal is not to build a sidecar. The immediate goal is to define the
|
||||
browser contract, classify current runtime state, and gate the first reversible
|
||||
journal slice.
|
||||
|
||||
## Goals
|
||||
|
||||
- Keep the browser-facing WebUI workbench contract stable while execution moves
|
||||
out of the WebUI process.
|
||||
- Define the minimum Hermes Runtime API / IPC v0 surface WebUI needs before it
|
||||
can route new runs to Hermes-owned execution.
|
||||
- Map current WebUI-owned runtime primitives to Hermes-owned APIs, WebUI
|
||||
presentation state, or explicit temporary compatibility shims.
|
||||
- Make restart/reattach the first meaningful success criterion, not merely
|
||||
"basic chat streamed once."
|
||||
- Preserve the current rich WebUI workbench experience.
|
||||
- Make the browser-facing event/control contract explicit.
|
||||
- Classify every current runtime-owned state primitive as `runner process`,
|
||||
`journal`, `adapter API surface`, or `WebUI presentation cache`.
|
||||
- Identify future backend mapping: existing Hermes runtime API, missing Hermes
|
||||
API, or temporary WebUI compatibility shim.
|
||||
- Define acceptance tests that must survive any migration.
|
||||
- Define reversible implementation slices, starting with an append-only
|
||||
in-process event journal / replay layer.
|
||||
|
||||
## Non-goals
|
||||
|
||||
- Do not implement the adapter in this RFC.
|
||||
- Do not create a new run-manager sidecar or broker requirement.
|
||||
- Do not re-create `STREAMS`, cached `AIAgent` objects, approval queues, clarify
|
||||
queues, or cancellation flags under new names inside WebUI.
|
||||
- Do not reduce WebUI product scope. The rich workbench UX remains in WebUI.
|
||||
- Do not require every event to be durably persisted on day one if the first
|
||||
upstream runtime slice can still prove Hermes-owned execution and reconnect.
|
||||
- Do not introduce a runner process or sidecar in the first implementation slice.
|
||||
- Do not change `_run_agent_streaming` control flow in the first journal slice.
|
||||
- Do not recreate `STREAMS`, cached `AIAgent` objects, callback queues, or
|
||||
cancellation flags under new names.
|
||||
- Do not reduce WebUI product scope or move normal workbench UX out of WebUI.
|
||||
- Do not depend on Hermes Agent shipping a WebUI-specific runtime connector before
|
||||
WebUI can improve its own boundary.
|
||||
|
||||
## Ownership boundary
|
||||
## Artifact 1: Browser Event and Control Contract
|
||||
|
||||
### Hermes Agent owns
|
||||
This is the compatibility contract the browser depends on, regardless of whether
|
||||
the backend is today's in-process streaming path, an in-process journaled path, a
|
||||
future WebUI-managed runner, or a future Hermes `/v1/runs` backend.
|
||||
|
||||
- run creation and lifecycle
|
||||
- run ids and session-to-active-run mapping
|
||||
- ordered event stream and replay cursor
|
||||
- terminal run state, final result, and error metadata
|
||||
- model/provider/profile/toolset routing
|
||||
- agent execution and tool dispatch
|
||||
- command semantics and capability metadata
|
||||
- approval and clarify lifecycle
|
||||
- cancel, interrupt, queue, continue, steer, and goal control where supported
|
||||
- durable runtime/session state needed for reconnect
|
||||
The current inventory should be derived from `static/messages.js` consumers and
|
||||
SSE/event production in `api/streaming.py`. Future edits to those files should
|
||||
update this RFC or the implementation contract that replaces it.
|
||||
|
||||
### WebUI owns
|
||||
### Event Envelope
|
||||
|
||||
- browser authentication and presentation-specific session routing
|
||||
- chat layout, transcript rendering, tool cards, thinking/progress display
|
||||
- approval and clarify widgets
|
||||
- workspace/file-panel UX
|
||||
- settings/admin/diagnostics presentation
|
||||
- adapting Hermes runtime events into WebUI-compatible browser events
|
||||
- temporary compatibility shims explicitly listed in this RFC
|
||||
|
||||
## WebUI event/control compatibility contract
|
||||
|
||||
The browser-facing contract should remain stable enough that the current WebUI
|
||||
workbench can render either the legacy in-process runtime or the Hermes-owned run
|
||||
adapter during migration. These are presentation events over Hermes runtime
|
||||
truth, not a second source of truth.
|
||||
|
||||
All events should include enough metadata for idempotent rendering and
|
||||
reconnect:
|
||||
Every replayable runtime event should be representable with:
|
||||
|
||||
```json
|
||||
{
|
||||
"event_id": "run_123:42",
|
||||
"seq": 42,
|
||||
"run_id": "run_123",
|
||||
"session_id": "20260511_...",
|
||||
"type": "tool.update",
|
||||
"created_at": 1778540000.0,
|
||||
"session_id": "20260514_...",
|
||||
"type": "tool.updated",
|
||||
"created_at": 1778750000.0,
|
||||
"terminal": false,
|
||||
"payload": {}
|
||||
}
|
||||
```
|
||||
|
||||
`event_id` may be an SSE `id:` value or an equivalent cursor token. `seq` is a
|
||||
monotonic per-run cursor. Clients may send `Last-Event-ID` or `after_seq` on
|
||||
reconnect. The runtime should treat replay as at-least-once delivery; WebUI must
|
||||
deduplicate by `run_id` + `seq` / `event_id`.
|
||||
Required semantics:
|
||||
|
||||
### Event families
|
||||
- `seq` is monotonic per run.
|
||||
- `event_id` is stable enough to use as an SSE `id:` value or equivalent cursor.
|
||||
- Reconnect supports `Last-Event-ID` or `after_seq`.
|
||||
- Replay is at-least-once; WebUI deduplicates by `run_id` + `seq` or `event_id`.
|
||||
- Terminal runs can replay their final `done`, `cancelled`, or `error` state.
|
||||
|
||||
| WebUI event family | Required payload | Runtime source of truth |
|
||||
|---|---|---|
|
||||
| `run.started` / `status` | lifecycle state, controls available, session id, workspace/profile/model/toolset summary | Hermes run state |
|
||||
| `token.delta` | assistant message id/segment id, delta text, optional content type | Hermes model output stream |
|
||||
| `reasoning.delta` / `reasoning.done` | reasoning text or structured reasoning block, visibility metadata | Hermes reasoning callback/event stream |
|
||||
| `progress` | concise status/progress text, optional phase/tool context | Hermes agent progress callbacks |
|
||||
| `tool.started` | tool call id, tool name, sanitized arguments, start time | Hermes tool dispatch lifecycle |
|
||||
| `tool.updated` | stdout/stderr/structured partial data, progress metadata | Hermes tool dispatch lifecycle |
|
||||
| `tool.done` | result, exit/status, duration, error flag | Hermes tool dispatch lifecycle |
|
||||
| `approval.requested` | approval id, command/action summary, risk metadata, available choices | Hermes approval queue/control plane |
|
||||
| `approval.resolved` | approval id, choice, resulting status | Hermes approval queue/control plane |
|
||||
| `clarify.requested` | clarify id, question, choices/input mode | Hermes clarify lifecycle |
|
||||
| `clarify.resolved` | clarify id, answer metadata/status | Hermes clarify lifecycle |
|
||||
| `title.updated` | title text, title source/confidence | Hermes session/title subsystem |
|
||||
| `usage.updated` / `usage.final` | tokens, cost, model/provider, duration where available | Hermes usage accounting |
|
||||
| `error` | stable error code, safe message, redacted diagnostic metadata, terminal flag | Hermes run terminal/error state |
|
||||
| `done` | final lifecycle state, usage, terminal result/error summary, last seq | Hermes run terminal state |
|
||||
### Event Families
|
||||
|
||||
### Reconnect metadata
|
||||
| Event family | Required payload | Browser responsibility | Runtime source of truth |
|
||||
|---|---|---|---|
|
||||
| `run.started` / `status` | lifecycle state, controls available, session id, workspace/profile/model/toolset summary | render active state and controls | runtime run state |
|
||||
| `token.delta` | assistant message id or segment id, delta text, content type | append visible assistant text | runtime model output stream |
|
||||
| `reasoning.delta` / `reasoning.done` | reasoning block id, delta/final text, visibility metadata | render thinking/progress UI | runtime reasoning events |
|
||||
| `progress` | concise phase/status text, optional tool context | render activity/progress text | runtime progress callbacks |
|
||||
| `tool.started` | tool call id, name, sanitized arguments, start time | open/update tool card | runtime tool lifecycle |
|
||||
| `tool.updated` | stdout/stderr/structured partial data, progress metadata | update tool card | runtime tool lifecycle |
|
||||
| `tool.done` | result, status/exit code, duration, error flag | finalize tool card | runtime tool lifecycle |
|
||||
| `approval.requested` | approval id, action summary, risk metadata, available choices | show approval widget | runtime approval state |
|
||||
| `approval.resolved` | approval id, choice, resulting status | close/update approval widget | runtime approval state |
|
||||
| `clarify.requested` | clarify id, question, choices/input mode | show clarify widget | runtime clarify state |
|
||||
| `clarify.resolved` | clarify id, answer metadata/status | close/update clarify widget | runtime clarify state |
|
||||
| `title.updated` | title text, source/confidence | update title surfaces | session/title subsystem |
|
||||
| `usage.updated` / `usage.final` | tokens, cost, model/provider, duration where available | update usage surfaces | runtime usage accounting |
|
||||
| `error` | stable error code, safe message, redacted diagnostics, terminal flag | render error and final state | runtime terminal/error state |
|
||||
| `done` | final lifecycle state, usage, terminal result/error summary, last seq | finalize run UI | runtime terminal state |
|
||||
|
||||
### Reconnect Metadata
|
||||
|
||||
Every active or terminal run must expose:
|
||||
|
||||
- `run_id`
|
||||
- `session_id`
|
||||
- current `status`: `queued`, `running`, `awaiting_approval`,
|
||||
`awaiting_clarify`, `paused`, `cancelling`, `cancelled`, `failed`,
|
||||
`completed`, or `expired`
|
||||
- `status`: `queued`, `running`, `awaiting_approval`, `awaiting_clarify`,
|
||||
`paused`, `cancelling`, `cancelled`, `failed`, `completed`, or `expired`
|
||||
- last committed event cursor / `last_event_id`
|
||||
- terminal state and final result/error when finished
|
||||
- currently available controls
|
||||
@@ -137,179 +143,217 @@ Every active or terminal run must expose:
|
||||
|
||||
### Controls
|
||||
|
||||
| WebUI control | Required semantics | Runtime endpoint / IPC |
|
||||
| Control | Required semantics | Target owner |
|
||||
|---|---|---|
|
||||
| cancel | Request graceful cancellation of the current run; terminal event must follow | `cancel_run` / `interrupt` |
|
||||
| queue / continue | Append follow-up work to a live, paused, or resumable run/session according to Hermes semantics | `queue_or_continue` |
|
||||
| approval | Resolve a pending approval request with `allow_once`, `allow_session`, `always`, or `deny` where supported | `respond_approval` |
|
||||
| clarify | Submit answer text or selected choice for a pending clarify request | `respond_clarify` |
|
||||
| goal | Set/status/pause/resume/clear goal where Hermes exposes goal capability for this surface | command/capability API |
|
||||
| observe | Attach to live events and replay from cursor | `observe_run` |
|
||||
| status | Poll lifecycle state when SSE/WebSocket is unavailable | `get_run` |
|
||||
| observe | attach to live events and replay from cursor | adapter API surface backed by runtime/journal |
|
||||
| status | poll lifecycle state when SSE/WebSocket is unavailable | adapter API surface backed by runtime/journal |
|
||||
| cancel | request graceful cancellation; terminal event follows | runner/runtime control plane |
|
||||
| queue / continue | append follow-up work according to Hermes semantics | runner/runtime control plane |
|
||||
| approval | resolve pending approval by id with supported choices | runner/runtime control plane |
|
||||
| clarify | answer pending clarify request by id | runner/runtime control plane |
|
||||
| goal | set/status/pause/resume/clear goal where capability exists | runtime command/capability plane |
|
||||
|
||||
WebUI may keep local UI state such as which disclosure rows are expanded, but it
|
||||
must not infer or privately mutate runtime state for these controls.
|
||||
WebUI may keep presentation state such as expanded rows, selected tabs, and local
|
||||
scroll position. WebUI must not privately mutate runtime truth for these controls.
|
||||
|
||||
## Hermes Runtime API / IPC v0 minimum
|
||||
## Artifact 2: Runtime State Inventory and Classifier
|
||||
|
||||
The transport can be HTTP, stdio IPC, websocket, or another Hermes-owned local
|
||||
protocol. The key requirement is the semantic contract: Hermes owns the run id,
|
||||
lifecycle, event cursor, controls, pending human-interaction state, and terminal
|
||||
state.
|
||||
Classifications:
|
||||
|
||||
### `start_run`
|
||||
- `runner process`: should be owned by the eventual execution runner / runtime
|
||||
backend, not the main WebUI request process.
|
||||
- `journal`: should be captured in append-only durable events for replay and
|
||||
diagnostics.
|
||||
- `adapter API surface`: should be exposed through a WebUI-owned boundary that
|
||||
can later switch backend implementations.
|
||||
- `WebUI presentation cache`: may remain local because it is not execution truth.
|
||||
|
||||
Creates a Hermes-owned run.
|
||||
| Current primitive | Current legacy source of truth | Target classification | Future backend mapping | Slice 1 handling | Notes / gap |
|
||||
|---|---|---|---|---|---|
|
||||
| `STREAMS` / `STREAMS_LOCK` | `api.state_sync` process memory | adapter API surface + presentation fan-out | WebUI runner or future Hermes run observation API | keep live path; mirror events into journal | Must stop being authoritative for active run existence. |
|
||||
| `CANCEL_FLAGS` | `api.state_sync` process memory | runner process | cancel/interrupt endpoint or runner control | no control-flow change | Final cancel state must return as a replayable event. |
|
||||
| cached `AIAgent` objects / `AGENT_INSTANCES` | `api/config.py` process memory | runner process | runner-owned Hermes integration | unchanged | Moving this is deferred until after journal proof. |
|
||||
| background thread lifecycle | `_run_agent_streaming` in `api/streaming.py` | runner process | runner-owned execution lifecycle | unchanged | Slice 1 must not rewrite thread/control flow. |
|
||||
| token / partial text buffers | streaming callbacks and browser SSE state | journal + presentation cache | replayable runtime events | append emitted events | Browser can cache rendered state, but replay must rebuild it. |
|
||||
| reasoning buffers | streaming callbacks and UI rendering state | journal + presentation cache | replayable reasoning events | append emitted events | Thinking cards must survive reconnect. |
|
||||
| tool buffers / live tool calls | WebUI streaming callbacks | journal + presentation cache | replayable tool lifecycle events | append emitted events | WebUI owns rendering, not tool execution state. |
|
||||
| approval callbacks / queues | live Python callbacks | runner process + adapter API surface + journal | approval state/control endpoint | journal request/resolution events only | Pending approval must eventually survive WebUI restart. |
|
||||
| clarify callbacks / queues | live Python callbacks | runner process + adapter API surface + journal | clarify state/control endpoint | journal request/resolution events only | Pending clarify must eventually survive WebUI restart. |
|
||||
| per-request `HERMES_HOME` env mutation lock | `api/streaming.py` / config helpers | runner process | runner/profile execution context | unchanged | Long-term runner must isolate profile env without process-global mutation. |
|
||||
| session-to-active-run mapping | session JSON + active stream ids + memory | journal + adapter API surface | runtime run registry/session mapping | journal run metadata | Reopen session must discover active/completed run. |
|
||||
| title generation state | WebUI callbacks/session saves | journal + presentation cache | runtime/session title event | append title events | WebUI may display title updates after event receipt. |
|
||||
| usage accounting state | WebUI callbacks/session saves | journal + presentation cache | runtime usage event/source of truth | append usage events | Avoid divergent WebUI-only accounting. |
|
||||
| command capability metadata | WebUI command registry + Hermes command assumptions | adapter API surface | runtime command/capability metadata | unchanged | Unknown command support should not be guessed by WebUI. |
|
||||
| voice mode state | browser/UI + streaming path | presentation cache + adapter API surface | runtime input/control capability | unchanged | Acceptance tests must pin voice behavior before migration. |
|
||||
| project/workspace context | WebUI session/workspace state + env mutation | adapter API surface + runner process | runtime run context | unchanged | Must preserve workspace-aware chat and project context. |
|
||||
|
||||
Input fields:
|
||||
Unclassified state is a design blocker. If an implementation slice discovers a
|
||||
runtime primitive that does not fit this table, update the RFC before landing code.
|
||||
|
||||
- `session_id` or instruction to create one
|
||||
- user message / queued input
|
||||
- workspace context and attachments metadata
|
||||
- profile/provider/model/toolset hints
|
||||
- source/surface metadata, e.g. `source=webui`
|
||||
- optional command intent, e.g. `/goal` if parsed by WebUI command UI
|
||||
- idempotency key for duplicate browser submissions
|
||||
## Artifact 3: Acceptance Test Catalog
|
||||
|
||||
Output fields:
|
||||
These are the user-observable behaviors that must survive the migration. The
|
||||
catalog should become automated tests where practical. Where full automation is
|
||||
not feasible in the first slice, the PR must include the strongest practical
|
||||
diagnostic or manual validation plan.
|
||||
|
||||
- `run_id`
|
||||
- `session_id`
|
||||
- initial `status`
|
||||
- `observe` cursor / first event id
|
||||
- supported controls for this run
|
||||
| Behavior | Acceptance criterion | Why it matters | First slice that must prove it |
|
||||
|---|---|---|---|
|
||||
| Journal replay after refresh/reconnect | reconnect or restart after events have been journaled can replay from cursor without duplicate transcript/tool/reasoning state | proves the browser contract is replayable and duplicate-safe | journal/replay slice |
|
||||
| Terminal replay | completed/failed/cancelled runs replay terminal state and do not duplicate transcript content | prevents stale spinner and duplicate-message regressions | journal/replay slice |
|
||||
| Interrupted/stale run diagnostics | if WebUI restarts while execution is still owned by the WebUI process, replay shows the last journaled state and a clear interrupted/stale diagnostic instead of pretending the run kept executing | keeps slice 1 honest before a runner exists | journal/replay slice |
|
||||
| Execution survives WebUI restart | active execution outlives the main WebUI process, reconnect discovers the active run, ordered replay catches up, and controls such as cancel still work | proves execution ownership actually moved out of the request process | runner/sidecar or external-runtime slice |
|
||||
| Cancel during tool call | cancel emits one terminal cancelled state and no stale writeback | catches historical stream ownership races | control migration slice |
|
||||
| Cancel during reasoning | partial/reasoning content is preserved cleanly and final state is not provider-error | catches cancellation classification regressions | control migration slice |
|
||||
| Approval request/response | approval survives observation, browser response reaches runtime, result is replayable | approval callbacks are cross-cutting and easy to orphan | approval migration slice |
|
||||
| Clarify request/response | clarify survives observation, browser response reaches runtime, result is replayable | same risk as approval, different UI/control path | clarify migration slice |
|
||||
| Slash commands | `/compress`, `/branch`, `/retry`, and other supported commands keep current semantics | command behavior should not be reimplemented ad hoc | command capability slice |
|
||||
| Model switch mid-session | provider/model changes route through the correct runtime context | prevents provider/source-of-truth drift | adapter control slice |
|
||||
| Workspace context | run receives the session workspace and attachments context | preserves workbench value | adapter control slice |
|
||||
| Multi-profile isolation | profile-specific runs write/read the correct Hermes home and memory | protects #2134-family isolation concerns | runner/profile slice |
|
||||
| Queue/continue | follow-up input during live/resumable work obeys Hermes semantics | prevents parallel continuation model | control migration slice |
|
||||
| Goal continuation | goal status/control survives the adapter boundary | goal logic is lifecycle-sensitive | goal capability slice |
|
||||
| Voice mode | voice-originated input uses the same run/event/control contract | prevents alternate input path drift | adapter parity slice |
|
||||
| Projects context | project metadata remains visible and correct across run replay | preserves session/workbench organization | adapter parity slice |
|
||||
|
||||
### `observe_run`
|
||||
## Artifact 4: Slicing Plan and Reversibility
|
||||
|
||||
Streams ordered run events, with replay from a cursor.
|
||||
### Slice 0: Spec PR
|
||||
|
||||
Required behavior:
|
||||
Scope:
|
||||
|
||||
- support `after_seq` or `Last-Event-ID`
|
||||
- emit events in monotonically increasing per-run order
|
||||
- replay terminal `error` / `done` state for completed runs
|
||||
- make duplicate delivery safe for reconnecting clients
|
||||
- preserve enough history for short WebUI restarts and browser reloads
|
||||
- this RFC update,
|
||||
- no runtime behavior change,
|
||||
- no streaming hot-path code change.
|
||||
|
||||
### `get_run`
|
||||
Revert path: revert the docs PR.
|
||||
|
||||
Returns current lifecycle state without consuming the event stream.
|
||||
### Slice 1: Append-only journal/replay beside the legacy path
|
||||
|
||||
Required fields:
|
||||
Pre-authorized only after this spec is reviewed and accepted in #1925.
|
||||
|
||||
- `run_id`, `session_id`, `status`
|
||||
- `created_at`, `updated_at`, optional `completed_at`
|
||||
- `last_seq` / `last_event_id`
|
||||
- active controls
|
||||
- pending approval/clarify summaries
|
||||
- terminal result/error summary
|
||||
- usage/model/provider/profile/toolset summary where available
|
||||
Scope:
|
||||
|
||||
### `cancel_run` / interrupt
|
||||
- add an append-only event journal alongside existing callback paths,
|
||||
- capture the event families in Artifact 1,
|
||||
- persist run metadata, cursor, terminal state, and safe diagnostic fields,
|
||||
- allow reconnect to replay from a cursor and then continue live observation,
|
||||
- keep `_run_agent_streaming` control flow unchanged,
|
||||
- keep cancellation, approval, clarify, queue, and goal behavior unchanged.
|
||||
|
||||
Requests graceful run cancellation or interruption. Hermes owns the final state
|
||||
transition and emits a terminal event. WebUI should not directly toggle a local
|
||||
cancellation flag as the source of truth.
|
||||
Non-goals:
|
||||
|
||||
### `queue_or_continue`
|
||||
- no runner process,
|
||||
- no sidecar,
|
||||
- no adapter interface that changes control flow,
|
||||
- no replacement of `STREAMS` as the live delivery path,
|
||||
- no speculative rewrite of agent construction/caching.
|
||||
|
||||
Submits follow-up work for a live, paused, or resumable run/session. Semantics
|
||||
must match Hermes-native queue/continue behavior so WebUI does not create a
|
||||
parallel continuation model.
|
||||
Revert path:
|
||||
|
||||
### `respond_approval`
|
||||
- disable journal writes/replay behind one small integration seam,
|
||||
- retain legacy WebUI streaming path unchanged.
|
||||
|
||||
Resolves a pending approval request by id.
|
||||
Success criterion:
|
||||
|
||||
Required behavior:
|
||||
1. Start a non-trivial WebUI run.
|
||||
2. Refresh/reconnect the browser, or restart WebUI after events have already been
|
||||
journaled.
|
||||
3. Rediscover the run from journal metadata.
|
||||
4. Replay from cursor without duplicate visible transcript content.
|
||||
5. Render the same already-journaled token/reasoning/tool/status/terminal state
|
||||
the workbench would have rendered without the reconnect.
|
||||
6. If WebUI restarted while execution was still owned by the WebUI process, show
|
||||
an explicit interrupted/stale diagnostic rather than claiming the active run
|
||||
kept executing.
|
||||
|
||||
- validate the approval belongs to the run/session
|
||||
- accept only supported choices
|
||||
- emit `approval.resolved`
|
||||
- continue, pause, or fail the run according to Hermes approval semantics
|
||||
### Slice 2: Adapter interface over the journaled legacy path
|
||||
|
||||
### `respond_clarify`
|
||||
Scope:
|
||||
|
||||
Resolves a pending clarification request by id.
|
||||
- introduce the `RuntimeAdapter` interface only after Slice 1 proves replay,
|
||||
- implement the first backend as a thin facade over the still-legacy path plus
|
||||
journal,
|
||||
- keep the browser event contract stable,
|
||||
- keep controls routed to existing code until a later control-specific slice.
|
||||
|
||||
Required behavior:
|
||||
Revert path: switch the feature flag back to direct legacy path.
|
||||
|
||||
- validate the clarify request belongs to the run/session
|
||||
- accept text or selected-choice payloads
|
||||
- emit `clarify.resolved`
|
||||
- continue or fail the run according to Hermes clarify semantics
|
||||
### Slice 3: Control migration
|
||||
|
||||
## Gap matrix
|
||||
Scope:
|
||||
|
||||
| Current WebUI primitive | Current role | Hermes-owned target | Temporary shim allowed? | Notes / gap |
|
||||
|---|---|---|---|---|
|
||||
| `STREAMS` / `STREAMS_LOCK` | Process-local live stream registry and subscriber fan-out | Hermes run registry + `observe_run` replay/fan-out | Yes, adapter may keep per-browser SSE connections only | Shim must not be the run source of truth and must survive WebUI restart by re-observing Hermes. |
|
||||
| `CANCEL_FLAGS` | Local cancellation signal checked by WebUI-owned agent thread | `cancel_run` / interrupt control | No, except translating button clicks into runtime calls | Cancellation result must come back as Hermes status/events. |
|
||||
| `AGENT_INSTANCES` | Cached `AIAgent` objects inside WebUI process | Hermes Agent runtime owns agent construction/reuse | No | Keeping this in the adapter would recreate the runtime surrogate. |
|
||||
| Partial text buffers | Reconstruct live assistant deltas for browser reconnect/render | Hermes event log/cursor plus WebUI renderer cache | Short-lived presentation cache only | Source should be replayed token events or persisted transcript, not WebUI-only execution state. |
|
||||
| Reasoning buffers | Preserve streamed reasoning/thinking text | Hermes reasoning events + replay | Short-lived presentation cache only | Replay must rebuild the same thinking cards after refresh. |
|
||||
| Tool buffers / live tool calls | Render tool cards and updates | Hermes tool lifecycle events + replay | Short-lived presentation cache only | WebUI owns card rendering, not tool execution state. |
|
||||
| Approval callbacks and queues | Bridge WebUI buttons to a live Python callback | Hermes pending approval state + `respond_approval` | No private callback queue | Pending approval must be discoverable after WebUI restart. |
|
||||
| Clarify callbacks and queues | Bridge WebUI form to a live Python callback | Hermes pending clarify state + `respond_clarify` | No private callback queue | Pending clarify must be discoverable after WebUI restart. |
|
||||
| Command capability metadata | Decide which slash commands render/execute in WebUI | Hermes command registry/capability API with owner/surface metadata | WebUI may cache metadata | Unknown commands should not be reimplemented in WebUI by default. |
|
||||
| Session-to-active-run mapping | Stored implicitly in WebUI session JSON / active stream ids | Hermes session/run mapping API | WebUI may cache last seen run id | Reopen session must rediscover active/completed run from Hermes. |
|
||||
| Reconnect/replay behavior | Depends on WebUI process memory and session JSON | `observe_run(after_seq)` + `get_run` terminal state | Browser SSE adapter only | First milestone must prove WebUI restart does not orphan the run. |
|
||||
| Usage/title/status events | Produced by WebUI streaming callbacks | Hermes usage/title/status events and run state | WebUI formatting only | WebUI can display and persist presentation copies after events arrive. |
|
||||
| Goal / queue / continue hooks | Mixed WebUI command handling and streaming callbacks | Hermes command/control plane | Only UI affordance shim | Goal support should be driven by Hermes capabilities. |
|
||||
- move cancel first,
|
||||
- then approval,
|
||||
- then clarify,
|
||||
- then queue/continue and goal controls,
|
||||
- each control gets its own acceptance tests and rollback path.
|
||||
|
||||
## Migration ladder
|
||||
Revert path: per-control feature flags or route-level fallback to legacy control
|
||||
handlers.
|
||||
|
||||
1. **Inventory and contract**: keep this RFC current with the current WebUI-owned
|
||||
runtime primitives and browser event/control contract.
|
||||
2. **Hermes Runtime API / IPC v0**: add or stabilize upstream Hermes primitives
|
||||
for `start_run`, `observe_run`, `get_run`, `cancel_run`, and replayable event
|
||||
cursors.
|
||||
3. **Read-only observation spike**: from WebUI, observe an existing Hermes-owned
|
||||
run and adapt its events into WebUI-compatible event objects without starting
|
||||
a WebUI-owned agent thread.
|
||||
4. **Feature-flagged new-run path**: route new WebUI runs to Hermes-owned
|
||||
`start_run` behind a flag while preserving the legacy path as fallback.
|
||||
5. **Restart/reattach milestone**: prove a non-trivial WebUI-started run
|
||||
survives a WebUI-only restart and browser reload with ordered replay.
|
||||
6. **Controls migration**: move cancel, queue/continue, approval, clarify, and
|
||||
goal controls to Hermes-owned endpoints/capabilities.
|
||||
7. **Parity tests**: compare legacy and adapter event streams for synthetic
|
||||
token, reasoning, tool, approval, clarify, error, and done scenarios.
|
||||
8. **Retire runtime surrogate state**: remove normal WebUI chat ownership of
|
||||
`AIAgent`, cancellation flags, callback queues, and process-local run truth
|
||||
once parity and fallback criteria are satisfied.
|
||||
### Slice 4: Runner process / sidecar boundary
|
||||
|
||||
## First success criterion
|
||||
Explicitly deferred until Slice 1 has worked in production for at least one
|
||||
release cycle and the adapter surface has review approval.
|
||||
|
||||
The first implementation milestone is not "basic chat streams through a new
|
||||
endpoint." The first meaningful milestone is:
|
||||
Scope:
|
||||
|
||||
1. Start a non-trivial chat run from WebUI through the Hermes-owned path.
|
||||
2. Restart only `hermes-webui` while the run is active.
|
||||
3. Reload or reopen the browser session.
|
||||
4. Rediscover the same `run_id` from Hermes using `session_id` or last known run
|
||||
metadata.
|
||||
5. Replay events from the last cursor with no duplicate visible transcript
|
||||
content.
|
||||
6. Render the same token/reasoning/tool/approval/clarify state the workbench
|
||||
would have rendered without the restart.
|
||||
7. Cancel the run from WebUI and observe Hermes emit the terminal cancelled
|
||||
state.
|
||||
- move long-lived execution out of the main WebUI request process,
|
||||
- runner owns active execution state,
|
||||
- main WebUI server observes/replays through the adapter/journal,
|
||||
- future Hermes CLI/Python/local API or `/v1/runs` backends can be evaluated
|
||||
behind the adapter.
|
||||
|
||||
If this works, WebUI is moving toward a protocol translator over Hermes-owned
|
||||
execution instead of becoming another runtime with different variable names.
|
||||
Revert path: disable runner backend and fall back to journaled legacy backend.
|
||||
|
||||
## Open questions
|
||||
## First Meaningful Success Criteria
|
||||
|
||||
- Where should the normative Hermes Runtime API / IPC v0 spec live: in
|
||||
`NousResearch/hermes-agent`, this WebUI RFC, or both with one designated
|
||||
source of truth?
|
||||
- What retention window is enough for v0 event replay: active-run memory only,
|
||||
SQLite-backed event log, or transcript-derived reconstruction plus terminal
|
||||
state?
|
||||
- Should WebUI talk to Hermes over the existing API server, an embedded IPC
|
||||
channel, or a profile-local runtime socket?
|
||||
- How should multiple clients observing the same run coordinate controls and
|
||||
pending approval/clarify prompts?
|
||||
- Which slash commands need surface-specific capability metadata before WebUI
|
||||
can safely delegate them to Hermes?
|
||||
The first meaningful milestones are deliberately split.
|
||||
|
||||
### Journal / Replay Gate
|
||||
|
||||
This gate belongs to Slice 1. It does not prove active execution survives a WebUI
|
||||
process restart, because execution is still owned by the WebUI process in this
|
||||
slice.
|
||||
|
||||
It proves:
|
||||
|
||||
1. A WebUI run emits append-only journal events with stable cursors.
|
||||
2. Browser refresh/reconnect can replay already-journaled events from cursor.
|
||||
3. Terminal `done`, `error`, or `cancelled` state replays without duplicate
|
||||
transcript content.
|
||||
4. Tool/reasoning/status state can be reconstructed from replayed journal events.
|
||||
5. If WebUI restarts before execution ownership has moved out of process, the UI
|
||||
can show a clear interrupted/stale diagnostic for the last journaled run state.
|
||||
|
||||
### Execution-Survives-WebUI-Restart Gate
|
||||
|
||||
This stronger gate belongs to the runner/sidecar or external-runtime slice, not
|
||||
Slice 1. It proves execution ownership has actually moved out of the main WebUI
|
||||
request process:
|
||||
|
||||
1. Start a long-running run from WebUI.
|
||||
2. Restart only `hermes-webui`.
|
||||
3. Keep the active run executing outside the restarted WebUI process.
|
||||
4. Reload the browser/session.
|
||||
5. Rediscover the active run and replay/catch up from cursor.
|
||||
6. Preserve the rendered workbench state without duplicate transcript content.
|
||||
7. If the run is still active, cancellation still works.
|
||||
|
||||
If this works without moving runtime ownership into a new pile of process-local
|
||||
globals, the architecture is moving in the right direction.
|
||||
|
||||
## Open Questions
|
||||
|
||||
- What exact storage format should Slice 1 use: SQLite run/event tables, JSONL,
|
||||
or a hybrid with transcript-derived checkpoints?
|
||||
- How long should event replay be retained after terminal state?
|
||||
- Which event fields must be redacted before journal persistence?
|
||||
- Should the journal live under the WebUI state dir, the session dir, or a
|
||||
future runtime-specific subdirectory?
|
||||
- What is the minimum set of synthetic event fixtures needed to compare legacy
|
||||
rendering with replay rendering?
|
||||
- Which controls need route-level feature flags before migration?
|
||||
- If Hermes Agent later ships a durable `/v1/runs` API, which adapter fields map
|
||||
directly and which remain WebUI presentation concerns?
|
||||
|
||||
Reference in New Issue
Block a user