Files
hermes-webui/ROADMAP.md
T
nesquena-hermes 1d7344c602 release: v0.51.31 — Release H (12-PR contributor batch)
CHANGELOG, ROADMAP, TESTING refresh for v0.51.31 stage release covering
12 contributor PRs:

Added (2 PRs):
- #1956 JKJameson — persistent composer draft (server-side, cross-client)
- #1957 hermes-gimmethebeans — configurable session TTL via env + settings

Fixed (10 PRs):
- #1939 ai-ag2026 — theme-color + sw cache regression coverage
- #1941 ai-ag2026 — preserve chat scroll across final render
- #1945 franksong2702 — localize session jump controls (#1938)
- #1947 happy5318 — show same model from different custom providers
  (Co-authored-by hacker1e7 for #1874 close)
- #1949 Sanjays2402 — close #1937 endless-scroll vs Start-jump race
  with generation-token + mutex
  (Co-authored-by franksong2702 + Michaelyklam)
- #1950 franksong2702 — mute stale stopped gateway heartbeat (#1944)
- #1951 amlyczz — gate goal hook on goal-related turns (#1932)
  (Co-authored-by franksong2702 for #1946 close)
- #1953 lucky-yonug — skip provider peel for custom host:port slugs
- #1960 Michaelyklam — translate hidden-files workspace label (#1841)
- #1961 sbe27 — respect image_input_mode (#1959)

Closed in favor of canonical: #1942, #1962, #1946, #1874, #1311.

Stage-326 hotfixes (per Opus advisor):
- CRITICAL #1951 PENDING_GOAL_CONTINUATION race fix (removed finally
  discard that race-erased the marker before consumer could read it)
- #1956 composer-draft input validation (50 KB text / 50 file clamp +
  type coercion to prevent unbounded session-JSON bloat)
- #1957 SESSION_TTL constant preserved as named fallback (existing
  regression tests pin it; #1957 originally deleted it)

Tests: 5006 → 5028 (+51 net new) — 0 regressions, 142.61s runtime.
2026-05-09 18:46:25 +00:00

20 KiB
Raw Blame History

Hermes Web UI — Roadmap

Web companion to the Hermes Agent CLI. Same workflows, browser-native.

Last updated: v0.51.31 (May 9, 2026) — 5028 tests collected — Release H 12-PR contributor batch (image-mode fix + race fixes + composer drafts + locale parity + custom-provider dedup + TTL config + heartbeat polish) Test source: pytest tests/ --collect-only -q Per-version detail: see CHANGELOG.md


Status snapshot

Surface Status
Hermes CLI parity Complete — every CLI workflow has a web equivalent
Streaming + tool transparency Live tool cards, reasoning cards, approval prompts, cancel
Multi-provider model support Any provider configured in config.yaml shows in the picker
Sessions + projects + search CRUD, content search, projects, tags, archive, fork, import
Mobile + Docker + auth Hamburger nav, slide-overs, password auth, GHCR images
Auxiliary surfaces Workspace tree + edit, cron CRUD, skills CRUD, memory write, MCP server UI
Visual polish 8 themes (incl. light/system/OLED/Sienna), Mermaid, KaTeX, syntax highlighting
Native distribution macOS desktop app (universal arm64+x86_64 DMG, signed) — separate repo

Remaining gaps and forward work live in Forward Work below.


Architecture

Layer Files Status
Python server server.py (~165 lines) + api/ modules (~20k lines) Thin shell + auth middleware + business logic
HTML template static/index.html (~600 lines) Served from disk
CSS static/style.css (~3k lines) Themes, mobile responsive, KaTeX, table styles
JavaScript static/{ui,sessions,messages,workspace,panels,boot,commands,icons,i18n,login,onboarding}.js (~26k lines) 11 modules served as static files
Service worker static/sw.js Offline shell cache, version-pinned assets
Docker Dockerfile, docker-compose.yml python:3.12-slim, multi-arch (amd64+arm64), HEALTHCHECK
CI/CD .github/workflows/release.yml Auto-release + GHCR publish on tag push
Test isolation tests/_pytest_port.py Per-worktree port + state-dir derivation, no collisions

Feature parity checklist

Chat and streaming

  • Send messages, get SSE-streaming responses
  • Composer-scoped model picker (per-conversation model selection)
  • Multi-provider API support — OpenAI, Anthropic, Google, OpenRouter, xAI, GLM, DeepSeek, Mistral, MiniMax, Kimi, OpenCode, Nous Portal, custom OpenAI-compatible endpoints
  • Live custom-endpoint model discovery (Ollama, LM Studio, vLLM via /v1/models)
  • Free-form OpenRouter model name (autocomplete + custom input)
  • Tool progress shown inline via live tool cards
  • Approval card for dangerous commands (Allow once / session / always, Deny)
  • Approval polling + SSE-pushed approval events
  • Clarify dialog — agent can ask blocking clarifying questions
  • Subagent delegation cards in tool view
  • INFLIGHT guard: switch sessions mid-request without losing response
  • Session restores from localStorage on page load
  • Reconnect banner if page reloaded mid-stream
  • SSE auto-reconnect with stream replay
  • Token / cost estimate per message and per session
  • Context usage indicator (compact ring badge in composer footer)
  • Auto-compaction handling + /compact command
  • rAF-throttled token rendering (smooth, no DOM thrash)
  • Cancel / stop button in composer footer
  • Reasoning effort selector (low / medium / high / xhigh) + /reasoning
  • Pure-text streaming with crash-recovery — partial messages restored from localStorage on reload

Conversation controls

  • Copy message to clipboard (hover icon on each bubble)
  • Edit last user message and regenerate
  • Regenerate last response
  • Clear conversation (wipe messages, keep session)
  • Branch / fork conversation from any message point (#465)
  • Pure-text + tool-call streams both recover

Sessions

  • Create session (+ button or Cmd/Ctrl+K)
  • Load session (click in sidebar)
  • Delete session (hover trash, toast undo, fallback)
  • Auto-title from first user message + adaptive title refresh (configurable cadence)
  • LLM-generated titles via auxiliary route (configurable model)
  • Rename session inline (double-click, Enter saves, Escape cancels)
  • Title search (live filter)
  • Content search (full-text across all sessions)
  • Date group headers (Today / Yesterday / Earlier) with collapsible groups
  • Pin / star sessions to top
  • Duplicate session
  • Import / Export session as JSON (full messages + metadata)
  • Download as Markdown transcript
  • Tags (#tag extraction + filter chips)
  • Archive sessions (hidden by default, "Show N archived" toggle)
  • Projects / folders (chip filter bar, "Unassigned" filter)
  • Per-session profile tracking
  • Per-session toolset override (/toolsets)
  • Batch select mode (multi-select, bulk delete / move / archive)
  • CLI session bridge — read CLI sessions from state.db, import as WebUI sessions

Workspace and files

  • Add workspace with path validation (existing directory, follows symlinks)
  • Remove / rename workspace
  • Quick-switch from topbar dropdown
  • Sidebar live workspace display (name + path)
  • New sessions inherit last-used workspace
  • Browse workspace directory tree with type icons
  • Tree view with expand / collapse + lazy load (#22)
  • Breadcrumb navigation in subdirectories
  • Preview text / code (read-only)
  • Preview markdown (rendered + tables + Mermaid + KaTeX)
  • Preview images (PNG, JPG, GIF, SVG, WEBP, AVIF inline)
  • Preview PDF / SVG / audio / video / Excalidraw / CSV / JSON / YAML
  • Edit files inline (Edit button, Enter saves, Escape cancels)
  • Create / rename / delete files and folders (in current directory)
  • Drag-drop / click / clipboard paste upload
  • Archive upload (zip / tar) with extraction
  • Syntax highlighted code preview (Prism.js, language-aware)
  • File preview auto-close on directory navigation
  • Right panel resizable (drag inner edge)
  • Embedded workspace terminal (/api/terminal/{start,input,output})
  • Git branch + dirty status badge in workspace header

Cron jobs

  • List all cron jobs (Tasks sidebar tab)
  • View job details (prompt, schedule, last run, output)
  • Run / pause / resume / delete
  • Create job from UI (name, schedule, prompt, delivery target)
  • Edit job inline (full create-form parity, including skills)
  • Skill picker in create + edit forms
  • Cron run history viewer (expandable per job)
  • Cron completion alerts (toast + badge)
  • Run-status tracking with live watch mode

Skills

  • List all skills grouped by category
  • Search / filter by name, description, category
  • View full SKILL.md content
  • View skill linked files
  • Create / edit / delete skill
  • /skills slash command

Memory

  • View personal notes (MEMORY.md) rendered as markdown
  • View user profile (USER.md) rendered as markdown
  • Last-modified timestamp per section
  • Add / edit memory entries inline

Profiles

  • Multi-profile support — create, switch, delete (#28)
  • Topbar profile picker with gateway-status dots
  • Profile management panel (full CRUD)
  • Seamless switching (no server restart, refreshes models / skills / memory / cron / workspace)
  • Profile-local workspace storage
  • First-run onboarding wizard with provider config (OpenRouter / Anthropic / OpenAI / Custom)
  • In-app OAuth for Codex and Claude

Configuration

  • Settings panel (default model, default workspace, send key, theme, voice, font size)
  • Send key preference (Enter or Ctrl+Enter)
  • Password authentication (off by default)
  • Per-session toolset override
  • Personality config via config.yaml
  • Reasoning effort persistence

Notifications

  • Cron job completion alerts
  • Background agent error banner
  • Approval pending badge
  • Provider / model mismatch toast warning

Slash commands

  • Command registry + autocomplete dropdown
  • Built-ins: /help, /clear, /model, /workspace, /new, /usage, /theme, /compact, /queue, /interrupt, /steer, /goal, /btw, /reasoning, /skills, /toolsets
  • Transparent pass-through for unrecognized commands

Security

  • Password auth with signed HMAC HTTP-only cookies (24h TTL)
  • Security headers (X-Content-Type-Options, X-Frame-Options, Referrer-Policy)
  • CSRF protection (scheme-aware, port-normalized for reverse proxies)
  • PBKDF2 password hashing
  • Rate limiting on auth endpoints
  • Session ID validation
  • SSRF guard on /api/models/live, cfg_base_url, custom_providers[]
  • ENV_LOCK around env mutations
  • XSS sanitization on all rendered HTML
  • HMAC-signed signing keys (random per install)
  • Skills path-traversal guard
  • Secure cookie flags (HttpOnly, SameSite, Secure when HTTPS)
  • Error message sanitization (no stack traces in responses)
  • POST body size limit (20MB)
  • Upload path-traversal guard
  • Credential redaction in API responses
  • Profile .env secret isolation on switch
  • Auto-install gate (opt-in via HERMES_WEBUI_AUTO_INSTALL=1)

Visual / UX

  • 8 themes — Dark, Light, System (auto-sync), Slate, Solarized, Monokai, Nord, OLED, Sienna
  • 2-axis appearance model (theme + skin) for community theme contributions
  • Mermaid diagram rendering
  • KaTeX math rendering with fence-before-math fix
  • Syntax highlighting (Prism.js, language-aware, YAML newline preservation)
  • Markdown image syntax ![alt](url) and inline MEDIA: tokens render as <img>
  • Plain URL auto-linking
  • Inline markdown in table cells (bold, italic, code, links)
  • Code block copy button
  • Tool card expand / collapse toggle
  • Collapsible thinking / reasoning cards (Claude extended thinking, o3 reasoning tokens)
  • Message timestamps (subtle, full date on hover)
  • Empty composer hides send button (icon-circle with pop-in animation)
  • Pluggable Lucide SVG icons (no emoji rendering inconsistencies)
  • Composer-centric controls (v0.50.0 UI overhaul)
  • Hermes Control Center modal (centralized actions)
  • Workspace panel state machine (defaults closed, opens for browsing / preview)
  • PWA manifest + service worker (offline shell)
  • Favicon (SVG + PNG + ICO)
  • Branded onboarding wizard

Voice

  • Voice input via Web Speech API (push-to-talk dictation)
  • Hands-free voice mode (turn-based conversation, opt-in via Settings → Preferences)
  • TTS playback of responses (configurable voice, rate, pitch)

Mobile

  • Hamburger sidebar (slide-in overlay)
  • Bottom navigation bar (5-tab iOS-style)
  • Files slide-over (right panel as slide-over)
  • 44px minimum touch targets
  • Container queries on composer
  • Android Chrome compatibility fixes
  • PWA installation (manifest + icons + Android support)

Internationalization

  • 9 locales — English, Japanese, Russian, Spanish, German, Chinese (zh + zh-Hant), Portuguese, Korean, French
  • Key-parity test ensures every locale has every key
  • Right-to-left and CJK input (IME composition fixes)

Gateway integration

  • Real-time gateway sessions in sidebar (Telegram, Discord, Slack, Weixin) via SSE + DB polling
  • Cross-channel handoff dock — composer-docked flyout summarizing the live external session
  • Transcript-summary card at 10+ rounds
  • Sidebar dedup keying on per-conversation identity (distinct chats from same platform stay separate)
  • Gateway session sync skips dup / delete options for external sessions
  • LLM Gateway routing metadata display — assistant turns and session metadata show the served model/provider, failover path, and model-switch warnings when response metadata includes used_provider, used_model, or routing (#732)

MCP integration

  • MCP server management UI (System Settings → MCP Servers)
  • Add / edit / delete MCP server entries

Distribution

  • Docker support (multi-arch amd64 + arm64, HEALTHCHECK, UID/GID auto-detect)
  • Two-container Docker compose (webui + agent)
  • GHCR auto-publish on tag push
  • Subpath mount support (reverse proxy at /hermes/)
  • PWA installable from any browser
  • Native macOS app — universal Intel + Apple Silicon, signed + notarized DMG, Sparkle 2 auto-update — see hermes-webui/hermes-swift-mac repo

Forward work

Confirmed candidates (open feature requests with sprint-candidate or active interest)

Theme Tracking Why
Persistent-host stability #1458 Bootstrap fork pattern crashes under launchd / systemd — partial fix shipped (foreground mode); state.db FD leak and HTTP-unhealthy wedge remain
Free-tier OpenRouter variants visible #1426 :free tool-support filter currently hides them from the picker
macOS scroll override regression #1360 Auto-scroll sometimes overrides user scroll on the desktop app
GLM dual-use (main + auxiliary) #1291 Currently mutually exclusive; same provider can't serve both surfaces
Auto-assign session to filtered project #1468 When user is filtering by project X, new session should default to project X
Update banner "What's new?" link #1512 Surface release highlights from the update banner
Sunset legacy LMSTUDIO_API_KEY env var #1502 Tracking issue — alias stays for one minor cycle, then removed
Hermes Agent dashboard cross-link #1459 Detect a running Hermes Agent and surface link in nav
Gateway status card in Settings #1457 Current gateway-status dots only on profile picker
Insights — daily token chart + per-model breakdown #1456 Existing usage badge is per-message; need rollup view
Logs tab — view agent / errors / gateway logs #1455 Currently requires terminal access to log files
Model picker collision handling #1425 Same-name models from different providers aren't disambiguated in dropdown
"Reveal in Finder" right-click on workspace #1424 macOS desktop app convenience
Configurable session persistence timing #1406 Currently every checkpoint, want operator control
Silent credential self-heal on 401 #1401 Gateway auth.json drift should resolve without user re-auth
LLM Wiki status panel #1257 On / off toggle for Wiki integration
Lightweight in-app Canvas editing #1255 Text canvas for prompt drafting / shared notes
Provider / Model source-of-truth alignment #1240 Reconcile WebUI vs CLI vs Gateway provider resolution
Built-in SearXNG web search #1037 Lightweight search tool with on / off toggle
Subagent session relationship view #1004 Show subagent hierarchy in sidebar with expand / collapse

Backlog (deferred, listed for visibility)

  • Insights / monitoring suite — agent heartbeat + alerts (#716), quota / rate-limit display (#706), data tabs (#722), monitor dashboard concepts (#766, #721)
  • Native MCP server expose — Hermes WebUI as an MCP server for direct agent integration (#733)
  • Teams / agents management panel — editable names, roles, assignments (#719)
  • Web UI profile model alignment with Hermes runtime — design parity (#749)
  • DOM windowing / message virtualization — for sessions with hundreds of messages (#734)
  • Searchable global tool list (#697)
  • Add agent / replace model modals (#698)
  • Code execution inline cells — Jupyter-style cell rendering inside chat
  • Sharing / public conversation URLs — requires hosted backend with access control (out of scope for self-host)

Intentionally not planned

  • Full SwiftUI rewrite of the frontend — the WKWebView shell already gets 95% of native benefit
  • App Store distribution — sandboxing breaks the local server model
  • Real-time multi-user collaboration — single-user assumption throughout
  • Plugin marketplace — Hermes skills cover this surface
  • Anthropic / Claude proprietary features — Projects AI memory, Claude artifacts sync (not reproducible)

Sprint history

Per-version detail lives in CHANGELOG.md. The table below is a high-level chronology of major sprint themes; individual PR / fix detail moved to CHANGELOG to keep this file readable.

Range Theme Highlights
Sprints 16 Foundations + workspace server / static split, JS module split, workspace CRUD, file editor, message queue + INFLIGHT, isolated test environment
Sprint 7 Wave 2 core Cron / skill / memory CRUD, session content search, health endpoint, git init
Sprint 8 Daily-driver finish line Edit + regenerate, regenerate last response, clear conversation, Prism.js, queue + INFLIGHT polish
Sprints 910 Codebase health + operational polish app.js → 6 modules, server.py → api/ modules, tool card UX, background task cancel, regression tests
Sprint 11 Multi-provider models + streaming Dynamic model dropdown, smooth scroll pinning, routes extracted to api/routes.py
Sprint 12 Settings + reliability + session QoL Settings panel, SSE auto-reconnect, pin sessions, JSON import
Sprint 13 Alerts + polish Cron alerts, background error banner, session duplicate, browser tab title
Sprint 14 Visual polish + workspace ops Mermaid, message timestamps, file rename, folder create, session tags, archive
Sprint 15 Session projects + code copy Projects / folders, code copy button, tool card expand / collapse
Sprint 16 Sidebar visual polish SVG icons, action dropdown, pin indicator, project border, safe HTML rendering
Sprint 17 Workspace polish + slash commands Breadcrumb nav, slash command autocomplete, send key setting (#26)
Sprint 18 Thinking display + workspace tree File preview auto-close, thinking / reasoning cards, expandable directory tree (#22)
Sprint 19 Auth + security hardening Password auth, login page, security headers, body limit (#23)
Sprint 20 Voice input + send button Web Speech API voice, send button polish
Sprint 21 Mobile responsive + Docker Hamburger sidebar, mobile nav, slide-over files, Docker support (#21, #7)
Sprint 22 Multi-profile support Profile picker, management panel, seamless switching, per-session tracking (#28)
Sprint 23 Agentic transparency Token / cost display, subagent cards, skill picker in cron, profile-local storage
Sprint 24 Web polish rAF streaming, git detection, collapsible date groups, context ring (#80, #81, #82, #83)
Sprint 25 macOS desktop application Native Swift + WKWebView shell, universal DMG, Sparkle 2 auto-update — separate repo
Sprint 26 Pluggable themes Light / Slate / Solarized / Monokai / Nord, settings unsaved-changes guard, /theme
Sprint 27 Theme polish 30+ hardcoded colors → CSS variables, light theme final polish
Sprint 28 Security hardening Env race fix, random signing key, upload traversal, PBKDF2
Sprints 2932 Model routing + custom endpoints + reasoning Model routing by provider prefix, custom endpoint URL fix, OLED theme, top-level reasoning, message_count sync
Sprint 33 Approval card + Lucide icons Approval prompt surfaced, emoji → SVG, login CSP fix, update diagnostics
Sprint 34 v0.50.0 UI overhaul Composer-centric controls, Control Center modal, workspace state machine, collapsible date groups, rAF throttle, context ring
Sprints 3537 Onboarding + i18n + Spanish First-run wizard, OpenRouter / Anthropic / OpenAI / Custom config, Spanish locale, Docker two-container, mobile Profiles button
Sprints 3840 Session + UI polish + Sprint 40 Five-bug clean-up + sidebar timestamp + test port isolation
Sprints 4142 Renderer hardening + KaTeX + handoff Context ring live usage, renderMd link / image / code stash chain, MEDIA: image rendering, gateway handoff foundation
Sprints 43+ Continuous contributor sprints Custom providers, Russian locale, IME fixes, model-switch toast, approval queue multi-slot, profile polish, font-size CSS, contributor wave

Versioning conventions

  • Patch (v0.50.X) — small batches, contributor PR releases, hotfixes
  • Minor (v0.X.0) — sprint completion, new feature surface, architecture milestone
  • Major (v1.0.0) — declared when CLI parity + Claude parity reach steady state and the feature surface stabilizes

Per-version detail and contributor attribution live in CHANGELOG.md.