mirror of
https://github.com/nesquena/hermes-webui.git
synced 2026-05-23 02:40:15 +00:00
1d7344c602
CHANGELOG, ROADMAP, TESTING refresh for v0.51.31 stage release covering 12 contributor PRs: Added (2 PRs): - #1956 JKJameson — persistent composer draft (server-side, cross-client) - #1957 hermes-gimmethebeans — configurable session TTL via env + settings Fixed (10 PRs): - #1939 ai-ag2026 — theme-color + sw cache regression coverage - #1941 ai-ag2026 — preserve chat scroll across final render - #1945 franksong2702 — localize session jump controls (#1938) - #1947 happy5318 — show same model from different custom providers (Co-authored-by hacker1e7 for #1874 close) - #1949 Sanjays2402 — close #1937 endless-scroll vs Start-jump race with generation-token + mutex (Co-authored-by franksong2702 + Michaelyklam) - #1950 franksong2702 — mute stale stopped gateway heartbeat (#1944) - #1951 amlyczz — gate goal hook on goal-related turns (#1932) (Co-authored-by franksong2702 for #1946 close) - #1953 lucky-yonug — skip provider peel for custom host:port slugs - #1960 Michaelyklam — translate hidden-files workspace label (#1841) - #1961 sbe27 — respect image_input_mode (#1959) Closed in favor of canonical: #1942, #1962, #1946, #1874, #1311. Stage-326 hotfixes (per Opus advisor): - CRITICAL #1951 PENDING_GOAL_CONTINUATION race fix (removed finally discard that race-erased the marker before consumer could read it) - #1956 composer-draft input validation (50 KB text / 50 file clamp + type coercion to prevent unbounded session-JSON bloat) - #1957 SESSION_TTL constant preserved as named fallback (existing regression tests pin it; #1957 originally deleted it) Tests: 5006 → 5028 (+51 net new) — 0 regressions, 142.61s runtime.
20 KiB
20 KiB
Hermes Web UI — Roadmap
Web companion to the Hermes Agent CLI. Same workflows, browser-native.
Last updated: v0.51.31 (May 9, 2026) — 5028 tests collected — Release H 12-PR contributor batch (image-mode fix + race fixes + composer drafts + locale parity + custom-provider dedup + TTL config + heartbeat polish) Test source:
pytest tests/ --collect-only -qPer-version detail: see CHANGELOG.md
Status snapshot
| Surface | Status |
|---|---|
| Hermes CLI parity | ✅ Complete — every CLI workflow has a web equivalent |
| Streaming + tool transparency | ✅ Live tool cards, reasoning cards, approval prompts, cancel |
| Multi-provider model support | ✅ Any provider configured in config.yaml shows in the picker |
| Sessions + projects + search | ✅ CRUD, content search, projects, tags, archive, fork, import |
| Mobile + Docker + auth | ✅ Hamburger nav, slide-overs, password auth, GHCR images |
| Auxiliary surfaces | ✅ Workspace tree + edit, cron CRUD, skills CRUD, memory write, MCP server UI |
| Visual polish | ✅ 8 themes (incl. light/system/OLED/Sienna), Mermaid, KaTeX, syntax highlighting |
| Native distribution | ✅ macOS desktop app (universal arm64+x86_64 DMG, signed) — separate repo |
Remaining gaps and forward work live in Forward Work below.
Architecture
| Layer | Files | Status |
|---|---|---|
| Python server | server.py (~165 lines) + api/ modules (~20k lines) |
Thin shell + auth middleware + business logic |
| HTML template | static/index.html (~600 lines) |
Served from disk |
| CSS | static/style.css (~3k lines) |
Themes, mobile responsive, KaTeX, table styles |
| JavaScript | static/{ui,sessions,messages,workspace,panels,boot,commands,icons,i18n,login,onboarding}.js (~26k lines) |
11 modules served as static files |
| Service worker | static/sw.js |
Offline shell cache, version-pinned assets |
| Docker | Dockerfile, docker-compose.yml |
python:3.12-slim, multi-arch (amd64+arm64), HEALTHCHECK |
| CI/CD | .github/workflows/release.yml |
Auto-release + GHCR publish on tag push |
| Test isolation | tests/_pytest_port.py |
Per-worktree port + state-dir derivation, no collisions |
Feature parity checklist
Chat and streaming
- Send messages, get SSE-streaming responses
- Composer-scoped model picker (per-conversation model selection)
- Multi-provider API support — OpenAI, Anthropic, Google, OpenRouter, xAI, GLM, DeepSeek, Mistral, MiniMax, Kimi, OpenCode, Nous Portal, custom OpenAI-compatible endpoints
- Live custom-endpoint model discovery (Ollama, LM Studio, vLLM via
/v1/models) - Free-form OpenRouter model name (autocomplete + custom input)
- Tool progress shown inline via live tool cards
- Approval card for dangerous commands (Allow once / session / always, Deny)
- Approval polling + SSE-pushed approval events
- Clarify dialog — agent can ask blocking clarifying questions
- Subagent delegation cards in tool view
- INFLIGHT guard: switch sessions mid-request without losing response
- Session restores from localStorage on page load
- Reconnect banner if page reloaded mid-stream
- SSE auto-reconnect with stream replay
- Token / cost estimate per message and per session
- Context usage indicator (compact ring badge in composer footer)
- Auto-compaction handling +
/compactcommand - rAF-throttled token rendering (smooth, no DOM thrash)
- Cancel / stop button in composer footer
- Reasoning effort selector (low / medium / high / xhigh) +
/reasoning - Pure-text streaming with crash-recovery — partial messages restored from localStorage on reload
Conversation controls
- Copy message to clipboard (hover icon on each bubble)
- Edit last user message and regenerate
- Regenerate last response
- Clear conversation (wipe messages, keep session)
- Branch / fork conversation from any message point (#465)
- Pure-text + tool-call streams both recover
Sessions
- Create session (+ button or Cmd/Ctrl+K)
- Load session (click in sidebar)
- Delete session (hover trash, toast undo, fallback)
- Auto-title from first user message + adaptive title refresh (configurable cadence)
- LLM-generated titles via auxiliary route (configurable model)
- Rename session inline (double-click, Enter saves, Escape cancels)
- Title search (live filter)
- Content search (full-text across all sessions)
- Date group headers (Today / Yesterday / Earlier) with collapsible groups
- Pin / star sessions to top
- Duplicate session
- Import / Export session as JSON (full messages + metadata)
- Download as Markdown transcript
- Tags (
#tagextraction + filter chips) - Archive sessions (hidden by default, "Show N archived" toggle)
- Projects / folders (chip filter bar, "Unassigned" filter)
- Per-session profile tracking
- Per-session toolset override (
/toolsets) - Batch select mode (multi-select, bulk delete / move / archive)
- CLI session bridge — read CLI sessions from state.db, import as WebUI sessions
Workspace and files
- Add workspace with path validation (existing directory, follows symlinks)
- Remove / rename workspace
- Quick-switch from topbar dropdown
- Sidebar live workspace display (name + path)
- New sessions inherit last-used workspace
- Browse workspace directory tree with type icons
- Tree view with expand / collapse + lazy load (#22)
- Breadcrumb navigation in subdirectories
- Preview text / code (read-only)
- Preview markdown (rendered + tables + Mermaid + KaTeX)
- Preview images (PNG, JPG, GIF, SVG, WEBP, AVIF inline)
- Preview PDF / SVG / audio / video / Excalidraw / CSV / JSON / YAML
- Edit files inline (Edit button, Enter saves, Escape cancels)
- Create / rename / delete files and folders (in current directory)
- Drag-drop / click / clipboard paste upload
- Archive upload (zip / tar) with extraction
- Syntax highlighted code preview (Prism.js, language-aware)
- File preview auto-close on directory navigation
- Right panel resizable (drag inner edge)
- Embedded workspace terminal (
/api/terminal/{start,input,output}) - Git branch + dirty status badge in workspace header
Cron jobs
- List all cron jobs (Tasks sidebar tab)
- View job details (prompt, schedule, last run, output)
- Run / pause / resume / delete
- Create job from UI (name, schedule, prompt, delivery target)
- Edit job inline (full create-form parity, including skills)
- Skill picker in create + edit forms
- Cron run history viewer (expandable per job)
- Cron completion alerts (toast + badge)
- Run-status tracking with live watch mode
Skills
- List all skills grouped by category
- Search / filter by name, description, category
- View full SKILL.md content
- View skill linked files
- Create / edit / delete skill
/skillsslash command
Memory
- View personal notes (MEMORY.md) rendered as markdown
- View user profile (USER.md) rendered as markdown
- Last-modified timestamp per section
- Add / edit memory entries inline
Profiles
- Multi-profile support — create, switch, delete (#28)
- Topbar profile picker with gateway-status dots
- Profile management panel (full CRUD)
- Seamless switching (no server restart, refreshes models / skills / memory / cron / workspace)
- Profile-local workspace storage
- First-run onboarding wizard with provider config (OpenRouter / Anthropic / OpenAI / Custom)
- In-app OAuth for Codex and Claude
Configuration
- Settings panel (default model, default workspace, send key, theme, voice, font size)
- Send key preference (Enter or Ctrl+Enter)
- Password authentication (off by default)
- Per-session toolset override
- Personality config via
config.yaml - Reasoning effort persistence
Notifications
- Cron job completion alerts
- Background agent error banner
- Approval pending badge
- Provider / model mismatch toast warning
Slash commands
- Command registry + autocomplete dropdown
- Built-ins:
/help,/clear,/model,/workspace,/new,/usage,/theme,/compact,/queue,/interrupt,/steer,/goal,/btw,/reasoning,/skills,/toolsets - Transparent pass-through for unrecognized commands
Security
- Password auth with signed HMAC HTTP-only cookies (24h TTL)
- Security headers (X-Content-Type-Options, X-Frame-Options, Referrer-Policy)
- CSRF protection (scheme-aware, port-normalized for reverse proxies)
- PBKDF2 password hashing
- Rate limiting on auth endpoints
- Session ID validation
- SSRF guard on
/api/models/live,cfg_base_url,custom_providers[] - ENV_LOCK around env mutations
- XSS sanitization on all rendered HTML
- HMAC-signed signing keys (random per install)
- Skills path-traversal guard
- Secure cookie flags (HttpOnly, SameSite, Secure when HTTPS)
- Error message sanitization (no stack traces in responses)
- POST body size limit (20MB)
- Upload path-traversal guard
- Credential redaction in API responses
- Profile
.envsecret isolation on switch - Auto-install gate (opt-in via
HERMES_WEBUI_AUTO_INSTALL=1)
Visual / UX
- 8 themes — Dark, Light, System (auto-sync), Slate, Solarized, Monokai, Nord, OLED, Sienna
- 2-axis appearance model (theme + skin) for community theme contributions
- Mermaid diagram rendering
- KaTeX math rendering with fence-before-math fix
- Syntax highlighting (Prism.js, language-aware, YAML newline preservation)
- Markdown image syntax
and inline MEDIA: tokens render as<img> - Plain URL auto-linking
- Inline markdown in table cells (bold, italic, code, links)
- Code block copy button
- Tool card expand / collapse toggle
- Collapsible thinking / reasoning cards (Claude extended thinking, o3 reasoning tokens)
- Message timestamps (subtle, full date on hover)
- Empty composer hides send button (icon-circle with pop-in animation)
- Pluggable Lucide SVG icons (no emoji rendering inconsistencies)
- Composer-centric controls (v0.50.0 UI overhaul)
- Hermes Control Center modal (centralized actions)
- Workspace panel state machine (defaults closed, opens for browsing / preview)
- PWA manifest + service worker (offline shell)
- Favicon (SVG + PNG + ICO)
- Branded onboarding wizard
Voice
- Voice input via Web Speech API (push-to-talk dictation)
- Hands-free voice mode (turn-based conversation, opt-in via Settings → Preferences)
- TTS playback of responses (configurable voice, rate, pitch)
Mobile
- Hamburger sidebar (slide-in overlay)
- Bottom navigation bar (5-tab iOS-style)
- Files slide-over (right panel as slide-over)
- 44px minimum touch targets
- Container queries on composer
- Android Chrome compatibility fixes
- PWA installation (manifest + icons + Android support)
Internationalization
- 9 locales — English, Japanese, Russian, Spanish, German, Chinese (zh + zh-Hant), Portuguese, Korean, French
- Key-parity test ensures every locale has every key
- Right-to-left and CJK input (IME composition fixes)
Gateway integration
- Real-time gateway sessions in sidebar (Telegram, Discord, Slack, Weixin) via SSE + DB polling
- Cross-channel handoff dock — composer-docked flyout summarizing the live external session
- Transcript-summary card at 10+ rounds
- Sidebar dedup keying on per-conversation identity (distinct chats from same platform stay separate)
- Gateway session sync skips dup / delete options for external sessions
- LLM Gateway routing metadata display — assistant turns and session metadata show the served model/provider, failover path, and model-switch warnings when response metadata includes
used_provider,used_model, orrouting(#732)
MCP integration
- MCP server management UI (System Settings → MCP Servers)
- Add / edit / delete MCP server entries
Distribution
- Docker support (multi-arch amd64 + arm64, HEALTHCHECK, UID/GID auto-detect)
- Two-container Docker compose (webui + agent)
- GHCR auto-publish on tag push
- Subpath mount support (reverse proxy at
/hermes/) - PWA installable from any browser
- Native macOS app — universal Intel + Apple Silicon, signed + notarized DMG, Sparkle 2 auto-update — see
hermes-webui/hermes-swift-macrepo
Forward work
Confirmed candidates (open feature requests with sprint-candidate or active interest)
| Theme | Tracking | Why |
|---|---|---|
| Persistent-host stability | #1458 | Bootstrap fork pattern crashes under launchd / systemd — partial fix shipped (foreground mode); state.db FD leak and HTTP-unhealthy wedge remain |
| Free-tier OpenRouter variants visible | #1426 | :free tool-support filter currently hides them from the picker |
| macOS scroll override regression | #1360 | Auto-scroll sometimes overrides user scroll on the desktop app |
| GLM dual-use (main + auxiliary) | #1291 | Currently mutually exclusive; same provider can't serve both surfaces |
| Auto-assign session to filtered project | #1468 | When user is filtering by project X, new session should default to project X |
| Update banner "What's new?" link | #1512 | Surface release highlights from the update banner |
Sunset legacy LMSTUDIO_API_KEY env var |
#1502 | Tracking issue — alias stays for one minor cycle, then removed |
| Hermes Agent dashboard cross-link | #1459 | Detect a running Hermes Agent and surface link in nav |
| Gateway status card in Settings | #1457 | Current gateway-status dots only on profile picker |
| Insights — daily token chart + per-model breakdown | #1456 | Existing usage badge is per-message; need rollup view |
| Logs tab — view agent / errors / gateway logs | #1455 | Currently requires terminal access to log files |
| Model picker collision handling | #1425 | Same-name models from different providers aren't disambiguated in dropdown |
| "Reveal in Finder" right-click on workspace | #1424 | macOS desktop app convenience |
| Configurable session persistence timing | #1406 | Currently every checkpoint, want operator control |
| Silent credential self-heal on 401 | #1401 | Gateway auth.json drift should resolve without user re-auth |
| LLM Wiki status panel | #1257 | On / off toggle for Wiki integration |
| Lightweight in-app Canvas editing | #1255 | Text canvas for prompt drafting / shared notes |
| Provider / Model source-of-truth alignment | #1240 | Reconcile WebUI vs CLI vs Gateway provider resolution |
| Built-in SearXNG web search | #1037 | Lightweight search tool with on / off toggle |
| Subagent session relationship view | #1004 | Show subagent hierarchy in sidebar with expand / collapse |
Backlog (deferred, listed for visibility)
- Insights / monitoring suite — agent heartbeat + alerts (#716), quota / rate-limit display (#706), data tabs (#722), monitor dashboard concepts (#766, #721)
- Native MCP server expose — Hermes WebUI as an MCP server for direct agent integration (#733)
- Teams / agents management panel — editable names, roles, assignments (#719)
- Web UI profile model alignment with Hermes runtime — design parity (#749)
- DOM windowing / message virtualization — for sessions with hundreds of messages (#734)
- Searchable global tool list (#697)
- Add agent / replace model modals (#698)
- Code execution inline cells — Jupyter-style cell rendering inside chat
- Sharing / public conversation URLs — requires hosted backend with access control (out of scope for self-host)
Intentionally not planned
- Full SwiftUI rewrite of the frontend — the WKWebView shell already gets 95% of native benefit
- App Store distribution — sandboxing breaks the local server model
- Real-time multi-user collaboration — single-user assumption throughout
- Plugin marketplace — Hermes skills cover this surface
- Anthropic / Claude proprietary features — Projects AI memory, Claude artifacts sync (not reproducible)
Sprint history
Per-version detail lives in CHANGELOG.md. The table below is a high-level chronology of major sprint themes; individual PR / fix detail moved to CHANGELOG to keep this file readable.
| Range | Theme | Highlights |
|---|---|---|
| Sprints 1–6 | Foundations + workspace | server / static split, JS module split, workspace CRUD, file editor, message queue + INFLIGHT, isolated test environment |
| Sprint 7 | Wave 2 core | Cron / skill / memory CRUD, session content search, health endpoint, git init |
| Sprint 8 | Daily-driver finish line | Edit + regenerate, regenerate last response, clear conversation, Prism.js, queue + INFLIGHT polish |
| Sprints 9–10 | Codebase health + operational polish | app.js → 6 modules, server.py → api/ modules, tool card UX, background task cancel, regression tests |
| Sprint 11 | Multi-provider models + streaming | Dynamic model dropdown, smooth scroll pinning, routes extracted to api/routes.py |
| Sprint 12 | Settings + reliability + session QoL | Settings panel, SSE auto-reconnect, pin sessions, JSON import |
| Sprint 13 | Alerts + polish | Cron alerts, background error banner, session duplicate, browser tab title |
| Sprint 14 | Visual polish + workspace ops | Mermaid, message timestamps, file rename, folder create, session tags, archive |
| Sprint 15 | Session projects + code copy | Projects / folders, code copy button, tool card expand / collapse |
| Sprint 16 | Sidebar visual polish | SVG icons, action dropdown, pin indicator, project border, safe HTML rendering |
| Sprint 17 | Workspace polish + slash commands | Breadcrumb nav, slash command autocomplete, send key setting (#26) |
| Sprint 18 | Thinking display + workspace tree | File preview auto-close, thinking / reasoning cards, expandable directory tree (#22) |
| Sprint 19 | Auth + security hardening | Password auth, login page, security headers, body limit (#23) |
| Sprint 20 | Voice input + send button | Web Speech API voice, send button polish |
| Sprint 21 | Mobile responsive + Docker | Hamburger sidebar, mobile nav, slide-over files, Docker support (#21, #7) |
| Sprint 22 | Multi-profile support | Profile picker, management panel, seamless switching, per-session tracking (#28) |
| Sprint 23 | Agentic transparency | Token / cost display, subagent cards, skill picker in cron, profile-local storage |
| Sprint 24 | Web polish | rAF streaming, git detection, collapsible date groups, context ring (#80, #81, #82, #83) |
| Sprint 25 | macOS desktop application | Native Swift + WKWebView shell, universal DMG, Sparkle 2 auto-update — separate repo |
| Sprint 26 | Pluggable themes | Light / Slate / Solarized / Monokai / Nord, settings unsaved-changes guard, /theme |
| Sprint 27 | Theme polish | 30+ hardcoded colors → CSS variables, light theme final polish |
| Sprint 28 | Security hardening | Env race fix, random signing key, upload traversal, PBKDF2 |
| Sprints 29–32 | Model routing + custom endpoints + reasoning | Model routing by provider prefix, custom endpoint URL fix, OLED theme, top-level reasoning, message_count sync |
| Sprint 33 | Approval card + Lucide icons | Approval prompt surfaced, emoji → SVG, login CSP fix, update diagnostics |
| Sprint 34 | v0.50.0 UI overhaul | Composer-centric controls, Control Center modal, workspace state machine, collapsible date groups, rAF throttle, context ring |
| Sprints 35–37 | Onboarding + i18n + Spanish | First-run wizard, OpenRouter / Anthropic / OpenAI / Custom config, Spanish locale, Docker two-container, mobile Profiles button |
| Sprints 38–40 | Session + UI polish + Sprint 40 | Five-bug clean-up + sidebar timestamp + test port isolation |
| Sprints 41–42 | Renderer hardening + KaTeX + handoff | Context ring live usage, renderMd link / image / code stash chain, MEDIA: image rendering, gateway handoff foundation |
| Sprints 43+ | Continuous contributor sprints | Custom providers, Russian locale, IME fixes, model-switch toast, approval queue multi-slot, profile polish, font-size CSS, contributor wave |
Versioning conventions
- Patch (
v0.50.X) — small batches, contributor PR releases, hotfixes - Minor (
v0.X.0) — sprint completion, new feature surface, architecture milestone - Major (
v1.0.0) — declared when CLI parity + Claude parity reach steady state and the feature surface stabilizes
Per-version detail and contributor attribution live in CHANGELOG.md.