Files
hermes-webui/SPRINTS.md
T
Hermes Bot 3a23efd923 docs: rewrite ROADMAP.md and SPRINTS.md for v0.50.281 currency
Both files had drifted significantly from the actual current state of the
project:

ROADMAP.md previously contained:
- A ~75-row 'sprint history' table that overlapped with CHANGELOG.md
- A 'Wave 2 Core' section frozen at Sprint 7 progress
- A 'Wave 2: Full CRUD' nested section repeating the same Wave 7 items
- A 'User Requested Features' table that double-counted the same shipped issues
- A 'Feature Parity Checklist' with many unchecked boxes that were actually shipped
  (branch/fork via #465, LLM-generated session titles via auto_title_refresh_every,
  workspace git detection at api/workspace.py:719, code execution and TTS
  reclassified, etc.)

SPRINTS.md previously contained:
- 1159 lines of historical sprint plans (Sprints 11-26)
- Inline planning detail more appropriate for the private workspace
- Stale 'as of v0.50.245' header with 'next sprint Sprint 24' reference
- Track-A/B/C breakdowns from sprints already long-merged

Rewrite:

ROADMAP.md (now 397 lines, was 363):
- Status snapshot table at the top
- Architecture table reflecting current layout (api/ ~20k LOC, static/*.js ~26k LOC)
- Feature parity checklist reorganized by surface (chat / sessions / workspace /
  cron / skills / memory / profiles / config / security / visual / voice /
  mobile / i18n / gateway / MCP / distribution) with every line currently in
  master correctly checked
- 'Forward work' section split into confirmed candidates (with tracking issue
  numbers) vs deferred backlog vs intentionally not planned
- 'Sprint history' compressed to a single chronological theme table — per-version
  detail explicitly redirects to CHANGELOG.md
- Versioning conventions documented

SPRINTS.md (now 165 lines, was 1159):
- Forward-looking only — no historical sprint plans (those live in CHANGELOG.md)
- Active sprint candidates table sourced from the sprint-candidate label
- Planning principles section (phase-0 fit assessment, salvage over absorb,
  independent-review gate, per-PR release velocity, no feature creep mid-PR,
  pre-release gate)
- Sprint shape table (typical 3-7 day sprint with phases)
- Out-of-scope section centralized
- Template for new sprint plans

Also updates TESTING.md test count 3990 → 3995 to match actual pytest collect.

No private workspace info, agent infra references, or contributor stipend
content. References to the maintainer's private planning notes are
acknowledged as 'in a private workspace' without further specifics — same
disclosure pattern most open-source projects use.
2026-05-03 17:39:53 +00:00

7.4 KiB
Raw Blame History

Hermes Web UI — Sprint Planning

Forward-looking sprint plan and active queue.

Current state: v0.50.281 | 3995 tests | port 8787 Target A (CLI parity): Complete Target B (Claude parity): ~95% — full subagent transparency UI and code-execution cells remain

Per-version detail: CHANGELOG.md Sprint history (chronology): ROADMAP.md


How sprints work here

A sprint is a thematic batch — usually 38 PRs landed together as a release. Each sprint has:

  1. Theme — single-sentence framing of what changes
  2. Items — selected work, with tracking issue / PR
  3. Out of scope — what is explicitly deferred and why
  4. Risks — known sharp edges
  5. Retro — once shipped, what we learned

External contributor PRs that don't fit a planned sprint get released individually as patch versions (v0.50.X). Sprints are reserved for coherent batches where the items reinforce each other.


Active sprint candidates

These are the issues currently labeled sprint-candidate — the inbox the next sprint plan draws from. Each item already has a confirmed root cause or design direction.

# Title Type Sprint fit
#1458 Persistent-host crashes — bootstrap fork pattern, state.db FD leak, HTTP-unhealthy wedge bug, stability Stability sprint candidate
#1426 OpenRouter free-tier :free variants invisible (tool-support filter) bug + feat Model picker sprint candidate
#1362 In-app OAuth login for Codex and Claude (currently terminal-only) feat, ux Onboarding sprint candidate
#1360 macOS desktop app — auto-scroll overrides user scroll (#677 regression) bug, ux Desktop app polish
#1291 GLM mutually exclusive between main agent and auxiliary title generation bug Auxiliary-route sprint

The active queue stays small — it's the next 1-2 sprints' worth of work. The broader backlog of feature requests lives in ROADMAP.md → Forward Work and on the GitHub enhancement label.


Planning principles

Phase-0 fit assessment. Every sprint candidate gets a marginal-benefit screen first: does this make the product noticeably better for real users, or does it add surface area that costs more to maintain than it earns? Five-question fit screen — need / shape / bloat / clutter / scope.

Salvage over absorb. When a contributor PR is partial or scoped wrong, prefer splicing the good parts into a maintainer-side PR with Co-authored-by attribution rather than asking for multiple rebase rounds. Only absorb whole when the PR is genuinely shippable as-is.

Independent-review gate. Self-built PRs need either (a) Opus advisor pass on the merged stage diff, or (b) independent review from a separate reviewer. High-risk batches (large LOC, security, locks, durability) get both.

Per-PR release velocity. When a single PR fixes a real bug and has a clean review, ship it as its own patch release the same day rather than waiting for a sprint batch. Friction-free is the goal — sprint batches exist for coherence, not for arbitrary grouping.

No feature creep mid-PR. If a contributor proposes a scope addition during review, file it as a separate PR and link from the original. The current PR keeps its original boundaries.

Pre-release gate (mandatory). Every release runs:

  1. pytest tests/ -q --timeout=120 clean
  2. Browser sanity check (HTTP-level API tests against a test server)
  3. Opus advisor pass on the merged stage diff with a written brief
  4. CHANGELOG.md + ROADMAP.md + TESTING.md version stamp
  5. CI green on Python 3.11 / 3.12 / 3.13

Skipping any of these requires a documented "I'm doing an override" from the maintainer.


Sprint shape

A typical sprint runs 3-7 days end to end:

Phase Duration Output
Triage 0.5 day Active queue → selected items + scope notes
Design / spike 0.51 day Design notes for items needing them; deferrals documented
Build 24 days Each item on its own branch + PR for independent review
Review 0.51 day Maintainer + Opus advisor passes; SHOULD-FIX absorbed in-release
Stage + ship 0.5 day Stage branch, full test suite, release PR, tag, deploy, verify live
Hygiene 0.5 day Close PRs / issues, GitHub release notes, docs sync, retro

Smaller sprints (12 PRs) compress to 12 days end to end. Single-PR releases skip the stage branch and ship via a release PR directly off the contributor's branch (or a maintainer-rebased copy if the fork doesn't grant write access).


Sprint history

Per-version detail is in CHANGELOG.md. High-level theme chronology is in ROADMAP.md → Sprint History. A few notable sprints worth highlighting:

  • Sprint 19 — auth + security hardening. First sprint that made the app safe to leave running beyond localhost.
  • Sprint 21 — mobile responsive + Docker. Two-container compose enabled the first wave of self-host deployments.
  • Sprint 22 — multi-profile support. Major CLI-parity unlock; profile switching is now seamless without server restart.
  • Sprint 25 — macOS desktop application. Native Swift + WKWebView shell, universal Intel + Apple Silicon DMG, Sparkle 2 auto-update. Lives in the separate hermes-webui/hermes-swift-mac repo.
  • Sprint 26 — pluggable themes. CSS-variable-driven 8-theme system that lets community contributors add themes as pure CSS.
  • Sprint 34 — v0.50.0 UI overhaul. Composer-centric controls, Control Center modal, workspace state machine, rAF streaming throttle.

Out of scope (across all sprints)

These are intentionally not on the roadmap. Listing them here to save planning cycles.

  • Multi-user collaboration — single-user assumption throughout the codebase. Refactoring would be a from-scratch architecture change.
  • Sharing / public conversation URLs — requires hosted backend with access control + CDN. Out of scope for self-hosted.
  • Plugin marketplace — Hermes skills already cover this surface.
  • Anthropic / Claude proprietary features — Projects AI memory, Claude artifacts sync. Not reproducible.
  • Linux / Windows native app wrappers — macOS done; demand on other platforms not yet established. Web UI works in any browser.
  • App Store distribution — sandboxing breaks the local-server model.
  • Auto-update mechanism for the Python webapp — Sparkle 2 covers the Mac app; the webapp updates via git pull + restart, which is the same as every other Python service.

Templates

When opening a new sprint plan, copy this structure:

# Sprint NN — <theme>

**Version target:** vX.Y.Z
**Theme:** <single sentence>
**Date started:** YYYY-MM-DD
**Status:** PLANNED | IN PROGRESS | COMPLETED — vX.Y.Z

## Items
| # | Issue | Title | Complexity | Files | PR |
|---|-------|-------|------------|-------|-----|

## Rationale
Why these items, why now.

## Build approach
Per-item branch + PR; or single combined branch if items are tightly coupled.

## Out of scope
What did NOT make this sprint and why.

## Known risks
Sharp edges to watch during review.

## PR Status
| Issue | PR | Status |
|-------|-----|--------|

## Retro (post-ship)
What worked, what we'd do differently.

The maintainer's planning notes for each sprint live in the workspace repo (private), not in this file. This file is the public-facing planning shape.