PR #1358 added the client-side lineage collapse helper, but
/api/sessions often did not include _lineage_root_id for the WebUI
JSON sessions visible in the sidebar. In that case the helper has no
grouping key and multiple same-title continuation rows remain visible.
This PR:
- Reads parent_session_id and end_reason from state.db.sessions for
the WebUI sidebar's session ids
- Walks the parent chain when end_reason is 'compression' or
'cli_close', producing _lineage_root_id and _compression_segment_count
- Cycle-detects via a 'seen' set
- Preserves projected lineage metadata on imported/gateway session rows
- Allows sidebar collapse to group cross-surface continuation chains
(CLI-close → WebUI continuation) while keeping non-continuation
parent rows flat
Co-authored-by: Dennis Soong <dso2ng@gmail.com>
The new _handle_clarify_sse_stream handler in #1355 holds clarify._lock and
then calls clarify.get_pending(sid) under the lock. get_pending also acquires
_lock internally — and clarify._lock is a non-reentrant threading.Lock(),
so the second acquisition deadlocks the SSE handler thread the moment any
client connects to /api/clarify/stream.
Existing tests pass because they only exercise sse_subscribe, sse_unsubscribe,
_clarify_sse_notify, and submit_pending directly — none of them invoke the
route handler. The deadlock would only manifest when a real EventSource opens
the connection.
Reproduced with a tiny harness that holds _lock and calls get_pending: the
worker thread is still blocked after a 2s timeout. With the fix, both empty
and populated queue cases complete in <1ms.
Fix: read clarify._gateway_queues / clarify._pending inline under the same
_lock acquisition, mirroring the approval SSE handler's pattern at
api/routes.py:2785-2793. No recursive lock; head-of-queue snapshot is
identical to what get_pending would have returned.
Added tests/test_pr1355_sse_handler_no_deadlock.py with three tests:
- behavioural: empty queue snapshot completes within 2s
- behavioural: populated queue snapshot returns the head entry
- source-level invariant: routes.py must not call get_clarify_pending()
inside `with _clarify_lock:` block (locks the regression in)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replaces the 1.5s HTTP polling loop for clarify with a Server-Sent Events endpoint at /api/clarify/stream that pushes clarify events to the browser instantly. Mirrors the approval SSE pattern from v0.50.248 (#1350) including all the correctness lessons:
- Atomic subscribe + initial snapshot under clarify._lock
- _clarify_sse_notify called inside _lock for ordering guarantees (no notify-out-of-order race)
- Notify passes head=q[0].data (head-fidelity, not the just-appended entry)
- resolve_clarify also calls notify after pop so trailing clarifies surface immediately (no stuck-clarify bug)
- Empty-state notify with None,0 after pop-empty so frontend hides the card
- 30s keepalive comments, _CLIENT_DISCONNECT_ERRORS handling
- Bounded queue (maxsize=16) with silent drop on full
- Frontend: EventSource with automatic 3s HTTP polling fallback on onerror
Co-authored-by: fxd-jason <wujiachen7@gmail.com>
Session.load_metadata_only().compact() was dropping is_cli_session, source_tag, session_source, and source_label, so imported CLI/gateway sessions lost their provenance in sidebar/API payloads. Adds these to METADATA_FIELDS and Session.compact().
Co-authored-by: Dennis Soong <dso2ng@gmail.com>
- api/streaming.py SSE payload now falls back to agent.model_metadata.get_model_context_length when compressor doesn't supply context_length (mirrors the session-save fallback shipped in v0.50.247).
- api/streaming.py also falls back to s.last_prompt_tokens to avoid using the cumulative input_tokens counter.
- static/ui.js tracks rawPct separately from pct and shows '(context exceeded)' tooltip when rawPct > 100 instead of misleading '100% used (0% left)'.
- static/messages.js clears 'Uploading...' composer status after upload completes.
Co-authored-by: nesquena-hermes <nesquena-hermes@users.noreply.github.com>
Pre-release Opus review caught three correctness bugs in the original
PR #1350 SSE wiring beyond the snapshot/subscribe race:
A) **Notify-ordering race (MUST-FIX A):** _approval_sse_notify took _lock
only for the subscriber-list snapshot, then released it before
put_nowait. With two parallel submit_pending calls, T2's notify
could fire before T1's, leaving the UI showing pending_count=1 while
the server actually had 2 queued.
C) **Trailing approval lost (MUST-FIX C):** _handle_approval_respond
never called _approval_sse_notify after popping. With parallel
tool-call approvals (#527), a second approval queued behind the one
being responded to was invisible until the next event ever fired —
in practice, the agent thread parked on it would appear hung.
D) **Payload showed tail not head (MUST-FIX D):** payload built from
the just-appended entry instead of queue[0]. /api/approval/pending
returns the head; SSE returned the tail. Diverging contracts.
Fix:
- Split into _approval_sse_notify_locked (caller holds _lock, no
internal locking) and _approval_sse_notify (convenience wrapper).
- submit_pending: call _locked variant inside the queue-mutation lock,
passing queue_list[0] as head.
- _handle_approval_respond: call _locked variant inside the pop lock,
passing the new head (or None/0 if queue is empty).
- Restore fallback poll to 1500ms (was bumped to 3000ms; degraded-mode
parity with v0.50.247 is more important than save 1.5s of polling).
New regression tests in tests/test_pr1350_sse_notify_correctness.py:
- test_second_submit_pending_sends_head_not_tail (D)
- test_respond_to_first_pushes_second_as_new_head (C)
- test_respond_to_only_pending_pushes_empty_state (C edge)
- test_pending_count_is_monotonic_under_contention (A)
Updated test_approval_sse.py to pin the new contract:
- _approval_sse_notify_locked(session_key, head, total)
- 1500ms fallback interval
Total: 3411 tests passing.
Co-authored-by: jasonjcwu <jasonjcwu@users.noreply.github.com>
Bundles:
- #1349 fix(ui): show context indicator percentage without explicit context_length
- #1350 feat(approval): SSE long-connection for real-time approval notifications
Pre-release fixes applied:
- Inline subscribe + snapshot under a single _lock acquisition in
_handle_approval_sse_stream() to close the snapshot/subscribe race
flagged in pre-release review. A submit_pending() arriving between
the snapshot read and subscribe call would have been lost (appended
to _pending after our snapshot AND notified to subscribers before we
joined). Now atomic.
- Added tests/test_pr1350_sse_atomic_subscribe.py (4 source-level
invariants covering the atomic-lock-block guarantee).
Co-authored-by: jasonjcwu <jasonjcwu@users.noreply.github.com>
Replaces the 1.5s HTTP polling loop with a Server-Sent Events endpoint
at /api/approval/stream that pushes approval events to the browser
instantly. The backend uses a thread-safe subscriber registry
(_approval_sse_subscribers) with bounded queues to prevent memory
leaks from slow clients. Frontend uses EventSource with automatic
fallback to 3s HTTP polling on SSE error.
- Backend: subscribe/unsubscribe/notify lifecycle in api/routes.py
- New route: GET /api/approval/stream?session_id=
- submit_pending() now calls _approval_sse_notify() after queue append
- Frontend: EventSource with onerror -> _startApprovalFallbackPoll()
- 30s keepalive comments, _CLIENT_DISCONNECT_ERRORS handling
- 42 new tests (static analysis + unit + concurrency)
Co-authored-by: jasonjcwu <jasonjcwu@users.noreply.github.com>
* fix(streaming): fallback to model_metadata for context_length when compressor missing (#1318 follow-up)
PR #1318 (shipped in v0.50.246 via PR #1341 + commit a5c10d5) persisted
context_length on the session so the context-ring indicator survives
page reloads. But the writer only fired when agent.context_compressor
was present and reported a non-zero value. Fresh agents, interrupted
streams, or compressors without the attribute would still leave
s.context_length=0 — and the indicator would still show 0% on reload.
This follow-up adds a fallback that calls
agent.model_metadata.get_model_context_length(model, base_url) when the
compressor didn't populate the value. The function returns a sensible
static context window for any known model (with a 256K default for
unknown models). Wrapped in a broad try/except because older
hermes-agent builds may not expose the helper.
Sourced from PR #1344 (@jasonjcwu) — extracted into this focused
follow-up after #1344 was closed as superseded by #1341.
Adds 6 structural tests covering: import + call presence, falsy-gate,
agent.model/base_url passing, exception swallowing, save() ordering,
result assignment.
Closes the data-flow gap in #1318 for the compressor-missing case.
* test: relax pr1341 block-size assertion to accommodate the new fallback
---------
Co-authored-by: nesquena-hermes <nesquena-hermes@users.noreply.github.com>
Pre-release Opus review on PR #1345 (Finding #3) flagged that
get_cli_sessions() was calling ensure_cron_project() once per cron
session in the loop — N lock acquires + N disk reads of projects.json
for N cron sessions per sidebar refresh.
Hoist a per-scan lazy memoizer (_cron_pid()) so we pay the resolution
cost at most once per get_cli_sessions() call. The memoizer is local
to the function (closure) so it's naturally scoped to a single scan
and doesn't leak across calls.
Could also have made ensure_cron_project() module-memoized, but that
would need invalidation on project deletion — the per-scan cache is
simpler and correct without coordination.
Pre-release Opus + nesquena review on v0.50.246 caught that PR #1341
added the data-structure scaffolding (Session.__init__ accepts the 3
fields, save() persists them, compact() exposes them, GET /api/session
returns them) but did NOT add the writer that actually populates them.
Without a writer, the user-visible bug (context-ring shows 0% after
page reload) was NOT fixed by #1341 alone — the fields stayed None
forever because nothing wrote to s.context_length anywhere.
Adds the writer at api/streaming.py:2188 (post-merge per-turn save block,
before s.save()) so the values from agent.context_compressor land on
disk and survive page reloads.
Also moves the SSE usage payload comment to clarify that the live SSE
payload and the session-level persistence are now distinct paths
(payload below, persistence above).
Adds tests/test_pr1341_context_window_persistence.py — 6 structural +
round-trip tests covering Session __init__/save/compact, the routes
response, and the streaming.py writer placement.
Closes#1318 (the actual user-visible bug, not just the scaffolding).
Pre-release Opus review on v0.50.246 caught a SHOULD-FIX in PR #1338's
cancel_stream synthesis: the symmetric substring guard
(_pending_user in _last_content OR _last_content in _pending_user) was too
loose. Common confirmation replies ("ok", "yes", "go") in the prior turn
would match longer follow-up prompts ("ok please continue"), the synthesis
would be skipped, and the user's typed text would be lost — exactly the
data-loss bug #1298 was supposed to fix.
The fix: gate the substring check on a timestamp comparison. Only treat
the latest user turn as 'already merged by the streaming thread' if its
timestamp is at or after pending_started_at. Earlier turns whose content
happens to be a substring of the pending must not short-circuit synthesis.
Also drops the symmetric (_last_content in _pending_user) branch — that
direction was the false-positive vector. Keeps the equality and prefix
match (workspace-prefix tolerance from the streaming thread).
Adds tests/test_issue1298_cancel_and_activity.py::
test_cancel_synthesizes_when_prior_turn_content_is_substring_of_pending —
regression for the exact 'ok' → 'ok please continue' scenario.
From PR #1338. Already independently APPROVED by nesquena before being absorbed into v0.50.246.
CHANGELOG entries from this PR were dropped during squash (the v0.50.245 section is already
shipped); they will be re-added under [v0.50.246] in the release commit.
Co-authored-by: nesquena-hermes <nesquena-hermes@users.noreply.github.com>
release: v0.50.243
Batch release of 2 PRs.
- #1301 — fix: remove PRIMARY chip badge + add Claude Opus 4.7 label
Drops the chip-projected configured-model badge added in #1287 (chip
width 235px → 164px). Adds Claude Opus 4.7 label entries so the picker
no longer renders "Claude Opus 4 7" (missing dot).
Independently reviewed and approved by nesquena (commit c0bbd23).
- #1297 (@franksong2702) — fix: preserve cron output response snippets
Fixes#1295. /api/crons/output now preserves the ## Response section
when a large skill dump appears in the prompt section; falls back to
file tail when no marker exists.
Tests: 3254 passed, 2 skipped, 3 xpassed.
Independently reviewed and approved by nesquena (commit b262e4d).
Reverts the global assistant serif rule and removes the Calm theme that were shipped in v0.50.240 PR #1282. Pure deletion; 3252 tests passing. Override on independent review per Nathan.
When the user explicitly selects @provider:model from the picker,
_resolve_compatible_session_model() was stripping the prefix because
the hint matched the active provider (hint_matches_active=True → return bare_model, True).
This caused:
- The picker to snap back to the first duplicate entry on next render
- resolve_model_provider() to use the default provider instead of the
explicitly selected one, running the agent on the wrong backend
The hint_matches_active branch was intended for normalizing stale cross-
provider session models. But an @provider:model where the hint IS the
active provider is not stale — it is the user's deliberate selection.
Fix: return (model, False) so the full @provider:model survives to
resolve_model_provider() in config.py, which already handles it correctly.
Updates test_active_at_provider_session_model_preserved_with_hint and
adds test_issue1253_duplicate_model_id_active_provider_hint_preserved.
Closes#1253
All LRU cache operations (get, set, move_to_end, popitem) are already
protected by SESSION_AGENT_CACHE_LOCK. This addresses the reviewer's
concern about thread safety in multi-threaded ASGI servers.
Problem:
- GET /api/mcp/servers returned 404 error
- MCP servers management UI could not load server list
- Root cause: route was placed outside handle_get(), in unreachable code
Root Cause:
- The MCP servers GET route was incorrectly placed after handle_get() returned False (404)
- handle_get() function returns False at line ~1224, so any code after it won't execute
- The route was also in handle_post() area but without proper method checking
Solution:
- Moved GET /api/mcp/servers route inside handle_get() before the return False statement
- Removed the misplaced route from the old location (originally around line 1636)
- Also updated /api/profiles response format to include full profiles list
Testing:
- After restart: curl http://localhost:8787/api/mcp/servers returns {"servers": []}
- No more 404 errors
- WebUI can now properly load MCP servers list
The agent cache stores full AIAgent instances (each holding complete
conversation history) without size limit. Long-running servers with
many sessions can accumulate unbounded memory usage.
Changes:
- Replace dict with OrderedDict for LRU tracking
- Add SESSION_AGENT_CACHE_MAX = 50 limit
- Evict least-recently-used entries when cache exceeds limit
- Call move_to_end() on cache hits to maintain LRU order
This prevents memory exhaustion on servers with many active sessions.
When using custom providers with private IPs (like AxonHub on internal
networks), the SSRF protection incorrectly blocks API calls to the user's
own configured endpoint.
This fix automatically adds the model.base_url hostname to the SSRF
trusted hosts list, since it's explicitly configured by the user.
Fixes issues where /api/models and /v1/* endpoints fail silently
when using custom providers with private IPs or IPv6 addresses.
B1: fix stored XSS in MCP delete button — replace inline onclick with
data-mcp-name attribute + event delegation (panels.js)
B2: fix zip/tar-slip via startswith prefix collision — use
is_relative_to(); track actual extracted bytes instead of trusting
member.file_size (upload.py)
B3: add NVIDIA NIM endpoint to _OPENAI_COMPAT_ENDPOINTS and
_SUPPORTED_PROVIDER_SETUPS so provider is reachable (routes.py,
onboarding.py)
H1: add terminalResizeHandle element to index.html and return it from
_terminalEls() so resize-by-drag works (index.html, terminal.js)
H2: fix dead get_terminal() branch — return None for dead terminals
instead of always returning term (terminal.py)
H3: replace os.environ.copy() with a safe allowlist in PTY shell env
so API keys are not exposed inside the terminal (terminal.py)
H5: make model dedup deterministic — sort groups by provider_id
alphabetically before first-occurrence assignment (config.py)
H7: add pid regex validation before OAuth probe; constrain key_source
to a closed set of safe values (providers.py)
M8: add double-run guard for cron run-now — reject if job is already
tracked as running (routes.py)
- Add _strip_masked_values() to skip masked placeholders in PUT endpoint,
preserving the original stored secret values instead of overwriting them
- Fix transport badge to gracefully handle unknown/future transport types
with a fallback that shows the raw string
- Add TestStripMaskedValues (5 tests) for the round-trip protection logic
- Addresses reviewer feedback on secret masking semantics and transport badge
- Add GET /api/mcp/servers (list with masked secrets)
- Add PUT /api/mcp/servers/<name> (add/update stdio and http servers)
- Add DELETE /api/mcp/servers/<name> (remove server)
- MCP section in System settings with server list, add/delete form
- Auto-detect transport type (stdio vs http) from server config
- Mask sensitive values (API keys, tokens, passwords) in list response
- Uses showConfirmDialog for delete confirmation (no native confirm)
- i18n: 21 keys across 7 locales
- 21 tests (list, save, delete, mask_secrets, validation)