Files
hermes-webui/static
nesquena-hermes e68f74ac99 fix(approval): close SSE notify-ordering, head-fidelity, and trailing-approval gaps (Opus MUST-FIX A/C/D)
Pre-release Opus review caught three correctness bugs in the original
PR #1350 SSE wiring beyond the snapshot/subscribe race:

A) **Notify-ordering race (MUST-FIX A):** _approval_sse_notify took _lock
   only for the subscriber-list snapshot, then released it before
   put_nowait. With two parallel submit_pending calls, T2's notify
   could fire before T1's, leaving the UI showing pending_count=1 while
   the server actually had 2 queued.

C) **Trailing approval lost (MUST-FIX C):** _handle_approval_respond
   never called _approval_sse_notify after popping. With parallel
   tool-call approvals (#527), a second approval queued behind the one
   being responded to was invisible until the next event ever fired —
   in practice, the agent thread parked on it would appear hung.

D) **Payload showed tail not head (MUST-FIX D):** payload built from
   the just-appended entry instead of queue[0]. /api/approval/pending
   returns the head; SSE returned the tail. Diverging contracts.

Fix:
- Split into _approval_sse_notify_locked (caller holds _lock, no
  internal locking) and _approval_sse_notify (convenience wrapper).
- submit_pending: call _locked variant inside the queue-mutation lock,
  passing queue_list[0] as head.
- _handle_approval_respond: call _locked variant inside the pop lock,
  passing the new head (or None/0 if queue is empty).
- Restore fallback poll to 1500ms (was bumped to 3000ms; degraded-mode
  parity with v0.50.247 is more important than save 1.5s of polling).

New regression tests in tests/test_pr1350_sse_notify_correctness.py:
- test_second_submit_pending_sends_head_not_tail (D)
- test_respond_to_first_pushes_second_as_new_head (C)
- test_respond_to_only_pending_pushes_empty_state (C edge)
- test_pending_count_is_monotonic_under_contention (A)

Updated test_approval_sse.py to pin the new contract:
- _approval_sse_notify_locked(session_key, head, total)
- 1500ms fallback interval

Total: 3411 tests passing.

Co-authored-by: jasonjcwu <jasonjcwu@users.noreply.github.com>
2026-04-30 18:45:15 +00:00
..
2026-04-29 06:06:01 +00:00
2026-04-29 06:06:01 +00:00
2026-04-29 21:34:27 -07:00
2026-04-29 21:34:27 -07:00