Commit Graph

215 Commits

Author SHA1 Message Date
nesquena-hermes 935d9e6402 Stage 379: PR #2461
# Conflicts:
#	CHANGELOG.md
2026-05-17 23:35:18 +00:00
swftwolfzyq b2ee7e365f Merge latest origin/master into codex/workspace-prefix-display-fix 2026-05-17 23:44:16 +08:00
swftwolfzyq 3553e63a51 Merge origin/master into codex/workspace-prefix-display-fix 2026-05-17 23:39:12 +08:00
starship-s 625d8d02fd fix: preserve memory lifecycle mark ordering 2026-05-17 05:16:46 -06:00
starship-s eb70a6dc5d fix: align WebUI memory commits with CLI boundaries 2026-05-17 05:04:57 -06:00
starship-s aecad0f427 [verified] Fix WebUI memory session lifecycle commits 2026-05-17 03:30:06 -06:00
nesquena-hermes 47c210899e Stage 374: PR #2421 — fix(cache-tokens): surface provider prompt-cache read/write tokens in WebUI usage by @Michaelyklam (fixes #2419)
Co-authored-by: Michael Lam <michael@example.local>
2026-05-17 02:49:34 +00:00
nesquena-hermes 8a950cfbdd Stage 373: PR #2417 — fix(streaming): stale compaction task resume on fresh greetings (closes #2308, supersedes #2309)
Co-authored-by: Frank Song <franksong2702@gmail.com>
2026-05-17 00:22:22 +00:00
Hermes Agent b937cf3583 Stage 370: PR #2390 — Fix live progress Activity grouping by @franksong2702
# Conflicts:
#	CHANGELOG.md
2026-05-16 20:21:58 +00:00
Hermes Agent ade7401ae1 Stage 369: PR #2396 — fix(streaming): preserve session agents for credential pools by @starship-s 2026-05-16 20:03:44 +00:00
starship-s 727e3c9c8f fix(streaming): preserve session agents for credential pools 2026-05-16 13:05:25 -06:00
Frank Song 2dfe3ffb42 Fix live progress activity grouping 2026-05-16 23:37:44 +08:00
Michael Lam 962b3840e6 fix: strip historical images in text mode 2026-05-16 03:55:12 -07:00
Hermes Agent b293bf8bc5 stage-364: Opus-caught live SSE event_id fix (side-channel approach)
Replace the earlier frontend-reset approach with a backend side-channel
approach that preserves the queue (event, data) tuple shape.

Problem (Opus catch):
- Live SSE frames emitted by _sse() in api/streaming.py:2296 carried no
  'id:' field. Only journal-replay frames (via _sse_with_id) emitted IDs.
- Frontend's _lastRunJournalSeq cursor stayed at 0 during live streaming.
- Mid-stream error → reconnect-to-replay arrived with after_seq=0.
- Server replayed every journaled event from seq 1.
- assistantText (closure-scoped) had accumulated all live tokens already
  → double-rendered output.

Fix:
- api/config.py: STREAM_LAST_EVENT_ID: dict = {} module-level dict.
- api/streaming.py put(): capture journal event_id, write to
  STREAM_LAST_EVENT_ID[stream_id]. Keep queue tuple as (event, data).
- api/routes.py _handle_sse_stream: read STREAM_LAST_EVENT_ID[stream_id]
  at emit time, use _sse_with_id when set.
- api/streaming.py finally block: pop STREAM_LAST_EVENT_ID for cleanup.

Why side-channel instead of 3-tuple:
- Earlier attempt (queue tuple → (event, data, event_id)) broke 4 existing
  tests: test_cancel_interrupt, test_sprint42, test_sprint51,
  test_issue1857_usage_overwrite. These all unpack 'event, data = q.get()'.
- Frontend-reset approach (reset assistantText before replay) broke 3
  other tests: test_smooth_text_fade, test_streaming_markdown,
  test_streaming_race_fix. _wireSSE must NOT reset accumulators because
  legacy reconnect doesn't replay events; only journal-replay does.

Side-channel preserves both invariants:
- Queue contract stays (event, data) — legacy consumers unbroken.
- Frontend accumulators stay alive on _wireSSE — legacy reconnect unbroken.
- Live SSE emits 'id:' so the journal cursor advances correctly.

6 regression tests added in test_stage364_opus_live_sse_event_id.py.
1 existing test (test_run_journal_streaming_static.test_streaming_journals_sse_events_before_queue_delivery) updated to be tuple-shape-agnostic.

Test results:
- Full pytest: 5713 passed, 10 skipped, 1 xfailed, 2 xpassed, 0 failed
- Previously-failing 5 tests: ALL PASS
- 6 new regression tests: ALL PASS
2026-05-16 03:58:54 +00:00
Frank Song 3b96035af0 Add WebUI run event journal replay 2026-05-16 02:58:34 +00:00
Michael Lam 0e91f89ce3 fix: clear runtime fields on loaded compression snapshots 2026-05-15 17:55:35 -07:00
Hermes Agent 62e4d9b2f5 Merge pull request #2327 into stage-362
fix: use assistant name in cancel copy (dotBeeps)
2026-05-15 22:55:35 +00:00
Michael Lam 6799ec56cf test: retarget compression snapshot runtime regression 2026-05-15 15:29:28 -07:00
dot 🐶 3add6f450f fix: use assistant name in cancel copy
Replace the hardcoded Skyly cancellation wording with the configured bot_name from settings, falling back to Hermes when unset.

Keep the client-side fallback in sync by using window._botName if the session refresh after cancellation fails.

Co-authored-by: Obryn 🐉 <obryn-ai@dotbeeps.dev>
2026-05-15 16:00:30 -04:00
Hermes Agent 29d13953d6 stage-361: apply Opus SHOULD-FIX — allow _attachment_root() in _build_native_multimodal_message 2026-05-15 19:55:34 +00:00
Hermes Agent a8a27eeb7d stage-360: Opus follow-up — update _ENV_LOCK docstring to reflect narrow-lock semantics
Opus stage-360 review caught that the docstring at api/streaming.py:40-43
said 'around the entire agent run' which is no longer accurate after the
narrow-lock refactor. The lock is now held only briefly for the env-mutation
critical section; the agent runs outside the lock and the finally block
re-acquires to atomically restore env vars.

Docstring now points to both narrow-lock implementations as references:
- _run_agent_streaming at line ~2719 (the original pattern)
- profile_env_for_background_worker at api/profiles.py:715 (added stage-360)
2026-05-15 19:05:37 +00:00
Hermes Agent fb0e664a10 stage-360 maintainer fix: narrow _ENV_LOCK to env mutation only in profile_env_for_background_worker
#2299 introduced profile_env_for_background_worker() in api/profiles.py and
changed _ENV_LOCK from threading.Lock() to threading.RLock(). Both changes
were incorrect:

1. RLock masked rather than fixed the underlying deadlock. The QA
   test_env_lock_is_non_reentrant test exists precisely to enforce
   non-reentrance — RLock would let a single thread hold _ENV_LOCK across
   nested critical sections, which hides bugs while still allowing
   different-thread races.

2. The original context manager held _ENV_LOCK for the ENTIRE 'yield'
   duration, meaning the lock was held for the full background worker's
   runtime (title generation, compression, update summary — possibly
   many seconds). That blocked ALL other sessions on _ENV_LOCK, which
   the QA test_third_message_completes runtime test caught as a timeout
   on the third sequential message.

Fix: mirror the narrow-lock pattern from _run_agent_streaming:
  - Acquire _ENV_LOCK only for env mutation (set runtime_env + patch
    skill modules)
  - Release immediately, yield to worker (no lock held)
  - Reacquire in finally to restore env + skill modules

Restored _ENV_LOCK back to threading.Lock(). All 20 QA tests now pass,
including test_third_message_completes (was timing out, now 35s).
2026-05-15 17:11:45 +00:00
Hermes Agent 3b05929f1a Merge pull request #2299 into stage-360
Fix profile-scoped auxiliary routing for background workers (starship-s)
2026-05-15 16:15:39 +00:00
Hermes Agent b2ebbebf01 Merge pull request #2279 into stage-360
Fix WebUI stream completion recovery gaps (franksong2702, closes #2262 + #2168)
2026-05-15 16:15:38 +00:00
Hermes Agent 75a2464821 stage-359: apply Opus SHOULD-FIX — symmetric runtime-field clearing on snapshot load-and-mark path 2026-05-15 15:27:24 +00:00
Hermes Agent fb8b91019e Merge pull request #2295 into stage-359
fix: clear runtime fields on compression snapshots (ai-ag2026)

# Conflicts:
#	CHANGELOG.md
#	api/streaming.py
2026-05-15 15:06:35 +00:00
Hermes Agent 4826a31fbc Merge pull request #2285 into stage-359
fix: hide pre-compression snapshots from sidebar (dso2ng, refs #2230)

# Conflicts:
#	CHANGELOG.md
2026-05-15 14:55:19 +00:00
starship-s abb6057304 test(profiles): keep profile module reloads isolated 2026-05-15 04:14:09 -06:00
Frank Song cadcf983d5 Tighten silent failure shrink detection 2026-05-15 18:04:53 +08:00
starship-s 4ffecdd7c9 refactor(profiles): consolidate background profile env 2026-05-15 03:58:40 -06:00
Dennis Soong eb31b4ed1e test: tighten compression snapshot preservation coverage 2026-05-15 17:31:37 +08:00
starship-s aa1c7c24f4 fix(profiles): route background aux workers via session profile 2026-05-15 03:02:42 -06:00
ai-ag2026 3a4259476d fix: clear runtime fields on compression snapshots 2026-05-15 09:20:19 +02:00
Dennis Soong bfccdc5c94 fix: hide pre-compression snapshots from sidebar 2026-05-15 11:20:17 +08:00
Frank Song 5f9b9c02b2 Fix WebUI stream completion recovery gaps 2026-05-15 08:36:48 +08:00
fxd-jason 1e80b51560 fix: align usage-overwrite test FakeAgent with real agent message format
The FakeAgent in test_issue1857_usage_overwrite returned only 2 messages
(user + assistant) without the conversation history. The real agent always
returns the full history plus new messages. This mismatch caused the new
_has_new_assistant_reply helper (which checks only messages beyond the
pre-turn offset) to see len(result)==len(prev) and incorrectly flag the
turn as a silent failure.

Fix: prepend conversation_history to the FakeAgent's response so the
message list mirrors production behavior.
2026-05-14 14:48:08 +08:00
fxd-jason 120ec5eba2 fix: silent failure detection scans only new messages, not full history
When a provider error (401/429/rate-limit) causes the agent to return
without producing a new assistant reply, the WebUI should emit an
apperror event so the user sees an inline error. However, the detection
logic scanned ALL messages in result['messages'] — which includes the
full conversation history. If any prior turn had an assistant response,
_assistant_added would be True and the apperror would be silently
skipped, leaving the user staring at a blank response.

Extract a helper _has_new_assistant_reply(all_messages, prev_count)
that only inspects messages beyond the pre-turn history offset. Apply
it to both the main detection path and the self-heal/retry path.

Tests: 15 new cases covering history masking, empty content, whitespace,
edge-case shrinks, and multi-assistant scenarios.
2026-05-14 14:34:19 +08:00
Hermes Agent 3d34a72ee8 stage-353: apply Opus SHOULD-FIX — unconditional parent_session_id stamp on compression rotation
Opus identified that PR #2227's preservation block had two related bugs in
the parent_session_id handling:

1. During preservation save: code did
     _old_parent = s.parent_session_id
     s.parent_session_id = None
     s.save(touch_updated_at=False, skip_index=True)
     s.parent_session_id = _old_parent
   The save persisted parent=None to disk. The in-memory restoration didn't
   reach the disk copy. Result: a /branch fork session that subsequently
   compressed lost its 'Forked from X' badge on the preserved old snapshot.

2. Stamping the continuation: code did
     if not s.parent_session_id:
         s.parent_session_id = old_sid
   The 'if not' guard skipped the stamp when the session already had a
   parent_session_id from a prior fork. Result: fork-of-fork compression
   broke lineage — the continuation jumped back to the original fork parent
   instead of the just-preserved immediate predecessor snapshot.

Fix (matches Opus's recommendation):
  - Remove the parent clearing during preservation save (preserve as-is)
  - Drop the 'if not' guard; always stamp continuation to old_sid

This makes the lineage chain consistent: new → old → old.parent → ... root.
Traversal from the continuation always walks through the just-preserved
snapshot to get to its parent's parent, never jumping over the snapshot.

Two new regression tests pin both invariants:
  - test_parent_session_id_stamped_unconditionally (no 'if not' guard)
  - test_old_session_parent_preserved_during_archive_save (no parent=None)

Both pass against the fix. All 8 tests in the file pass.
2026-05-14 03:59:02 +00:00
RØG3R L!M4 5bbf18324c fix: preserve session history during compression rotation (#2223)
The previous implementation renamed old_sid.json → new_sid.json during
context compression, destroying the only persistent copy of the full
conversation history. If the summarisation LLM call also failed, the
user was left with zero recoverable messages.

Fix:
- Remove the destructive old_path.rename(new_path) call
- Preserve old_sid.json as an immutable pre-compression archive
- Create new_sid.json as a fresh file via s.save()
- Set parent_session_id on the continuation session for lineage
- Save in-memory messages to old_sid.json if they're newer than disk

Test: test_issue2223_compression_no_rename.py (6 tests, all passing)
2026-05-14 03:02:44 +00:00
Frank Song 28ec3af697 fix: strip only leading user-asking wrapper line
Refs #2215 Fix B: remove the mid-response stripping hazard without losing leading multi-line wrapper cleanup.

The pattern now strips only a leading 'the user is asking' wrapper line and preserves the visible answer that follows. Add regression coverage for both the leading-wrapper and mid-response prose cases.
2026-05-14 09:14:28 +08:00
Frank Song dc213d47b8 fix: preserve literal thinking tags 2026-05-14 07:13:34 +08:00
Hermes Agent 7209e89ef4 stage-350: apply Opus SHOULD-FIX — tighten _partial_already_present dedup scope
Opus flagged that PR #2151's cancel-handler partial-dedup loop used a
substring check that was too broad: any short prior assistant reply
('OK', 'Here is the answer:') would dedup a longer new partial containing
it, silently dropping the partial and resurrecting the #893 data-loss bug.

Tightened to only dedup against actual prior _partial=True markers with
exact (whitespace-stripped) content match. Three new regression tests
added (short-non-partial-prefix-does-not-dedup, exact-partial-match-still-
dedups, same-content-non-partial-does-not-dedup).

10/10 partial-cancel tests pass after the fix. Also updated CHANGELOG with
the conflict-resolution notes for #2151 vs #2136 and the #2178 test-fix.
2026-05-13 21:11:01 +00:00
Hermes Agent 3f851051cf Merge pull request #2151 into stage-350
fix: clarify cancelled chat turn status (Jordan-SkyLF)

Conflict resolution on api/streaming.py:4549-4567 (the cancel-handler
ownership guard). Both this PR and the already-shipped PR #2136 add a
guard at the same site against stale stream writebacks, from different
angles:

  - PR #2136 (HEAD): _stream_writeback_is_current(_cs, stream_id) — strictly
    dominates by checking the active_stream_id token equality.
  - PR #2151: 'worker won the race' check via (active_stream_id != stream_id
    and not pending_user_message), with _emit_cancel_event = False to suppress
    the terminal cancel event.

Resolution merges both: keep #2136's strictly-stronger condition for skip
detection, and adopt #2151's _emit_cancel_event = False semantic so the
cancel event isn't emitted in addition to skipping the writeback (when
client may have already received the successful done payload).

55/55 tests pass across cancelled-turn-status + stale-stream-writeback +
the four cancel/data-loss sibling test files.
2026-05-13 20:44:44 +00:00
Hermes Agent 7150e9fe70 Merge pull request #2202 into stage-349
feat: show early session titles on chat start (Jordan-SkyLF)
2026-05-13 19:03:03 +00:00
Jordan SkyLF 0381294f1c feat: add early session provisional titles 2026-05-13 11:37:11 -07:00
MrFant 520795fdd2 fix: preserve reasoning_content in API message whitelist
Providers like Xiaomi MiMo, DeepSeek, and Kimi require reasoning_content
to be echoed back on every assistant message in multi-turn conversations
with tool calls. Omitting it causes HTTP 400: 'The reasoning_content in
the thinking mode must be passed back to the API.'

The WebUI's _sanitize_messages_for_api() strips all fields not in
_API_SAFE_MSG_KEYS before sending conversation history to the LLM API.
reasoning_content was not in this whitelist, so it was silently dropped.

The CLI path (run_agent.py) is unaffected because it has its own
_copy_reasoning_content_for_api() logic that operates on raw message
dicts without going through this filter. This is why the same session
works from CLI but fails from WebUI with HTTP 400.

The fix adds 'reasoning_content' to _API_SAFE_MSG_KEYS so the field
passes through sanitization intact.
2026-05-14 02:29:17 +08:00
Lumen Yang 3289c44fb6 fix: refresh context ring after compression 2026-05-13 14:02:28 +02:00
Frank Song 9ea4f1145d Fix stale stream exception writeback guards 2026-05-13 10:23:03 +08:00
Hermes Agent 20717a0d0a Merge pull request #2136 into stage-345
fix: guard stale stream writebacks (LumenYoung)

Prevents stale WebUI stream workers from writing old results into a session
after that session has already moved on to another stream. Adds new helper
_stream_writeback_is_current() (a token equality check against the session's
active_stream_id) and short-circuits the two finalize/cancel paths when the
worker no longer owns the session writeback.
2026-05-12 23:11:48 +00:00
Jordan SkyLF 112eadc209 fix: address cancelled turn review feedback
- classify string-only CancelledError payloads as cancelled
- centralize cancel marker substring matching
- add targeted regression coverage
2026-05-12 15:43:36 -07:00