hermes-webui

mirror of https://github.com/nesquena/hermes-webui.git synced 2026-05-25 03:00:23 +00:00

Author	SHA1	Message	Date
Hermes Agent	3f851051cf	Merge pull request #2151 into stage-350 fix: clarify cancelled chat turn status (Jordan-SkyLF) Conflict resolution on api/streaming.py:4549-4567 (the cancel-handler ownership guard). Both this PR and the already-shipped PR #2136 add a guard at the same site against stale stream writebacks, from different angles: - PR #2136 (HEAD): _stream_writeback_is_current(_cs, stream_id) — strictly dominates by checking the active_stream_id token equality. - PR #2151: 'worker won the race' check via (active_stream_id != stream_id and not pending_user_message), with _emit_cancel_event = False to suppress the terminal cancel event. Resolution merges both: keep #2136's strictly-stronger condition for skip detection, and adopt #2151's _emit_cancel_event = False semantic so the cancel event isn't emitted in addition to skipping the writeback (when client may have already received the successful done payload). 55/55 tests pass across cancelled-turn-status + stale-stream-writeback + the four cancel/data-loss sibling test files.	2026-05-13 20:44:44 +00:00
Hermes Agent	7150e9fe70	Merge pull request #2202 into stage-349 feat: show early session titles on chat start (Jordan-SkyLF)	2026-05-13 19:03:03 +00:00
Jordan SkyLF	0381294f1c	feat: add early session provisional titles	2026-05-13 11:37:11 -07:00
MrFant	520795fdd2	fix: preserve reasoning_content in API message whitelist Providers like Xiaomi MiMo, DeepSeek, and Kimi require reasoning_content to be echoed back on every assistant message in multi-turn conversations with tool calls. Omitting it causes HTTP 400: 'The reasoning_content in the thinking mode must be passed back to the API.' The WebUI's _sanitize_messages_for_api() strips all fields not in _API_SAFE_MSG_KEYS before sending conversation history to the LLM API. reasoning_content was not in this whitelist, so it was silently dropped. The CLI path (run_agent.py) is unaffected because it has its own _copy_reasoning_content_for_api() logic that operates on raw message dicts without going through this filter. This is why the same session works from CLI but fails from WebUI with HTTP 400. The fix adds 'reasoning_content' to _API_SAFE_MSG_KEYS so the field passes through sanitization intact.	2026-05-14 02:29:17 +08:00
Lumen Yang	3289c44fb6	fix: refresh context ring after compression	2026-05-13 14:02:28 +02:00
Frank Song	9ea4f1145d	Fix stale stream exception writeback guards	2026-05-13 10:23:03 +08:00
Hermes Agent	20717a0d0a	Merge pull request #2136 into stage-345 fix: guard stale stream writebacks (LumenYoung) Prevents stale WebUI stream workers from writing old results into a session after that session has already moved on to another stream. Adds new helper _stream_writeback_is_current() (a token equality check against the session's active_stream_id) and short-circuits the two finalize/cancel paths when the worker no longer owns the session writeback.	2026-05-12 23:11:48 +00:00
Jordan SkyLF	112eadc209	fix: address cancelled turn review feedback - classify string-only CancelledError payloads as cancelled - centralize cancel marker substring matching - add targeted regression coverage	2026-05-12 15:43:36 -07:00
Lumen Yang	4b57b202a0	fix: guard stale stream writebacks	2026-05-13 00:05:09 +02:00
Jordan SkyLF	e4d16e93c7	fix: clarify cancelled chat turn status	2026-05-12 13:26:49 -07:00
Hermes Agent	a06952ab00	Merge pull request #2140 into stage-344 Preserve fallback provider credential hints (closes #2133) # Conflicts: # CHANGELOG.md	2026-05-12 16:12:54 +00:00
Frank Song	76e611d49f	Preserve fallback provider credential hints	2026-05-12 20:42:55 +08:00
Michael Lam	265496782a	docs: clarify compression anchor helpers	2026-05-12 01:43:16 -07:00
nesquena-hermes	d75b59135a	stage-341: apply Opus SHOULD-FIX (it i18n + short-circuit logger.debug + docstring) Opus advisor pass on stage-341 found three surgical items: 1. static/i18n.js:it — PR #2064 branched before stage-340 landed the 'it' locale (#2067), missing 9 session_worktree keys. Mechanical mirror of en/ja position. Italian falls back to English silently without this fix. 2. api/streaming.py — PR #2107's new break short-circuit was silent in both the aux and agent title-generation paths. Added logger.debug calls before each break so production logs surface the exit shape. 3. api/streaming.py — Expanded _title_should_skip_remaining_attempts docstring to document the membership criterion explicitly (vs the implicit reasoning-only-burn case it ships with today). Future additions (llm_safety_blocked, llm_oauth_quota) have a clear inclusion test. CHANGELOG updated under the Stage-341 maintainer fixes section to mirror the stage-340 pattern. All targeted tests pass (57/57 in the affected modules).	2026-05-12 00:16:33 +00:00
nesquena-hermes	e20eb2c784	fix: skip budget-doubling title retry for reasoning-only responses (#2083 ) Reasoning models (Qwen3-thinking via LM Studio, DeepSeek-R1, Kimi-K2, etc.) can burn their entire output budget on hidden reasoning tokens and emit no visible content. The previous title-generation retry path classified that as llm_length and doubled the budget — but the second call produces the same shape, so the retry only doubled the GPU/credit burn. Repeated across the two prompts in _title_prompts() this came to ~3000 reasoning tokens of GPU work per new chat. On local LM Studio servers behind a custom: provider (where is_lmstudio=False means reasoning_effort: none never reaches the model) it manifested as the GPU never going idle after a prompt. Fix: - _extract_title_response: classify reasoning-bearing empty responses as llm_empty_reasoning regardless of finish_reason. The presence of reasoning_content is the diagnostic signal, not finish_reason. - _title_retry_status: drop llm_empty_reasoning from the retry set. Length-truncated responses WITHOUT reasoning still retry (those are legitimately recoverable by a larger budget). - Add _title_should_skip_remaining_attempts() and break out of the prompt-iteration loop on empty-reasoning. A second prompt against the same model would produce the same shape. - Falls through to _fallback_title_from_exchange for a local-summary title. Tests updated to invert the previous reasoning-retry assertions: - test_aux_short_circuits_on_empty_reasoning_without_retrying - test_aux_still_retries_finish_length_without_reasoning - test_agent_route_short_circuits_on_empty_reasoning_without_retrying - test_agent_route_still_retries_finish_length_without_reasoning Companion agent-side work (LM Studio classifier for custom: providers) is tracked separately on the hermes-agent side; this WebUI fix is the belt-and-braces guard so the loop stops regardless of agent classifier state. Reported by @darkopetrovic. Closes #2083. Co-authored-by: darkopetrovic <darkopetrovic@users.noreply.github.com> (cherry picked from commit `efeae4a86e`)	2026-05-12 00:04:11 +00:00
nesquena-hermes	fd069155af	Merge PR #2062 into stage-339 feat: record turn journal lifecycle events by @ai-ag2026	2026-05-11 17:43:58 +00:00
nesquena-hermes	6a016dae6c	Merge PR #2077 into stage-338 Refactor compression anchor visibility helpers by @franksong2702	2026-05-11 17:17:25 +00:00
ai-ag2026	c864ad47af	fix: address turn journal lifecycle review	2026-05-11 17:16:43 +02:00
Frank Song	18124ced62	Refactor compression anchor visibility helpers	2026-05-11 20:56:30 +08:00
Frank Song	a0e9c06102	Fix HERMES_HOME skill cache patching	2026-05-11 19:12:02 +08:00
ai-ag2026	4b486f2860	feat: record turn journal lifecycle events	2026-05-11 09:13:25 +02:00
Frank Song	5a445e7562	Fix duplicate assistant transcript merge	2026-05-11 13:09:16 +08:00
nesquena-hermes	97b283c5a4	Merge PR #2039 into stage-335	2026-05-11 00:25:07 +00:00
ai-ag2026	2ead7daa2f	fix: expose active run lifecycle in health	2026-05-11 02:15:00 +02:00
Michael Lam	d620f4394a	fix: prewarm skill imports outside env lock	2026-05-10 15:51:49 -07:00
nesquena-hermes	2377216860	Stage 333: PR #2009 — feat(context): live status tracking during streaming by @dobby-d-elf	2026-05-10 18:16:59 +00:00
nesquena-hermes	22991fa820	Merge remote-tracking branch 'origin/master' into stage-331 # Conflicts: # CHANGELOG.md	2026-05-10 18:03:55 +00:00
nesquena-hermes	c156e5a256	Stage 331: PR #2006 — fix(compression): stamp profile on continuation session by @qxxaa	2026-05-10 17:09:21 +00:00
nesquena-hermes	9060bdb344	Stage 330: PR #2001 — fix(clarify): honor clarify.timeout config by @franksong2702	2026-05-10 17:07:37 +00:00
nesquena-hermes	7eced19463	Stage 330: PR #2000 — fix(skills): patch module-level caches on per-request profile switch by @qxxaa	2026-05-10 17:07:37 +00:00
dobby-d-elf	fecfc5f6db	fix: reanchor live context usage updates	2026-05-10 10:31:14 -06:00
dobby-d-elf	56d68b7511	fix: keep live context metering session-scoped	2026-05-10 08:20:37 -06:00
dobby-d-elf	1cf0ff01b5	feat: live context window status tracking during streaming	2026-05-10 06:51:46 -06:00
qxxaa	f665e50738	fix: stamp profile on continuation session after context compression When context compression fires, the agent rotates to a new session_id. The compression migration block correctly migrates the session lock, SESSION_AGENT_CACHE, SESSIONS dict, and the session file rename, but does not ensure s.profile is set on the continuation session. On the next request, _run_agent_streaming resolves the profile via: get_hermes_home_for_profile(getattr(s, 'profile', None)) With s.profile == None this falls back to the default profile's HERMES_HOME. Memory tool calls then read and write the wrong profile's MEMORY.md — confirmed by investigation: session 0dfefb (continuation after compression from a troubleshooting profile session) read memory at 16% / 1,184 chars with 4 entries, while the troubleshooting profile's actual state was 72-77% / 5,000+ chars. That reading could only come from the default profile's bank. Subsequent replace operations failed because the target entries existed only in the troubleshooting profile. There are two failure paths: 1. In-memory: if s.profile was None from the start (legacy session or one created before this fix), the continuation session object carries null through the current request. 2. Persistence: s.save() persists "profile": null to the continuation session's JSON file (profile is in METADATA_FIELDS, models.py ~408). On the next request, Session.load(new_sid) reads it back as null and get_hermes_home_for_profile(None) falls back to the default profile. Fix: capture _resolved_profile_name at request entry (~line 2019), immediately after profile home resolution. This is the only point where profile context is reliable: s.profile if already set, otherwise get_active_profile_name() — which at that point reads thread-local storage (_tls.profile) correctly set by the HTTP handler thread via set_request_profile(). Calling get_active_profile_name() at compression time instead would be unsafe: the streaming thread is a separate threading.Thread, does not inherit TLS, and the call would fall back to the process-global _active_profile which may belong to a different concurrent tab. Stamp s.profile in the compression migration block immediately after s.session_id = new_sid. Guarded by `if not s.profile` so sessions that already have a profile set are unaffected. A logger.info line records when the stamp fires, making future investigation straightforward. Fixes: memory writes bleeding into default profile after compression Reproduces: reliably on any long non-default profile session that hits the compression threshold (default: 0.80 context fill)	2026-05-10 09:57:45 +01:00
Frank Song	1bec8070f2	fix(1833): persist compression anchor summary for reload UI	2026-05-10 16:45:16 +08:00
Frank Song	2e6b3601bd	fix(clarify): honor clarify.timeout config in webui prompts	2026-05-10 16:05:50 +08:00
qxxaa	7ee41c9b12	fix: patch skills module-level caches on per-request profile switch Per-request profile switches (process_wide=False, introduced in #1700) update os.environ['HERMES_HOME'] but skip _set_hermes_home(), which is responsible for monkeypatching module-level caches. Both tools/skills_tool.py and tools/skill_manager_tool.py set HERMES_HOME and SKILLS_DIR once at import time. When a non-default profile is active in the WebUI, os.environ['HERMES_HOME'] is correctly updated per-turn in the _ENV_LOCK block, but the module-level constants still point at the root profile. All agent-side skill operations — skills_list(), skill_view(), skill_manage() — read and write to the wrong directory. Add the same monkeypatching that _set_hermes_home() already performs (profiles.py line ~620) to the per-turn env setup block in streaming.py, covering both skills_tool and skill_manager_tool. The WebUI display half was already fixed in #1917 via _active_skills_dir() in routes.py. This patch fixes the agent-side half so the running agent resolves skills from the correct profile.	2026-05-10 09:02:49 +01:00
Frank Song	1e1a9481b4	fix(i18n): localize /goal runtime status strings	2026-05-10 15:21:24 +08:00
nesquena-hermes	a3af4a3c8f	fix(profile/mcp): discover MCP tools after per-session HERMES_HOME mutation Issue #1968: switching to a non-default profile in the WebUI dropdown had no effect on which MCP servers were available. Every chat session, regardless of profile, only saw the default profile's mcp_servers from ~/.hermes/config.yaml. Non-default profile MCP servers (postgres, custom stdio servers, anything in <profile>/config.yaml) never registered. Root cause: api/streaming.py:1922 called discover_mcp_tools() at the TOP of _run_agent_streaming(), about 100 lines BEFORE the per-session 'os.environ["HERMES_HOME"] = _profile_home' mutation at line 2053. discover_mcp_tools() reads ~/.hermes/config.yaml via get_hermes_home(), which uses os.environ['HERMES_HOME']. So at the call site, HERMES_HOME was still whatever the WebUI server process had at startup — the default profile, every time. Fix: relocate the discover_mcp_tools() call past the _ENV_LOCK block so get_hermes_home() resolves to the session's actual profile home. Same try/except wrapping is preserved; same idempotency semantics on already-connected servers; same lazy-import pattern. Caveat (out of scope, agent-side): _servers in tools/mcp_tool.py is a process-global Dict[str, MCPServerTask] keyed only by server name. So once profile A registers a server named e.g. 'postgres', profile B's discovery sees 'postgres' as already connected and skips it — even if B's config points at a different binary or DB. Concurrent multi-profile WebUI processes will still hit 'first profile wins per server name'. Fully fixing that requires keying _servers by (profile_home, name) upstream in hermes-agent. This PR ships layer 1 only — fixes the single-non-default-profile case (the headline symptom). Tests: tests/test_issue1968_mcp_profile_discovery.py — 4 static tests pinning the lexical ordering invariants. Verified mutation-safety: a proof-of-concept revert (re-adding a discover call before the HERMES_HOME mutation) makes the 'only called once' test fail. Test suite: 5047 passed, 4 skipped, 3 xpassed, 0 regressions. Closes #1968	2026-05-09 20:08:16 +00:00
nesquena-hermes	8782fd2675	fix(stage-326): apply Opus advisor critical + recommended fixes CRITICAL: #1951 PENDING_GOAL_CONTINUATION race Removes `PENDING_GOAL_CONTINUATION.discard(session_id)` from the streaming worker's `finally` cleanup block. The marker is set inside the SAME function call (line ~3328 on `goal_continue`) and the discard in the `finally` (line ~3553) almost always raced ahead of the frontend's SSE-receive → POST /api/chat/start round-trip, erasing the marker before the consumer in routes.py could read it. The consumer (`_start_chat_stream_for_session` in routes.py:6522) already discards atomically when consuming, so removing the streaming-side discard preserves single-use semantics and unblocks the goal-continuation chain. Adds tests/test_stage326_pending_goal_continuation_race.py with 5 regression guards: 1. streaming.py's finally must NOT discard PENDING_GOAL_CONTINUATION 2. routes.py consumer must check + set + discard atomically 3. PENDING_GOAL_CONTINUATION must be a set (GIL-safe single-op) 4. STREAM_GOAL_RELATED.pop must be keyed by stream_id, not session_id 5. PENDING_GOAL_CONTINUATION.add must precede the goal_continue SSE emission in source ordering HARDENING: #1956 composer-draft input validation Per Opus, the POST /api/session/draft handler accepted unbounded / arbitrary-typed text and files inputs. With the 400ms debounced auto-save firing on every keystroke, a misbehaving client could persist multi-MB strings into the session JSON. Adds: - text: coerced to str if not already; clamped to 50_000 chars - files: coerced to list if not already; clamped to 50 entries Validation runs BEFORE the session lock acquire / save. Adds tests/test_stage326_composer_draft_validation.py with 5 guards. Verdict from Opus advisor on stage-326: SHIP-WITH-FIXES. This commit applies the required + recommended fixes; #1957 hardening fixed in a prior stage commit.	2026-05-09 18:36:01 +00:00
nesquena-hermes	4751b5ace5	Stage 326: PR #1951 — fix: only evaluate goal hook on goal-related turns (#1932 ) by @amlyczz	2026-05-09 18:17:20 +00:00
hermes-agent	b443e8ea5a	fix: WebUI respects image_input_mode — stop unconditionally embedding native images _build_native_multimodal_message() unconditionally embedded images as native image_url parts, bypassing the agent's image_input_mode config. Add _resolve_image_input_mode(cfg) helper mirroring the agent's decide_image_input_mode logic, and wire it into _build_native_multimodal_message with a new cfg parameter. When mode resolves to 'text' (explicit aux vision config, or image_input_mode: text), returns plain string so the agent's existing text-mode pipeline (vision_analyze) handles images. Closes #1959	2026-05-09 19:39:50 +02:00
zqy	6fd07c2af4	fix: only evaluate goal hook on goal-related turns (#1932 ) The goal evaluation hook was firing on every completed assistant turn when a goal was active, even for unrelated messages like "what time is it". This burned the goal budget, triggered continuation prompts that interrupted unrelated conversations, and made /goal status numbers misleading. Add STREAM_GOAL_RELATED and PENDING_GOAL_CONTINUATION flags to gate the evaluate_goal_after_turn() call in the streaming loop. Only streams started from goal kickoff (/goal <text>) or goal continuation are marked as goal-related. Normal user messages skip the hook entirely.	2026-05-09 15:08:13 +08:00
Michael Lam	8e513b596b	fix: surface goal evaluation status	2026-05-08 17:12:01 +00:00
Michael Lam	0db5bc6b76	feat: add WebUI goal command support	2026-05-08 17:12:01 +00:00
nesquena-hermes	8c4c253654	Stage 322: PR #1814 — custom named provider API key resolution by @hualong1009	2026-05-08 16:55:20 +00:00
王浩生	cdbdc28f5c	fix(config): custom named provider API key resolution in WebUI - add robust custom provider credential/base_url resolver - apply fallback in streaming and routes agent init/self-heal paths - support slug normalization and config fallbacks for custom:* providers	2026-05-08 16:40:17 +00:00
Frank Song	ccdc055c36	Fix workspace prefix sentinel handling	2026-05-08 16:40:17 +00:00
nesquena-hermes	b8426d047c	Stage 321: PR #1900 — pass config overrides into context-length fallback (closes #1896 )	2026-05-08 16:08:42 +00:00
nesquena-hermes	0efa75827a	fix(streaming): pass config overrides into context-length fallback (#1896 ) The two get_model_context_length() fallback callsites in api/streaming.py (session save + SSE usage payload) were calling the resolver with only model + base_url. When the agent's compressor reports 0 (fresh/cached/ transitioning agent), resolution fell through to the 256K DEFAULT_FALLBACK even when users had set model.context_length: 1048576 in config.yaml. For LCM users on 1M-context models, the wrong window cascaded into a session-killing failure: auto-compression triggered at ~25% of the wrong value, floods of compress requests, 429s, credential pool exhaustion, fallback 429s, then 'API call failed after 3 retries'. Reported by @AvidFuturist on Discord with deepseek-v4-flash. Reproduced 5x. Both callsites now pass config_context_length, provider, and custom_providers. The resolver consults these BEFORE probing, so the config override wins. Both are wrapped in except TypeError blocks that retry with the legacy 2-arg form for older hermes-agent builds whose get_model_context_length signature pre-dates these kwargs. Tests: 7 source-string regressions guarding both call shapes, the safe config parse, the legacy fallback, and the per-profile config source. Also bumped the line-distance assertion in test_pr1341 (the test explicitly invites bumping when a new pre-save mutation block is added). Closes #1896 Co-authored-by: Hermes Agent <agent@hermes.local>	2026-05-08 16:08:42 +00:00

1 2 3 4

172 Commits