hermes-webui

mirror of https://github.com/nesquena/hermes-webui.git synced 2026-05-24 18:50:15 +00:00

Author	SHA1	Message	Date
Hermes Agent	a06952ab00	Merge pull request #2140 into stage-344 Preserve fallback provider credential hints (closes #2133) # Conflicts: # CHANGELOG.md	2026-05-12 16:12:54 +00:00
Frank Song	76e611d49f	Preserve fallback provider credential hints	2026-05-12 20:42:55 +08:00
Michael Lam	265496782a	docs: clarify compression anchor helpers	2026-05-12 01:43:16 -07:00
nesquena-hermes	d75b59135a	stage-341: apply Opus SHOULD-FIX (it i18n + short-circuit logger.debug + docstring) Opus advisor pass on stage-341 found three surgical items: 1. static/i18n.js:it — PR #2064 branched before stage-340 landed the 'it' locale (#2067), missing 9 session_worktree keys. Mechanical mirror of en/ja position. Italian falls back to English silently without this fix. 2. api/streaming.py — PR #2107's new break short-circuit was silent in both the aux and agent title-generation paths. Added logger.debug calls before each break so production logs surface the exit shape. 3. api/streaming.py — Expanded _title_should_skip_remaining_attempts docstring to document the membership criterion explicitly (vs the implicit reasoning-only-burn case it ships with today). Future additions (llm_safety_blocked, llm_oauth_quota) have a clear inclusion test. CHANGELOG updated under the Stage-341 maintainer fixes section to mirror the stage-340 pattern. All targeted tests pass (57/57 in the affected modules).	2026-05-12 00:16:33 +00:00
nesquena-hermes	e20eb2c784	fix: skip budget-doubling title retry for reasoning-only responses (#2083 ) Reasoning models (Qwen3-thinking via LM Studio, DeepSeek-R1, Kimi-K2, etc.) can burn their entire output budget on hidden reasoning tokens and emit no visible content. The previous title-generation retry path classified that as llm_length and doubled the budget — but the second call produces the same shape, so the retry only doubled the GPU/credit burn. Repeated across the two prompts in _title_prompts() this came to ~3000 reasoning tokens of GPU work per new chat. On local LM Studio servers behind a custom: provider (where is_lmstudio=False means reasoning_effort: none never reaches the model) it manifested as the GPU never going idle after a prompt. Fix: - _extract_title_response: classify reasoning-bearing empty responses as llm_empty_reasoning regardless of finish_reason. The presence of reasoning_content is the diagnostic signal, not finish_reason. - _title_retry_status: drop llm_empty_reasoning from the retry set. Length-truncated responses WITHOUT reasoning still retry (those are legitimately recoverable by a larger budget). - Add _title_should_skip_remaining_attempts() and break out of the prompt-iteration loop on empty-reasoning. A second prompt against the same model would produce the same shape. - Falls through to _fallback_title_from_exchange for a local-summary title. Tests updated to invert the previous reasoning-retry assertions: - test_aux_short_circuits_on_empty_reasoning_without_retrying - test_aux_still_retries_finish_length_without_reasoning - test_agent_route_short_circuits_on_empty_reasoning_without_retrying - test_agent_route_still_retries_finish_length_without_reasoning Companion agent-side work (LM Studio classifier for custom: providers) is tracked separately on the hermes-agent side; this WebUI fix is the belt-and-braces guard so the loop stops regardless of agent classifier state. Reported by @darkopetrovic. Closes #2083. Co-authored-by: darkopetrovic <darkopetrovic@users.noreply.github.com> (cherry picked from commit `efeae4a86e`)	2026-05-12 00:04:11 +00:00
nesquena-hermes	fd069155af	Merge PR #2062 into stage-339 feat: record turn journal lifecycle events by @ai-ag2026	2026-05-11 17:43:58 +00:00
nesquena-hermes	6a016dae6c	Merge PR #2077 into stage-338 Refactor compression anchor visibility helpers by @franksong2702	2026-05-11 17:17:25 +00:00
ai-ag2026	c864ad47af	fix: address turn journal lifecycle review	2026-05-11 17:16:43 +02:00
Frank Song	18124ced62	Refactor compression anchor visibility helpers	2026-05-11 20:56:30 +08:00
Frank Song	a0e9c06102	Fix HERMES_HOME skill cache patching	2026-05-11 19:12:02 +08:00
ai-ag2026	4b486f2860	feat: record turn journal lifecycle events	2026-05-11 09:13:25 +02:00
Frank Song	5a445e7562	Fix duplicate assistant transcript merge	2026-05-11 13:09:16 +08:00
nesquena-hermes	97b283c5a4	Merge PR #2039 into stage-335	2026-05-11 00:25:07 +00:00
ai-ag2026	2ead7daa2f	fix: expose active run lifecycle in health	2026-05-11 02:15:00 +02:00
Michael Lam	d620f4394a	fix: prewarm skill imports outside env lock	2026-05-10 15:51:49 -07:00
nesquena-hermes	2377216860	Stage 333: PR #2009 — feat(context): live status tracking during streaming by @dobby-d-elf	2026-05-10 18:16:59 +00:00
nesquena-hermes	22991fa820	Merge remote-tracking branch 'origin/master' into stage-331 # Conflicts: # CHANGELOG.md	2026-05-10 18:03:55 +00:00
nesquena-hermes	c156e5a256	Stage 331: PR #2006 — fix(compression): stamp profile on continuation session by @qxxaa	2026-05-10 17:09:21 +00:00
nesquena-hermes	9060bdb344	Stage 330: PR #2001 — fix(clarify): honor clarify.timeout config by @franksong2702	2026-05-10 17:07:37 +00:00
nesquena-hermes	7eced19463	Stage 330: PR #2000 — fix(skills): patch module-level caches on per-request profile switch by @qxxaa	2026-05-10 17:07:37 +00:00
dobby-d-elf	fecfc5f6db	fix: reanchor live context usage updates	2026-05-10 10:31:14 -06:00
dobby-d-elf	56d68b7511	fix: keep live context metering session-scoped	2026-05-10 08:20:37 -06:00
dobby-d-elf	1cf0ff01b5	feat: live context window status tracking during streaming	2026-05-10 06:51:46 -06:00
qxxaa	f665e50738	fix: stamp profile on continuation session after context compression When context compression fires, the agent rotates to a new session_id. The compression migration block correctly migrates the session lock, SESSION_AGENT_CACHE, SESSIONS dict, and the session file rename, but does not ensure s.profile is set on the continuation session. On the next request, _run_agent_streaming resolves the profile via: get_hermes_home_for_profile(getattr(s, 'profile', None)) With s.profile == None this falls back to the default profile's HERMES_HOME. Memory tool calls then read and write the wrong profile's MEMORY.md — confirmed by investigation: session 0dfefb (continuation after compression from a troubleshooting profile session) read memory at 16% / 1,184 chars with 4 entries, while the troubleshooting profile's actual state was 72-77% / 5,000+ chars. That reading could only come from the default profile's bank. Subsequent replace operations failed because the target entries existed only in the troubleshooting profile. There are two failure paths: 1. In-memory: if s.profile was None from the start (legacy session or one created before this fix), the continuation session object carries null through the current request. 2. Persistence: s.save() persists "profile": null to the continuation session's JSON file (profile is in METADATA_FIELDS, models.py ~408). On the next request, Session.load(new_sid) reads it back as null and get_hermes_home_for_profile(None) falls back to the default profile. Fix: capture _resolved_profile_name at request entry (~line 2019), immediately after profile home resolution. This is the only point where profile context is reliable: s.profile if already set, otherwise get_active_profile_name() — which at that point reads thread-local storage (_tls.profile) correctly set by the HTTP handler thread via set_request_profile(). Calling get_active_profile_name() at compression time instead would be unsafe: the streaming thread is a separate threading.Thread, does not inherit TLS, and the call would fall back to the process-global _active_profile which may belong to a different concurrent tab. Stamp s.profile in the compression migration block immediately after s.session_id = new_sid. Guarded by `if not s.profile` so sessions that already have a profile set are unaffected. A logger.info line records when the stamp fires, making future investigation straightforward. Fixes: memory writes bleeding into default profile after compression Reproduces: reliably on any long non-default profile session that hits the compression threshold (default: 0.80 context fill)	2026-05-10 09:57:45 +01:00
Frank Song	1bec8070f2	fix(1833): persist compression anchor summary for reload UI	2026-05-10 16:45:16 +08:00
Frank Song	2e6b3601bd	fix(clarify): honor clarify.timeout config in webui prompts	2026-05-10 16:05:50 +08:00
qxxaa	7ee41c9b12	fix: patch skills module-level caches on per-request profile switch Per-request profile switches (process_wide=False, introduced in #1700) update os.environ['HERMES_HOME'] but skip _set_hermes_home(), which is responsible for monkeypatching module-level caches. Both tools/skills_tool.py and tools/skill_manager_tool.py set HERMES_HOME and SKILLS_DIR once at import time. When a non-default profile is active in the WebUI, os.environ['HERMES_HOME'] is correctly updated per-turn in the _ENV_LOCK block, but the module-level constants still point at the root profile. All agent-side skill operations — skills_list(), skill_view(), skill_manage() — read and write to the wrong directory. Add the same monkeypatching that _set_hermes_home() already performs (profiles.py line ~620) to the per-turn env setup block in streaming.py, covering both skills_tool and skill_manager_tool. The WebUI display half was already fixed in #1917 via _active_skills_dir() in routes.py. This patch fixes the agent-side half so the running agent resolves skills from the correct profile.	2026-05-10 09:02:49 +01:00
Frank Song	1e1a9481b4	fix(i18n): localize /goal runtime status strings	2026-05-10 15:21:24 +08:00
nesquena-hermes	a3af4a3c8f	fix(profile/mcp): discover MCP tools after per-session HERMES_HOME mutation Issue #1968: switching to a non-default profile in the WebUI dropdown had no effect on which MCP servers were available. Every chat session, regardless of profile, only saw the default profile's mcp_servers from ~/.hermes/config.yaml. Non-default profile MCP servers (postgres, custom stdio servers, anything in <profile>/config.yaml) never registered. Root cause: api/streaming.py:1922 called discover_mcp_tools() at the TOP of _run_agent_streaming(), about 100 lines BEFORE the per-session 'os.environ["HERMES_HOME"] = _profile_home' mutation at line 2053. discover_mcp_tools() reads ~/.hermes/config.yaml via get_hermes_home(), which uses os.environ['HERMES_HOME']. So at the call site, HERMES_HOME was still whatever the WebUI server process had at startup — the default profile, every time. Fix: relocate the discover_mcp_tools() call past the _ENV_LOCK block so get_hermes_home() resolves to the session's actual profile home. Same try/except wrapping is preserved; same idempotency semantics on already-connected servers; same lazy-import pattern. Caveat (out of scope, agent-side): _servers in tools/mcp_tool.py is a process-global Dict[str, MCPServerTask] keyed only by server name. So once profile A registers a server named e.g. 'postgres', profile B's discovery sees 'postgres' as already connected and skips it — even if B's config points at a different binary or DB. Concurrent multi-profile WebUI processes will still hit 'first profile wins per server name'. Fully fixing that requires keying _servers by (profile_home, name) upstream in hermes-agent. This PR ships layer 1 only — fixes the single-non-default-profile case (the headline symptom). Tests: tests/test_issue1968_mcp_profile_discovery.py — 4 static tests pinning the lexical ordering invariants. Verified mutation-safety: a proof-of-concept revert (re-adding a discover call before the HERMES_HOME mutation) makes the 'only called once' test fail. Test suite: 5047 passed, 4 skipped, 3 xpassed, 0 regressions. Closes #1968	2026-05-09 20:08:16 +00:00
nesquena-hermes	8782fd2675	fix(stage-326): apply Opus advisor critical + recommended fixes CRITICAL: #1951 PENDING_GOAL_CONTINUATION race Removes `PENDING_GOAL_CONTINUATION.discard(session_id)` from the streaming worker's `finally` cleanup block. The marker is set inside the SAME function call (line ~3328 on `goal_continue`) and the discard in the `finally` (line ~3553) almost always raced ahead of the frontend's SSE-receive → POST /api/chat/start round-trip, erasing the marker before the consumer in routes.py could read it. The consumer (`_start_chat_stream_for_session` in routes.py:6522) already discards atomically when consuming, so removing the streaming-side discard preserves single-use semantics and unblocks the goal-continuation chain. Adds tests/test_stage326_pending_goal_continuation_race.py with 5 regression guards: 1. streaming.py's finally must NOT discard PENDING_GOAL_CONTINUATION 2. routes.py consumer must check + set + discard atomically 3. PENDING_GOAL_CONTINUATION must be a set (GIL-safe single-op) 4. STREAM_GOAL_RELATED.pop must be keyed by stream_id, not session_id 5. PENDING_GOAL_CONTINUATION.add must precede the goal_continue SSE emission in source ordering HARDENING: #1956 composer-draft input validation Per Opus, the POST /api/session/draft handler accepted unbounded / arbitrary-typed text and files inputs. With the 400ms debounced auto-save firing on every keystroke, a misbehaving client could persist multi-MB strings into the session JSON. Adds: - text: coerced to str if not already; clamped to 50_000 chars - files: coerced to list if not already; clamped to 50 entries Validation runs BEFORE the session lock acquire / save. Adds tests/test_stage326_composer_draft_validation.py with 5 guards. Verdict from Opus advisor on stage-326: SHIP-WITH-FIXES. This commit applies the required + recommended fixes; #1957 hardening fixed in a prior stage commit.	2026-05-09 18:36:01 +00:00
nesquena-hermes	4751b5ace5	Stage 326: PR #1951 — fix: only evaluate goal hook on goal-related turns (#1932 ) by @amlyczz	2026-05-09 18:17:20 +00:00
hermes-agent	b443e8ea5a	fix: WebUI respects image_input_mode — stop unconditionally embedding native images _build_native_multimodal_message() unconditionally embedded images as native image_url parts, bypassing the agent's image_input_mode config. Add _resolve_image_input_mode(cfg) helper mirroring the agent's decide_image_input_mode logic, and wire it into _build_native_multimodal_message with a new cfg parameter. When mode resolves to 'text' (explicit aux vision config, or image_input_mode: text), returns plain string so the agent's existing text-mode pipeline (vision_analyze) handles images. Closes #1959	2026-05-09 19:39:50 +02:00
zqy	6fd07c2af4	fix: only evaluate goal hook on goal-related turns (#1932 ) The goal evaluation hook was firing on every completed assistant turn when a goal was active, even for unrelated messages like "what time is it". This burned the goal budget, triggered continuation prompts that interrupted unrelated conversations, and made /goal status numbers misleading. Add STREAM_GOAL_RELATED and PENDING_GOAL_CONTINUATION flags to gate the evaluate_goal_after_turn() call in the streaming loop. Only streams started from goal kickoff (/goal <text>) or goal continuation are marked as goal-related. Normal user messages skip the hook entirely.	2026-05-09 15:08:13 +08:00
Michael Lam	8e513b596b	fix: surface goal evaluation status	2026-05-08 17:12:01 +00:00
Michael Lam	0db5bc6b76	feat: add WebUI goal command support	2026-05-08 17:12:01 +00:00
nesquena-hermes	8c4c253654	Stage 322: PR #1814 — custom named provider API key resolution by @hualong1009	2026-05-08 16:55:20 +00:00
王浩生	cdbdc28f5c	fix(config): custom named provider API key resolution in WebUI - add robust custom provider credential/base_url resolver - apply fallback in streaming and routes agent init/self-heal paths - support slug normalization and config fallbacks for custom:* providers	2026-05-08 16:40:17 +00:00
Frank Song	ccdc055c36	Fix workspace prefix sentinel handling	2026-05-08 16:40:17 +00:00
nesquena-hermes	b8426d047c	Stage 321: PR #1900 — pass config overrides into context-length fallback (closes #1896 )	2026-05-08 16:08:42 +00:00
nesquena-hermes	0efa75827a	fix(streaming): pass config overrides into context-length fallback (#1896 ) The two get_model_context_length() fallback callsites in api/streaming.py (session save + SSE usage payload) were calling the resolver with only model + base_url. When the agent's compressor reports 0 (fresh/cached/ transitioning agent), resolution fell through to the 256K DEFAULT_FALLBACK even when users had set model.context_length: 1048576 in config.yaml. For LCM users on 1M-context models, the wrong window cascaded into a session-killing failure: auto-compression triggered at ~25% of the wrong value, floods of compress requests, 429s, credential pool exhaustion, fallback 429s, then 'API call failed after 3 retries'. Reported by @AvidFuturist on Discord with deepseek-v4-flash. Reproduced 5x. Both callsites now pass config_context_length, provider, and custom_providers. The resolver consults these BEFORE probing, so the config override wins. Both are wrapped in except TypeError blocks that retry with the legacy 2-arg form for older hermes-agent builds whose get_model_context_length signature pre-dates these kwargs. Tests: 7 source-string regressions guarding both call shapes, the safe config parse, the legacy fallback, and the per-profile config source. Also bumped the line-distance assertion in test_pr1341 (the test explicitly invites bumping when a new pre-save mutation block is added). Closes #1896 Co-authored-by: Hermes Agent <agent@hermes.local>	2026-05-08 16:08:42 +00:00
nesquena-hermes	f456daa574	fix(streaming): include profile home in agent cache signature (#1897 ) Same-session profile switches reused cached AIAgent from previous profile, silently leaking the old persona's SOUL.md / system prompt into the new profile's turns. session_id stays stable across profile switches, and the signature didn't include the active profile home, so every signature input matched and the stale agent was returned from SESSION_AGENT_CACHE. Append _profile_home to the signature blob so profile switches force a cache miss and a fresh agent build under the new HERMES_HOME (which triggers a fresh load_soul_md() call). Tests: 3 source-string regressions guarding the signature contract, ordering, and empty-home fallback. Closes #1897 Co-authored-by: Hermes Agent <agent@hermes.local>	2026-05-08 16:08:18 +00:00
nesquena-hermes	72b077ecce	Stage 320: PR #1889 — deduplicate workspace-prefixed user turns by @ai-ag2026	2026-05-08 15:48:28 +00:00
ai-ag2026	f6d09e06ca	fix: deduplicate workspace-prefixed user turns	2026-05-08 15:37:10 +00:00
nesquena-hermes	518453545c	Stage 320: PR #1865 — interim_assistant streaming in runtime + live UI by @franksong2702	2026-05-08 15:37:09 +00:00
nesquena-hermes	035c537281	Stage 320: PR #1861 — overwrite session usage per turn by @franksong2702	2026-05-08 15:37:09 +00:00
Frank Song	c1a9d7ce79	fix: overwrite session usage per turn	2026-05-08 15:37:09 +00:00
Frank Song	82c7367cef	Add interim_assistant streaming path to WebUI	2026-05-08 15:37:09 +00:00
Michael Lam	01b9c82dc9	fix: honor configured max_turns in WebUI agents Read agent.max_turns when constructing streaming WebUI AIAgent instances, pass it as max_iterations when supported, and include it in the per-session agent cache signature so budget changes take effect. Add regression coverage for the config read, constructor kwarg, and cache key.	2026-05-08 15:37:08 +00:00
Michael Lam	e31b7e72d6	fix: show auto-compression running state	2026-05-07 18:41:13 +00:00
Michael Lam	048f1fa24e	fix: keep assistant-only stream deltas on current turn	2026-05-07 06:25:16 +00:00

1 2 3 4

162 Commits