Issue #1968: switching to a non-default profile in the WebUI dropdown
had no effect on which MCP servers were available. Every chat session,
regardless of profile, only saw the default profile's mcp_servers from
~/.hermes/config.yaml. Non-default profile MCP servers (postgres, custom
stdio servers, anything in <profile>/config.yaml) never registered.
Root cause: api/streaming.py:1922 called discover_mcp_tools() at the
TOP of _run_agent_streaming(), about 100 lines BEFORE the per-session
'os.environ["HERMES_HOME"] = _profile_home' mutation at line 2053.
discover_mcp_tools() reads ~/.hermes/config.yaml via get_hermes_home(),
which uses os.environ['HERMES_HOME']. So at the call site, HERMES_HOME
was still whatever the WebUI server process had at startup — the default
profile, every time.
Fix: relocate the discover_mcp_tools() call past the _ENV_LOCK block so
get_hermes_home() resolves to the session's actual profile home. Same
try/except wrapping is preserved; same idempotency semantics on
already-connected servers; same lazy-import pattern.
Caveat (out of scope, agent-side): _servers in tools/mcp_tool.py is a
process-global Dict[str, MCPServerTask] keyed only by server name. So
once profile A registers a server named e.g. 'postgres', profile B's
discovery sees 'postgres' as already connected and skips it — even if
B's config points at a different binary or DB. Concurrent multi-profile
WebUI processes will still hit 'first profile wins per server name'.
Fully fixing that requires keying _servers by (profile_home, name)
upstream in hermes-agent. This PR ships layer 1 only — fixes the
single-non-default-profile case (the headline symptom).
Tests: tests/test_issue1968_mcp_profile_discovery.py — 4 static tests
pinning the lexical ordering invariants. Verified mutation-safety: a
proof-of-concept revert (re-adding a discover call before the
HERMES_HOME mutation) makes the 'only called once' test fail.
Test suite: 5047 passed, 4 skipped, 3 xpassed, 0 regressions.
Closes#1968
CRITICAL: #1951 PENDING_GOAL_CONTINUATION race
Removes `PENDING_GOAL_CONTINUATION.discard(session_id)` from the
streaming worker's `finally` cleanup block. The marker is set inside
the SAME function call (line ~3328 on `goal_continue`) and the discard
in the `finally` (line ~3553) almost always raced ahead of the
frontend's SSE-receive → POST /api/chat/start round-trip, erasing
the marker before the consumer in routes.py could read it. The
consumer (`_start_chat_stream_for_session` in routes.py:6522) already
discards atomically when consuming, so removing the streaming-side
discard preserves single-use semantics and unblocks the
goal-continuation chain.
Adds tests/test_stage326_pending_goal_continuation_race.py with 5
regression guards:
1. streaming.py's finally must NOT discard PENDING_GOAL_CONTINUATION
2. routes.py consumer must check + set + discard atomically
3. PENDING_GOAL_CONTINUATION must be a set (GIL-safe single-op)
4. STREAM_GOAL_RELATED.pop must be keyed by stream_id, not session_id
5. PENDING_GOAL_CONTINUATION.add must precede the goal_continue SSE
emission in source ordering
HARDENING: #1956 composer-draft input validation
Per Opus, the POST /api/session/draft handler accepted unbounded /
arbitrary-typed text and files inputs. With the 400ms debounced
auto-save firing on every keystroke, a misbehaving client could
persist multi-MB strings into the session JSON. Adds:
- text: coerced to str if not already; clamped to 50_000 chars
- files: coerced to list if not already; clamped to 50 entries
Validation runs BEFORE the session lock acquire / save.
Adds tests/test_stage326_composer_draft_validation.py with 5 guards.
Verdict from Opus advisor on stage-326: SHIP-WITH-FIXES.
This commit applies the required + recommended fixes; #1957 hardening
fixed in a prior stage commit.
PR #1957 deleted the SESSION_TTL = 86400 * 30 module-level constant in
favor of the new _resolve_session_ttl() helper. Two existing regression
tests pin the constant: test_auth_sessions.TestSessionPruning.test_session_ttl_is_24_hours
imports SESSION_TTL directly, and test_v050258_opus_followups.test_redirect_session_ttl_30_days
asserts the literal "SESSION_TTL = 86400 * 30" line is present in source
(guarding against the daily-kick-out regression from #1419).
Restore SESSION_TTL as the named fallback for _resolve_session_ttl(); the
new env-var/settings.json path is unchanged. Backwards-compatible.
Also fix the new TestSessionTtlResolution suite:
- Switch from pytest's `monkeypatch` fixture (incompatible with
unittest.TestCase subclasses) to setUp/tearDown env snapshotting
- Reconcile clamp tests with actual implementation: out-of-range env
values fall through to settings/default, not snap to bounds
- test_session_uses_dynamic_ttl now sets the env var so the dynamic
resolved value (3600s) is exercised rather than expecting the default
Verified: tests/test_auth_sessions.py + tests/test_v050258_opus_followups.py
21/21 pass.
_build_native_multimodal_message() unconditionally embedded images as
native image_url parts, bypassing the agent's image_input_mode config.
Add _resolve_image_input_mode(cfg) helper mirroring the agent's
decide_image_input_mode logic, and wire it into
_build_native_multimodal_message with a new cfg parameter.
When mode resolves to 'text' (explicit aux vision config, or
image_input_mode: text), returns plain string so the agent's
existing text-mode pipeline (vision_analyze) handles images.
Closes#1959
Add _resolve_session_ttl() with three-layer precedence:
1. HERMES_WEBUI_SESSION_TTL env var (highest priority)
2. session_ttl_seconds in settings.json
3. Default: 86400 * 30 (30 days)
Clamped to [60s, 1 year] for safety. Settings changes take effect
immediately since the function is called dynamically at each login/cookie-write.
Closes#1954
- Session.composer_draft field: {text, files} stored in session JSON
- POST+GET /api/session/draft endpoint for save/load
- loadSession: save draft before switch, restore from S.session.composer_draft
- textarea input: debounced 400ms auto-save to server
- send(): clear draft after message is sent
- lockComposerForClarify(): save draft before card locks composer
- _restoreComposerDraft: clears textarea when target has no draft, guards
against stale responses racing new session loads, exact text comparison
- Session.compact(): includes composer_draft in response
- Fix: use handler.command instead of parsed.method (ParseResult has no .method)
Co-authored-by: Minimax <noreply@minimax.io>
When multiple custom providers expose the same model ID (e.g. baidu,
huoshan, and liantong all offering glm-5.1), only the first provider's
entry was shown in the model dropdown.
Root cause (backend): used the bare model ID as the
dedup key, so the second and subsequent providers with the same model
were silently skipped.
Root cause (frontend): stripped the @provider: prefix before
comparing, so @custom:baidu:glm-5.1 and @custom:huoshan:glm-5.1 were
treated as duplicates.
Fix:
- Backend: change _seen_custom_ids key to '{slug}:{model_id}' so each
provider's models are tracked independently.
- Frontend: add _providerOf() helper and deduplicate on the composite
(normId, provider) key instead of normId alone. Bare model IDs
(without @provider: prefix) still deduplicate on normId for backward
compatibility.
model_with_provider_context can emit @custom:<host>:<port>:<model> when
model_provider is derived from an OpenAI base_url authority (e.g.
custom:10.8.0.1:8080). The colon-count heuristic meant for @custom:slug:model:free
mistook those extra colons for an over-split model ID and prepended the port
segment onto the bare model (8080:Qwen3-235B), breaking WebUI while CLI/curl
stayed correct.
Detect endpoint-style slugs (IPv4/localhost/hostname + numeric port) and skip
the peel in that case. Add regression tests for IPv4, dotted hostname,
localhost, and model_with_provider_context round-trip.
The goal evaluation hook was firing on every completed assistant turn
when a goal was active, even for unrelated messages like "what time is
it". This burned the goal budget, triggered continuation prompts that
interrupted unrelated conversations, and made /goal status numbers
misleading.
Add STREAM_GOAL_RELATED and PENDING_GOAL_CONTINUATION flags to gate
the evaluate_goal_after_turn() call in the streaming loop. Only streams
started from goal kickoff (/goal <text>) or goal continuation are
marked as goal-related. Normal user messages skip the hook entirely.
Conflict resolution: both #1928 (session jump buttons) and #1929 (endless
scroll) add their own settings/UI/i18n keys. Resolved by keeping both —
the features are independent opt-in toggles.
Maintainer review on PR #1895 flagged that mcp_server.py duplicated the
visibility model from api/routes.py:75. Move the canonical helper into
api/profiles.py (next to _is_root_profile, on which it depends) so both
api/routes.py and mcp_server.py import the same function instead of
carrying parallel definitions that could drift as the model evolves.
- api/profiles.py: + _profiles_match (verbatim from former routes.py:75-97)
- api/routes.py: replace local definition with re-export to keep all
existing _profiles_match(...) call sites resolving
without per-call-site refactors
- mcp_server.py: drop local copy, import _profiles_match alongside the
existing api.profiles imports (line 59)
- tests: + test_profiles_match_single_source_of_truth asserts
identity (mcp.module._profiles_match is api.profiles._profiles_match
is api.routes._profiles_match) so any re-introduction of
a local copy trips the test
+ test_profiles_match_input_matrix parametrize across
the (None|''|'default'|'foo') x (None|''|'default'|'foo'|'bar')
visibility matrix per maintainer suggestion
Behaviour unchanged. Zero call-site changes anywhere in api/routes.py.
Co-Authored-By: Claude (Opus 4.7) <noreply@anthropic.com>
PR #1900 patches the two get_model_context_length() fallback callsites in
api/streaming.py to pass config_context_length, provider, and
custom_providers — but a third callsite of the same shape lives at
api/routes.py:2849, in the /api/session/get path that resolves
context_length for older sessions (pre-#1318) that have context_length=0
persisted.
Same bug shape: only `(model, base_url)` were forwarded, so the resolver
fell through to the 256K DEFAULT_FALLBACK_CONTEXT even when the user had
`model.context_length: 1048576` set in config.yaml. Visible symptom: the
very first paint of a reloaded old session shows the wrong window in the
chat-toolbar indicator until a turn fires (which would then trigger the
streaming.py fallbacks fixed in this PR and overwrite with the correct
value).
Fix mirrors streaming.py: pass `config_context_length=`,
`provider=effective_provider or ""`, and `custom_providers=` from the
per-profile config (`get_config()`), with a TypeError fallback that
retries the legacy 2-arg form for older hermes-agent builds whose
get_model_context_length signature pre-dates the new kwargs.
Adds `test_routes_session_load_fallback_passes_config_overrides` to lock
the call shape — verified to fail pre-fix with the same "missing
config_context_length=" error the streaming.py tests catch.
Defense-in-depth completion of #1896 — closes the third leg of the same
bug shape.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The two get_model_context_length() fallback callsites in api/streaming.py
(session save + SSE usage payload) were calling the resolver with only
model + base_url. When the agent's compressor reports 0 (fresh/cached/
transitioning agent), resolution fell through to the 256K DEFAULT_FALLBACK
even when users had set model.context_length: 1048576 in config.yaml.
For LCM users on 1M-context models, the wrong window cascaded into a
session-killing failure: auto-compression triggered at ~25% of the wrong
value, floods of compress requests, 429s, credential pool exhaustion,
fallback 429s, then 'API call failed after 3 retries'.
Reported by @AvidFuturist on Discord with deepseek-v4-flash. Reproduced 5x.
Both callsites now pass config_context_length, provider, and
custom_providers. The resolver consults these BEFORE probing, so the
config override wins. Both are wrapped in except TypeError blocks that
retry with the legacy 2-arg form for older hermes-agent builds whose
get_model_context_length signature pre-dates these kwargs.
Tests: 7 source-string regressions guarding both call shapes, the safe
config parse, the legacy fallback, and the per-profile config source.
Also bumped the line-distance assertion in test_pr1341 (the test
explicitly invites bumping when a new pre-save mutation block is added).
Closes#1896
Co-authored-by: Hermes Agent <agent@hermes.local>
Same-session profile switches reused cached AIAgent from previous profile,
silently leaking the old persona's SOUL.md / system prompt into the new
profile's turns. session_id stays stable across profile switches, and the
signature didn't include the active profile home, so every signature input
matched and the stale agent was returned from SESSION_AGENT_CACHE.
Append _profile_home to the signature blob so profile switches force a
cache miss and a fresh agent build under the new HERMES_HOME (which
triggers a fresh load_soul_md() call).
Tests: 3 source-string regressions guarding the signature contract,
ordering, and empty-home fallback.
Closes#1897
Co-authored-by: Hermes Agent <agent@hermes.local>
Read agent.max_turns when constructing streaming WebUI AIAgent instances, pass it as max_iterations when supported, and include it in the per-session agent cache signature so budget changes take effect.
Add regression coverage for the config read, constructor kwarg, and cache key.