PR #1900 patches the two get_model_context_length() fallback callsites in
api/streaming.py to pass config_context_length, provider, and
custom_providers — but a third callsite of the same shape lives at
api/routes.py:2849, in the /api/session/get path that resolves
context_length for older sessions (pre-#1318) that have context_length=0
persisted.
Same bug shape: only `(model, base_url)` were forwarded, so the resolver
fell through to the 256K DEFAULT_FALLBACK_CONTEXT even when the user had
`model.context_length: 1048576` set in config.yaml. Visible symptom: the
very first paint of a reloaded old session shows the wrong window in the
chat-toolbar indicator until a turn fires (which would then trigger the
streaming.py fallbacks fixed in this PR and overwrite with the correct
value).
Fix mirrors streaming.py: pass `config_context_length=`,
`provider=effective_provider or ""`, and `custom_providers=` from the
per-profile config (`get_config()`), with a TypeError fallback that
retries the legacy 2-arg form for older hermes-agent builds whose
get_model_context_length signature pre-dates the new kwargs.
Adds `test_routes_session_load_fallback_passes_config_overrides` to lock
the call shape — verified to fail pre-fix with the same "missing
config_context_length=" error the streaming.py tests catch.
Defense-in-depth completion of #1896 — closes the third leg of the same
bug shape.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The two get_model_context_length() fallback callsites in api/streaming.py
(session save + SSE usage payload) were calling the resolver with only
model + base_url. When the agent's compressor reports 0 (fresh/cached/
transitioning agent), resolution fell through to the 256K DEFAULT_FALLBACK
even when users had set model.context_length: 1048576 in config.yaml.
For LCM users on 1M-context models, the wrong window cascaded into a
session-killing failure: auto-compression triggered at ~25% of the wrong
value, floods of compress requests, 429s, credential pool exhaustion,
fallback 429s, then 'API call failed after 3 retries'.
Reported by @AvidFuturist on Discord with deepseek-v4-flash. Reproduced 5x.
Both callsites now pass config_context_length, provider, and
custom_providers. The resolver consults these BEFORE probing, so the
config override wins. Both are wrapped in except TypeError blocks that
retry with the legacy 2-arg form for older hermes-agent builds whose
get_model_context_length signature pre-dates these kwargs.
Tests: 7 source-string regressions guarding both call shapes, the safe
config parse, the legacy fallback, and the per-profile config source.
Also bumped the line-distance assertion in test_pr1341 (the test
explicitly invites bumping when a new pre-save mutation block is added).
Closes#1896
Co-authored-by: Hermes Agent <agent@hermes.local>
Same-session profile switches reused cached AIAgent from previous profile,
silently leaking the old persona's SOUL.md / system prompt into the new
profile's turns. session_id stays stable across profile switches, and the
signature didn't include the active profile home, so every signature input
matched and the stale agent was returned from SESSION_AGENT_CACHE.
Append _profile_home to the signature blob so profile switches force a
cache miss and a fresh agent build under the new HERMES_HOME (which
triggers a fresh load_soul_md() call).
Tests: 3 source-string regressions guarding the signature contract,
ordering, and empty-home fallback.
Closes#1897
Co-authored-by: Hermes Agent <agent@hermes.local>
Read agent.max_turns when constructing streaming WebUI AIAgent instances, pass it as max_iterations when supported, and include it in the per-session agent cache signature so budget changes take effect.
Add regression coverage for the config read, constructor kwarg, and cache key.
The dashboard banner 'Hermes agent is not responding' fires on every
multi-container deployment that doesn't set 'pid: "service:hermes-agent"'
in compose, because get_running_pid() relies on fcntl.flock and
os.kill(pid, 0) — both PID-namespace-scoped and invisible across container
boundaries.
Fix: when get_running_pid() returns None, fall back to a freshness check on
gateway_state.json. The gateway already writes that file on every tick with
gateway_state == 'running' and an aware ISO-8601 updated_at timestamp, so a
recent (<= 120s) timestamp is an equivalent live-process signal that needs
only a shared volume — no PID namespace, no compose workaround, no extra
HTTP probe URL.
Behavior preserved:
- In-namespace deployments still hit the PID-based path first; payload shape
unchanged (no 'reason' key) so #716 contract holds.
- Cross-container alive path adds reason='cross_container_freshness' so
support diagnostics can tell which signal succeeded.
- Stale updated_at, non-running gateway_state, malformed/naive/missing
timestamps, and timestamps far in the future all still report 'down' — the
fallback never produces a false positive.
- Same redaction rules: argv/command/executable/env/raw pid never leak.
Tests: 15 new cases in test_issue1879_cross_container_gateway_liveness.py
covering the cross-container alive path, every refusal case, clock-skew
tolerance, and backward compat with the #716 PID path. Existing #716
heartbeat tests (8) continue to pass.
Two bugs in get_available_models() conspired to duplicate the active
provider's auto-detected models under a phantom 'Custom' group whenever
custom_providers was also declared in config.yaml:
1. custom:* PIDs not in _named_custom_groups (e.g. stale slugs left from
prior configs) fell through to the auto_detected_models fallback, copying
the active provider's whole catalog into a phantom Custom: <slug> group.
Fix: continue unconditionally for ANY custom:* PID — the named-group
branch is the only legitimate population path.
2. The bare 'custom' PID, with the active provider being concrete (e.g.
ai-gateway), hit 'elif auto_detected_models: copy.deepcopy(...)' and
built a duplicate Custom group of the active provider's models with
mismatched provider prefixes. Fix: when pid == 'custom' and the active
provider is non-custom, leave models_for_group empty.
The reporter also suggested a third fix gating resolve_model_provider() on
config_provider — that's intentionally NOT applied because it conflicts with
the long-standing model-specific-override semantics covered by
test_model_resolver.py::test_custom_provider_*_routes_to_named_custom_provider
(custom_providers entries explicitly override the active provider's routing
when the user opted-in). The reporter's symptom (duplicate UI group) lives
entirely in get_available_models()'s group construction and is fully fixed
by the two changes above.
Tests: 6 new regression tests (3 in #1881 file + reuse), 774 broader
tests still green (model/provider/custom/config domain).
Per Opus pre-release verdict on PR #1843: the four handle_kanban_*
entry points declare '-> bool' but actually return True | None | False
(after PR #1843 made the False-vs-None distinction load-bearing for
the caller's '_kanban_unknown_endpoint' decision). Update the type
annotations to 'bool | None' and add a docstring on handle_kanban_get
(with cross-references on the three siblings) so a future contributor
adding a new return path doesn't accidentally produce a 0/'' value
that would silently revert the double-404 fix.
Test-only verification: kanban tests pass (49/49). Production behavior
unchanged. Cheap defensive cleanup per Nathan's standing absorb-in-release
default for ≤20-LOC documentation/type-annotation fixes.
PR #1837's new `_kanban_unknown_endpoint` wrapper was triggered for any
falsy bridge return — but `handle_kanban_*` returns `None` (not `True`)
when an inner handler calls `bad(...)` to send an error response. The
wrapper then sent a SECOND 404 on top of the bridge's response, producing
concatenated JSON bodies on the wire.
Concrete reproducer (caught by behavioural harness, not the merged tests):
GET /api/kanban/tasks/<missing-id>/log
→ '{"error":"task not found"}{"error":"unknown Kanban endpoint: GET ..."}'
This affected every `bad(...)`-shaped error path in the bridge:
- task-not-found returns from `_task_log_payload` / `_task_detail_payload`
- exception handlers for ImportError (503), LookupError (404),
ValueError (400), RuntimeError (409) across all four method handlers
- the `_handle_events_sse_stream` board-resolution failure path
The fix: distinguish an explicit `False` (truly unmatched path) from
`None` (handled, response already sent). Only `False` should trigger
the unknown-endpoint diagnostic.
Adds a regression test that exercises the task-not-found path through
`routes.handle_get` and asserts only one JSON body is on the wire.
Follow-on to #1837 (already merged into master at v0.51.20).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Note: PR #1827 was branched before v0.51.19 shipped #1812, which
introduced an initial (pure live-fetch) Codex provider card hook in
api/providers.py at the same line range. The contributor's PR was
filed AFTER #1812 shipped but their diff didn't yet account for it.
Stage 314 absorbs the contributor's intent (visible Codex cache
merge for gpt-5.3-codex-spark visibility) by replacing the v0.51.19
hook with the richer merged version directly in stage. Production
code change ≡ what the contributor's PR would have produced if
rebased onto current master. Test file + pr-media adopted verbatim.
Marker commit so the stage log makes the absorption visible.
Two in-stage fixes for v0.51.19 batch:
1) api/config.py — add resolve_alias=False param to
_resolve_configured_provider_id() and pass it from
resolve_model_provider(). The PR #1818 swap from
_resolve_provider_alias() to _resolve_configured_provider_id()
was correct for active-provider/badge surfaces but broke #1625's
local-server-provider literal-preservation contract: 'ollama' →
'custom' and 'lm-studio' → 'lmstudio' alias-collapse caused
_LOCAL_SERVER_PROVIDERS membership check to miss, breaking the
model-id full-path preservation for LM Studio/Ollama. The new
flag preserves the raw provider value when called from
resolve_model_provider, and named-custom-slug + base-url
fallback both still run unchanged.
2) tests/test_bootstrap_discover_agent.py — pin Path.home() in
_isolate_discover_agent_dir so the hard-coded
'Path.home() / .hermes / hermes-agent' / 'Path.home() /
hermes-agent' candidates in discover_agent_dir() can't pick up
the dev machine's real install. The original PR #1817 isolation
helper covered HERMES_HOME, HERMES_WEBUI_AGENT_DIR, and
REPO_ROOT but missed the Path.home() leak.
Both surfaced on full pytest pre-release gate, fixed in stage,
ship in v0.51.19. Tests: full suite green.
PR #1762 fixed the rsplit grammar collision for plain @openrouter:model:free
qualifiers, but skipped the fallback whenever the provider hint started with
'custom:' on the assumption that custom providers route directly. That left
'@custom:my-key:some-model:free' broken: rsplit yields
provider='custom:my-key:some-model', bare='free' → custom guard skips the
split-fallback → returns provider='custom:my-key:some-model', model='free'.
Detect the over-split structurally instead of using a known-suffix allowlist:
custom hints carry exactly one segment after 'custom:' (constructed at
api/config.py:1363 as 'custom:' + entry_name). So any rsplit result of
'custom:<a>:<b>' with bare model '<c>' has eaten one model segment — peel
it back with a second rsplit and prepend it to the bare model.
This is robust for :free / :beta / :thinking / :preview / any future
OpenRouter suffix without an allowlist to maintain.
Adds 5 regression tests covering the matrix (free/beta/thinking/preview/
slashed-model). All 7 existing #1744 tests still pass; #1228 tests
unaffected.
Co-authored-by: Cake <51058514+Sanjays2402@users.noreply.github.com>
The bridge module docstring still described the API as 'deliberately
read-only' but it now exposes full CRUD (tasks, boards, comments,
links, SSE). Updated to list the supported operations.
For _board_counts_for_slug (the hot path for the board-switcher badge),
added a board_exists() early-out that mirrors the agent's own helper
in plugin_api.py (path.exists() before connect()). This avoids a
redundant init_db()+connect() schema pass per board per list refresh.
connect() already handles auto-init for fresh databases via its
needs_init check, so the extra init_db was unnecessary overhead on
the hot path that scales linearly with board count.
Tests:
- test_board_counts_returns_empty_for_nonexistent_board: verifies the
early-out (no connect() call, returns {})
- test_board_counts_returns_real_counts_for_populated_board: verifies
actual per-status counts are returned for existing boards