Commit Graph

617 Commits

Author SHA1 Message Date
nesquena-hermes b8426d047c Stage 321: PR #1900 — pass config overrides into context-length fallback (closes #1896) 2026-05-08 16:08:42 +00:00
Nathan Esquenazi 15b7b7ae12 fix(routes): pass config overrides into session-load context-length fallback
PR #1900 patches the two get_model_context_length() fallback callsites in
api/streaming.py to pass config_context_length, provider, and
custom_providers — but a third callsite of the same shape lives at
api/routes.py:2849, in the /api/session/get path that resolves
context_length for older sessions (pre-#1318) that have context_length=0
persisted.

Same bug shape: only `(model, base_url)` were forwarded, so the resolver
fell through to the 256K DEFAULT_FALLBACK_CONTEXT even when the user had
`model.context_length: 1048576` set in config.yaml. Visible symptom: the
very first paint of a reloaded old session shows the wrong window in the
chat-toolbar indicator until a turn fires (which would then trigger the
streaming.py fallbacks fixed in this PR and overwrite with the correct
value).

Fix mirrors streaming.py: pass `config_context_length=`,
`provider=effective_provider or ""`, and `custom_providers=` from the
per-profile config (`get_config()`), with a TypeError fallback that
retries the legacy 2-arg form for older hermes-agent builds whose
get_model_context_length signature pre-dates the new kwargs.

Adds `test_routes_session_load_fallback_passes_config_overrides` to lock
the call shape — verified to fail pre-fix with the same "missing
config_context_length=" error the streaming.py tests catch.

Defense-in-depth completion of #1896 — closes the third leg of the same
bug shape.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 16:08:42 +00:00
nesquena-hermes 0efa75827a fix(streaming): pass config overrides into context-length fallback (#1896)
The two get_model_context_length() fallback callsites in api/streaming.py
(session save + SSE usage payload) were calling the resolver with only
model + base_url. When the agent's compressor reports 0 (fresh/cached/
transitioning agent), resolution fell through to the 256K DEFAULT_FALLBACK
even when users had set model.context_length: 1048576 in config.yaml.

For LCM users on 1M-context models, the wrong window cascaded into a
session-killing failure: auto-compression triggered at ~25% of the wrong
value, floods of compress requests, 429s, credential pool exhaustion,
fallback 429s, then 'API call failed after 3 retries'.

Reported by @AvidFuturist on Discord with deepseek-v4-flash. Reproduced 5x.

Both callsites now pass config_context_length, provider, and
custom_providers. The resolver consults these BEFORE probing, so the
config override wins. Both are wrapped in except TypeError blocks that
retry with the legacy 2-arg form for older hermes-agent builds whose
get_model_context_length signature pre-dates these kwargs.

Tests: 7 source-string regressions guarding both call shapes, the safe
config parse, the legacy fallback, and the per-profile config source.
Also bumped the line-distance assertion in test_pr1341 (the test
explicitly invites bumping when a new pre-save mutation block is added).

Closes #1896

Co-authored-by: Hermes Agent <agent@hermes.local>
2026-05-08 16:08:42 +00:00
nesquena-hermes 03bb364917 Stage 321: PR #1898+#1904 — profile-home in agent cache signature + functional regression test (closes #1897) 2026-05-08 16:08:18 +00:00
nesquena-hermes f456daa574 fix(streaming): include profile home in agent cache signature (#1897)
Same-session profile switches reused cached AIAgent from previous profile,
silently leaking the old persona's SOUL.md / system prompt into the new
profile's turns. session_id stays stable across profile switches, and the
signature didn't include the active profile home, so every signature input
matched and the stale agent was returned from SESSION_AGENT_CACHE.

Append _profile_home to the signature blob so profile switches force a
cache miss and a fresh agent build under the new HERMES_HOME (which
triggers a fresh load_soul_md() call).

Tests: 3 source-string regressions guarding the signature contract,
ordering, and empty-home fallback.

Closes #1897

Co-authored-by: Hermes Agent <agent@hermes.local>
2026-05-08 16:08:18 +00:00
nesquena-hermes 681456fc11 Stage 321: PR #1903 — scope skills endpoints to active profile by @Michaelyklam 2026-05-08 16:07:49 +00:00
Michael Lam 6c4b769324 fix: scope skills endpoints to active profile 2026-05-08 16:07:49 +00:00
Michael Lam 4366daba24 fix: use root home for gateway health status 2026-05-08 16:07:48 +00:00
nesquena-hermes 72b077ecce Stage 320: PR #1889 — deduplicate workspace-prefixed user turns by @ai-ag2026 2026-05-08 15:48:28 +00:00
ai-ag2026 f6d09e06ca fix: deduplicate workspace-prefixed user turns 2026-05-08 15:37:10 +00:00
nesquena-hermes 518453545c Stage 320: PR #1865 — interim_assistant streaming in runtime + live UI by @franksong2702 2026-05-08 15:37:09 +00:00
nesquena-hermes 035c537281 Stage 320: PR #1861 — overwrite session usage per turn by @franksong2702 2026-05-08 15:37:09 +00:00
Frank Song c1a9d7ce79 fix: overwrite session usage per turn 2026-05-08 15:37:09 +00:00
Frank Song 82c7367cef Add interim_assistant streaming path to WebUI 2026-05-08 15:37:09 +00:00
nesquena-hermes 0039ae8c64 Stage 320: PR #1877 — honor configured max_turns in WebUI agents by @Michaelyklam 2026-05-08 15:37:08 +00:00
nesquena-hermes f2194f13cd Stage 320: PR #1860 — request wedge diagnostics by @franksong2702 2026-05-08 15:37:08 +00:00
Michael Lam 01b9c82dc9 fix: honor configured max_turns in WebUI agents
Read agent.max_turns when constructing streaming WebUI AIAgent instances, pass it as max_iterations when supported, and include it in the per-session agent cache signature so budget changes take effect.

Add regression coverage for the config read, constructor kwarg, and cache key.
2026-05-08 15:37:08 +00:00
Frank Song 7e2709e281 fix: add request wedge diagnostics 2026-05-08 15:37:08 +00:00
Frank Song 6808e06083 fix: isolate profile quota usage probes 2026-05-08 15:37:07 +00:00
nesquena-hermes a11cbd3ee9 Stage 319: PR #1862 — preserve local custom provider model ids by @franksong2702 2026-05-08 15:16:18 +00:00
Frank Song 414c474d97 fix: preserve local custom provider model ids 2026-05-08 15:16:18 +00:00
nesquena-hermes 1105d496e9 Stage 319: PR #1887 — cross-container gateway liveness via state-file freshness fallback by @Sanjays2402 2026-05-08 15:15:50 +00:00
Sanjay Santhanam efcfff3d7f fix(#1879): cross-container gateway liveness via state-file freshness
The dashboard banner 'Hermes agent is not responding' fires on every
multi-container deployment that doesn't set 'pid: "service:hermes-agent"'
in compose, because get_running_pid() relies on fcntl.flock and
os.kill(pid, 0) — both PID-namespace-scoped and invisible across container
boundaries.

Fix: when get_running_pid() returns None, fall back to a freshness check on
gateway_state.json. The gateway already writes that file on every tick with
gateway_state == 'running' and an aware ISO-8601 updated_at timestamp, so a
recent (<= 120s) timestamp is an equivalent live-process signal that needs
only a shared volume — no PID namespace, no compose workaround, no extra
HTTP probe URL.

Behavior preserved:
- In-namespace deployments still hit the PID-based path first; payload shape
  unchanged (no 'reason' key) so #716 contract holds.
- Cross-container alive path adds reason='cross_container_freshness' so
  support diagnostics can tell which signal succeeded.
- Stale updated_at, non-running gateway_state, malformed/naive/missing
  timestamps, and timestamps far in the future all still report 'down' — the
  fallback never produces a false positive.
- Same redaction rules: argv/command/executable/env/raw pid never leak.

Tests: 15 new cases in test_issue1879_cross_container_gateway_liveness.py
covering the cross-container alive path, every refusal case, clock-skew
tolerance, and backward compat with the #716 PID path. Existing #716
heartbeat tests (8) continue to pass.
2026-05-08 15:15:50 +00:00
Sanjay Santhanam a958c29373 fix(config): phantom Custom group when active provider is ai-gateway (#1881)
Two bugs in get_available_models() conspired to duplicate the active
provider's auto-detected models under a phantom 'Custom' group whenever
custom_providers was also declared in config.yaml:

1. custom:* PIDs not in _named_custom_groups (e.g. stale slugs left from
   prior configs) fell through to the auto_detected_models fallback, copying
   the active provider's whole catalog into a phantom Custom: <slug> group.
   Fix: continue unconditionally for ANY custom:* PID — the named-group
   branch is the only legitimate population path.

2. The bare 'custom' PID, with the active provider being concrete (e.g.
   ai-gateway), hit 'elif auto_detected_models: copy.deepcopy(...)' and
   built a duplicate Custom group of the active provider's models with
   mismatched provider prefixes. Fix: when pid == 'custom' and the active
   provider is non-custom, leave models_for_group empty.

The reporter also suggested a third fix gating resolve_model_provider() on
config_provider — that's intentionally NOT applied because it conflicts with
the long-standing model-specific-override semantics covered by
test_model_resolver.py::test_custom_provider_*_routes_to_named_custom_provider
(custom_providers entries explicitly override the active provider's routing
when the user opted-in). The reporter's symptom (duplicate UI group) lives
entirely in get_available_models()'s group construction and is fully fixed
by the two changes above.

Tests: 6 new regression tests (3 in #1881 file + reuse), 774 broader
tests still green (model/provider/custom/config domain).
2026-05-08 15:15:49 +00:00
nesquena-hermes 0dcce8e434 Stage 318: PR #1859 — fix: persist generated title refresh marker by @ai-ag2026 2026-05-08 15:01:48 +00:00
ai-ag2026 755c18bdf9 fix: persist generated title refresh marker 2026-05-08 01:36:10 +02:00
ai-ag2026 f69a81c8c3 fix: preserve pending turn during stale cleanup 2026-05-07 23:57:01 +02:00
ChaseFlorell d8612ba323 fix: add cdn.jsdelivr.net to CSP connect-src to allow xterm source map fetches
Closes #1850

Co-authored-by: Chase Florell <ChaseFlorell@users.noreply.github.com>
2026-05-07 20:42:55 +00:00
hermes-agent 5f6a55185c stage-315 absorb: document handle_kanban_* three-valued return contract
Per Opus pre-release verdict on PR #1843: the four handle_kanban_*
entry points declare '-> bool' but actually return True | None | False
(after PR #1843 made the False-vs-None distinction load-bearing for
the caller's '_kanban_unknown_endpoint' decision). Update the type
annotations to 'bool | None' and add a docstring on handle_kanban_get
(with cross-references on the three siblings) so a future contributor
adding a new return path doesn't accidentally produce a 0/'' value
that would silently revert the double-404 fix.

Test-only verification: kanban tests pass (49/49). Production behavior
unchanged. Cheap defensive cleanup per Nathan's standing absorb-in-release
default for ≤20-LOC documentation/type-annotation fixes.
2026-05-07 18:52:01 +00:00
Michael Lam 78c09e1fd9 fix: keep shell route errors html 2026-05-07 18:41:13 +00:00
nesquena-hermes 740e5412a5 Stage 315: PR #1838 — show auto-compression running state by @Michaelyklam 2026-05-07 18:41:13 +00:00
Michael Lam e31b7e72d6 fix: show auto-compression running state 2026-05-07 18:41:13 +00:00
Nathan Esquenazi f3b56d8793 fix(kanban): avoid double 404 when bridge already sent error response
PR #1837's new `_kanban_unknown_endpoint` wrapper was triggered for any
falsy bridge return — but `handle_kanban_*` returns `None` (not `True`)
when an inner handler calls `bad(...)` to send an error response. The
wrapper then sent a SECOND 404 on top of the bridge's response, producing
concatenated JSON bodies on the wire.

Concrete reproducer (caught by behavioural harness, not the merged tests):

    GET /api/kanban/tasks/<missing-id>/log
    →  '{"error":"task not found"}{"error":"unknown Kanban endpoint: GET ..."}'

This affected every `bad(...)`-shaped error path in the bridge:
- task-not-found returns from `_task_log_payload` / `_task_detail_payload`
- exception handlers for ImportError (503), LookupError (404),
  ValueError (400), RuntimeError (409) across all four method handlers
- the `_handle_events_sse_stream` board-resolution failure path

The fix: distinguish an explicit `False` (truly unmatched path) from
`None` (handled, response already sent). Only `False` should trigger
the unknown-endpoint diagnostic.

Adds a regression test that exercises the task-not-found path through
`routes.handle_get` and asserts only one JSON body is on the wire.

Follow-on to #1837 (already merged into master at v0.51.20).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 11:35:57 -07:00
hermes-agent 0ed63968b6 Stage 314: PR #1827 — sync Codex provider card models with picker by @Michaelyklam
Note: PR #1827 was branched before v0.51.19 shipped #1812, which
introduced an initial (pure live-fetch) Codex provider card hook in
api/providers.py at the same line range. The contributor's PR was
filed AFTER #1812 shipped but their diff didn't yet account for it.
Stage 314 absorbs the contributor's intent (visible Codex cache
merge for gpt-5.3-codex-spark visibility) by replacing the v0.51.19
hook with the richer merged version directly in stage. Production
code change ≡ what the contributor's PR would have produced if
rebased onto current master. Test file + pr-media adopted verbatim.
Marker commit so the stage log makes the absorption visible.
2026-05-07 17:58:52 +00:00
Michael Lam bb75707331 fix: surface stale Kanban client recovery 2026-05-07 17:57:09 +00:00
hermes-agent 1f702c7569 stage-313 absorb: gate _resolve_configured_provider_id alias resolution + harden bootstrap test isolation
Two in-stage fixes for v0.51.19 batch:

1) api/config.py — add resolve_alias=False param to
   _resolve_configured_provider_id() and pass it from
   resolve_model_provider(). The PR #1818 swap from
   _resolve_provider_alias() to _resolve_configured_provider_id()
   was correct for active-provider/badge surfaces but broke #1625's
   local-server-provider literal-preservation contract: 'ollama' →
   'custom' and 'lm-studio' → 'lmstudio' alias-collapse caused
   _LOCAL_SERVER_PROVIDERS membership check to miss, breaking the
   model-id full-path preservation for LM Studio/Ollama. The new
   flag preserves the raw provider value when called from
   resolve_model_provider, and named-custom-slug + base-url
   fallback both still run unchanged.

2) tests/test_bootstrap_discover_agent.py — pin Path.home() in
   _isolate_discover_agent_dir so the hard-coded
   'Path.home() / .hermes / hermes-agent' / 'Path.home() /
   hermes-agent' candidates in discover_agent_dir() can't pick up
   the dev machine's real install. The original PR #1817 isolation
   helper covered HERMES_HOME, HERMES_WEBUI_AGENT_DIR, and
   REPO_ROOT but missed the Path.home() leak.

Both surfaced on full pytest pre-release gate, fixed in stage,
ship in v0.51.19. Tests: full suite green.
2026-05-07 17:07:48 +00:00
Frank Song 8bc2677691 fix: repair file picker and html preview interactions 2026-05-07 16:59:00 +00:00
ai-ag2026 ae22a80238 fix: hide workspace metadata in user bubbles 2026-05-07 16:58:59 +00:00
ai-ag2026 7d5704c3bc fix: keep cross-surface session continuations visible 2026-05-07 16:58:39 +00:00
nesquena-hermes 5e01b00b8b Stage 313: PR #1809 — dedupe workspace-prefixed user turns after compaction by @ai-ag2026 2026-05-07 16:58:16 +00:00
ai-ag2026 256866ace6 fix: dedupe workspace-prefixed user turns after compaction 2026-05-07 16:58:16 +00:00
Frank Song f7902776d4 fix: use live Codex models in providers card 2026-05-07 16:58:16 +00:00
nesquena-hermes db7b72596e Stage 313: PR #1805 — provider account quota cards by @franksong2702 2026-05-07 16:58:15 +00:00
nesquena-hermes 6ab384618a Stage 313: PR #1818 — named custom provider routing by @franksong2702 2026-05-07 16:56:49 +00:00
Michael Lam 1192a0a766 fix: preserve inaccessible workspace entries 2026-05-07 16:56:48 +00:00
Frank Song 3ac89c2696 fix: route named custom provider model selections 2026-05-07 21:40:23 +08:00
Frank Song a6b88c8c1e feat: show account limits in provider quota 2026-05-07 17:36:04 +08:00
Michael Lam 048f1fa24e fix: keep assistant-only stream deltas on current turn 2026-05-07 06:25:16 +00:00
Sanjay Santhanam 064d14c85b fix(config): custom provider + :free/:beta/:thinking suffix mis-resolution (#1776)
PR #1762 fixed the rsplit grammar collision for plain @openrouter:model:free
qualifiers, but skipped the fallback whenever the provider hint started with
'custom:' on the assumption that custom providers route directly. That left
'@custom:my-key:some-model:free' broken: rsplit yields
provider='custom:my-key:some-model', bare='free' → custom guard skips the
split-fallback → returns provider='custom:my-key:some-model', model='free'.

Detect the over-split structurally instead of using a known-suffix allowlist:
custom hints carry exactly one segment after 'custom:' (constructed at
api/config.py:1363 as 'custom:' + entry_name). So any rsplit result of
'custom:<a>:<b>' with bare model '<c>' has eaten one model segment — peel
it back with a second rsplit and prepend it to the bare model.

This is robust for :free / :beta / :thinking / :preview / any future
OpenRouter suffix without an allowlist to maintain.

Adds 5 regression tests covering the matrix (free/beta/thinking/preview/
slashed-model). All 7 existing #1744 tests still pass; #1228 tests
unaffected.

Co-authored-by: Cake <51058514+Sanjays2402@users.noreply.github.com>
2026-05-07 06:25:16 +00:00
fxd-jason a80b7695d8 fix(kanban): update stale read-only docstring + board_exists early-out in board counts
The bridge module docstring still described the API as 'deliberately
read-only' but it now exposes full CRUD (tasks, boards, comments,
links, SSE). Updated to list the supported operations.

For _board_counts_for_slug (the hot path for the board-switcher badge),
added a board_exists() early-out that mirrors the agent's own helper
in plugin_api.py (path.exists() before connect()). This avoids a
redundant init_db()+connect() schema pass per board per list refresh.
connect() already handles auto-init for fresh databases via its
needs_init check, so the extra init_db was unnecessary overhead on
the hot path that scales linearly with board count.

Tests:
- test_board_counts_returns_empty_for_nonexistent_board: verifies the
  early-out (no connect() call, returns {})
- test_board_counts_returns_real_counts_for_populated_board: verifies
  actual per-status counts are returned for existing boards
2026-05-07 03:58:16 +00:00