Commit Graph

435 Commits

Author SHA1 Message Date
Hermes Bot 166f439eeb fix: correct issue references #1557#1558 (nesquena review feedback)
The PR title and body correctly say 'Closes #1558' but every code comment,
the test file name, error-message strings, docstrings, and the original
commit body referenced #1557 instead. Independent reviewer flagged this:

> The 17 wrong references won't auto-close issue #1558 from the commit
> message — and the test file name will be misleading for future archeology.
> Worth a one-pass s/#1557/#1558/g (and rename test file →
> test_metadata_save_wipe_1558.py) before merge so the artifacts agree
> with reality.

This commit:
- Renames tests/test_metadata_save_wipe_1557.py → test_metadata_save_wipe_1558.py
- Replaces 17 #1557 references with #1558 across:
  - tests/test_metadata_save_wipe_1558.py (7 refs)
  - api/models.py (5 refs in Session.save guard + backup safeguard comments)
  - api/routes.py (2 refs in _clear_stale_stream_state docstring + log)
  - api/session_recovery.py (3 refs)
  - server.py (3 refs in startup self-heal block)

Verified: 6/6 tests in tests/test_metadata_save_wipe_1558.py pass
with the renamed file + updated references.
2026-05-03 19:55:14 +00:00
nesquena-hermes 1d9a0cbba1 fix(P0 #1557): metadata-only Session.save() was wiping conversation history
v0.50.279 introduced api.routes._clear_stale_stream_state() (#1525) which
calls session.save() to clear stale active_stream_id/pending_* fields. The
helper is called from /api/session and /api/session/status — both of which
load the session with metadata_only=True. Session.load_metadata_only()
synthesizes a stub with messages=[] (its whole purpose: fast metadata read
without parsing the 400KB+ messages array). Session.save() unconditionally
writes self.messages to disk via os.replace(), so saving a metadata-only
stub atomically overwrites the on-disk JSON with messages=[], wiping the
entire conversation.

Production trigger: every SSE reconnect cycle after a server restart polls
/api/session/status, which fans out to _clear_stale_stream_state, which
saves the metadata-only stub. The user reported losing 1000+ message
conversations and seeing 'Reconnecting…' loops on every prompt — the
reconnect loop kept the cycle running until the conversation was empty.

Fix: three layers, defense in depth.

(1) api/models.py: load_metadata_only() now sets _loaded_metadata_only=True
    on the returned stub. Session.save() raises RuntimeError if that flag
    is set — a hard guard so any future caller making the same mistake
    cannot wipe data, only crash visibly.

(2) api/routes.py: _clear_stale_stream_state() now detects the metadata-only
    flag and re-loads the full session with metadata_only=False before
    mutating persisted state. The full-load path also runs
    _repair_stale_pending() which independently clears the stream flags,
    so the explicit clear becomes a no-op in most cases — but messages
    stay intact.

(3) api/models.py + api/session_recovery.py: every save() that would
    SHRINK the messages array (the precise failure shape of #1557) first
    snapshots the previous file to <sid>.json.bak. Server.py runs
    recover_all_sessions_on_startup() at boot — any session whose live
    JSON has fewer messages than its .bak is restored automatically.
    Idempotent on clean state. Backup overhead is zero on the normal
    grow-the-conversation path.

Reproducer (master): test_metadata_only_save_does_not_wipe_messages goes
from 1000 messages to 0 in a single save() call. After the fix, 1000
messages survive.

Tests: 6 new regression tests in tests/test_metadata_save_wipe_1557.py
covering all three layers. Full pytest: 4019 → 4025 (+6, all green).

Live verified on port 8789: write 1000-msg session with stale active_stream_id,
hit /api/session/status, /api/session — file ends with 1002 messages
(_repair_stale_pending injects an error-marker pair on full reload, harmless
existing behavior), active_stream_id cleared, pending cleared, no Reconnecting
loop.

Closes #1557.

Reported by AvidFuturist via user feedback on v0.50.282.
2026-05-03 19:45:10 +00:00
Hermes Bot c73a5eb384 Stage 283: PR #1553 — silent credential self-heal on 401 (#1401) by @bergeouss 2026-05-03 19:19:02 +00:00
Hermes Bot e4e53f9ef4 Stage 283: PR #1552 — Gateway status card in Settings (#1457) by @bergeouss 2026-05-03 19:19:02 +00:00
Hermes Bot fd6e409021 Stage 283: PR #1551 — Reveal in File Manager workspace context menu (#1424) by @bergeouss 2026-05-03 19:19:02 +00:00
Hermes Bot cee61fb1d9 Stage 283: PR #1550 — auto-assign session to filtered project (#1468) by @bergeouss 2026-05-03 19:19:02 +00:00
Hermes Bot 4daa09da7f Stage 283: PR #1549 — What's new? link in update banner (#1512) by @bergeouss 2026-05-03 19:19:02 +00:00
Hermes Bot 16c53e5bcf Stage 283: PR #1548 (augmented) — OpenRouter free-tier live fetch (#1426) by @bergeouss 2026-05-03 19:19:02 +00:00
Hermes Bot 9a7728f06b Stage 283: PR #1543 — recover pending turn after stale stream restart by @ai-ag2026 (follow-up to #1471) 2026-05-03 19:19:01 +00:00
Hermes Bot babca37ea6 Stage 283: PR #1545 — remove phantom /sw.js from PUBLIC_PATHS (#1481) by @bergeouss 2026-05-03 19:19:01 +00:00
Hermes Bot 0750da5b37 fix(models): structural OpenRouter free-tier visibility — live fetch + augment fallback (#1426)
Augments @bergeouss's PR #1548 v2 with the structural fix the issue
actually requested. The original PR added 5 hardcoded entries to
_FALLBACK_MODELS which would rot fast as OpenRouter's free-tier roster
turns over monthly.

Adds proper live-fetch logic to the OpenRouter group population so the
free-tier list stays fresh without requiring a code release every time
a new free model lands.

api/config.py:2120 — replaces the static _FALLBACK_MODELS slice with:

  1. Live curated catalog via hermes_cli.models.fetch_openrouter_models()
     — applies the tool-support filter (Kilo-Org/kilocode#9068).
  2. Free-tier live fetch — direct call to https://openrouter.ai/api/v1/models,
     filtered to free-tier-only (pricing.prompt == 0 AND pricing.completion
     == 0, OR :free suffix), bypasses the tool-support filter so newly-added
     free variants appear even before OpenRouter annotates them with tools.
     Capped at 30 entries to keep the picker usable.
  3. Defense-in-depth fallback to _FALLBACK_MODELS (which retains
     @bergeouss's hardcoded list for offline / test envs).
  4. Deduplication via seen_ids — model in both surfaces appears once.

5 new tests + 1 fixed test in tests/test_minimax_provider.py (scoped the
provider='MiniMax' assertion to direct-MiniMax routes by filtering for
'minimax/' prefix and excluding ':free' since the OpenRouter free-tier
variant minimax/minimax-m2.5:free correctly carries provider='OpenRouter').

Co-authored-by: bergeouss <[email protected]>
2026-05-03 19:18:44 +00:00
bergeouss 1c5bce92cb feat: add gateway status card to Settings → System (#1457) 2026-05-03 19:02:17 +00:00
bergeouss a085b71511 feat: add Reveal in File Manager to workspace file context menu (#1424) 2026-05-03 19:02:16 +00:00
bergeouss 0fbaafa110 feat: auto-assign project when filtering by project on new session (#1468) 2026-05-03 19:02:15 +00:00
bergeouss c94f9c70ce feat: add 'What's new?' link to update banner (#1512) 2026-05-03 19:02:14 +00:00
bergeouss f60db40133 fix: include OpenRouter free-tier models in fallback list (#1426) 2026-05-03 19:02:13 +00:00
bergeouss 8fe593fa38 feat: silent credential self-heal on 401 errors (#1401) 2026-05-03 18:32:53 +00:00
bergeouss 237010f8bd fix: remove phantom /sw.js from PUBLIC_PATHS whitelist (#1481) 2026-05-03 18:18:14 +00:00
nesquena-hermes c21e3086a2 docs: align _format_nous_label docstring examples with actual output
Per review observation on PR #1544: the docstring claimed
'Gemini 3.1 Pro Preview' and 'Nemotron 3 Super 120B A12B' but the
helper reuses _format_ollama_label's 3-letter-token rule, which
uppercases 'PRO' (and the existing rule for tokens like 'a12b'
renders 'A12b' not 'A12B'). Update the examples to match actual
behavior — labels are unchanged, only the docstring.

Pure-comment change, no behavioral effect. Test counts unchanged
(4013 passed).
2026-05-03 18:12:01 +00:00
nesquena-hermes bff8cb2b58 fix: Nous Portal full live catalog + dropdown cache invalidation on provider remove
Closes #1538, #1539. Two related dropdown-staleness bugs reported by Deor
(Discord, May 03 2026).

#1538 — Nous Portal picker showed only 4 hardcoded models
=========================================================
The Settings → Default Model picker, the composer model dropdown, the
/model slash command, and the Settings → Providers card all showed only
four Nous models (Claude Opus 4.6, Claude Sonnet 4.6, GPT-5.4 Mini, Gemini
3.1 Pro Preview) because `_PROVIDER_MODELS["nous"]` had four hardcoded
entries and `_build_available_models_uncached()` fell through to the
generic `pid in _PROVIDER_MODELS` branch.

The actual Nous Portal catalog has 30 models live — Claude Opus 4.7, GPT-5.5,
Kimi K2.6, MiniMax M2.7, Gemini 3.1 Pro/Flash, several Xiaomi/Tencent/StepFun
entries, and more.

Fix:
- New `_format_nous_label()` helper in `api/config.py` — reuses the
  `_format_ollama_label()` token rules, drops the vendor namespace, and
  appends ` (via Nous)` so labels disambiguate from same-named direct-
  provider entries (e.g. "Claude Opus 4.7" via direct Anthropic).
- New `elif pid == "nous":` branch in `_build_available_models_uncached()`
  mirroring the Ollama Cloud pattern: live-fetch through
  `hermes_cli.models.provider_model_ids("nous")`, prefix every id with
  `@nous:` (matches the existing routing convention from PR-era #854 and
  pinned in tests/test_nous_portal_routing.py), fall back to the curated
  4-entry static list when hermes_cli is unavailable.
- Same fix applied to `api/providers.py:get_providers()` — that's the
  separate code path that builds Settings → Providers card models, and
  it had the identical bug shape.

#1539 — Removed provider lingered in dropdowns until restart
============================================================
After Settings → Providers → Remove, the provider's models still appeared
in every model dropdown until the page was reloaded. The server-side
TTL cache was correctly flushed (`set_provider_key()` calls
`invalidate_models_cache()` on both add and remove) but JS-side caches
were never dropped:

- `_slashModelCache` / `_slashModelCachePromise` (commands.js) — feeds
  the `/model` slash-command suggestions.
- `_dynamicModelLabels` / `window._configuredModelBadges` (ui.js) —
  populated by `populateModelDropdown()` on app boot and profile switch.

Pre-fix, `_removeProviderKey()` only called `loadProvidersPanel()`
which refreshed the providers card list but never asked any consumer
to re-fetch /api/models.

Fix:
- `static/commands.js`: new `_invalidateSlashModelCache()` helper that
  nulls both cache slots, exposed on `window` (typeof-guarded so the
  module remains importable in headless vm contexts — needed by the
  existing tests/test_cli_only_slash_commands.py harness).
- `static/panels.js`: new `_refreshModelDropdownsAfterProviderChange()`
  helper that calls the invalidator + `populateModelDropdown()`, wrapped
  in try/catch so the providers panel update never breaks if a
  downstream module hasn't loaded yet. Both `_saveProviderKey` and
  `_removeProviderKey` invoke it (defense-in-depth: same staleness shape
  applies to the add path too).

Tests
-----
- `tests/test_issue1538_nous_live_catalog.py` (12 tests): live-fetch
  surfaces ≥20 entries, every id starts with `@nous:`, every label ends
  with ` (via Nous)`, recent flagships (Opus 4.7, GPT-5.5, Kimi K2.6,
  Gemini 3.1 Pro, MiniMax M2.7) reach the dropdown, static fallback
  works when hermes_cli raises, label formatter unit tests (vendor
  namespace stripping, variant rendering, MiniMax mixed-case), the
  curated static list and its routing invariants are preserved.
- `tests/test_issue1539_provider_removal_dropdown_invalidation.py`
  (11 tests): invalidator helper exists and clears both cache slots,
  exposed on window with typeof guard, both save and remove paths
  invoke the dropdown flush, helper calls both invalidator and
  populateModelDropdown, helper is resilient to missing modules,
  helper does not block panel refresh, server-side
  `set_provider_key → invalidate_models_cache` invariant pinned.

Verified live on port 8789: `/api/models` Nous group returns 30
models (was 4); browser `document.getElementById('modelSelect')`
exposes 30 options under the "Nous Portal" group; the dropdown-flush
helper is callable from the browser and round-trip rebuild keeps the
dropdown at 30 options.

Test counts:
- Full pytest: 4013 passed, 2 skipped, 3 xpassed, 0 failures
  (was 3990 → 4013, +23 from this PR).
- QA harness pytest: 20 passed.
- Browser API sanity: 11/11 passed.
- Agent Browser CDP: 21/23 passed (the 2 SSE liveness failures
  reproduce on master and are unrelated to this PR).
2026-05-03 18:12:01 +00:00
Manfred afaeb03532 fix: recover pending turn after stale stream restart 2026-05-03 20:00:56 +02:00
Dutch AI Agency e4d2704ce8 fix: resolve local models from configured base url 2026-05-03 17:04:46 +00:00
Hermes Bot 0cbada7228 Stage 280: PR #1404 — cross-channel messaging handoff (Frank Song, rebased onto master) 2026-05-03 16:51:34 +00:00
Frank Song c7e52084ba Harden messaging channel handoff 2026-05-03 16:35:50 +00:00
Frank Song 20ef643bb8 Add messaging session handoff summary 2026-05-03 16:35:22 +00:00
nesquena df0d904d87 fix(streaming): pass agent.reasoning_effort into WebUI agents (salvages #1531)
Spliced from #1531 by @Asunfly: take Change-1 only (the actual bug fix +
cache signature inclusion) and skip Change-2 (auxiliary title-route
extra_body change) which is a separate scope concern.

## What

Two surgical fixes in api/streaming.py:

1. Line 1820 — `_cfg.cfg.get(...)` → `_cfg.get(...)`. `get_config()` returns
   a plain dict (not a wrapper exposing `.cfg`).  The buggy line raised
   AttributeError that the surrounding try/except swallowed, so
   `_reasoning_config` was always None regardless of what `/reasoning
   <level>` had been set to.  Verified locally — `api/streaming.py:1959`
   already correctly used `_cfg.get(...)` in the same function, so the
   same `_cfg` was being read two different ways in one file.

2. Line 1888 — added `_reasoning_config or {}` to `_sig_blob`.  Without
   this, switching effort mid-session would fail to take effect because
   the per-session agent cache key would still match the old entry.
   Mirrors how `resolved_provider` / `resolved_base_url` already
   participate in the signature.

## Why splice instead of merge #1531 directly

@Asunfly force-pushed a Change-2 onto #1531 after the original review
that removes `extra_body={"reasoning": {"enabled": False}}` from
`generate_title_raw_via_aux` (the auxiliary title-generation route).
That intent is reasonable (let operator-configured `extra_body.reasoning`
flow through to the title route) but it touches a different surface and
deserves its own PR.

The narrow concern is operators who selected a reasoning-capable
auxiliary title model without explicitly setting
`reasoning.enabled=False` in the task config — pre-Change-2 the WebUI
defended against accidental reasoning on the title hot path; post-Change-2
those configs would reason on every new conversation`s title, with cost
and latency implications.

## What is NOT in this PR

- The `generate_title_raw_via_aux` extra_body refactor (Change-2 from #1531).
- The `test_does_not_override_configured_reasoning_extra_body` test (guards
  Change-2). Asunfly can re-open that as its own focused PR.

## Tests

Two new R17b/R17c regression assertions in tests/test_regressions.py:

- `test_streaming_reads_reasoning_effort_from_config_dict` — static-source
  guard: `_cfg.cfg` must not return to streaming.py
- `test_streaming_agent_cache_signature_includes_reasoning_config` —
  catches removal of `_reasoning_config` from `_sig_blob`

## Closes

- Closes #1531 (the Change-1 portion ships here; Asunfly can re-open
  Change-2 as a separate PR if desired)

Co-authored-by: Asunfly <[email protected]>
2026-05-03 16:34:25 +00:00
Hermes Bot a5e6b9dc8b Merge PR #1526 by @ai-ag2026: pass WebUI max_tokens into agent + classify OpenRouter quota phrases (refs #1524) 2026-05-03 16:06:55 +00:00
Hermes Bot 1148656370 Merge PR #1525 by @ai-ag2026: clear stale WebUI stream state proactively (refs #1471)
Merge conflict resolution: kept HEAD's `CACHE_NAME = 'hermes-shell-__WEBUI_VERSION__'` (post-#1517 rename) over PR #1525's `'hermes-shell-__CACHE_VERSION__-stale-stream-cleanup1'` manual suffix. The renamed placeholder still auto-bumps with each release through the `quote(WEBUI_VERSION, safe="")` substitution, so the manual `-stale-stream-cleanup1` suffix is no longer needed to force-update existing service workers — the natural version bump (v0.50.278 → v0.50.279) already invalidates the old cache via `caches.delete(k)` for `k !== CACHE_NAME` in the SW activate handler. No behavioral regression: the SW cache still bumps on this release, just via the canonical version-token path.

Co-authored-by: ai-ag2026 <ai-ag2026@users.noreply.github.com>
2026-05-03 16:06:42 +00:00
Hermes Bot 437eae00be Merge PR #1532 by @ai-ag2026: recover WebUI-origin state.db sessions when JSON sidecar missing (refs #1471) 2026-05-03 16:06:04 +00:00
Manfred 9c0a16fdd6 fix: recover WebUI-origin state.db sessions 2026-05-03 15:41:56 +02:00
Manfred dbb0879956 fix: pass WebUI max_tokens to agents
Read configured max_tokens from config.yaml, pass it into WebUI-created AIAgent instances when supported, and include it in the agent cache signature. Also classify OpenRouter quota phrasing such as more credits, can only afford, and fewer max_tokens.

Adds regression coverage for max_tokens propagation, cache signature isolation, and quota error classification.
2026-05-03 11:46:42 +02:00
Manfred 6bce34c27e fix: clear stale WebUI stream state
Clear persisted active_stream_id and pending runtime fields when the server no longer has the referenced live stream. Also drop browser-side INFLIGHT state when the server reports a session idle and bump the service-worker cache so the frontend fix is delivered.

Adds regression coverage for backend stale-stream cleanup, frontend inflight invalidation, and cache busting.
2026-05-03 11:46:42 +02:00
Frank Song 8f3dbe185d fix: consolidate __CACHE_VERSION__ → __WEBUI_VERSION__ (#1509)
__CACHE_VERSION__ (sw.js) and __WEBUI_VERSION__ (index.html) are
functionally identical — both resolve to quote(WEBUI_VERSION, safe='')
at request time. Two names exist for historical reasons (different files
added at different times).

Rename __CACHE_VERSION__ → __WEBUI_VERSION__ in:
- static/sw.js (CACHE_NAME + VQ constant + comment)
- api/routes.py (substitution string)
- tests/test_pwa_manifest_sw.py (all assertions)

Single canonical name. No behavior change — same ?v=vX.Y.Z query strings
on the same URLs.
2026-05-03 14:59:37 +08:00
Hermes Bot 6381ab1b8a fix(model-picker): deepcopy auto_detected_models per group to stop dedup bleed-across (#1511 root cause)
Supersedes contributor PR #1511 (lost9999), which removed the label-suffix
logic in _deduplicate_model_ids() but left the underlying shared-reference
bug intact — IDs would still be silently corrupted across provider groups,
just with cleaner-looking labels.

## Bug shape

When multiple unconfigured providers (Ollama / HuggingFace / custom
endpoints / Google Gemini CLI / Xiaomi / etc.) all fell through to the
'else' branch in api/config.py:get_models_grouped() that ends with:

    groups.append({..., "models": auto_detected_models})

every group ended up sharing the SAME list reference AND the SAME dicts
inside. When _deduplicate_model_ids() then mutated those dicts to add
@provider_id: prefixes and provider-name parentheticals, the changes were
applied to every group that referenced the same dict.

Visible symptom: user 'vishnu' reported the dropdown showing
'Deepseek V4 Flash (Xiaomi) (Ollama) (HuggingFace) (Google-Gemini-Cli)'
on every group. Hidden symptom (worse): the 'id' field collapsed to
'@xiaomi:deepseek-v4-flash' on every group too, so clicking the entry
under any group routed the request to Xiaomi.

## Fix

api/config.py:2078 — wrap auto_detected_models in copy.deepcopy() at the
groups.append site so each group gets its own independent dicts. The
existing _deduplicate_model_ids() logic is correct and unchanged; the
bug was in the assignment site, not the dedup function.

The single-parenthetical disambiguation in labels is retained because
the composer chip (composer-model-label) shows the model label without
the optgroup header context — 'Deepseek V4 Flash (Ollama)' is more
useful than ambiguous 'Deepseek V4 Flash' there.

## Tests

tests/test_issue1511_dedup_shared_reference.py — 3 new tests:
- test_groups_have_independent_model_lists: structural invariant pin
- test_unconfigured_providers_no_shared_dedup_bleed: end-to-end against
  the corrected code path; verifies each group gets its own @provider_id:
  prefix and exactly ONE provider parenthetical per disambiguated label
- test_shared_reference_pre_fix_demonstrates_corruption: documents the
  broken state that motivated the fix

Full suite: 3925 → 3928 passing (+3 new, 0 regressions).

Co-authored-by: lost9999 <56498264+lost9999@users.noreply.github.com>
2026-05-03 06:41:11 +00:00
Hermes Bot 8f58688b66 test: lock /session/static MIME-type + auth fix; drop unused import
- Add tests/test_session_static_assets.py (5 tests):
  * /session/static/style.css must return text/css (not text/html)
  * /session/static/ui.js must return application/javascript
  * /session/<id> still serves the HTML index (catch-all not weakened)
  * Path-traversal still sandboxed after prefix strip
  * /session/static/* matches /static/* auth-exemption policy
- Drop unused 'from urllib.parse import urlparse as _up' import from
  PR #1505's added block (parsed._replace already gives a usable result).

Co-authored-by: Rick Chew <rickchew@users.noreply.github.com>
2026-05-03 05:20:19 +00:00
Rick Chew 7cf2150b94 fix: serve static assets correctly under /session/* routes
When the browser loads a session page at /session/<id>, it requests
static assets relative to that path — e.g. /session/static/style.css.
The /session/* catch-all in handle_get() intercepted those requests and
returned the HTML index page (text/html), causing browsers to refuse the
stylesheet with a MIME-type mismatch error.

Two-part fix:
- routes.py: add a guard before the /session/ catch-all that strips the
  /session prefix from /session/static/* paths and delegates to
  _serve_static(), so the correct Content-Type is returned.
- auth.py: whitelist /session/static/* in check_auth() alongside
  /static/, so static assets on session pages are served without
  requiring an authenticated session (same policy as /static/).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-03 13:05:15 +08:00
Hermes Bot ba6f34488e fix(onboarding,probe): refuse HTTP redirects on probe path (reviewer-flagged on PR #1501)
SSRF defense-in-depth: `urllib.request.urlopen` follows redirects by default,
so a probe at `http://example.com/v1/models` could be redirected to
`http://internal-service:8080/admin` — surfacing internal HTTP services to
the authenticated user. The probe is already gated behind WebUI auth and the
local-network check, so the practical attack surface is 'authenticated user
enumerating internal services' (same as `curl` from their browser DevTools).
Tightening the redirect default is cheap insurance.

Implementation:

- New module-level `_NoRedirectHandler` (subclasses `urllib.request.HTTPRedirectHandler`,
  overrides `redirect_request` to return None — urllib then raises `HTTPError(3xx)`
  rather than following).
- New module-level `_PROBE_OPENER = urllib.request.build_opener(_NoRedirectHandler())`.
- `probe_provider_endpoint` switches from `urlopen(req, …)` to `_PROBE_OPENER.open(req, …)`.
- The existing `HTTPError` handler now categorizes 3xx as `unreachable` with a
  detail string mentioning 'redirect' so the user understands what happened.
  3xx does NOT get its own error code in `PROBE_ERROR_CODES` — the error
  taxonomy contract stays the same shape (frontend i18n unchanged).

Added regression test `test_probe_does_not_follow_redirects` in
`tests/test_issue1499_onboarding_probe.py`. Spins up a tiny HTTP server that
302-redirects `/v1/models` to `/different-endpoint` (which would return
`{'data': [{'id': 'should-not-see'}]}` if followed). Asserts the probe
returns `{ok: False, error: 'unreachable', status: 302, detail: …'redirect'…}`
and that the 'should-not-see' string never appears in the result.

Mutation-verified: reverting `_PROBE_OPENER.open` back to `urlopen` causes
the test to fail with "Probe followed a redirect — should have refused".

Suite delta: 3917 → 3918 passing (+1).

Reviewer-flagged in PR #1501. Per the
'reviewer-flagged-fix-in-release-not-followup' policy: <20 LOC defensive
fix, regression test path obvious, ship in this release rather than punting.
2026-05-03 03:21:22 +00:00
Hermes Bot 8f4692b8cf fix(onboarding): allow keyless setup for self-hosted providers (#1499 third sub-bug)
Pre-fix, the wizard rejected an empty api_key for every provider in
_SUPPORTED_PROVIDER_SETUPS — including lmstudio, ollama, and custom,
which run keyless on the vast majority of local installs. The agent's
LMSTUDIO_NOAUTH_PLACEHOLDER substitution at chat-time was the workaround
for the no-auth case, but the wizard side rejected the empty input first.
Users had to type random gibberish into the API key field to clear the
form — the third sub-bug from #1420 that the prior commit's PR description
explicitly punted to a follow-up.

Surfaced by Nathan during PR review: "I think it's too weird for users
to have to type a string into the API key field, right?"  Yes — and the
probe (#1499) makes the cleanest fix strictly better: we accept empty
keys, and the probe gives instant feedback ("Connected. 2 model(s)
available." for keyless servers, "401" for auth-required servers).

Backend changes
---------------

* `api/onboarding.py` — `_SUPPORTED_PROVIDER_SETUPS` gains
  `key_optional: True` for `lmstudio`, `ollama`, `custom`. Cloud
  providers (openrouter, anthropic, openai, gemini, deepseek, …)
  remain key_required.

* `apply_onboarding_setup` skips the "{env_var} is required" check
  when `key_optional` is set AND no key is supplied. No write to .env
  for the empty-key case (no `LM_API_KEY=*** placeholder lying in the
  user's .env`).

* `_status_from_runtime` reports `provider_ready=True` for key_optional
  providers based on `requires_base_url` alone, so the wizard doesn't
  refire on the next page load just because there's no api_key. Cloud
  providers still need a key for provider_ready=True.

* `_build_setup_catalog` exposes the `key_optional` flag to the frontend.

Frontend changes
----------------

* `static/onboarding.js` — new `_renderOnboardingApiKeyField()` helper.
  For key_optional providers:
    - Label: "API key (optional)"
    - Placeholder: "Leave blank for keyless servers"
    - Inline italic muted help: "Most LM Studio / Ollama / vLLM installs
      run keyless — leave this blank if your server doesn't require
      authentication. Use the Test connection button to verify."
  For cloud providers: unchanged (label "API key", standard placeholder,
  no help block).

* The api-key input also now triggers `_scheduleOnboardingProbe()` on
  oninput, so changing the key re-runs the probe — handles "the server
  rejected my empty key with 401, let me add one and retry."

* `static/i18n.js` — 3 new keys × 9 locales (canonical English in `en`,
  English fallback with `// TODO: translate` markers in the other 8).

* `static/style.css` — `.onboarding-api-key-help` rule for the muted
  italic helper paragraph.

Verified end-to-end on port 8789
--------------------------------

Spun up an isolated test server + a mock LM Studio at
`127.0.0.1:11234/v1/models`. Stepped through the wizard:

* Picked LM Studio → field label flipped to "API key (optional)",
  placeholder showed "Leave blank for keyless servers", help text
  rendered in italic muted gray below.
* Switched to Anthropic → label reverted to "API key", help text
  disappeared. Visual hierarchy correct.
* Left api_key blank, set base_url to the mock, clicked Test connection
  → green "Connected. 2 model(s) available." banner. Probe-discovered
  models populated the workspace-step dropdown.
* Continued through to the finish step. config.yaml written with
  provider/model/base_url. **`.env` does NOT exist** — no placeholder
  string written. `chat_ready: true`, `state: ready`.
* Vision tool confirmed the visual hierarchy: subtle italic help
  reads as documentation, prominent green banner pops as status.

Tests
-----

`tests/test_issue1499_keyless_onboarding.py` — 16 tests in 3 classes:

  TestKeyOptionalProviderSchema (5)
    - lmstudio / ollama / custom declare key_optional=True
    - openrouter / anthropic / openai do NOT (regression defense)
    - setup catalog exposes the flag

  TestKeylessOnboarding (6)
    - lmstudio / ollama / custom: empty api_key accepted, no .env write
    - openrouter / anthropic: empty api_key still rejected
    - lmstudio with explicit key still writes .env (regression defense)

  TestKeylessChatReady (5)
    - lmstudio / ollama: provider_ready=True with no key
    - custom: provider_ready=True with key+base_url, False without base_url
    - openrouter: provider_ready=False with no key (regression defense)
    - End-to-end get_onboarding_status reports chat_ready=True

Full suite: 3901 → 3917 passing (+16 from this commit; +22 cumulative
from the PR's earlier commit). 0 failures.

Closes #1499 (all three sub-bugs from #1420 now addressed)
2026-05-03 03:07:07 +00:00
Hermes Bot 8616033605 fix(onboarding,providers): probe LM Studio /models + align env var with agent CLI (#1499 #1500)
Addresses both #1499 (onboarding wizard never probes the configured base URL)
and #1500 (cross-tool env-var name divergence between webui and agent CLI).
Surfaced together because they're both LM-Studio onboarding bugs that pile
on top of each other — fixing only one leaves the broken UX.

#1499 — Onboarding wizard probes <base_url>/models before persisting

Pre-fix, `apply_onboarding_setup` accepted whatever `base_url` the user typed
without ever fetching `<base_url>/models`. @chwps's log timeline in #1420
showed the wizard finishing in 239ms with zero outbound HTTP — onboarding
silently persisted unreachable URLs and left users with empty model
dropdowns they had to populate by hand-editing config.yaml.

Backend:
* New `probe_provider_endpoint(provider, base_url, api_key, timeout=5.0)`
  in `api/onboarding.py`. Stdlib-only (urllib + socket — no httpx dep).
  Returns `{ok, models}` on success; `{ok: False, error: <code>, detail}`
  on failure with stable error codes the frontend can switch on:
  invalid_url, dns, connect_refused, timeout, http_4xx, http_5xx, parse,
  unreachable. 256 KB response cap and 5s timeout keep a hostile or mis-
  pointed endpoint from blocking the wizard.
* New `POST /api/onboarding/probe` route — thin JSON wrapper around the
  function above. Same local-network gate as `/api/onboarding/setup`
  because the body carries an `api_key` the user typed.
* The probe response is NEVER persisted. Only the user's typed selection
  ends up in config.yaml; the probed model list just populates the
  wizard's dropdown.
* SSRF: deliberately does NOT block private-IP ranges. The wizard is
  gated behind WebUI auth and the legitimate target IS a local LM Studio
  / Ollama / vLLM server. A "block private IPs" SSRF defense would make
  the feature useless for its primary use case.

Frontend:
* `static/onboarding.js`:
  - New `ONBOARDING.probe` state ({status, error, detail, models, probedKey}).
  - `_runOnboardingProbe()` — POSTs to /api/onboarding/probe, idempotent
    & cached on (provider, baseUrl, apiKey).
  - Debounced (400ms) on `oninput` of the base URL field.
  - Explicit "Test connection" button.
  - `nextOnboardingStep` blocks Continue at the setup step for any
    provider with `requires_base_url=True` until the probe succeeds.
    Same localized error renders inline.
* `static/i18n.js`: 13 new keys × 9 locales (canonical English in `en`,
  English fallback with `// TODO: translate` markers in the other 8 —
  same convention as v0.50.271 #1488 voice-buttons).
* `static/style.css`: probe banner + Test button styling (red-tinted
  error variant, green-tinted success variant, neutral probing state).

Verified via manual repro on port 8789:
* connect_refused → red banner, helpful "from Docker, try the host IP"
  hint, blocks Continue.
* DNS failure → red banner, "could not resolve host '...'", blocks Continue.
* Success against a mock /v1/models server → green banner, model dropdown
  populates from the probed list, Continue advances normally.

#1500 — webui env var aligned with agent CLI (LM_API_KEY)

The webui has long used `LMSTUDIO_API_KEY` for LM Studio's API key in
both onboarding and Settings detection. The agent CLI runtime
(hermes_cli/auth.py:177-183) reads `LM_API_KEY`. So a user who configured
auth on their LM Studio instance got Settings → Providers reporting
has_key=True (because webui saw its own LMSTUDIO_API_KEY) but the agent
runtime ignored the key and fell back to LMSTUDIO_NOAUTH_PLACEHOLDER →
401 against the auth-enabled LM Studio server. Masked in practice for
the no-auth majority.

Picked Option B from the issue (defer to the agent — single source of
truth) but mitigated the migration cliff by reading the legacy name as
a fallback:

* `api/onboarding.py:_SUPPORTED_PROVIDER_SETUPS["lmstudio"]`:
  - `env_var: "LM_API_KEY"` (canonical, what onboarding writes going forward).
  - `env_var_aliases: ["LMSTUDIO_API_KEY"]` (read-only fallback for
    pre-#1500 users so detection keeps working without forcing an
    .env rewrite).
* `api/onboarding.py:_provider_api_key_present` reads aliases too.
* `api/providers.py:_PROVIDER_ENV_VAR["lmstudio"] = "LM_API_KEY"`.
* `api/providers.py:_PROVIDER_ENV_VAR_ALIASES["lmstudio"] = ("LMSTUDIO_API_KEY",)`
  — new dict, used by `_provider_has_key` and `get_providers`'s
  key_source resolution. Drops in cleanly when other providers later
  rename their env vars too.

Verified:

```
before fix:  webui writes LMSTUDIO_API_KEY → agent ignores it → 401 on chat
 after fix:  webui writes LM_API_KEY → agent picks it up → chat works
             pre-#1500 .env with LMSTUDIO_API_KEY → still has_key=True in Settings
                                                  → key_source='env_file'
```

Tests

* `tests/test_issue1499_onboarding_probe.py` — 17 tests:
  3 invalid_url variants, dns, connect_refused, success (OpenAI shape),
  success (bare-list shape), http_4xx, http_5xx, parse non-JSON, parse
  wrong-shape, api_key authorization header passthrough, "probe must
  not write to config.yaml or .env", PROBE_ERROR_CODES contract pin,
  3 end-to-end route-level smoke tests against the live server fixture.
* `tests/test_issue1500_lmstudio_env_var_alignment.py` — 5 tests:
  onboarding declares LM_API_KEY canonical with LMSTUDIO_API_KEY alias,
  onboarding writes ONLY the canonical name, legacy env var still
  detected post-migration, canonical takes precedence when both are
  set, _provider_api_key_present reads aliases.
* `tests/test_issue1420_lmstudio_provider_env_var.py` — updated:
  the original 5-test #1420 suite now pins LM_API_KEY as canonical
  and LMSTUDIO_API_KEY as alias.

Full suite: 3879 → 3901 passing (+22), 0 failures.

Out of scope (explicitly NOT addressed here)

The third LM Studio onboarding sub-bug from #1420's thread — that
`apply_onboarding_setup` requires a non-empty api_key for lmstudio
even though most LM Studio installs run keyless — remains. The agent's
`LMSTUDIO_NOAUTH_PLACEHOLDER` substitution kicks in at runtime, but
the onboarding wizard rejects the empty-key case at submit. Fixing
this requires a UX decision (auto-write a sentinel? loosen the
required-key check for self-hosted providers?) and is left as a
separate follow-up.

Closes #1499
Closes #1500

Co-authored-by: chwps <106549456+chwps@users.noreply.github.com>
Co-authored-by: AdoneyGalvan <25235323+AdoneyGalvan@users.noreply.github.com>
2026-05-03 02:46:24 +00:00
Hermes Bot d3c7ac182b fix(providers): map lmstudio to LMSTUDIO_API_KEY in _PROVIDER_ENV_VAR (#1420)
After completing the onboarding wizard with the LM Studio provider, users
saw LM Studio in the model picker and could chat normally, but Settings →
Providers showed no LM Studio entry — or rendered it with has_key=False
and configurable=False even when LMSTUDIO_API_KEY was already in
~/.hermes/.env. There was no UI surface to add or update the key.

Root cause:

api/providers.py:_PROVIDER_ENV_VAR — the dict that maps each provider id
to its env-var name — is missing an "lmstudio: LMSTUDIO_API_KEY" entry.
That dict drives two things:

  1. _provider_has_key(pid) — env-var-based key detection. Returns False
     and sets key_source='none' if the pid isn't in the dict, regardless
     of what's in .env or os.environ.

  2. get_providers() line 364:
        "configurable": not is_oauth and pid in _PROVIDER_ENV_VAR,
     Without the entry, configurable=False, hiding the "Add API key"
     form in the UI.

So with no map entry, an LM Studio user with a working LMSTUDIO_API_KEY
gets has_key=False (wrong) AND no UI to fix it (wrong-er).

Same bug shape as #1410 (Ollama Cloud / local Ollama env-var collision).
The #1410 fix dropped bare "ollama" from _PROVIDER_ENV_VAR because
OLLAMA_API_KEY was shared with ollama-cloud and the runtime semantics
made the local key detection ambiguous. LMSTUDIO_API_KEY has no such
collision — it's only consumed by the lmstudio runtime.

Verified via reproduction:

  before fix: lmstudio.has_key=False, configurable=False, key_source='none'
   after fix: lmstudio.has_key=True,  configurable=True,  key_source='env_file'

5 regression tests in tests/test_issue1420_lmstudio_provider_env_var.py:

  1. _PROVIDER_ENV_VAR['lmstudio'] == 'LMSTUDIO_API_KEY'
  2. LMSTUDIO_API_KEY in env → has_key=True + configurable=True
  3. providers.lmstudio.api_key in config.yaml → has_key=True (fallback path)
  4. No env, no config → has_key=False but configurable=True (UI fix surface)
  5. LMSTUDIO_API_KEY doesn't cross-detect any other provider

Mutation-verified: reverting the map entry causes 4 of 5 tests to fail
with clear assertion messages naming the bug (the 5th — config.yaml
fallback — is independent of the env-var path and intentionally remains
green to pin that the existing path keeps working).

Scope discipline:

#1420's broader thread surfaces a sibling bug — the onboarding wizard
never probes the configured <base_url>/v1/models endpoint before
persisting (the wizard accepts unreachable URLs silently with no
model-list dropdown population). That bug is being filed separately
and is NOT addressed here. Adding a probe touches the wizard UX flow,
has timeout / error-handling implications, and warrants its own design
pass.

Closes #1420 (the "LM Studio missing from Settings" half — feature-
request half about provider catalog support is already shipped: LM
Studio has been a first-class provider in api/onboarding.py since long
before this issue).

Co-authored-by: chwps <106549456+chwps@users.noreply.github.com>
Co-authored-by: AdoneyGalvan <25235323+AdoneyGalvan@users.noreply.github.com>
2026-05-03 02:06:19 +00:00
Hermes Bot c4ea9643f9 Stage 272: PR #1492 — P0 bugfixes (tool-card args + CLI rename + scroll pinning + sw.js relative-path regression test) 2026-05-03 01:34:10 +00:00
Hermes Bot 51a87ebdc7 fix(sqlite): close state.db connections explicitly to stop FD leak in sidebar polling (#1494)
Production WebUI on macOS launchd reproduced an HTTP-unhealthy wedge after
#1483 closed the bootstrap supervisor double-fork: process alive, port
listening, every HTTP request reset by peer before a response. The reporter
(@insecurejezza) traced it to FD exhaustion — 366 open FDs on the wedged
process, 238 of them `~/.hermes/state.db`, `state.db-wal`, and `state.db-shm`.

Root cause: four sqlite callsites use `with sqlite3.connect(...) as conn:`.
Python's sqlite3 connection context manager only commits or rolls back on
exit; it does NOT close the connection. `/api/sessions` polling calls these
on every sidebar refresh, so each poll leaked one or more open state.db FDs
until the process hit macOS's soft FD limit and new sqlite3.connect() calls
inside fresh request handlers raised before any response bytes were written.

Fix: wrap each `sqlite3.connect(...)` in `contextlib.closing(...)` so the
connection is explicitly closed on scope exit, in addition to the auto-
commit / rollback semantics that `Connection.__exit__` already provides.

Callsites patched:
- api/agent_sessions.py:read_importable_agent_session_rows
- api/agent_sessions.py:read_session_lineage_metadata
- api/models.py:get_cli_session_messages
- api/models.py:delete_cli_session

Reporter's verification (post-patch, 100-request stress loop against
/api/sessions and /api/projects):

  batch=1 fd=92 state_handles=0
  batch=2 fd=92 state_handles=0
  ...
  batch=5 fd=92 state_handles=0

Pre-patch the same loop made FD count and state.db handle count climb
monotonically.

4 regression tests in tests/test_issue1494_state_db_fd_leak.py monkeypatch
sqlite3.connect with a tracking wrapper that records .close() calls and
assert every connection opened by each of the four functions is explicitly
closed. Verified to fail (catching the original bug) when the closing()
wrap is reverted: "leaked 5 of 5 sqlite connection(s) — context-manager-
only `with sqlite3.connect()` does not close. Wrap in contextlib.closing()."

This addresses Bug #2 of the umbrella issue #1458. Bug #3 (HTTP-unhealthy
wedge in the absence of FD exhaustion) remains open pending separate
diagnostic data — explicit scope discipline.

Closes #1494
Refs #1458 (Bug #2 of 3)

Co-authored-by: insecurejezza <70424851+insecurejezza@users.noreply.github.com>
2026-05-03 01:15:26 +00:00
bergeouss 24a5457471 fix: P0 bugfixes — tool-card args, sw.js path, CLI rename, scroll pinning
- #1481: Use absolute path for service worker registration to avoid
  <base> tag resolution on session pages causing JSON 404
- #1484: Fix tool-card expanded args readability — replace
  word-break:break-all with pre-wrap+break-word, add display:block
  so newlines and indentation are preserved
- #1486: Prefer WebUI JSON title over state.db title for CLI sessions,
  fixing rename-not-persisting after compression chain extension
- #1469/#1360: Add _programmaticScroll guard to distinguish
  programmatic scrolls from user scrolls, preventing the race
  condition where scrollIfPinned() re-pins after user scrolls up
2026-05-02 23:39:52 +00:00
Hermes Bot 26b332612d fix(api): add pending_user_message to Session.compact() (#1479) 2026-05-02 18:04:44 +00:00
Hermes Bot bcfd8b2eac chore(release): stamp v0.50.268 — 4-PR batch + Opus follow-ups (i18n + per-session fields + None title guard)
- CHANGELOG.md: v0.50.268 entry detailing #1395 #1450 #1462 #1476 + Opus SHOULD-FIX followups
- ROADMAP.md: bump to v0.50.268, 3800 tests collected
- TESTING.md: bump header + total to 3800

SF-1 i18n fix:
- static/i18n.js: session_meta_children key in all 10 locale blocks (en, ja, ru, es, de, zh, zh-Hant x2, pt, ko)
- static/sessions.js: 2 callsites use t(session_meta_children, childCount)

SF-2 #1462 per-session field carry-over:
- api/routes.py: duplicate now carries personality, enabled_toolsets, context_length, threshold_tokens

SF-3 #1462 None-title guard:
- api/routes.py: (session.title or "Untitled") + " (copy)"

Tests:
- tests/test_stage268_opus_followups.py: 6 regression tests pinning SF-1 + SF-2 + SF-3
- tests/test_session_duplicate.py: 2 brittle assertions widened to accept new forms

Follow-up issue filed: #1481 (PWA /sw.js whitelist vestige, Opus SF-4)
2026-05-02 17:54:58 +00:00
youzhi b804b66238 Fix session list pending message payload 2026-05-03 01:44:38 +08:00
Hermes Bot 273888df48 fix(sidebar): nest child sessions under lineage roots (#1450) 2026-05-02 17:41:05 +00:00
Hermes Bot 7c1b53258a feat(api): /api/session/duplicate endpoint for session cloning (#1462) 2026-05-02 17:41:05 +00:00
Hermes Bot 02726b9123 feat(pwa): Android PWA app installation with manifest and icons (#1476) 2026-05-02 17:41:05 +00:00
Hermes Bot 6303a30a87 Address review feedback: deepcopy independence, persist on duplicate, reset pinned/archived, 404 status
Five fixes from the May 2 2026 maintainer review:

1. messages and tool_calls now use copy.deepcopy() — prior plain assignment
   shared list refs between source and duplicate, so appending a turn to one
   mutated the other.
2. copied_session.save() called explicitly — pre-fix, the duplicate was
   in-memory only until the user sent a turn. Refreshing mid-flow lost it.
3. pinned and archived reset to False — duplicating an archived conversation
   should produce a visible (un-archived) copy.
4. Missing-session error is now status=404 (was default 400).
5. Removed redundant `import uuid` / `import time` inside the handler — both
   are already at the top of routes.py.

Test updates:

- Two existing static-grep tests widened to accept the new
  `copy.deepcopy(session.messages)` form alongside the original
  `messages=session.messages`.
- Five new static-grep regression tests pin each of the five fixes so
  reverting any single one trips a test.

All 3775 tests pass.

Co-authored-by: Alexey Dsov <AlexeyDsov@users.noreply.github.com>
2026-05-02 17:39:55 +00:00