Files
hermes-webui/static
nesquena-hermes 360463dd8e v0.50.212: model cache perf (~30s→~1ms), session switch UX, cache isolation fix (#1063)
* fix(models): disk cache now used on restart, cold path locked, 24h TTL

Root causes fixed:
- reload_config() was deleting disk cache on every server start (cfg_mtime 0.0 vs real mtime).
  Now saves old mtime before update and skips cache deletion on first-ever load.
- Cold path was running outside the lock causing thundering herd on startup.
  Now extracted to _build_available_models_uncached() helper running inside RLock.
- Disk cache was never being checked before lock acquisition.
  Now loads from disk BEFORE acquiring lock; cache hit returns without lock contention.
- Credential pool load_pool() was called per-provider per-request (~10s for zai).
  Now cached in _CREDENTIAL_POOL_CACHE with 24h TTL.

Result: /api/models returns in ~1ms on restart instead of ~30s.

* fix(ui): block stale SSE events, cancel old stream on switch, clear pending files after send, focus textarea after switch, instant click for inactive sessions, rename session via titlebar dblclick

Key UX improvements:
- Block stale SSE responses from old sessions reaching new session DOM after switch
- Cancel in-flight streaming when switching sessions
- Clear pending files after send (prevents ghost attachments in tray)
- Auto-focus message textarea after session switch
- Instant click for inactive sessions (no loading spinner blocking)
- Double-click app titlebar to rename active session
- Persist/restore composer draft across session switches

* style: add user-select:none to session titles to prevent accidental text selection

* fix(models): prevent concurrent cold path runs with _cache_build_in_progress guard

Thread 2 was re-entering the cold path (via RLock) while Thread 1 was
still inside it, causing duplicate 10s zai load_pool() calls. The RLock
allows re-entry from the same thread, defeating the 'only one cold path'
guarantee. Now threads wait on _cache_build_cv instead of re-entering.

* fix(models): add missing global declarations, move mtime check to outer scope for test

* fix(models): attach _cache_build_cv to the RLock so notify_all() is safe

* fix(models): evict _CREDENTIAL_POOL_CACHE entries when provider cache is invalidated

Without this, invalidate_provider_models_cache(provider_id) cleared the
models cache but left stale CredentialPool objects in _CREDENTIAL_POOL_CACHE
for up to 24h.  The next get_available_models() cold path would re-use the
stale pool instead of re-loading, meaning new credentials added by the user
wouldn't show up until the pool TTL expired.

Now evicts both provider_id and its canonical alias from the pool cache
so the next cold path re-loads from disk.

* fix(merge): restore #1024/#1025 work in static/sessions.js after rebase

The merge of master (commit 05d1ba9) resolved the static/sessions.js
conflict by keeping the contributor's version, which silently dropped
several pieces of work that had landed via PR #1024 and #1025:

  PR #1024 (session attention indicators):
    - _renderOneSession(s, isPinnedGroup=false) signature
    - body.appendChild(_renderOneSession(s, Boolean(g.isPinned)))
    - pinned-group dedup: if(s.pinned&&!isPinnedGroup) ...
    - last_message_at preference in _sessionTimestampMs
    - Right-slot attention indicator + hide-timestamp-when-attentive

  PR #1025 (session restore speed):
    - &resolve_model=0 on the loadSession metadata fetch
    - S.session._modelResolutionDeferred=true after assignment
    - _resolveSessionModelForDisplaySoon(sid) helper + invocation
    - &resolve_model=0 on the lazy full-message fetch

Restoration approach: reset sessions.js to current master, then layer
the contributor's #1060 additions on top:
  - _loadingSessionId global for stale-response discard
  - composer draft persistence on session switch (via S.composerDrafts)
  - _loadingSessionId !== sid bail-outs at every async await point
  - Cleanup _loadingSessionId = null at all exit paths

Test outcome:
  - tests/test_issue856_pinned_indicator_layout.py: 5/5 (was 5/5 fail)
  - tests/test_session_metadata_fast_path.py: 5/5 (was 3/5 fail)
  - tests/test_session_sidebar_relative_time.py: 5/5 (was 1/5 fail)
  - Full suite: 2233 passed, 0 failed

fix(models): clear _CREDENTIAL_POOL_CACHE in invalidate_models_cache

The 24h-TTL credential pool cache introduced in this PR was keyed by
provider_id only, so when a user added/changed credentials, or when
tests called invalidate_models_cache() between cases with different
auth payloads, the cached CredentialPool from the prior payload leaked
into the new run.

Two complementary fixes:
  1. invalidate_models_cache() now also clears _CREDENTIAL_POOL_CACHE
  2. invalidate_provider_models_cache(provider_id) pops just that
     provider's entry — surgical eviction for live key edits

Pinned by tests/test_credential_pool_providers.py — 23/23 passing.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix: invalidate disk cache in invalidate_models_cache(); reset _cache_build_in_progress on exception

1. invalidate_models_cache() now calls _delete_models_cache_on_disk() so that the
   on-disk snapshot at /dev/shm is removed alongside the memory cache. Without this,
   _load_models_cache_from_disk() serves a stale prior-test result immediately after
   invalidation, breaking all test_credential_pool_providers and test_model_resolver
   tests that rely on get_available_models() returning fresh mocked data.

2. Wrap _build_available_models_uncached() in try/except so _cache_build_in_progress
   is always reset (+ notify_all) even if the rebuild raises unexpectedly, preventing
   waiting threads from being stuck at wait_for() for the full 60s timeout.

3. Fix misleading comment: "avoid deadlock" → "file I/O outside the lock".

Co-authored-by: JKJameson <JKJameson@users.noreply.github.com>

* docs: v0.50.212 release notes and version bump

Model cache perf, session switch UX improvements, cache isolation fixes.

---------

Co-authored-by: Josh <josh@fyul.link>
Co-authored-by: Nathan Esquenazi <nesquena@gmail.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: nesquena-hermes <nesquena-hermes@users.noreply.github.com>
Co-authored-by: JKJameson <JKJameson@users.noreply.github.com>
2026-04-25 18:24:30 -07:00
..