CI on Python 3.13 (clean editable install, no hermes_cli package) was still
failing the 3 lmstudio tests after the first fix attempt. Root cause: the
outer try/except in the lmstudio branch was catching ImportError from
`from hermes_cli.models import provider_model_ids`, hijacking the whole
branch and silently skipping the urlopen fallback.
Restructured into two independent tiers:
1. hermes_cli lookup in its own try/except — ImportError logs at DEBUG
and continues with lm_ids=[].
2. urlopen fallback runs unconditionally when lm_ids is empty, including
after hermes_cli import failure.
New regression test `test_lmstudio_fallback_works_when_hermes_cli_unavailable`
explicitly blocks hermes_cli via sys.meta_path and verifies the lmstudio
group still populates from the urlopen fallback. Without this test, the
CI-vs-local divergence (local env had hermes_cli installed, CI didn't)
would keep slipping through.
All 12 lmstudio-related tests pass, including the 3 #1527 tests that
broke on stage-337.
PR #2053 added worktree-backed session creation. PR #2041 (shipped in
v0.51.42) added state.db sidecar reconciliation that rebuilds a missing
<sid>.json sidecar from the canonical state.db row when the JSON file is
gone (failed save, manual rm, restore-from-backup with mismatched dirs).
The two interact silently. `_state_db_row_to_sidecar()` was hard-coding
`'workspace': ''` and never propagating the four worktree_* fields from
the row to the rebuilt sidecar dict. So a worktree-backed session that
loses its sidecar and gets rebuilt from state.db:
- loses `worktree_path` → matches the empty-session sidebar filter at
`api/models.py:1067/1107` (which spares worktree-backed empty sessions
via `not s.get('worktree_path')`) → session disappears from the
sidebar even though the worktree directory still exists on disk.
- loses `workspace` → downstream tools (terminal panels, file pickers
that use `s.workspace`) operate on empty string instead of the original
worktree path.
- always reports `message_count == 0` → contributes to the empty-session
filter even for sessions that have messages in `state.db.messages`.
Fix:
1. `_read_state_db_missing_sidecar_rows()` SELECT now includes
`workspace, worktree_path, worktree_branch, worktree_repo_root,
worktree_created_at, message_count` (each gated by
`_sql_optional_col()` so older state.db schemas without those columns
continue to work — recovery degrades gracefully rather than 500ing).
2. `_state_db_row_to_sidecar()` propagates each field. workspace comes
from the row if it's a string, otherwise '' (matching pre-fix behavior
for non-worktree sessions). message_count comes from the row if
it's an int, otherwise falls back to `len(messages)` so the rebuilt
sidecar always has a coherent count.
3 new regression tests in tests/test_state_db_worktree_recovery.py
exercise:
- worktree session with messages → all four worktree_* fields preserved.
- non-worktree session → worktree_* fields all None (no spurious
propagation), workspace=''.
- empty worktree session (the worst case) → confirms the rebuilt sidecar
does NOT match the empty-session-exempt filter, so it stays visible
in the sidebar.
Caught by Opus advisor during stage-337 review (the cross-PR interaction
between #2053 and the previously-shipped #2041 wasn't exercised by either
PR's individual test suite).
PR #1970 added a dedicated `elif pid == "lmstudio":` branch in
`get_available_models()` that fetches the live /v1/models list when the
hermes_cli helper doesn't have ids cached. The fallback path inside that
branch only looked at `cfg["providers"]["lmstudio"]["base_url"]`, missing
the historical config shape where the URL lives under `cfg["model"]`:
model:
provider: lmstudio
base_url: http://192.168.1.22:1234/v1 ← here, not under providers.lmstudio
providers:
lmstudio:
api_key: local-key
3 pre-existing tests in tests/test_issue1527_lmstudio_base_url_classification
broke on stage-337 because of this — they passed on master, failed after
the PR #1970 merge.
The simpler fix is to enhance the already-introduced `_get_provider_base_url()`
helper so it falls back to `cfg["model"]["base_url"]` when
`cfg["model"]["provider"] == provider_id`, then use the helper inside the
lmstudio branch instead of a direct lookup. This keeps the previous
behaviour (where the generic configured-provider branch handled lmstudio
via the model block) while preserving PR #1970's live-discovery additions.
Belt-and-suspenders: `_get_provider_base_url()` explicitly does NOT inherit
model.base_url for providers other than the active one — if a user's config
says `model.provider: anthropic` and they have `providers.openai` configured
without a base_url, openai must still resolve to None (use SDK default),
not to the anthropic proxy URL.
6 new regression tests in tests/test_pr1970_lmstudio_base_url_fallback.py
lock the two-location lookup, the precedence rule (explicit providers entry
wins over model fallback), trailing-slash stripping, and the negative case
(model.base_url MUST NOT leak to non-active providers).
All 51 tests in the existing model-resolver + custom-provider banks still
pass.
Caught by maintainer review on stage-337 (full pytest with the new network
isolation in place surfaced the regression that the fork-CI mock-server path
would have hidden).
The /api/media endpoint only serves files from ~/.hermes, /tmp, and the
active workspace. Power users with media in custom directories (models,
Downloads, Pictures, ComfyUI outputs) have no way to serve those files
inline without copying or symlinking.
Add MEDIA_ALLOWED_ROOTS env var — a colon-separated list of absolute
paths — that extends the allowed roots at runtime. Each entry is resolved
and validated as an existing directory before being appended. Non-existent
or invalid paths are silently skipped.
This is purely additive: the built-in security whitelist is unchanged,
and if MEDIA_ALLOWED_ROOTS is unset, behavior is identical to before.
Two concrete data-corruption vectors flagged in Opus review of PR #2041,
both fixed atomically so the new repair-safe endpoint is safe for production:
1. Shared tmp filename under concurrent calls
`tmp = target.with_suffix('.json.reconcile.tmp')` produced a fixed path
per session ID. Two simultaneous repair-safe POSTs would interleave bytes
in the same tmp file, then both rename → corrupted JSON. Now matches the
`Session.save()` convention at api/models.py:484 with a pid+tid suffix.
2. TOCTOU between target.exists() check and tmp.replace(target)
`os.replace()` overwrites unconditionally. If a concurrent Session.save()
for the same SID materialized the live sidecar in the microsecond window
between the existence check and the rename, the reconciliation would
silently overwrite a live sidecar with a (lossier) state.db reconstruction.
Switched to `os.link()` + `unlink(tmp)` which is atomic create-or-fail —
on FileExistsError we record `skipped: sidecar_appeared_during_reconcile`
and keep the live sidecar untouched.
Plus a round-trip schema-parity test: materialize a sidecar from state.db,
then load it back through `Session.load()` and assert the messages survive.
Catches future schema drift between `_state_db_row_to_sidecar()` and
`Session.__init__()`. Also adds a guard test confirming the .reconcile.tmp
suffix includes pid+tid (regression guard for hazard #1).
Tests: 23 passing across the recovery suite (was 21; +2 new in this commit).
Co-authored-by: ai-ag2026 <261867348+ai-ag2026@users.noreply.github.com>
When context compression fires, the agent rotates to a new session_id.
The compression migration block correctly migrates the session lock,
SESSION_AGENT_CACHE, SESSIONS dict, and the session file rename, but
does not ensure s.profile is set on the continuation session.
On the next request, _run_agent_streaming resolves the profile via:
get_hermes_home_for_profile(getattr(s, 'profile', None))
With s.profile == None this falls back to the default profile's
HERMES_HOME. Memory tool calls then read and write the wrong profile's
MEMORY.md — confirmed by investigation: session 0dfefb (continuation
after compression from a troubleshooting profile session) read memory
at 16% / 1,184 chars with 4 entries, while the troubleshooting profile's
actual state was 72-77% / 5,000+ chars. That reading could only come
from the default profile's bank. Subsequent replace operations failed
because the target entries existed only in the troubleshooting profile.
There are two failure paths:
1. In-memory: if s.profile was None from the start (legacy session or
one created before this fix), the continuation session object carries
null through the current request.
2. Persistence: s.save() persists "profile": null to the continuation
session's JSON file (profile is in METADATA_FIELDS, models.py ~408).
On the next request, Session.load(new_sid) reads it back as null and
get_hermes_home_for_profile(None) falls back to the default profile.
Fix: capture _resolved_profile_name at request entry (~line 2019),
immediately after profile home resolution. This is the only point where
profile context is reliable: s.profile if already set, otherwise
get_active_profile_name() — which at that point reads thread-local
storage (_tls.profile) correctly set by the HTTP handler thread via
set_request_profile(). Calling get_active_profile_name() at compression
time instead would be unsafe: the streaming thread is a separate
threading.Thread, does not inherit TLS, and the call would fall back to
the process-global _active_profile which may belong to a different
concurrent tab.
Stamp s.profile in the compression migration block immediately after
s.session_id = new_sid. Guarded by `if not s.profile` so sessions that
already have a profile set are unaffected. A logger.info line records
when the stamp fires, making future investigation straightforward.
Fixes: memory writes bleeding into default profile after compression
Reproduces: reliably on any long non-default profile session that hits
the compression threshold (default: 0.80 context fill)
Add xiaomi to _PROVIDER_DISPLAY, _PROVIDER_MODELS, and _PROVIDER_ALIASES
so the WebUI recognizes Xiaomi as a first-class provider.
Models included:
- mimo-v2.5-pro (MiMo V2.5 Pro)
- mimo-v2.5 (MiMo V2.5)
- mimo-v2-pro (MiMo V2 Pro)
- mimo-v2-omni (MiMo V2 Omni)
- mimo-v2-flash (MiMo V2 Flash)
Aliases: mimo, xiaomi-mimo -> xiaomi
The hermes-agent CLI already registers xiaomi as a provider
(hermes_cli/models.py, hermes_cli/auth.py) but the WebUI was missing
the corresponding entries, causing the model dropdown to fall back to
OpenRouter and the provider list to show 'Unsupported'.
Per-request profile switches (process_wide=False, introduced in #1700)
update os.environ['HERMES_HOME'] but skip _set_hermes_home(), which is
responsible for monkeypatching module-level caches.
Both tools/skills_tool.py and tools/skill_manager_tool.py set
HERMES_HOME and SKILLS_DIR once at import time. When a non-default
profile is active in the WebUI, os.environ['HERMES_HOME'] is correctly
updated per-turn in the _ENV_LOCK block, but the module-level
constants still point at the root profile. All agent-side skill
operations — skills_list(), skill_view(), skill_manage() — read and
write to the wrong directory.
Add the same monkeypatching that _set_hermes_home() already performs
(profiles.py line ~620) to the per-turn env setup block in
streaming.py, covering both skills_tool and skill_manager_tool.
The WebUI display half was already fixed in #1917 via
_active_skills_dir() in routes.py. This patch fixes the agent-side
half so the running agent resolves skills from the correct profile.