fix(config): preserve nvidia/ prefix on NVIDIA NIM (closes#2177)
Self-built. nesquena APPROVED with extensive end-to-end trace including
cross-tool agent CLI verification and 12-shape behavioural harness.
Move the `_PORTAL_PROVIDERS` guard in `resolve_model_provider()` to run
BEFORE the `prefix == config_provider` strip branch. The guard was added
for NVIDIA (along with the Nous portal cases in #854 / #894) but was
placed after the strip, so it never fired when `config_provider == "nvidia"`
and the model id started with `nvidia/`.
For `model_id="nvidia/nemotron-3-super-120b-a12b"`,
`config_provider="nvidia"`:
- prefix = "nvidia", bare = "nemotron-3-super-120b-a12b"
- prefix == config_provider → True → strip branch returned bare name
- `_PORTAL_PROVIDERS` guard never reached
- bare "nemotron-3-super-120b-a12b" sent to NVIDIA NIM → HTTP 404
NIM requires the full namespaced path. The fix moves the portal guard
to run first, so all portal providers (Nous, OpenCode-Zen, OpenCode-Go,
NVIDIA NIM) always preserve the full `provider/model` id regardless of
whether the prefix happens to equal the provider name.
This also closes a latent symmetric bug for the Nous case if a
`nous/<model>` id ever existed in the catalog.
Test plan:
- New `tests/test_issue2177_nvidia_prefix_preservation.py` covers:
- nvidia/nemotron-... under nvidia (the reported case)
- cross-namespace qwen/ and meta/ under nvidia (regression pin)
- every static nvidia model in `_PROVIDER_MODELS` resolves to itself
- latent nous/<model> under nous (structural ordering pin)
- non-portal providers (anthropic) still strip — fix doesn't over-correct
- Existing portal-routing suites (test_nous_portal_routing.py,
test_issue895_894_nous_prefix.py) continue to pass.
- Full test suite: 5320 passed, 4 skipped, 3 xpassed.
Reported on Discord by @vishnu (Nathan forwarded as #2177).
When a provider's 'models' config contains dicts (e.g. {"id": "x", "label": "y"})
instead of plain strings, _apply_provider_prefix() crashes with:
AttributeError: 'dict' object has no attribute 'startswith'
This happens because the list comprehension at line 3505 passes the raw dict
as the model ID. The fix extracts 'id' and 'label' from dict entries while
keeping string entries as-is.
Fixes the /api/models and /api/onboarding/status 500 errors.
CI on Python 3.13 (clean editable install, no hermes_cli package) was still
failing the 3 lmstudio tests after the first fix attempt. Root cause: the
outer try/except in the lmstudio branch was catching ImportError from
`from hermes_cli.models import provider_model_ids`, hijacking the whole
branch and silently skipping the urlopen fallback.
Restructured into two independent tiers:
1. hermes_cli lookup in its own try/except — ImportError logs at DEBUG
and continues with lm_ids=[].
2. urlopen fallback runs unconditionally when lm_ids is empty, including
after hermes_cli import failure.
New regression test `test_lmstudio_fallback_works_when_hermes_cli_unavailable`
explicitly blocks hermes_cli via sys.meta_path and verifies the lmstudio
group still populates from the urlopen fallback. Without this test, the
CI-vs-local divergence (local env had hermes_cli installed, CI didn't)
would keep slipping through.
All 12 lmstudio-related tests pass, including the 3 #1527 tests that
broke on stage-337.
PR #1970 added a dedicated `elif pid == "lmstudio":` branch in
`get_available_models()` that fetches the live /v1/models list when the
hermes_cli helper doesn't have ids cached. The fallback path inside that
branch only looked at `cfg["providers"]["lmstudio"]["base_url"]`, missing
the historical config shape where the URL lives under `cfg["model"]`:
model:
provider: lmstudio
base_url: http://192.168.1.22:1234/v1 ← here, not under providers.lmstudio
providers:
lmstudio:
api_key: local-key
3 pre-existing tests in tests/test_issue1527_lmstudio_base_url_classification
broke on stage-337 because of this — they passed on master, failed after
the PR #1970 merge.
The simpler fix is to enhance the already-introduced `_get_provider_base_url()`
helper so it falls back to `cfg["model"]["base_url"]` when
`cfg["model"]["provider"] == provider_id`, then use the helper inside the
lmstudio branch instead of a direct lookup. This keeps the previous
behaviour (where the generic configured-provider branch handled lmstudio
via the model block) while preserving PR #1970's live-discovery additions.
Belt-and-suspenders: `_get_provider_base_url()` explicitly does NOT inherit
model.base_url for providers other than the active one — if a user's config
says `model.provider: anthropic` and they have `providers.openai` configured
without a base_url, openai must still resolve to None (use SDK default),
not to the anthropic proxy URL.
6 new regression tests in tests/test_pr1970_lmstudio_base_url_fallback.py
lock the two-location lookup, the precedence rule (explicit providers entry
wins over model fallback), trailing-slash stripping, and the negative case
(model.base_url MUST NOT leak to non-active providers).
All 51 tests in the existing model-resolver + custom-provider banks still
pass.
Caught by maintainer review on stage-337 (full pytest with the new network
isolation in place surfaced the regression that the fork-CI mock-server path
would have hidden).
Add xiaomi to _PROVIDER_DISPLAY, _PROVIDER_MODELS, and _PROVIDER_ALIASES
so the WebUI recognizes Xiaomi as a first-class provider.
Models included:
- mimo-v2.5-pro (MiMo V2.5 Pro)
- mimo-v2.5 (MiMo V2.5)
- mimo-v2-pro (MiMo V2 Pro)
- mimo-v2-omni (MiMo V2 Omni)
- mimo-v2-flash (MiMo V2 Flash)
Aliases: mimo, xiaomi-mimo -> xiaomi
The hermes-agent CLI already registers xiaomi as a provider
(hermes_cli/models.py, hermes_cli/auth.py) but the WebUI was missing
the corresponding entries, causing the model dropdown to fall back to
OpenRouter and the provider list to show 'Unsupported'.
- api/config.py: resolve merge conflict, keep both _custom_slug_rest_looks_like_host_port
and new _get_provider_base_url helper. Custom providers now return their configured
base_url in resolve_model_provider(). Add 'Configured' badge for explicitly configured
providers in the models dropdown. Detect LM Studio via LM_API_KEY+LM_BASE_URL env vars.
Fetch live loaded models from LM Studio with fallback to direct HTTP requests.
- api/providers.py: fetch live LM Studio model list via hermes_cli for the providers card.
- static/style.css: add purple 'Configured' badge style.
When multiple custom providers expose the same model ID (e.g. baidu,
huoshan, and liantong all offering glm-5.1), only the first provider's
entry was shown in the model dropdown.
Root cause (backend): used the bare model ID as the
dedup key, so the second and subsequent providers with the same model
were silently skipped.
Root cause (frontend): stripped the @provider: prefix before
comparing, so @custom:baidu:glm-5.1 and @custom:huoshan:glm-5.1 were
treated as duplicates.
Fix:
- Backend: change _seen_custom_ids key to '{slug}:{model_id}' so each
provider's models are tracked independently.
- Frontend: add _providerOf() helper and deduplicate on the composite
(normId, provider) key instead of normId alone. Bare model IDs
(without @provider: prefix) still deduplicate on normId for backward
compatibility.
model_with_provider_context can emit @custom:<host>:<port>:<model> when
model_provider is derived from an OpenAI base_url authority (e.g.
custom:10.8.0.1:8080). The colon-count heuristic meant for @custom:slug:model:free
mistook those extra colons for an over-split model ID and prepended the port
segment onto the bare model (8080:Qwen3-235B), breaking WebUI while CLI/curl
stayed correct.
Detect endpoint-style slugs (IPv4/localhost/hostname + numeric port) and skip
the peel in that case. Add regression tests for IPv4, dotted hostname,
localhost, and model_with_provider_context round-trip.
The goal evaluation hook was firing on every completed assistant turn
when a goal was active, even for unrelated messages like "what time is
it". This burned the goal budget, triggered continuation prompts that
interrupted unrelated conversations, and made /goal status numbers
misleading.
Add STREAM_GOAL_RELATED and PENDING_GOAL_CONTINUATION flags to gate
the evaluate_goal_after_turn() call in the streaming loop. Only streams
started from goal kickoff (/goal <text>) or goal continuation are
marked as goal-related. Normal user messages skip the hook entirely.
Conflict resolution: both #1928 (session jump buttons) and #1929 (endless
scroll) add their own settings/UI/i18n keys. Resolved by keeping both —
the features are independent opt-in toggles.
Two bugs in get_available_models() conspired to duplicate the active
provider's auto-detected models under a phantom 'Custom' group whenever
custom_providers was also declared in config.yaml:
1. custom:* PIDs not in _named_custom_groups (e.g. stale slugs left from
prior configs) fell through to the auto_detected_models fallback, copying
the active provider's whole catalog into a phantom Custom: <slug> group.
Fix: continue unconditionally for ANY custom:* PID — the named-group
branch is the only legitimate population path.
2. The bare 'custom' PID, with the active provider being concrete (e.g.
ai-gateway), hit 'elif auto_detected_models: copy.deepcopy(...)' and
built a duplicate Custom group of the active provider's models with
mismatched provider prefixes. Fix: when pid == 'custom' and the active
provider is non-custom, leave models_for_group empty.
The reporter also suggested a third fix gating resolve_model_provider() on
config_provider — that's intentionally NOT applied because it conflicts with
the long-standing model-specific-override semantics covered by
test_model_resolver.py::test_custom_provider_*_routes_to_named_custom_provider
(custom_providers entries explicitly override the active provider's routing
when the user opted-in). The reporter's symptom (duplicate UI group) lives
entirely in get_available_models()'s group construction and is fully fixed
by the two changes above.
Tests: 6 new regression tests (3 in #1881 file + reuse), 774 broader
tests still green (model/provider/custom/config domain).
Two in-stage fixes for v0.51.19 batch:
1) api/config.py — add resolve_alias=False param to
_resolve_configured_provider_id() and pass it from
resolve_model_provider(). The PR #1818 swap from
_resolve_provider_alias() to _resolve_configured_provider_id()
was correct for active-provider/badge surfaces but broke #1625's
local-server-provider literal-preservation contract: 'ollama' →
'custom' and 'lm-studio' → 'lmstudio' alias-collapse caused
_LOCAL_SERVER_PROVIDERS membership check to miss, breaking the
model-id full-path preservation for LM Studio/Ollama. The new
flag preserves the raw provider value when called from
resolve_model_provider, and named-custom-slug + base-url
fallback both still run unchanged.
2) tests/test_bootstrap_discover_agent.py — pin Path.home() in
_isolate_discover_agent_dir so the hard-coded
'Path.home() / .hermes / hermes-agent' / 'Path.home() /
hermes-agent' candidates in discover_agent_dir() can't pick up
the dev machine's real install. The original PR #1817 isolation
helper covered HERMES_HOME, HERMES_WEBUI_AGENT_DIR, and
REPO_ROOT but missed the Path.home() leak.
Both surfaced on full pytest pre-release gate, fixed in stage,
ship in v0.51.19. Tests: full suite green.
PR #1762 fixed the rsplit grammar collision for plain @openrouter:model:free
qualifiers, but skipped the fallback whenever the provider hint started with
'custom:' on the assumption that custom providers route directly. That left
'@custom:my-key:some-model:free' broken: rsplit yields
provider='custom:my-key:some-model', bare='free' → custom guard skips the
split-fallback → returns provider='custom:my-key:some-model', model='free'.
Detect the over-split structurally instead of using a known-suffix allowlist:
custom hints carry exactly one segment after 'custom:' (constructed at
api/config.py:1363 as 'custom:' + entry_name). So any rsplit result of
'custom:<a>:<b>' with bare model '<c>' has eaten one model segment — peel
it back with a second rsplit and prepend it to the bare model.
This is robust for :free / :beta / :thinking / :preview / any future
OpenRouter suffix without an allowlist to maintain.
Adds 5 regression tests covering the matrix (free/beta/thinking/preview/
slashed-model). All 7 existing #1744 tests still pass; #1228 tests
unaffected.
Co-authored-by: Cake <51058514+Sanjays2402@users.noreply.github.com>
The previous approach of prepending 'openrouter/' to the model ID in the
catalog was incorrect — it only masked the symptom while regressing the
config_provider=openrouter codepath.
The root cause is in resolve_model_provider(): rsplit(':', 1) on
'@openrouter:tencent/hy3-preview:free' yields provider='openrouter:tencent/hy3-preview'
and model='free', because the ':free' suffix collides with the @provider:model
grammar.
Fix: after rsplit, validate that the extracted provider hint is a known
provider (in _PROVIDER_MODELS, _PROVIDER_DISPLAY, or starts with 'custom:').
If not, fall back to split(':', 1) so trailing suffixes stay attached to
the model ID.
This fixes all current and future OR models with colon-suffixed tags
(:free, :beta, :thinking, :nitro, etc.) without catalog changes.
Also adds regression tests for the affected models and edge cases.
Co-authored-by: nesquena-hermes <nesquena-hermes@users.noreply.github.com>