fix: clarify cancelled chat turn status (Jordan-SkyLF)
Conflict resolution on api/streaming.py:4549-4567 (the cancel-handler
ownership guard). Both this PR and the already-shipped PR #2136 add a
guard at the same site against stale stream writebacks, from different
angles:
- PR #2136 (HEAD): _stream_writeback_is_current(_cs, stream_id) — strictly
dominates by checking the active_stream_id token equality.
- PR #2151: 'worker won the race' check via (active_stream_id != stream_id
and not pending_user_message), with _emit_cancel_event = False to suppress
the terminal cancel event.
Resolution merges both: keep #2136's strictly-stronger condition for skip
detection, and adopt #2151's _emit_cancel_event = False semantic so the
cancel event isn't emitted in addition to skipping the writeback (when
client may have already received the successful done payload).
55/55 tests pass across cancelled-turn-status + stale-stream-writeback +
the four cancel/data-loss sibling test files.
Providers like Xiaomi MiMo, DeepSeek, and Kimi require reasoning_content
to be echoed back on every assistant message in multi-turn conversations
with tool calls. Omitting it causes HTTP 400: 'The reasoning_content in
the thinking mode must be passed back to the API.'
The WebUI's _sanitize_messages_for_api() strips all fields not in
_API_SAFE_MSG_KEYS before sending conversation history to the LLM API.
reasoning_content was not in this whitelist, so it was silently dropped.
The CLI path (run_agent.py) is unaffected because it has its own
_copy_reasoning_content_for_api() logic that operates on raw message
dicts without going through this filter. This is why the same session
works from CLI but fails from WebUI with HTTP 400.
The fix adds 'reasoning_content' to _API_SAFE_MSG_KEYS so the field
passes through sanitization intact.
HMAC length: create_session() now emits a full 64-char HMAC-SHA256 hex
digest instead of the truncated 32-char form. verify_session() accepts
both lengths during a transition window so existing sessions survive the
upgrade without a forced global logout. The legacy 32-char branch can be
removed once the default 30-day session TTL has elapsed.
Secure flag: introduce _is_secure_context(handler) to encapsulate the
env-var override and heuristic. Restores the getpeercert / X-Forwarded-Proto
heuristic that was present before this refactor, keeping the env-var
override (HERMES_WEBUI_SECURE) on top for proxy deployments that need
explicit control. The bare `return False` stub that the previous commit
left in place silently broke Secure-cookie delivery for all reverse-proxy
users who never set the env var.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Opus advisor flagged that PR #2171's credential prefilter only listed
specific DB scheme prefixes and form keys, letting OAuth callback URLs,
URL userinfo, signed-URL query params bypass the hard agent redactor.
Adding the generic '://' marker restores the WebUI-as-hard-safety-boundary
contract. Plain URLs without sensitive substrings still pass through
unchanged because the redactor itself only mutates sensitive substrings.
Regression-pinned with 5 new parametric cases in test_security_redaction.py
plus 1 negative-case companion. Verified test FAILS without the fix and
PASSES with it.
Concurrent failed logins raced on _login_attempts because no lock guarded
the dict. Add _LOGIN_ATTEMPTS_LOCK and wrap both _check_login_rate() and
_record_login_attempt() with it.
Extract _load_key() to de-duplicate key file I/O. Add _pbkdf2_key() that
loads .pbkdf2_key (separate from .signing_key) so PBKDF2 and HMAC signing
no longer share a key — key reuse across cryptographic primitives is unsafe.
Update _hash_password() to use _pbkdf2_key() as its default salt, with an
optional *salt* kwarg so verify_password() can try the legacy .signing_key
salt during transparent migration. When the old hash matches, save_settings()
re-hashes with _pbkdf2_key() and _invalidate_password_hash_cache() ensures
the next request sees the upgraded hash without a restart.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
fix(config): preserve nvidia/ prefix on NVIDIA NIM (closes#2177)
Self-built. nesquena APPROVED with extensive end-to-end trace including
cross-tool agent CLI verification and 12-shape behavioural harness.
Move the `_PORTAL_PROVIDERS` guard in `resolve_model_provider()` to run
BEFORE the `prefix == config_provider` strip branch. The guard was added
for NVIDIA (along with the Nous portal cases in #854 / #894) but was
placed after the strip, so it never fired when `config_provider == "nvidia"`
and the model id started with `nvidia/`.
For `model_id="nvidia/nemotron-3-super-120b-a12b"`,
`config_provider="nvidia"`:
- prefix = "nvidia", bare = "nemotron-3-super-120b-a12b"
- prefix == config_provider → True → strip branch returned bare name
- `_PORTAL_PROVIDERS` guard never reached
- bare "nemotron-3-super-120b-a12b" sent to NVIDIA NIM → HTTP 404
NIM requires the full namespaced path. The fix moves the portal guard
to run first, so all portal providers (Nous, OpenCode-Zen, OpenCode-Go,
NVIDIA NIM) always preserve the full `provider/model` id regardless of
whether the prefix happens to equal the provider name.
This also closes a latent symmetric bug for the Nous case if a
`nous/<model>` id ever existed in the catalog.
Test plan:
- New `tests/test_issue2177_nvidia_prefix_preservation.py` covers:
- nvidia/nemotron-... under nvidia (the reported case)
- cross-namespace qwen/ and meta/ under nvidia (regression pin)
- every static nvidia model in `_PROVIDER_MODELS` resolves to itself
- latent nous/<model> under nous (structural ordering pin)
- non-portal providers (anthropic) still strip — fix doesn't over-correct
- Existing portal-routing suites (test_nous_portal_routing.py,
test_issue895_894_nous_prefix.py) continue to pass.
- Full test suite: 5320 passed, 4 skipped, 3 xpassed.
Reported on Discord by @vishnu (Nathan forwarded as #2177).
When a provider's 'models' config contains dicts (e.g. {"id": "x", "label": "y"})
instead of plain strings, _apply_provider_prefix() crashes with:
AttributeError: 'dict' object has no attribute 'startswith'
This happens because the list comprehension at line 3505 passes the raw dict
as the model ID. The fix extracts 'id' and 'label' from dict entries while
keeping string entries as-is.
Fixes the /api/models and /api/onboarding/status 500 errors.
get_password_hash() computes PBKDF2-SHA256 with 600k iterations to
hash the HERMES_WEBUI_PASSWORD env var. This is called on nearly every
HTTP request via check_auth -> is_auth_enabled -> get_password_hash.
Before: ~1s of PBKDF2 per request, regardless of how many times the
same env-var value has already been hashed. A page load hitting 5+
API endpoints would burn 5+ seconds purely on password hashing.
After: compute once on first call, cache the hex result in a module-
level variable. Subsequent calls are a single global-variable read
(~50ns). The env var is immutable for the process lifetime, so there
is nothing to invalidate.
Thread-safe: double-checked locking ensures that under a burst of
concurrent requests only one thread computes PBKDF2, while the fast
path (after initialisation) requires zero locks.
Security analysis: zero regression. The hash is derived from a static
env var and a static signing key — both already readable from process
memory. Caching does not introduce any new disclosure or replay
vector. PBKDF2 is still used for the initial computation and for
verify_password() on login.
AI: deepseek/deepseek-v4-flash
fix: guard stale stream writebacks (LumenYoung)
Prevents stale WebUI stream workers from writing old results into a session
after that session has already moved on to another stream. Adds new helper
_stream_writeback_is_current() (a token equality check against the session's
active_stream_id) and short-circuits the two finalize/cancel paths when the
worker no longer owns the session writeback.
(1) compress/status no longer pops the job entry on first read of `done` payload.
Second open tab no longer sees `idle` and a stale-job toast.
(2) compress/start no longer short-circuits to a stale `done` payload when
re-invoked within the 10-minute TTL. Re-running /compress always starts
fresh, so closing-and-reopening a tab mid-compress works correctly.
Third SHOULD-FIX (#2135 cfg["model"] fallback tightening when no custom_providers
entry matches) deferred to follow-up — strictly no-worse-than-master behavior.
tests/test_sprint46.py 10/10 still passes.
#2142 (legeantbleu) added the fr locale to static/i18n.js but didn't update:
1. tests/test_issue1488_composer_voice_buttons.py: two TestComposerVoiceButtonI18n + TestVoiceModePreferenceGate LOCALES tuples needed 'fr'
2. api/routes.py: _LOGIN_LOCALE needed an 'fr' block so the login page localizes for French users (issue #1442 parity contract)
3. tests/test_login_locale_parity.py: the test asserting 'fr' falls-back-to-'en' is inverted — fr now resolves to fr, with sibling assertions for fr-FR and fr-CA
Mirrors the stage-340 fix for the it locale (PR #2067 → maintainer adds tuple entries). 46/46 i18n tests pass after fix.