Closes#1442 (server-side _LOGIN_LOCALE missing ja/pt/ko)
Closes#1443 (promote _isImeEnter helper to 6 other Safari Enter guards)
Closes#1446 (glued-bold-heading lift for LLM thinking-block output)
Closes#1447 (markdown heading visual hierarchy in chat messages)
All four issues were filed by the Opus pre-release advisor on the v0.50.264 batch
or by Cygnus via Discord (relayed by @AvidFuturist, May 1 2026). They share a
common shape — narrow, well-scoped, independent of each other, all adding
regression tests.
== #1442: _LOGIN_LOCALE parity (api/routes.py + static/i18n.js) ==
Added entries for ja/pt/ko to the server-side _LOGIN_LOCALE dict that renders
the localized login page BEFORE the JS i18n bundle loads. With v0.50.264
shipping Japanese as the 8th built-in locale, ja/pt/ko users were seeing the
English login page even with their language preference set.
While auditing static/i18n.js for English leakage, also fixed:
- ko: 10 user-facing login/sign-out/password keys still in English
- es: 3 sign-out/auth-disabled keys still in English
Tests: tests/test_login_locale_parity.py (20 tests) — pins both invariants:
(a) every locale in i18n.js LOCALES has a matching _LOGIN_LOCALE entry
(b) every locale's login-flow keys (13 of them) are translated, not English
== #1443: window._isImeEnter promotion ==
PR #1441 fixed the Safari IME-composition Enter race in the chat composer
(`#msg`) by widening the guard from `e.isComposing` to a `_isImeEnter(e)`
helper that combines three signals (isComposing || keyCode===229 ||
_imeComposing flag). Six other Enter-input handlers were left on the original
narrow guard and would still drop IME composition Enters on Safari for
Japanese/Chinese/Korean users.
Promoted the helper to `window._isImeEnter` (defined in static/boot.js) and
replaced the `e.isComposing` guards at all six sites:
- static/sessions.js: session rename, project create, project rename
- static/ui.js: app dialog (confirm/prompt), message edit, workspace rename
The state-free part of the helper (`isComposing || keyCode===229`) handles
Safari's race for any focused input without needing per-input composition
listeners — only `#msg` keeps the local `_imeComposing` flag.
Tests:
- tests/test_issue1443_ime_helper_promotion.py (9 tests) — pins each site
+ verifies no raw `e.isComposing` Enter-guards remain in sessions.js/ui.js
- tests/test_ime_composition.py — alternation regex extended to accept
the windowed helper form (loosen-test-on-shape-change pattern from
v0.50.264 reflection notes)
== #1446: glued-bold-heading lift (static/ui.js renderMd + Python mirror) ==
LLMs in thinking/reasoning mode emit "section headers" glued to the end of the
previous paragraph with no whitespace:
Para 1 text.**Heading to Para 2**
Para 2 text.**Heading to Para 3**
The renderer correctly produces inline `<strong>` per CommonMark, but it looks
like trailing emphasis on the body text rather than a section break. Cygnus
reported this as "Markdown feedback 2 of 3."
Added a single regex pre-pass in renderMd():
s.replace(/([.!?])\*\*([^*\n]{1,80})\*\*\n\n/g, '$1\n\n**$2**\n\n')
Constraints chosen to avoid false positives:
- Trigger only on `[.!?]` IMMEDIATELY before `**` (no space) — almost always
an LLM-glued heading, not intentional emphasis
- Inner text ≤80 chars, no `*` or newline (single-line only)
- Trailing `\n\n` required — preserves "this is **important** to know."
mid-paragraph emphasis untouched
- Position: after rawPreStash restore, before fence_stash restore — fenced
code blocks stay protected (their content is `\x00P` / `\x00F` tokens
when the lift runs)
Mirrored in tests/test_sprint16.py render_md() so both stay in sync.
Tests: tests/test_issue1446_glued_heading_lift.py (17 tests, 5 of which drive
the actual ui.js renderMd via node) — covers all 3 trigger forms (.!?), all 4
preserve-emphasis cases the issue spec'd, fenced/inline code protection,
chained glued headings, source-level position pin, regex shape pin.
== #1447: markdown heading visual hierarchy (static/style.css) ==
Pre-fix sizes in `.msg-body`:
h1 18px, h2 16px, h3 14px (= body), h4 13px, h5 12px, h6 11px
So h3 was indistinguishable from body and h4/h5/h6 were SMALLER than body.
Cygnus's report: "Markdown feedback 3 of 3 — Headings seem to be missing
across the board in Hermes. They're there, but all plaintext."
New sizes:
h1 24px (border-bottom) h2 20px (border-bottom) h3 17px h4 15px
h5 14px (uppercase, tracked) h6 13px (uppercase, tracked, muted)
All headings now `font-weight:700` + `color:var(--strong)` for stronger ink.
h5/h6 use uppercase + letter-spacing for "label-style" affordance instead
of being smaller-than-body.
Synced .preview-md (file preview pane) to match exactly so a markdown file
preview and a chat message render identically. Added missing h4/h5/h6 rules
to .preview-md (it only had h1-h3 before).
Updated data-font-size="small"/"large" h1-h6 overrides to scale
proportionally with the new defaults. Hierarchy preserved at all three
font-size settings.
Tests: tests/test_issue1447_heading_hierarchy.py (9 tests) — pins the size
hierarchy, the bottom borders on h1/h2, the uppercase affordance on h5/h6,
the .preview-md sync, and the small/large override scaling.
== Verification ==
pytest tests/ -q → 3748 passed (+56 new)
bash ~/WebUI/scripts/run-browser-tests.sh → 20 + 11 PASS
bash ~/WebUI/scripts/webui_qa_agent.sh 8789 → 23/23 PASS
Visual confirmation in browser at port 8789:
- Heading hierarchy clearly visible at all 6 levels
- Glued-bold lift produces separate paragraphs as designed
- window._isImeEnter accessible from any module after boot.js
- Login page renders ja/pt/ko strings correctly (curl -s /login)
REQUIRED:
- _fully_unquote_path range(3) -> range(10) — defense-in-depth so quadruple-
encoded .. is rejected by validator instead of slipping through (not
exploitable but contract violation)
- docs/EXTENSIONS.md trust-model callout moved to top of file with explicit
'don't enable in untrusted env / don't point at user-writable dir' guidance
NICE-TO-HAVE (taken since Nathan asked for all fixes big and small):
- URL list cap at _MAX_URL_LIST=32 to avoid pathological rendering
- One-shot WARNING log for rejected URLs (silent drop now visible to admin)
- One-shot WARNING log for URL list truncation
- MIME map: ttf (font/ttf), otf (font/otf), wasm (application/wasm)
5 regression tests in tests/test_pr1445_opus_followups.py pin all invariants.
Opus advisor caught a recoverable footgun in PR #1441's manual flag: if
focus is lost mid-composition (window blur or older Safari WebKit IME
quirk), compositionend may never fire and _imeComposing stays true
until the next full composition cycle. Result: Enter-to-send is
silently broken until page reload — an unrecoverable stuck state for
something that's supposed to be transient.
Add a blur listener that also resets the flag. Cheap belt-and-suspenders
against the stuck state. Adds 1 regression test pinning the listener.
(other Opus findings logged in /tmp/stage-264-brief.md as follow-up
issues: _LOGIN_LOCALE parity for ja/pt/ko, promote _isImeEnter to the
6 other Safari-affected Enter guards in sessions.js + ui.js)
The markdown fence regex /```([\s\S]*?)```/g had no line anchoring. A literal
triple backtick inside code block content (e.g. a regex with ``` in a lookbehind,
or a script that documents fences) terminated the outer fence at the wrong place.
The leaked tail then went through bold/italic/inline-code passes, eating `*`
characters as italic markers and emitting literal </strong> tags into the
rendered output.
CommonMark §4.5 requires that an opening code fence be the first non-whitespace
content of a line (up to 3 spaces of indent allowed) and that the closing fence
also start a line. This patch updates 3 sites + the Python mirror to use that
invariant:
static/ui.js:1559 renderMd() fenced-block stash (assistant messages)
static/ui.js:66 _renderUserFencedBlocks() (user messages)
static/ui.js:2599 _stripForTTS() (TTS speech pre-strip)
tests/test_sprint16.py Python mirror
Pattern: (^|\n)[ ]{0,3}```(?:([\s\S]*?)\n)?[ ]{0,3}```(?=\n|$)
The non-capturing (?:...\n)? group keeps empty fences (```\n```) working;
without it, a body+\n is required and the closing fence on the very next line
no longer matches. The lead group (^|\n) is prefixed back to the stash token
so paragraphs above don't bleed into the <pre> block.
20 regression tests in tests/test_issue1438_fence_anchoring.py cover:
- Cygnus's exact repro from Discord (May 1 2026)
- Inline ``` mid-paragraph (must not open fence)
- Partial/streaming fence with no close (must not eat content)
- Empty fences with and without language tag
- 3-space indented fences (allowed) vs 4-space (not a fence)
- Multiple adjacent blocks
- Bold/italic/inline-code surviving after a fence
- Source-level guards on all 3 patched sites + lead-prefix invariant
Empirical browser verification (live JS, on bug repro):
Before fix: </code></pre>[^\n]<em>|%%[ \t]</em>... ← truncated, italic leak
After fix: <pre><code>...```[^\n]*|%%...</code></pre> ← intact, regex preserved
Tests: 3678 passed (+20 from new test file, was 3658), 0 failures.
Reported-By: Cygnus (Discord)
Relayed-By: @AvidFuturist
Closes#1438
Fix two-layer bug where `/api/session` returned `context_length=0` for
sessions that pre-date #1318, then the frontend silently fell back to
cumulative `input_tokens` and the 128K JS default, producing nonsense
indicators like "100" capped from "890% used (context exceeded), 1.2M
/ 131.1k tokens used".
Empirical impact: 23 of 75 sessions on dev server rendered >100% before
this fix. #1356 fixed the same symptom on the live SSE path but missed
the GET /api/session load path that older sessions go through.
Two-layer fix:
1. Backend (api/routes.py:1295-1313) — resolve context_length via
agent.model_metadata.get_model_context_length() when the persisted
value is 0. Mirrors api/streaming.py:2333-2342.
2. Frontend (static/ui.js:1269) — drop the cumulative `input_tokens`
fallback. When last_prompt_tokens is missing, render "·" + "tokens
used" (existing !hasPromptTok branch) instead of computing a
percentage from the cumulative total.
10 regression tests in tests/test_issue1436_context_indicator_load_path.py
covering both layers + the empty-model edge case (avoids the 256K
default-for-unknown-model trap that get_model_context_length('') returns).
Verified live: claude-opus-4-7 session with input_tokens=5,226,479 now
renders "·" + "5.3M tokens used" instead of "100" + "3987% used".
Reported by @AvidFuturist.
Closes#1436.
Two unrelated UX bugs, both small surgical fixes with regression tests.
Issue #1432 — "+" button doesn't open new chat during streaming
================================================================
Reported by @Olyno: clicking "+" after sending a first message keeps
redirecting to the same chat instead of opening a new blank conversation,
making parallel chats impossible until the first response finishes.
Root cause:
static/boot.js:691 (and the Cmd/Ctrl+K branch at :844) had an empty-session
guard from #1171 that skipped newSession() when message_count===0:
if(S.session && (S.session.message_count||0)===0){
$('msg').focus(); closeMobileSidebar(); return;
}
But during the first user turn of a brand-new session, message_count is
still 0 server-side because the user message hasn't been merged into
s.messages yet. The guard treated that as "empty" and silently dropped
the click, blocking parallel chats for the entire stream duration.
Fix:
Tighten the predicate to also exclude in-flight state:
if(S.session
&& (S.session.message_count||0)===0
&& !S.busy
&& !S.session.active_stream_id
&& !S.session.pending_user_message){
$('msg').focus(); closeMobileSidebar(); return;
}
Same predicate applied to the Cmd/Ctrl+K handler at :844. The in-flight
signal (active_stream_id || pending_user_message) is the same one
_restoreSettledSession() in messages.js:1081 already uses to decide
whether a session is "settled" — keeping both call sites aligned.
Verified end-to-end: with S.busy=true and pending_user_message set, the
old guard returned `block=true` (= the bug), the new guard returns
`block=false` (= fixed). With a truly empty session (no busy, no pending),
both old and new guards still block — preserving #1171 behavior.
Issue #1423 — Profile name field auto-capitalizes typed values
==============================================================
Self-reported (Mac app, May 1 2026): typing `hello` into the New Profile
"Name" field shows `Hello` after blur/autofill, contradicting the
"Lowercase letters, numbers, hyphens, underscores only" hint right next
to it. The form lowercases on submit so stored data is correct, but the
displayed value during typing is misleading.
Root cause:
static/panels.js:2532 had only autocomplete="off":
<input type="text" id="profileFormName"
placeholder="..." autocomplete="off" required>
Missing three attributes that actually prevent the misbehavior:
- autocapitalize="none" — mobile keyboards (iOS Safari, Android Chrome,
WKWebView in the Mac app) auto-capitalize the first letter without it
- autocorrect="off" — Safari runs autocorrect on blur, can rewrite hello→Hello
- spellcheck="false" — desktop browsers may run spellcheck on blur
Fix:
Add the three attributes to profileFormName. Also added to
profileFormBaseUrl since URLs are similarly bad targets for
autocapitalize/autocorrect. profileFormApiKey is type="password" and
already has correct browser behavior.
Verified end-to-end against the live DOM: openProfileCreate() →
getElementById('profileFormName').getAttribute(...) returns the new
attributes correctly, with required preserved.
Tests
-----
3648 passed, 2 skipped, 3 xpassed (was 3640 — added 8 new regression tests
in test_1432_newchat_and_1423_profile_input.py).
One pre-existing test had to be widened: tests/test_mobile_layout.py
test_new_conversation_closes_mobile_sidebar grabbed only the first 500
chars of the btnNewChat handler block to scan for closeMobileSidebar.
The new comment block pushed closeMobileSidebar past that window even
though both calls are still present. Bumped the window to 1500 chars
and the shortcut-block lines from 12 to 24 to match the multi-line guard.
Closes#1432Closes#1423
Reported by @Olyno (#1432, GitHub)
Three fixes from Opus advisor review of stage-261:
1. CRITICAL: dropdown-survives-resize bug. The composerToolsetsDropdown is a
DOM sibling of composerToolsetsWrap, not a child, so CSS hiding the wrap
does not cascade-hide an open dropdown. If a user opens the dropdown at
composer-footer >= 1100px and then opens the workspace panel (or resizes
the window), the dropdown would stay open without a visible anchor.
Fixed in three places (defense-in-depth):
- resize listener: closes dropdown when chip.offsetParent === null
- _positionToolsetsDropdown: closes if chip hidden (defense-in-depth)
- toggleToolsetsDropdown: early-returns if chip hidden (defense against
future #1431 redesign code that might invoke from elsewhere)
2. MEDIUM: display:flex changed to display:block to match sibling wraps
(.composer-profile-wrap, .composer-model-wrap, .composer-reasoning-wrap
all use the natural block display).
3. Added 3 new regression tests to pin all three guards.
Refs #1431, #1433.
Replaces PR #1433 unconditional JS display:none with a CSS @container query
that shows the chip only at composer-footer widths >= 1100px. JS now clears
inline style instead of setting display:none, so the CSS responsive cascade
is the single source of truth. Also removed inline style=\"display:none\" from
index.html so the CSS base rule provides the default-hidden state.
10 regression tests pin the base hide, wide-container show, narrow-container
hide (520px container query), mobile viewport hide (640px @media), JS does
not force display:none, JS clears inline style, /api/session/toolsets and
the dropdown machinery (toggleToolsetsDropdown, _populateToolsetsDropdown)
are preserved.
Refs #1431, #1433.
Combines PR #1428 (UID/GID alignment) with a broader Docker reliability pass
that addresses recurring user reports about compose files not working.
Constituent PR:
- #1428 sunnysktsang - Align agent UID/GID with webui (fixes#1399).
Two- and three-container compose files had agent at UID 10000 (image
default) and webui at UID 1000 (WANTED_UID default), causing permission
denied on shared hermes-home volume. All services now use ${UID:-1000}.
Plus broader Docker UX overhaul:
- All 3 compose files document HERMES_SKIP_CHMOD/HERMES_HOME_MODE escape
hatches inline (the v0.50.254 fix wasn't surfaced for Docker users).
- New .env.docker.example template covering UID/GID, paths, password,
permission handling. UID/GID are uncommented with placeholder values
per Opus advisor (so macOS users don't skim past).
- New docs/docker.md - comprehensive guide: 5-min quickstart, failure
mode table with one-line fixes, bind-mount migration, multi-container
architecture diagram, macOS Docker Desktop VirtioFS note, link to
community sunnysktsang/hermes-suite all-in-one image.
- README Docker section rewritten - clearer quickstart, failure-mode
table, link to docs/docker.md. Stale /root/.hermes references removed.
Plus Opus pre-release advisor MUST-FIX:
- HERMES_HOME_MODE has DIFFERENT semantics in the WebUI vs the agent
image. WebUI: credential-file mode threshold (0640 allows group bits).
Agent: HERMES_HOME directory mode (default 0700). 0640 on a directory
has no owner-execute bit, so the agent can't traverse its own home and
bricks. My initial draft recommended HERMES_HOME_MODE=0640 in agent
service blocks - corrected to 0750 across all 4 surfaces (compose
files, .env.docker.example, docs/docker.md). 3 regression tests pin
the asymmetry.
12 regression tests total in test_v050260_docker_invariants.py.
Full suite: 3627 passed, 0 failed.
Nathan explicitly authorized merge with my own review + Opus only, no
independent review needed.
CI-only failure: test_session_db_close_is_idempotent imported hermes_state
from /home/hermes/.hermes/hermes-agent which exists locally but NOT on the
GH Actions runner that only has the WebUI repo.
Use importlib.util.find_spec to detect availability and pytest.skip when
the agent repo isn't present. The source-level pin in
test_cached_agent_reuse_closes_old_session_db catches revert of the close()
call; the runtime idempotency test is added confirmation when both repos
are co-located.
Local: 5 passed. CI: 4 passed + 1 skipped (idempotency).
PR #1421 (SessionDB WAL handle leak fix on cached-agent reuse path) had a
sibling leak at the LRU eviction site that I caught during pre-review:
api/streaming.py SESSION_AGENT_CACHE.popitem(last=False) was discarding
the evicted entry with `evicted_sid, _ = ...`. The agent's _session_db
was dropped on the floor and only released when GC eventually finalized
the agent — which on a long-running server may be never (cyclic refs,
extension types holding C handles, etc.).
Same fix shape as #1421: capture the evicted entry, call
_evicted_agent._session_db.close() explicitly. SessionDB.close() is
idempotent + thread-safe (with self._lock: if self._conn:), so the
double-close-is-benign property still holds.
5 regression tests in test_v050259_sessiondb_fd_leak.py:
- Source-level: cached-agent reuse path closes before replace
- Source-level: LRU eviction path captures + closes evicted agent
- Behavioral: SessionDB.close() is idempotent (3 calls safe)
- Behavioral: cached-agent reuse with mock — close called exactly once
- Behavioral: LRU eviction with mock — only evicted agent's DB closes
Full suite: 3615 passed, 0 failed.
Nathan explicitly authorized 'just go ahead and merge it as a small release'
since the PR is 9 LOC, focused, has Opus pre-release follow-up + tests, and
matches the empirically-confirmed leak shape (73-handle leak at EMFILE).
PR #1419 (login session TTL + redirect-back + connectivity probe) had a
real bug in the server-side ?next= construction:
quote(path, safe='/:@!$&'()*+,;=') keeps ? and & literal, so:
(a) /api/sessions?limit=50&offset=0 round-trips as /api/sessions?limit=50
— the inner & terminates the outer next= value and offset=0 leaks as
a top-level outer query the login page ignores.
(b) An attacker-controlled path with embedded &next=https://evil.com
injects a second top-level next parameter. Browsers parse first-match
(benign), Python parse_qs parses last-match (the evil URL) — the
parser-divergence is a footgun even though _safeNextPath() in login.js
rejects the actual exploit.
Fix: encode the entire path?query blob with safe='/' so ?, &, = all
percent-encode. The outer next then holds exactly one path-with-query
string the browser auto-decodes once.
6 regression tests in test_v050258_opus_followups.py pin round-trip behavior
across simple paths, single-query, multi-param queries, attacker-injection
neutralization, and the SESSION_TTL=30d constant.
Full suite: 3610 passed, 0 failed.
Opus pre-release advisor caught a 5th issue not covered by my initial
follow-up sweep, this one CRITICAL: PR #1402#493 per-session toolset
override silently no-op'd every time.
Bug: api/streaming.py:1755 called _session_meta.get('enabled_toolsets') on
the result of Session.load_metadata_only(). It returns a Session INSTANCE,
not a dict. .get() raised AttributeError, which the surrounding bare
except swallowed silently. The toolset chip in the UI saved correctly to
disk, but the streaming agent always ran with global toolsets.
Fix: use getattr(_session_meta, 'enabled_toolsets', None).
Two new regression tests:
- Source-level: forbid the .get() / [] dict-access shape.
- Runtime: Session.load_metadata_only must return a Session instance.
Full suite: 3604 passed, 0 failed.
stage-257 batch (PRs #1402 + #1415):
Opus pre-release advisor caught 4 issues in stage-257:
1. MUST-FIX (security): api/oauth.py::_write_auth_json — tmp.replace()
preserves the temp file umask (0644 default), so OAuth access/refresh
tokens landed world-readable on shared systems. Fix: tmp.chmod(0o600)
BEFORE rename, with try/except OSError that warns but does not abort.
2. SHOULD-FIX: _handle_cron_history and _handle_cron_run_detail accepted
job_id as a path component without validation. Mirrors the rollback
path-traversal vector caught in v0.50.255 (#1405). Path() / .. does NOT
normalize. New regex ^[A-Za-z0-9_-][A-Za-z0-9_.-]{0,63}$ with explicit
. / .. rejection.
3. SHOULD-FIX: _handle_cron_history int(offset)/int(limit) raised
ValueError on malformed input → confusing 500. Now try/except + clamp
to (max(0, offset), max(1, min(500, limit))).
4. NIT: same regex applied to _handle_cron_run_detail (defense-in-depth
even though path-resolve check would catch it downstream).
PR #1415 follow-up: 8 pre-existing tests in test_issue1106 and
test_custom_provider_display_name asserted bare model IDs but #1415
changes named-custom-provider IDs to @custom:NAME:model form when active
provider differs. Tests updated to use _strip_at_prefix helper to keep
checking the same invariant in the new shape.
4 regression tests in test_v050257_opus_followups.py + 8 fixed pre-existing
tests. Full suite: 3602 passed, 0 failed.
The li() helper in static/icons.js logs console.warn and returns ''
when an icon name is not in LI_PATHS. Five icon names referenced by
static/*.js were never registered, so their host elements rendered as
empty 0-size buttons / containers despite display:flex.
Five missing icons added:
- 'volume-2' — TTS speaker on every assistant message
(ui.js:3376; regression from #499; surfaced after
#1411 fixed CSS specificity in v0.50.255)
- 'chevron-up' — queue pill chevron (ui.js:2178; the '▲' fallback
only fired when li was undefined, not when it
returned '')
- 'hash' — Insights 'Messages' stat card (panels.js:883)
- 'cpu' — Insights 'Tokens' stat card (panels.js:884)
- 'dollar-sign' — Insights 'Cost' stat card (panels.js:885)
The Insights icons are a fresh regression from #1405 (v0.50.255).
Adds tests/test_issue1413_li_path_coverage.py — three tests:
1. Walk every li('NAME', ...) call across static/*.js, assert NAME
is registered in LI_PATHS. Prevents the entire class of bug.
2. Pin the five icons added by this fix so removal gets a clear
error message.
3. Pin the warn+empty-string contract of li() so the diagnostic
story in the test docstring stays accurate.
Reported by @AvidFuturist via Telegram, 2026-05-01.
Fixes#1413
Opus pre-release advisor caught 4 issues in stage-255 (#1390 + #1405):
1. MUST-FIX: api/rollback.py path-traversal — _checkpoint_root() / ws_hash /
checkpoint did NOT normalize Path() / "../escape", so an authenticated
caller could read or restore from another allowlisted workspace via
../<other-ws-hash>/<sha>. New _validate_checkpoint_id() regex-guards
with ^[A-Za-z0-9_-][A-Za-z0-9_.-]{0,63}$ and rejects . and .. literals.
Both get_checkpoint_diff and restore_checkpoint validate.
2. SHOULD-FIX: redact_session_data perf cliff — the new api_redact_enabled
toggle in #1405 called uncached load_settings() per string, recursed
across messages[] and tool_calls[]. For a 50-message session: hundreds
of disk reads per /api/session response. Now read once at the top and
thread _enabled through via private kwarg.
3. SHOULD-FIX: voice-mode wrong-session TTS — the patched autoReadLastAssistant
fires globally; if the user navigated to a different session between
sending and stream completion, TTS would speak the wrong session\\s reply.
New _voiceModeThinkingSid closure captures S.session.session_id at
thinking-time; _speakResponse bails to _startListening() on mismatch.
4. NIT: rollback._inspect_checkpoint had bare Exception in the except tuple
alongside specific catches, swallowing everything. Now (TimeoutExpired,
OSError) only.
6 regression tests in test_v050255_opus_followups.py. Full suite: 3587 passed,
2 skipped, 3 xpassed.
Two unrelated UX/Settings bugs, both small surgical fixes with regression
tests.
Issue #1409 — TTS toggle has no effect
=======================================
Reported via Discord: ticking Settings → Voice → "Text-to-Speech for
responses" did nothing. The speaker icon never appeared on assistant
messages despite the checkbox saving to localStorage correctly.
Root cause (CSS specificity collision):
static/panels.js _applyTtsEnabled() set
btn.style.display = enabled ? '' : 'none'
on every .msg-tts-btn. The '' branch removes the inline override, after
which the .msg-tts-btn { display:none; } rule from style.css re-hides the
button. Both branches left the icon hidden, so the toggle has been
silently broken since #499 first shipped the TTS feature.
Fix (body-class toggle, Option B from the issue):
- panels.js: _applyTtsEnabled now toggles body.classList('tts-enabled')
- style.css: new compound selector
body.tts-enabled .msg-tts-btn { display:inline-flex; align-items:center; }
- default-hidden rule (.msg-tts-btn{display:none;}) preserved so the icon
stays hidden by default (CSS-only state)
- boot.js paths that already call _applyTtsEnabled(localStorage…) work
unchanged — the new function applies state at the body level instead of
inline-styling individual buttons, so the rule survives renderMd()
re-renders without re-querying every button
Verified end-to-end against live server: getComputedStyle on a probe
.msg-tts-btn returns display:flex when body has tts-enabled, display:none
when it doesn't. Two regression tests in TestIssue1409TtsToggleBodyClass
explicitly check for the body-class shape and forbid the broken inline-style
pattern.
Issue #1410 — Ollama (local) shows "API key configured" when only
Ollama Cloud key is set
=================================================================
Reported via Discord: configuring Ollama Cloud lit up the local Ollama card
too. Both providers were mapped to OLLAMA_API_KEY in api/providers.py
_PROVIDER_ENV_VAR.
Root cause:
api/providers.py:47-48
"ollama": "OLLAMA_API_KEY",
"ollama-cloud": "OLLAMA_API_KEY",
_provider_has_key("ollama") found the value the user set for Ollama Cloud
and returned True. But the runtime code path in
hermes_cli/runtime_provider.py only consumes OLLAMA_API_KEY when the base
URL hostname is ollama.com (Ollama Cloud) — local Ollama is keyless by
default and reaches a custom base URL with no auth. The WebUI was
reporting "configured" for a key local Ollama doesn't even read.
Fix (Option A from the issue body, preferred):
- Drop bare "ollama" from _PROVIDER_ENV_VAR with an inline comment
explaining why
- _provider_has_key("ollama") falls through to the config.yaml branch,
which already supports providers.ollama.api_key for local users who
genuinely need to set a token
- ollama-cloud retains its OLLAMA_API_KEY mapping unchanged
Verified end-to-end against live server with OLLAMA_API_KEY=sk-cloud-key-test
in env: GET /api/providers reports has_key=True only for ollama-cloud, and
has_key=False for bare ollama. Two regression tests in
TestIssue1410OllamaEnvVarBleed cover the bleed-prevention case AND the
"local user with config.yaml api_key still reports configured" case to
guard against over-correction.
Tests
-----
3572 passed, 2 skipped, 3 xpassed (was 3567 — added 5 new regression tests).
Closes#1409Closes#1410
Reported by @AvidFuturist (Discord, May 1 2026)
- popstate handler now refuses to switch sessions mid-stream (S.busy guard)
Mirrors the same guard the cross-tab storage handler had. PR #1392 added
the popstate listener but missed this. Without it, browser Back during
a live stream silently yanks the user out of their turn.
(Opus pre-release advisor finding)
- CHANGELOG entry for v0.50.254 (4 PRs + 1 Opus follow-up)
1 regression test in test_v050254_opus_followups.py.
- Point 4 (security): _resolve_workspace now validates against known workspaces
from workspaces.json to prevent arbitrary path write via restore endpoint
- Point 5 (voice mode): bail out of voice mode on not-allowed, service-not-allowed,
and audio-capture errors instead of infinite retry loop
- Point 1 (locale coverage): added ~40 new English keys as placeholders with
TODO:translate comments in zh, zh-Hant, ko, ru, es, de, pt locales
- Point 2 (test fix): tightened test regex to anchor on branch-indicator class
to avoid collision with _sessionLineageKey helper
- Point 3 (test fix): accept both inline and parentEl variable forms for
body.appendChild pattern in pinned indicator test
All 6 previously failing tests now pass.
Three small fixes from Opus review of the merged stage diff:
1. Strip 9 orphan wiki_* i18n keys (72 lines) from PR #1342 — leaked
from a different branch, zero references outside i18n.js.
2. /branch endpoint: reject non-string session_id with explicit 400
(was raising TypeError → generic 500 from get_session()).
3. /branch endpoint: reject negative keep_count with explicit 400
(Python slice semantics on negative produces 'all but last N',
confusing fork behavior).
Plus tests/test_v050253_opus_followups.py — 3 regression tests pinning
all three fixes.
Verified: 3558 pytest passing.
Pulls in the extra commit pushed to PR #1381 after our initial absorb. Adds a
@media (max-width: 340px) block that compacts gutters (composer-wrap padding,
composer-footer gap, composer-left gap) without shrinking the 44px touch
targets. Plus its regression test.
Verified with apply --check failed but actual apply succeeded — the failure
was due to context drift from our earlier CSS specificity fix; the new lines
landed at the correct location. test_mobile_layout.py: 47 tests passing.
PR #1342's rewrite introduced `del sys.modules['api.config']`, 'api.profiles']`
anti-pattern that breaks tests/test_live_models_ttl_cache.py::test_live_models_cache_is_profile_scoped
(v0.50.252) when run after test_issue1195_*. The pattern is explicitly banned per
~/WebUI/docs/agent-memory/pytest-isolation.md — sibling tests that import api.profiles
later see the wrong (re-imported) module.
Master's version of this test passes 5/5 and uses no del sys.modules calls. The PR's
core /branch feature does NOT depend on this test rewrite — reverting it loses no
coverage of the branching feature.
Fix: gate parent_session_id emission in compact() on truthiness so
sessions without a fork link don't leak parent_session_id: None and
break the v0.50.251 lineage end_reason gating in agent_sessions.py.
The /branch endpoint sets the field on saved forks; everything else
keeps the v0.50.251 sidebar lineage path as the canonical source.
Persist session model_provider separately from model IDs so active/default provider selections like gpt-5.5 remain bare while routing through OpenAI Codex. Keep @provider:model for picker disambiguation and runtime bridging, and preserve explicit OpenRouter plus custom/proxy base_url routing.
Opus pass-2 review of v0.50.251 caught a critical regression in PR
#1375:
The cancel-partial message stored captured tool calls under the
'tool_calls' key. That key is whitelisted by _API_SAFE_MSG_KEYS so
_sanitize_messages_for_api forwarded the entries to the next-turn
LLM call. But the captured entries use the WebUI internal shape
({name, args, done, duration, is_error}) — they don't have the
OpenAI/Anthropic id + function: {name, arguments} envelope. Strict
providers (OpenAI, Anthropic, Z.AI/GLM) would 400 on the malformed
entries. Net effect: the very cancel-then-continue scenario PR
#1375 aimed to improve becomes a hard fail.
Fix:
- Rename the persisted key to '_partial_tool_calls' (underscore-
prefixed private key NOT in _API_SAFE_MSG_KEYS, so sanitize
correctly strips it).
- Update static/messages.js hasMessageToolMetadata check to also
recognize _partial_tool_calls for UI rendering.
- Update test_issue1361_cancel_data_loss.py assertion to check
_partial_tool_calls (and tool_calls as legacy fallback).
Plus 2 NIT fixes from the same Opus review:
NIT 1 (api/profiles.py:153): re.match → re.fullmatch for consistency
with other _PROFILE_ID_RE callers in the codebase. The trailing-
newline footgun ($ matches before final \n in re.match) is now
closed. Without #1373's is_dir() guard, a name like 'valid\n' would
have created a directory named 'valid\n' on Linux. Doesn't escape
<HERMES_HOME>/profiles/ via Path joining, but unintended.
NIT 2 (test_issue798.py): R19j coverage gaps — added trailing-
newline tests, length-boundary tests (64-char valid, 65-char
rejected), single-char minimum, and non-ASCII / Unicode-trick tests.
New regression test (tests/test_pr1375_partial_tool_calls_sanitize.py):
- test_partial_tool_calls_field_not_forwarded_to_llm: pins that
sanitize-for-API strips _partial_tool_calls + reasoning + does
NOT have tool_calls on a partial message
- test_legitimate_tool_calls_are_preserved_for_completed_turns:
pins that real OpenAI-shape tool_calls on completed turns survive
sanitize unchanged
Tests: 3486 passing (3484 → 3486, +2 sanitize tests).
Adds two more contributor PRs to the v0.50.251 batch per user
directive (per-PR review + Opus review for #1373; #1375 was clean
ship-on-sight).
#1375 (@bergeouss, +382 LOC, all CI green) — fixes#1361 paid-token
data loss on Stop/Cancel. Mirrors the existing STREAM_PARTIAL_TEXT
pattern from #893: adds STREAM_REASONING_TEXT and STREAM_LIVE_TOOL_CALLS
shared dicts populated during streaming and read by cancel_stream().
Also fixes the §C reasoning-only-creates-no-message gap where the
strip-thinking-blocks regex returned empty string and the if-guard
skipped the partial append. 8 regression tests covering all 3
sections plus tools+text combinations.
#1373 (@bergeouss, +105 LOC, had CI failures pre-fix) — fixes#1195
new-profile-routes-to-default. The is_dir() guard in
get_hermes_home_for_profile() caused new profiles (no session yet)
to silently route every session back to the default profile until
the directory existed on disk. Removed the guard; profile path is
now returned unconditionally.
Pre-release fix for #1373's CI failures: the change flipped two
behaviors pinned by tests in #798:
- R19c (test_get_hermes_home_for_profile_falls_back_for_missing_profile)
asserted nonexistent → base. Renamed and updated to assert the
new always-return-profile-path behavior.
- R19j (test_get_hermes_home_for_profile_rejects_path_traversal)
asserted that valid-but-nonexistent profile names → base. Updated
to assert profile-scoped path. Also updated docstring: the
_PROFILE_ID_RE regex is now the SOLE defense against path
traversal (previously is_dir() was a defense-in-depth layer);
verified each known-bad shape still returns base.
Tests: 3484 passing (3471 → 3484, +13).
When a user switched profiles and created a new session, the session
was saved to the default profile directory instead of the active
profile directory — because get_hermes_home_for_profile() silently
fell back to _DEFAULT_HERMES_HOME when the profile directory didn't
exist yet on disk.
Root cause: api/profiles.py:156 had `if profile_dir.is_dir(): return
profile_dir; return _DEFAULT_HERMES_HOME`. New profiles (no session
yet, so no dir) routed every session back to default.
Fix: remove the is_dir() guard, return the profile path
unconditionally. The profile directory is created on first use by
the agent/session layer.
5 regression tests in tests/test_issue1195_session_profile_routing.py:
existing-profile, non-existent-profile (the core fix), None, empty-
string, 'default' all return the expected path.
Co-authored-by: bergeouss <bergeouss@users.noreply.github.com>
Three distinct data-loss paths fixed:
§A — Reasoning text was accumulated in a thread-local _reasoning_text
inside _run_agent_streaming. cancel_stream() never saw it because it
went out of scope when the thread was interrupted. Now mirrored to a
new shared dict STREAM_REASONING_TEXT keyed by stream_id, populated
in on_reasoning() and the reasoning branch of on_tool(), read in
cancel_stream().
§B — Live tool calls in thread-local _live_tool_calls were similarly
invisible to cancel_stream(). Now mirrored to STREAM_LIVE_TOOL_CALLS
on tool.started + tool.completed.
§C — Reasoning-only streams produced no partial message because the
thinking-block regex strip returned empty string and the `if _stripped:`
guard skipped the append. Now appends the partial message when EITHER
content text, reasoning trace, OR tool calls exist.
Mirrors the existing STREAM_PARTIAL_TEXT pattern from #893 exactly:
same dict creation in _run_agent_streaming, same _live_config fallback
in cancel_stream, same cleanup in _periodic_checkpoint.
8 regression tests in tests/test_issue1361_cancel_data_loss.py
covering all three sections plus tools+text combinations.
Co-authored-by: bergeouss <bergeouss@users.noreply.github.com>
Opus pre-release findings on #1370 applied:
SHOULD-FIX 1: Tightened parent_session_id exposure to only emit when
the parent's end_reason is in {compression, cli_close}. Without this,
two distinct WebUI sessions sharing a non-continuation parent (e.g.
'user_stop') would get clustered by frontend's _sessionLineageKey
(which falls through to parent_session_id when _lineage_root_id is
missing) and incorrectly collapsed into a single sidebar row.
Updated assertions in:
- tests/test_session_lineage_metadata_api.py::
test_non_compression_state_db_parent_does_not_create_sidebar_lineage
- tests/test_pr1370_lineage_metadata_perf_and_orphan.py::
test_non_compression_parent_does_not_extend_lineage
SHOULD-FIX 2: Chunked the IN-clause to 500 vars to stay under
SQLITE_MAX_VARIABLE_NUMBER. Python 3.9 ships sqlite 3.31 with the
default limit of 999. A power user with 2000+ sessions in the
sidebar would hit OperationalError, the silent except-wrapper would
swallow it, and lineage collapse would never work. Added
test_in_clause_chunked_for_large_session_set with SQL interception
to lock the invariant in source.
PR addition (per user directive — Opus + my review, no second
independent review round needed for combined batch):
#1372 from @NocGeek — fix: persist manual cron run results.
Self-contained 89 LOC fix split out from the held #1352. Mirrors the
scheduled-cron path (cron/scheduler.py:1334-1364) exactly: saves
output, marks job complete, treats empty response as soft failure
with matching error string. 2 behavioral tests using sys.modules
monkeypatch to mock cron.scheduler.run_job. CI not yet attached
because branch is brand-new; ran the new tests + adjacent suites
locally — all pass.
Final test count: 3471 passing, 0 failed.
Also adds 2 more regression tests for the perf-fix invariants:
- test_in_clause_chunked_for_large_session_set
- test_two_children_sharing_non_continuation_parent_not_collapsed
Manual WebUI cron runs previously called cron.scheduler.run_job(job)
and then only cleared the in-memory running flag. That meant output
could be dropped and job metadata like last_run_at / last_status was
not updated after a manual run.
This PR matches the scheduled cron path (cron/scheduler.py:1334-1364)
exactly:
- Save manual-run output via save_job_output
- Mark manual runs complete via mark_job_run
- Treat empty final_response as a soft failure with the same error
string as the scheduled path
- Record manual-run failures in job metadata via mark_job_run(False)
- Keep _run_cron_tracked self-contained for worker-thread execution
Includes 2 behavioral regression tests using monkeypatch.setitem on
sys.modules to mock cron.scheduler.run_job + cron.jobs helpers — the
right test pattern (exercises the real _run_cron_tracked code path).
Split out from #1352 (the larger profile-aware-cron-panel PR that's
on hold) per pre-release-review feedback. Self-contained, doesn't
touch the held PR's profile-filtering scope.
Co-authored-by: NocGeek <NocGeek@users.noreply.github.com>