_gateway_root_pid_path() unconditionally returned <hermes_root>/gateway.pid.
Profile-scoped gateways (started with --profile <name> or via active_profile)
write their runtime files under <hermes_root>/profiles/<name>/ instead of the
root, so the root-level path never existed.
build_agent_health_payload() therefore always received a non-existent pid_path,
fell through to the stale root-level gateway_state.json, and returned alive=None.
This caused the cron/scheduled-jobs page to display "Gateway not configured" even
when a gateway was actively running.
Fix: after failing to find a root-level gateway.pid, fall back to the active
profile directory via get_active_hermes_home(). Root-level wins when it exists,
so deployments that do write there are unaffected. Errors from profile lookup are
swallowed and the root path is returned, preserving the previous safe default.
Adds five focused unit tests covering the new fallback, the priority rule, and
the error-handling path.
Inline fixes for 4 of 5 Opus SHOULD-FIX items before tag:
1. /api/auth/status now gates passkeys_enabled / passwordless_enabled on
_passkey_feature_flag_enabled() — when flag is off, status reports
no credentials even if passkeys.json has legacy entries. New
passkey_feature_flag field added to the response for the frontend.
2. Settings → System Passkeys block (passkeysSettingsBlock) now starts
display:none and loadPasskeys() reveals it only when the server
confirms passkey_feature_flag === true AND /api/auth/passkeys
doesn't return {disabled: true}. Stops the broken-affordance trap
where users would see Add passkey → click → 404.
3. /api/settings/save now refuses to set passwordless mode when the
passkey feature flag is off. Closes the auth-bypass path Opus flagged:
user goes passwordless while flag on → admin unsets flag → restart
serves the WebUI fully unauthenticated.
4. CHANGELOG entries added for PR #2685 (replayed-context dedup +
per-turn metering cap) and PR #2824 (Stop server affordance,
relocated to Settings) — both PRs had functional changes but no
release-notes entries. Also enriched the rate-limit detail on the
#2739 entry (30 events / 60s / 4KB body cap).
Deferred to follow-up issue (#5 in Opus review):
- Live tool metering cumulative cap across many tool calls — non-trivial
refactor of _bump_live_prompt_estimate, will be a separate PR
Per the stage-batch14 ship plan, passkey/WebAuthn support is shipped
opt-in default-off behind an explicit feature flag so deployments can
disable the entire surface (UI + endpoints + credential storage) without
needing to delete code.
Enable via either:
- HERMES_WEBUI_PASSKEY=1 environment variable, OR
- webui_passkey_enabled: true in config.yaml
With the flag off:
- are_passkeys_enabled() returns False even if credentials exist
- is_auth_enabled() falls back to password-only checking
- /login renders password-only (no passkey button)
- All 6 /api/auth/passkey/* endpoints return 404 with a clear message
- Settings → System → Passkeys section is hidden
Mirrors the #2527 notes-drawer flag shape (env-or-config, truthy parse).
Auth is high-stakes; opt-in lets us land the code while keeping default
deployments on the well-tested password-only path.
Touches: api/auth.py (new _passkey_feature_flag_enabled helper, gated
are_passkeys_enabled), api/routes.py (6 endpoint guards).
Move POST /api/shutdown routing after the CSRF check so drive-by
cross-origin requests cannot bring down a dev server with auth off.
Also replace os._exit(0) with os.kill(os.getpid(), signal.SIGINT)
so atexit handlers and pending session writes run during shutdown.
Add a power button (⏻) in the title bar that gracefully stops the
WebUI server process from the browser.
- api/routes.py: POST /api/shutdown endpoint with threaded os._exit(0)
- static/boot.js: shutdownServer() with confirm prompt, BroadcastChannel
cross-tab notification, and _showServerStopped() placeholder UI
- static/index.html: shutdown button HTML in title bar (after reload btn)
- static/style.css: .app-titlebar-shutdown styles, hover turns red
Opus pre-release advisor caught a #2762 parity gap. api/streaming.py:5078
(_run_agent_streaming worker, background thread) correctly passes
profile= to sync_session_usage post-#2827. But the SECOND production
call site at api/routes.py:9007 (_handle_chat_sync, HTTP thread) does
not. Safe TODAY (HTTP thread sets TLS correctly), but it's a
defense-in-depth gap: anyone wrapping that handler in a worker pool
later silently regresses the fix. Closes the parity gap so the
threat-model invariant holds regardless of future threading changes.
Opus pre-release advisor MUST-FIX patched inline:
- api/routes.py:7290-7308 _handle_folder_download: add Connection: close
header before end_headers() to satisfy HTTP/1.1 framing on the on-the-fly
ZIP stream. Without it, post-#2836 protocol_version bump leaves clients
hanging waiting for the next pipelined response after central-directory
bytes finish. Opus verified this is the ONLY streaming response #2836
missed — all other paths (j/t helpers, 12 hand-written responses, 8 SSE
endpoints, auth flow) are already correctly framed.
Cherry-picked via 3-way apply (rebase had failed on static/index.html
conflict when applied via rebase commit chain; 3-way of the net delta
against stage HEAD applied cleanly).
Co-authored-by: mccxj <mccxj@github.users.noreply.github.com>
Agent reviewer 'LGTM. Ship it.'
- Bug A fix: _session_field helper handles dict-vs-object snapshot in pin-limit check
- Bug B fix: removed stale client-side pinLimitReached short-circuit
- Bug C recovery: renderSessionList() on pin/unpin failure refreshes from server
Co-authored-by: franksong2702 <146128127+franksong2702@users.noreply.github.com>
nesquena APPROVED 2026-05-22. Cherry-picked onto post-v0.51.127
master via 3-way apply. Resolved api/routes.py conflict: master had
the inline correctness fix from the deep-review iteration; PR
refactors it into _metadata_only_message_summary() helper. Took the
helper AND added profile= threading (post-#2827 master adds
profile-aware state.db reads). Kept master's pre-existing
test_api_session_reload_drops_stale_cached_user_tail_after_saved_assistant
alongside the PR's new test_metadata_fast_path_matches_reconciliation_for_restamped_replays.
Co-authored-by: dobby-d-elf <dobby.the.agent@gmail.com>
MUST-FIX:
- tests/test_2735_open_in_vscode.py: bump expected open_in_vscode locale
counter from 10 to 11 (Turkish locale added in #2772). The bump fell
out of an in-rebase test edit but never got committed; tagging without
this would have shipped a failing test in the release commit.
SHOULD-FIX inline:
- api/updates.py: case-D drift in _select_apply_compare_ref. The original
#2855 fix used latest_tag in the past-tag predicate; the check side
uses current_tag (HEAD's nearest reachable tag) plus a 'behind == 0'
gate. They drift when HEAD is on an OLDER release tag with commits on
top AND a NEWER tag exists ('case D'): check correctly suggests
advancing to the newer tag, but apply fell through to origin/<branch>.
Mirror the check-side predicate exactly. Adds regression test
test_select_apply_compare_ref_case_d_older_tag_with_commits_and_newer_tag_exists.
- static/messages.js: post-await race guard in _restoreSettledSession.
stream_end without preceding 'done' enters the settlement path, awaits
/api/session, then sets _streamFinalized=true. If a late 'done' event
arrives during that await, it sees _streamFinalized still false and
double-runs the finalize. The guard returns early when done won the
race, avoiding double renderMessages() + double notification.
- server.py: CORS preflight Access-Control-Allow-Methods now includes PUT.
#2776 wired PUT into the router for /api/mcp/servers/{name} but didn't
update the OPTIONS response. Same-origin only in practice, but cosmetic
completeness for CORS-aware deployments.
Opus advisor verdict: all 5 risk areas reviewed, 1 MUST-FIX + 3 SHOULD-FIX
all addressed inline. Net: +69/-9, no new architecture, no behavior risk.
Add a complete Turkish locale to the WebUI and login page so users can
select Türkçe in Settings, with speech recognition via tr-TR.
Co-authored-by: Cursor <cursoragent@cursor.com>
Add `PATCH /api/mcp/servers/{name}` endpoint that accepts `{"enabled": bool}`,
updates `mcp_servers.<name>.enabled` in config.yaml, and calls `reload_config()`.
Mirrors the existing DELETE pattern.
Also wire the previously-defined-but-unrouted `_handle_mcp_server_delete` into
`handle_delete`, and `_handle_mcp_server_update` into a new `handle_put` +
`do_PUT` in server.py — fixing a pre-existing bug where those handlers existed
but were never reachable over HTTP.
UI: add a toggle button in each MCP server row in the system settings panel
(panels.js). Clicking it calls PATCH and reloads the list. Toggle button is
styled with `.mcp-toggle-enabled` / `.mcp-toggle-disabled` CSS classes. The
`toggle_supported` flag in the list response is now `True`.
i18n: add 5 new keys (`mcp_enable_server`, `mcp_disable_server`,
`mcp_enabled_toast`, `mcp_disabled_toast`, `mcp_toggle_failed`) to all 9
non-English locales (English values as placeholder translations).
Tests: add `TestMcpToggle` class with 7 tests covering disable, enable,
404-not-found, empty name, missing field, response payload, and URL-encoded name.
Update `test_empty_config` and visibility panel assertions to reflect
`toggle_supported: True` and the new toggle button in panels.js.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Fixes#2846. After PR #2758 (the #2653 fix) the update check correctly
falls through to the branch comparison when HEAD has moved past the
latest `v*` tag — so the banner reports the real commit count against
`origin/<branch>`. But `_select_apply_compare_ref` was never updated to
mirror that decision: as long as any `v*` tag exists, it returns
`tags[0]`, even when HEAD is far past it.
Result for everyone running hermes-agent past `v2026.5.16` (i.e. anyone
on agent master between tagged releases):
1. Banner: `Agent (origin/main): 254 updates available` ← correct
2. User clicks Update Now
3. `_select_apply_compare_ref` picks `v2026.5.16` because tags exist
4. `git pull --ff-only origin v2026.5.16` — no-op (HEAD is already past it)
5. `_schedule_restart()` fires anyway, server bounces
6. Next check still reports 254 behind — banner reappears unchanged
`apply_force_update` had the same bug, except worse: `git reset --hard
v2026.5.16` would have actively rewound the user's checkout 254 commits.
The root cause is the same bug class as #2653 — two parallel paths
(`_check_repo_release` and `_select_apply_compare_ref`) that should make
the same decision but didn't. Pre-fix, the "is HEAD past the latest
tag?" predicate lived inline inside `_check_repo_release` only.
Fix
---
Extract `_head_is_past_latest_tag(path, current_tag)` and have both
paths consult it. When HEAD is past the latest tag:
- check path: release check returns None → branch check runs (#2653,
unchanged behaviour, just refactored)
- apply path: falls through to upstream / `origin/<branch>`, never the
stale tag (#2846, new behaviour)
Tests
-----
- `test_select_apply_compare_ref_uses_tag_when_head_is_on_tag` —
unchanged behaviour pinned: HEAD exactly on tag → advance to tag.
- `test_select_apply_compare_ref_falls_through_when_head_is_past_tag` —
the #2846 repro: HEAD = v2026.5.16 + 608 commits → advance to
`origin/main`, not the tag.
- `test_select_apply_compare_ref_no_tags_uses_upstream` — unchanged.
- `test_select_apply_compare_ref_no_tags_no_upstream_uses_default_branch`
— unchanged.
- `test_check_and_apply_paths_agree_when_head_is_past_tag` — symmetry
test, ensures the two paths can't drift apart again.
All 21 tests in `tests/test_updates.py` pass locally (16 existing + 5
new).
Refs #2846, #2653.
Fixes#2853. The `_terminal_shell_preexec_fn` added in `71d8a8fb` called
`prctl(PR_SET_PDEATHSIG, SIGTERM)` so orphaned PTY shells would die when
the WebUI process crashed. But that signal is **per-thread**, not
per-process, and WebUI runs `ThreadingHTTPServer`: every HTTP request is
handled in its own short-lived worker thread.
Flow that broke every Linux user:
1. User clicks the terminal toggle → frontend hits `POST /api/terminal/start`.
2. ThreadingHTTPServer spins up a worker thread to handle that one request.
3. The worker thread calls `subprocess.Popen(..., preexec_fn=...)`.
4. The shell calls `prctl(PR_SET_PDEATHSIG, SIGTERM)` in its preexec_fn.
Its registered "parent" is now the WebUI worker thread that called Popen.
5. The handler returns its JSON response and the worker thread exits.
6. The kernel sees the pdeathsig-parent thread has died and sends SIGTERM
to the PTY shell. The shell dies within ~10 ms of being created.
7. The reader loop sees EIO on the master FD, emits `terminal_closed`, and
the frontend writes `[terminal closed]`.
macOS users were unaffected because `libc.prctl` doesn't exist there —
`ctypes.CDLL(None)` returns a libc handle, `libc.prctl` raises
`AttributeError`, the bare-`except` swallows it, and the shell starts
with no pdeathsig configured.
Empirical verification on this Linux host (real PTY + `subprocess.Popen`
inside a `threading.Thread` that joins immediately):
with preexec_fn → proc.poll() == -15 (SIGTERM), master FD returns EIO
without preexec_fn → proc.poll() == None (alive), master FD returns "HELLO\\r\\n"
Same shell, same PTY, same threading topology as WebUI.
Fix
---
Drop the `preexec_fn` entirely. The orphan-shell-on-crash case the original
PR was navigating is rare for self-hosted single-user installs, and the
existing `atexit.register(close_all_terminals)` + explicit `close_terminal`
paths cover graceful shutdown. A future fix (option B in the issue) can
re-introduce pdeathsig pinned to a long-lived supervisor thread, but that
is a follow-up — this PR is the smallest unbricks-Linux-today change.
Tests
-----
- Invert `test_terminal_shell_uses_parent_death_signal_preexec` →
`test_terminal_shell_does_not_use_pdeathsig_preexec`: asserts
`preexec_fn` is NOT in the Popen kwargs.
- Add `test_pty_shell_survives_when_spawning_thread_exits`: spawns a
real PTY shell via `start_terminal` from a worker thread, waits for
the worker to join, asserts the shell is still alive after a half-second
grace window. This is the contract the original tests never exercised.
- Update `test_terminal_module_registers_graceful_shutdown_reaper` to
refuse re-introduction of the preexec_fn or the `libc.prctl(1, SIGTERM)`
call (treats either as a regression).
All 27 terminal-related tests pass locally.
Refs #2853
Right-click any workspace file, folder, or root now shows
'Open in VS Code' alongside the existing Reveal in File Manager action.
- POST /api/file/open-vscode: resolves path via safe_resolve, finds VS
Code via shutil.which() with fallbacks for macOS (/usr/local/bin/code,
app bundle CLI), Linux (/usr/bin/code, /snap/bin/code), and Windows
(%LOCALAPPDATA% and %PROGRAMFILES% user/system installs). Returns a
descriptive error if not found rather than a bare OS error.
- Optional vscode block in config.yaml: command (default: code),
host_path_prefix + container_path_prefix for Docker path mapping.
- i18n: open_in_vscode and open_in_vscode_failed translated in all 10
locales (it, ja, ru, es, de, zh-CN, zh-TW, pt, ko).
- 26 tests in tests/test_2735_open_in_vscode.py covering source wiring,
command resolution, i18n completeness, and live endpoint error paths.