Two low-severity follow-ups from Opus regrounding review:
1. The IPv6 unique-local fc00::/7 check was `h.startswith('fc') or
h.startswith('fd')` — too loose. It would also classify hostnames
like 'food.example.com' or 'fdsa.test' as 'local' and silently let
them through the block. Tightened to a regex match for canonical
IPv6 syntax (`f[cd][0-9a-f]{0,2}:`) so only actual IPv6 addresses
match. Same fix in both tests/conftest.py and server.py.
2. test_allow_outbound_network_fixture_unblocks was technically
self-passing: it tried to connect to a *.invalid hostname, which is
in the allow-list, so the real socket.create_connection would run
regardless of whether the fixture toggled the block. Replaced with
a public-IP-based test that actually proves the toggle works, plus
a paired test_block_is_active_outside_the_fixture sanity test that
proves the block is on without the fixture.
Both follow-ups noted by Opus advisor as 'defer-OK' but trivial fixes
so landing them in this batch.
Tests should not reach the public internet. Before this commit, an
accidentally-leaking outbound socket from the test_server fixture (real
TLS handshakes to Anthropic / Amazon / OpenRouter, sometimes triggered
by SDK-init paths that found a credential the credential-strip allowlist
missed) was adding 60+s of wall-time to a 100s test run and creating a
class of flaky failures.
This installs a default-deny socket-block at two layers:
1. Pytest process, via tests/conftest.py module-level monkey-patch on
socket.create_connection + socket.socket.connect. Loopback / RFC1918
private / link-local / RFC2606 reserved-TLD destinations pass through;
anything else raises OSError("hermes test network isolation: outbound
to ... blocked"). Tests that legitimately need real outbound opt back
in via the new `allow_outbound_network` fixture (no current callers).
2. Test_server subprocess (server.py), via a HERMES_WEBUI_TEST_NETWORK_BLOCK=1
environment-variable-gated guard at the top of server.py. tests/conftest.py
sets the env var on every test_server spawn. Without this, the subprocess
could make outbound that the pytest-side block can't see (which is exactly
what was happening — verified via `ss -tnp` showing the server.py child
with established ESTAB sockets to [2607:6bc0::10]:443).
In production the env var is unset, so the guard is a no-op.
Companion changes:
- test_dns_resolution_failure refactored to mock socket.getaddrinfo
raising gaierror, instead of relying on a real DNS lookup of a
*.invalid hostname. The test was the one outlier that genuinely
exercised real DNS; mocking matches what every other probe-error test
in the same file already does.
- New tests/test_conftest_network_isolation.py with 9 adversarial
tests proving the block fires for public IPs (including the exact
Anthropic IPv6 and Amazon IPv4 destinations we observed leaking),
the allow-list passes loopback / RFC1918 / link-local / reserved-TLDs,
and the opt-in fixture re-enables real outbound when needed.
Test suite: 5,120 → 5,192 (+72 net new from this commit + the regression
tests in the companion commits). Wall time: 161s → 95s on the same
hardware. No remaining outbound from any test path.
Detect IPv6 addresses (containing ':') in QuietHTTPServer.__init__ and set address_family to AF_INET6 before socket creation, fixing EAFNOSUPPORT when binding to :: or ::1.
Also updates the loopback check to recognize ::1 and the container warning to mention :: as the IPv6 equivalent of 0.0.0.0. Documents IPv6 usage in HERMES_WEBUI_HOST env var description.
- Delete QuietHTTPServer.server_bind() override entirely:
TCP_KEEP* setsockopts on the listening socket are no-ops without
SO_KEEPALIVE, and SO_REUSEADDR=1 is already set by the parent class.
The actual fix lives entirely in Handler.setup().
- Restructure Handler.setup() with per-platform branches so
SO_KEEPALIVE=1 is always applied before timing params, and macOS
(TCP_KEEPALIVE) gets keepalive instead of aborting on TCP_KEEPIDLE.
The PR title and body correctly say 'Closes #1558' but every code comment,
the test file name, error-message strings, docstrings, and the original
commit body referenced #1557 instead. Independent reviewer flagged this:
> The 17 wrong references won't auto-close issue #1558 from the commit
> message — and the test file name will be misleading for future archeology.
> Worth a one-pass s/#1557/#1558/g (and rename test file →
> test_metadata_save_wipe_1558.py) before merge so the artifacts agree
> with reality.
This commit:
- Renames tests/test_metadata_save_wipe_1557.py → test_metadata_save_wipe_1558.py
- Replaces 17 #1557 references with #1558 across:
- tests/test_metadata_save_wipe_1558.py (7 refs)
- api/models.py (5 refs in Session.save guard + backup safeguard comments)
- api/routes.py (2 refs in _clear_stale_stream_state docstring + log)
- api/session_recovery.py (3 refs)
- server.py (3 refs in startup self-heal block)
Verified: 6/6 tests in tests/test_metadata_save_wipe_1558.py pass
with the renamed file + updated references.
v0.50.279 introduced api.routes._clear_stale_stream_state() (#1525) which
calls session.save() to clear stale active_stream_id/pending_* fields. The
helper is called from /api/session and /api/session/status — both of which
load the session with metadata_only=True. Session.load_metadata_only()
synthesizes a stub with messages=[] (its whole purpose: fast metadata read
without parsing the 400KB+ messages array). Session.save() unconditionally
writes self.messages to disk via os.replace(), so saving a metadata-only
stub atomically overwrites the on-disk JSON with messages=[], wiping the
entire conversation.
Production trigger: every SSE reconnect cycle after a server restart polls
/api/session/status, which fans out to _clear_stale_stream_state, which
saves the metadata-only stub. The user reported losing 1000+ message
conversations and seeing 'Reconnecting…' loops on every prompt — the
reconnect loop kept the cycle running until the conversation was empty.
Fix: three layers, defense in depth.
(1) api/models.py: load_metadata_only() now sets _loaded_metadata_only=True
on the returned stub. Session.save() raises RuntimeError if that flag
is set — a hard guard so any future caller making the same mistake
cannot wipe data, only crash visibly.
(2) api/routes.py: _clear_stale_stream_state() now detects the metadata-only
flag and re-loads the full session with metadata_only=False before
mutating persisted state. The full-load path also runs
_repair_stale_pending() which independently clears the stream flags,
so the explicit clear becomes a no-op in most cases — but messages
stay intact.
(3) api/models.py + api/session_recovery.py: every save() that would
SHRINK the messages array (the precise failure shape of #1557) first
snapshots the previous file to <sid>.json.bak. Server.py runs
recover_all_sessions_on_startup() at boot — any session whose live
JSON has fewer messages than its .bak is restored automatically.
Idempotent on clean state. Backup overhead is zero on the normal
grow-the-conversation path.
Reproducer (master): test_metadata_only_save_does_not_wipe_messages goes
from 1000 messages to 0 in a single save() call. After the fix, 1000
messages survive.
Tests: 6 new regression tests in tests/test_metadata_save_wipe_1557.py
covering all three layers. Full pytest: 4019 → 4025 (+6, all green).
Live verified on port 8789: write 1000-msg session with stale active_stream_id,
hit /api/session/status, /api/session — file ends with 1002 messages
(_repair_stale_pending injects an error-marker pair on full reload, harmless
existing behavior), active_stream_id cleared, pending cleared, no Reconnecting
loop.
Closes#1557.
Reported by AvidFuturist via user feedback on v0.50.282.
* fix: dynamic version badge — read from git tag, never hardcoded
The settings panel showed v0.50.87 and the HTTP Server: header said
HermesWebUI/0.50.38 — both hardcoded strings that drift further behind
with every release because there was no mechanism to keep them in sync.
Changes:
- api/updates.py: add _run_git() (moved before _detect_webui_version),
_detect_webui_version(), and WEBUI_VERSION module constant resolved
once at import time via 'git describe --tags --always --dirty'.
Fallback chain: git → api/_version.py → 'unknown'.
- api/routes.py: inject webui_version into GET /api/settings response
so the frontend can read it without a separate API call.
- static/panels.js: loadSettingsPanel() populates .settings-version-badge
from settings.webui_version — one line after the existing api() call.
- static/index.html: replace stale hardcoded 'v0.50.87' with '—'
placeholder; JS overwrites it as soon as the settings panel opens.
- server.py: replace hardcoded 'HermesWebUI/0.50.38' server_version with
'HermesWebUI/' + WEBUI_VERSION.lstrip('v') — stays in sync automatically.
- Dockerfile: add ARG HERMES_VERSION=unknown and write api/_version.py
so Docker images (where .git is excluded) still show the correct tag.
- .github/workflows/release.yml: pass build-args: HERMES_VERSION=${{ github.ref_name }}
to the Docker build step on tag pushes.
- .gitignore: exclude api/_version.py (generated by Docker/CI, never committed).
No manual 'update the version badge' step is required going forward.
Tagging is sufficient — the badge and HTTP header update automatically.
Tests: 18 new tests in tests/test_version_badge.py covering the full
resolution chain, /api/settings injection, HTML placeholder, JS wiring,
and server.py import. 1596 tests pass total.
* fix: address review feedback on PR #790
- api/updates.py: replace exec() with regex parse for api/_version.py
(no supply-chain risk from build artifact; exec unnecessary for one assignment)
- api/updates.py: cap git describe timeout at 3s (was 10s — import-time
stall on NFS/.git would block server startup unnecessarily)
- server.py: lstrip('v') → removeprefix('v') (lstrip strips chars not prefix)
- server.py: emit bare 'HermesWebUI' when version is 'unknown' rather than
'HermesWebUI/unknown' (log aggregators expect semver-ish suffix or none)
- CHANGELOG.md: add v0.50.124 entry for this user-visible change
- tests: rename exec-error test to reflect regex behaviour; add tests for
removeprefix usage and unknown-version header guard (1598 tests total)
---------
Co-authored-by: nesquena-hermes <hermes@nesquena.com>
- test_sprint45.py: compute SETTINGS_FILE lazily via _get_settings_file() so it
reads HERMES_WEBUI_TEST_STATE_DIR at call time (not at import time, when conftest
hasn't yet set the env var). Fixes test isolation across all 1078 tests.
- test_sprint45.py: use auth cookie in teardown when clearing password post-test.
- test_sprint45.py: remove test_synced_version_strings (checks local-patch version).
- static/i18n.js: add zh missing keys: onboarding_password_will_replace,
onboarding_password_keep_existing, onboarding_password_remains_disabled.
- server.py: revert server_version to HermesWebUI/0.50.38 (matches master).
* feat: add real-time gateway session sync (Phase 1)
- Add gateway_watcher.py: background daemon polling state.db every 5s
for gateway session changes (telegram, discord, slack, etc.)
- Extend get_cli_sessions() to include all non-webui sources
- Add SSE endpoint /api/sessions/gateway/stream for real-time push
- Add dynamic source badges (telegram=blue, discord=purple, slack=dark purple)
- Rename 'Show CLI sessions' to 'Show agent sessions'
- Wire watcher lifecycle into server start/stop
- 10 tests covering metadata, filtering, SSE, and watcher lifecycle
- Activated via the same checkbox as CLI session import
Addresses GitHub issue #272
* fix: SSE event name mismatch, TLS attribute, remove PLAN.md
- Fix critical SSE bug: frontend listened for 'gateway_session_update'
but backend sends 'sessions_changed' -- events were silently dropped
- Fix frontend field check: data.changed -> data.sessions (matches
the actual payload structure from gateway_watcher)
- Fix TLS: ssl.TLSv1_2 -> ssl.TLSVersion.TLSv1_2 (the bare attribute
does not exist, would crash TLS setup and silently fall back to HTTP)
- Remove PLAN.md: implementation plan should not be committed to repo
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: test isolation and slow-consumer sentinel in gateway sync
tests/test_gateway_sync.py:
- Fix _get_test_state_dir() path mismatch: the function was computing
HERMES_HOME/webui-mvp-test but conftest.py sets HERMES_HOME=TEST_STATE_DIR,
so state.db was written to a double-nested path the server never read.
Now uses HERMES_WEBUI_STATE_DIR first (which conftest sets directly to
TEST_STATE_DIR), fixing the 7/10 test failures in full-suite ordering.
- Fix conn cleanup: removed conn.close() from inside try blocks so the
connection stays valid for _remove_test_sessions() in the finally block.
Previously the closed conn caused ProgrammingError in finally (swallowed
by bare except), leaving ghost sessions in state.db on test failure.
api/gateway_watcher.py:
- Fix slow-consumer queue eviction: when a subscriber queue fills (>10 events)
and is removed from _subscribers, now puts a None sentinel into it so the
SSE handler unblocks and closes the connection, letting EventSource
auto-reconnect. Without this the connection stayed open but received no
further events.
* fix: test isolation — set HERMES_WEBUI_TEST_STATE_DIR in conftest
The gateway sync tests write directly to state.db and must use the same
path the test server reads from. Previously they computed the path
independently, which broke when test_auth_sessions.py set a different
HERMES_WEBUI_STATE_DIR in the test-process environment at import time.
tests/conftest.py:
- Set HERMES_WEBUI_TEST_STATE_DIR=TEST_STATE_DIR in the test process's
os.environ (via setdefault) so gateway tests can read it reliably.
Using setdefault preserves any explicit override the caller may pass.
tests/test_gateway_sync.py:
- Simplify _get_test_state_dir(): check HERMES_WEBUI_TEST_STATE_DIR first
(now reliably set by conftest), fall back to HERMES_HOME/webui-mvp-test.
Remove the workaround that tried to snapshot HERMES_HOME at import time.
Result: 658/658 tests pass in full-suite ordering (was 651 pass / 7 fail).
---------
Co-authored-by: bergeouss <bergeouss@users.noreply.github.com>
Co-authored-by: Nathan Esquenazi <nesquena@gmail.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: decode HTML entities before markdown processing + zh/zh-Hant translations (#239)
Adds decode() helper in renderMd() to fix double-escaping of HTML entities
from LLM output (e.g. <code> becoming &lt;code&gt; instead
of rendering). XSS-safe: decode runs before esc(), only 5 entity patterns.
Also adds 40+ missing zh (Simplified Chinese) translation keys and a new
zh-Hant (Traditional Chinese) locale with 163 keys.
Fix applied: removed duplicate settings_label_notifications key in both
zh and zh-Hant locales.
Fixes#240
* fix: restore custom model list discovery with config api key (#238)
get_available_models() now reads api_key from config.yaml before env vars:
1. model.api_key
2. providers.<active>.api_key / providers.custom.api_key
3. env var fallbacks (HERMES_API_KEY, OPENAI_API_KEY, etc.)
Also adds OpenAI/Python User-Agent header and a regression test covering
authenticated /v1/models discovery.
Fixes users with LM Studio / Ollama custom endpoints configured in
config.yaml whose model picker silently collapsed to the default model.
* feat: Docker UID/GID matching to avoid root-owned .hermes files (#237)
Adds docker_init.bash with hermeswebuitoo/hermeswebui user pattern so
container files match the host user UID/GID. Prevents .hermes volume
mounts from being owned by root when using a non-root host user.
Configure via WANTED_UID and WANTED_GID env vars (default 1000/1000).
Readme updated with setup instructions.
Fix applied: removed duplicate WANTED_GID=1000 line in docker-compose.yml
that was overriding the ${GID:-1000} variable expansion.
* security: redact credentials from API responses and fix credential file permissions (#243)
Adds response-layer credential redaction to three endpoints:
- GET /api/session — messages[], tool_calls[], and title
- GET /api/session/export — download also redacted
- SSE done event — session payload in stream
- GET /api/memory — MEMORY.md and USER.md content
Adds api/startup.py with fix_credential_permissions() at server startup.
Adds 13 tests in tests/test_security_redaction.py.
Merged with #237 container detection changes in server.py.
* fix: cancel button now interrupts agent and cleans up UI state (#244)
Wires agent.interrupt() into cancel_stream() so the backend actually
stops tool execution when the user clicks Cancel, rather than only
stopping the SSE stream while the agent keeps running.
Changes:
- api/config.py: adds AGENT_INSTANCES dict (stream_id -> AIAgent)
- api/streaming.py: stores agent in AGENT_INSTANCES after creation,
checks CANCEL_FLAGS immediately after store (race condition fix),
calls agent.interrupt() in cancel_stream(), cleans up in finally block
- static/boot.js: removes stale setStatus(cancelling) call
- static/messages.js: setBusy(false)/setStatus('') unconditionally on cancel
Race condition fix: after storing agent in AGENT_INSTANCES, immediately
checks if CANCEL_FLAGS[stream_id] is already set (cancel arrived during
agent init) and interrupts before starting. Check is inside the same
STREAMS_LOCK acquisition, making it atomic.
New test file: tests/test_cancel_interrupt.py with 6 unit tests.
* docs: v0.46.0 release notes, bump version, update test counts
---------
Co-authored-by: Nathan Esquenazi <nesquena@gmail.com>
* fix: wire auto_install_agent_deps into server.py startup; add api/startup.py to ARCHITECTURE.md
* fix(tests): kill stale process on test port before server start in conftest
Stale servers left by QA harness runs (ports 8792/8793 etc.) or prior
test sessions could interfere with conftest starting its own server on
TEST_PORT (8788). If the port was already occupied, _wait_for_server
hit the wrong server and tests got unexpected 404s/500s, failing
non-deterministically — the 'conftest isolation issue' seen this session.
Fix: run fuser -k on TEST_PORT before launching the new server process,
with a 0.5s sleep for port release. The full suite now runs 571/571
reliably regardless of what other servers were previously active.
---------
Co-authored-by: Nathan Esquenazi <nesquena@gmail.com>
Add optional HTTPS support controlled by two env vars:
HERMES_WEBUI_TLS_CERT=/path/to/cert.pem
HERMES_WEBUI_TLS_KEY=/path/to/key.pem
- Wraps server socket with ssl.SSLContext (min TLSv1.2)
- Dynamic scheme detection for startup messages (http:// vs https://)
- Graceful fallback to HTTP if cert loading fails — server never crashes
due to bad TLS config, just prints a warning and continues
- Auth cookie Secure flag already set when HTTPS is detected via getpeercert
- 6 end-to-end tests: config flags, HTTPS handshake, HTTP still works,
fallback on bad paths
Addresses #191 (HTTPS support issue).
Set Handler.timeout = 30. Python's BaseHTTPRequestHandler.setup()
calls self.request.settimeout(timeout), which raises socket.timeout
on idle or slow connections after the configured duration.
This defends against Slowloris-style attacks where a client holds
connections open indefinitely, exhausting threads in ThreadingHTTPServer.
Also recovers threads from crashed clients with hung TCP connections.
Addresses #194.
When `pip install --target .` is run inside the hermes-agent checkout,
third-party package directories (openai/, pydantic/, requests/, etc.)
end up alongside real Hermes source files. With the agent dir at the
front of sys.path (insert(0)), Python resolves imports from those local
directories, breaking whenever the host platform differs from the
container (e.g. macOS .so files inside a Linux image).
Fix: append agent dir to sys.path instead of prepending. This lets
site-packages resolve pip packages correctly while still allowing
Hermes-specific modules (run_agent, hermes/, etc.) to resolve since
they do not exist in site-packages.
Also improves verify_hermes_imports() to surface the actual exception
message in startup logs, making it much easier to diagnose why a
module failed to import.
Tracebacks exposed file paths, module names, and potentially secret
values from local variables. Now logged server-side only; clients
receive a generic error message.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Auth system (off by default, zero friction for localhost):
- New api/auth.py module: password hashing (SHA-256 + STATE_DIR salt),
signed HMAC session cookies (24h TTL), auth middleware
- Enable via HERMES_WEBUI_PASSWORD env var or Settings panel
- Minimal dark-themed login page at /login (self-contained HTML)
- POST /api/auth/login, /api/auth/logout, GET /api/auth/status
- Settings panel: "Access Password" field + "Sign Out" button
- password_hash added to settings.json (null = auth disabled)
Security hardening:
- Security headers on all responses: X-Content-Type-Options: nosniff,
X-Frame-Options: DENY, Referrer-Policy: same-origin
- POST body size limit: 20MB cap in read_body() to prevent DoS
Closes#23. 9 new tests. Total: 304 passed, 0 regressions.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>