hermes-webui

gooneyryan/hermes-webui

Fork 0

mirror of https://github.com/nesquena/hermes-webui.git synced 2026-05-25 11:10:18 +00:00

Commit Graph

Author	SHA1	Message	Date
Nathan Esquenazi	64590cb6b9	harden(docker-smoke): catch !!ERROR/!!Exiting + tighten egg_info test Two non-blocking observations from the review, both addressed: 1. The bad-pattern grep listed `error_exit` as a literal token, but the `error_exit()` function at docker_init.bash:5-10 only echoes the strings `"!! ERROR: "` and `"!! Exiting script (ID: $$)"` — the function name itself never appears in container logs. So `grep -E -i "error_exit"` would only fire on stray debug prints of the name, not on actual failures. The other patterns (`Failed to set (UID\|GID\|...)`, `groupmod: cannot`, etc.) DO catch real error_exit output, so this wasn't a coverage gap — just a dead token. Add `!! ERROR` and `!! Exiting script` to the bad-pattern set so the grep actually matches the function's output. Keep the literal `error_exit` token as belt-and-suspenders for any debug/echo of the name. 2. `test_docker_init_excludes_egg_info_during_staging` was a single `assert "egg-info" in src` check. That passes if any occurrence appears — including the explanatory comment block above the staging logic. A maintainer removing the `--exclude='.egg-info'` from rsync but keeping the comment would slip past the test. Tighten to: - scope to the staging block (between `_stage_src=` and the `uv pip install` line) so comments outside that window can't satisfy the assertion; - require the literal `--exclude='.egg-info'` rsync flag; - require `*.egg-info` in the block so the cp-fallback cleanup is also pinned; - additionally require `--exclude='build'`, `--exclude='dist'`, `--exclude='__pycache__'` so all four setuptools-touchable artifact dirs stay excluded. Verified: - tests/test_docker_docs_and_readonly.py — 11/11 pass. - YAML parses cleanly via `yaml.safe_load`. - Full suite: 5770 passed, 0 failed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-17 17:34:46 -07:00
nesquena-hermes	70f371c8b9	fix(docker): stage agent source to writable build dir before pip install The Docker smoke gate added in this same PR caught a real production regression on its very first CI run. v0.51.84 (PR #2470) mounted hermes-agent-src read-only on the WebUI side and widened the chown prune to keep the read-only walk happy, but missed that the WebUI's startup also runs: uv pip install "$_agent_src[all]" against the same now-read-only mount. setuptools' egg_info step writes hermes_agent.egg-info/ inside the source tree even under PEP 517 build isolation (this is by design -- PEP 517 isolates the BUILD environment, not the source tree's metadata directory). On a :ro mount this returns EROFS, the install fails, error_exit fires, and every multi-container deploy dies at startup. The smoke gate flagged it on both the two-container and three-container variants. The fix ------- Stage the agent source into a writable build dir under /tmp BEFORE invoking pip install, then point pip at the staged copy. _stage_src="/tmp/hermes-agent-build" rm -rf "$_stage_src" && mkdir -p "$_stage_src" rsync -a --exclude='.egg-info' --exclude='build' --exclude='dist' \ --exclude='__pycache__' --exclude='.git' \ "$_agent_src"/ "$_stage_src"/ uv pip install "$_stage_src[all]" ... rm -rf "$_stage_src" The exclusion list matters: when setuptools sees a pre-baked .egg-info, build, or dist directory, it takes a timestamp-update code path that also reads/writes inside that directory -- which itself fails on a :ro source. Excluding them keeps the build on the fresh-build path unconditionally. rsync is in the production image (Dockerfile line 41-44). For users running custom WebUI images without rsync, the script falls back to cp -a + post-copy rm -rf of the same artifacts. Tests ----- Two new source-level invariants in tests/test_docker_docs_and_readonly.py: test_docker_init_stages_agent_source_for_writable_install -- asserts _stage_src=... is declared -- asserts every `uv pip install ...[all]` line uses _stage_src, NOT raw $_agent_src test_docker_init_excludes_egg_info_during_staging -- asserts the staging path excludes *.egg-info (rsync exclude form or cp-fallback's explicit rm -rf both pass) These would have caught the v0.51.84 regression at the source level (once written; they're new). The Docker runtime smoke gate is the durable defence for the broader class of :ro x init-script interactions, since source-level invariants only catch what they're written to catch. Verification ------------ - pytest tests/test_docker_docs_and_readonly.py: 11 passed (9 existing + 2 new) - pytest tests/ -q --timeout=60: 5891 passed, 6 skipped (was 5889; delta is exactly the 2 new tests) - bash -n docker_init.bash: clean Once this lands, the Docker smoke gate's two/three-container variants should go green, completing the self-validating loop.	2026-05-18 00:21:31 +00:00
nesquena-hermes	5cc8b6c654	docs(docker): document agent-image upgrade flow + read-only WebUI source mount The hermes-agent-src named volume in the two- and three-container compose files is initialised from the agent image's /opt/hermes on first `up` and Docker reuses it verbatim on every subsequent `up` — even after a fresh `docker pull` of the agent image. This was the root cause of #1416 (the 'missing entrypoint' symptom was a stale cached volume hiding the new image's source tree). Changes: - Add an 'Upgrading the agent container' section to docs/docker.md with the canonical `down → docker volume rm → pull → up -d` recipe, plus the same pointer as a comment block in both multi-container compose files near the volume declarations. - Switch the WebUI's hermes-agent-src mount to `:ro` in both multi-container compose files. The WebUI only reads this volume to install the agent's Python deps at startup; mounting it read-only enforces that at the kernel layer and brings the actual mount mode in line with the existing docs/docker.md architecture diagram (which already labelled this edge as read-only). - Align the workspace bind default in both multi-container compose files with the single-container convention — `${HERMES_WORKSPACE:-${HOME}/workspace}` instead of `${HERMES_WORKSPACE:-~/workspace}` — so the default resolves the same way across Linux, macOS, WSL2, and Docker Desktop on Windows. - Add a 'What the multi-container setup isolates (and what it doesn't)' section to docs/docker.md to frame the two/three-container setups as process/network/resource isolation, not filesystem isolation, so users don't reach for multi-container expecting a trust boundary it doesn't provide. - Cross-link #1416 from the Related issues section. Adds 9 regression tests in tests/test_docker_docs_and_readonly.py covering: - :ro on the WebUI side of hermes-agent-src in both files - agent side stays read-write (still needs to populate /opt/hermes on first run) - ${HOME} (not ~) in workspace bind defaults in both files - single-container file already uses ${HOME} (pin to prevent drift) - docs/docker.md has the 'Upgrading the agent container' section + recipe - compose files reference docs/docker.md + show the upgrade step inline - docs/docker.md frames the isolation model honestly Test suite: 42 passed (33 existing Docker tests + 9 new). No behaviour change for users who set HERMES_WORKSPACE explicitly, and no migration is required for existing deployments — Docker rebinds the existing volume read-only on next `up`. Users upgrading the agent image should now follow the documented `docker volume rm hermes-agent-src` recipe. Closes #1416 (documented upgrade procedure) and addresses the read-only half of the multi-container coupling concern raised on #2453.	2026-05-17 17:18:39 +00:00

Author

SHA1

Message

Date

Nathan Esquenazi

64590cb6b9

harden(docker-smoke): catch !!ERROR/!!Exiting + tighten egg_info test

Two non-blocking observations from the review, both addressed:

1. The bad-pattern grep listed `error_exit` as a literal token, but the
   `error_exit()` function at docker_init.bash:5-10 only echoes the
   strings `"!! ERROR: "` and `"!! Exiting script (ID: $$)"` — the
   function name itself never appears in container logs. So
   `grep -E -i "error_exit"` would only fire on stray debug prints of
   the name, not on actual failures. The other patterns
   (`Failed to set (UID|GID|...)`, `groupmod: cannot`, etc.) DO catch
   real error_exit output, so this wasn't a coverage gap — just a dead
   token.

   Add `!! ERROR` and `!! Exiting script` to the bad-pattern set so the
   grep actually matches the function's output. Keep the literal
   `error_exit` token as belt-and-suspenders for any debug/echo of the
   name.

2. `test_docker_init_excludes_egg_info_during_staging` was a single
   `assert "egg-info" in src` check. That passes if any occurrence
   appears — including the explanatory comment block above the staging
   logic. A maintainer removing the `--exclude='*.egg-info'` from
   rsync but keeping the comment would slip past the test.

   Tighten to:
   - scope to the staging block (between `_stage_src=` and the
     `uv pip install` line) so comments outside that window can't
     satisfy the assertion;
   - require the literal `--exclude='*.egg-info'` rsync flag;
   - require `*.egg-info` in the block so the cp-fallback cleanup is
     also pinned;
   - additionally require `--exclude='build'`, `--exclude='dist'`,
     `--exclude='__pycache__'` so all four setuptools-touchable
     artifact dirs stay excluded.

Verified:
- tests/test_docker_docs_and_readonly.py — 11/11 pass.
- YAML parses cleanly via `yaml.safe_load`.
- Full suite: 5770 passed, 0 failed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-05-17 17:34:46 -07:00

nesquena-hermes

70f371c8b9

fix(docker): stage agent source to writable build dir before pip install

The Docker smoke gate added in this same PR caught a real production
regression on its very first CI run. v0.51.84 (PR #2470) mounted
hermes-agent-src read-only on the WebUI side and widened the chown
prune to keep the read-only walk happy, but missed that the WebUI's
startup also runs:

    uv pip install "$_agent_src[all]"

against the same now-read-only mount. setuptools' egg_info step writes
hermes_agent.egg-info/ inside the source tree even under PEP 517 build
isolation (this is by design -- PEP 517 isolates the BUILD environment,
not the source tree's metadata directory). On a :ro mount this returns
EROFS, the install fails, error_exit fires, and every multi-container
deploy dies at startup. The smoke gate flagged it on both the
two-container and three-container variants.

The fix
-------
Stage the agent source into a writable build dir under /tmp BEFORE
invoking pip install, then point pip at the staged copy.

  _stage_src="/tmp/hermes-agent-build"
  rm -rf "$_stage_src" && mkdir -p "$_stage_src"
  rsync -a --exclude='*.egg-info' --exclude='build' --exclude='dist' \
        --exclude='__pycache__' --exclude='.git' \
        "$_agent_src"/ "$_stage_src"/
  uv pip install "$_stage_src[all]" ...
  rm -rf "$_stage_src"

The exclusion list matters: when setuptools sees a pre-baked *.egg-info,
build, or dist directory, it takes a timestamp-update code path that
also reads/writes inside that directory -- which itself fails on a :ro
source. Excluding them keeps the build on the fresh-build path
unconditionally.

rsync is in the production image (Dockerfile line 41-44). For users
running custom WebUI images without rsync, the script falls back to
cp -a + post-copy rm -rf of the same artifacts.

Tests
-----
Two new source-level invariants in tests/test_docker_docs_and_readonly.py:

  test_docker_init_stages_agent_source_for_writable_install
    -- asserts _stage_src=... is declared
    -- asserts every `uv pip install ...[all]` line uses _stage_src,
       NOT raw $_agent_src

  test_docker_init_excludes_egg_info_during_staging
    -- asserts the staging path excludes *.egg-info (rsync exclude
       form or cp-fallback's explicit rm -rf both pass)

These would have caught the v0.51.84 regression at the source level
(once written; they're new). The Docker runtime smoke gate is the
durable defence for the broader class of :ro x init-script
interactions, since source-level invariants only catch what they're
written to catch.

Verification
------------
- pytest tests/test_docker_docs_and_readonly.py: 11 passed (9 existing
  + 2 new)
- pytest tests/ -q --timeout=60: 5891 passed, 6 skipped (was 5889;
  delta is exactly the 2 new tests)
- bash -n docker_init.bash: clean

Once this lands, the Docker smoke gate's two/three-container variants
should go green, completing the self-validating loop.

2026-05-18 00:21:31 +00:00

nesquena-hermes

5cc8b6c654

docs(docker): document agent-image upgrade flow + read-only WebUI source mount

The hermes-agent-src named volume in the two- and three-container compose
files is initialised from the agent image's /opt/hermes on first `up` and
Docker reuses it verbatim on every subsequent `up` — even after a fresh
`docker pull` of the agent image. This was the root cause of #1416 (the
'missing entrypoint' symptom was a stale cached volume hiding the new
image's source tree).

Changes:

- Add an 'Upgrading the agent container' section to docs/docker.md with
  the canonical `down → docker volume rm → pull → up -d` recipe, plus the
  same pointer as a comment block in both multi-container compose files
  near the volume declarations.
- Switch the WebUI's hermes-agent-src mount to `:ro` in both multi-container
  compose files. The WebUI only reads this volume to install the agent's
  Python deps at startup; mounting it read-only enforces that at the kernel
  layer and brings the actual mount mode in line with the existing
  docs/docker.md architecture diagram (which already labelled this edge as
  read-only).
- Align the workspace bind default in both multi-container compose files
  with the single-container convention — `${HERMES_WORKSPACE:-${HOME}/workspace}`
  instead of `${HERMES_WORKSPACE:-~/workspace}` — so the default resolves
  the same way across Linux, macOS, WSL2, and Docker Desktop on Windows.
- Add a 'What the multi-container setup isolates (and what it doesn't)'
  section to docs/docker.md to frame the two/three-container setups as
  process/network/resource isolation, not filesystem isolation, so users
  don't reach for multi-container expecting a trust boundary it doesn't
  provide.
- Cross-link #1416 from the Related issues section.

Adds 9 regression tests in tests/test_docker_docs_and_readonly.py covering:
- :ro on the WebUI side of hermes-agent-src in both files
- agent side stays read-write (still needs to populate /opt/hermes on first run)
- ${HOME} (not ~) in workspace bind defaults in both files
- single-container file already uses ${HOME} (pin to prevent drift)
- docs/docker.md has the 'Upgrading the agent container' section + recipe
- compose files reference docs/docker.md + show the upgrade step inline
- docs/docker.md frames the isolation model honestly

Test suite: 42 passed (33 existing Docker tests + 9 new). No behaviour
change for users who set HERMES_WORKSPACE explicitly, and no migration is
required for existing deployments — Docker rebinds the existing volume
read-only on next `up`. Users upgrading the agent image should now follow
the documented `docker volume rm hermes-agent-src` recipe.

Closes #1416 (documented upgrade procedure) and addresses the read-only
half of the multi-container coupling concern raised on #2453.

2026-05-17 17:18:39 +00:00

3 Commits