handle_upload_extract() used Path(s.workspace) as the extraction root,
bypassing HERMES_WEBUI_ATTACHMENT_DIR entirely. Route through
_session_attachment_dir(session_id) so archives land alongside
single-file uploads and session cleanup covers them.
Add tests and CHANGELOG entry.
Ref #2247
Add a narrow README note for the community ARM64 Android AVF field
report: Hermes Agent + WebUI running inside a Debian 12 VM on a
mid-range Android phone with cloud-hosted inference.
The note frames the report as a compatibility signal rather than an
official support baseline or provider/model benchmark, and records
practical mobile caveats around first-install compile time, Android
tab reloads, and battery optimization.
Refs #2364Closesnesquena/hermes-webui#2483
Co-authored-by: Frank Song <franksong2702@gmail.com>
Two non-blocking observations from the review, both addressed:
1. The bad-pattern grep listed `error_exit` as a literal token, but the
`error_exit()` function at docker_init.bash:5-10 only echoes the
strings `"!! ERROR: "` and `"!! Exiting script (ID: $$)"` — the
function name itself never appears in container logs. So
`grep -E -i "error_exit"` would only fire on stray debug prints of
the name, not on actual failures. The other patterns
(`Failed to set (UID|GID|...)`, `groupmod: cannot`, etc.) DO catch
real error_exit output, so this wasn't a coverage gap — just a dead
token.
Add `!! ERROR` and `!! Exiting script` to the bad-pattern set so the
grep actually matches the function's output. Keep the literal
`error_exit` token as belt-and-suspenders for any debug/echo of the
name.
2. `test_docker_init_excludes_egg_info_during_staging` was a single
`assert "egg-info" in src` check. That passes if any occurrence
appears — including the explanatory comment block above the staging
logic. A maintainer removing the `--exclude='*.egg-info'` from
rsync but keeping the comment would slip past the test.
Tighten to:
- scope to the staging block (between `_stage_src=` and the
`uv pip install` line) so comments outside that window can't
satisfy the assertion;
- require the literal `--exclude='*.egg-info'` rsync flag;
- require `*.egg-info` in the block so the cp-fallback cleanup is
also pinned;
- additionally require `--exclude='build'`, `--exclude='dist'`,
`--exclude='__pycache__'` so all four setuptools-touchable
artifact dirs stay excluded.
Verified:
- tests/test_docker_docs_and_readonly.py — 11/11 pass.
- YAML parses cleanly via `yaml.safe_load`.
- Full suite: 5770 passed, 0 failed.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The Docker smoke gate added in this same PR caught a real production
regression on its very first CI run. v0.51.84 (PR #2470) mounted
hermes-agent-src read-only on the WebUI side and widened the chown
prune to keep the read-only walk happy, but missed that the WebUI's
startup also runs:
uv pip install "$_agent_src[all]"
against the same now-read-only mount. setuptools' egg_info step writes
hermes_agent.egg-info/ inside the source tree even under PEP 517 build
isolation (this is by design -- PEP 517 isolates the BUILD environment,
not the source tree's metadata directory). On a :ro mount this returns
EROFS, the install fails, error_exit fires, and every multi-container
deploy dies at startup. The smoke gate flagged it on both the
two-container and three-container variants.
The fix
-------
Stage the agent source into a writable build dir under /tmp BEFORE
invoking pip install, then point pip at the staged copy.
_stage_src="/tmp/hermes-agent-build"
rm -rf "$_stage_src" && mkdir -p "$_stage_src"
rsync -a --exclude='*.egg-info' --exclude='build' --exclude='dist' \
--exclude='__pycache__' --exclude='.git' \
"$_agent_src"/ "$_stage_src"/
uv pip install "$_stage_src[all]" ...
rm -rf "$_stage_src"
The exclusion list matters: when setuptools sees a pre-baked *.egg-info,
build, or dist directory, it takes a timestamp-update code path that
also reads/writes inside that directory -- which itself fails on a :ro
source. Excluding them keeps the build on the fresh-build path
unconditionally.
rsync is in the production image (Dockerfile line 41-44). For users
running custom WebUI images without rsync, the script falls back to
cp -a + post-copy rm -rf of the same artifacts.
Tests
-----
Two new source-level invariants in tests/test_docker_docs_and_readonly.py:
test_docker_init_stages_agent_source_for_writable_install
-- asserts _stage_src=... is declared
-- asserts every `uv pip install ...[all]` line uses _stage_src,
NOT raw $_agent_src
test_docker_init_excludes_egg_info_during_staging
-- asserts the staging path excludes *.egg-info (rsync exclude
form or cp-fallback's explicit rm -rf both pass)
These would have caught the v0.51.84 regression at the source level
(once written; they're new). The Docker runtime smoke gate is the
durable defence for the broader class of :ro x init-script
interactions, since source-level invariants only catch what they're
written to catch.
Verification
------------
- pytest tests/test_docker_docs_and_readonly.py: 11 passed (9 existing
+ 2 new)
- pytest tests/ -q --timeout=60: 5891 passed, 6 skipped (was 5889;
delta is exactly the 2 new tests)
- bash -n docker_init.bash: clean
Once this lands, the Docker smoke gate's two/three-container variants
should go green, completing the self-validating loop.
Closes the source-only-test gap that let v0.51.84's :ro-mount x chown -h
{} + startup regression reach review with 5800+ green pytests. Adds a
new GitHub Actions workflow .github/workflows/docker-smoke.yml that
actually runs 'docker compose up' against each compose variant.
Triggers
--------
Path-filtered on pull_request + push to master:
Dockerfile, docker_init.bash, docker-compose*.yml, .dockerignore,
.env.docker.example, .github/workflows/docker-smoke.yml itself.
Also workflow_dispatch for manual runs.
Jobs
----
1. compose-config -- preflight that 'docker compose config' parses each
of the three compose files. Cheap, fast, catches schema/interpolation
drift in parallel before any container starts.
2. smoke (matrix: single / two-container / three-container) -- for each
variant:
a. Reap any leftover hermes-smoke-* containers/volumes/networks from
prior runs (defence-in-depth on self-hosted runners; hosted runners
are fresh).
b. docker build -t ghcr.io/nesquena/hermes-webui:latest .
Critical: the multi-container compose files reference the GHCR
image. Without this retag, multi-container smoke would test the
previously-released image, NOT the PR's docker_init.bash / Dockerfile
changes. With the retag, Compose's default pull_policy=missing keeps
the local build in place and the PR is genuinely exercised.
c. mktemp -d for ephemeral HERMES_HOME + HERMES_WORKSPACE so the
runner's host filesystem is never touched.
d. docker compose up -d --wait --wait-timeout 120 (Dockerfile carries a
HEALTHCHECK so --wait blocks on 'healthy', not just 'running').
e. curl /health probe with a 30-attempt x 2s poll loop as headroom for
the multi-container variants' Python dep install phase.
f. grep startup logs for known-bad signatures:
EROFS | Read-only file system | Traceback | PermissionError |
error_exit | groupmod: cannot | usermod: cannot |
Failed to set (UID|GID|owner|permissions|ownership)
These are the exact patterns that would have flagged #2470 in real
time. Failed-to-set is anchored to specific objects to avoid false
positives on benign locale/library bootstrap warnings.
g. trap on EXIT: docker compose down -v --remove-orphans + rm -rf the
ephemeral host paths, regardless of how the job exited.
Safety
------
- permissions: contents: read only -- no GITHUB_TOKEN write scope.
- Fork PRs run with no secrets (standard pull_request, not
pull_request_target).
- No host bind mounts; no ~/.hermes exposure; no network egress beyond
what compose itself needs to pull the agent image.
- timeout-minutes: 15 on the smoke job as a hard ceiling against a
hung docker build.
- Per-run COMPOSE_PROJECT name (hermes-smoke-VARIANT-RUNID-ATTEMPT)
so concurrent runs or reruns can't clobber each other.
Out of scope for v1 (per design review)
---------------------------------------
- HERMES_WEBUI_SMOKE_TEST env flag in docker_init.bash -- production-code
footgun that would let any leaked env var silently exit before
serving traffic.
- --user 60000:60000 -- incompatible with the image's root-init phase
and would skip the very chown branch we are guarding against.
- Local-runnable scripts/docker-smoke-test.sh -- defer until CI gating
ships and we see what contributors actually trip over.
- Hadolint / yamllint -- separate lint workflow, follow-up PR.
- Podman runtime smoke -- defer until a podman-specific bug ships.
Pre-merge verification
----------------------
- actionlint: clean
- YAML parse: clean (3 triggers, 2 jobs, 3-variant matrix)
- bash -n on all 6 run-blocks: clean
- pytest tests/ -q --timeout=60: 5889 passed, 6 skipped (no test impact;
workflow-only change)
- Opus design review on the brief (REVISE -> minimum scope adopted)
- Opus implementation review on this workflow (APPROVE)
Backend (api/config.py):
- resolve_model_provider(): check custom_providers for prefix match
BEFORE the config_base_url branch. Previously, providers with a
base_url set (e.g. deepseek) would catch all slash-delimited model
ids and return the config provider, preventing custom provider
routing.
- get_available_models(): include model aliases in response so the
frontend can resolve them on /model commands.
Frontend (static/commands.js):
- cmdModel(): resolve aliases by fetching /api/models before fuzzy
matching the dropdown.
- Add bare-model fallback when the alias resolves to a slash-delimited
provider/model id (e.g. "deepseek/deepseek-v4-flash").
- Add cross-provider fallback: when the model is from a custom provider
not in the active provider dropdown, call /api/session/update directly
with the provider/model id and provider override.