Four code-review comments from the automated Copilot reviewer on this PR:
1. `_journal_tool_already_present` dedupe was session-wide, so a
legitimately-repeated tool (e.g. a second `terminal: ls` in an
earlier turn) could cause the retry path to falsely skip
materializing the recovered tool card. The helper now takes a
keyword `stream_id` argument; when supplied, a tool card whose
`_recovered_stream_id` is set AND differs from the candidate is no
longer treated as a duplicate. Untagged tool cards (live tools, or
tool cards carried over from a pre-tagging core transcript) still
match, preserving the existing 'core transcript already has this
tool, don't duplicate' invariant. Two new tests in
`TestJournalToolDedupeScoping` cover both legs of the rule.
2./3. The troubleshooting FAQ pointed at `~/.hermes/webui/sessions/session_<sid>.json`
and `~/.hermes/_run_journal/...`. The actual sidecar filename has
no `session_` prefix and the run-journal lives under the WebUI
sessions dir (`~/.hermes/webui/sessions/_run_journal/<sid>/<stream>.jsonl`,
default). Both paths fixed and an explicit note added about
`HERMES_WEBUI_STATE_DIR` overriding the state root.
4. Drop unused `json` / `queue` / `Path` imports from
`tests/test_session_lost_response_regression.py` so the file stops
carrying noise that future linting would flag.
CHANGELOG: append an Unreleased / Fixed entry describing the user-visible
behaviour change (interrupted-turn marker now self-heals on the next
session read; gives up gracefully after 12 retries or 24h).
docs/troubleshooting.md: add a 'Symptom → Why → Diagnostic → Fix →
Caps → When to file a bug' entry for the
'no agent output was recovered' marker so users who hit the lost-response
shape on WSL2 / network FS can recognise it, verify the run-journal on
disk, and know that reloading the session is enough.
The hermes-agent-src named volume in the two- and three-container compose
files is initialised from the agent image's /opt/hermes on first `up` and
Docker reuses it verbatim on every subsequent `up` — even after a fresh
`docker pull` of the agent image. This was the root cause of #1416 (the
'missing entrypoint' symptom was a stale cached volume hiding the new
image's source tree).
Changes:
- Add an 'Upgrading the agent container' section to docs/docker.md with
the canonical `down → docker volume rm → pull → up -d` recipe, plus the
same pointer as a comment block in both multi-container compose files
near the volume declarations.
- Switch the WebUI's hermes-agent-src mount to `:ro` in both multi-container
compose files. The WebUI only reads this volume to install the agent's
Python deps at startup; mounting it read-only enforces that at the kernel
layer and brings the actual mount mode in line with the existing
docs/docker.md architecture diagram (which already labelled this edge as
read-only).
- Align the workspace bind default in both multi-container compose files
with the single-container convention — `${HERMES_WORKSPACE:-${HOME}/workspace}`
instead of `${HERMES_WORKSPACE:-~/workspace}` — so the default resolves
the same way across Linux, macOS, WSL2, and Docker Desktop on Windows.
- Add a 'What the multi-container setup isolates (and what it doesn't)'
section to docs/docker.md to frame the two/three-container setups as
process/network/resource isolation, not filesystem isolation, so users
don't reach for multi-container expecting a trust boundary it doesn't
provide.
- Cross-link #1416 from the Related issues section.
Adds 9 regression tests in tests/test_docker_docs_and_readonly.py covering:
- :ro on the WebUI side of hermes-agent-src in both files
- agent side stays read-write (still needs to populate /opt/hermes on first run)
- ${HOME} (not ~) in workspace bind defaults in both files
- single-container file already uses ${HOME} (pin to prevent drift)
- docs/docker.md has the 'Upgrading the agent container' section + recipe
- compose files reference docs/docker.md + show the upgrade step inline
- docs/docker.md frames the isolation model honestly
Test suite: 42 passed (33 existing Docker tests + 9 new). No behaviour
change for users who set HERMES_WORKSPACE explicitly, and no migration is
required for existing deployments — Docker rebinds the existing volume
read-only on next `up`. Users upgrading the agent image should now follow
the documented `docker volume rm hermes-agent-src` recipe.
Closes#1416 (documented upgrade procedure) and addresses the read-only
half of the multi-container coupling concern raised on #2453.