13 Commits

Author SHA1 Message Date
Dustin ae6b6b1b72 ci(windows): make taskkill no-op when server.py already exited
The path-discovery step succeeds on the first run, but the cleanup
step exits non-zero because `taskkill /PID 5560 /T /F` returns 128
("process not found") when server.py has already exited on the mock
hermes_cli stub. That's the expected steady state for this mock-only
workflow, not a failure.

Two-line fix: reset `$global:LASTEXITCODE = 0` after the taskkill
call, and explicit `exit 0` at the end of the step so any other
external-command exit codes don't bubble up. The try/catch wrapper
didn't help because taskkill writes its diagnostic to stderr without
raising a PowerShell exception — `catch` never fired.

Run 26352805510 on this branch shows the failure shape: "OK: start.ps1
path discovery - all guards passed." in the verify step, then
"ERROR: The process '5560' not found." in the cleanup step. Path
discovery is what this workflow exists to validate; cleanup just has
to not fail the job.
2026-05-24 17:10:02 +00:00
Dustin 145a442f61 ci(windows): rework #2811 with mock hermes_cli (maintainer ask, option 1)
Per @nesquena-hermes review on #2811: hermes-agent isn't published to
PyPI, so `pip install hermes-agent` finds nothing and start.ps1's
hermes_cli guard correctly bails out — leaving the previous workflow
unable to self-validate against release/stage-batch6.

This rework adopts option 1 from the review: drop the pip install,
stub a hermes_cli/ directory with a minimal __init__.py next to the
sibling hermes-agent/ folder, then run start.ps1 for 8 seconds and
assert that none of its own Write-Error guards (no Python, no agent
dir, bad port, missing hermes_cli, missing server.py) appeared in
stderr. /health is no longer probed — the server cannot boot on a
stub, and full-boot regressions stay covered by the Linux jobs and
docker-smoke.yml.

Scope intentionally narrower than the original: this workflow
validates start.ps1's PowerShell syntax + path discovery only. The
exact bug class PR #2805 caught (WOW64 ProgramFiles redirect) would
now light up red here pre-merge, which is the reason this gate exists.

Paths filter trimmed to `start.ps1` + the workflow itself; the broader
list (requirements.txt / bootstrap.py / server.py) was inherited from
the original full-boot scoping and isn't relevant for a path-discovery-
only run.

Verification: workflow runs on this PR via its own pull_request trigger.
The first CI run on this branch IS the verification.

CHANGELOG updated under [Unreleased] with a single bullet sized to the
surrounding density.
2026-05-24 17:10:02 +00:00
Nathan Esquenazi 64590cb6b9 harden(docker-smoke): catch !!ERROR/!!Exiting + tighten egg_info test
Two non-blocking observations from the review, both addressed:

1. The bad-pattern grep listed `error_exit` as a literal token, but the
   `error_exit()` function at docker_init.bash:5-10 only echoes the
   strings `"!! ERROR: "` and `"!! Exiting script (ID: $$)"` — the
   function name itself never appears in container logs. So
   `grep -E -i "error_exit"` would only fire on stray debug prints of
   the name, not on actual failures. The other patterns
   (`Failed to set (UID|GID|...)`, `groupmod: cannot`, etc.) DO catch
   real error_exit output, so this wasn't a coverage gap — just a dead
   token.

   Add `!! ERROR` and `!! Exiting script` to the bad-pattern set so the
   grep actually matches the function's output. Keep the literal
   `error_exit` token as belt-and-suspenders for any debug/echo of the
   name.

2. `test_docker_init_excludes_egg_info_during_staging` was a single
   `assert "egg-info" in src` check. That passes if any occurrence
   appears — including the explanatory comment block above the staging
   logic. A maintainer removing the `--exclude='*.egg-info'` from
   rsync but keeping the comment would slip past the test.

   Tighten to:
   - scope to the staging block (between `_stage_src=` and the
     `uv pip install` line) so comments outside that window can't
     satisfy the assertion;
   - require the literal `--exclude='*.egg-info'` rsync flag;
   - require `*.egg-info` in the block so the cp-fallback cleanup is
     also pinned;
   - additionally require `--exclude='build'`, `--exclude='dist'`,
     `--exclude='__pycache__'` so all four setuptools-touchable
     artifact dirs stay excluded.

Verified:
- tests/test_docker_docs_and_readonly.py — 11/11 pass.
- YAML parses cleanly via `yaml.safe_load`.
- Full suite: 5770 passed, 0 failed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-17 17:34:46 -07:00
nesquena-hermes 5b6f69c884 ci(docker): runtime smoke gate for Docker init logic
Closes the source-only-test gap that let v0.51.84's :ro-mount x chown -h
{} + startup regression reach review with 5800+ green pytests. Adds a
new GitHub Actions workflow .github/workflows/docker-smoke.yml that
actually runs 'docker compose up' against each compose variant.

Triggers
--------
Path-filtered on pull_request + push to master:
  Dockerfile, docker_init.bash, docker-compose*.yml, .dockerignore,
  .env.docker.example, .github/workflows/docker-smoke.yml itself.
Also workflow_dispatch for manual runs.

Jobs
----
1. compose-config -- preflight that 'docker compose config' parses each
   of the three compose files. Cheap, fast, catches schema/interpolation
   drift in parallel before any container starts.

2. smoke (matrix: single / two-container / three-container) -- for each
   variant:
   a. Reap any leftover hermes-smoke-* containers/volumes/networks from
      prior runs (defence-in-depth on self-hosted runners; hosted runners
      are fresh).
   b. docker build -t ghcr.io/nesquena/hermes-webui:latest .
      Critical: the multi-container compose files reference the GHCR
      image. Without this retag, multi-container smoke would test the
      previously-released image, NOT the PR's docker_init.bash / Dockerfile
      changes. With the retag, Compose's default pull_policy=missing keeps
      the local build in place and the PR is genuinely exercised.
   c. mktemp -d for ephemeral HERMES_HOME + HERMES_WORKSPACE so the
      runner's host filesystem is never touched.
   d. docker compose up -d --wait --wait-timeout 120 (Dockerfile carries a
      HEALTHCHECK so --wait blocks on 'healthy', not just 'running').
   e. curl /health probe with a 30-attempt x 2s poll loop as headroom for
      the multi-container variants' Python dep install phase.
   f. grep startup logs for known-bad signatures:
        EROFS | Read-only file system | Traceback | PermissionError |
        error_exit | groupmod: cannot | usermod: cannot |
        Failed to set (UID|GID|owner|permissions|ownership)
      These are the exact patterns that would have flagged #2470 in real
      time. Failed-to-set is anchored to specific objects to avoid false
      positives on benign locale/library bootstrap warnings.
   g. trap on EXIT: docker compose down -v --remove-orphans + rm -rf the
      ephemeral host paths, regardless of how the job exited.

Safety
------
- permissions: contents: read only -- no GITHUB_TOKEN write scope.
- Fork PRs run with no secrets (standard pull_request, not
  pull_request_target).
- No host bind mounts; no ~/.hermes exposure; no network egress beyond
  what compose itself needs to pull the agent image.
- timeout-minutes: 15 on the smoke job as a hard ceiling against a
  hung docker build.
- Per-run COMPOSE_PROJECT name (hermes-smoke-VARIANT-RUNID-ATTEMPT)
  so concurrent runs or reruns can't clobber each other.

Out of scope for v1 (per design review)
---------------------------------------
- HERMES_WEBUI_SMOKE_TEST env flag in docker_init.bash -- production-code
  footgun that would let any leaked env var silently exit before
  serving traffic.
- --user 60000:60000 -- incompatible with the image's root-init phase
  and would skip the very chown branch we are guarding against.
- Local-runnable scripts/docker-smoke-test.sh -- defer until CI gating
  ships and we see what contributors actually trip over.
- Hadolint / yamllint -- separate lint workflow, follow-up PR.
- Podman runtime smoke -- defer until a podman-specific bug ships.

Pre-merge verification
----------------------
- actionlint: clean
- YAML parse: clean (3 triggers, 2 jobs, 3-variant matrix)
- bash -n on all 6 run-blocks: clean
- pytest tests/ -q --timeout=60: 5889 passed, 6 skipped (no test impact;
  workflow-only change)
- Opus design review on the brief (REVISE -> minimum scope adopted)
- Opus implementation review on this workflow (APPROVE)
2026-05-18 00:09:41 +00:00
ai-ag2026 98c9a3de72 test: tighten CI and console hygiene
(cherry picked from commit bd9e6df71c)
2026-05-11 23:13:16 +00:00
nesquena-hermes 0590d597a3 ci: install mcp + pytest-asyncio in CI; importorskip in test_mcp_server.py
CI failed on stage-323 because:
1. mcp_server.py imports the 'mcp' package (optional runtime dep) — only
   users who actually run the MCP integration install it. CI runs with
   stdlib-only deps (pyyaml + pytest + pytest-timeout).
2. tests/test_mcp_server.py uses pytest.mark.asyncio which requires
   pytest-asyncio — not installed in CI.

Fix:
- Add pytest-asyncio to CI install line.
- Try-install mcp; if it fails (Python 3.13 wheel issues, etc.) the test
  module uses pytest.importorskip and skips cleanly without breaking the
  matrix.
- tests/test_mcp_server.py: add module-level importorskip for both 'mcp'
  and 'pytest_asyncio' as a safety net.

Local: 4947/4947 still pass after change.
2026-05-08 20:26:11 +00:00
nesquena-hermes 38e215e8f8 fix: dynamic version badge — read from git tag, never hardcoded (#790)
* fix: dynamic version badge — read from git tag, never hardcoded

The settings panel showed v0.50.87 and the HTTP Server: header said
HermesWebUI/0.50.38 — both hardcoded strings that drift further behind
with every release because there was no mechanism to keep them in sync.

Changes:
- api/updates.py: add _run_git() (moved before _detect_webui_version),
  _detect_webui_version(), and WEBUI_VERSION module constant resolved
  once at import time via 'git describe --tags --always --dirty'.
  Fallback chain: git → api/_version.py → 'unknown'.
- api/routes.py: inject webui_version into GET /api/settings response
  so the frontend can read it without a separate API call.
- static/panels.js: loadSettingsPanel() populates .settings-version-badge
  from settings.webui_version — one line after the existing api() call.
- static/index.html: replace stale hardcoded 'v0.50.87' with '—'
  placeholder; JS overwrites it as soon as the settings panel opens.
- server.py: replace hardcoded 'HermesWebUI/0.50.38' server_version with
  'HermesWebUI/' + WEBUI_VERSION.lstrip('v') — stays in sync automatically.
- Dockerfile: add ARG HERMES_VERSION=unknown and write api/_version.py
  so Docker images (where .git is excluded) still show the correct tag.
- .github/workflows/release.yml: pass build-args: HERMES_VERSION=${{ github.ref_name }}
  to the Docker build step on tag pushes.
- .gitignore: exclude api/_version.py (generated by Docker/CI, never committed).

No manual 'update the version badge' step is required going forward.
Tagging is sufficient — the badge and HTTP header update automatically.

Tests: 18 new tests in tests/test_version_badge.py covering the full
resolution chain, /api/settings injection, HTML placeholder, JS wiring,
and server.py import. 1596 tests pass total.

* fix: address review feedback on PR #790

- api/updates.py: replace exec() with regex parse for api/_version.py
  (no supply-chain risk from build artifact; exec unnecessary for one assignment)
- api/updates.py: cap git describe timeout at 3s (was 10s — import-time
  stall on NFS/.git would block server startup unnecessarily)
- server.py: lstrip('v') → removeprefix('v') (lstrip strips chars not prefix)
- server.py: emit bare 'HermesWebUI' when version is 'unknown' rather than
  'HermesWebUI/unknown' (log aggregators expect semver-ish suffix or none)
- CHANGELOG.md: add v0.50.124 entry for this user-visible change
- tests: rename exec-error test to reflect regex behaviour; add tests for
  removeprefix usage and unknown-version header guard (1598 tests total)

---------

Co-authored-by: nesquena-hermes <hermes@nesquena.com>
2026-04-20 20:36:53 -07:00
Nathan Esquenazi b12a682121 ci: add GitHub Actions workflow to run tests on PRs (#307)
Runs pytest suite across Python 3.11, 3.12, and 3.13 on ubuntu-latest.
Agent-dependent tests auto-skip via existing conftest logic.
Triggers on PRs targeting master and pushes to master.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 10:38:27 -07:00
Nathan Esquenazi bba9a236c3 fix(ci): use stricter semver regex for Docker tag extraction
Use v(\d+\.\d+(?:\.\d+)?) instead of v(.*) to only match real
version numbers (v0.29, v0.29.1), not arbitrary strings.
Keep latest unconditional since all tag pushes are releases.

Based on review of PR #52 approach vs #53.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 19:32:50 -07:00
Nathan Esquenazi 1a579ef9cf fix(ci): use match pattern for Docker tags to support 2-part versions
The semver pattern in docker/metadata-action requires 3-part versions
(e.g. v0.29.0). Our tags use 2-part (v0.29), causing the metadata
step to produce empty tags, which made build-push-action fail.

Fix: use type=match with regex to extract the version string directly,
plus type=raw for the latest tag unconditionally.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 19:27:41 -07:00
Nathan Esquenazi 2766314e81 fix(ci): use standard version tags for GitHub Actions
The SHA-pinned versions from the security hardening commit referenced
non-existent commit hashes, causing the workflow to fail with 'unable
to resolve action'. Switch to standard major version tags (v4, v3, v2,
v6, v5) which are the recommended approach for GitHub-maintained and
well-known actions.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 14:09:36 -07:00
Nathan Esquenazi 4a3b9571f1 fix(ci): pin all GitHub Actions to full commit SHAs for supply chain security
Pinned all 7 third-party actions from mutable version tags to immutable
commit SHAs. Mutable tags (e.g. @v4) can be force-pushed by the action
author (or a compromised account) to inject malicious code into the workflow,
which runs with write access to the repo and GHCR registry.

Also moved 'permissions' from workflow level to job level (best practice:
scope permissions as narrowly as possible).

Pin mapping:
  actions/checkout@v4               -> @11bd71901bbe...  (v4.2.2)
  softprops/action-gh-release@v2    -> @c062e08bd532...  (v2.2.1)
  docker/setup-qemu-action@v3       -> @49b3bc8e6bdd...  (v3.2.0)
  docker/setup-buildx-action@v3     -> @c47758b77c97...  (v3.7.1)
  docker/login-action@v3            -> @9780b0c442fb...  (v3.3.0)
  docker/metadata-action@v5         -> @369eb591f429...  (v5.6.1)
  docker/build-push-action@v6       -> @ca877d9245fe...  (v6.10.0)
2026-04-03 21:02:08 +00:00
Nathan Esquenazi 6a61f36280 ci: add GitHub Actions workflow for multi-arch Docker + releases
On tag push (v*):
- Creates a GitHub Release with auto-generated release notes
- Builds multi-arch Docker image (linux/amd64, linux/arm64)
- Pushes to ghcr.io/nesquena/hermes-webui with semver tags
- Uses GitHub Actions cache for faster builds

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 13:55:41 -07:00