mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-05-21 03:39:54 +00:00
dependabot/uv/python-dotenv-1.2.2
405 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
c1eb2dcda7 |
feat(security): supply-chain advisory checker + lazy-install framework + tiered install fallback (#24220)
* feat(security): supply-chain advisory checker + lazy-install framework + tiered install fallback
Three coordinated mitigations for the Mini Shai-Hulud worm hitting
mistralai 2.4.6 on PyPI (2026-05-12) and for the next single-package
compromise that follows.
# What this PR makes true
1. Users with the poisoned mistralai 2.4.6 in their venv get a loud
detection banner with copy-pasteable remediation steps the moment
they run hermes (and on every gateway startup).
2. One quarantined / yanked PyPI package can no longer silently demote
a fresh install to 'core only' — the installer keeps every other
extra and tells the user which tier landed.
3. Future opt-in backends (Mistral, ElevenLabs, Honcho, etc.) can
lazy-install on first use under a strict allowlist, instead of
eagerly pulling everything at install time.
# Detection: hermes_cli/security_advisories.py
- ADVISORIES catalog (one entry currently: shai-hulud-2026-05 for
mistralai==2.4.6). Adding the next one is a single dataclass.
- detect_compromised() uses importlib.metadata.version() — no pip
dependency, works in uv venvs that lack pip.
- Banner cache (~/.hermes/cache/advisory_banner_seen) rate-limits
the startup banner to once per 24h per advisory.
- Acks persisted to security.acked_advisories in config.yaml; never
re-banner after ack.
- Wired into:
* hermes doctor — runs first, prints full remediation block
* hermes doctor --ack <id> — dismisses an advisory
* cli.py interactive run() and single-query branches — short
stderr banner pointing at hermes doctor
* gateway/run.py startup — operator-visible warning in gateway.log
# Lazy-install framework: tools/lazy_deps.py
- LAZY_DEPS allowlist maps namespaced feature keys (tts.elevenlabs,
memory.honcho, provider.bedrock, etc.) to pip specs.
- ensure(feature) installs missing deps in the active venv via the
uv → pip → ensurepip ladder (matches tools_config._pip_install).
- Strict spec safety regex rejects URLs, file paths, shell metas,
pip flag injection, control chars — only PyPI-by-name accepted.
- Gated on security.allow_lazy_installs (default true) plus the
HERMES_DISABLE_LAZY_INSTALLS env var for restricted/audited envs.
- Migrated three backends as proof of pattern:
* tools/tts_tool.py — _import_elevenlabs() calls ensure first
* plugins/memory/honcho/client.py — get_honcho_client lazy-installs
* tts.mistral / stt.mistral entries pre-registered for when PyPI
restores mistralai
# Installer fallback tiers
scripts/install.sh, scripts/install.ps1, setup-hermes.sh:
- Centralised _BROKEN_EXTRAS list (currently: mistral). Edit one
array when a transitive breaks; users keep every other extra.
- New 'all minus known-broken' tier between [all] and the existing
PyPI-only-extras tier. Only kicks in when [all] fails resolve.
- All three tiers explicit: every fallback announces which tier
landed and prints a re-run hint when not on Tier 1.
- install.ps1 and install.sh both regenerate their tier specs from
the same _BROKEN_EXTRAS array so updates stay in sync.
Side effect: install.ps1 Tier 2 spec previously hardcoded 'mistral'
in its extra list — bug fixed by the refactor (mistral is filtered
out).
# Config
hermes_cli/config.py — DEFAULT_CONFIG.security gains:
- acked_advisories: [] (advisory IDs the user has dismissed)
- allow_lazy_installs: True (security gate for ensure())
No config version bump needed — both keys nest under existing
security: block, and load_config's deep-merge picks up DEFAULT_CONFIG
defaults for users with older configs.
# Tests
tests/hermes_cli/test_security_advisories.py — 23 tests covering:
- detect_compromised matches/non-matches, wildcard frozenset
- ack persistence, idempotence, blank rejection, config-failure path
- banner cache rate limiting + 24h re-banner + ack-stops-banner
- short_banner_lines / full_remediation_text / render_doctor_section /
gateway_log_message
- shipped catalog well-formedness invariant
tests/tools/test_lazy_deps.py — 40 tests covering:
- spec safety: 11 safe parametrized + 18 unsafe parametrized
- allowlist: unknown-feature rejection, namespace.name shape,
every shipped spec passes the safety regex
- security gating: config flag, env var, default, fail-open
- ensure() happy/sad paths: already-satisfied, install success,
pip stderr surfaced on failure, install-succeeds-but-still-missing
- is_available, feature_install_command
Combined: 63 new tests, all passing under scripts/run_tests.sh.
# Validation
- scripts/run_tests.sh tests/hermes_cli/test_security_advisories.py
tests/tools/test_lazy_deps.py → 63/63 passing
- scripts/run_tests.sh tests/hermes_cli/test_doctor.py
tests/hermes_cli/test_doctor_command_install.py
tests/tools/test_tts_mistral.py tests/tools/test_transcription_tools.py
tests/tools/test_transcription_dotenv_fallback.py → 165/165 passing
- scripts/run_tests.sh tests/hermes_cli/ tests/tools/ →
9191 passed, 8 pre-existing failures (verified on origin/main
before this change)
- bash -n on install.sh and setup-hermes.sh → OK
- py_compile on all modified .py files → OK
- End-to-end smoke test of detect_compromised + render_doctor_section
+ gateway_log_message with mocked installed version → produces
copy-pasteable remediation output
# Community
Full advisory + remediation steps:
website/docs/community/security-advisories/shai-hulud-mistralai-2026-05.md
Short-form post drafts (Discord, GitHub pinned issue, README banner):
scripts/community-announcement-shai-hulud.md
Refs: PR #24205 (mistral disabled), Socket Security advisory
<https://socket.dev/blog/mini-shai-hulud-worm-pypi>
* build(deps): pin every direct dep to ==X.Y.Z (no ranges)
Companion to the supply-chain advisory work: replace every >=/</~= range
in pyproject.toml's [project.dependencies] and [project.optional-dependencies]
with an exact ==X.Y.Z pin sourced from uv.lock.
Why: ranges allow PyPI to ship a fresh version of any direct dep at any
time without a code review on our side. With ranges, the malicious
mistralai 2.4.6 release would have been pulled by every fresh
'pip install -e .[all]' for the hours between upload and PyPI's
quarantine — exactly the install window we got hit on. Exact pins close
that window: the only way a new package version reaches a user is via
an intentional update on our end.
What the user-facing change is: nothing, behavior-wise. Every package
resolves to the same version it was already resolving to via uv.lock —
the pins just remove the resolver's freedom to pick a different one.
Cost: any user installing Hermes alongside another package that requires
a newer pin gets a resolver conflict. Acceptable for our isolated-venv
install path; documented in the new comment block.
Build-system requires line (setuptools>=61.0) is intentionally left
as a range — pinning the build backend would block fresh pip from
bootstrapping the build on architectures where that exact wheel isn't
available.
mistral extra (mistralai==2.3.0) is pinned but stays out of [all]
(per PR #24205). 'uv lock' regeneration will fail until PyPI restores
mistralai; lockfile regeneration is gated behind that, NOT on every PR.
LAZY_DEPS in tools/lazy_deps.py also moved to exact pins so the lazy-
install pathway can never resolve a different version than the one
declared in pyproject.toml.
Validation:
- Cross-checked all 77 pinned direct deps in pyproject.toml against
uv.lock — every pin matches the resolved version exactly.
- Cross-checked all LAZY_DEPS specs against uv.lock — same.
- 'uv pip install -e .[all] --dry-run' resolves 205 packages cleanly.
- tests/tools/test_lazy_deps.py + tests/hermes_cli/test_security_advisories.py
→ 63/63 passing (every shipped spec passes the safety regex).
- Doctor + TTS + transcription targeted suite → 146/146 passing.
* build(deps): hash-verify transitives via uv.lock; remove unresolvable [mistral] extra
You asked: 'what about the dependencies the dependencies rely on?' —
correctly noting that exact-pinning direct deps in pyproject.toml does
NOT cover the transitive graph. `pip install` and `uv pip install` both
re-resolve transitives fresh from PyPI at install time, so a compromised
transitive (e.g. `httpcore` if it got worm-poisoned tomorrow) would
still hit our users even with every direct dep exact-pinned.
# What this commit fixes
1. **Both real installer scripts now prefer `uv sync --locked` as Tier 0.**
uv.lock records SHA256 hashes for every transitive — a compromised
package with a different hash gets REJECTED. Falls through to the
existing `uv pip install` cascade if the lockfile is missing or
stale, with a loud warning that the fallback path does NOT
hash-verify transitives. Previously only `setup-hermes.sh` (the dev
path) used the lockfile; `scripts/install.sh` and `scripts/install.ps1`
(the paths fresh users actually run) skipped it.
2. **Removed the `[mistral]` extra entirely.** The `mistralai` PyPI
project is fully quarantined right now — every version returns 404,
so any pin we wrote was unresolvable, which broke `uv lock --check`
in CI. Restoration is documented in pyproject.toml as a 5-step
checklist (verify, re-add extra, re-enable in 4 modules, regenerate
lock, optionally re-add to [all]).
3. **Regenerated uv.lock.** 262 packages, mistralai/eval-type-backport/
jsonpath-python pruned. `uv lock --check` now passes.
# Defense-in-depth view
| Layer | Where | Protects against |
|----------------------------|-------------------|-------------------------------------------|
| Exact pins in pyproject | direct deps | new mistralai 2.4.6-style direct compromise |
| uv.lock + `--locked` install | transitive graph | transitive worm injection |
| Tier-0 hash-verified path | install.sh / .ps1 | actually USE the lockfile in fresh installs |
| `uv lock --check` CI gate | every PR | drift between pyproject and lockfile |
| `hermes_cli/security_advisories.py` | runtime | cleanup for users who already got hit |
The exact pinning + hash verification together close the supply-chain
gap. Without the lockfile path, exact pins alone are theater.
# Validation
- `uv lock --check` → passes (262 packages resolved, no drift).
- `bash -n` on install.sh + setup-hermes.sh → OK.
- 209/209 tests passing across new + adjacent test files
(test_lazy_deps.py, test_security_advisories.py, test_doctor.py,
test_tts_mistral.py, test_transcription_tools.py).
- TOML parse OK.
* chore: remove community announcement drafts (PR body covers it)
* build(deps): lazy-install every opt-in backend (anthropic, search, terminal, platforms, dashboard)
Extends the lazy-install framework to cover everything that's not used by
every hermes session. Base install drops from ~60 packages to 45.
Moved out of core dependencies = []:
- anthropic (only when provider=anthropic native, not via aggregators)
- exa-py, firecrawl-py, parallel-web (search backends; only when picked)
- fal-client (image gen; only when picked)
- edge-tts (default TTS but still optional)
New extras in pyproject.toml: [anthropic] [exa] [firecrawl] [parallel-web]
[fal] [edge-tts]. All added to [all].
New LAZY_DEPS entries: provider.anthropic, search.{exa,firecrawl,parallel},
tts.edge, image.fal, memory.hindsight, platform.{telegram,discord,matrix},
terminal.{modal,daytona,vercel}, tool.dashboard.
Each import site now calls ensure() before importing the SDK. Where the
module had a top-level try/except (telegram, discord, fastapi), the
graceful-fallback pattern was extended to lazy-install on first
check_*_requirements() call and re-bind module globals.
Updated test_windows_native_support.py tzdata check from snapshot
(>=2023.3 literal) to invariant (any version + win32 marker).
Validation:
- Base install: 45 packages (was ~60); 6 newly-extracted packages absent
- uv lock --check: passes (262 packages, no drift)
- 209/209 lazy_deps + advisory + doctor + tts/transcription tests passing
- py_compile clean on all 12 modified modules
|
||
|
|
7b76366552 |
feat(prompt-cache): cross-session 1h prefix cache for Claude on Anthropic / OpenRouter / Nous Portal (#23828)
Cuts input cost for first-turn Claude requests by ~85-90% on subsequent
sessions within an hour. Tools array (~13k tokens for default toolset) +
stable system prefix (~5-8k tokens) get a 1h cache_control marker; the
volatile suffix (memory, USER profile, timestamp, session id) sits in a
separate non-cached block at the end so it doesn't poison the cross-session
prefix when it changes.
Provider gate: Claude on native Anthropic (incl. OAuth subscription),
OpenRouter, and Nous Portal (which proxies to OpenRouter). All other
providers keep today's system_and_3 layout unchanged.
Layout (4 cache_control breakpoints, Anthropic max):
1. tools[-1] -> 1h (cross-session)
2. system content[0] -> 1h (cross-session, stable prefix)
3. messages[-2] -> 5m (within-session rolling)
4. messages[-1] -> 5m (within-session rolling)
Within-session rolling shrinks from 3 messages to 2 to free the breakpoint
budget. On Claude with realistic tool loadouts the long-lived tier carries
the bulk of cross-session value anyway.
System prompt is now always assembled cache-friendly: stable identity /
guidance / skills / platform hints first, then session-stable context
files (AGENTS.md, .cursorrules), then per-call volatile content. Old
single-string callers see the same logical content (same join order),
just reordered so volatile lives at the end.
Config knobs (defaults shown):
prompt_caching:
cache_ttl: "5m" # rolling-window TTL (unchanged)
long_lived_prefix: true # opt-out switch
long_lived_ttl: "1h" # cross-session prefix TTL
Live E2E (tests/agent/test_prompt_caching_live.py, gated on
OPENROUTER_API_KEY) on anthropic/claude-haiku-4.5 with default toolset:
Call 1 (cold): cache_write=13,415 cache_read=0
Call 2 (NEW agent + msg): cache_write=391 cache_read=13,025
Cross-session reuse: 97.09%
Implementation:
* agent/prompt_caching.py: new apply_anthropic_cache_control_long_lived()
+ mark_tools_for_long_lived_cache(); existing apply_anthropic_cache_control()
preserved verbatim for the fallback path.
* agent/anthropic_adapter.py: convert_tools_to_anthropic() now forwards
cache_control onto each Anthropic-format tool dict.
* run_agent.py: _build_system_prompt_parts() returns the 3-tier dict;
_build_system_prompt() joins them (backward compatible).
_supports_long_lived_anthropic_cache() policy added next to the existing
_anthropic_prompt_cache_policy() (which now also recognises Nous Portal
Claude — pre-existing gap fixed in passing).
_build_api_kwargs() resolves tools_for_api once and propagates the
marker through all four build paths (anthropic_messages, bedrock,
codex_responses, profile/legacy chat completions).
Long-lived flag plumbed into the runtime snapshot/restore + model-switch
+ fallback-promotion paths.
Tests:
* tests/agent/test_prompt_caching.py: +8 tests (TestMarkToolsForLongLivedCache,
TestApplyAnthropicCacheControlLongLived).
* tests/run_agent/test_anthropic_prompt_cache_policy.py: +9 tests
(TestSupportsLongLivedAnthropicCache matrix across 8 endpoint classes
+ a fallback-target case).
* tests/agent/test_prompt_caching_live.py: new live E2E (skipif when
OPENROUTER_API_KEY is unset; runs outside the hermetic suite).
* Targeted suites: 327/327 pass (caching/adapter/policy/builder).
* tests/agent/ + tests/run_agent/: 3992 pass, 17 skip, 1 pre-existing
flake (test_async_httpx_del_neuter::test_same_key_replaces_stale_loop_entry,
verified failing on pristine origin/main).
|
||
|
|
2ec8d2b42f |
chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937)
Replace with for all literal-tuple membership tests. Set lookup is O(1) vs O(n) for tuple — consistent micro-optimization across the codebase. 608 instances fixed via `ruff --fix --unsafe-fixes`, 0 remaining. 133 files, +626/-626 (net zero). |
||
|
|
ebf2ea584a |
feat(terminal,cli): docker_extra_args + display.timestamps
Two independent opt-in QoL toggles, both off by default. terminal.docker_extra_args: - List of extra flags appended verbatim to docker run after security defaults. Useful for adding capabilities (e.g. --cap-add SETUID) or other docker run options not exposed by existing config keys. - Non-string entries are logged and skipped. - Also available via TERMINAL_DOCKER_EXTRA_ARGS='[...]' env var. display.timestamps: - Appends [HH:MM] to user input bullet and the assistant response box header. Single hub in _format_submitted_user_message_preview() covers both single-line and multi-line user previews; assistant response label gets the timestamp at box-open time. Closes #1569 (timestamps). Co-authored-by: Mibayy <Mibayy@users.noreply.github.com> |
||
|
|
228a4d11ae |
fix(config): warn loudly on YAML parse failure instead of silent default fallback (#23585)
A YAML parse error in ~/.hermes/config.yaml caused load_config() to print one line to stdout (Warning: Failed to load config: ...) and silently fall back to DEFAULT_CONFIG, dropping every user override (auxiliary providers, fallback chain, model settings). Users only noticed when downstream behavior misbehaved — see issue #23570 where a tab-indent error in the auxiliary section caused aux fallback to use OpenRouter (depleted) instead of the configured Codex/MiniMax chain. Now: log at WARNING (so 'hermes logs' surfaces it), write a prominent line to stderr, dedup on (path, mtime_ns, size) so concurrent loads don't spam, and re-warn after the user edits the file. Both call sites (raw read + merged load) route through the same helper. Refs #23570 |
||
|
|
116a1446a4 |
fix(terminal): bridge docker_env config to TERMINAL_DOCKER_ENV
Problem: terminal.docker_env set in config.yaml was silently ignored.
Docker containers never received the user-specified env vars.
Root cause: docker_env was missing from all three config→env bridging
maps (cli.py env_mappings, gateway/run.py _terminal_env_map,
hermes_cli/config.py _config_to_env_sync) and from the terminal_tool
_get_env_config() reader. _create_environment() consumed the key from
container_config correctly, but it was always {} because TERMINAL_DOCKER_ENV
was never set.
Also extend the list-serialisation branches in cli.py and gateway/run.py
to handle dict values via json.dumps (lists already used json.dumps;
plain str() on a dict produces undecodable output).
Fix:
- cli.py: add "docker_env": "TERMINAL_DOCKER_ENV" to env_mappings;
serialise dict values with json.dumps alongside existing list path
- gateway/run.py: same additions to _terminal_env_map and serialisation
- hermes_cli/config.py: add "terminal.docker_env": "TERMINAL_DOCKER_ENV"
to _config_to_env_sync so `hermes config set terminal.docker_env …`
persists to .env correctly
- tools/terminal_tool.py: add docker_env key to _get_env_config() reading
TERMINAL_DOCKER_ENV via _parse_env_var with default "{}"
Tests: add test_docker_env_is_bridged_everywhere to
tests/tools/test_terminal_config_env_sync.py — stash-verified: fails on
origin/main, passes with fix.
Fixes #20537
|
||
|
|
0bcc327cab |
docs(openrouter): document auxiliary.<task>.extra_body for OR routing and Pareto (#22844)
The plumbing for setting OpenRouter provider preferences and the Pareto Code router on auxiliary tasks already exists — auxiliary.<task>.extra_body is forwarded verbatim by call_llm() / async_call_llm(). It just wasn't documented, so users who wanted (e.g.) Pareto Code routing for compression but the strongest coder for the main agent had no way to discover the escape hatch. - hermes_cli/config.py: expand the auxiliary section header with a YAML example showing provider routing plus plugins under extra_body, and an explicit note that main-agent provider_routing / openrouter.min_coding_score do NOT propagate to aux calls (each task is independent by design) - website/docs/user-guide/configuration.md: new 'OpenRouter routing and Pareto Code for auxiliary tasks' subsection with worked example - website/docs/integrations/providers.md: cross-link from the Pareto Code Router section to the aux-side doc E2E verified that auxiliary.<task>.extra_body reaches the OpenRouter API with the configured provider routing and plugins blocks intact. |
||
|
|
c7f0aab949 |
feat(openrouter): wire Pareto Code router with min_coding_score knob (#22838)
Pick openrouter/pareto-code as your model and OpenRouter auto-routes each
request to the cheapest model meeting your coding-quality bar (ranked by
Artificial Analysis). The new openrouter.min_coding_score config key (0.0-1.0,
default 0.65) tunes the floor.
- hermes_cli/models.py: add openrouter/pareto-code to OPENROUTER_MODELS so
it shows up in the picker with a description
- hermes_cli/config.py: add openrouter.min_coding_score (default 0.65 — lands
on a mid-tier coder on the current Pareto frontier)
- plugins/model-providers/openrouter: emit extra_body.plugins =
[{id: pareto-router, min_coding_score: X}] when model is openrouter/pareto-code
AND the score is a valid float in [0.0, 1.0]
- agent/transports/chat_completions.py: same emission on the legacy flag
path (when no provider profile is loaded)
- run_agent.py: openrouter_min_coding_score kwarg + storage; plumbed into
both build_kwargs() invocations and the context-summary extra_body path
- cli.py: read openrouter.min_coding_score once at init, validate float in
[0,1], pass to AIAgent constructions (CLI + background-task paths)
- cron/scheduler.py, batch_runner.py, tools/delegate_tool.py,
tui_gateway/server.py: propagate the kwarg (mirrors providers_order
plumbing — subagents inherit, cron/batch read from config)
- tests: profile-level + transport-level coverage of the model gating,
unset/empty/out-of-range handling, and the legacy flag path
- docs: new 'OpenRouter Pareto Code Router' section in providers.md
Verified end-to-end against api.openrouter.ai: at score=0.65 we land on a
mid-tier coder, at omission we get the strongest. Score is silently dropped
on any model other than openrouter/pareto-code, so it's safe to leave set.
|
||
|
|
b9c001116e |
feat: confirm prompt for destructive slash commands (#4069) (#22687)
/clear, /new, /reset, and /undo now ask the user to confirm before discarding conversation state — three-option prompt routed through the existing tools.slash_confirm primitive. Native yes/no buttons render on Telegram, Discord, and Slack (their adapters already implement send_slash_confirm); other platforms get a text-fallback prompt and reply with /approve, /always, or /cancel. The classic prompt_toolkit CLI uses the same three-option flow via the established _prompt_text_input pattern (see _confirm_and_reload_mcp). TUI keeps its existing modal overlay (#12312). Gated by new config key approvals.destructive_slash_confirm (default true). Picking 'Always Approve' flips the gate to false so subsequent destructive commands run silently — matches the established mcp_reload_confirm UX. Out of scope: /cron remove (separate domain — scheduled jobs, not session history). Existing TUI overlay env-var (HERMES_TUI_NO_CONFIRM) left unchanged; cosmetic unification can come later. Closes #4069. |
||
|
|
cc38282b04 |
feat(cross-platform): psutil for PID/process management + Windows footgun checker
## Why
Hermes supports Linux, macOS, and native Windows, but the codebase grew up
POSIX-first and has accumulated patterns that silently break (or worse,
silently kill!) on Windows:
- `os.kill(pid, 0)` as a liveness probe — on Windows this maps to
CTRL_C_EVENT and broadcasts Ctrl+C to the target's entire console
process group (bpo-14484, open since 2012).
- `os.killpg` — doesn't exist on Windows at all (AttributeError).
- `os.setsid` / `os.getuid` / `os.geteuid` — same.
- `signal.SIGKILL` / `signal.SIGHUP` / `signal.SIGUSR1` — module-attr
errors at runtime on Windows.
- `open(path)` / `open(path, "r")` without explicit encoding= — inherits
the platform default, which is cp1252/mbcs on Windows (UTF-8 on POSIX),
causing mojibake round-tripping between hosts.
- `wmic` — removed from Windows 10 21H1+.
This commit does three things:
1. Makes `psutil` a core dependency and migrates critical callsites to it.
2. Adds a grep-based CI gate (`scripts/check-windows-footguns.py`) that
blocks new instances of any of the above patterns.
3. Fixes every existing instance in the codebase so the baseline is clean.
## What changed
### 1. psutil as a core dependency (pyproject.toml)
Added `psutil>=5.9.0,<8` to core deps. psutil is the canonical
cross-platform answer for "is this PID alive" and "kill this process
tree" — its `pid_exists()` uses `OpenProcess + GetExitCodeProcess` on
Windows (NOT a signal call), and its `Process.children(recursive=True)`
+ `.kill()` combo replaces `os.killpg()` portably.
### 2. `gateway/status.py::_pid_exists`
Rewrote to call `psutil.pid_exists()` first, falling back to the
hand-rolled ctypes `OpenProcess + WaitForSingleObject` dance on Windows
(and `os.kill(pid, 0)` on POSIX) only if psutil is somehow missing —
e.g. during the scaffold phase of a fresh install before pip finishes.
### 3. `os.killpg` migration to psutil (7 callsites, 5 files)
- `tools/code_execution_tool.py`
- `tools/process_registry.py`
- `tools/tts_tool.py`
- `tools/environments/local.py` (3 sites kept as-is, suppressed with
`# windows-footgun: ok` — the pgid semantics psutil can't replicate,
and the calls are already Windows-guarded at the outer branch)
- `gateway/platforms/whatsapp.py`
### 4. `scripts/check-windows-footguns.py` (NEW, 500 lines)
Grep-based checker with 11 rules covering every Windows cross-platform
footgun we've hit so far:
1. `os.kill(pid, 0)` — the silent killer
2. `os.setsid` without guard
3. `os.killpg` (recommends psutil)
4. `os.getuid` / `os.geteuid` / `os.getgid`
5. `os.fork`
6. `signal.SIGKILL`
7. `signal.SIGHUP/SIGUSR1/SIGUSR2/SIGALRM/SIGCHLD/SIGPIPE/SIGQUIT`
8. `subprocess` shebang script invocation
9. `wmic` without `shutil.which` guard
10. Hardcoded `~/Desktop` (OneDrive trap)
11. `asyncio.add_signal_handler` without try/except
12. `open()` without `encoding=` on text mode
Features:
- Triple-quoted-docstring aware (won't flag prose inside docstrings)
- Trailing-comment aware (won't flag mentions in `# os.kill(pid, 0)` comments)
- Guard-hint aware (skips lines with `hasattr(os, ...)`,
`shutil.which(...)`, `if platform.system() != 'Windows'`, etc.)
- Inline suppression with `# windows-footgun: ok — <reason>`
- `--list` to print all rules with fixes
- `--all` / `--diff <ref>` / staged-files (default) modes
- Scans 380 files in under 2 seconds
### 5. CI integration
A GitHub Actions workflow that runs the checker on every PR and push is
staged at `/tmp/hermes-stash/windows-footguns.yml` — not included in this
commit because the GH token on the push machine lacks `workflow` scope.
A maintainer with `workflow` permissions should add it as
`.github/workflows/windows-footguns.yml` in a follow-up. Content:
```yaml
name: Windows footgun check
on:
push:
branches: [main]
pull_request:
branches: [main]
jobs:
check:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with: {python-version: "3.11"}
- run: python scripts/check-windows-footguns.py --all
```
### 6. CONTRIBUTING.md — "Cross-Platform Compatibility" expansion
Expanded from 5 to 16 rules, each with message, example, and fix.
Recommends psutil as the preferred API for PID / process-tree operations.
### 7. Baseline cleanup (91 → 0 findings)
- 14 `open()` sites → added `encoding='utf-8'` (internal logs/caches) or
`encoding='utf-8-sig'` (user-editable files that Notepad may BOM)
- 23 POSIX-only callsites in systemd helpers, pty_bridge, and plugin
tool subprocess management → annotated with
`# windows-footgun: ok — <reason>`
- 7 `os.killpg` sites → migrated to psutil (see §3 above)
## Verification
```
$ python scripts/check-windows-footguns.py --all
✓ No Windows footguns found (380 file(s) scanned).
$ python -c "from gateway.status import _pid_exists; import os
> print('self:', _pid_exists(os.getpid())); print('bogus:', _pid_exists(999999))"
self: True
bogus: False
```
Proof-of-repro that `os.kill(pid, 0)` was actually killing processes
before this fix — see commit `1cbe39914` and bpo-14484. This commit
removes the last hand-rolled ctypes path from the hot liveness-check
path and defers to the best-maintained cross-platform answer.
|
||
|
|
cbce5e93fc |
codebase: add encoding='utf-8' to all bare open() calls (PLW1514)
Closes the last Python-on-Windows UTF-8 exposure by making every
text-mode open() call explicit about its encoding.
Before: on Windows, bare open(path, 'r') defaults to the system
locale encoding (cp1252 on US-locale installs). That means reading
any config/yaml/markdown/json file with non-ASCII content either
crashes with UnicodeDecodeError or silently mis-decodes bytes.
After: all 89 affected call sites in production code now pass
encoding='utf-8' explicitly. Works identically on every platform
and every locale, no surprise behavior.
Mechanical sweep via:
ruff check --preview --extend-select PLW1514 --unsafe-fixes --fix --exclude 'tests,venv,.venv,node_modules,website,optional-skills, skills,tinker-atropos,plugins' .
All 89 fixes have the same shape: open(x) or open(x, mode) became
open(x, encoding='utf-8') or open(x, mode, encoding='utf-8'). Nothing
else changed. Every modified file still parses and the Windows/sandbox
test suite is still green (85 passed, 14 skipped, 0 failed across
tests/tools/test_code_execution_windows_env.py +
tests/tools/test_code_execution_modes.py + tests/tools/test_env_passthrough.py +
tests/test_hermes_bootstrap.py).
Scope notes:
- tests/ excluded: test fixtures can use locale encoding intentionally
(exercising edge cases). If we want to tighten tests later that's
a separate PR.
- plugins/ excluded: plugin-specific conventions may differ; plugin
authors own their code.
- optional-skills/ and skills/ excluded: skill scripts are user-authored
and we don't want to mass-edit them.
- website/ and tinker-atropos/ excluded: vendored / generated content.
46 files touched, 89 +/- lines (symmetric replacement). No behavior
change on POSIX or on Windows when the file is ASCII; bug fix on
Windows when the file contains non-ASCII.
|
||
|
|
b53bd12fe4 |
fix(windows-editor): default EDITOR=notepad so /edit and Ctrl+X Ctrl+E work
Pre-existing Windows bug surfaced while reviewing the portable-MinGit
install: prompt_toolkit's Buffer.open_in_editor() falls back to POSIX
absolute paths (/usr/bin/nano, /usr/bin/vi, /usr/bin/emacs) that don't
exist on native Windows. When neither $EDITOR nor $VISUAL is set,
Ctrl+X Ctrl+E ("open prompt in editor") and /edit both silently do
nothing on Windows — the user hits the key, nothing happens, no error.
This wasn't caused by MinGit (full Git for Windows doesn't fix it either,
because the Windows Python subprocess call resolves `/usr/bin/nano` as
`C:\usr\bin\nano`, which doesn't exist even with nano installed).
Fixes:
- hermes_cli/stdio.py::configure_windows_stdio now sets EDITOR=notepad
on Windows if neither EDITOR nor VISUAL is set. notepad.exe is in
every Windows install, works as a blocking editor (subprocess.call
waits for the window to close), and writes back to the file.
- hermes_cli/config.py (hermes config edit): reorder fallback list so
Windows tries notepad first — previously nano led the list, which
required Git Bash / WSL to be in PATH.
- Users who want VSCode / Neovim / Notepad++ can still override via
$env:EDITOR — that's checked before our default kicks in. Docstring
spells out the common overrides.
The Ink TUI (`hermes --tui`) already handled Windows correctly via
ui-tui/src/lib/editor.ts falling back to notepad.exe on win32 — this
commit brings the classic prompt_toolkit CLI into parity.
3 new tests in test_windows_native_support.py verify:
- EDITOR=notepad gets set when unset on Windows
- Explicit $EDITOR is respected
- $VISUAL is respected (not overwritten by our default)
|
||
|
|
34f7297359 | Serialize Hermes config access | ||
|
|
24d48ffb82 |
feat(kanban): add specify — auxiliary LLM fleshes out triage tasks (#21435)
* feat(kanban): add `specify` — auxiliary LLM fleshes out triage tasks
The Triage column shipped with a placeholder 'a specifier will flesh
out the spec', but the specifier itself was never built. This wires
it up as a dedicated CLI verb.
`hermes kanban specify <id>` calls the auxiliary LLM (configured under
`auxiliary.triage_specifier`) to expand a rough one-liner into a
concrete spec — tightened title plus a body with Goal / Approach /
Acceptance criteria / Out-of-scope sections — then atomically flips
`status: triage -> todo` and recomputes ready so parent-free tasks
go straight to the dispatcher on the same tick.
Surface:
hermes kanban specify <task_id> # single task
hermes kanban specify --all [--tenant T] # sweep triage column
hermes kanban specify ... --author NAME # audit-comment author
hermes kanban specify ... --json # one JSON line per task
Design choices:
- Parent gating is preserved. specify_triage_task flips to 'todo',
then recompute_ready promotes to 'ready' only when parents are
done — same rule as a normal parent-gated todo.
- No daemon, no background watcher. Every invocation is explicit —
keeps cost predictable and doesn't fight the dispatcher loop.
- Response parse is lenient: strict JSON preferred, markdown-fence
tolerated, raw-body fallback on malformed JSON so the LLM can't
strand a task in triage.
- All failure modes (no aux client, API error, task moved out of
triage mid-call) return SpecifyOutcome(ok=False, reason=...) so
--all continues past individual failures.
Changes:
hermes_cli/kanban_db.py + specify_triage_task()
hermes_cli/kanban_specify.py NEW (~220 LOC — prompt, parse, call)
hermes_cli/kanban.py + specify subcommand + _cmd_specify
hermes_cli/config.py + auxiliary.triage_specifier task slot
website/docs/user-guide/features/kanban.md specify + config notes
website/docs/reference/cli-commands.md CLI reference entry
tests/hermes_cli/test_kanban_specify_db.py NEW (10 tests)
tests/hermes_cli/test_kanban_specify.py NEW (20 tests)
Validation: 30/30 targeted tests pass. E2E: triage task -> specify ->
ends in 'ready' with events [created, specified, promoted] and the
audit comment recorded under the configured author.
* feat(kanban): wire specifier into dashboard and gateway slash
Follow-ups to the initial PR #21435 — closes the two gaps I'd left as
post-merge: dashboard button and first-class gateway surface.
Dashboard (plugins/kanban/dashboard/)
- POST /tasks/:id/specify NEW endpoint. Thin wrapper around
kanban_specify.specify_task(). Returns the CLI outcome shape
({ok, task_id, reason, new_title}); ok=false with a human reason
is a 200, not a 4xx, so the UI can render it inline without
treating 'no aux client configured' as a crash.
- Runs sync in FastAPI's threadpool because the LLM call can take
tens of seconds on reasoning models.
- Pins HERMES_KANBAN_BOARD around the specify call so the module's
argless kb.connect() lands on the right board.
- dist/index.js: doSpecify callback threaded through the drawer →
TaskDetail → StatusActions prop chain. ✨ Specify button appears
ONLY when task.status === 'triage' (elsewhere the backend would
reject anyway — hide the button to keep the action row clean).
Busy state (Specifying…) + inline success/error banner under the
button using the response.reason text.
- dist/style.css: tiny hermes-kanban-msg-ok / -err classes using
existing --color vars so themes reskin cleanly.
Gateway slash (/kanban specify)
- Already works via the existing run_slash → build_parser →
kanban_command pipeline. No code change needed — slash commands
inherit the argparse tree automatically. Added coverage:
test_run_slash_specify_end_to_end (create --triage, specify, verify
promotion + retitle) and test_run_slash_specify_help_is_reachable.
Tests
- tests/plugins/test_kanban_dashboard_plugin.py: 3 new tests for the
REST endpoint — happy path, non-triage rejection as ok=false 200,
missing aux client as ok=false 200.
- tests/hermes_cli/test_kanban_cli.py: 2 new slash-surface tests.
Docs
- website/docs/user-guide/features/kanban.md: dashboard action row
description mentions ✨ Specify + all three surfaces. REST table
gains /tasks/:id/specify. Slash examples include /kanban specify.
Validation: 340/340 targeted tests pass. E2E via TestClient: create a
triage task over REST → POST /specify with mocked aux client → task
moves to 'ready' column on /board with new title and body applied.
|
||
|
|
04193cf71c |
feat(web): add Brave Search (free tier) and DDGS search providers
Both implement WebSearchProvider via tools/web_providers/ — matching the existing SearXNG pattern (PR #5c906d702). Search-only; pair with any extract provider via web.extract_backend. - tools/web_providers/brave_free.py — Brave Search API (free tier, 2k queries/mo). Uses BRAVE_SEARCH_API_KEY as X-Subscription-Token. - tools/web_providers/ddgs.py — DuckDuckGo via the ddgs Python package. No API key; gated on package importability. - tools/web_tools.py: both backends added to _get_backend() config list and auto-detect chain (trails paid providers), _is_backend_available, web_search_tool dispatch, web_extract_tool + web_crawl_tool search-only refusals, check_web_api_key, and the __main__ diagnostic. Introduces _ddgs_package_importable() helper so tests can monkeypatch a single symbol for the ddgs availability check. - hermes_cli/tools_config.py: picker entries for both providers; ddgs gets a post_setup handler that runs `pip install ddgs`. - hermes_cli/config.py: BRAVE_SEARCH_API_KEY in OPTIONAL_ENV_VARS. - scripts/release.py: AUTHOR_MAP entry for @Abd0r. - tests: 14 new tests (brave-free) + 15 new tests (ddgs) covering provider unit behavior, backend wiring, and search-only refusals. Salvages the brave-free + ddgs portion of PR #19796. Not included: the in-line helpers in web_tools.py (replaced with provider modules to match the shipped architecture), the lynx-based extract path (these backends should refuse extract with a clear error — users pair with a real extract provider), and scripts/start-llama-server.sh (unrelated). Co-authored-by: Abd0r <223003280+Abd0r@users.noreply.github.com> |
||
|
|
af9336d575 |
feat(gateway): generic plugin hooks for env enablement + cron delivery
Widen the platform-plugin surface so plugins can self-configure from env
vars and opt into cron home-channel delivery without editing core files.
Closes the scope gap that forced every new platform (Google Chat, Teams,
IRC, future) to either touch gateway/config.py, cron/scheduler.py, and
hermes_cli/config.py or live without env-only setup.
Changes:
- gateway/platform_registry.py: two new optional PlatformEntry fields.
- env_enablement_fn: () -> Optional[dict]. Called during
_apply_env_overrides BEFORE the adapter is constructed. Returned
dict fields are merged into PlatformConfig.extra; the special
'home_channel' key (if present) becomes a proper HomeChannel
dataclass on the PlatformConfig.
- cron_deliver_env_var: name of the *_HOME_CHANNEL env var. When set,
the plugin platform is a valid cron deliver= target and cron reads
the env var to resolve the default chat/room ID.
- gateway/config.py: the existing plugin-platform enable pass at the
bottom of _apply_env_overrides now calls env_enablement_fn and seeds
extras/home_channel. No effect on plugins that don't set the new
field.
- cron/scheduler.py: _is_known_delivery_platform and
_resolve_home_env_var fall through to the registry when the platform
isn't in the hardcoded built-in sets. New _iter_home_target_platforms
helper iterates built-ins + plugin platforms for the deliver=origin
fallback.
- gateway/run.py: _home_target_env_var now consults the new resolver so
plugin-defined home channels work for non-cron call sites too.
- hermes_cli/config.py: new _inject_platform_plugin_env_vars() sibling
of _inject_profile_env_vars(). Scans plugins/platforms/*/plugin.yaml
at import time and contributes entries to OPTIONAL_ENV_VARS so
'hermes config' UI discovers them. Supports bare-string and rich-dict
requires_env entries plus a new optional_env list for non-required
vars (home channels, allowlists).
All additions are strictly opt-in. Existing plugins (IRC, Teams,
image_gen, memory) see zero behavior change until they adopt the new
fields.
|
||
|
|
69d025e4a7 |
feat(gateway): add allowed_{chats,channels,rooms} whitelist to Telegram, Mattermost, Matrix, DingTalk
Mirrors the Slack `allowed_channels` feature (PR #7401) and Discord's `allowed_channels` (PR #7044) across the remaining group-capable platforms. All five platforms (Slack + Discord + the four added here) now follow the same pattern: primary config via config.yaml, env-var fallback as an escape hatch — matching the project policy that .env is for secrets only and behavioral settings belong in config.yaml. Also fixes a duplicate `slack` key in DEFAULT_CONFIG introduced by PR #7401 (the later entry silently overwrote `allowed_channels`, `require_mention`, and `free_response_channels` at dict-literal evaluation time). Platforms added: - Telegram: `telegram.allowed_chats` (env alias: `TELEGRAM_ALLOWED_CHATS`) - Mattermost: `mattermost.allowed_channels` (env alias: `MATTERMOST_ALLOWED_CHANNELS`) - Matrix: `matrix.allowed_rooms` (env alias: `MATRIX_ALLOWED_ROOMS`) - DingTalk: `dingtalk.allowed_chats` (env alias: `DINGTALK_ALLOWED_CHATS`) Mattermost and Matrix previously had NO config.yaml bridging for any of their gating settings; this PR adds `load_gateway_config` bridges for them (Mattermost gets require_mention + free_response_channels + allowed_channels; Matrix gets allowed_rooms on top of its existing bridges for require_mention and free_response_rooms). Semantics identical everywhere: - Empty = no restriction (fully backward compatible). - Non-empty = hard whitelist: non-listed chats are silently ignored, even when the bot is @mentioned. - DMs bypass the check entirely. DEFAULT_CONFIG merges the duplicate `slack` block and adds new `mattermost` and `matrix` blocks so all gating settings surface in defaults. Not included: Feishu (has its own per-chat `chat_rules` system that covers this use case differently), WhatsApp (already has `group_allow_from` via `group_policy: allowlist`), pure-DM platforms (Signal, SMS, BlueBubbles, Yuanbao — no group concept). |
||
|
|
cd3ef685c4 | feat(slack): add allowed_channels whitelist config | ||
|
|
80717a157f |
fix(discord): route DM role-auth opt-in through config.yaml (not env var)
Per repo policy, ~/.hermes/.env is for secrets only. Guild IDs are behavioral configuration, not secrets. Replacing the DISCORD_DM_ROLE_AUTH_GUILD env var from the original fix with discord.dm_role_auth_guild in config.yaml. - New module-level _read_dm_role_auth_guild() helper reads hermes_cli.config.read_raw_config()['discord']['dm_role_auth_guild']. Fails closed on any parse error (safe default = DM role-auth off). - DEFAULT_CONFIG['discord'] gains dm_role_auth_guild: '' with a comment documenting the opt-in. - Tests patch hermes_cli.config.read_raw_config directly (via the _set_dm_role_auth_guild helper) instead of setenv/delenv. 12 tests in test_discord_roles_dm_scope pass; no env var involvement. - Docstring + module docstring + comments updated to reference discord.dm_role_auth_guild. - E2E verified with real imports across 6 scenarios: unset, int, string, garbage, zero, and (crucially) env-var-only-no-config all return None except the valid int/string cases. Env var has zero effect — policy compliance confirmed. |
||
|
|
fb1ce793e6 |
feat(security): enable secret redaction by default (#17691, #20785) (#21193)
Flip the default for HERMES_REDACT_SECRETS from off to on so the redactor already wired into send_message_tool, logs, and tool output actually runs on a fresh install. - agent/redact.py: env-var default "" → "true" - hermes_cli/config.py: DEFAULT_CONFIG security.redact_secrets True; two config-template comments rewritten - gateway/run.py + cli.py: startup log / banner warning when the user has explicitly opted out, so the downgrade is visible in agent.log and at CLI banner time - docs/reference/environment-variables.md: description reconciled - tests: flipped the default-pin, restructured the force=True regression test to explicit-false instead of unset Users who need raw credential values (redactor development) can still opt out via security.redact_secrets: false in config.yaml or HERMES_REDACT_SECRETS=false in .env. Closes #17691. Addresses #20785 (short-term output-pipeline recommendation). |
||
|
|
411cfa26e3 | fix: auto-block repeated kanban retries | ||
|
|
5c906d7026 |
feat(web): add SearXNG as a native search-only backend
Adds SearXNG as a free, self-hosted web search provider. SearXNG is a
privacy-respecting metasearch engine that requires no API key — just a
running instance and SEARXNG_URL pointing at it.
## What this adds
- `tools/web_providers/searxng.py` — `SearXNGSearchProvider` implementing
`WebSearchProvider` (search only; no extract capability)
- `_is_backend_available("searxng")` — gates on SEARXNG_URL
- `_get_backend()` — accepts "searxng" as a configured value; adds it to
auto-detect candidates (lower priority than paid services)
- `web_search_tool` — dispatches to SearXNG when it is the active backend
- `check_web_api_key()` — includes SearXNG in availability check
- `OPTIONAL_ENV_VARS["SEARXNG_URL"]` — registered with tools=["web_search"]
- `tools_config.py` — SearXNG appears in the `hermes tools` provider picker
- `nous_subscription.py` — `direct_searxng` detection, web_active / web_available
- `setup.py` — SEARXNG_URL listed in the missing-credential hint
- 23 tests covering: is_configured, happy-path search, score sorting, limit,
HTTP/request errors, _is_backend_available, _get_backend, check_web_api_key
## Config
```yaml
# Use SearXNG for search, any paid provider for extract
web:
search_backend: "searxng"
extract_backend: "firecrawl"
# Or: SearXNG as the sole backend (web_extract will use the next available)
web:
backend: "searxng"
```
SearXNG is search-only — it does not implement WebExtractProvider. Users
who only configure SEARXNG_URL get web_search available; web_extract falls
back to the next available extract provider (or is unavailable if none).
Closes #19198 (Phase 2 Task 4 — SearXNG provider)
Ref: #11562 (original SearXNG PR)
|
||
|
|
cd2cbc73b7 |
refactor(web): per-capability backend selection for search/extract split
Introduce the foundation for independently selecting web search and extract backends — enabling future combinations like SearXNG for search + Firecrawl for extract. Architecture: - tools/web_providers/base.py: WebSearchProvider and WebExtractProvider ABCs with normalized result contracts (mirrors CloudBrowserProvider) - tools/web_tools.py: _get_search_backend() and _get_extract_backend() read per-capability config keys, fall through to shared web.backend - hermes_cli/config.py: web.search_backend and web.extract_backend in DEFAULT_CONFIG (empty = inherit from web.backend) Behavioral change: - web_search_tool() now dispatches via _get_search_backend() - web_extract_tool() now dispatches via _get_extract_backend() - When per-capability keys are empty (default), behavior is identical to before — _get_search_backend() falls through to _get_backend() This is purely structural — no new backends are added. SearXNG and other search-only/extract-only providers can now be added as simple drop-in modules in follow-up PRs. 12 new tests, 49 existing tests pass with zero regressions. Ref: #19198 |
||
|
|
ad7aad251c |
feat(skills/linear): add Documents support + Python helper script (#20752)
* feat(skills/linear): add Documents support + Python helper script The bundled Linear skill (PR #1230) covered issues, projects, teams, and workflow states via curl. It had no coverage for Linear's Documents API, so fetching an RFC/doc from a linear.app URL required hand-writing GraphQL against an underdocumented schema. Adds: - Documents section in SKILL.md explaining slugId extraction from URLs, the contentState (markdown) vs contentState (ProseMirror) split, and four canonical curl examples (fetch by slugId, fetch by UUID, list recent, title-search). - scripts/linear_api.py — stdlib-only Python CLI wrapping the most common operations (whoami, list-teams, list/get/search/create/update issues, add-comment, update-status, list/get/search documents, raw GraphQL passthrough). Zero deps, reads LINEAR_API_KEY from env. Auth header quirk (personal key takes bare $LINEAR_API_KEY, no Bearer prefix) is already documented in the skill. Found during RFC review: the existing skill's lack of document support forced falling back to the browser (which hit Linear's login wall). Also fixes a schema gotcha — the Document field is `contentState`, not `contentData` (which returns 400). Tested end-to-end against the production API: python3 linear_api.py whoami python3 linear_api.py get-document 38359beef67c Both return expected payloads. * fix(skills/linear): point LINEAR_API_KEY setup to the correct page The org-level Settings > API page (/settings/api) only shows OAuth apps and workspace-member keys. Personal API keys live under Account, Security, access (/settings/account/security). Update both the setup link in config.py (shown during hermes setup) and the setup step in SKILL.md so users land on the page that can create a personal key. |
||
|
|
a0fedfbb1b |
feat(checkpoints): v2 single-store rewrite with real pruning + disk guardrails (#20709)
Replaces the per-directory shadow-repo design with a single shared shadow
git store at ~/.hermes/checkpoints/store/. Object DB is now deduplicated
across every working directory the agent has ever touched; a dozen
worktrees of the same project cost near-zero in additional disk.
Why
---
Pre-v2 design had three compounding problems that let ~/.hermes/checkpoints/
grow to multi-GB on active machines:
1. Each working directory got its own full shadow git repo — no object
dedup across projects or across worktrees of the same project.
2. _prune() was a documented no-op: max_snapshots only limited the
/rollback listing. Loose objects accumulated forever.
3. Defaults: enabled=True, auto_prune=False — users paid the disk cost
without ever asking for /rollback.
Field report on a single workstation: 847 MB across 47 shadow repos,
mostly redundant clones of the hermes-agent source tree.
Changes
-------
- tools/checkpoint_manager.py: full rewrite. Single bare store, per-project
refs (refs/hermes/<hash>), per-project indexes (store/indexes/<hash>),
per-project metadata (store/projects/<hash>.json with workdir +
created_at + last_touch). On first v2 init, any pre-v2 per-directory
shadow repos are auto-migrated into legacy-<timestamp>/ so the new
store starts clean. _prune() now actually rewrites the per-project ref
to the last max_snapshots commits and runs git gc --prune=now. New
_enforce_size_cap() drops oldest commits round-robin across projects
when the store exceeds max_total_size_mb. _drop_oversize_from_index()
filters any single file larger than max_file_size_mb out of the snapshot.
- hermes_cli/checkpoints.py: new 'hermes checkpoints' CLI
(status / list / prune / clear / clear-legacy) for managing the store
outside a session.
- hermes_cli/config.py: flipped defaults — enabled=False, max_snapshots=20,
auto_prune=True. Added max_total_size_mb=500, max_file_size_mb=10.
Tightened DEFAULT_EXCLUDES (added target/, *.so/*.dylib/*.dll,
*.mp4/*.mov, *.zip/*.tar.gz, .worktrees/, .mypy_cache/, etc.).
- run_agent.py / cli.py / gateway/run.py: thread the new kwargs through
AIAgent and the startup auto_prune hooks.
- Tests rewritten to match v2 storage while keeping backwards-compat
coverage for the pre-v2 prune path (per-directory shadow repos under
base/ are still swept correctly for anyone mid-migration).
- Docs updated: user-guide/checkpoints-and-rollback.md explains the
shared store, new defaults, migration, and the new CLI;
reference/cli-commands.md documents 'hermes checkpoints'.
E2E validated
-------------
- Legacy migration: pre-v2 shadow repos auto-archived into legacy-<ts>/.
- Object dedup: two projects with an identical shared.py blob resolve to
7 total objects in the store (v1 would have stored the blob twice).
- max_snapshots=3 actually enforced: after 6 commits, list shows 3.
- Orphan prune: deleting a project's workdir + 'hermes checkpoints prune
--retention-days 0' removes its ref, index, and metadata; GC reclaims
the objects.
- max_file_size_mb=1 excludes a 2 MB weights.bin while keeping the
tracked source code files.
- hermes checkpoints {status,prune,clear,clear-legacy} all work from the
CLI without an agent running.
Breaking / migration
--------------------
No in-place data migration — legacy per-directory shadow repos are moved
into legacy-<timestamp>/ on first run. Old /rollback history is still
accessible by inspecting the archive with git; run
'hermes checkpoints clear-legacy' to reclaim the space when ready. Users
relying on /rollback must now set checkpoints.enabled=true (or pass
--checkpoints) explicitly.
|
||
|
|
76074d9ee6 | fix(cli): recover classic CLI output after resize | ||
|
|
395dbcc873 |
feat(browser): add Lightpanda engine support with automatic Chrome fallback
Add Lightpanda as an optional browser engine for local mode.
Lightpanda is a headless browser built from scratch in Zig -- faster
navigation than Chrome with significantly less memory.
One config line to enable:
browser:
engine: lightpanda
New functions in browser_tool.py:
- _get_browser_engine() -- config/env reader with validation + caching
- _should_inject_engine() -- only inject in local non-cloud mode
- _needs_lightpanda_fallback() -- detect empty/failed LP results
- _chrome_fallback_screenshot() -- temporary Chrome session for screenshots
- Engine injection in _run_browser_command (--engine flag)
- browser_vision pre-routes screenshots to Chrome when engine=lightpanda
Config:
- browser.engine in DEFAULT_CONFIG (auto/lightpanda/chrome)
- AGENT_BROWSER_ENGINE in OPTIONAL_ENV_VARS
- /browser status shows engine info in local mode
Rebased from PR #7144 onto current main. All existing code preserved --
pure additions only (+520/-2).
25 new tests + 81 total browser tests pass (0 failures).
|
||
|
|
39f451f5ad |
fix: add Turkish locale references in config, tests, and docs
- hermes_cli/config.py: add tr to supported languages comment - locales/en.yaml: add tr to locale file list comment - tests/agent/test_i18n.py: add Turkish alias tests + explicit lang test - website/docs/user-guide/configuration.md: add tr to supported values |
||
|
|
c4b287ba53 | feat(i18n): add Ukrainian locale | ||
|
|
7b05ccddc7 | docs(bedrock): fix IAM permissions, add quickstart entry, add fallback provider, fix deployment section | ||
|
|
20a4f79ed1 |
feat: provider modules — ProviderProfile ABC, 33 providers, fetch_models, transport single-path
Introduces providers/ package — single source of truth for every inference provider. Adding a simple api-key provider now requires one providers/<name>.py file with zero edits anywhere else. What this PR ships: - providers/ package (ProviderProfile ABC + 33 profiles across 4 api_modes) - ProviderProfile declarative fields: name, api_mode, aliases, display_name, env_vars, base_url, models_url, auth_type, fallback_models, hostname, default_headers, fixed_temperature, default_max_tokens, default_aux_model - 4 overridable hooks: prepare_messages, build_extra_body, build_api_kwargs_extras, fetch_models - chat_completions.build_kwargs: profile path via _build_kwargs_from_profile, legacy flag path retained for lmstudio/tencent-tokenhub (which have session-aware reasoning probing that doesn't map cleanly to hooks yet) - run_agent.py: profile path for all registered providers; legacy path variable scoping fixed (all flags defined before branching) - Auto-wires: auth.PROVIDER_REGISTRY, models.CANONICAL_PROVIDERS, doctor health checks, config.OPTIONAL_ENV_VARS, model_metadata._URL_TO_PROVIDER - GeminiProfile: thinking_config translation (native + openai-compat nested) - New tests/providers/ (79 tests covering profile declarations, transport parity, hook overrides, e2e kwargs assembly) Deltas vs original PR (salvaged onto current main): - Added profiles: alibaba-coding-plan, azure-foundry, minimax-oauth (were added to main since original PR) - Skipped profiles: lmstudio, tencent-tokenhub stay on legacy path (their reasoning_effort probing has no clean hook equivalent yet) - Removed lmstudio alias from custom profile (it's a separate provider now) - Skipped openrouter/custom from PROVIDER_REGISTRY auto-extension (resolve_provider special-cases them; adding breaks runtime resolution) - runtime_provider: profile.api_mode only as fallback when URL detection finds nothing (was breaking minimax /v1 override) - Preserved main's legacy-path improvements: deepseek reasoning_content preserve, gemini Gemma skip, OpenRouter response caching, Anthropic 1M beta recovery, etc. - Kept agent/copilot_acp_client.py in place (rejected PR's relocation — main has 7 fixes landed since; relocation would revert them) - _API_KEY_PROVIDER_AUX_MODELS alias kept for backward compat with existing test imports Co-authored-by: kshitijk4poor <82637225+kshitijk4poor@users.noreply.github.com> Closes #14418 |
||
|
|
7de3c86c5a |
feat(i18n): add display.language for static message translation (zh/ja/de/es) (#20231)
* revert(gateway): remove stale-code self-check and auto-restart Removes the _detect_stale_code / _trigger_stale_code_restart mechanism introduced in #17648 and iterated in #19740. On every incoming message the gateway compared the boot-time git HEAD SHA to the current SHA on disk, and if they differed it would reply with Gateway code was updated in the background -- restarting this gateway so your next message runs on the new code. Please retry in a moment. and then kick off a graceful restart. This is unwanted behaviour: users who run a long-lived gateway and do their own ad-hoc git operations on the checkout end up with their chat interrupted and the current message dropped every time HEAD moves, with no way to opt out. If an operator really needs the old protection against stale sys.modules after "hermes update", the SIGKILL-survivor sweep in hermes update (hermes_cli/main.py, also tagged #17648) already handles the supervisor-respawn case on its own. Removed: gateway/run.py: - _STALE_CODE_SENTINELS, _GIT_SHA_CACHE_TTL_SECS - _read_git_head_sha(), _compute_repo_mtime() module helpers - class-level _boot_wall_time / _boot_repo_mtime / _boot_git_sha / _stale_code_restart_triggered defaults - __init__ boot-snapshot block (_boot_*, _cached_current_sha*, _repo_root_for_staleness, _stale_code_notified) - _current_git_sha_cached(), _detect_stale_code(), _trigger_stale_code_restart() methods - stale-code check + user-facing restart notice at the top of _handle_message() tests/gateway/test_stale_code_self_check.py (deleted, 412 lines) No new logic added. Zero remaining references to any removed symbol. Gateway test suite passes the same 4589 tests it passed before; the 3 pre-existing unrelated failures (discord free-channel, feishu bot admission, teams typing) are unchanged by this commit. * feat(i18n): add display.language for static message translation (zh/ja/de/es) Adds a thin-slice i18n layer covering the highest-impact static user-facing messages: the CLI dangerous-command approval prompt and a handful of gateway slash-command replies (restart-drain, goal cleared, approval expired, config read/save errors). Out of scope (stays English): agent responses, log lines, tool outputs, slash-command descriptions, error tracebacks. Infrastructure: - agent/i18n.py: catalog loader, t() helper, language resolution (HERMES_LANGUAGE env var > display.language config > en) - locales/{en,zh,ja,de,es}.yaml: ~19 translated strings per language - display.language in DEFAULT_CONFIG (hermes_cli/config.py) Tests: - tests/agent/test_i18n.py: 21 tests covering catalog parity, placeholder parity across locales, fallback behavior, env-var override, alias normalization, missing-key graceful degradation. Docs: - website/docs/user-guide/configuration.md: display.language entry plus a short section explaining scope so users don't expect agent responses to translate via this knob. |
||
|
|
645a2f482d | fix(cli): fix shortcut config conflict in hermes_cli | ||
|
|
b46b0c9888 |
fix(backup): floor pre-update backup_keep to 1 so the new backup survives
`updates.backup_keep: 0` (or any negative value) wiped the freshly-
created pre-update zip:
_prune_pre_update_backups(backup_dir, keep=0):
backups = sorted(..., reverse=True) # newest first, includes
# the zip we just wrote
for p in backups[0:]: # = all of them
p.unlink()
The wrapper in `main.py` then printed `Saved: <path>` for a file that
no longer existed (the size lookup is wrapped in `try/except OSError`
which silently degrades to "0 B"), leaving operators believing they had
a recovery point when they had none.
This is a real footgun because some config systems treat 0 as "keep
unlimited"; here it does the opposite — every backup is destroyed
right after creation.
Fix: clamp `keep` to a minimum of 1 inside `_prune_pre_update_backups`
since that helper is only invoked immediately after a fresh backup
is written. Operators who genuinely want no backups should set
`updates.pre_update_backup: false` (which gates creation entirely)
rather than relying on `backup_keep: 0`.
Also extends the `backup_keep` config docstring to spell out the floor
and point at `pre_update_backup: false` as the off-switch.
## Tests
Three regression tests added in `TestPreUpdateBackup`:
- `test_keep_zero_does_not_delete_freshly_created_backup` —
asserts the file persists after `keep=0`
- `test_keep_negative_does_not_delete_freshly_created_backup` —
same for negative values
- `test_keep_zero_still_prunes_older_backups` — proves the floor
only protects the new backup; older ones are still rotated out
Verified the new tests fail on origin/main (without the floor) and
pass with it; full `tests/hermes_cli/test_backup.py` suite green
(84 tests).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
a11aed1acc |
fix(cli): local backend CLI always uses launch directory, stops .env sync of TERMINAL_CWD (#19334)
The old CWD heuristic was fooled by: 1. TERMINAL_CWD persisted to .env by `hermes config set terminal.cwd` 2. Inherited TERMINAL_CWD from parent hermes processes 3. Only resolved when config had a placeholder value (not explicit paths) Fix: - load_cli_config() unconditionally uses os.getcwd() for local backend - TERMINAL_CWD always force-exported in CLI mode (overrides stale values) - Gateway sets _HERMES_GATEWAY=1 marker so lazy cli.py imports don't clobber - Remove terminal.cwd from config-set .env sync map (prevents re-poisoning) - Clarify setup wizard label as 'Gateway working directory' Closes #19214 |
||
|
|
457c7b76cd |
feat(openrouter): add response caching support (#19132)
Enable OpenRouter's response caching feature (beta) via X-OpenRouter-Cache
headers. When enabled, identical API requests return cached responses for
free (zero billing), reducing both latency and cost.
Configuration via config.yaml:
openrouter:
response_cache: true # default: on
response_cache_ttl: 300 # 1-86400 seconds
Changes:
- Add openrouter config section to DEFAULT_CONFIG (response_cache + TTL)
- Add build_or_headers() in auxiliary_client.py that builds attribution
headers plus optional cache headers based on config
- Replace inline _OR_HEADERS dicts with build_or_headers() at all 5 sites:
run_agent.py __init__, _apply_client_headers_for_base_url(), and
auxiliary_client.py _try_openrouter() + _to_async_client()
- Add _check_openrouter_cache_status() method to AIAgent that reads
X-OpenRouter-Cache-Status from streaming response headers and logs
HIT/MISS status
- Document in cli-config.yaml.example
- Add 28 tests (22 unit + 6 integration)
Ref: https://openrouter.ai/docs/guides/features/response-caching
|
||
|
|
5d3be898a8 |
docs(tts): mention xAI custom voice support (#18776)
Point users to xAI's custom voices feature — clone your voice in the console, paste the voice_id into tts.xai.voice_id. No code changes needed; the existing TTS pipeline already handles arbitrary voice IDs. - config.py: link to xAI custom voices docs in voice_id comment - setup.py: prompt accepts custom voice IDs during xAI TTS setup - tts.md: short section linking to xAI console and docs |
||
|
|
1dce908930 |
fix(gateway): shutdown + restart hygiene (drain timeout, false-fatal, success log) (#18761)
* fix(gateway): config.yaml wins over .env for agent/display/timezone settings
Regression from the silent config→env bridge. The bridge at module import
time is correct for max_turns (unconditional overwrite), but every other
agent.*, display.*, timezone, and security bridge key was guarded by
'if X not in os.environ' — so a stale .env entry from an old 'hermes setup'
run would shadow the user's current config.yaml indefinitely.
Symptom: agent.max_turns: 500 in config.yaml, HERMES_MAX_ITERATIONS=60
in .env from an old setup, and the gateway silently capped at 60
iterations per turn. Gateway logs confirmed api_calls never exceeded 60.
Three changes:
1. gateway/run.py: drop the 'not in os.environ' guards for all agent.*,
display.*, timezone, and security.* bridge keys. config.yaml is now
authoritative for these settings — same semantics already in place
for max_turns, terminal.*, and auxiliary.*. Also surface the bridge
failure (previously 'except Exception: pass') to stderr so operators
see bridge errors instead of silently falling back to .env.
2. gateway/run.py: INFO-log the resolved max_iterations at gateway
start so operators can verify the config→env bridge did the right
thing instead of chasing a phantom budget ceiling.
3. hermes_cli/setup.py: stop writing HERMES_MAX_ITERATIONS to .env in
the setup wizard. config.yaml is the single source of truth. Also
clean up any stale .env entry left behind by pre-fix setups.
Regression tests in tests/gateway/test_config_env_bridge_authority.py
guard each config→env key against the 'stale .env shadows config' bug.
* fix(gateway): shutdown + restart hygiene (drain timeout, false-fatal, success log)
Three issues observed in production gateway.log during a rapid restart
chain on 2026-05-02, all fixed here.
1. _send_restart_notification logged unconditional success
adapter.send() catches provider errors (e.g. Telegram 'Chat not found')
and returns SendResult(success=False); it never raises. The caller
ignored the return value and always logged 'Sent restart notification
to <chat>' at INFO, producing a misleading success line directly
below the 'Failed to send Telegram message' traceback on every boot.
Now inspects result.success and logs WARNING with the error otherwise.
2. WhatsApp bridge SIGTERM on shutdown classified as fatal error
_check_managed_bridge_exit() saw the bridge's returncode -15 (our own
SIGTERM from disconnect()) and fired the full fatal-error path,
producing 'ERROR ... WhatsApp bridge process exited unexpectedly' plus
'Fatal whatsapp adapter error (whatsapp_bridge_exited)' on every
planned shutdown, immediately before the normal '✓ whatsapp
disconnected'. Adds a _shutting_down flag that disconnect() sets
before the terminate, and _check_managed_bridge_exit() returns None
for returncode in {0, -2, -15} while shutting down. OOM-kill (137)
and other non-signal exits still hit the fatal path.
3. restart_drain_timeout default 60s → 180s
On 2026-05-02 01:43:27 a user /restart fired while three agents were
mid-API-call (82s, 112s, 154s into their turns). The 60s drain budget
expired and all three were force-interrupted. 180s covers realistic
in-flight agent turns; users on very-long-reasoning models can still
raise it further via agent.restart_drain_timeout in config.yaml.
Existing explicit user values are preserved by deep-merge.
Tests
- tests/gateway/test_restart_notification.py: two new tests assert INFO
is only logged on SendResult(success=True) and WARNING with the error
string is logged on SendResult(success=False).
- tests/gateway/test_whatsapp_connect.py: parametrized test for
returncode in {0, -2, -15} proves shutdown-time exits are suppressed;
separate test proves returncode 137 (SIGKILL/OOM) still surfaces as
fatal even when _shutting_down is set.
- _check_managed_bridge_exit() reads _shutting_down via getattr-with-
default so existing _make_adapter() test helpers that bypass __init__
(pitfall #17 in AGENTS.md) keep working unmodified.
|
||
|
|
77c0bc6b13 |
fix(curator): defer first run and add --dry-run preview (#18373) (#18389)
* fix(curator): defer first run and add --dry-run preview (#18373) Curator was meant to run 7 days after install, not on the very first gateway tick. On a fresh install (no .curator_state), should_run_now() returned True immediately because last_run_at was None — so the gateway cron ticker fired Curator against a fresh skill library moments after 'hermes update'. Combined with the binary 'agent-created' provenance model (anything not bundled and not hub-installed), this consolidated hand-authored user workflow skills without consent. Changes: - should_run_now(): first observation seeds last_run_at='now' and returns False. The next real pass fires one full interval_hours later (7 days by default), matching the original design intent. - hermes curator run --dry-run: produces the same review report without applying automatic transitions OR permitting the LLM to call skill_manage / terminal mv. A DRY-RUN banner is prepended to the prompt and the caller skips apply_automatic_transitions. State is NOT advanced so a preview doesn't defer the next scheduled real pass. - hermes update: prints a one-liner on fresh installs pointing at --dry-run, pause, and the docs. Silent on steady state. - Docs: curator.md and cli-commands.md explain the deferred first-run behavior and warn that hand-written SKILL.md files share the 'agent-created' bucket, with guidance to pin or preview before the first pass. Tests: - test_first_run_defers replaces the old 'first run always eligible' assertion — same fixture, inverted expectation. - test_maybe_run_curator_defers_on_fresh_install covers the gateway tick path end-to-end. - Three new dry-run tests cover state-advance suppression, prompt banner injection, and apply_automatic_transitions skipping. Fixes #18373. * feat(curator): pre-run backup + rollback (#18373) Every real curator pass now snapshots ~/.hermes/skills/ into ~/.hermes/skills/.curator_backups/<utc-iso>/skills.tar.gz before calling apply_automatic_transitions or the LLM review. If a run consolidates or archives something the user didn't want touched, 'hermes curator rollback' restores the tree in one command. Dry-run is skipped — no mutation means no snapshot needed. Changes: - agent/curator_backup.py (new): tar.gz snapshot + safe rollback. The snapshot excludes .curator_backups/ (would recurse) and .hub/ (managed by the skills hub). Extract refuses absolute paths and .. components, and uses tarfile's filter='data' on Python 3.12+. Rollback takes a pre-rollback safety snapshot FIRST, stages the current tree into .rollback-staging-<ts>/ so the extract lands in an empty dir, and cleans the staging dir on success. A failed extract restores the staged contents. - agent/curator.py: run_curator_review() calls curator_backup. snapshot_skills(reason='pre-curator-run') before apply_automatic_ transitions. Best-effort — a failed snapshot logs at debug and the run continues (a transient disk issue shouldn't silently disable curator forever). - hermes_cli/curator.py: new 'hermes curator backup' and 'hermes curator rollback' subcommands. rollback supports --list, --id <ts>, -y. - hermes_cli/config.py: curator.backup.{enabled, keep} config block with sane defaults (enabled=true, keep=5). - Docs: curator.md gets a 'Backups and rollback' section; cli-commands .md table gets the new rows. Tests (new file tests/agent/test_curator_backup.py, 16 cases): - snapshot creates tarball + manifest with correct counts - snapshot excludes .curator_backups/ (recursion guard) and .hub/ - snapshot disabled via config returns None without creating anything - snapshot uniquifies ids within the same second (-01 suffix) - prune honors keep count, newest-first - list_backups + _resolve_backup cover newest-default and unknown-id - rollback restores a deleted skill with content intact - rollback is itself undoable — safety snapshot shows up in list_backups - rollback with no snapshots returns an error - rollback refuses tarballs with absolute paths or .. components - real curator runs take a 'pre-curator-run' snapshot; dry-runs do not All curator tests: 210 passing locally. |
||
|
|
265bd59c1d |
feat: /goal — persistent cross-turn goals (Ralph loop) (#18262)
Add a standing-goal slash command that keeps Hermes working toward a user-stated objective across turns until it is achieved, paused, or the turn budget runs out. Our take on the Ralph loop — cf. Codex CLI 0.128.0's /goal. After each turn, a lightweight auxiliary-model judge call asks 'is this goal satisfied by the assistant's last response?'. If not, and we're under the turn budget (default 20), Hermes feeds a continuation prompt back into the same session as a normal user message. Any real user message preempts the continuation loop automatically. Judge failures fail OPEN (continue) so a flaky judge never wedges progress — the turn budget is the real backstop. ### Commands - `/goal <text>` — set a standing goal (kicks off the first turn) - `/goal` or `/goal status` — show current state - `/goal pause` — pause the continuation loop - `/goal resume` — resume (resets turn counter) - `/goal clear` — drop the goal Works on both CLI and gateway platforms via the central CommandDef registry. ### Design invariants preserved - **Prompt cache**: continuation prompts are regular user-role messages appended to history. No system-prompt mutation, no toolset swap. - **Role alternation**: continuation is a user turn, never injected mid-tool-loop. - **Session persistence**: goal state lives in SessionDB.state_meta keyed by `goal:<session_id>`, so `/resume` picks it up. - **Mid-run safety**: on the gateway, `/goal status|pause|clear` are allowed mid-run (control-plane only); setting a new goal requires `/stop` first so we don't race a second continuation prompt against the current turn. ### Files - `hermes_cli/goals.py` (new, 380 lines) — GoalManager + judge + state - `hermes_cli/commands.py` — CommandDef entry - `hermes_cli/config.py` — `goals.max_turns` default - `hermes_cli/web_server.py` — dashboard category merge - `cli.py` — /goal handler + post-turn continuation hook in process_loop - `gateway/run.py` — /goal handler + post-turn continuation hook wrapping _handle_message_with_agent - `tests/hermes_cli/test_goals.py` (new, 26 tests) — judge parsing, fail-open semantics, lifecycle, persistence, budget exhaustion - `website/docs/reference/slash-commands.md` — docs entry |
||
|
|
4caad285a6 |
feat(gateway): auto-delete slash-command system notices after TTL (#18266)
Adds opt-in auto-deletion for slash-command reply messages like "New session started!", "Restarting gateway…", "Stopped.", and YOLO toggles. After the TTL elapses the gateway calls the adapter's delete_message; on platforms without a delete API (everything except Telegram today) the TTL is silently ignored and the message stays. Requested on Twitter by @charlesmcdowell — tool-call bubbles are useful real-time, but system notices clutter the thread once the agent finishes. Implementation: - EphemeralReply(str) sentinel in gateway/platforms/base.py. Subclasses str so existing 'X' in response / response.startswith(...) checks in tests and call sites keep working unchanged; isinstance() still distinguishes it for the send path. - _process_message_background and both busy-session bypass paths (in base.py) call _unwrap_ephemeral() on the handler return, send the unwrapped text, and schedule a detached delete task when the TTL > 0 AND the adapter class overrides delete_message. - display.ephemeral_system_ttl (default 0 = disabled) in DEFAULT_CONFIG. Handler can pass ttl_seconds explicitly to override. - Wrapped the highest-noise return sites: /new, /reset, /stop, /yolo on/off, /restart success + "already in progress". Draining notices and /help output left as plain strings — those are informational and users want to read them. Backward-compat: default TTL 0 → no scheduling, no behavior change for existing users. Platforms without delete_message silently no-op. |
||
|
|
fc78e708ed |
fix(update): don't crash hermes update if skill config scan fails (#18257)
`hermes update` ran the config migration (11 → 17) successfully then crashed at `agent/skill_utils.py:340` during the post-migration skill-config prompt. User @FlockonUS reported this on Twitter. Root cause: `get_missing_skill_config_vars` in hermes_cli/config.py only guarded the import of `discover_all_skill_config_vars`, not the call. Any runtime exception inside the skill scan (malformed SKILL.md, unreadable external skill dir, etc.) propagated up through `migrate_config` and aborted `hermes update` after the version bump. Wrap the call in try/except so skill-config prompting — which is a post-migration nicety — can never block the migration itself. |
||
|
|
0704589ceb | fix(agent): make tool loop guardrails warning-first | ||
|
|
e3624e00db |
fix: enforce strictly subtractive toolset filtration
Refactor tool resolution logic in model_tools.py to ensure that disabled_toolsets are always subtracted at the end, preventing composite toolsets (e.g. 'browser') from implicitly enabling tools that should be hidden. - Added 'disabled_toolsets' to DEFAULT_CONFIG in hermes_cli/config.py - Updated HermesCLI in cli.py to load and propagate disabled toolsets to AIAgent - Implemented robust two-phase resolution (additive then subtractive) in model_tools.py |
||
|
|
c868425467 |
feat(kanban): durable multi-profile collaboration board (#17805)
Salvage of PR #16100 onto current main (after emozilla's #17514 fix that unblocks plugin Pydantic body validation). History preserved on the standing `feat/kanban-standing` branch; this squashes the 22 iterative commits into one clean landing. What this lands: - SQLite kernel (hermes_cli/kanban_db.py) — durable task board with tasks, task_links, task_runs, task_comments, task_events, kanban_notify_subs tables. WAL mode, atomic claim via CAS, tenant-namespaced, skills JSON array per task, max-runtime timeouts, worker heartbeats, idempotency keys, circuit breaker on repeated spawn failures, crash detection via /proc/<pid>/status, run history preserved across attempts. - Dispatcher — runs inside the gateway by default (`kanban.dispatch_in_gateway: true`). Ticks every 60s, reclaims stale claims, promotes ready tasks, spawns `hermes -p <assignee> chat -q "work kanban task <id>"` with HERMES_KANBAN_TASK + HERMES_KANBAN_WORKSPACE env. Auto-loads `--skills kanban-worker` plus any per-task skills. Health telemetry warns on stuck ready queue. - Structured tool surface (tools/kanban_tools.py) — 7 tools (kanban_show, kanban_complete, kanban_block, kanban_heartbeat, kanban_comment, kanban_create, kanban_link). Gated on HERMES_KANBAN_TASK via check_fn so zero schema footprint in normal sessions. - System-prompt guidance (agent/prompt_builder.py KANBAN_GUIDANCE) injected only when kanban tools are active. - Dashboard plugin (plugins/kanban/dashboard/) — Linear-style board UI: triage/todo/ready/running/blocked/done columns, drag-drop, inline create, task drawer with markdown, comments, run history, dependency editor, bulk ops, lanes-by-profile grouping, WS-driven live refresh. Matches active dashboard theme via CSS variables. - CLI — `hermes kanban init|create|list|show|assign|link|unlink| claim|comment|complete|block|unblock|archive|tail|dispatch|context| init|gc|watch|stats|notify|log|heartbeat|runs|assignees` + `/kanban` slash in-session. - Worker + orchestrator skills (skills/devops/kanban-worker + kanban-orchestrator) — pattern library for good summary/metadata shapes, retry diagnostics, block-reason examples, fan-out patterns. - Per-task force-loaded skills — `--skill <name>` (repeatable), stored as JSON, threaded through to dispatcher argv as one `--skills X` pair per skill alongside the built-in kanban-worker. Dashboard + CLI + tool parity. - Deprecation of standalone `hermes kanban daemon` — stub exits 2 with migration guidance; `--force` escape hatch for headless hosts. - Docs (website/docs/user-guide/features/kanban.md + kanban-tutorial.md) with 11 dashboard screenshots walking through four user stories (Solo Dev, Fleet Farming, Role Pipeline, Circuit Breaker). - Tests (251 passing): kernel schema + migration + CAS atomicity, dispatcher logic, circuit breaker, crash detection, max-runtime timeouts, claim lifecycle, tenant isolation, idempotency keys, per- task skills round-trip + validation + dispatcher argv, tool surface (7 tools × round-trip + error paths), dashboard REST (CRUD + bulk + links + warnings), gateway-embedded dispatcher (config gate, env override, graceful shutdown), CLI deprecation stub, migration from legacy schemas. Gateway integration: - GatewayRunner._kanban_dispatcher_watcher — new asyncio background task, symmetric with _kanban_notifier_watcher. Runs dispatch_once via asyncio.to_thread so SQLite WAL never blocks the loop. Sleeps in 1s slices for snappy shutdown. Respects HERMES_KANBAN_DISPATCH_IN_GATEWAY=0 env override for debugging. - Config: new `kanban` section in DEFAULT_CONFIG with `dispatch_in_gateway: true` (default) + `dispatch_interval_seconds: 60`. Additive — no \_config_version bump needed. Forward-compat: - workflow_template_id / current_step_key columns on tasks (v1 writes NULL; v2 will use them for routing). - task_runs holds claim machinery (claim_lock, claim_expires, worker_pid, last_heartbeat_at) so multi-attempt history is first- class from day one. Closes #16102. Co-authored-by: emozilla <emozilla@nousresearch.com> |
||
|
|
e8e5985ce6 |
fix(curator): seed defaults on update, create logs/curator dir, defer fire import (#17927)
Three fixes bundled for curator reliability on existing installs and broken/partial installs: 1. run_agent.py: defer `import fire` into the __main__ block. `fire` is only used by `fire.Fire(main)` when running run_agent.py directly as a CLI — it is NOT needed for library usage. Importing it at module top made `from run_agent import AIAgent` from a daemon thread (e.g. the curator's forked review agent) crash with ModuleNotFoundError on broken/partial installs where `fire` isn't present. 2. hermes_cli/config.py: add version 22 → 23 migration that writes the `curator` + `auxiliary.curator` sections to config.yaml with their defaults, only filling keys the user hasn't overridden. Existing configs from before PR #16049 / the April 2026 `auxiliary.curator` unification had neither section on disk, so users couldn't see or edit the settings in their config.yaml (runtime deep-merge papered over it at read time, but the file never reflected reality). 3. hermes_cli/config.py: `ensure_hermes_home()` now pre-creates `~/.hermes/logs/curator/` alongside cron/sessions/logs/memories on every CLI launch. Managed-mode (NixOS) variant mkdir's it defensively after the activation-script existence checks, since the activation script may not know about this subpath. 4. agent/curator.py: `_reports_root()` mkdir's the dir at call time as belt-and-suspenders for entry paths that bypass both ensure_hermes_home() and the v23 migration (gateway-only installs, bare library use). E2E validated in isolated HERMES_HOME: fresh install gets full defaults seeded; partial-override config keeps user's `enabled: false` and custom `interval_hours` while filling the missing keys; re-running the migration is a no-op. |
||
|
|
b50bc13ef9 |
fix(config): preserve YAML lists in hermes config set (#17876)
_set_nested unconditionally replaced any non-dict value with an empty dict when walking the dotted path, which silently destroyed list-typed config nodes the moment someone set a value with a numeric index (e.g. 'hermes config set custom_providers.0.api_key NEW'). Any sibling entries and any fields inside the targeted entry that the user didn't write were lost. Fix: - _set_nested now detects list nodes and navigates by numeric index, and preserves both dicts AND lists at intermediate positions (scalars are still replaced so bare-scalar -> nested overrides keep working). - set_config_value drops its duplicated navigation logic and calls _set_nested instead -- single source of truth for the rules. Regression tests (tests/hermes_cli/test_set_config_value.py): - test_indexed_set_preserves_sibling_list_entries -- exact #17876 repro - test_indexed_set_preserves_non_targeted_fields -- inner-dict fields survive - test_deeper_nesting_through_list -- dict -> list -> dict -> scalar path 35/35 existing + new tests pass. E2E-verified with the issue's repro against a real on-disk config.yaml -- list stays a list, entry 0 updated, entry 1 intact. Closes #17876 |
||
|
|
0dd373ec43 |
fix(context): honor model.context_length for Ollama num_ctx and all display paths
When a user sets model.context_length in config.yaml, the value was only used for Hermes' internal compression decisions (context_compressor) but NOT for Ollama's num_ctx parameter. Ollama auto-detects context from GGUF metadata (often 256K+) and allocates that much VRAM regardless of the user's config — causing OOM on smaller GPUs like the P100 (16GB). Root cause: two separate context values existed independently: - context_compressor.context_length = config value (e.g. 65536) ✓ - _ollama_num_ctx = GGUF metadata value (e.g. 256000) ✗ ignored config Changes: 1. Cap Ollama num_ctx to config context_length (run_agent.py) When model.context_length is explicitly set and no explicit ollama_num_ctx override exists, cap the auto-detected GGUF value to the user's context_length. This is the core fix — it prevents Ollama from allocating more VRAM than the user budgeted. 2. Pass config_context_length through all secondary call sites Several paths called get_model_context_length() without the config override, falling through to the 256K default fallback: - cli.py: @-reference expansion and /model switch display - gateway/run.py: @-reference expansion and /model switch display - tui_gateway/server.py: @-reference expansion - hermes_cli/model_switch.py: resolve_display_context_length() 3. Normalize root-level context_length in config (hermes_cli/config.py) _normalize_root_model_keys() now migrates root-level context_length into the model section, matching existing behavior for provider and base_url. Users who wrote `context_length: 65536` at the YAML root instead of under `model:` had it silently ignored. 4. Fix misleading comments (agent/model_metadata.py) DEFAULT_FALLBACK_CONTEXT is 256K (CONTEXT_PROBE_TIERS[0]), not 128K as two comments stated. Tests: 3 new tests for root-level context_length normalization. All existing context_length tests pass (96 tests). |
||
|
|
8d302e37a8 |
feat(tts): add Piper as a native local TTS provider (closes #8508) (#17885)
Piper (OHF-Voice/piper1-gpl) is a fast, local neural TTS engine from the
Home Assistant project that supports 44 languages with zero API keys.
Adds it as a native built-in provider alongside edge/neutts/kittentts,
installable via 'hermes tools' with one keystroke.
What ships:
- New 'piper' built-in provider in tools/tts_tool.py
- Lazy import via _import_piper()
- Module-level voice cache keyed on (model_path, use_cuda) so switching
voices doesn't invalidate older cached voices
- _resolve_piper_voice_path() accepts either an absolute .onnx path or a
voice name (auto-downloaded on first use via 'python -m
piper.download_voices --download-dir <cache>')
- Voice cache at ~/.hermes/cache/piper-voices/ (profile-aware via
get_hermes_dir)
- Optional SynthesisConfig knobs: length_scale, noise_scale,
noise_w_scale, volume, normalize_audio, use_cuda — passed through
only when configured, so older piper-tts versions aren't broken
- WAV output then ffmpeg conversion path (same as neutts/kittentts) so
Telegram voice bubbles work when ffmpeg is present
- Piper added to BUILTIN_TTS_PROVIDERS so a user's
tts.providers.piper.command cannot shadow the native provider
(regression test included)
- 'hermes tools' wizard entry
- Piper appears under Voice and TTS as local free, with
'pip install piper-tts' auto-install via post_setup handler
- Prints voice-catalog URL and default-voice info after install
- config.yaml defaults
- tts.piper.voice defaults to en_US-lessac-medium
- Commented advanced knobs for discoverability
- Docs
- New 'Piper (local, 44 languages)' section in features/tts.md
explaining install path, voice switching, pre-downloaded voices,
and advanced knobs
- Piper listed in the ten-provider table and ffmpeg table
- Custom-command-providers section updated to drop the Piper example
(now native) and add a piper-custom example for users with their own
trained .onnx models
- overview.md bumps provider count to ten
- Tests (tests/tools/test_tts_piper.py, 16 tests)
- Registration (BUILTIN_TTS_PROVIDERS, PROVIDER_MAX_TEXT_LENGTH)
- _resolve_piper_voice_path across every branch: direct .onnx path,
cached voice name, fresh download with correct CLI args, download
failure, successful-exit-but-missing-files, empty voice to default
- _generate_piper_tts: loads voice once, reuses cache, voice-name
download wiring, advanced knobs flow through SynthesisConfig
- text_to_speech_tool end-to-end dispatch and missing-package error
- check_tts_requirements: piper availability toggles the return value
- Regression guard: piper cannot be shadowed by a command provider
with the same name
- Pre-existing test_tts_mistral test broadened to mock the new
piper/kittentts/command-provider checks (otherwise it false-passes
when piper is installed in the test venv)
E2E verification (live):
Actual pip install piper-tts, config piper + en_US-lessac-low,
text_to_speech_tool call, voice auto-downloaded from HuggingFace,
WAV synthesized, ffmpeg-converted to Ogg/Opus. Second call hits the
cache (~60ms). Cache dir populated with .onnx and .onnx.json.
This caught a real bug during development: the first pass used '-d' as
the download-dir flag; the actual piper.download_voices CLI wants
'--download-dir'. Fixed before PR opened.
|
||
|
|
0da968e521 |
fix(curator): unify under auxiliary.curator (hermes model, dashboard) (#17868)
Voscko reported curator.auxiliary.provider/model was advertised in the
docs but ignored — the review fork read only model.provider/default. The
narrow fix would wire the one-off key through, but that leaves curator
as a parallel system: not in `hermes model` → auxiliary picker, not in
the dashboard Models tab, missing per-task base_url/api_key/timeout/
extra_body.
Unify curator with the rest of the aux task system so `hermes model`
and the dashboard configure it like every other aux task.
Four sources of truth updated:
- hermes_cli/config.py — add 'curator' slot to DEFAULT_CONFIG.auxiliary
(timeout=600 since reviews run long), drop the one-off curator.auxiliary
block from DEFAULT_CONFIG.curator.
- hermes_cli/main.py — add ('curator', 'Curator', 'skill-usage review pass')
to _AUX_TASKS so the CLI picker offers it.
- hermes_cli/web_server.py — add 'curator' to _AUX_TASK_SLOTS so the
dashboard REST endpoint accepts it.
- web/src/pages/ModelsPage.tsx — add Curator entry so the dashboard
Models tab renders the task.
agent/curator.py _resolve_review_model() now reads auxiliary.curator
first (canonical), falls back to legacy curator.auxiliary (with an info
log asking users to migrate), then falls back to the main chat model.
Pre-unification users keep working.
Docs updated: docs/user-guide/features/curator.md now points at
`hermes model` → auxiliary → Curator and the dashboard Models tab.
Tests: 6 unit tests on _resolve_review_model (auto default, canonical
slot honored, partial override fallback, legacy fallback with
deprecation log assertion, new-wins-over-legacy, empty-config safety)
plus a cross-registry test that curator is wired into all four sources
of truth. test_aux_tasks_keys_all_exist_in_default_config already
covers the DEFAULT_CONFIG ↔ _AUX_TASKS invariant.
Reported by Voscko on Discord.
|