mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-05-21 03:39:54 +00:00

Files

T

Teknium eacce70a35 docs: comprehensive 2-week sweep of feature/PR coverage gaps (#28497 )

Catch the website docs up to two weeks of merged work (May 4 – May 18, 2026,
roughly 1,080 PRs). The audit found ~50 user-visible features that had landed
in code with no docs footprint, plus a handful of stale pages. This PR closes
every gap the scan turned up.

New pages
- user-guide/features/deliverable-mode.md — extension list, agent triggers,
  kanban_complete artifacts pattern, [[as_document]] override (PR #27813).
- developer-guide/web-search-provider-plugin.md — authoring guide modeled on
  image-gen-provider-plugin, covering brave_free / ddgs / etc. (PR #25448).

Providers / auth
- Rename "Alibaba Cloud" → "Qwen Cloud (Alibaba DashScope)" everywhere the
  display label shows up; provider id stays `alibaba` (PR #24835).
- Document OAuth refresh-token quarantine for xAI / MiniMax / Codex (PRs
  #28116 / #28118 / #28119).
- Document Nous JWT minting from refresh token + invalid-refresh quarantine
  + cross-profile shared token store (PRs #27663 / #19712).
- Add `## Microsoft Entra ID authentication (keyless)` section to
  azure-foundry guide — DefaultAzureCredential, RBAC, OpenAI + Anthropic
  routing details (PR #28101 / #9df9816da).
- Custom providers `api_mode` is now prompted-and-persisted, not just URL
  autodetected (PR #25068).
- Delegation honours `api_mode` + auto-detects anthropic_messages base URLs
  (PR #26824).
- `x_search` auto-enables when xAI credentials are present (PR #27376).
- Add `xAI Grok OAuth (SuperGrok)` row to providers headline table (PR
  #26534).
- NVIDIA NIM billing-origin header is set automatically (PR #26585).

Windows / installer
- `install.ps1`: document `-Commit <sha>` and `-Tag <v>` pin params plus
  the BOM-strip / git-retry hardening (PR #28169).
- Document Hermes Desktop thin installer + first-launch bootstrap (PR
  #27822).
- Document `dep_ensure` Windows bootstrap (PR #27845).
- Document install-method auto-detection (pip / git / homebrew / nixos) and
  the matching update command (PR #27843).

Gateway / messaging
- `/platform list|pause|resume` full description + circuit-breaker
  semantics (PR #26600).
- Slack / Matrix / Mattermost get parallel `allowed_channels` /
  `allowed_rooms` allowlist sections matching Telegram/Discord/DingTalk
  (PR #21251).
- Discord `allow_any_attachment` + `max_attachment_bytes` (config and env
  vars) (PR #27245).
- Discord clarify-choice button rendering (PR #25485).
- Telegram `guest_mode` @mention bypass for allowlisted groups (PR
  #22759).
- Telegram `notifications` mode (`important` vs `all`) (PR #22793).
- `[[as_document]]` skill / response directive for forcing
  document-style media delivery (PR #21210).

CLI / TUI
- `/new [name]` argument (PR #19637).
- `/subgoal` user-supplied criteria appended to `/goal` (PR #25449).
- `/exit --delete` flag confirmation prompts for destructive slash
  commands (PR #22687).
- Status-bar additions: ▶ N background indicator (PR #27175), context
  compression count (PR #21218), YOLO mode banner+statusbar warning (PR
  #26238).
- `display.timestamps` + `docker_extra_args` config keys (PR #23599).
- TUI collapsible startup banner sections (PR #20625).
- `HERMES_SESSION_ID` exported to tool subprocesses (PR #23847).

i18n
- Refresh display.language locale list from 8 → 16 (en, zh, zh-hant, ja,
  de, es, fr, tr, uk, af, ko, it, ga, pt, ru, hu) — matches
  `agent/i18n.py:SUPPORTED_LANGUAGES`.

Tools / features
- `vision_analyze` native-pixel passthrough for vision-capable callers,
  with auxiliary text-describer fallback (PR #22955).
- `session_search` rewrite to the single-shape tool (discovery / scroll /
  browse modes) (PRs #27590 / #27840).
- Clarify MCP transport scope: client supports stdio + SSE; embedded
  `hermes mcp serve` is stdio-only (PR #21227).
- Web search backends table: add Brave Search (free tier) and DDGS rows
  (PR #21337).
- ACP session-scoped edit auto-approval modes (PR #27862).
- Curator rename map in the user-visible per-run summary (PR #22910).
- Prompt caching feature page reference in features/overview.md — Claude
  cross-session 1-hour prefix cache on native Anthropic / OpenRouter /
  Nous Portal (PR #23828).
- Cron per-job profile parameter (PR #28124).
- `--no-skills` flag for `hermes profile create` (PR #20986).

Build
- Verified with `npm run build` in `website/`; both `en` and `zh-Hans`
  locales compile. Remaining broken-link/anchor warnings are pre-existing
  (`rl-training.md` from learning-path / overview; the
  zh-Hans translation lag the docs skill already calls out).

2026-05-18 23:55:25 -07:00

10 KiB

Raw Permalink Blame History

sidebar_position, title, description

sidebar_position	title	description
16	Persistent Goals	Set a standing goal and let Hermes keep working across turns until it's done. Our take on the Ralph loop.

Persistent Goals (`/goal`)

/goal gives Hermes a standing objective that survives across turns. After every turn a lightweight judge model checks whether the goal is satisfied by the assistant's last response. If not, Hermes automatically feeds a continuation prompt back into the same session and keeps working — until the goal is achieved, you pause or clear it, or the turn budget runs out.

It's our take on the Ralph loop, directly inspired by Codex CLI 0.128.0's /goal by Eric Traut (OpenAI). The core idea — keep a goal alive across turns and don't stop until it's achieved — is theirs. The implementation here is independent and adapted to Hermes' architecture.

When to use it

Use /goal for tasks where you want Hermes to iterate on its own without you re-prompting every turn:

"Fix every lint error in src/ and verify ruff check passes"
"Port feature X from repo Y, including tests, and get CI green"
"Investigate why session IDs sometimes drift on mid-run compression and write up a report"
"Build a small CLI to rename files by their EXIF dates, then test it against the photos/ folder"

Tasks where the agent does one turn and stops don't need /goal. Tasks where you'd otherwise have to say "keep going" three times are where this shines.

Quick start

/goal Fix every failing test in tests/hermes_cli/ and make sure scripts/run_tests.sh passes for that directory

What you'll see:

Goal accepted — ⊙ Goal set (20-turn budget): <your goal>
Turn 1 runs — Hermes starts working as if you'd sent the goal as a normal message.
Judge runs — after the turn, the judge model decides done or continue.
Loop fires if needed — if continue, you'll see ↻ Continuing toward goal (1/20): <judge's reason> and Hermes takes the next step automatically.
Terminates — eventually you see either ✓ Goal achieved: <reason> or ⏸ Goal paused — N/20 turns used.

Commands

Command	What it does
`/goal <text>`	Set (or replace) the standing goal. Kicks off the first turn immediately so you don't need to send a separate message.
`/goal` or `/goal status`	Show the current goal, its status, and turns used.
`/goal pause`	Stop the auto-continuation loop without clearing the goal.
`/goal resume`	Resume the loop (resets the turn counter back to zero).
`/goal clear`	Drop the goal entirely.

Works identically on the CLI and every gateway platform (Telegram, Discord, Slack, Matrix, Signal, WhatsApp, SMS, iMessage, Webhook, API server, and the web dashboard).

Adding criteria mid-goal: `/subgoal`

While a goal is active you can append extra acceptance criteria with /subgoal <text> without resetting the loop. Each call adds one numbered item to the goal's subgoal list; the continuation prompt the agent sees on the next turn includes the original goal plus an "Additional criteria the user added mid-loop" block, and the judge prompt is rewritten so the verdict must consider every subgoal — the goal isn't marked done until the original objective and every subgoal are met.

Command	What it does
`/subgoal <text>`	Append a new criterion to the active goal. Requires an active `/goal`.
`/subgoal` (no args)	Show the current numbered subgoal list.
`/subgoal remove <N>`	Remove the Nth subgoal (1-based).
`/subgoal clear`	Drop every subgoal but keep the original goal intact.

Subgoals are persisted alongside the goal in SessionDB.state_meta, so they survive /resume. Setting a new /goal <text> replaces the goal and clears the subgoal list; /goal clear does the same.

Use this when you start a loop ("fix the failing tests") and notice partway through that you also want it to "and add a regression test for the bug you just patched" — /subgoal add a regression test tightens the success criteria without breaking the running loop.

Behavior details

The judge

After every turn, Hermes calls an auxiliary model with:

The standing goal text
The agent's most recent final response (last ~4 KB of text)
A system prompt telling the judge to reply with strict JSON: {"done": <bool>, "reason": "<one-sentence rationale>"}

The judge is deliberately conservative: it marks a goal done only when the response explicitly confirms the goal is complete, when the final deliverable is clearly produced, or when the goal is unachievable/blocked (treated as DONE with a block reason so we don't burn budget on impossible tasks).

Fail-open semantics

If the judge errors (network blip, malformed response, unavailable aux client), Hermes treats the verdict as continue — a broken judge never wedges progress. The turn budget is the real backstop.

Turn budget

Default is 20 continuation turns (goals.max_turns in config.yaml). When the budget is hit, Hermes auto-pauses and tells you exactly how to proceed:

⏸ Goal paused — 20/20 turns used. Use /goal resume to keep going, or /goal clear to stop.

/goal resume resets the counter to zero, so you can keep going in measured chunks.

User messages always preempt

Any real message you send while a goal is active takes priority over the continuation loop. On the CLI your message lands in _pending_input ahead of the queued continuation; on the gateway it goes through the adapter FIFO the same way. The judge runs again after your turn — so if your message happens to complete the goal, the judge will catch it and stop.

Mid-run safety (gateway)

While an agent is already running, /goal status, /goal pause, and /goal clear are safe to run — they only touch control-plane state and don't interrupt the current turn. Setting a new goal mid-run (/goal <new text>) is rejected with a message telling you to /stop first, so the old continuation can't race the new one.

Persistence

Goal state lives in SessionDB.state_meta keyed by goal:<session_id>. That means /resume picks up right where you left off — set a goal, close your laptop, come back tomorrow, /resume, and the goal is still standing exactly as you left it (active, paused, or done).

Prompt cache

The continuation prompt is a plain user-role message appended to history. It does not mutate the system prompt, swap toolsets, or touch the conversation in any way that invalidates Hermes' prompt cache. Running a 20-turn goal costs the same cache-wise as 20 turns of normal conversation.

Configuration

Add to ~/.hermes/config.yaml:

goals:
  # Max continuation turns before Hermes auto-pauses and asks you to
  # /goal resume. Default 20. Lower this if you want tighter loops;
  # raise it for long-running refactors.
  max_turns: 20

Choosing the judge model

The judge uses the goal_judge auxiliary task. By default it resolves to your main model (see Auxiliary Models). If you want to route the judge to a cheap fast model to keep costs down, add an override:

auxiliary:
  goal_judge:
    provider: openrouter
    model: google/gemini-3-flash-preview

The judge call is small (~200 output tokens) and runs once per turn, so a cheap fast model is usually the right call.

Example walkthrough

You: /goal Create four files /tmp/note_{1..4}.txt, one per turn, each containing its number as text

  ⊙ Goal set (20-turn budget): Create four files /tmp/note_{1..4}.txt, one per turn, each containing its number as text

Hermes: Creating /tmp/note_1.txt now.
  💻 echo "1" > /tmp/note_1.txt   (0.1s)
  I've created /tmp/note_1.txt with the content "1". I'll continue with the remaining files on the next turn as you specified.

  ↻ Continuing toward goal (1/20): Only 1 of 4 files has been created; 3 files remain.

Hermes: [Continuing toward your standing goal]
  💻 echo "2" > /tmp/note_2.txt   (0.1s)
  Created /tmp/note_2.txt. Two more to go.

  ↻ Continuing toward goal (2/20): 2 of 4 files created; 2 remain.

Hermes: [Continuing toward your standing goal]
  💻 echo "3" > /tmp/note_3.txt   (0.1s)
  Created /tmp/note_3.txt.

  ↻ Continuing toward goal (3/20): 3 of 4 files created; 1 remains.

Hermes: [Continuing toward your standing goal]
  💻 echo "4" > /tmp/note_4.txt   (0.1s)
  All four files have been created: /tmp/note_1.txt through /tmp/note_4.txt, each containing its number.

  ✓ Goal achieved: All four files were created with the specified content, completing the goal.

You: _

Four turns, one /goal invocation, zero "keep going" prompts from you.

When the judge gets it wrong

No judge is perfect. Two failure modes to watch for:

False negative — judge says continue when the goal is actually done. The turn budget catches this. You'll see ⏸ Goal paused and can /goal clear or just send a new message.

False positive — judge says done when work remains. You'll see ✓ Goal achieved but you know better. Send a follow-up message to continue, or re-set the goal more precisely: /goal <more specific text>. The judge's system prompt is deliberately conservative to make false positives rarer than false negatives.

If you find a judge verdict unconvincing, the reason text in the ↻ Continuing toward goal or ✓ Goal achieved line tells you exactly what the judge saw. That's usually enough to diagnose whether the goal text was ambiguous or the model's response was.

Attribution

/goal is Hermes' take on the Ralph loop pattern. The user-facing design — keep a goal alive across turns, don't stop until it's achieved, with create/pause/resume/clear controls — was popularised and shipped in Codex CLI 0.128.0 by Eric Traut on OpenAI's Codex team. Our implementation is independent (central CommandDef registry, SessionDB.state_meta persistence, auxiliary-client judge, adapter-FIFO continuation on the gateway side) but the idea is theirs. Credit where credit's due.

10 KiB Raw Permalink Blame History

Persistent Goals (/goal)