mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-05-21 03:39:54 +00:00
2dec7604e206f0397cb5c8804f542dee405aba62
134 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
5d1f350784 | fix(cli): preserve cron asterisks in strip mode | ||
|
|
2ef501e1f5 |
feat(cli): add /update slash command to CLI and TUI (#23854)
* feat: add /update slash command to CLI and TUI * test(cli): add Python tests for /update slash command Co-authored-by: Cursor <cursoragent@cursor.com> * fix(cli): address Copilot review for /update slash command Route classic CLI /update through prompt_toolkit modal confirmation and defer relaunch to the main-thread cleanup path after app.exit(). Tighten Y/n semantics, add Python wrapper and catalog coverage tests, and assert /update stays visible in the TUI command catalog. Co-authored-by: Cursor <cursoragent@cursor.com> * fix(cli): address review feedback on /update command - Replace raw input() with _prompt_text_input_modal in _handle_update_command to avoid EOF/hang/keystroke-leak races with prompt_toolkit's stdin ownership - Fix confirmation logic: only proceed on recognized affirmative aliases (y/yes/1/ok); cancel on everything else including empty string, typos, and unrecognized input — matches all other [Y/n] prompts in the codebase - Route relaunch through main-thread shutdown path: set _pending_relaunch and return False from process_command so process_loop triggers app.exit(); run() then calls relaunch() after prompt_toolkit has restored terminal modes and after cleanup — safe on both POSIX (execvp) and Windows (subprocess+exit) - Fix misleading docstring in test_update_command.py: the Vitest only covers the TypeScript slash handler that emits code 42, not the Python wrapper branch that acts on it - Rewrite tests to use SimpleNamespace pattern (like test_destructive_slash_confirm) so _prompt_text_input_modal can be stubbed directly - Add Python test for _launch_tui exit-code-42 → relaunch branch in main.py Agent-Logs-Url: https://github.com/NousResearch/hermes-agent/sessions/f6da68cf-e7b1-4b7a-aed6-3d4b0f523bdb Co-authored-by: austinpickett <260188+austinpickett@users.noreply.github.com> * fix(cli): polish test fixtures for /update command - Remove unused _prompt_text_input from SimpleNamespace stub - Use pytest.fail sentinel in managed-install guard test to catch unexpected modal invocations Agent-Logs-Url: https://github.com/NousResearch/hermes-agent/sessions/f6da68cf-e7b1-4b7a-aed6-3d4b0f523bdb Co-authored-by: austinpickett <260188+austinpickett@users.noreply.github.com> * chore: re-trigger CI after Copilot review fixes Co-authored-by: Cursor <cursoragent@cursor.com> --------- Co-authored-by: Cursor <cursoragent@cursor.com> Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: austinpickett <260188+austinpickett@users.noreply.github.com> |
||
|
|
5fba236644 |
chore: ruff auto-fix PLR6201 resweep — tuple → set in membership tests (#27355)
Six days after #23937 (608 fixes) the codebase had accumulated 241 new PLR6201 violations. Same mechanical `x in (...)` → `x in {...}` fix, same zero-risk profile: set lookup is O(1) vs O(n) for tuple and the two are semantically equivalent for hashable scalar membership tests. All 241 instances fixed via `ruff check --select PLR6201 --fix --unsafe-fixes`, zero remaining. Every changed value is a hashable scalar (str/int/None/enum/signal); no risk of unhashable runtime errors. No behavior change. Test plan: - 119 files changed, +244/-244 (net zero) — exactly one-line edits - `ruff check` clean afterward - Compile checks pass on the largest touched files (cli.py, run_agent.py, gateway/run.py, gateway/platforms/discord.py, model_tools.py) - Subset broad test run on tests/gateway/ tests/hermes_cli/ tests/agent/ tests/tools/: 18187 passed, 59 pre-existing failures (verified against origin/main with the same shape — identical failure count, identical category — all xdist test-order flakes unrelated to this change) Follows the same template as PR #23937 ([tracker: #23972](https://github.com/NousResearch/hermes-agent/issues/23972)). |
||
|
|
226cee43d9 |
feat(cli): show ▶ N indicator in status bar when /background tasks are running (#27175)
Surface live background-task count in the prompt_toolkit status bar so users can see at a glance that a /background task exists and is running — no need to ask the agent about it (the agent has no visibility into bg sessions by design). - _get_status_bar_snapshot now reports active_background_tasks from len() of the live _background_tasks dict (entries are removed in the task thread's finally block, so this reflects truly-running tasks) - Indicator shown only on medium (<76) and wide (>=76) tiers; narrow (<52) stays minimal since it's already cramped - No invalidate plumbing needed: status bar fragments are pulled via lambda on every redraw, and the bg thread already calls _app.invalidate() on exit Refs #8568 |
||
|
|
fc03c95da1 |
feat(cli): add /exit --delete flag to remove session on quit (#27101)
Port from google-gemini/gemini-cli#19332. Users can now exit with '/exit --delete' (or '/quit --delete', '/exit -d') to permanently remove the current session's SQLite history plus on-disk transcripts (*.json / *.jsonl / request_dump_*) in one shot. Useful for privacy-sensitive workflows and one-off interactions where leaving a session recording behind is undesirable. Implementation: - New HermesCLI._delete_session_on_exit one-shot flag (defaults False). - process_command() parses --delete / -d after /exit or /quit and arms the flag. Unknown args print a hint and keep the CLI running (prevents typos like '/exit -delete' from accidentally exiting). - Shutdown path calls SessionDB.delete_session(session_id, sessions_dir=...) right after end_session() when the flag is set. That API already existed for 'hermes sessions delete' and handles both SQLite removal (orphaning child sessions so FK constraints hold) and on-disk file cleanup. - /quit CommandDef now advertises '[--delete]' in args_hint so /help and CLI autocomplete surface it. Tests: tests/cli/test_exit_delete_session.py (12 cases covering both aliases, case insensitivity, whitespace, short form, unknown-arg rejection, and registry metadata). E2E-verified with isolated HERMES_HOME: session row deleted, all three transcript/request-dump files removed, second delete_session call correctly returns False. |
||
|
|
965ae7fa97 |
revert(cli): drop scrollback box width clamp (#25975), restore full-width borders (#26163)
#25975 (salvaging #24403) clamped decorative scrollback Panels and streaming box rules to `max(32, min(width, 56))` as a defense against terminal-emulator reflow when columns shrink. On any modern wide terminal this made the response/reasoning borders look stubby — 56 cols inside a 200-col viewport. #26137 (salvaging #25981, by @OutThisLife) landed a more fundamental fix: prompt_toolkit's `_output_screen_diff` is monkey-patched so its reserve-vertical-space cursor move no longer pushes chrome into scrollback at all. With that in place, the clamp is no longer load-bearing for the chrome-into-scrollback class of bugs — the remaining risk is purely cosmetic reflow of *already stamped* Panel borders during an aggressive column shrink, which we now accept as a tradeoff for restoring proper full-width rendering. Changes: - `_scrollback_box_width()` returns `max(32, width)` (just the floor, no upper cap). All 10 call sites stay valid. - Updated `test_scrollback_box_width_caps_to_resize_safe_value` to the new `test_scrollback_box_width_returns_viewport_width` asserting full-width passthrough above the 32-col floor. Floor of 32 is kept so `'─' * (w - 2)` math stays positive on tiny terminals. Refs #18449 #19280 #22976 (the original reflow class) and #25975 (the clamp this reverts). |
||
|
|
cbd1f8e4be |
test(cli): cover light-mode detection + SkinConfig.get_color remap
Adds 16 unit tests covering the light/dark terminal detection path introduced in the previous commit: - Env override priority (HERMES_LIGHT, HERMES_TUI_LIGHT, HERMES_TUI_THEME, HERMES_TUI_BACKGROUND, COLORFGBG) - Detection cache stickiness - _maybe_remap_for_light_mode() no-op in dark mode - Known dark-mode color remap (#FFF8DC -> #1A1A1A etc) - Case-insensitive lookup - Unknown color passthrough - Status-bar paired colors (#C0C0C0, #888888, #555555, #8B8682) are intentionally NOT remapped — regression guard for the patch-11 fix, since remapping them would produce dark-on-dark on the status bar's navy bg - SkinConfig.get_color() wrapper is installed and idempotent - SkinConfig.get_color() does remap in light mode and passes through in dark mode We don't try to fake an OSC 11 reply — that path is exercised end-to-end in real Terminal.app; the env-override path covers the algorithmic logic. |
||
|
|
d6c488f2dc |
fix(cli): wire /sessions slash command in the classic CLI
The 'sessions' command has been registered in the central command registry since #20805 (May 2025) and surfaces in /help and tab-completion, but the classic CLI's process_command() never had an elif branch for it. The canonical name fell through and printed 'Unknown command: sessions'. The TUI side was wired up correctly via the SessionPicker overlay; only the legacy CLI was missing the dispatch. Adds _handle_sessions_command() which mirrors /resume's no-arg behavior inline (the CLI has no overlay primitive equivalent to the TUI picker): - /sessions and /sessions list → print the recent-sessions table - /sessions <id_or_title> → delegates to _handle_resume_command Includes regression tests covering the dispatcher wiring (the original bug) plus the three handler branches. |
||
|
|
2844c888f1 |
fix(cli): clamp scrollback box widths + suppress status bar after resize (#25975)
When the terminal shrinks, already-printed box-drawing rules (response, reasoning, streaming TTS, background-task Panels) reflow into multiple narrower rows — visible as duplicated horizontal separators / ghost lines in scrollback. Similarly, prompt_toolkit redraws a fresh status bar on SIGWINCH on top of one the terminal just reflowed, producing double-bar artifacts on column shrink. Two surgical changes: 1. Decorative scrollback boxes now use a new `HermesCLI._scrollback_box_width()` helper that clamps to `max(32, min(width, 56))`. The live TUI footer is unaffected and still uses the full width. Covers: streaming response box (open + close), reasoning box (open + close, both streaming and post-stream paths), streaming-TTS box close, final-response Rich Panel, and the background-task Rich Panel. 2. `_recover_after_resize()` now also sets a new `_status_bar_suppressed_after_resize` flag so the dynamic status bar and both input separator rules stay hidden until the next user input. The flag is cleared in the process loop the moment the user submits their next prompt, restoring chrome cleanly. Tests: - New `test_input_rules_hide_after_resize_until_next_input` covers the flag's effect on rule heights. - New `test_scrollback_box_width_caps_to_resize_safe_value` covers the helper at floor / cap / mid-range / overflow. - Existing resize-recovery test extended to assert the flag flips. Refs: #18449 #19280 #22976 Salvage of #24403. Co-authored-by: Szymonclawd <szymonclawd@mac.home> |
||
|
|
ac64d0c2ca | fix: preserve ansi output history on resize replay | ||
|
|
06c6c1f0f2 | fix(cli): batch resize history replay | ||
|
|
e2b2d48610 |
fix(cli): preserve startup banner on terminal resize
Recover from SIGWINCH without clearing the physical screen or scrollback buffer. The startup banner and tool summary are printed before prompt_toolkit owns the live chrome, so they live in normal terminal scrollback. Calling erase_screen() + \x1b[3J] on every resize removed that UI permanently — _replay_output_history cannot reconstruct it because the banner was never added to _OUTPUT_HISTORY. Instead, just reset prompt_toolkit's renderer cache and invalidate so the next incremental redraw starts from a clean slate, then let the original on_resize handler recalculate layout for the new terminal size. This matches the behaviour of bash/zsh/fish on SIGWINCH. Fixes NousResearch/hermes-agent#22999 |
||
|
|
6f2d1c88b7 |
feat(custom): prompt and persist explicit api_mode for custom providers
Adds an explicit API compatibility mode prompt to the `hermes model -> custom`
flow so Codex-compatible third-party endpoints (and any other non-default
backend whose URL doesn't match the existing heuristics in
`_detect_api_mode_for_url`) can be selected explicitly instead of silently
falling back to chat_completions.
Choices: Auto-detect / chat_completions / codex_responses / anthropic_messages.
Persists `api_mode` to:
- `model.api_mode` (active session config)
- the matching `custom_providers[*]` entry (so re-activating the named
provider next time replays the same transport)
Salvaged from PR #6125 onto current main: kept the new prompt and the
`_save_custom_provider(api_mode=...)` plumbing; the named-custom flow
already extracts and applies `api_mode` from the saved entry on current
main so those changes are preserved as-is. Test fixtures updated for the
new prompt and the existing display-name prompt.
Co-authored-by: littlewwwhite <1095245867@qq.com>
|
||
|
|
a34998ee2f | fix(cli): parse positional insights days | ||
|
|
1d00716754 |
fix(cli,tui): align CJK / wide-char markdown tables (#23863)
CJK and emoji glyphs render as two terminal cells but JS String#length and the model's own padding count them as one, so any markdown table with Chinese / Japanese / Korean cells drifts right per row when a real terminal renders it. Both surfaces fix this with a display-cell width measurement (wcswidth on the Python side, stringWidth on the TUI side). Changes: - agent/markdown_tables.py: new helper. realign_markdown_tables(text) detects markdown table blocks (header + |---| divider) and rewrites the row padding using wcwidth.wcswidth so every pipe and dash lines up across rows. No-op on text without tables. - cli.py: hook the helper into _render_final_assistant_content for strip / render modes (raw passes through untouched), and into the streaming line emitter so live token-by-token rendering also produces aligned tables. A small two-buffer state machine in _emit_stream_text holds table rows until the block ends, then flushes them through the realigner so all rows pad to a single per-column width. - ui-tui/src/components/markdown.tsx: renderTable now uses stringWidth (Bun.stringWidth fast path + East-Asian-width-aware fallback, already memoised in @hermes/ink) instead of UTF-16 String#length for both column-width measurement and per-cell padding. Drops the comment that documented the bug as a deliberate limitation. Validation: - New tests/agent/test_markdown_tables.py (11): every rebuilt block shares pipe column offsets across rows for pure CJK, mixed CJK+emoji, ragged-row, and multi-table inputs. - Updated tests/cli/test_cli_markdown_rendering.py: the existing strip-mode test asserted exact whitespace; rewritten to assert the alignment contract (cell content survives + every rendered row shares pipe offsets). - New ui-tui markdown.test.ts case (1): rendered column-2 start offset is identical for the header + every body row, including the CJK row that drifted before the fix. - Live: hermes chat -q with the user-reported screenshot prompt now produces a perfectly aligned table on the wire (header, divider, 4 body rows including '通义千问', all pipes at identical columns). |
||
|
|
054f568578 | fix: use TUI modal for slash confirmations | ||
|
|
3e7145e0bb |
revert: roll back /goal checklist + /subgoal feature stack (#23813)
* Revert "fix(goals): force judge to use tool calls instead of JSON-text replies (#23547)" This reverts commit |
||
|
|
28b4fe6007 |
test: stabilize quick-command redaction test against xdist ordering
agent.redact._REDACT_ENABLED is snapshotted at import time from HERMES_REDACT_SECRETS env. Under xdist a prior test in the same worker can flip it, so test_exec_command_output_is_redacted was order-dependent. Pin it via monkeypatch like test_terminal_output_transform_still_runs_strip_and_redact does. |
||
|
|
f6736ced81 |
fix(security): sanitize env and redact output in quick commands + remove write-only _pending_messages
1. Quick command exec ran in the gateway process's full environment without env sanitization or output redaction. A quick command like "env" or "printenv" would leak all API keys, OAuth tokens, and bot credentials to the messaging user. Fix: apply _sanitize_subprocess_env() before exec and redact_sensitive_text() on output before returning. 2. GatewayRunner._pending_messages was written on every interrupt (lines 1331-1334) but never read or consumed anywhere. The actual interrupt delivery uses adapter._pending_messages (a separate dict). Removed the write-only accumulation to prevent unbounded growth. |
||
|
|
404640a2b7 |
feat(goals): /goal checklist + /subgoal user controls (#23456)
* feat(goals): /goal checklist + /subgoal user controls
Two-phase judge for /goal — Phase A decomposes the goal into a detailed
checklist on first turn; Phase B evaluates each pending item harshly
against the agent's most recent response. The goal completes only when
every item is in a terminal status (completed or impossible). Adds
/subgoal so the user can append, complete, mark impossible, undo,
remove, or clear items the judge missed or got wrong.
Mechanics:
- GoalState gains `checklist` and `decomposed` fields, both backwards
compatible (old state_meta rows load unchanged).
- Phase A: aux call writes a harsh, exhaustive checklist; biased toward
more items not fewer. Falls through to legacy freeform judge when
decompose fails.
- Phase B: judge gets the checklist + last-response snippet + path to
a per-session conversation dump at <HERMES_HOME>/goals/<sid>.json.
A bounded read_file tool (max 5 calls per turn, restricted to that
one file) lets the judge inspect history when the snippet is
ambiguous. Stickiness in code: terminal items are frozen, only the
user can revert via /subgoal undo.
- Continuation prompt shows checklist progress when non-empty;
reverts to old prompt when empty.
- Status line shows M/N done counts.
CLI + gateway + TUI gateway all pass the agent reference into
evaluate_after_turn so the dump can be written. Gateway-side
/subgoal is allowed mid-run since it only modifies the checklist
the judge consults at turn boundaries.
Tests: 24 new cases — backcompat round-trip, Phase A decompose,
Phase B updates + new_items + stickiness, user override flows,
conversation dump (incl. unsafe-sid sanitization), judge read_file
restriction. Existing freeform-mode tests updated to patch the
renamed `judge_goal_freeform` and skip Phase A explicitly.
* fix(goals): off-by-one in judge index, message-list plumbing, prompt tuning
Three live-test findings from running /goal end-to-end against
gemini-3-flash-preview as the judge:
1. Off-by-one bug — the judge sees the checklist rendered with 1-based
indices ('1. [ ] foo, 2. [ ] bar') but the apply layer indexed
state.checklist as 0-based. Result: every judge update landed on
the wrong item, evidence got attached to neighbouring rows, and
the genuine 'first pending' item (usually #1) never got marked.
Fix: convert 1 → 0 in _parse_evaluate_response. Also tightened the
user prompt to call out the 1-based scheme explicitly. New tests
cover the parser conversion + an end-to-end fake-judge round-trip.
2. Conversation dump never happened — _extract_agent_messages tried
common AIAgent attribute names (.messages, .conversation_history,
etc.) but AIAgent doesn't expose the message list as an instance
attribute; it lives inside run_conversation()'s scope. Result: the
judge's read_file tool always saw history_path=unavailable. Fix:
added an explicit messages= kwarg to evaluate_after_turn that all
three call sites (CLI, gateway, TUI gateway) now pass directly.
Agent-attribute extraction kept as back-compat fallback.
3. Prompt was too harsh on simple goals. The original 'be HARSH,
default to leaving items pending' wording made the judge refuse
to mark 'file exists' completed even after the agent ran ls,
test -f, os.path.isfile, and find — burning the entire 8-turn
budget on a fizzbuzz task. Softened to 'strict but not absurd'
with explicit guidance on what counts as evidence and a directive
not to require re-proving items already established earlier.
Re-tested live with the same fizzbuzz goal: now terminates in 2
turns with all 8 checklist items correctly attributed to their
own evidence. /subgoal user-action flow (add / complete / undo /
impossible) verified live as well.
|
||
|
|
c5f1f863ac |
fix(cli): drive _prompt_text_input directly when off main thread (#23454)
Slash commands (/clear, /new, /undo, /reload-mcp) are dispatched from the process_loop daemon thread. prompt_toolkit.run_in_terminal returns a coroutine that only the main-thread event loop can drive, so calling it from a daemon thread orphans the coroutine — the input prompt never renders and user keystrokes leak into the composer instead of the confirmation prompt (issue #23185). Mirror the thread-aware guard already in _run_curses_picker: when off the main thread, fall back to a direct input() call. Also wrap run_in_terminal in try/except so WSL / Warp / other emulators that silently drop the scheduled coroutine fall back to input() too. Tests: tests/cli/test_prompt_text_input_thread_safety.py covers main thread (run_in_terminal path), daemon thread (direct input fallback), no-app, run_in_terminal-raises, and EOF handling. |
||
|
|
85383c6363 | fix(cli): preserve config comments on setting writes | ||
|
|
c7f0aab949 |
feat(openrouter): wire Pareto Code router with min_coding_score knob (#22838)
Pick openrouter/pareto-code as your model and OpenRouter auto-routes each
request to the cheapest model meeting your coding-quality bar (ranked by
Artificial Analysis). The new openrouter.min_coding_score config key (0.0-1.0,
default 0.65) tunes the floor.
- hermes_cli/models.py: add openrouter/pareto-code to OPENROUTER_MODELS so
it shows up in the picker with a description
- hermes_cli/config.py: add openrouter.min_coding_score (default 0.65 — lands
on a mid-tier coder on the current Pareto frontier)
- plugins/model-providers/openrouter: emit extra_body.plugins =
[{id: pareto-router, min_coding_score: X}] when model is openrouter/pareto-code
AND the score is a valid float in [0.0, 1.0]
- agent/transports/chat_completions.py: same emission on the legacy flag
path (when no provider profile is loaded)
- run_agent.py: openrouter_min_coding_score kwarg + storage; plumbed into
both build_kwargs() invocations and the context-summary extra_body path
- cli.py: read openrouter.min_coding_score once at init, validate float in
[0,1], pass to AIAgent constructions (CLI + background-task paths)
- cron/scheduler.py, batch_runner.py, tools/delegate_tool.py,
tui_gateway/server.py: propagate the kwarg (mirrors providers_order
plumbing — subagents inherit, cron/batch read from config)
- tests: profile-level + transport-level coverage of the model gating,
unset/empty/out-of-range handling, and the legacy flag path
- docs: new 'OpenRouter Pareto Code Router' section in providers.md
Verified end-to-end against api.openrouter.ai: at score=0.65 we land on a
mid-tier coder, at omission we get the strongest. Score is silently dropped
on any model other than openrouter/pareto-code, so it's safe to leave set.
|
||
|
|
70bc52e408 |
fix(cli): make Ctrl+Enter insert newline on WSL/SSH/Windows Terminal (#22777)
Native Windows, WSL, SSH sessions, and Windows Terminal all send Ctrl+Enter as bare LF (c-j). Hermes was binding c-j as submit on every POSIX platform, so Ctrl+Enter submitted instead of inserting a newline on those terminals. Reported in #22379. Add _preserve_ctrl_enter_newline() predicate that detects the environments where Ctrl+Enter must produce a newline (sys.platform == 'win32', SSH_CONNECTION/SSH_CLIENT/SSH_TTY env, WT_SESSION, WSL_DISTRO_NAME, /proc/version 'microsoft' marker). Gate the c-j-as-submit binding off in those environments and gate the c-j-as-newline handler on. Local POSIX TTYs without those markers (docker exec, plain ssh from a Mac) keep c-j as submit so plain Enter still works on thin PTYs. Add install_ctrl_enter_alias() in hermes_cli/pt_input_extras.py mapping the three CSI-u / modifyOtherKeys variants of Ctrl+Enter ('\x1b[13;5u', '\x1b[27;5;13~', '\x1b[27;5;13u') to the (Escape, ControlM) tuple Alt+Enter produces. This lets Kitty / mintty / xterm-with-modifyOtherKeys users over SSH get a Ctrl+Enter newline through the existing Alt+Enter handler. 9 new tests + extended existing test_lf_enter_binds_to_submit_handler_posix to cover bare-local vs SSH branches. Closes #22379. |
||
|
|
b9c001116e |
feat: confirm prompt for destructive slash commands (#4069) (#22687)
/clear, /new, /reset, and /undo now ask the user to confirm before discarding conversation state — three-option prompt routed through the existing tools.slash_confirm primitive. Native yes/no buttons render on Telegram, Discord, and Slack (their adapters already implement send_slash_confirm); other platforms get a text-fallback prompt and reply with /approve, /always, or /cancel. The classic prompt_toolkit CLI uses the same three-option flow via the established _prompt_text_input pattern (see _confirm_and_reload_mcp). TUI keeps its existing modal overlay (#12312). Gated by new config key approvals.destructive_slash_confirm (default true). Picking 'Always Approve' flips the gate to false so subsequent destructive commands run silently — matches the established mcp_reload_confirm UX. Out of scope: /cron remove (separate domain — scheduled jobs, not session history). Existing TUI overlay env-var (HERMES_TUI_NO_CONFIRM) left unchanged; cosmetic unification can come later. Closes #4069. |
||
|
|
f5b635f6ab |
feat(cli): recognise Shift+Enter as a newline key
Closes #5346. Most terminals send the same byte sequence for `Enter` and `Shift+Enter` by default, so the application can't tell them apart — this is a terminal protocol limitation, not something Hermes can paper over. But terminals that implement the Kitty keyboard protocol (Kitty / foot / WezTerm / Ghostty by default; iTerm2 / Alacritty / VS Code terminal / Warp once the protocol is enabled) DO emit a distinct sequence for `Shift+Enter`: - `\x1b[13;2u` — Kitty / CSI-u, modifier=2 - `\x1b[27;2;13~` — xterm modifyOtherKeys=2 Stock prompt_toolkit doesn't have the CSI-u sequence in its `ANSI_SEQUENCES` table at all, and it maps the modifyOtherKeys variant to plain `Keys.ControlM` (Enter) — i.e. it strips the Shift modifier, which is the bug users actually hit on iTerm2 and friends. This PR adds `hermes_cli/pt_input_extras.install_shift_enter_alias()`, called once at CLI startup from `cli.py`, which inserts/overwrites those sequences in `ANSI_SEQUENCES` so they decode to `(Keys.Escape, Keys.ControlM)` — the same key tuple `Alt+Enter` produces. The existing Alt+Enter newline handler (`@kb.add('escape', 'enter')` in `cli.py`) then fires unchanged, so there is no new keybinding to register and no behavioral change for terminals that don't emit the distinct sequences. Files ===== * `hermes_cli/pt_input_extras.py` — new module hosting the helper. Lives outside `cli.py` so it's importable in tests without dragging in the full CLI runtime (which depends on `fire`, `rich`, etc.). * `cli.py` — calls `install_shift_enter_alias()` once at module import. Wrapped in try/except so prompt_toolkit version drift can't break CLI startup. * `tests/cli/test_cli_shift_enter_newline.py` — 6 tests: - registration of all three byte sequences - overwrite of stock prompt_toolkit's broken modifyOtherKeys mapping - idempotency - parser equivalence: CSI-u Shift+Enter == Alt+Enter - parser equivalence: modifyOtherKeys Shift+Enter == Alt+Enter - plain Enter remains a single key (submit), distinct from the two-key Alt+Enter / Shift+Enter tuple * `website/docs/user-guide/cli.md` — keybinding table updated; new "Shift+Enter compatibility" subsection with a per-terminal status table noting macOS Terminal / stock Windows Terminal cannot distinguish the keystroke at the protocol level. * `website/docs/getting-started/quickstart.md`, `website/docs/guides/tips.md` — short mention pointing readers at the full compatibility note in `cli.md`. Tested ====== pytest tests/cli/test_cli_shift_enter_newline.py # 6 passed Live-tested by triggering `\x1b[13;2u` against the running Vt100Parser (see test). Not exercised in a real terminal end-to-end because that requires a Kitty-protocol-capable host; the test exercises the parser path that drives the live terminal too. |
||
|
|
d1838041e5 |
feat: Ctrl+Enter inserts newline on Windows Terminal
Windows Terminal intercepts Alt+Enter for its fullscreen shortcut, leaving Windows users with no Enter-involving way to insert a newline in the Hermes prompt. Fix it by reclaiming c-j on Windows only: - _bind_prompt_submit_keys now binds c-j (LF) to submit only on POSIX, where thin PTYs (docker exec, some SSH configs) deliver Enter as LF. On Windows plain Enter is always c-m, so c-j is free. - Windows-only prompt binding: c-j inserts a newline. Windows Terminal sends Ctrl+Enter as LF, so the user-facing keystroke is Ctrl+Enter — no terminal settings changes required. - Alt+Enter binding unchanged; still works on mac/Linux/WSL. - Test TestPromptToolkitTerminalCompatibility::test_lf_enter_binds_to_submit_handler split into platform-aware assertions for POSIX vs win32. - Fixed the Ctrl+J claim in hermes_cli/tips.py (was wrong before this commit even on POSIX) to point Windows users at Ctrl+Enter. Tradeoff: on Windows, raw Ctrl+J (without Enter) also inserts a newline, since WT collapses Ctrl+Enter and Ctrl+J to the same c-j keycode. No conflicting Hermes binding existed for Ctrl+J, so this is a harmless side effect. |
||
|
|
674fad1483 |
fix(goals): Ctrl+C during /goal loop auto-pauses the goal (#21888)
Reported: Ctrl+C during an active /goal loop felt like it did nothing — the agent would interrupt the current turn, then immediately queue another continuation and keep going until the session ended or the 20-turn budget ran out. Root cause: cli.py's _maybe_continue_goal_after_turn() ran in the finally: block around self.chat(...) unconditionally. Whether the turn completed normally, got interrupted, or returned an empty string, the judge ran on whatever was in conversation_history and — because the judge is fail-open — a "continue" verdict pushed another CONTINUATION_PROMPT onto _pending_input. Ctrl+C was invisible to the hook. Fix: - chat() now captures result['interrupted'] onto self._last_turn_interrupted (resets to False at entry so early-returns don't leak prior state). - _maybe_continue_goal_after_turn() checks the flag first: on interrupt, auto-pause via mgr.pause(reason='user-interrupted (Ctrl+C)') and print a one-liner pointing the user at /goal resume or /goal clear. No judge call, no continuation enqueued. - Also added an empty-response guard that mirrors gateway/run.py's _handle_message logic (empty reply → transient failure → skip judging so we don't trip the consecutive-parse-failures backstop unnecessarily). The goal stays in the DB as paused, so /goal resume recovers it after the user has sorted out whatever made them cancel. /goal clear still works as before for a full stop. Tests: tests/cli/test_cli_goal_interrupt.py covers: - interrupted turn pauses + doesn't queue + judge is NOT called - paused goal is resumable - empty / whitespace / missing assistant reply skips judging - healthy turn still enqueues continuation / marks done - chat() resets _last_turn_interrupted at entry (anti-leak guard) All 55 existing goal tests still pass. |
||
|
|
f5a232af84 |
refactor: replace 'cmp' text with 🗜️ emoji in status bar
Address review feedback to use the clamp emoji (��️) instead of the plain text 'cmp' prefix for the compression count indicator. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> |
||
|
|
103e11926f |
feat(cli): show context compression count in status bar
Display the number of context compressions in the CLI status bar when compressions > 0, helping users understand conversation compression pressure during long sessions. - Wide layout (>=76 cols): shows 'cmp N' between context percent and duration - Medium layout (52-75 cols): shows 'cmp N' between percent and duration - Narrow layout (<52 cols): omitted to save space - Color-coded: dim for 1-4, warn for 5-9, bad for 10+ - Hidden when zero to keep the bar clean for new sessions Closes #18564 |
||
|
|
5044e1cbf1 | fix(cli): submit LF enter in thin PTYs (#20896) | ||
|
|
906881c38b |
fix(cli): catch OSError in _resolve_attachment_path to prevent ENAMETOOLONG dropping long slash commands
When the user pastes a long slash command like \`/goal <long prose>\` into
\`hermes chat\`, the input flows into \`_detect_file_drop()\`, whose
\`starts_like_path\` prefilter accepts anything starting with \`/\` and
forwards it to \`_resolve_attachment_path()\`. That helper calls
\`Path.exists()\` which invokes \`os.stat()\`, which raises
\`OSError(errno=ENAMETOOLONG)\` — 63 on macOS, 36 on Linux — when the
candidate exceeds NAME_MAX (typically 255 bytes).
The OSError propagates up to the broad \`except Exception\` in
\`process_loop\` (cli.py:11798), gets logged at WARNING level, and the
user's input is silently dropped. From the user's POV the chat prompt
hangs — the only signal is in agent.log:
WARNING cli: process_loop unhandled error (msg may be lost):
[Errno 63] File name too long: "/goal Drive the space board..."
This affects any slash command with prose-length arguments — \`/goal\`
in particular but also \`/skill\`, \`/cron\`, custom user commands.
Fix: wrap the \`exists()\`/\`is_file()\` calls in try/except OSError so
structurally-invalid path candidates cleanly return None. The slash-
command dispatch path downstream (cli.py:11718) then handles the
input correctly.
Tests: two new regression cases in test_cli_file_drop.py cover the
original \`/goal\` reproducer and a synthetic long path. All 35 file-
drop tests pass.
Reproducer (without the fix):
python -c "from cli import _detect_file_drop;
_detect_file_drop('/goal ' + 'a'*300)"
→ OSError: [Errno 63] File name too long
|
||
|
|
76074d9ee6 | fix(cli): recover classic CLI output after resize | ||
|
|
e45df2e81e | fix(ui): reduce status-line jitter while scrolling | ||
|
|
4e6f51167d | fix(cli): fall back on invalid HERMES_MAX_ITERATIONS | ||
|
|
9e0ef2a1bc |
test: pin per-turn reasoning extraction semantics
Covers four scenarios for the reasoning-box extraction loop: - simple turn with reasoning - simple turn with no reasoning - tool-calling turn where reasoning lives on the tool-call step - prior turn had reasoning, current turn does not (the stale-display bug the fix exists for) - tool-calling turn where reasoning lives on BOTH steps (latest wins) - empty-string reasoning treated as missing Also updates the four inline replica loops in tests/cli/test_reasoning_command.py to match the new turn-boundary shape so the test file reflects production semantics. |
||
|
|
aacf36e943 | fix(cli): persist manual compress handoff | ||
|
|
20428f5e60 |
fix(tui): respect voice.record_key config (supersedes #19028, #19339) (#19835)
* fix(tui): respect voice.record_key config instead of hardcoded Ctrl+B
Classic CLI loaded ``voice.record_key`` from config.yaml and bound the
prompt-toolkit handler dynamically (``cli.py`` paths). The new TUI hard-
coded ``Ctrl+B`` everywhere — ``isVoiceToggleKey`` (input handler),
``/voice status`` ("Record key: Ctrl+B"), and ``/voice on`` ("Ctrl+B to
start/stop recording"). A user who set ``voice.record_key: ctrl+o``
(or any other key) saw the documented config silently ignored — only
Ctrl+B worked, the displayed shortcut lied about it.
Wire the configured key end to end through the existing channels:
* **Backend** (``tui_gateway/server.py``): ``voice.toggle`` action=status
AND action=on/off responses now include ``record_key``, sourced from
``config.get('voice', {}).get('record_key', 'ctrl+b')``.
* **Backend types** (``ui-tui/src/gatewayTypes.ts``): ``ConfigFullResponse``
now exposes ``config.voice.record_key`` and ``VoiceToggleResponse``
carries ``record_key`` so the TUI can both bind and display it.
* **Frontend parser/formatter** (``ui-tui/src/lib/platform.ts``):
``parseVoiceRecordKey()`` accepts ``ctrl+b`` / ``alt+r`` / ``cmd+space``
and the common aliases (``option``, ``cmd``, ``win``, …); falls back to
the documented Ctrl+B for empty / multi-character / malformed input so
a typo never silently disables the shortcut. ``formatVoiceRecordKey()``
renders for status text. ``isVoiceToggleKey`` now takes a parsed
``ParsedVoiceRecordKey`` argument; the hardcoded ``ch === 'b'`` is
gone. Default arg keeps existing call sites back-compat.
* **Hydration** (``ui-tui/src/app/useConfigSync.ts``,
``useMainApp.ts``): startup ``config.get full`` already runs; extract
``cfg.voice.record_key`` from it, parse, push into a new
``voiceRecordKey`` state, and forward to the input handler ctx
(``InputHandlerContext.voice.recordKey``). Mtime-poll path also
re-applies the parsed key so a hand-edit of config.yaml takes effect
the next tick — matches existing behaviour for display options.
* **Input handler** (``ui-tui/src/app/useInputHandlers.ts``):
``isVoiceToggleKey(key, ch, voice.recordKey)`` so the configured
binding fires.
* **Slash command** (``ui-tui/src/app/slash/commands/session.ts``):
``/voice status`` and ``/voice on`` use ``formatVoiceRecordKey`` on
the response's ``record_key`` instead of the hardcoded label.
Tests:
* ``parseVoiceRecordKey`` covers ctrl/alt/cmd/super aliases, multi-char
rejection, and empty fallback.
* ``formatVoiceRecordKey`` covers the doc examples (``Ctrl+B``,
``Ctrl+O``, ``Alt+R``, ``Cmd+B``).
* ``isVoiceToggleKey`` regression: ``ctrl+o`` configured → only ``o``
matches, not ``b``; ``alt+r`` matches both alt-bit and meta-bit
encodings (terminal protocol parity); omitted-arg call still binds
Ctrl+B for back-compat.
Full TUI suite (555 tests) passes; ``tsc --noEmit`` clean.
Fixes #18994
Co-authored-by: asheriif <ahmedsherif95@gmail.com>
* fix(tui): support named-key tokens in voice.record_key (space, enter, …)
Reviewer caught that the round-1 parser in #18994 rejected every
multi-character token, so a config value like ``ctrl+space`` (which the
CLI happily binds via prompt_toolkit's ``c-space`` rewrite in
``cli.py``) silently fell back to the documented Ctrl+B default —
re-introducing the same false-shortcut bug the PR was meant to fix,
just at a different surface.
Add explicit named-key support that mirrors what the CLI accepts:
* ``space`` (alias: ``spc``) → matches ``ch === ' '``
* ``enter`` (alias: ``return``, ``ret``) → matches ``key.return``
* ``tab`` → matches ``key.tab``
* ``escape`` (alias: ``esc``) → matches ``key.escape``
* ``backspace`` (alias: ``bs``) → matches ``key.backspace``
* ``delete`` (alias: ``del``) → matches ``key.delete``
``ParsedVoiceRecordKey`` gains an optional ``named`` field; ``ch``
holds either a single char (back-compat) or the canonical named token,
and the runtime matcher dispatches on ``named`` before checking the
modifier shape. Aliases collapse to one canonical name so
``ctrl+esc`` and ``ctrl+escape`` behave identically.
Unrecognised multi-character tokens (e.g. ``ctrl+spcae`` typo, or
unsupported keys like ``ctrl+f5``) still fall back to the Ctrl+B
default rather than silently disabling the binding — keeps the "typo
never silently kills the shortcut" guarantee.
Tests:
* ``parseVoiceRecordKey`` parametrised over every named token + each
alias variant.
* New ``isVoiceToggleKey`` cases for space (ch-based match), enter
(``key.return``), tab, escape, backspace, delete, including
modifier-mismatch negatives.
* ``formatVoiceRecordKey`` renders named keys in title case
(``Ctrl+Space``, ``Ctrl+Enter``).
* Existing fall-back-to-Ctrl+B contract preserved for empty input
AND unrecognised multi-char tokens.
Full TUI suite: 559/559 pass; ``tsc --noEmit`` clean.
Refs #18994 (round-1 review feedback)
Co-authored-by: asheriif <ahmedsherif95@gmail.com>
* test(tui): assert voice.toggle returns configured record_key
Salvage the backend regression from #19339 — asserts ``voice.toggle``
action=on AND action=status responses carry the configured
``voice.record_key`` end-to-end through ``_load_cfg()``. Keeps the
CLI→TUI parity contract visible in the Python test suite alongside
the existing frontend parser/matcher/formatter coverage from #19028.
* fix(tui): address Copilot review on #19835 voice.record_key wiring
Five tightenings on the parser + matcher + hydration surface, all
caught by the Copilot review on the PR — each one turns a silent
false-fire or display/binding skew into a deterministic behaviour.
* **isVoiceToggleKey ctrl branch was too permissive for named keys.**
The doc-default macOS Cmd+B muscle-memory fallback
(``isActionMod(key)`` on top of ``key.ctrl``) fired for every
configured key, so bare Esc — which hermes-ink reports with
``key.meta`` on some macOS terminals — triggered ``ctrl+escape``,
and Alt+Space / Alt+Tab triggered ``ctrl+space`` / ``ctrl+tab``.
Gate the fallback to the literal ``ctrl+b`` binding so any custom
chord requires the real Ctrl bit.
* **Alt branch guarded against Ctrl/Cmd co-press.** Without this,
Ctrl+Alt+<letter> and Cmd+Alt+<letter> also fired ``alt+<letter>``.
* **Dropped the ``meta`` modifier variant and its alias.** In
hermes-ink ``key.meta`` is Alt on xterm-style terminals and Cmd on
legacy macOS ones, so a literal ``meta+b`` config displayed as
``Cmd+B`` while matching Alt+B — exactly the kind of false
shortcut the PR was meant to remove. ``cmd`` / ``command`` now
collapse onto ``super`` (kitty-style ``key.super``, with a macOS
``key.meta`` fallback) and render as ``Cmd+B``. Unknown modifier
tokens fall back to the documented Ctrl+B default rather than
silently coercing to Ctrl.
* **Slash-command display/binding skew.** ``/voice status`` and
``/voice on`` rendered from the fresh gateway ``record_key``
response, but ``useInputHandlers()`` still bound the old key
until the next 5s mtime poll. Thread ``setVoiceRecordKey``
through ``SlashHandlerContext.voice`` and push the parsed spec
into frontend state on every response so text and binding stay
consistent.
* **Test coverage for the two paths Copilot flagged.** Added
vitest coverage for (a) the three-case ``/voice`` slash output
in ``createSlashHandler.test.ts`` and (b) the
``applyDisplay → voice.record_key`` hydration + omit-setter
back-compat paths in ``useConfigSync.test.ts``. Plus regression
cases for every false-fire scenario above.
Suite: 575/575 green, tsc --noEmit clean.
* fix(tui): address Copilot round-2 review on #19835
Three tightenings on the surface introduced in the round-1 fix:
* **``/voice tts`` reset custom bindings to Ctrl+B.** The ``tts`` branch
of ``voice.toggle`` omitted ``record_key`` from its response, so the
frontend's ``r.record_key ?? 'ctrl+b'`` coerced a user's custom
binding back to the default on every TTS toggle. Two-sided fix:
the backend now includes ``record_key`` on the ``tts`` branch (parity
with ``status``/``on``/``off``), and the slash handler only pushes
frontend state when the response actually carries ``record_key`` —
belt-and-suspenders against any future branch forgetting to include
it.
* **``super+b`` / ``win+b`` / ``cmd+b`` displayed "Cmd+B" on Linux and
Windows.** ``formatVoiceRecordKey`` rendered ``mod === 'super'`` as
``Cmd`` universally, which told non-mac users the wrong modifier to
press even though ``isVoiceToggleKey`` matched the right event bits.
Gate the label to ``isMac`` so non-mac renders ``Super+B``.
* **``control+b`` / ``ctrl + b`` lost the macOS Cmd+B fallback.**
``_isDefaultVoiceKey`` keyed off ``parsed.raw`` — so
semantically-equal aliases of the documented default dropped into
the strict branch even though they bind Ctrl+B. Compare on the
parsed spec (mod + ch + named) instead.
Coverage added: Linux ``Super+B`` rendering (and macOS ``Cmd+B``),
``control+b`` / ``ctrl + b`` accepting the Cmd+B fallback on darwin,
``/voice tts`` without ``record_key`` not clobbering cached binding,
and a backend regression asserting every ``voice.toggle`` branch
carries the configured key.
Suite: 579/579 TUI vitest green, 2/2 backend voice tests green,
tsc --noEmit clean.
* fix(tui): address Copilot round-3 review on #19835
Three classes of robustness issue caught on the second pass — all
revolve around malformed YAML tipping ``parseVoiceRecordKey`` or
``_voice_record_key`` into a crash instead of the documented
fallback.
* **Parser crashed on non-string YAML scalars.** ``config.get full``
returns raw ``yaml.safe_load`` output, so ``voice.record_key: 1``
or ``voice.record_key: true`` in a hand-edited config would hit
``.trim()`` on a number/bool and throw, breaking startup and
every mtime re-apply. Accept ``unknown`` at the signature, guard
with ``typeof raw !== 'string'``, and fall back to the default.
* **Backend blew up on non-dict ``voice:``.** Same YAML hazard on
the gateway side: ``voice: true`` / ``voice: cmd+b`` left
``_load_cfg().get("voice")`` as a bool/str, so ``.get("record_key")``
raised AttributeError and took every ``voice.toggle`` branch down
with it. Centralised the lookup in a single
``_voice_record_key()`` helper that ``isinstance``-guards both
``voice`` and ``record_key`` and falls back to ``ctrl+b``.
* **Multi-modifier chords silently dropped extras.** The previous
validator only checked the first modifier token, so ``ctrl+alt+r``
silently parsed as ``ctrl+r`` and ``cmd+ctrl+b`` as ``super+b`` —
a typo bound a different shortcut than the user configured.
Reject multi-modifier spellings outright; the classic CLI only
supports single-modifier bindings via prompt_toolkit's ``c-x`` /
``a-x`` rewrite, so this matches CLI parity.
Coverage added:
* ``parseVoiceRecordKey`` fallback on ``1`` / ``true`` / ``null`` /
``undefined`` / ``{}``.
* ``parseVoiceRecordKey`` fallback on ``ctrl+alt+r`` /
``cmd+ctrl+b`` / ``alt+ctrl+space``.
* ``test_voice_toggle_handles_non_dict_voice_cfg`` exercises
every non-dict ``voice:`` shape (bool, str, None, int, list) and
asserts each falls back to ``record_key: 'ctrl+b'``.
Suite: 581/581 TUI vitest green, 3/3 backend voice tests green,
tsc --noEmit clean.
* fix(tui): address Copilot round-4 review on #19835
Four final corners of the voice.record_key surface:
* **Bare-char configs silently coerced to ``ctrl+<key>``.** A config
like ``voice.record_key: o`` / ``space`` / ``escape`` fell through
to the default ``mod = 'ctrl'`` and silently bound Ctrl+O, while
the classic CLI's prompt_toolkit would bind the raw key (no
rewrite) — so the two runtimes silently disagreed on what "o"
means. Require an explicit modifier; bare-char configs fall back
to the documented Ctrl+B default.
* **Reserved ctrl+<letter> bindings would never fire.**
``useInputHandlers()`` intercepts ``ctrl+c`` (interrupt),
``ctrl+d`` (quit), and ``ctrl+l`` (clear screen) before the voice
check runs, so those configs would be advertised in /voice
status but the advertised shortcut never actually triggers
push-to-talk. Added ``_RESERVED_CTRL_CHARS`` at parse time so
the user gets the documented default instead of a dead shortcut.
(``alt+c``, ``cmd+l``, etc. are not intercepted and stay usable.)
* **``_load_cfg()`` root itself may be a non-dict.**
``_voice_record_key()`` isinstance-guarded the ``voice`` subkey
but not the root — a malformed config.yaml that collapsed to a
scalar/list at the top level (``config.yaml: true`` or ``[]``)
would still raise on ``.get("voice")``. Added the top-level
guard too so every malformed shape falls back to ``ctrl+b``.
* **Stale header comment on ``isVoiceToggleKey``.** The doc-comment
still claimed "On macOS we additionally accept the platform
action modifier (Cmd) for the configured letter" even though the
implementation gates the Cmd fallback to the documented default
only. Rewrote to match.
Coverage added:
* ``parseVoiceRecordKey`` fallback on bare chars (``o``, ``b``,
``space``, ``escape``).
* ``parseVoiceRecordKey`` fallback on ``ctrl+c`` / ``ctrl+d`` /
``ctrl+l``; positive case for ``alt+c`` / ``cmd+l`` still usable.
* Backend ``test_voice_toggle_handles_non_dict_voice_cfg`` now
exercises 5 non-dict shapes at the YAML root too.
Suite: 583/583 TUI vitest green, 3/3 backend voice tests green,
tsc --noEmit clean.
* fix(tui): address Copilot round-5 review on #19835
Three follow-ups on the voice matcher's modifier + shift discipline:
* **``super`` branch falsely fired on Alt+<key> / bare Esc on macOS.**
``isVoiceToggleKey`` accepted ``isMac && key.meta`` as a Cmd
fallback for the ``super`` modifier — but hermes-ink sets
``key.meta`` for plain Alt/Option AND for bare Escape on some
macOS terminals. A ``cmd+b`` config silently fired on Alt+B;
``cmd+space`` on Alt+Space; ``cmd+escape`` on bare Esc. Drop the
fallback and require the literal ``key.super`` bit. Legacy-
terminal users who need Cmd should upgrade to a kitty-protocol
terminal or bind ``alt+X`` explicitly.
* **Shift bit was never checked.** The parser rejects multi-
modifier configs like ``ctrl+shift+tab``, but the runtime
matcher didn't check ``key.shift`` — so ``ctrl+tab`` also fired
on Ctrl+Shift+Tab and ``alt+enter`` on Alt+Shift+Enter.
Early-return on ``key.shift === true`` so the runtime only fires
the exact chord the user configured.
* **Test leaked ``HERMES_VOICE=1`` into later tests.**
``voice.toggle`` action=on writes to ``os.environ`` directly
(CLI parity, runtime-only flag); ``test_voice_toggle_returns_
configured_record_key`` dispatched action=on without letting
monkeypatch take ownership of the var first. Any later test
that read voice mode in the same Python process could inherit a
stale enabled state. Added ``monkeypatch.setenv("HERMES_VOICE",
"0")`` up front so monkeypatch restores the original value at
teardown.
Coverage added:
* ``cmd+b`` / ``cmd+space`` / ``cmd+escape`` do NOT fire on
``key.meta``-only events on darwin.
* ``ctrl+tab`` / ``alt+enter`` / ``ctrl+o`` reject matches when
``key.shift`` is held; sanity cases without Shift still fire.
Suite: 585/585 TUI vitest green, 3/3 backend voice tests green,
tsc --noEmit clean.
* fix(tui): address Copilot round-6 review on #19835
Three classes of modifier-discipline tightening + one config-surface
honesty fix:
* **Default ``ctrl+b`` Cmd fallback leaked Alt+B.** The default's
macOS Cmd+B muscle-memory path used ``isActionMod(key)``, which
returns ``key.meta || key.super`` on darwin. hermes-ink also
reports plain Alt as ``key.meta``, so Alt+B silently fired the
default binding. Replaced with strict ``isMac && key.super ===
true`` — kitty-style Cmd+B still works, Alt+B correctly
rejected. Legacy-terminal mac users (Terminal.app without
CSI-u) now get raw Ctrl+B only; the documented default still
works everywhere.
* **ctrl / super branches accepted extra modifier bits.** The
parser rejects multi-modifier configs like ``ctrl+alt+o``, but
the runtime matcher was permissive — ``ctrl+o`` fired on
Ctrl+Alt+O / Ctrl+Cmd+O, and ``super+b`` fired on Cmd+Alt+B /
Ctrl+Cmd+B. Added strict ``!key.alt && !key.meta && key.super
!== true`` on ctrl, and ``!key.ctrl && !key.alt && !key.meta``
on super, so the runtime only fires the exact chord the parser
would let you configure.
* **Dropped ``cmd`` / ``command`` aliases.** They parsed to
``super`` and rendered as ``Cmd+X``, but legacy macOS terminals
report Cmd as ``key.meta`` (same signal as Alt), so a
``cmd+o`` config was advertised as working but never actually
fired on Terminal.app-without-CSI-u. That recreated the
"displayed shortcut does not work" problem this PR was meant to
remove. Users who want the platform action modifier spell it
``super`` / ``win`` — that matches the unambiguous ``key.super``
bit, and kitty-style macOS terminals render it as ``Cmd+X`` via
platform-aware formatter.
Coverage updated:
* Default ctrl+b no longer fires on Alt+B via ``key.meta`` leak;
raw Ctrl+B and kitty-style Cmd+B still fire.
* ``ctrl+o`` rejects Ctrl+Alt+O / Ctrl+Cmd+O / Ctrl+Meta+O chords.
* ``super+b`` rejects Cmd+Alt+B / Cmd+Meta+B / Ctrl+Cmd+B chords.
* ``cmd+b`` / ``command+b`` / ``meta+b`` all fall back to the
documented default at parse time (joined the ambiguous-mac-mod
rejection class).
* Round-2 expectations that asserted ``cmd+b`` parsed as super
and accepted ``key.meta`` on darwin updated to reflect the new
stricter contract.
Suite: 588/588 TUI vitest green, 3/3 backend voice tests green,
tsc --noEmit clean.
* fix(tui): address Copilot follow-up on wire typing + escape precedence
Two follow-ups from the latest Copilot pass:
* **Config wire typing honesty (`gatewayTypes.ts`)**
`config.get full` forwards raw `yaml.safe_load()` output, so
`voice.record_key` can be any scalar/container when hand-edited.
Typing it as `string` suggests a normalized contract that the
backend does not guarantee and makes unsafe callers more likely.
Change `ConfigVoiceConfig.record_key` to `unknown` with an
explicit comment that callers must normalize at runtime.
* **Escape-based voice bindings were swallowed before voice check**
`useInputHandlers()` handled `key.escape` for queue-edit cancel and
selection clear before `isVoiceToggleKey(...)`, so configured
`ctrl+escape` / `alt+escape` / `super+escape` chords were advertised
but never toggled recording in those UI states.
Add an early escape+voice check before generic Esc handlers so
escape-based voice bindings win when configured, while plain Esc
behavior remains unchanged.
Also updated PR #19835 description text to remove stale cmd/command
alias claims and match the current parser contract.
* fix(tui): pass configured voice shortcut through TextInput layer
Thread the live parsed voiceRecordKey into TextInput so configured voice.record_key chords bubble to useInputHandlers instead of being consumed as editor input. This removes the last hardcoded Ctrl+B pass-through in the composer path while preserving existing global control chord behavior.
* fix(tui): require explicit alt bit for escape-based alt chords
Hermes-ink reports bare Escape as meta=true+escape=true on some terminals, so a configured alt+escape binding was firing on bare Esc. Require an explicit key.alt bit when the configured named key is escape so plain Esc stays plain Esc; kitty-style alt+escape still fires.
* fix(tui): harden voice.record + TextInput paste + super-mod reserved list
Three round-7 Copilot follow-ups on #19835:
- voice.record start handler used _load_cfg().get('voice', {}).get(...) without
shape checks, so malformed YAML (bool/scalar/list) returned 5025 instead of
using VAD defaults. Centralized _voice_cfg_dict() helper and type-guarded
silence_threshold/silence_duration with numeric fallbacks.
- TextInput pass-through check moved above paste/copy handling so configured
voice chords (ctrl+v / alt+v / cmd+v) beat the composer's paste/copy
defaults.
- parser now also rejects super+{c,d,l,v} — on macOS those are
copy/exit/clear/paste and would be advertised in /voice status but never
actually toggle recording.
* Potential fix for pull request finding
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
* fix(tui): round-8 Copilot review — allow ctrl+x, gate super reservations to macOS, preserve voice key on transient RPC failure
Three round-8 Copilot follow-ups on #19835:
- Revert ctrl+x addition to _RESERVED_CTRL_CHARS (landed via Copilot Autofix
commit
|
||
|
|
d89e7a3cd4 |
fix(anthropic): restrict fast mode to Opus 4.6 (Anthropic API contract)
Per https://platform.claude.com/docs/en/build-with-claude/fast-mode: "Fast mode is currently supported on Opus 4.6 only. Sending speed: fast with an unsupported model returns an error." Pre-fix, _is_anthropic_fast_model() returned True for any claude-* model, so /fast on Opus 4.7 (or Sonnet/Haiku) would persist agent.service_tier=fast in config.yaml and the adapter would inject extra_body["speed"] = "fast" on every subsequent request. Opus 4.7 returns: HTTP 400: 'claude-opus-4-7' does not support the `speed` parameter. This wedged sessions across model upgrades (a user who ran /fast on Opus 4.6 and later switched the default model to 4.7 hit a hard 400 on every turn until they manually edited config.yaml). Changes: - _is_anthropic_fast_model: gate on "opus-4-6" / "opus-4.6" only - anthropic_adapter: add _supports_fast_mode predicate as defensive guard so stale request_overrides on an unsupported model are dropped silently instead of 400'ing - Tests: flip the assertions that mirrored the bug (Sonnet/Haiku/Opus 4.7 asserting fast-mode support) to match the documented API contract |
||
|
|
026a5e47df | fix(cli): preserve Windows hidden-dir paths in markdown | ||
|
|
5b6d413476 |
fix(cli,gateway): surface title errors from /new <name>
The contributor's PR silently swallowed ValueError from SessionDB.set_session_title() with bare except Exception: pass. Users typing /new <title> with an already-in-use title got an untitled session and no feedback. Changes: - cli.py: catch ValueError from both sanitize_title() and set_session_title(); print the error and mark the session untitled in the banner (never echo the rejected title back). - gateway/run.py: append a warning note to the reset reply on title rejection; reflect the accepted title in the header. - Add regression tests for the duplicate-title path in CLI and gateway. Also map exx@example.com -> @exxmen in scripts/release.py. |
||
|
|
f720751d79 |
feat(cli,gateway): /new accepts optional session name argument
Allow users to start a fresh session and immediately set its title by
passing a name to /new (or /reset):
/new Refactor auth module
Changes:
- hermes_cli/commands.py: add args_hint='[name]' to /new command
- cli.py: parse title argument in process_command(), pass to new_session()
- cli.py: new_session() accepts title=None, sets title via SessionDB
- gateway/run.py: _handle_reset_command() parses title, sets on new entry
- gateway/session.py: reset_session() accepts optional display_name
- tests: add test_new_session_with_title, test_reset_command_with_title,
test_new_command_in_help_output
All 36 affected tests pass.
|
||
|
|
e3461e0b2a |
fix(cli): remove dead 'q' check from quit command resolution
The 'q' alias is defined for 'queue' command in commands.py:93.
The hardcoded 'q' in cli.py:5910 was dead code - resolve_command('q')
returns the queue CommandDef, so canonical would never be 'q'.
Removes the misleading check without changing any behavior:
- /quit and /exit still exit (defined aliases)
- /q still maps to queue (as intended)
|
||
|
|
a11aed1acc |
fix(cli): local backend CLI always uses launch directory, stops .env sync of TERMINAL_CWD (#19334)
The old CWD heuristic was fooled by: 1. TERMINAL_CWD persisted to .env by `hermes config set terminal.cwd` 2. Inherited TERMINAL_CWD from parent hermes processes 3. Only resolved when config had a placeholder value (not explicit paths) Fix: - load_cli_config() unconditionally uses os.getcwd() for local backend - TERMINAL_CWD always force-exported in CLI mode (overrides stale values) - Gateway sets _HERMES_GATEWAY=1 marker so lazy cli.py imports don't clobber - Remove terminal.cwd from config-set .env sync map (prevents re-poisoning) - Clarify setup wizard label as 'Gateway working directory' Closes #19214 |
||
|
|
167b5648ea |
Revert "fix(cli): CLI/TUI on local backend always uses launch directory, ignores terminal.cwd (#19242)" (#19329)
This reverts commit
|
||
|
|
9eaddfafa3 |
fix(cli): CLI/TUI on local backend always uses launch directory, ignores terminal.cwd (#19242)
CLI/TUI sessions on the local backend now unconditionally use os.getcwd() as the working directory. The terminal.cwd config value is only consumed by gateway/cron/delegation modes (where there's no shell to cd from). Previously, 'hermes setup' would write an absolute path (e.g. $HOME) into terminal.cwd which then pinned the CLI to that directory regardless of where the user launched hermes from. This was a silent foot-gun — the user's 'cd' was being ignored. Changes: 1. cli.py: Restructured CWD resolution — if TERMINAL_CWD is not already set by the gateway, and the backend is local, always use os.getcwd(). Config terminal.cwd is irrelevant for interactive CLI/TUI sessions. 2. setup.py: Moved the cwd prompt from setup_terminal_backend() to setup_gateway(). It now only appears when configuring messaging platforms and is labeled 'Gateway working directory'. 3. Tests: Rewrote test_cwd_env_respect.py to validate the new behavior: explicit config paths are ignored for CLI, gateway pre-set values are preserved, non-local backends keep their config paths. 4. Docs: Updated configuration.md, profiles.md, and environment-variables.md to clarify that terminal.cwd only affects gateway/cron mode on local backend. Closes #19214 |
||
|
|
f0dc919f92 |
fix(compression): include system prompt + tool schemas in token estimates (#18265)
The user-visible /compress banner and the post-compression last_prompt_tokens writeback both counted only the raw message transcript (chars/4). With a 15KB system prompt and 30 tool schemas (~26KB), a 4-message transcript that looks like ~45 tokens to the transcript-only estimator is really ~10.5K tokens of request pressure — a 234x gap. Two user-facing consequences: - Banner shows 'Compressing … (~45 tokens)…' while compression is actually firing on 10K+ tokens of real pressure, confusing users about why compression triggered (reported by @codecovenant on X; #6217). - Post-compression last_prompt_tokens writeback omits tool schemas, so the next should_compress() check compares real usage against a stale underestimate — compression triggers late, potentially past the model's context limit on small-context models (#14695). Swap estimate_messages_tokens_rough() for estimate_request_tokens_rough() at every user-visible banner and at the post-compression writeback. estimate_request_tokens_rough() already existed for exactly this purpose and includes system prompt + tool schemas. Touched call sites: - run_agent.py: post-compression last_prompt_tokens writeback, post-tool call should_compress() fallback when provider usage is missing - cli.py: /compress banner + summary - gateway/run.py: gateway /compress banner + summary - tui_gateway/server.py: TUI /compress status + summary - acp_adapter/server.py: ACP /compact before/after Left intentionally alone: - Session-hygiene fallback and the 'no agent' /status path in gateway/run.py — no agent instance is in scope to query for system prompt/tools, and the existing 30-50% overestimate wobble on hygiene is safety-accepted. - Verbose-mode 'Request size' logging — informational only, already counts system prompt via api_messages[0]. Also relabels the feedback line from 'Rough transcript estimate' to 'Approx request size' so the metric label matches what it actually measures. Credits: diagnoses from @devilardis (#14695) and @Jackten (#6217); user report @codecovenant on X (2026-04-30). Closes #14695 Closes #6217 |
||
|
|
80a676658c |
fix(cli): surface self-improvement review summaries from bg thread
When the self-improvement background review fires after a turn, it runs in a bg thread and emits a ' 💾 <summary>' line to announce what it saved to memory or skills. Two problems made this invisible to users even when the review successfully modified a skill: 1. The print went through `_cprint` (prompt_toolkit's print_formatted_text) on a bg thread while the CLI's PromptSession was live. Direct print_formatted_text races with the input-area redraw and the line can land behind/above the prompt, scrolled off without the user seeing it. 2. The message said only '💾 Skill created.' / '💾 Memory updated' with no indication that the self-improvement loop was the one doing this. Users who did catch the line couldn't tell the background review from some other agent action. Fixes: - `_cprint` now detects when it's called from a non-app thread with a running prompt_toolkit Application, and routes through `run_in_terminal` via `loop.call_soon_threadsafe`. That pauses the input, prints the line above the prompt, and redraws — the normal prompt_toolkit contract for bg-thread output. Direct-print fallback preserved for the no-app / same-thread / import-error paths. Affects every bg-thread emission, not just the review summary (curator summaries and auxiliary failure prints benefit too). - The summary now reads ' 💾 Self-improvement review: <summary>' in both the CLI and the gateway `background_review_callback` path, so the origin is unambiguous. Tests: - New `tests/cli/test_cprint_bg_thread.py` covers all five routing branches (no app, app-not-running, cross-thread schedule, same-thread direct, app-loop-attribute-error, import-error). - New case in `tests/run_agent/test_background_review.py` asserts the attributed prefix shows up in both `_safe_print` and `background_review_callback`. Live E2E: exercised _cprint from a bg thread inside a real Application event loop; confirmed get_app_or_none() sees the app, call_soon_threadsafe schedules run_in_terminal, and the inner _pt_print runs. |
||
|
|
285e9efb3f |
Merge pull request #17701 from NousResearch/bb/mouse-mode-self-heal
fix(cli): recover leaked mouse tracking terminal state |
||
|
|
0dd373ec43 |
fix(context): honor model.context_length for Ollama num_ctx and all display paths
When a user sets model.context_length in config.yaml, the value was only used for Hermes' internal compression decisions (context_compressor) but NOT for Ollama's num_ctx parameter. Ollama auto-detects context from GGUF metadata (often 256K+) and allocates that much VRAM regardless of the user's config — causing OOM on smaller GPUs like the P100 (16GB). Root cause: two separate context values existed independently: - context_compressor.context_length = config value (e.g. 65536) ✓ - _ollama_num_ctx = GGUF metadata value (e.g. 256000) ✗ ignored config Changes: 1. Cap Ollama num_ctx to config context_length (run_agent.py) When model.context_length is explicitly set and no explicit ollama_num_ctx override exists, cap the auto-detected GGUF value to the user's context_length. This is the core fix — it prevents Ollama from allocating more VRAM than the user budgeted. 2. Pass config_context_length through all secondary call sites Several paths called get_model_context_length() without the config override, falling through to the 256K default fallback: - cli.py: @-reference expansion and /model switch display - gateway/run.py: @-reference expansion and /model switch display - tui_gateway/server.py: @-reference expansion - hermes_cli/model_switch.py: resolve_display_context_length() 3. Normalize root-level context_length in config (hermes_cli/config.py) _normalize_root_model_keys() now migrates root-level context_length into the model section, matching existing behavior for provider and base_url. Users who wrote `context_length: 65536` at the YAML root instead of under `model:` had it silently ignored. 4. Fix misleading comments (agent/model_metadata.py) DEFAULT_FALLBACK_CONTEXT is 256K (CONTEXT_PROBE_TIERS[0]), not 128K as two comments stated. Tests: 3 new tests for root-level context_length normalization. All existing context_length tests pass (96 tests). |