hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-05-21 03:39:54 +00:00

Files

T

Teknium 49bd89dc78 fix(codex-runtime): correct protocol field names found via live e2e test

Three real bugs caught only by running a turn end-to-end against codex
0.130.0 with a real ChatGPT subscription. Unit tests passed because they
asserted on our own (incorrect) wire shapes; the wire format from
codex-rs/app-server-protocol/src/protocol/v2/* is the source of truth and
my initial reading of the README was incomplete.

Bug 1: thread/start.permissions wire format

Was sending {"profileId": "workspace-write"}.
Real format per PermissionProfileSelectionParams enum (tagged union):
  {"type": "profile", "id": "workspace-write"}
AND requires the experimentalApi capability declared during initialize.
AND requires a matching [permissions] table in ~/.codex/config.toml or
codex fails the request with 'default_permissions requires a [permissions]
table'.

Fix: stop overriding permissions on thread/start. Codex picks its default
profile (read-only unless user configures otherwise), which matches what
codex CLI users expect — they configure their default permission profile
in ~/.codex/config.toml the standard way. Trying to be clever about
profile selection broke every turn we tested.

Live error before fix: 'Invalid request: missing field type' on every
turn/start, even though our turn/start payload was correct — the field
codex was complaining about was inside the permissions sub-object we
shouldn't have been sending.

Bug 2: server-request method names

Was matching 'execCommandApproval' and 'applyPatchApproval'.
Real names per common.rs ServerRequest enum:
  item/commandExecution/requestApproval
  item/fileChange/requestApproval
  item/permissions/requestApproval (new third method)

Fix: match the documented names. Added handler for
item/permissions/requestApproval that always declines — codex sometimes
asks to escalate permissions mid-turn and silent acceptance would surprise
users.

Live symptom before fix: agent.log showed
'Unknown codex server request: item/commandExecution/requestApproval'
and codex stalled because we replied with -32601 (unsupported method)
instead of an approval decision. The agent reported back 'The write
command was rejected' even though Hermes never showed the user an
approval prompt.

Bug 3: approval decision values

Was sending decision strings 'approved'/'approvedForSession'/'denied'.
Real values per CommandExecutionApprovalDecision enum (camelCase):
  accept, acceptForSession, decline, cancel
(also AcceptWithExecpolicyAmendment and ApplyNetworkPolicyAmendment
variants we don't currently use).

Fix: rename _approval_choice_to_codex_decision return values; update
auto_approve_* fallbacks; update fail-closed default from 'denied' to
'decline'. Test mapping table updated to match.

Live test verified after fixes:
  $ hermes (with model.openai_runtime: codex_app_server)
  > Run the shell command: echo hermes-codex-livetest > .../proof.txt
    then read it back

  Approval prompt fired with 'Codex requests exec in <cwd>'.
  User chose 'Allow once'. Codex executed the command, wrote the file,
  read it back. Final response: 'Read back from proof.txt:
  hermes-codex-livetest'. File contents on disk match.

agent.log confirms:
  codex app-server thread started: id=019e200e profile=workspace-write
                                    cwd=/tmp/hermes-codex-livetest/workspace

All 20 session tests still green after wire-format updates.

2026-05-12 23:41:53 -07:00

transports

fix(codex-runtime): correct protocol field names found via live e2e test

2026-05-12 23:41:53 -07:00

__init__.py

test: add unit tests for 8 modules (batch 2)

2026-02-26 13:54:20 +03:00

test_anthropic_adapter.py

fix: avoid unsupported anthropic context beta by default

2026-05-07 05:43:20 -07:00

test_anthropic_keychain.py

fix: re-auth on stale OAuth token; read Claude Code credentials from macOS Keychain

2026-04-24 07:14:00 -07:00

test_arcee_trinity_overrides.py

test(arcee): cover Trinity Large Thinking temperature + compression overrides

2026-05-05 17:23:45 -07:00

test_auxiliary_client_anthropic_custom.py

fix(anthropic): complete third-party Anthropic-compatible provider support (#12846 )

2026-04-19 22:43:09 -07:00

test_auxiliary_client.py

fix(auxiliary): evict async wrappers on poisoned client (follow-up to #23482 )

2026-05-11 11:13:20 -07:00

test_auxiliary_config_bridge.py

fix(tests): pin UTF-8 encoding when reading source files on Windows

2026-05-09 02:47:28 -07:00

test_auxiliary_main_first.py

fix(copilot): send vision header for Copilot vision requests

2026-04-27 08:35:50 -07:00

test_auxiliary_named_custom_providers.py

fix(fallback): let custom_providers shadow built-in aliases

2026-04-30 20:18:44 -07:00

test_auxiliary_transport_autodetect.py

fix(auxiliary): auto-detect Anthropic Messages transport for all aux clients (#17027 )

2026-04-28 06:50:14 -07:00

test_bedrock_1m_context.py

test: remove 50 stale/broken tests to unblock CI (#22098 )

2026-05-08 14:55:40 -07:00

test_bedrock_adapter.py

fix(bedrock): preserve reasoningContent across converse normalization

2026-05-07 05:17:16 -07:00

test_bedrock_integration.py

fix(agent): handle aws_sdk auth type in resolve_provider_client

2026-04-24 07:26:07 -07:00

test_codex_cloudflare_headers.py

fix(aux): remove hardcoded Codex fallback model, drop Codex from auto chain (#17765 )

2026-04-29 23:23:50 -07:00

test_compress_focus.py

fix: resolve CI test failures — add missing functions, fix stale tests (#9483 )

2026-04-14 01:43:45 -07:00

test_compressor_image_tokens.py

feat(image-input): native multimodal routing based on model vision capability (#16506 )

2026-04-27 06:27:59 -07:00

test_context_compressor_summary_continuity.py

fix(compression): preserve iterative summary continuity

2026-05-05 04:42:44 -07:00

test_context_compressor.py

fix(context_compressor): treat streaming premature-close as transient error

2026-05-09 17:52:51 -07:00

test_context_engine.py

feat: wire context engine plugin slot into agent and plugin system

2026-04-10 19:15:50 -07:00

test_context_references.py

fix(agent): fall back when rg is blocked for @folder references

2026-04-20 01:56:41 -07:00

test_copilot_acp_client.py

fix(ci): recover 38 failing tests on main (#17642 )

2026-04-29 20:05:32 -07:00

test_credential_pool_routing.py

refactor: remove smart_model_routing feature (#12732 )

2026-04-19 18:12:55 -07:00

test_credential_pool.py

fix(auth): shorten credential 401 cooldown

2026-05-07 06:15:33 -07:00

test_crossloop_client_cache.py

refactor(tests): re-architect tests + fix CI failures (#5946 )

2026-04-07 17:19:07 -07:00

test_curator_activity.py

fix: use skill activity in curator status

2026-04-30 10:31:47 -07:00

test_curator_backup.py

fix(curator): authoritative absorbed_into on delete + restore cron skill links on rollback (#18671 ) (#18731 )

2026-05-02 01:29:57 -07:00

test_curator_classification.py

feat(curator): hint at hermes curator pin in the rename block (#23212 )

2026-05-10 06:44:53 -07:00

test_curator_reports.py

fix(curator): rewrite cron job skill refs after consolidation (#18253 )

2026-04-30 23:04:50 -07:00

test_curator.py

fix(skills): keep manual skills out of curator

2026-05-04 02:19:28 -07:00

test_deepseek_anthropic_thinking.py

test(anthropic): regression guard for DeepSeek /anthropic thinking replay

2026-04-29 08:10:29 -07:00

test_direct_provider_url_detection.py

fix: restrict provider URL detection to exact hostname matches

2026-04-20 22:14:29 -07:00

test_display_emoji.py

feat(tools): centralize tool emoji metadata in registry + skin integration

2026-03-15 20:21:21 -07:00

test_display.py

fix(cli): honor positive tool preview length

2026-05-07 05:26:28 -07:00

test_error_classifier.py

fix(error_classifier): classify generic-typed timeout messages as transient (carve-out of #22664 )

2026-05-09 17:54:07 -07:00

test_external_skills_dirs_cache.py

perf(cli): cut ~19s from 'hermes' cold start (skills cache + lazy Feishu + no Nous HTTP) (#22138 )

2026-05-08 16:39:32 -07:00

test_external_skills.py

feat(skills): support external skill directories via config (#3678 )

2026-03-29 00:33:30 -07:00

test_gemini_cloudcode.py

fix(gemini): assign unique stream indices to parallel tool calls

2026-04-20 02:10:53 -07:00

test_gemini_fast_fallback.py

Prefer fallback for Gemini CloudCode rate limits

2026-05-05 10:14:48 -07:00

test_gemini_free_tier_gate.py

feat(gemini): block free-tier keys at setup + surface guidance on 429 (#15100 )

2026-04-24 04:46:17 -07:00

test_gemini_native_adapter.py

fix(gemini): fail fast on missing API key + surface it in hermes dump (#15133 )

2026-04-24 05:35:17 -07:00

test_gemini_schema.py

fix(gemini): drop integer/number/boolean enums from tool schemas (#15082 )

2026-04-24 03:40:00 -07:00

test_i18n.py

feat(i18n): localize all gateway commands + web dashboard, add 8 new locales (16 total) (#22914 )

2026-05-10 07:14:14 -07:00

test_image_gen_registry.py

feat(plugins): pluggable image_gen backends + OpenAI provider (#13799 )

2026-04-21 21:30:10 -07:00

test_image_routing.py

fix(image-routing): sniff magic bytes for image MIME, ignore misleading suffix

2026-05-07 05:58:11 -07:00

test_insights.py

test: stop testing mutable data — convert change-detectors to invariants (#13363 )

2026-04-20 23:20:33 -07:00

test_kimi_coding_anthropic_thinking.py

fix(anthropic): broaden Kimi thinking-suppression to custom endpoints (#17455 )

2026-04-29 06:35:42 -07:00

test_local_stream_timeout.py

fix(agent): recognize Tailscale CGNAT (100.64.0.0/10) as local for Ollama timeouts

2026-04-22 14:46:10 -07:00

test_markdown_tables.py

fix(cli): vertical fallback for markdown tables wider than terminal (#23948 )

2026-05-11 16:49:13 -07:00

test_memory_provider.py

fix(memory): add write origin metadata

2026-04-24 14:37:55 -07:00

test_memory_session_switch.py

feat(hindsight): probe API for update_mode='append' support, dedupe across processes

2026-05-05 15:09:59 -07:00

test_memory_user_id.py

feat(hindsight): richer session-scoped retain metadata

2026-04-22 05:27:10 -07:00

test_minimax_auxiliary_url.py

fix: provider/model resolution — salvage 4 PRs + MiniMax aux URL fix (#5983 )

2026-04-07 22:23:28 -07:00

test_minimax_provider.py

feat: provider modules — ProviderProfile ABC, 33 providers, fetch_models, transport single-path

2026-05-05 13:40:01 -07:00

test_model_metadata_local_ctx.py

fix(tui): show correct context length

2026-04-28 12:27:36 -07:00

test_model_metadata_ssl.py

fix(auth): honor SSL CA env vars across httpx + requests callsites

2026-04-24 03:00:33 -07:00

test_model_metadata.py

fix(model-metadata): set codex-spark fallback context to 128k

2026-05-09 23:17:25 -07:00

test_models_dev.py

perf(models_dev): cache-first lookup, skip network when disk cache is fresh (#22808 )

2026-05-09 13:32:38 -07:00

test_moonshot_schema.py

fix(moonshot): also strip nullable/enum after anyOf collapse

2026-04-30 23:14:31 -07:00

test_nous_rate_guard.py

fix(nous): don't trip cross-session rate breaker on upstream-capacity 429s (#15898 )

2026-04-26 04:53:42 -07:00

test_onboarding.py

docs(onboarding): lead OpenClaw residue banner with migrate, warn that cleanup breaks OpenClaw (#17507 )

2026-04-29 08:08:36 -07:00

test_openrouter_response_cache.py

fix(openrouter): use canonical X-Title attribution header

2026-05-05 10:13:34 -07:00

test_plugin_llm.py

feat(plugins): run any LLM call from inside a plugin via ctx.llm (#23194 )

2026-05-10 07:09:28 -07:00

test_prompt_builder.py

fix(webui): add platform hint for MEDIA rendering

2026-05-09 02:22:40 -07:00

test_prompt_caching_live.py

feat(prompt-cache): cross-session 1h prefix cache for Claude on Anthropic / OpenRouter / Nous Portal (#23828 )

2026-05-11 11:14:56 -07:00

test_prompt_caching.py

feat(prompt-cache): cross-session 1h prefix cache for Claude on Anthropic / OpenRouter / Nous Portal (#23828 )

2026-05-11 11:14:56 -07:00

test_proxy_and_url_validation.py

fix(agent): normalize socks:// env proxies for httpx/anthropic

2026-04-21 05:52:46 -07:00

test_rate_limit_tracker.py

feat: capture provider rate limit headers and show in /usage (#6541 )

2026-04-09 03:43:14 -07:00

test_redact.py

feat: replace kimi-k2.5 with kimi-k2.6 on OpenRouter and Nous Portal (#13148 )

2026-04-20 11:49:54 -07:00

test_shell_hooks_consent.py

fix(shell_hooks): parse hooks_auto_accept as strict bool/string, not bool() (#16322 )

2026-04-26 20:48:35 -07:00

test_shell_hooks.py

feat: shell hooks — wire shell scripts as Hermes hook callbacks

2026-04-20 20:53:51 -07:00

test_skill_commands_reload.py

refactor(reload-skills): queue note for next turn, drop cache invalidation + agent tool

2026-04-29 21:07:47 -07:00

test_skill_commands.py

test(skills): cover additional rescan paths in skill_commands cache (#14536 )

2026-05-07 04:59:43 -07:00

test_skill_utils.py

test(skill_utils): add regression tests for non-dict metadata in extract_skill_conditions

2026-04-30 20:37:15 -07:00

test_streaming_context_scrubber.py

style: trim verbose comment blocks added by previous commit

2026-04-27 12:37:33 -07:00

test_subagent_progress.py

feat(delegate): orchestrator role and configurable spawn depth (default flat)

2026-04-21 14:23:45 -07:00

test_subagent_stop_hook.py

feat: shell hooks — wire shell scripts as Hermes hook callbacks

2026-04-20 20:53:51 -07:00

test_subdirectory_hints.py

fix(agent): catch PermissionError in subdirectory hint discovery

2026-04-09 03:10:30 -07:00

test_think_scrubber.py

fix(agent): stateful streaming scrubber for reasoning-block leaks (#17924 ) (#20184 )

2026-05-05 04:33:38 -07:00

test_title_generator.py

fix: improve telegram topic mode setup

2026-05-04 12:07:17 -07:00

test_tool_guardrails.py

fix(agent): make tool loop guardrails warning-first

2026-04-30 20:43:15 -07:00

test_unsupported_parameter_retry.py

test: remove 50 stale/broken tests to unblock CI (#22098 )

2026-05-08 14:55:40 -07:00

test_unsupported_temperature_retry.py

refactor(memory): remove flush_memories entirely (#15696 )

2026-04-25 08:21:14 -07:00

test_usage_pricing.py

fix(usage): read top-level Anthropic cache fields from OAI-compatible proxies

2026-04-22 17:40:49 -07:00

test_vision_resolved_args.py

fix(vision): preserve explicit provider auth with custom base_url

2026-05-04 05:05:43 -07:00