hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-05-21 03:39:54 +00:00

Files

T

Teknium1 93a0fe6495 feat(codex-runtime): wire codex_app_server runtime into AIAgent

The integration commit. AIAgent.run_conversation() now early-returns to a
new helper _run_codex_app_server_turn() when self.api_mode ==
'codex_app_server', bypassing the chat_completions tool loop entirely.

Three small surgical edits to run_agent.py (~105 LOC total):

1. Line ~1204 (constructor api_mode validation set):
   Add 'codex_app_server' so an explicit api_mode='codex_app_server'
   passed to AIAgent() isn't silently rewritten to 'chat_completions'.

2. Line ~12048 (run_conversation, just before the while loop):
   Early-return to _run_codex_app_server_turn() when self.api_mode is
   'codex_app_server'. Placed AFTER all standard pre-loop setup —
   logging context, session DB, surrogate sanitization, _user_turn_count
   and _turns_since_memory increments, _ext_prefetch_cache, memory
   manager on_turn_start — so behavior outside the model-call loop is
   identical between paths. Default Hermes flow is unchanged when the
   flag is off.

3. End-of-class (line ~15497):
   New method _run_codex_app_server_turn(). Lazy-instantiates one
   CodexAppServerSession per AIAgent (reused across turns), runs the
   turn, splices projected_messages into messages, increments
   _iters_since_skill by tool_iterations (since the chat_completions
   loop normally does that per iteration), fires
   _spawn_background_review on the same cadence as the default path.

Counter accounting:

  _turns_since_memory  ← already incremented at run_conversation:11817
                         (gated on memory store configured) — codex
                         helper does NOT touch it (would double-count).
  _user_turn_count     ← already incremented at run_conversation:11793
                         — codex helper does NOT touch it.
  _iters_since_skill   ← incremented in the chat_completions loop per
                         tool iteration. Codex helper increments by
                         turn.tool_iterations since the loop is bypassed.

User message:

  ALREADY appended to messages by run_conversation pre-loop (line 11823)
  before the early-return reaches us. Helper does NOT append again.
  Regression test test_user_message_not_duplicated guards this.

Approval callback wiring:

  Lazy-fetches tools.terminal_tool._get_approval_callback at session
  spawn time, passes to CodexAppServerSession. CLI threads with
  prompt_toolkit get interactive approvals; gateway/cron contexts get
  the codex-side fail-closed deny.

Error path:

  Codex session exceptions become a 'partial' result with completed=False
  and a final_response that explicitly tells the user how to switch back:
  'Codex app-server turn failed: ... Fall back to default runtime with
  /codex-runtime auto.' Same return-dict shape as the chat_completions
  path so all callers (gateway, CLI, batch_runner, ACP) work unchanged.

9 new integration tests in tests/run_agent/test_codex_app_server_integration.py:
  - api_mode='codex_app_server' is accepted on AIAgent construction
  - run_conversation returns the expected codex shape
    (final_response, codex_thread_id, codex_turn_id, completed, partial)
  - Projected messages are spliced into messages list
  - _iters_since_skill ticks per tool iteration
  - _user_turn_count delegated to standard flow (not double-counted)
  - User message appears exactly once (regression guard)
  - _spawn_background_review IS invoked (memory/skill review keeps working)
  - chat.completions.create is NEVER called (loop fully bypassed)
  - Session exception → partial result with /codex-runtime auto hint
  - Interrupted turn → partial result with error preserved

Adjacent test runs confirm no regressions:
  - tests/run_agent/test_memory_nudge_counter_hydration.py: green
  - tests/run_agent/test_background_review.py: green
  - tests/run_agent/test_fallback_model.py: green
  - tests/agent/transports/: 249/249 green

Still missing for full feature: /codex-runtime slash command, plugin
migration helper, docs page, live e2e test gated on codex binary. Those
are the remaining followup commits.

2026-05-12 10:26:26 -07:00

__init__.py

…

conftest.py

test: speed up slow tests (backoff + subprocess + IMDS network) (#11797 )

2026-04-17 14:21:22 -07:00

test_413_compression.py

fix(agent): surface preflight compression status

2026-05-04 01:41:51 -07:00

test_860_dedup.py

fix: lazy session creation — defer DB row until first message (#18370 )

2026-05-01 18:39:12 +05:30

test_1630_context_overflow_loop.py

fix(tests): make AIAgent constructor calls self-contained (#11755 )

2026-04-17 12:32:03 -07:00

test_agent_guardrails.py

fix(agent): include name field on every role:tool message for Gemini compatibility (#16478 )

2026-05-04 05:06:33 -07:00

test_agent_loop_tool_calling.py

…

test_agent_loop_vllm.py

…

test_agent_loop.py

…

test_anthropic_error_handling.py

feat(providers): extend request_timeout_seconds to all client paths

2026-04-19 11:23:00 -07:00

test_anthropic_prompt_cache_policy.py

fix(cache): route Nous Portal Qwen through Portal-Claude cache pathway (#24151 )

2026-05-11 21:04:55 -07:00

test_anthropic_third_party_oauth_guard.py

fix(anthropic): complete third-party Anthropic-compatible provider support (#12846 )

2026-04-19 22:43:09 -07:00

test_anthropic_truncation_continuation.py

refactor: remove _nr_to_assistant_message shim + fix flush_memories guard

2026-04-23 02:30:05 -07:00

test_api_max_retries_config.py

feat(agent): make API retry count configurable via agent.api_max_retries (#14730 )

2026-04-23 13:59:32 -07:00

test_async_httpx_del_neuter.py

fix(copilot): send vision header for Copilot vision requests

2026-04-27 08:35:50 -07:00

test_background_review_summary.py

fix(agent): exclude prior-history tool messages from background review summary

2026-04-24 03:10:19 -07:00

test_background_review_toolset_restriction.py

fix(ci): stabilize main test suite regressions (#17660 )

2026-04-29 23:18:55 -07:00

test_background_review.py

fix(cli): surface self-improvement review summaries from bg thread

2026-04-30 14:07:22 -07:00

test_codex_app_server_integration.py

feat(codex-runtime): wire codex_app_server runtime into AIAgent

2026-05-12 10:26:26 -07:00

test_codex_multimodal_tool_result.py

feat(vision): vision_analyze returns pixels to vision-capable models, not aux text (#22955 )

2026-05-09 21:06:19 -07:00

test_commit_memory_session_context_engine.py

fix(agent): notify context engine on commit_memory_session (#22764 )

2026-05-09 12:28:42 -07:00

test_compress_focus_plugin_fallback.py

refactor(memory): remove flush_memories entirely (#15696 )

2026-04-25 08:21:14 -07:00

test_compression_boundary_hook.py

fix: signal compression boundary to context engine

2026-04-26 19:07:18 -07:00

test_compression_boundary.py

…

test_compression_feasibility.py

refactor(memory): remove flush_memories entirely (#15696 )

2026-04-25 08:21:14 -07:00

test_compression_persistence.py

fix(tests): make AIAgent constructor calls self-contained (#11755 )

2026-04-17 12:32:03 -07:00

test_compression_trigger_excludes_reasoning.py

fix(compression): exclude completion tokens from compression trigger (#12026 )

2026-04-20 05:12:10 -07:00

test_compressor_fallback_update.py

…

test_concurrent_interrupt.py

test: remove 50 stale/broken tests to unblock CI (#22098 )

2026-05-08 14:55:40 -07:00

test_context_token_tracking.py

feat(providers): extend request_timeout_seconds to all client paths

2026-04-19 11:23:00 -07:00

test_copilot_native_vision_headers.py

fix(copilot): mark native image requests as vision

2026-04-27 08:35:50 -07:00

test_create_openai_client_kwargs_isolation.py

fix(tests): make AIAgent constructor calls self-contained (#11755 )

2026-04-17 12:32:03 -07:00

test_create_openai_client_proxy_env.py

test(proxy): regression tests for NO_PROXY bypass on keepalive client

2026-04-24 03:04:42 -07:00

test_create_openai_client_reuse.py

fix(tests): make AIAgent constructor calls self-contained (#11755 )

2026-04-17 12:32:03 -07:00

test_deepseek_reasoning_content_echo.py

fix(deepseek): use non-empty reasoning_content placeholder for V4 Pro thinking mode

2026-04-30 23:04:23 -07:00

test_deepseek_v4_thinking_live.py

fix(deepseek): preserve v4 reasoning_content on replay

2026-04-30 11:18:39 -07:00

test_dict_tool_call_args.py

fix(tests): fix 78 CI test failures and remove dead test (#9036 )

2026-04-13 10:50:24 -07:00

test_empty_response_recovery_persistence.py

fix(run_agent): break permanent empty-response loop from orphan tool-tail (#21385 )

2026-05-07 08:35:10 -07:00

test_exit_cleanup_interrupt.py

test: speed up slow tests (backoff + subprocess + IMDS network) (#11797 )

2026-04-17 14:21:22 -07:00

test_fallback_model.py

fix(fallback): resolve api_key_env in fallback chain entries (carve-out of #22665 )

2026-05-09 17:53:56 -07:00

test_image_rejection_fallback.py

fix(agent): catch ChatGPT-account Codex data-URL rejection so images are stripped instead of cascading to compression (#23602 )

2026-05-11 07:37:22 -07:00

test_image_shrink_recovery.py

feat(image-input): native multimodal routing based on model vision capability (#16506 )

2026-04-27 06:27:59 -07:00

test_init_fallback_on_exhausted_pool.py

fix(agent): try fallback providers at init when primary credential pool is exhausted (#17929 )

2026-05-02 02:09:46 -07:00

test_interactive_interrupt.py

…

test_interrupt_propagation.py

test: stop testing mutable data — convert change-detectors to invariants (#13363 )

2026-04-20 23:20:33 -07:00

test_invalid_context_length_warning.py

fix(tests): resolve CI test failures — pool auto-seeding, stale assertions, mock isolation

2026-04-15 22:05:21 -07:00

test_iteration_budget_race.py

fix(run_agent): acquire lock in IterationBudget.used property

2026-05-04 12:37:28 -07:00

test_jsondecodeerror_retryable.py

fix(agent): retry on json.JSONDecodeError instead of treating it as a local validation error (#15107 )

2026-04-24 05:02:58 -07:00

test_last_reasoning_per_turn.py

test: pin per-turn reasoning extraction semantics

2026-05-05 05:00:05 -07:00

test_long_context_tier_429.py

…

test_materialize_data_url_cleanup.py

fix(misc): three small defensive fixes from PR #1974

2026-05-10 22:28:01 -07:00

test_memory_nudge_counter_hydration.py

fix(agent): hydrate memory-nudge counters from conversation_history (#22774 )

2026-05-09 12:48:03 -07:00

test_memory_provider_init.py

fix(memory): keep Honcho provider opt-in

2026-04-18 22:50:55 -07:00

test_memory_sync_interrupted.py

feat(memory): notify providers on mid-process session_id rotation (#17409 )

2026-04-29 04:57:22 -07:00

test_message_sequence_repair.py

fix(run_agent): break permanent empty-response loop from orphan tool-tail (#21385 )

2026-05-07 08:35:10 -07:00

test_openai_client_lifecycle.py

…

test_percentage_clamp.py

…

test_plugin_context_engine_init.py

fix(tests): make AIAgent constructor calls self-contained (#11755 )

2026-04-17 12:32:03 -07:00

test_primary_runtime_restore.py

fix(agent): only set rate-limit cooldown when leaving primary; add tests

2026-04-24 05:35:43 -07:00

test_provider_attribution_headers.py

refactor(gmi): move User-Agent to profile.default_headers

2026-05-08 03:22:11 -07:00

test_provider_fallback.py

fix(fallback): skip chain entries matching current provider/model/base_url (#22780 )

2026-05-09 12:48:19 -07:00

test_provider_parity.py

fix(aux): remove hardcoded Codex fallback model, drop Codex from auto chain (#17765 )

2026-04-29 23:23:50 -07:00

test_real_interrupt_subagent.py

fix(tests): fix 78 CI test failures and remove dead test (#9036 )

2026-04-13 10:50:24 -07:00

test_redirect_stdout_issue.py

…

test_repair_tool_call_arguments.py

fix(run_agent): handle unescaped control chars in tool_call arguments (#15356 )

2026-04-24 15:06:41 -07:00

test_repair_tool_call_name.py

fix(agent): repair CamelCase + _tool suffix tool-call emissions (#15124 )

2026-04-24 05:32:08 -07:00

test_review_prompt_class_first.py

fix(review): tell background reviewer not to capture transient env failures as skills (#23004 )

2026-05-09 22:51:25 -07:00

test_run_agent_codex_responses.py

fix(memory): drop scrub from interim commentary + final response

2026-04-27 12:37:33 -07:00

test_run_agent_multimodal_prologue.py

refactor: unify transport dispatch + collapse normalize shims

2026-04-22 18:34:25 -07:00

test_run_agent.py

fix(kanban): call kanban_block on iteration-budget exhaustion to prevent protocol violation

2026-05-11 06:44:58 -07:00

test_sequential_chats_live.py

test: regression guards for the keepalive/transport bug class (#10933 ) (#11266 )

2026-04-16 16:36:33 -07:00

test_session_id_env.py

feat: expose HERMES_SESSION_ID to agent tools via ContextVar + env (#23847 )

2026-05-12 00:16:45 +05:30

test_session_meta_filtering.py

…

test_session_reset_fix.py

…

test_steer.py

refactor(steer): simplify injection marker to 'User guidance:' prefix (#13340 )

2026-04-20 22:18:49 -07:00

test_stream_drop_logging.py

feat(stream-retry): add upstream + timing diagnostics to drop log (#23005 )

2026-05-09 22:49:35 -07:00

test_stream_interrupt_retry.py

fix: /stop now immediately aborts streaming retry loop

2026-04-25 09:51:39 -07:00

test_streaming_tool_call_repair.py

fix: repair malformed tool call args in streaming assembly before flagging as truncated

2026-04-24 15:03:07 -07:00

test_streaming.py

fix(copilot-acp): disable streaming path for CopilotACPClient

2026-04-28 11:33:07 -07:00

test_strict_api_validation.py

…

test_strip_reasoning_tags_cli.py

fix(display): strip standalone tool-call XML tags from visible text

2026-04-22 18:12:42 -07:00

test_switch_model_context.py

…

test_switch_model_fallback_prune.py

fix(agent): default missing fallback chain on switch

2026-04-24 05:35:43 -07:00

test_thinking_only_sanitizer.py

fix(agent): drop thinking-only assistant turns before provider call (#16959 )

2026-04-28 03:50:51 -07:00

test_token_persistence_non_cli.py

fix: make session search initialize session db

2026-05-09 14:36:58 -07:00

test_tool_arg_coercion.py

fix(tools): wrap bare scalars in single-element list for array-typed args

2026-05-04 05:00:37 -07:00

test_tool_call_args_sanitizer.py

fix(agent): include name field on every role:tool message for Gemini compatibility (#16478 )

2026-05-04 05:06:33 -07:00

test_tool_call_guardrail_runtime.py

fix(agent): make tool loop guardrails warning-first

2026-04-30 20:43:15 -07:00

test_tool_executor_contextvar_propagation.py

fix(agent): propagate ContextVars to concurrent tool worker threads (#18123 )

2026-04-30 16:26:26 -07:00

test_unicode_ascii_codec.py

fix: always retry on ASCII codec UnicodeEncodeError — don't gate on per-component sanitization

2026-04-15 15:03:28 -07:00

test_vision_aware_preprocessing.py

feat(image-input): native multimodal routing based on model vision capability (#16506 )

2026-04-27 06:27:59 -07:00