hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-05-21 03:39:54 +00:00

Files

T

Teknium 6cb9917c73 perf(compression): defer feasibility check to first compression attempt (#28957 )

`AIAgent.__init__` was eagerly calling
`_check_compression_model_feasibility()` which probes the auxiliary
provider chain and runs `get_model_context_length()` (potentially
network-bound) to decide whether the configured auxiliary model can
fit a full compression-threshold window. That cost ~440ms cold on
every agent construction.

Most `chat -q` invocations finish in 1-5 seconds and never accumulate
enough context to trip the compression threshold, so the feasibility
check is pure overhead. The result is also only consumed when
compression actually fires (the function adjusts the live threshold
downward if the aux model can't fit; absent that mutation, the gate
in `conversation_loop.py:442` would never fire anyway).

Defer to first `compress_context()` call via
`agent._compression_feasibility_checked` sentinel. Runs at most once
per agent lifetime, just before the first compression pass. The
warning storage (`_compression_warning`) and gateway replay
machinery is unchanged — it still emits to status_callback on the
first turn that actually needs compression.

E2E timing (chat -q 'hi', 3 runs each):
                BEFORE   AFTER    delta
  median wall   2.03s    1.86s    -8% (-169ms)
  min wall      1.92s    1.63s    -15% (-293ms)

Real cold-start observation (synthetic 31-turn agent loop): identical
behavior since feasibility check fires once on first compression and
caches. No semantic difference for sessions that DO compress.

UX trade-off: users with broken auxiliary-provider config no longer
see the warning at session start. They see it when compression first
fires — which is exactly when it matters. For users with working
config (the vast majority), the warning never fires anyway, so the
deferral is invisible.

Tests:
- tests/run_agent/test_compression_feasibility.py — 16/16 pass
  (the one test that asserted call-at-init was updated to drive the
  lazy check explicitly via agent._check_compression_model_feasibility())
- Live tmux session: 2-turn conversation + tool call completes clean,
  zero errors in agent.log

2026-05-19 17:27:17 -07:00

lsp

chore: ruff auto-fix PLR6201 resweep — tuple → set in membership tests (#27355 )

2026-05-17 02:29:41 -07:00

transports

fix(codex): allow kanban worker board writes

2026-05-17 11:50:43 -07:00

__init__.py

Refactor Terminal and AIAgent cleanup

2026-02-21 22:31:43 -08:00

account_usage.py

chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937 )

2026-05-11 11:13:25 -07:00

agent_init.py

perf(compression): defer feasibility check to first compression attempt (#28957 )

2026-05-19 17:27:17 -07:00

agent_runtime_helpers.py

fix(agent): set tool_name on tool-result messages at construction time

2026-05-19 20:49:11 +01:00

anthropic_adapter.py

feat(azure-foundry): add Microsoft Entra ID auth

2026-05-18 10:14:38 -07:00

async_utils.py

fix(async): close unscheduled coroutines in all threadsafe bridges (#26584 )

2026-05-15 14:00:01 -07:00

auxiliary_client.py

fix(xai-oauth): pin inference base_url to x.ai origin (#28952 )

2026-05-19 14:51:21 -07:00

azure_identity_adapter.py

feat(azure-foundry): add Microsoft Entra ID auth

2026-05-18 10:14:38 -07:00

background_review.py

feat(bg-review): add bundled/pinned skill protection rules to review prompts (#27644 )

2026-05-18 20:02:22 -07:00

bedrock_adapter.py

chore(deps): lazy-install boto3/botocore for bedrock adapter

2026-05-17 02:31:18 -07:00

browser_provider.py

fix(browser): self-review pass — dead-import, log levels, future-proofing

2026-05-17 04:04:15 -07:00

browser_registry.py

fix(browser): self-review pass — dead-import, log levels, future-proofing

2026-05-17 04:04:15 -07:00

chat_completion_helpers.py

fix(xai-responses): strip enum values containing '/' from tool schemas

2026-05-18 10:37:35 -07:00

codex_responses_adapter.py

fix(xai-oauth): recover from prelude SSE errors, gate reasoning replay, surface entitlement 403s (#26644 )

2026-05-15 16:35:12 -07:00

codex_runtime.py

fix(xai): surface provider 'error' SSE frame in Codex fallback stream (#27184 )

2026-05-16 23:41:09 -07:00

context_compressor.py

fix(compress): make abort-on-summary-failure opt-in via config flag (#28117 )

2026-05-18 10:28:20 -07:00

context_engine.py

fix(compression): keep default protect_first_n at 3 + align ABC

2026-05-13 22:25:16 -07:00

context_references.py

fix(agent): fall back when rg is blocked for @folder references

2026-04-20 01:56:41 -07:00

conversation_compression.py

perf(compression): defer feasibility check to first compression attempt (#28957 )

2026-05-19 17:27:17 -07:00

conversation_loop.py

fix: wrap _pool_may_recover_from_rate_limit call through run_agent namespace

2026-05-18 20:04:57 -07:00

copilot_acp_client.py

fix: guard yaml.safe_load, flock unlock, TOCTOU races, and atomic writes

2026-05-19 00:12:41 -07:00

credential_pool.py

fix(codex-oauth): quarantine terminal refresh errors so dead tokens are not replayed across sessions

2026-05-18 10:31:40 -07:00

credential_sources.py

feat(xai-oauth): add xAI Grok OAuth (SuperGrok Subscription) provider

2026-05-15 12:11:32 -07:00

curator_backup.py

fix(curator): authoritative absorbed_into on delete + restore cron skill links on rollback (#18671 ) (#18731 )

2026-05-02 01:29:57 -07:00

curator.py

feat(curator): hint at hermes curator pin in the rename block (#23212 )

2026-05-10 06:44:53 -07:00

display.py

chore: remove Atropos RL environments and tinker-atropos integration (#26106 )

2026-05-15 10:36:38 +05:30

error_classifier.py

fix(error_classifier): classify xAI Grok entitlement SSE errors as auth

2026-05-18 10:24:13 -07:00

file_safety.py

fix(security): apply file safety to copilot acp fs

2026-04-21 01:31:58 -07:00

gemini_cloudcode_adapter.py

fix(agent/gemini-cloudcode): seed delta defaults for reasoning-only stream chunks

2026-05-14 08:03:56 -07:00

gemini_native_adapter.py

fix(auxiliary): evict async wrappers on poisoned client (follow-up to #23482 )

2026-05-11 11:13:20 -07:00

gemini_schema.py

chore: remove unused imports and dead locals (ruff F401, F841) (#17010 )

2026-04-28 06:46:45 -07:00

google_code_assist.py

chore: remove unused imports and dead locals (ruff F401, F841) (#17010 )

2026-04-28 06:46:45 -07:00

google_oauth.py

fix(google_oauth): close TOCTOU window when saving credentials

2026-05-04 03:16:19 -07:00

i18n.py

feat(i18n): localize all gateway commands + web dashboard, add 8 new locales (16 total) (#22914 )

2026-05-10 07:14:14 -07:00

image_gen_provider.py

feat(plugins): pluggable image_gen backends + OpenAI provider (#13799 )

2026-04-21 21:30:10 -07:00

image_gen_registry.py

fix(plugins): filter resolution by is_available() in web + image_gen registries

2026-05-13 22:31:28 -07:00

image_routing.py

chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937 )

2026-05-11 11:13:25 -07:00

insights.py

Merge branch 'main' into feat/dashboard-skill-analytics

2026-04-20 05:25:49 -07:00

iteration_budget.py

refactor(run_agent): extract OpenAI proxy, safe stdio, IterationBudget

2026-05-16 17:59:32 -07:00

lmstudio_reasoning.py

feat(agent): add lmstudio integration

2026-04-28 12:27:36 -07:00

manual_compression_feedback.py

fix(compression): include system prompt + tool schemas in token estimates (#18265 )

2026-04-30 23:03:54 -07:00

markdown_tables.py

fix(cli): vertical fallback for markdown tables wider than terminal (#23948 )

2026-05-11 16:49:13 -07:00

memory_manager.py

🐛 fix(memory): require newline after context tag

2026-05-18 10:53:08 -07:00

memory_provider.py

docs(agent): remove stale BuiltinMemoryProvider references from memory module docstrings

2026-05-05 13:33:49 -07:00

message_sanitization.py

refactor(run_agent): extract message sanitization to agent/message_sanitization.py

2026-05-16 17:41:09 -07:00

model_metadata.py

fix(metadata): qwen3.6-plus has a 1M context window (#27008 )

2026-05-17 02:31:18 -07:00

models_dev.py

feat: add NovitaAI as LLM provider

2026-05-13 23:51:15 -07:00

moonshot_schema.py

fix(moonshot): strip $ref siblings and collapse tuple items in tool schemas (#27104 )

2026-05-16 13:02:19 -07:00

nous_rate_guard.py

codebase: add encoding='utf-8' to all bare open() calls (PLW1514)

2026-05-08 14:27:40 -07:00

onboarding.py

docs(onboarding): lead OpenClaw residue banner with migrate, warn that cleanup breaks OpenClaw (#17507 )

2026-04-29 08:08:36 -07:00

plugin_llm.py

feat(plugins): run any LLM call from inside a plugin via ctx.llm (#23194 )

2026-05-10 07:09:28 -07:00

portal_tags.py

feat(nous): unified client=hermes-client-v<version> tag on every Portal request (#24779 )

2026-05-12 20:49:20 -07:00

process_bootstrap.py

refactor(run_agent): extract OpenAI proxy, safe stdio, IterationBudget

2026-05-16 17:59:32 -07:00

prompt_builder.py

fix(kanban): stale reclaim must not tick failure counter (#28680 )

2026-05-19 03:15:18 -07:00

prompt_caching.py

fix(cache): kill long-lived prefix layout — system prompt is now byte-static within a session (#24778 )

2026-05-12 20:46:04 -07:00

rate_limit_tracker.py

refactor: remove dead code — 1,784 lines across 77 files (#9180 )

2026-04-13 16:32:04 -07:00

redact.py

perf(agent-loop): cut 47% of per-conversation function calls via 3 targeted hot-path optimizations (#28866 )

2026-05-19 14:25:10 -07:00

retry_utils.py

feat(agent): add jittered retry backoff

2026-04-08 00:41:36 -07:00

shell_hooks.py

fix: guard yaml.safe_load, flock unlock, TOCTOU races, and atomic writes

2026-05-19 00:12:41 -07:00

skill_bundles.py

feat(skills): add skill bundles — alias /<name> loads multiple skills (#28373 )

2026-05-18 21:38:05 -07:00

skill_commands.py

fix(skills): load symlinked skill slash commands

2026-05-18 00:34:29 -07:00

skill_preprocessing.py

fix: treat inline-shell timeout guard as timeout

2026-05-18 19:36:04 -07:00

skill_utils.py

perf(cli): cut ~19s from 'hermes' cold start (skills cache + lazy Feishu + no Nous HTTP) (#22138 )

2026-05-08 16:39:32 -07:00

stream_diag.py

refactor(run_agent): extract stream diagnostics to agent/stream_diag.py

2026-05-16 18:28:17 -07:00

subdirectory_hints.py

fix(agent): catch PermissionError in subdirectory hint discovery

2026-04-09 03:10:30 -07:00

system_prompt.py

perf(prompt): cache kanban worker guidance at session init

2026-05-18 20:56:44 -07:00

think_scrubber.py

fix(agent): stateful streaming scrubber for reasoning-block leaks (#17924 ) (#20184 )

2026-05-05 04:33:38 -07:00

title_generator.py

fix: improve telegram topic mode setup

2026-05-04 12:07:17 -07:00

tool_dispatch_helpers.py

fix(agent): set tool_name on tool-result messages at construction time

2026-05-19 20:49:11 +01:00

tool_executor.py

fix(agent): set tool_name on tool-result messages at construction time

2026-05-19 20:49:11 +01:00

tool_guardrails.py

fix: add recovery hints to loop guard warnings

2026-05-19 00:12:12 -07:00

tool_result_classification.py

fix: classify landed file mutations with diagnostics

2026-05-13 06:46:23 -07:00

trajectory.py

Refactor Terminal and AIAgent cleanup

2026-02-21 22:31:43 -08:00

usage_pricing.py

fix(pricing): add deepseek-v4-pro to official docs pricing table

2026-05-12 16:32:57 -07:00

video_gen_provider.py

feat(video_gen): unified video_generate tool with pluggable provider backends (#25126 )

2026-05-13 16:39:41 -07:00

video_gen_registry.py

feat(video_gen): unified video_generate tool with pluggable provider backends (#25126 )

2026-05-13 16:39:41 -07:00

web_search_provider.py

fix(web): align _LEGACY_PREFERENCE with legacy 7-provider order + doc cleanup

2026-05-13 22:31:28 -07:00

web_search_registry.py

fix(web): align _LEGACY_PREFERENCE with legacy 7-provider order + doc cleanup

2026-05-13 22:31:28 -07:00