hermes-agent

mirror of https://github.com/NousResearch/hermes-agent.git synced 2026-05-21 03:39:54 +00:00

Author	SHA1	Message	Date
kshitijk4poor	c74ff2c8ef	fix(browser): self-review pass — dead-import, log levels, future-proofing Addresses findings from two self-review passes pre-merge. First pass (3-agent parallel review): 1. plugins/browser/browser_use/provider.py: drop the ``_ = managed_nous_tools_enabled`` dead-import-hider in _get_config_or_none(). The import was actively misleading — the helper IS used in _get_config() (separate method, separate import), not here. The "keep static analysis happy" comment was wrong about what the helper does in this scope. 2. agent/browser_provider.py: drop ``pragma: no cover`` from is_configured() / provider_name() backward-compat aliases. They ARE covered by ``TestLegacyAbcAliases`` — the pragma would have masked future regressions. 3. tools/browser_tool.py: refactor _is_legacy_provider_registry_overridden() to compare against a module-frozen _DEFAULT_PROVIDER_REGISTRY snapshot instead of hardcoded set of 3 keys. Future maintainers adding a 4th built-in provider now just extend _PROVIDER_REGISTRY; the override detection adapts automatically. Previously the hardcoded ``set(...) != {"browserbase", "browser-use", "firecrawl"}`` would flip True forever on any 4-key registry, silently routing every install onto the legacy fixture path. 4. tools/browser_tool.py: when explicit ``browser.cloud_provider`` is set but the registry has no matching plugin (typo, uninstalled plugin, discovery failure), emit a WARNING with actionable text instead of silently falling through to auto-detect. Legacy code surfaced a typed credentials error via direct class instantiation; this log restores the signal in the post-migration path. 5. agent/browser_registry.py: trim the triple-redundant _LEGACY_PREFERENCE documentation. Module docstring + 13-line block-comment + 5-line inline comment was repeating the same point. Kept the docstring and trimmed the block-comment to 5 lines. 6. agent/browser_registry.py: upgrade is_available()-raised logging from DEBUG to WARNING with exc_info=True. A provider's availability check throwing is unusual enough that users debugging "no cloud provider" need the traceback in logs. 7. tests/plugins/browser/check_parity_vs_main.py: drop dead top-level imports (os, shutil, tempfile — only referenced inside the SUBPROCESS_SCRIPT string literal that runs in a child process). Second pass (architecture + claim-verification review): 8. tools/browser_tool.py: rewrite the inline comment in _get_cloud_provider auto-detect branch. Prior text claimed it "routes through the plugin registry's legacy preference walk so third-party plugins still get a chance to be selected when they're explicitly configured" — false on both counts. The branch uses module-level legacy class aliases (BrowserUseProvider / BrowserbaseProvider) directly; third-party plugins are intentionally reachable only via explicit ``browser.cloud_provider``. Corrected comment now matches behaviour and cross-references _LEGACY_PREFERENCE for the firecrawl gate rationale. 9. tools/browser_tool.py + tests/tools/test_managed_browserbase_and_modal.py: drop the unused ``get_active_browser_provider as _registry_get_active_browser_provider`` alias from the ``from agent.browser_registry import ...`` block. It was never referenced; matching test-stub line in the agent.browser_registry SimpleNamespace also dropped. ``get_provider`` is still imported (used by the explicit-config dispatch path at line 535). 10. plugins/browser/firecrawl/provider.py: align emergency_cleanup() with the early-guard pattern used in browserbase + browser_use plugins. Previously firecrawl tried the DELETE and relied on ``_headers()`` raising ValueError to trip a "missing credentials" warning; same final outcome but a different control flow that read like a bug to a maintainer skimming the three modules. Now: if is_available() is False, log+return early — identical shape to the other two providers. Verification: 54/54 unit tests + 13/13 parity scenarios still pass.	2026-05-17 04:04:15 -07:00
kshitijk4poor	1bb6f03724	fix(browser): ensure plugin discovery before registry lookup; parity harness Two changes that go together: 1. tools/browser_tool.py — add _ensure_browser_plugins_loaded() and call it from _get_cloud_provider() before consulting the registry. Normally model_tools triggers discover_plugins() as an import side-effect, but _get_cloud_provider() can be reached from contexts that haven't gone through model_tools (standalone scripts, certain unit-test paths, the new parity-sweep harness). Without the defensive call, the registry is empty and _registry_get_browser_provider() returns None — silently downgrading users to local mode when they explicitly configured a cloud provider with no credentials yet. The behavior-parity sweep below caught this as 4 scenario regressions (explicit-X-no-creds for all 3 providers, and explicit-firecrawl-with-creds). 2. tests/plugins/browser/check_parity_vs_main.py — subprocess harness that pins one Python invocation to origin/main and one to this PR's worktree via sys.path.insert(), runs _get_cloud_provider() across a 13-scenario config matrix, and diffs the reduced shape tuple (is_local, provider_name, is_available). Provider_name pulls from provider.provider_name() which is the legacy CloudBrowserProvider API and remains as a backward-compat alias on the new BrowserProvider ABC, so the comparison is apples-to-apples regardless of class identity. Final result: PARITY OK across 13 scenarios. The four observable config/credential matrices that exercise the dispatcher all match origin/main bit-for-bit: - no-config + no-env → local - explicit local + any env → local - explicit BB / BU / FC + no creds → provider returned with is_available()==False (so dispatcher surfaces typed credentials error; matches main exactly) - explicit BB / BU / FC + creds → provider returned with is_available()==True - no-config + BU creds → Browser Use - no-config + BB creds → Browserbase - no-config + both → Browser Use (legacy walk first hit) - no-config + FC only → local (firecrawl NOT in legacy walk) - no-config + FC + BB → Browserbase (legacy walk skips firecrawl) Per the dev skill's "behavior-parity for refactor PRs" rule — without this subprocess sweep, 31/31 unit tests pass while the production code path is silently broken for users who type `browser.cloud_provider: browserbase` and run a single browser command without prior model_tools import. Caught + fixed before push.	2026-05-17 04:04:15 -07:00
kshitijk4poor	250caebeb1	refactor(browser): delete tools/browser_providers/ directory; migrate tests The four files in tools/browser_providers/ (base.py, browserbase.py, browser_use.py, firecrawl.py) have been migrated into plugins/browser/<vendor>/provider.py over the previous commits. No in-tree code references them anymore — the legacy class names (BrowserbaseProvider / BrowserUseProvider / FirecrawlProvider) are re-exported from tools.browser_tool as aliases to the plugin classes, so existing test patches keep working. Updates tests/tools/test_managed_browserbase_and_modal.py: - Adds _load_plugin_module() helper next to _load_tool_module(). - Reroutes five _load_tool_module('tools.browser_providers.X', ...) calls to _load_plugin_module('plugins.browser.X.provider', ...). - Renames BrowserbaseProvider/BrowserUseProvider -> the new plugin class names (BrowserbaseBrowserProvider / BrowserUseBrowserProvider). - Updates is_configured() -> is_available() on the one assertion that cared about the rename (the others stay on is_configured() via the BrowserProvider ABC's backward-compat alias). Net diff: -630 / +39 lines (tests + dead-code deletion). Verified 23/23 tests in test_browser_cloud_*.py + test_managed_browserbase_and_modal.py still pass. Closes the file-tree mismatch portion of #25214. Remaining work: new plugin-level test coverage under tests/plugins/browser/, behaviour parity subprocess sweep vs origin/main, and full tests/tools/ regression sweep before opening the PR.	2026-05-17 04:04:15 -07:00
kshitijk4poor	40fde853fa	refactor(browser): dispatch _get_cloud_provider through agent.browser_registry Switches tools.browser_tool's cloud-provider lookup from the hardcoded _PROVIDER_REGISTRY class-instantiation pattern to the agent.browser_registry singleton registry that plugins self-populate. Changes: - tools/browser_tool.py top imports: pull BrowserProvider from agent.browser_provider (re-exported as CloudBrowserProvider for legacy callers) and the three provider classes from plugins/browser/<vendor>/. Legacy class names (BrowserbaseProvider, BrowserUseProvider, FirecrawlProvider) remain on tools.browser_tool as re-export shims so existing test patches (monkeypatch.setattr(browser_tool, 'BrowserUseProvider', ...)) keep working. - _get_cloud_provider() now consults agent.browser_registry.get_provider() for explicit-config lookups. The auto-detect fallback still uses BrowserUseProvider() / BrowserbaseProvider() at the module level so the cache-policy test fixtures (which patch those names) keep driving the function. Test-time _PROVIDER_REGISTRY overrides are detected by class identity and routed through the legacy factory-call path. - agent/browser_provider.py: BrowserProvider grows is_configured() and provider_name() as thin backward-compat aliases for the legacy CloudBrowserProvider API. Subclasses MUST implement is_available() and name; the aliases delegate. This keeps ~6 caller sites in browser_tool.py working without churning them. - tests/tools/test_managed_browserbase_and_modal.py: _install_fake_tools_package grows stubs for agent.browser_provider / agent.browser_registry / plugins.browser.<vendor>.provider so the test's spec-loader path (sys.modules-reset + reload-tool-from-disk) can satisfy tools.browser_tool's top-level imports. Verified: all 23 existing tests in test_browser_cloud_*.py + test_managed_browserbase_and_modal.py still pass post-cutover. The legacy tools/browser_providers/ directory is NOT yet deleted; several tests still _load_tool_module() those files via spec_from_file_location. The deletion + test-path updates land in a later commit.	2026-05-17 04:04:15 -07:00
kshitij	5fba236644	chore: ruff auto-fix PLR6201 resweep — tuple → set in membership tests (#27355 ) Six days after #23937 (608 fixes) the codebase had accumulated 241 new PLR6201 violations. Same mechanical `x in (...)` → `x in {...}` fix, same zero-risk profile: set lookup is O(1) vs O(n) for tuple and the two are semantically equivalent for hashable scalar membership tests. All 241 instances fixed via `ruff check --select PLR6201 --fix --unsafe-fixes`, zero remaining. Every changed value is a hashable scalar (str/int/None/enum/signal); no risk of unhashable runtime errors. No behavior change. Test plan: - 119 files changed, +244/-244 (net zero) — exactly one-line edits - `ruff check` clean afterward - Compile checks pass on the largest touched files (cli.py, run_agent.py, gateway/run.py, gateway/platforms/discord.py, model_tools.py) - Subset broad test run on tests/gateway/ tests/hermes_cli/ tests/agent/ tests/tools/: 18187 passed, 59 pre-existing failures (verified against origin/main with the same shape — identical failure count, identical category — all xdist test-order flakes unrelated to this change) Follows the same template as PR #23937 ([tracker: #23972](https://github.com/NousResearch/hermes-agent/issues/23972)).	2026-05-17 02:29:41 -07:00
EloquentBrush0x	ad00777f04	fix(mcp-oauth): print SSH tunnel hint in _redirect_handler When Hermes runs on a remote host over SSH, MCP OAuth loopback flows silently fail: the OAuth provider redirects the user's browser to http://127.0.0.1:<port>/callback, which reaches the callback server on the remote machine — not the local machine where the browser is running. _redirect_handler already detected SSH (via _can_open_browser) and printed "Headless environment detected — open the URL manually." but gave no guidance on how to actually reach the callback server. Users got silent timeouts or "Could not establish connection" errors. This is the same bug fixed for xAI-oauth and Spotify in #26592, which added _print_loopback_ssh_hint() in hermes_cli/auth.py. mcp_oauth.py uses the identical loopback callback pattern (http://127.0.0.1:<port>/callback via _configure_callback_port / _wait_for_callback) but was missing the hint. Fix: when SSH_CLIENT or SSH_TTY is set and _oauth_port is available, print the ssh -N -L port-forward command and the OAuth-over-SSH guide URL to stderr, consistent with the rest of _redirect_handler's output. Tests: 4 new cases in TestRedirectHandlerSshHint covering SSH_CLIENT, SSH_TTY, local session (no hint), and missing _oauth_port (no hint).	2026-05-17 02:29:37 -07:00
Grogger	8bf09455dc	fix(windows): suppress console window flash on subprocess spawns Add creationflags=CREATE_NO_WINDOW to every Windows Popen call across the terminal, process registry, code execution, and kanban worker subsystems. Prevents visible CMD windows from flashing on the user's desktop during agent operation. Also adds the _IS_WINDOWS module constant to kanban_db.py where it was missing, for consistency with the other patched files. 5 Popen sites across 4 files: - tools/environments/local.py (terminal foreground spawn) - tools/process_registry.py (background process spawn) - tools/code_execution_tool.py (sandbox + interpreter probe) - hermes_cli/kanban_db.py (kanban worker spawn)	2026-05-16 23:05:27 -07:00
flooryyyy	7d09bb1915	fix(delegate): tool_trace false-positive error detection for short outputs	2026-05-16 22:54:22 -07:00
0xchainer	60531889d5	fix: remove unused import and hoist module-level constant - Remove unused from tools/tts_tool.py (dead code) - Move _BUILTIN_DELIVER_PLATFORMS set from send() method to module scope in gateway/platforms/webhook.py to avoid reallocation on every call	2026-05-16 22:49:54 -07:00
Teknium	fb05f5d4b5	fix(mcp): validate remote URLs up-front with a clear error (#27105 ) Port from anomalyco/opencode#25019 ("fix: handle invalid mcp urls"). Previously: a typo in `config.yaml` (missing scheme, wrong scheme, empty string, non-string value) slipped past `_is_http()` and hit `httpx.URL(url)` or `streamablehttp_client(url, ...)` deep in the transport layer. That raised a generic exception which went through the reconnect-backoff loop, so a bad URL caused _MAX_INITIAL_CONNECT_RETRIES attempts with doubling backoff — about a minute of pointless retries plus an opaque error — before the server was marked failed. Now: we validate the URL once, at the top of `run()`, before entering the retry loop. A malformed URL raises `InvalidMcpUrlError` (a `ValueError` subclass) with a message that names the offending server and explains exactly what was wrong. `_ready` is set and `_error` is populated, so `start()` re-raises and the server shows up as failed in `hermes mcp list` without any backoff burn. Validation rules: - Must be a string (rejects None, dict, int) - Must be non-empty (rejects '' and whitespace-only) - Scheme must be http or https (rejects file://, ws://, stdio://) - Must have a non-empty host (rejects http:///, http://:8080) Tests (21 new cases in tests/tools/test_mcp_invalid_url.py): - TestValidUrlsAccepted: http, https, IPv6, ports, paths, query strings - TestInvalidUrlsRejected: every rejection path above + clear error text - TestErrorIsValueError: downstream code catching ValueError still works E2E verified: a misconfigured server with `url: not-a-valid-url` now fails in <0.001s with the clear error, instead of minutes of retries. Doesn't touch stdio servers (they use `command`, not `url`) — the validator only fires when `_is_http()` returns True.	2026-05-16 13:06:56 -07:00
Teknium	d725407c56	security(deps): bump aiohttp, anthropic, cryptography to CVE-fixed versions (#26830 ) Closes #10695. Picks up the still-vulnerable Python pins on current main: - aiohttp 3.13.3 -> 3.13.4 (messaging, slack, homeassistant, sms extras + lazy_deps platform.slack) — CVE-2026-34513 (DNS cache exhaustion), CVE-2026-34518 (cookie/proxy-auth leak on cross-origin redirect, relevant for the gateway since it handles OAuth tokens), CVE-2026-34519 (response reason injection), CVE-2026-34520 (null bytes in headers), CVE-2026-34525 (multiple Host headers). - anthropic 0.86.0 -> 0.87.0 (anthropic extra + lazy_deps provider.anthropic) — CVE-2026-34450 (memory tool files created mode 0o666), CVE-2026-34452 (path-traversal in async local-filesystem memory tool). Not directly exploitable since hermes-agent doesn't use the SDK's filesystem memory tool, but the SDK is bumped for hygiene. - cryptography pinned explicitly at 46.0.7 in core dependencies — CVE-2026-39892 (buffer overflow on non-contiguous buffers). Previously came in transitively via PyJWT[crypto]; the explicit floor keeps the WeCom/Weixin crypto paths from drifting below the fix. curl-cffi from the original issue is no longer in pyproject.toml or uv.lock, so no action needed there. uv.lock regenerated cleanly; only aiohttp / anthropic / cryptography moved. Credit: original issue + scoping by @shaun0927 (#10695, #10701). Floor analysis and packaging-surface audit by @gnanirahulnutakki (#10784), adapted to current main's exact-pin style. Co-authored-by: shaun0927 <shaun0927@users.noreply.github.com> Co-authored-by: Gnani Rahul Nutakki <gnanirahulnutakki@users.noreply.github.com>	2026-05-16 01:25:25 -07:00
Teknium	6ba35ec336	Inspired by Claude Code: tighten dangerous-command detection (#26829 ) Port three hardening patches from Claude Code 2.1.113's expanded deny rules to hermes' detect_dangerous_command() pattern list. 1. macOS /private/{etc,var,tmp,home} system paths /etc, /var, /tmp, /home are symlinks to /private/<name> on macOS. A write to /private/etc/sudoers works identically to /etc/sudoers but bypassed the plain /etc/ pattern check. Extracted a shared _SYSTEM_CONFIG_PATH fragment so /etc/ and the /private/ mirror stay in sync across redirect / tee / cp / mv / install / sed -i patterns. 2. killall -9 / -KILL / -SIGKILL / -s KILL / -r <regex> Parallel to the existing pkill -9 pattern. killall -9 against non-hermes processes was previously unprotected, and killall -r can sweep unrelated processes matching a regex. 3. find -execdir rm Same destructive effect as find -exec rm but ran in each match's directory. The previous pattern required a literal '-exec ' so -execdir slipped through. Guarded by 32 new test cases in 4 test classes: - TestMacOSPrivateSystemPaths (11 cases) - TestKillallKillSignals (9 cases) - TestFindExecdir (4 cases) - TestEtcPatternsUnaffectedByRefactor (6 regression guards on the existing /etc/ coverage after the _SYSTEM_CONFIG_PATH refactor) Inspiration: https://github.com/anthropics/claude-code/releases (Claude Code 2.1.113, April 17 2026 - "Enhanced deny rules" and "Dangerous path protection")	2026-05-16 01:24:25 -07:00
Teknium	395e9dd9e2	feat: add supports_parallel_tool_calls for MCP servers (#26825 ) Port from openai/codex#17667: MCP servers can now opt-in to parallel tool execution by setting supports_parallel_tool_calls: true in their config. This allows tools from the same server to run concurrently within a single tool-call batch, matching the behavior already available for built-in tools like web_search and read_file. Previously all MCP tools were forced sequential because they weren't in the _PARALLEL_SAFE_TOOLS set. Now _should_parallelize_tool_batch checks is_mcp_tool_parallel_safe() which looks up the server's config flag. Config example: mcp_servers: docs: command: "docs-server" supports_parallel_tool_calls: true Changes: - tools/mcp_tool.py: Track parallel-safe servers in _parallel_safe_servers set, populated during register_mcp_servers(). Add is_mcp_tool_parallel_safe() public API. - run_agent.py: Add _is_mcp_tool_parallel_safe() lazy-import wrapper. Update _should_parallelize_tool_batch() to check MCP tools against server config. - 11 new tests covering the feature end-to-end. - Updated MCP docs and config reference.	2026-05-16 01:04:28 -07:00
Teknium	c445f48b78	fix(delegation): honor api_mode + auto-detect anthropic_messages URLs (#26824 ) Subagent delegation hardcoded api_mode='chat_completions' for any delegation.base_url that didn't match three specific hostnames (chatgpt.com, api.anthropic.com, api.kimi.com/coding), and never read delegation.api_mode from config. Azure AI Foundry's https://foundry.services.ai.azure.com/anthropic endpoint fell through and got chat_completions, causing 404s on every delegate_task call. The main agent already handles this correctly via the shared _detect_api_mode_for_url() helper (anything ending in /anthropic → anthropic_messages); delegation reimplemented its own narrower check. Reuse the shared detector and honor an explicit delegation.api_mode when set so users can also force the transport on non-standard endpoints the URL heuristic can't classify. Fixes #10213. Co-authored-by: HiddenPuppy <HiddenPuppy@users.noreply.github.com>	2026-05-16 01:00:27 -07:00
Teknium	74d0b392e7	feat(x_search): gated X (Twitter) search tool with OAuth-or-API-key auth (#26763 ) * feat(x_search): gated X (Twitter) search tool with OAuth-or-API-key auth Salvages tools/x_search_tool.py from the closed PR #10786 (originally by @Jaaneek) and reworks its credential resolution so the tool registers when EITHER xAI credential path is available: * XAI_API_KEY (paid xAI API key) is set in ~/.hermes/.env or the env, OR * The user is signed in via xAI Grok OAuth — SuperGrok subscription — i.e. hermes auth add xai-oauth has been run Both paths route through xAI's built-in x_search Responses tool at https://api.x.ai/v1/responses. When both credentials exist OAuth wins, matching tools/xai_http.py's existing preference order (uses SuperGrok quota instead of paid API spend). The check_fn calls resolve_xai_http_credentials() which auto-refreshes the OAuth access token if it's within the refresh skew window, so a True return means the bearer is fetchable AND non-empty. Wiring - tools/x_search_tool.py — new tool, ~370 LOC. Schema gated by check_fn, bearer resolved per-call so revoked OAuth surfaces a clean tool_error rather than an HTTP 401. - toolsets.py — "x_search" toolset def. NOT added to _HERMES_CORE_TOOLS; users opt in via hermes tools. - hermes_cli/tools_config.py — CONFIGURABLE_TOOLSETS entry + TOOL_CATEGORIES block with two provider options (OAuth + API key) sharing the existing xai_grok post_setup hook for credential bootstrap. - hermes_cli/config.py — DEFAULT_CONFIG["x_search"] with model / timeout_seconds / retries. Additive nested key; no version bump. - tests/tools/test_x_search_tool.py — 13 tests covering HTTP shape, handle validation, citation extraction, 4xx/5xx/timeout handling, and the full credential-resolution matrix (OAuth-only, API-key-only, both-set, neither-set, resolver-raises, config overrides, registry registration). - website/docs/guides/xai-grok-oauth.md — adds X Search to the direct-to-xAI tools section with off-by-default note. - website/docs/user-guide/features/tools.md — new row in the tools table. Off by default — users enable via `hermes tools` → 🐦 X (Twitter) Search. Schema only appears to the model when xAI credentials are configured. Co-authored-by: Jaaneek <Jaaneek@users.noreply.github.com> * docs(x_search): add dedicated feature page + reference entries - website/docs/user-guide/features/x-search.md (new) — full feature walkthrough: authentication, enablement, configuration, parameters, returned fields, example, troubleshooting, see-also links. - website/docs/reference/tools-reference.md — new "x_search" toolset section with parameter docs and credential gating note. - website/docs/reference/toolsets-reference.md — new row in the toolset catalog table. - website/sidebars.ts — wires the new feature page under Media & Web, after web-search. --------- Co-authored-by: Jaaneek <Jaaneek@users.noreply.github.com>	2026-05-16 00:58:27 -07:00
Teknium	627f8a5f1d	security: sanitize tool error strings before injecting into model context (#26823 ) Adds _sanitize_tool_error() in model_tools and routes both error paths through it: registry.dispatch's try/except (the primary path for tool exceptions) and handle_function_call's outer except (defense in depth). Stripping targets structural framing tokens that the model itself can react to even though json.dumps already handles wire-layer escaping: XML role tags (tool_call, function_call, result, response, output, input, system, assistant, user), CDATA sections, and markdown code fences. Caps message body at 2000 chars and wraps with [TOOL_ERROR] prefix. Defense-in-depth: a tool exception carrying '<tool_call>...' won't break message framing (json escapes it), but the model still reads those tokens and they nudge it toward role-confusion framing. Ported from ironclaw#1639 (one piece of #3838's three-feature scout). The truncated-tool-call (#1632) and empty-response-recovery (#1677, #1720) pieces are skipped because main now implements both far more thoroughly (run_agent.py L8147/L12209/L13012 for truncation retry + length rewrite; L4500/L15090+ for empty-response scaffolding stripper, multi-stage nudge, fallback model activation).	2026-05-16 00:57:39 -07:00
Teknium	016c772e7f	feat(plugins): tool override flag for replacing built-in tools (closes #11049 ) (#26759 ) Plugins can now replace a built-in tool by passing override=True to ctx.register_tool(). Without it, the registry rejects any registration that would shadow an existing tool from a different toolset (unchanged default behavior). Unlocks the use case from #11049: drop-in replacement of browser/web backends without forking core. Composes with the existing pre_tool_call hook for runtime interception of any implementation. The override is audit-logged at INFO so it surfaces in agent.log.	2026-05-15 22:12:57 -07:00
Teknium	c5dc9700eb	fix(windows): silence tirith-unavailable banner + skip install/spawn attempts on unsupported platforms (#26718 ) Tirith ships no Windows binary, so on every Windows CLI startup users saw a scary 'tirith security scanner enabled but not available' banner they could not act on. The banner suggested degraded security; in reality pattern-matching guards still run and the message was pure noise. Fix: - New public is_platform_supported() helper in tools/tirith_security.py that returns False when _detect_target() doesn't resolve (Windows, any non-x86_64/aarch64 arch). - ensure_installed(), _resolve_tirith_path(), and check_command_security() short-circuit on unsupported platforms: cache _resolved_path = _INSTALL_FAILED with reason 'unsupported_platform', skip PATH probes, skip the background download thread, skip the disk failure marker, and return allow with an empty summary from check_command_security so the spawn loop never fires. - Explicit user-configured tirith_path is still honored everywhere (a user who built tirith themselves under WSL keeps that path). - CLI banner in cli.py gated on is_platform_supported() — fires only on platforms where tirith should work but isn't installed. - Docs note tirith's supported-platform list and point Windows users at WSL. Tests: tests/tools/test_tirith_security.py +8 tests covering Linux x86_64, Darwin arm64, Windows, and unknown-arch verdicts plus the silent ensure_installed / check_command_security / _resolve_tirith_path fast-paths and the explicit-path override. test_tirith_security.py 75 passed (8 new + 67 pre-existing) test_command_guards.py 19 passed	2026-05-15 20:29:28 -07:00
teknium1	4aec25bc44	fix(windows): stop spamming cwd-missing + tirith-spawn warnings on every terminal call Two log-spam fixes surfaced by a Windows user (Git Bash + Python 3.11.9): 1. LocalEnvironment cwd warn spam ============================ Git Bash's `pwd -P` emits paths like `/c/Users/x`. The base-class `_extract_cwd_from_output` was assigning this verbatim to `self.cwd` without validation, then `_resolve_safe_cwd`'s `os.path.isdir(/c/...)` returned False on Windows, triggering: LocalEnvironment cwd '/c/Users/NVIDIA' is missing on disk; falling back to '/' so terminal commands keep working. ...on every terminal call. The pre-existing Windows-path translation inside `_run_bash` ran AFTER the safe-cwd check, so it could never prevent the warning. Fix: - New `_msys_to_windows_path` helper (idempotent, no-op off Windows). - `_resolve_safe_cwd` normalizes before `isdir`, so a valid MSYS path is recognized as the real directory it points at. - `LocalEnvironment._update_cwd` and a new override of `_extract_cwd_from_output` translate + validate before mutating `self.cwd`. Stale / non-existent marker paths roll back to the previous cwd instead of clobbering it. - The fallback warning still fires when the directory really is gone (deletion-recovery scenario from #17558 still covered). 2. tirith spawn-failed warn spam ============================= When tirith isn't installed (background install in flight, or marked failed for the day) and the configured path stays as the bare string `tirith`, every `subprocess.run([tirith_path, ...])` raises OSError and logged: tirith spawn failed: [WinError 2] The system cannot find the file specified ...on every command. fail_open=True means behaviour is correct, but the log noise is severe. Fix: - `_warn_once(key, ...)` thread-safe dedupe helper. - Three hot-path warnings (`tirith path resolved to None`, `tirith spawn failed: ...`, `tirith timed out after Ns`) now log once per (exception class, errno) / timeout-value / path-none key. - Dedupe set is cleared on `_clear_install_failed` so a successful install lets a subsequent failure surface again. Tests ===== - `tests/tools/test_local_env_windows_msys.py`: 12 tests covering the MSYS→Windows translator, the resolve fast-path, update_cwd validation, and extract_cwd_from_output rollback. - `tests/tools/test_tirith_security.py`: 4 new dedupe tests (15 spawn failures → 1 log line; distinct exc types → 2 lines; timeout dedupe; path-None dedupe). Targeted runs: test_local_env_windows_msys.py 12 passed test_local_env_cwd_recovery.py 7 passed (pre-existing, no regressions) test_tirith_security.py 67 passed (63 pre-existing + 4 new) test_base_environment + local_* 37 passed (no regressions) test_local_env_blocklist + neighbours 114 passed Reported via Hermes log capture: 19× cwd warnings + 15× tirith warnings in a single short session.	2026-05-15 16:25:31 -07:00
sprmn24	7fee1f61eb	fix(memory): eliminate TOCTOU race in Windows file lock creation On Windows (msvcrt path), _file_lock() first checked if the lock file existed and wrote it with write_text(), then opened it with open('r+'). Between these two calls, another process could delete the file causing open('r+') to raise FileNotFoundError — uncaught, leaving memory writes to proceed without holding the lock, risking data corruption. Replace the three-line sequence with a single open('a+', ...) call which atomically creates the file if missing or opens it if it exists, closing the TOCTOU window entirely. The existing fd.seek(0) before msvcrt.locking() is preserved and sufficient for correct lock byte positioning. Root cause: TOCTOU between lock_path.write_text() and open('r+') Impact: concurrent memory writes on Windows could corrupt MEMORY.md	2026-05-15 15:28:18 -07:00
teknium1	6068363311	fix(delegate): guard heartbeat join against unstarted thread Pairs with the prior commit (start() now inside the try block). If threading.Thread.start() itself raises (OS thread exhaustion under heavy delegation fanout), the finally would call .join() on a never-started thread, which raises RuntimeError("cannot join thread before it is started") — trading one rare bug for another. Thread.ident is None until start() succeeds, so gate the join on it.	2026-05-15 15:09:55 -07:00
sprmn24	2d7182f72c	fix(delegate): move heartbeat thread start inside try block to prevent orphan _heartbeat_thread.start() was called before the try/finally block that contains _heartbeat_stop.set(). If _register_subagent() or any code between .start() and try: raised an exception, the finally block would never run — leaving the heartbeat thread as an orphan that continues calling _touch_activity() on the parent agent, incorrectly resetting gateway timeout counters. Move _heartbeat_thread.start() to be the first statement inside the try block so the finally block always reaches _heartbeat_stop.set() regardless of how the child run completes or fails. Root cause: heartbeat start outside try/finally scope Impact: orphan heartbeat thread incorrectly resets parent gateway timeouts	2026-05-15 15:09:55 -07:00
alt-glitch	c57709a3d6	feat: wire ensure_dependency into TUI and browser tool call sites Before: missing node → hard exit; missing browser → FileNotFoundError. After: both try ensure_dependency() first, which prompts interactively and delegates installation to install.sh --ensure. ripgrep and ffmpeg already degrade gracefully (grep fallback, skip conversion) so they don't need wiring. Also documents the design rationale in dep_ensure.py: detection and prompting live in Python (portable, instant, UX-integrated); only the actual installation delegates to install.sh (1900 lines of battle-tested OS/package-manager logic).	2026-05-15 14:45:43 -07:00
alt-glitch	259ae846c8	feat: add ensure_dependency() wrapper + ship install.sh in wheel Includes paired change: browser tool now searches ~/.hermes/node_modules/.bin/ for agent-browser installed via install.sh --ensure browser.	2026-05-15 14:45:43 -07:00
Teknium	4e89c53082	fix(async): close unscheduled coroutines in all threadsafe bridges (#26584 ) Wraps every sync->async coroutine-scheduling site in the codebase with a new agent.async_utils.safe_schedule_threadsafe() helper that closes the coroutine on scheduling failure (closed loop, shutdown race, etc.) instead of leaking it as 'coroutine was never awaited' RuntimeWarnings plus reference leaks. 22 production call sites migrated across the codebase: - acp_adapter/events.py, acp_adapter/permissions.py - agent/lsp/manager.py - cron/scheduler.py (media + text delivery paths) - gateway/platforms/feishu.py (5 sites, via existing _submit_on_loop helper which now delegates to safe_schedule_threadsafe) - gateway/run.py (10 sites: telegram rename, agent:step hook, status callback, interim+bg-review, clarify send, exec-approval button+text, temp-bubble cleanup, channel-directory refresh) - plugins/memory/hindsight, plugins/platforms/google_chat - tools/browser_supervisor.py (3), browser_cdp_tool.py, computer_use/cua_backend.py, slash_confirm.py - tools/environments/modal.py (_AsyncWorker) - tools/mcp_tool.py (2 + 8 _run_on_mcp_loop callers converted to factory-style so the coroutine is never constructed on a dead loop) - tui_gateway/ws.py Tests: new tests/agent/test_async_utils.py covers helper behavior under live loop, dead loop, None loop, and scheduling exceptions. Regression tests added at three PR-original sites (acp events, acp permissions, mcp loop runner) mirroring contributor's intent. Live-tested end-to-end: - Helper stress test: 1500 schedules across live/dead/race scenarios, zero leaked coroutines - Race exercised: 5000 schedules with loop killed mid-flight, 100 ok / 4900 None returns, zero leaks - hermes chat -q with terminal tool call (exercises step_callback bridge) - MCP probe against failing subprocess servers + factory path - Real gateway daemon boot + SIGINT shutdown across multiple platform adapter inits - WSTransport 100 live + 50 dead-loop writes - Cron delivery path live + dead loop Salvages PR #2657 — adopts contributor's intent over a much wider site list and a single centralized helper instead of inline try/except at each site. 3 of the original PR's 6 sites no longer exist on main (environments/patches.py deleted, DingTalk refactored to native async); the equivalent fix lives in tools/environments/modal.py instead. Co-authored-by: JithendraNara <jithendranaidunara@gmail.com>	2026-05-15 14:00:01 -07:00
teknium1	931caf2b2d	fix(env-flags): widen truthy-only session env checks to sibling sites Build on @aydnOktay's cronjob fix by routing the cronjob check through the shared 'env_var_enabled' helper in utils.py (same truthy set: 1/true/yes/on) and applying the same semantics to the 8 sibling call sites that read HERMES_INTERACTIVE / HERMES_GATEWAY_SESSION / HERMES_EXEC_ASK / HERMES_CRON_SESSION with bare os.getenv() truthy checks: - tools/approval.py: _is_gateway_approval_context (2), check_command_safety (2), check_all_command_guards (3) -- 7 sites total - tools/terminal_tool.py: _handle_sudo_failure, sudo password prompt -- 2 sites - tools/skills_tool.py: _is_gateway_surface -- 1 site Without this, a user who exports HERMES_INTERACTIVE=0 in their shell still gets interactive sudo prompts, approval prompts, and gateway skill-install paths -- only the cronjob tool was hardened. Now all consumers agree on the same false-like values. Also drops the duplicate _is_truthy_env helper from cronjob_tools.py in favour of the existing canonical utils.env_var_enabled. Tests: extend the parametrized regression coverage to all three session env vars (HERMES_INTERACTIVE / HERMES_GATEWAY_SESSION / HERMES_EXEC_ASK) symmetrically. tests/tools/test_cronjob_tools.py: 60/60 pass; tests/tools/{approval,terminal_tool,skills_tool, cron_approval_mode,hardline_blocklist}.py: 378/378 pass.	2026-05-15 12:35:07 -07:00
aydnOktay	734aa0f367	fix(cronjob): require explicit truthy session env values	2026-05-15 12:35:07 -07:00
Jaaneek	e13c1b8060	fix(xai-http): preserve ~/.hermes/.env fallback and XAI_STT_BASE_URL precedence The new resolve_xai_http_credentials() resolver was using os.getenv() for the XAI_API_KEY/XAI_BASE_URL fallback path, which dropped the ~/.hermes/.env contract guarded by PR #17140 / #17163. Users with XAI_API_KEY in dotenv only would see "No xAI credentials found" even though the key was configured. Separately, _transcribe_xai started consulting creds["base_url"] (which always returns at least the default https://api.x.ai/v1) ahead of the public XAI_STT_BASE_URL env override, so the per-tool override stopped working. - tools/xai_http.py: add module-level get_env_value() wrapper that reads ~/.hermes/.env first (via hermes_cli.config.get_env_value), then os.environ. Resolver uses it for the API-key/base-url fallback. - tools/transcription_tools.py: restore precedence so XAI_STT_BASE_URL wins over creds["base_url"]. - tests/tools/test_transcription_dotenv_fallback.py + tests/tools/test_tts_dotenv_fallback.py: repoint the per-call-site patches at the new resolution point (tools.xai_http.get_env_value). The end-to-end regression-guard test (which patches load_env) is unchanged and still passes.	2026-05-15 12:11:32 -07:00
Jaaneek	b62c997973	feat(xai-oauth): add xAI Grok OAuth (SuperGrok Subscription) provider Adds a new authentication provider that lets SuperGrok subscribers sign in to Hermes with their xAI account via the standard OAuth 2.0 PKCE loopback flow, instead of pasting a raw API key from console.x.ai. Highlights ---------- * OAuth 2.0 PKCE loopback login against accounts.x.ai with discovery, state/nonce, and a strict CORS-origin allowlist on the callback. * Authorize URL carries `plan=generic` (required for non-allowlisted loopback clients) and `referrer=hermes-agent` for best-effort attribution in xAI's OAuth server logs. * Token storage in `auth.json` with file-locked atomic writes; JWT `exp`-based expiry detection with skew; refresh-token rotation synced both ways between the singleton store and the credential pool so multi-process / multi-profile setups don't tear each other's refresh tokens. * Reactive 401 retry: on a 401 from the xAI Responses API, the agent refreshes the token, swaps it back into `self.api_key`, and retries the call once. Guarded against silent account swaps when the active key was sourced from a different (manual) pool entry. * Auxiliary tasks (curator, vision, embeddings, etc.) route through a dedicated xAI Responses-mode auxiliary client instead of falling back to OpenRouter billing. * Direct HTTP tools (`tools/xai_http.py`, transcription, TTS, image-gen plugin) resolve credentials through a unified runtime → singleton → env-var fallback chain so xai-oauth users get them for free. * `hermes auth add xai-oauth` and `hermes auth remove xai-oauth N` are wired through the standard auth-commands surface; remove cleans up the singleton loopback_pkce entry so it doesn't silently reinstate. * `hermes model` provider picker shows "xAI Grok OAuth (SuperGrok Subscription)" and the model-flow falls back to pool credentials when the singleton is missing. Hardening --------- * Discovery and refresh responses validate the returned `token_endpoint` host against the same `.x.ai` allowlist as the authorization endpoint, blocking MITM persistence of a hostile endpoint. Discovery / refresh / token-exchange `response.json()` calls are wrapped to raise typed `AuthError` on malformed bodies (captive portals, proxy error pages) instead of leaking JSONDecodeError tracebacks. * `prompt_cache_key` is routed through `extra_body` on the codex transport (sending it as a top-level kwarg trips xAI's SDK with a TypeError). * Credential-pool sync-back preserves `active_provider` so refreshing an OAuth entry doesn't silently flip the active provider out from under the running agent. Testing ------- * New `tests/hermes_cli/test_auth_xai_oauth_provider.py` (~63 tests) covers JWT expiry, OAuth URL params (plan + referrer), CORS origins, redirect URI validation, singleton↔pool sync, concurrency races, refresh error paths, runtime resolution, and malformed-JSON guards. * Extended `test_credential_pool.py`, `test_codex_transport.py`, and `test_run_agent_codex_responses.py` cover the pool sync-back, `extra_body` routing, and 401 reactive refresh paths. * 165 tests passing on this branch via `scripts/run_tests.sh`.	2026-05-15 12:11:32 -07:00
Siddharth Balyan	d5416284f1	fix(tui): autonomous background process completion notifications (#26071 ) (#26327 ) * feat(process-registry): add format_process_notification shared helper * feat(process-registry): add drain_notifications method * refactor(cli): use shared drain_notifications and format_process_notification * feat(tui): add background notification poller for completion_queue * feat(tui): wire notification poller into session init/finalize * refactor(tui): add post-turn drain using shared helper as safety net	2026-05-15 19:31:00 +05:30
nidhi-singh02	13c72fb486	fix(tools): wrap browser provider network calls with error handling Wrap requests.post() in create_session() for browser_use, browserbase, and firecrawl providers with requests.RequestException handling. Connection timeouts and DNS resolution failures now surface as clean RuntimeError messages instead of raw requests exception tracebacks. Browser Use managed-gateway mode preserves raw exception propagation so the existing idempotency-key retry semantics keep working. Closes #2746 Co-authored-by: teknium1 <127238744+teknium1@users.noreply.github.com>	2026-05-15 01:53:06 -07:00
aydnOktay	6af9942327	fix(url-safety): allow only http and https schemes	2026-05-15 01:52:48 -07:00
Nidhi Singh	eacb398f75	fix(tools): add return_exceptions to asyncio.gather in web_tools Three asyncio.gather() calls in tools/web_tools.py ran without return_exceptions=True. A single failing task (e.g. LLM rate limit on one URL) would raise out of gather() and discard every other successfully fetched/summarized result. Pass return_exceptions=True and filter BaseException entries with a warning log before unpacking. Affects: - chunk summarization gather (large web_extract pages) - firecrawl per-result LLM post-processing - tavily crawl per-result LLM post-processing Closes #2744	2026-05-15 01:50:41 -07:00
Animesh Mishra	55f3262e78	fix(mcp): pre-compile env-var regex and unify interpolation Remove redundant inner `import re` and regex recompilation on every call in _interpolate_env_vars. Add module-level _ENV_VAR_PATTERN compiled once. Replace the separate _interpolate_value() in mcp_config.py (which used \w+ and would silently fail on env vars containing hyphens or dots) with the shared _ENV_VAR_PATTERN from mcp_tool.py. Remove now-unused import re.	2026-05-15 01:43:54 -07:00
buntingszn	6682f91b80	feat(cron): support name-based lookup for job operations Cron mutation operations (run/pause/resume/remove) and 'hermes cron edit' now accept a job name in addition to the hex ID, with case-insensitive matching. Before this, 'hermes cron run my_job_name' died with 'Job with ID my_job_name not found' and forced the user to look up the hex ID first. The original PR matched by name but silently picked the first match when two jobs shared a name. This version refuses to act on an ambiguous name and surfaces every matching job (id, name, schedule, next_run_at) so the caller can pick a specific ID. - cron/jobs.py: - get_job() stays ID-only (preserves existing call-site semantics for web_server/api_server/curator/scheduler/test code that always passes real IDs). - resolve_job_ref() is the new name-or-ID resolver, used by pause/ resume/trigger/remove_job. Exact ID match wins over a name match even if a different job's name happens to equal that ID. Ambiguous name match raises AmbiguousJobReference with all candidate IDs. - tools/cronjob_tools.py: dispatch site uses resolve_job_ref, surfaces ambiguous matches as a structured error with the matching IDs. - hermes_cli/cron.py: 'cron edit' uses resolve_job_ref so editing by name works and ambiguous names are reported with IDs. - tests/cron/test_jobs.py: new TestResolveJobRef covering ID match, case-insensitive name match, ID-wins-over-name, ambiguous refusal, and that pause/resume/trigger/remove all refuse on ambiguity. Closes #2627	2026-05-15 01:36:03 -07:00
Teknium	9329e06696	feat(image-gen): actionable setup message when no FAL backend is reachable (#26222 ) When the in-tree FAL path has no API key (and no managed gateway), the handler used to return a bare 'FAL_KEY environment variable not set' error. Users had no idea where to get a key, that a managed Nous gateway exists, or that plugin-registered providers are an option. Now `image_generate_tool` returns a structured multi-line message: - signup link (https://fal.ai) - managed-gateway status (if Nous tools are enabled) - pointer to `hermes tools` / `hermes plugins list` for alternate backends, so users on a stale `image_gen.provider` know where to look The schema is untouched — `check_fn` still gates the tool out of the schema when no backend is reachable at startup, consistent with every other conditional tool. This patch fixes the call-time failure modes: managed-gateway 5xx, plugin provider disappearing mid-session, etc. Inspired by #2546 / @Mibayy. The PR was ~5700 commits stale against the new plugin-aware image_gen architecture, so this is a forward port of the actionable-error idea rather than a cherry-pick. Closes #2543 Co-authored-by: Mibayy <mibayy@users.noreply.github.com>	2026-05-15 01:33:13 -07:00
kshitijk4poor	e0e4856d46	feat(skills-hub): add huggingface/skills as trusted default tap (#2549 ) Adds Hugging Face's official skill catalog to the default GitHub taps and classifies it as a trusted source alongside openai/skills and anthropics/skills. - tools/skills_guard.py: huggingface/skills -> TRUSTED_REPOS - tools/skills_hub.py: GitHubSource.DEFAULT_TAPS += huggingface/skills (skills/) - website/docs: list it under default taps + trusted-source examples Closes #2549. Co-authored-by: teknium1 <127238744+teknium1@users.noreply.github.com>	2026-05-15 01:25:33 -07:00
teknium1	bcca5ed34d	fix(deps): pin brotlicffi so aiohttp can decode Discord's Brotli attachments Discord's CDN serves attachments with Content-Encoding: br. aiohttp's compression_utils tries 'import brotlicffi as brotli' first and falls back to google's Brotli, but Brotli<1.2.0's Decompressor.process() is 1-arg while aiohttp calls it with 2 args (data, max_length). Result: every .txt/.md/.doc uploaded to a Discord-gateway session fails to decode at att.read() with 'Can not decode content-encoding: br' / 'TypeError: process() takes exactly 1 argument (2 given)', the agent never sees the bytes, and falls back to filesystem guessing. Pin brotlicffi==1.2.0.1 in both surfaces: - tools/lazy_deps.py 'platform.discord' tuple: Discord users on the lazy-install path get it on first discord.py import. - pyproject.toml [messaging] extra: users who explicitly install hermes-agent[messaging] (skipping the lazy path) get it eagerly. brotlicffi wins aiohttp's import race regardless of what else is installed (try brotlicffi / except: import brotli), so existing setups that already pulled google's Brotli transitively don't change behavior beyond the bug fix. ~1.5 MB wheel, manylinux/macOS/Windows coverage. E2E verified: round-trip decode of Brotli-compressed payload via aiohttp.compression_utils.brotli succeeds with brotlicffi pinned; same test against Brotli==1.1.0 alone reproduces the reported TypeError. Credit to @Korkyzer for the original diagnosis and fix shape in #15744; the lazy-deps gating layer was added on top to keep brotlicffi out of the install path for users who don't run a Discord gateway. Fixes #12511. Closes #15744. Co-authored-by: Korky <korkyzer@gmail.com>	2026-05-14 22:36:46 -07:00
Siddharth Balyan	5af672c753	chore: remove Atropos RL environments and tinker-atropos integration (#26106 ) * chore: remove Atropos RL environments, tools, tests, skill, and tinker-atropos submodule Delete: - environments/ (43 files — base env, agent loop, tool call parsers, benchmarks) - rl_cli.py (standalone RL training CLI) - tools/rl_training_tool.py (all 10 rl_* tools) - tests: test_rl_training_tool, test_tool_call_parsers, test_managed_server_tool_support, test_agent_loop, test_agent_loop_vllm, test_agent_loop_tool_calling, test_terminalbench2_env_security - optional-skills/mlops/hermes-atropos-environments/ - tinker-atropos git submodule + .gitmodules * chore: remove RL/Atropos references from Python source - toolsets.py: remove rl toolset block + update comment - model_tools.py: remove rl_tools group + update async bridging comment - hermes_cli/tools_config.py: remove RL display entry, _DEFAULT_OFF_TOOLSETS, setup block, and rl_training post-setup handler - tools/budget_config.py: remove RL environment reference in docstring - tests/test_model_tools.py: remove rl_tools from expected groups - tests/run_agent/test_streaming_tool_call_repair.py: fix stale cross-reference * chore: remove rl/yc-bench extras and tinker-atropos refs from pyproject.toml - Remove rl extra (atroposlib, tinker, fastapi, uvicorn, wandb) - Remove yc-bench extra - Remove rl_cli from py-modules - Remove [tool.ty.src] exclude for tinker-atropos - Remove [tool.ruff] exclude for tinker-atropos - Regenerate uv.lock * chore: remove tinker-atropos from install/setup scripts - setup-hermes.sh: remove entire tinker-atropos submodule install block - scripts/install.sh: remove both tinker-atropos blocks (Termux + standard) - scripts/install.ps1: remove tinker-atropos block - nix/hermes-agent.nix: remove tinker-atropos pip install line * chore: remove RL references from cli-config.yaml.example * docs: remove Atropos/RL references from README, CONTRIBUTING, AGENTS.md * docs: remove RL/Atropos references from website - Delete: environments.md, rl-training.md, mlops-hermes-atropos-environments.md - sidebars.ts: remove rl-training and environments sidebar entries - optional-skills-catalog.md: remove hermes-atropos-environments row - tools-reference.md: remove entire rl toolset section - toolsets-reference.md: remove rl row + update example - integrations/index.md: remove RL Training bullet - architecture.md: remove environments/ from tree + RL section - contributing.md: remove tinker-atropos setup - updating.md: remove tinker-atropos install + stale submodule update * chore: remove remaining RL/Atropos stragglers - hermes_cli/config.py: remove TINKER_API_KEY + WANDB_API_KEY env var defs - hermes_cli/doctor.py: remove Submodules check section (tinker-atropos) - hermes_cli/setup.py: remove RL Training status check - hermes_cli/status.py: remove Tinker + WandB from API key status display - agent/display.py: remove both rl_* tool preview/activity blocks - website/docs: remove RL references from providers.md + env-variables.md - tests: remove TINKER_API_KEY from conftest, set_config_value, setup_script * chore: remove RL training section from .env.example	2026-05-15 10:36:38 +05:30
teknium1	4695d2716f	fix(browser): honor pre-set AGENT_BROWSER_ARGS and document the bypass Follow-up to the sandbox-bypass env-var fix: - Update the opt-out gate so a user-provided AGENT_BROWSER_ARGS is also respected, not just the legacy AGENT_BROWSER_CHROME_FLAGS. Previously the gate only checked the broken legacy var, so a user who pre-set AGENT_BROWSER_ARGS would still get clobbered by Hermes's auto-injection. - Document AGENT_BROWSER_ARGS in .env.example, the browser feature page, and the env var reference, with notes about the auto-injection on AppArmor-restricted systems (Ubuntu 23.10+, DGX Spark, containers). - Add Anadi Jaggia to AUTHOR_MAP.	2026-05-14 19:02:17 -07:00
Anadi Jaggia	8ed2ef6f46	fix(browser): use correct env var for --no-sandbox bypass AGENT_BROWSER_CHROME_FLAGS is not read by agent-browser CLI. The correct env var is AGENT_BROWSER_ARGS, with comma-separated values. This fixes Chrome 'No usable sandbox' crash on Ubuntu 23.10+ systems where AppArmor restricts unprivileged user namespaces. The detection logic was correct but the fix used the wrong environment variable name and space-separated instead of comma-separated args.	2026-05-14 19:02:17 -07:00
Teknium	19071529f6	fix(lsp): shift baseline diagnostics into post-edit coordinates (#25978 ) Pre-existing diagnostics below an edit point used to surface as 'LSP diagnostics introduced by this edit' whenever the edit deleted or inserted lines. The delta-filter key included the diagnostic's range, so the same logical error reported at a different line in the post-edit snapshot looked like a brand new diagnostic. Concrete case: deleting 14 lines in cli.py caused Pyright errors at lines 9873, 10590, 12413, 13004 (unrelated to the edit) to be reported as introduced by it. Fix: build a piecewise-linear line-shift map (via difflib's SequenceMatcher) from pre and post content, and remap baseline diagnostics into post-edit coordinates before the set-difference. Diagnostics in deleted regions drop out cleanly; diagnostics below the edit shift by the right amount; diagnostics above are untouched. The strict (range-aware) equality key stays — so a genuinely new instance of an identical error class at a different line still surfaces as new. Pieces: - agent/lsp/range_shift.py — build_line_shift, shift_diagnostic_range, shift_baseline. Pure functions, no LSP state. - agent/lsp/manager.py — LSPService.get_diagnostics_sync gains an optional line_shift kwarg; baseline is shift_baseline'd before computing the seen-set. _diag_key keeps the strict range key. - tools/file_operations.py — write_file captures pre_content for any LSP-handled extension (not just LINTERS_INPROC) and passes pre/post to _maybe_lsp_diagnostics, which builds the shift map. - New _lsp_handles_extension helper guards the pre_content read. Trade-offs preserved: - Genuinely new same-class errors at different lines still surface (content-only key would have swallowed them). - Pre-existing errors at unshifted positions still get filtered (covered by the strict-key path with no shift). - Best-effort: when pre_content can't be captured (file didn't exist, permissions), the unshifted comparison still catches most pre-existing errors; the edge case it misses is a new file with a non-empty baseline, which is structurally impossible.	2026-05-14 15:56:07 -07:00
Teknium	72b5dd8658	fix(update): refresh lazy-installed backends on hermes update (#25766 ) Pyproject's [all] extra was slimmed down in May 2026 — ~20 optional backends moved to tools/lazy_deps.py and only install on first use. hermes update runs uv pip install -e .[all] which doesn't touch any of them, so pin bumps in LAZY_DEPS (CVE response, transitive fixes) were silently ignored on already-activated backends. Two changes: 1. _is_satisfied() now parses the spec and checks the installed version against the constraint via packaging.specifiers. Previously it returned True the moment the package name was importable, which made ensure() a name-presence gate rather than a version-pin gate. 2. New active_features() / refresh_active_features() pair: lists every feature with at least one of its packages currently installed, then re-runs ensure() on each. Refresh is invoked at the end of _cmd_update_impl, right after the [all] install completes. Cold backends (never activated) stay quiet — no churn for them. Output during update is one summary block: → Refreshing 4 active lazy backend(s)... ↑ 1 refreshed: provider.anthropic ✓ 3 already current or ⚠ memory.honcho failed to refresh: <pip stderr> Failures never raise out of update — backends keep their previously- installed version and we tell the user to rerun once upstream is fixed. security.allow_lazy_installs=false is honored: features get marked "skipped" with the reason shown. Tests: 18 new unit tests covering version-aware satisfaction (exact pin, range, extras blocks, missing package, malformed spec), active feature discovery, and refresh status reporting. All 61 lazy_deps tests pass.	2026-05-14 08:03:40 -07:00
wesleysimplicio	364ddd45e8	fix(terminal): prevent safety filter false positives on keywords inside quoted strings The _foreground_background_guidance() function matched background-wrapper keywords (nohup/disown/setsid) anywhere in the command text, including inside quoted strings, Python -c code, commit messages, and PR body text. Two-layer fix: 1. Strip single-quoted, double-quoted, and backtick-quoted content before pattern matching via _strip_quotes() helper. 2. Tighten the regex to only match keywords at command-start positions (after ^, ;, &, &&, \|\|, or $() — not mid-argument. Both layers are needed: quote stripping handles the common case of keywords in string literals, and the position-aware regex handles unquoted cases like 'export FOO=setsid' (word boundary match, wrong position). Fixes #20064	2026-05-14 08:02:01 -07:00
AsoTora	1247ff2dca	fix: stop retrying initial MCP auth failures	2026-05-14 07:58:43 -07:00
fu576	f0e46c5e9e	fix: do not inherit api_mode when delegating across providers Cross-provider delegation (e.g. MiniMax parent → DeepSeek child) must not inherit the parent's api_mode, because each provider uses a different API surface: MiniMax uses 'anthropic_messages' while DeepSeek uses 'chat_completions'. Inheriting the wrong mode causes 404 errors. When the effective provider differs from the parent's provider, derive api_mode from the target provider's defaults instead (None triggers re-derivation). Refs: Bug #20558, PR #20563	2026-05-13 23:12:57 -07:00
kshitijk4poor	4ca5e72444	fix(web): preserve top-level error envelope on unconfigured systems Surfaced by local E2E behavior-parity testing of PR vs origin/main: the plugin-migrated dispatchers were quietly changing the error envelope shape returned to function-calling models on unconfigured systems. Two findings, both from per-result error wrapping bleeding into the pre-flight configuration error path: 1. search: ``firecrawl.search()`` caught the ``ValueError("Web tools are not configured...")`` from ``_get_firecrawl_client()`` and returned it as ``{"success": False, "error": ...}``, losing the legacy ``{"error": "Error searching web: ..."}`` envelope that ``tool_error()`` emits on main. Models that special-case the ``error`` key still detect the failure, but the prefix is part of the legacy contract some users rely on. 2. crawl: ``firecrawl.crawl()`` caught the same pre-flight ``ValueError`` and wrapped it as a per-page error inside ``results[0]``. Main short-circuits on ``check_firecrawl_api_key()`` BEFORE dispatching, so its unconfigured response is ``{"success": False, "error": "web_crawl requires Firecrawl..."}`` at the top level. The PR's per-page burying hid the failure inside ``results[]`` where models that check ``result.get("error")`` would miss it. Fix: - ``plugins/web/firecrawl/provider.py``: pull ``_get_firecrawl_client()`` outside the broad ``try`` in ``search()``. Pre-flight ``ValueError`` / ``ImportError`` propagate to the dispatcher's top-level exception handler. In-flight SDK errors still get wrapped as ``{"success": False, ...}``. - ``tools/web_tools.py``: mirror main's upstream availability gate in ``web_crawl_tool``. When the resolved crawl provider is ``is_available()==False``, short-circuit BEFORE dispatching with the same top-level error shape main emits. - ``tests/tools/test_web_providers.py``: 2 regression tests (``TestUnconfiguredErrorEnvelopeParity``) lock in the behavior so future plugin work can't undo this. Verified via local subprocess-based parity test (14/14 scenarios match origin/main shape exactly) and full 210/210 web test suite green.	2026-05-13 22:31:28 -07:00
kshitijk4poor	21e3a863bb	feat(web): firecrawl plugin natively supports crawl; delete legacy inline path The web-provider migration originally left firecrawl crawl as the only provider-specific code remaining inline in tools/web_tools.py (~250 lines of Firecrawl-specific crawl orchestration that didn't fit the plugin's existing surface). This commit closes that gap. What this adds -------------- 1. plugins/web/firecrawl/provider.py: implement async ``crawl(url, **kwargs)`` - Accepts the same kwargs as the dispatcher passes to any crawl provider (``instructions``, ``depth``, ``limit``); Firecrawl's /crawl endpoint ignores ``instructions`` and ``depth`` so we log and drop with a clear info message. - Wraps the sync SDK ``crawl()`` call in asyncio.to_thread so the gateway event loop isn't blocked on a multi-page crawl. - Preserves the response-shape normalization across pydantic / typed-object / dict variants that the legacy inline code did. - Preserves per-page website-policy re-check (catches blocked redirects after the SDK returns). - Returns the same {"results": [...]} shape so the dispatcher's shared LLM-summarization post-processing path works unchanged. - Sets supports_crawl() to True so the dispatcher routes through the plugin instead of the legacy fallthrough. 2. tools/web_tools.py: delete the entire legacy firecrawl crawl block that used to run after "No registered provider supports crawl" — ~270 lines including: - check_firecrawl_api_key gate + typed error - inline SSRF + website-policy seed-URL gate (dispatcher already does this) - Firecrawl client setup with crawl_params - 100+ lines of pydantic/dict/typed-object normalization - Per-page LLM-processing loop (kept in the dispatcher's shared post-processing path; that's where it always belonged) - trimming + base64 image cleanup (still done in the dispatcher's shared path) Replaced with a single typed-error branch when no crawl-capable provider is available: "web_crawl has no available backend. Set FIRECRAWL_API_KEY (or FIRECRAWL_API_URL for self-hosted), or set TAVILY_API_KEY for Tavily." Test updates ------------ - tests/tools/test_website_policy.py: - test_web_crawl_short_circuits_blocked_url: dispatcher seed-URL gate still runs on web_tools.check_website_access (no change to that patch), but the firecrawl client lockdown moved to the plugin module — patch firecrawl_provider._get_firecrawl_client instead of web_tools._get_firecrawl_client. The dispatcher short-circuits before the plugin runs, so the test still passes. - test_web_crawl_blocks_redirected_final_url: patch the per-page policy gate at plugins.web.firecrawl.provider.check_website_access (where it now runs) AND on web_tools (where the seed-URL gate still runs). Patch firecrawl_provider._get_firecrawl_client for the FakeCrawlClient injection. Both checks flow through the same fake_check function. - tests/plugins/web/test_web_search_provider_plugins.py: - Update parametrized capability-flag spec: firecrawl supports_crawl is now True. - Add test_firecrawl_crawl_returns_error_dict_when_unconfigured — verifies inspect.iscoroutinefunction(p.crawl) is True and that the async crawl returns a per-page error dict (not a raise) when FIRECRAWL_API_KEY is missing. Verified -------- - 218/218 web tests pass (was 173, +44 plugin tests + 1 new firecrawl crawl test from this commit = 218 with the test deduplication). - Compile-clean (py_compile passes on both files). - Provider capabilities matrix confirmed end-to-end: name search extract crawl async-extract? async-crawl? firecrawl True True True True True tavily True True True False False Both crawl-capable providers exercise the dispatcher's inspect.iscoroutinefunction async-or-sync detection. Net diff -------- - tools/web_tools.py: -254 lines (legacy inline crawl gone) - plugins/web/firecrawl/provider.py: +185 lines (crawl method) - test_website_policy.py: +14/-9 lines (patch locations) - test_web_search_provider_plugins.py: +22/-1 lines (capability flag + new firecrawl crawl test) - Total: -32 net LoC; tools/web_tools.py is now 1509 lines (was 1763 before this commit, 2227 before the migration started).	2026-05-13 22:31:28 -07:00
kshitijk4poor	39b4ebfcea	refactor(web): delete legacy tools/web_providers/ directory + migrate ABC tests Removes the legacy in-tree provider scaffolding that PR #25182 fully replaced with the plugin architecture: tools/web_providers/__init__.py (6 lines) tools/web_providers/base.py (89 lines — old ABCs) tools/web_providers/ARCHITECTURE.md (73 lines — old design doc) These were the staging-ground ABCs and provider modules that the plugin migration absorbed. All seven web providers now implement the single :class:`agent.web_search_provider.WebSearchProvider` ABC and live under ``plugins/web/<vendor>/``. Nothing else in the tree imports ``tools.web_providers`` — verified via grep before deletion. Test migration (tests/tools/test_web_providers.py) -------------------------------------------------- Rewrote ``TestWebProviderABCs`` to test the new unified ABC at :mod:`agent.web_search_provider`: - test_cannot_instantiate_abc_directly — abstract ``name`` + ``is_available`` - test_concrete_search_only_provider_works — exercise default ``supports_extract=False`` / ``supports_crawl=False`` flags - test_concrete_multi_capability_provider_works — exercise all three capabilities, async extract supported (declared sync here for simplicity; real plugins like parallel + firecrawl use async) - test_search_only_provider_skips_extract_and_crawl — verify ``supports_*()`` flags default to False so search-only providers don't have to implement extract() or crawl() The 9 other tests in the file (per-capability backend selection, DEFAULT_CONFIG merge, dispatcher routing) test public helpers in ``tools.web_tools`` that still exist and pass unchanged. agent/web_search_provider.py docstring updated to reflect that the legacy ABCs no longer exist; the response-shape contract is preserved bit-for-bit so external consumers see no behavioral change. Net diff -------- - tools/web_providers/ removed (-168 lines) - tests/tools/test_web_providers.py rewritten ABC section (+78/-30 net, same coverage, new API) - agent/web_search_provider.py docstring (-3/+5 lines) Verified -------- - 173/173 targeted web tests pass - 12/12 ABC contract tests pass with the new interface - No remaining grep hits for ``tools.web_providers`` outside of intentional historical references in plugin docstrings.	2026-05-13 22:31:28 -07:00
kshitijk4poor	748f3e016b	refactor(web): delete inline vendor helpers, re-export from plugins Removes ~580 lines of dead code from tools/web_tools.py that were superseded by the plugin migration but kept around in the cutover commit to keep the diff focused. Replaces them with thin re-export shims so existing tests and external callers that reach for the legacy ``tools.web_tools.<name>`` paths continue to work transparently. Deleted from tools/web_tools.py -------------------------------- - Lazy Firecrawl SDK proxy (_load_firecrawl_cls, _FirecrawlProxy, _FIRECRAWL_CLS_CACHE, the Firecrawl singleton) - Firecrawl client section (_get_direct_firecrawl_config, _get_firecrawl_gateway_url, _is_tool_gateway_ready, _has_direct_firecrawl_config, _raise_web_backend_configuration_error, _firecrawl_backend_help_suffix, _get_firecrawl_client) - Parallel client section (_get_parallel_client, _get_async_parallel_client, _parallel_client, _async_parallel_client) - Tavily client section (_TAVILY_BASE_URL, _tavily_request, _normalize_tavily_search_results, _normalize_tavily_documents) - Generic SDK normalizers (_to_plain_object, _normalize_result_list, _extract_web_search_results, _extract_scrape_payload) - Exa client section (_get_exa_client, _exa_client, _exa_search, _exa_extract) - Parallel helpers (_parallel_search, _parallel_extract) - Duplicate inline check_firecrawl_api_key Net: tools/web_tools.py drops from 2227 → 1613 lines (-614 lines). Re-exports added at top of tools/web_tools.py --------------------------------------------- - From plugins.web.firecrawl.provider: Firecrawl, _FirecrawlProxy, _FIRECRAWL_CLS_CACHE, _load_firecrawl_cls, _get_direct_firecrawl_config, _get_firecrawl_gateway_url, _is_tool_gateway_ready, _has_direct_firecrawl_config, _firecrawl_backend_help_suffix, _raise_web_backend_configuration_error, _get_firecrawl_client, _to_plain_object, _normalize_result_list, _extract_web_search_results, _extract_scrape_payload, check_firecrawl_api_key - From plugins.web.tavily.provider: _tavily_request, _normalize_tavily_search_results, _normalize_tavily_documents - From plugins.web.parallel.provider: _get_parallel_client, _get_async_parallel_client - From plugins.web.exa.provider: _get_exa_client Plus retained module-level imports for backward-compat with tests: - httpx (tests patch tools.web_tools.httpx for tavily request mocking) - build_vendor_gateway_url, _read_nous_access_token, resolve_managed_tool_gateway, managed_nous_tools_enabled, prefers_gateway (tests patch tools.web_tools.<name>) Plugin indirection pattern (key technique) ------------------------------------------ For functions inside the firecrawl/parallel/exa plugins to honor unit-test patches that target ``tools.web_tools.<name>``, the plugin implementations now do ``import tools.web_tools as _wt`` at call time and read helper names through that module (``_wt._read_nous_access_token``, ``_wt.Firecrawl``, ``_wt.prefers_gateway``, etc.). This makes the existing test patches transparently reach the plugin code without any test changes. The cached client globals (_firecrawl_client, _firecrawl_client_config, _parallel_client, _async_parallel_client, _exa_client) also now live on tools.web_tools so existing test setup_method handlers that reset ``tools.web_tools._<vendor>_client = None`` between cases keep working. The plugins read/write the cache via getattr/setattr on the web_tools module. Verified -------- - 173/173 targeted web tests pass: test_web_providers.py, test_web_providers_brave_free.py, test_web_providers_ddgs.py, test_web_providers_searxng.py, test_web_tools_config.py, test_web_tools_tavily.py, test_website_policy.py, test_config_null_guard.py - Compile-clean (py_compile.compile passes) - All inline implementations now exist in exactly one place (plugins.web.<vendor>.provider) Follow-up clean-up ------------------ - Drop _WEB_PLUGIN_SKIPLIST + hardcoded TOOL_CATEGORIES["web"] rows (next commit) - Delete tools/web_providers/ directory entirely - Add tests/plugins/web/ coverage - Full tests/tools/ + tests/gateway/ regression sweep before promoting PR	2026-05-13 22:31:28 -07:00

1 2 3 4 5 ...

1376 Commits