mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-05-21 03:39:54 +00:00
9c626ef8ea8bc190f9a339991c8de26ce4528bb5
122 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
a01c1f7305 | fix: kanban button | ||
|
|
dfe512c58d |
fix(paths): route achievements plugin + profile-tui through HERMES_HOME
Four callsites hardcoded Path.home() / '.hermes' with no HERMES_HOME
check, breaking Docker deployments and profile isolation (hermes -p):
- plugins/hermes-achievements/dashboard/plugin_api.py:
state_path(), snapshot_path(), checkpoint_path() bare-literal paths
- scripts/profile-tui.py:
DEFAULT_STATE_DB and DEFAULT_LOG defaults ignored HERMES_HOME
- hermes_cli/slack_cli.py:
except-Exception fallback for slack-manifest.json dump
- optional-skills/migration/openclaw-migration/scripts/openclaw_to_hermes.py:
--target argparse default
Use get_hermes_home() (with an ImportError shim for the standalone
scripts) or 'os.environ.get("HERMES_HOME") or str(Path.home()/".hermes")'
where importing hermes_constants is impractical.
E2E-verified: with HERMES_HOME=/tmp/x all three achievements paths and
both profile-tui defaults route under /tmp/x.
Salvaged from #18068 (original scope was broader mechanical cleanup
claiming 23 callsites were buggy; most were already respecting
HERMES_HOME via os.environ.get(key, default) — only these 4 had no env
check at all). Credit: @web-dev0521.
|
||
|
|
ec4cb16a29 |
fix(honcho): guard _peers_cache and _sessions_cache reads under _cache_lock
_get_peer() and _get_or_create_honcho_session() accessed _peers_cache and _sessions_cache without holding _cache_lock, while other paths in the same class use the lock consistently. Under concurrent tool calls or prefetch threads, this can produce stale reads or lost cache updates. Wrap both unguarded cache read sites in _cache_lock. Network calls (honcho.peer() and honcho.session()) remain outside the lock to avoid holding it during I/O. |
||
|
|
bea2562fc4 |
fix(honcho): replace raw int() config parsing with safe helper
Three int() calls in HonchoClient.from_global_config() parsed dialecticMaxChars, messageMaxChars, and dialecticMaxInputChars directly without guards. A malformed value in honcho.json would raise ValueError and abort provider initialization entirely. Add _parse_int_config() helper following the existing _parse_context_tokens() pattern, and replace all three raw int() calls with it. |
||
|
|
624057fce6 |
feat(teams): set User-Agent to Hermes via 2.0.0 client option
microsoft-teams-apps 2.0.0 added the `client` option to AppOptions, accepting a ClientOptions instance. Use it to set the User-Agent header to "Hermes" on all outgoing HTTP requests. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> |
||
|
|
c868425467 |
feat(kanban): durable multi-profile collaboration board (#17805)
Salvage of PR #16100 onto current main (after emozilla's #17514 fix that unblocks plugin Pydantic body validation). History preserved on the standing `feat/kanban-standing` branch; this squashes the 22 iterative commits into one clean landing. What this lands: - SQLite kernel (hermes_cli/kanban_db.py) — durable task board with tasks, task_links, task_runs, task_comments, task_events, kanban_notify_subs tables. WAL mode, atomic claim via CAS, tenant-namespaced, skills JSON array per task, max-runtime timeouts, worker heartbeats, idempotency keys, circuit breaker on repeated spawn failures, crash detection via /proc/<pid>/status, run history preserved across attempts. - Dispatcher — runs inside the gateway by default (`kanban.dispatch_in_gateway: true`). Ticks every 60s, reclaims stale claims, promotes ready tasks, spawns `hermes -p <assignee> chat -q "work kanban task <id>"` with HERMES_KANBAN_TASK + HERMES_KANBAN_WORKSPACE env. Auto-loads `--skills kanban-worker` plus any per-task skills. Health telemetry warns on stuck ready queue. - Structured tool surface (tools/kanban_tools.py) — 7 tools (kanban_show, kanban_complete, kanban_block, kanban_heartbeat, kanban_comment, kanban_create, kanban_link). Gated on HERMES_KANBAN_TASK via check_fn so zero schema footprint in normal sessions. - System-prompt guidance (agent/prompt_builder.py KANBAN_GUIDANCE) injected only when kanban tools are active. - Dashboard plugin (plugins/kanban/dashboard/) — Linear-style board UI: triage/todo/ready/running/blocked/done columns, drag-drop, inline create, task drawer with markdown, comments, run history, dependency editor, bulk ops, lanes-by-profile grouping, WS-driven live refresh. Matches active dashboard theme via CSS variables. - CLI — `hermes kanban init|create|list|show|assign|link|unlink| claim|comment|complete|block|unblock|archive|tail|dispatch|context| init|gc|watch|stats|notify|log|heartbeat|runs|assignees` + `/kanban` slash in-session. - Worker + orchestrator skills (skills/devops/kanban-worker + kanban-orchestrator) — pattern library for good summary/metadata shapes, retry diagnostics, block-reason examples, fan-out patterns. - Per-task force-loaded skills — `--skill <name>` (repeatable), stored as JSON, threaded through to dispatcher argv as one `--skills X` pair per skill alongside the built-in kanban-worker. Dashboard + CLI + tool parity. - Deprecation of standalone `hermes kanban daemon` — stub exits 2 with migration guidance; `--force` escape hatch for headless hosts. - Docs (website/docs/user-guide/features/kanban.md + kanban-tutorial.md) with 11 dashboard screenshots walking through four user stories (Solo Dev, Fleet Farming, Role Pipeline, Circuit Breaker). - Tests (251 passing): kernel schema + migration + CAS atomicity, dispatcher logic, circuit breaker, crash detection, max-runtime timeouts, claim lifecycle, tenant isolation, idempotency keys, per- task skills round-trip + validation + dispatcher argv, tool surface (7 tools × round-trip + error paths), dashboard REST (CRUD + bulk + links + warnings), gateway-embedded dispatcher (config gate, env override, graceful shutdown), CLI deprecation stub, migration from legacy schemas. Gateway integration: - GatewayRunner._kanban_dispatcher_watcher — new asyncio background task, symmetric with _kanban_notifier_watcher. Runs dispatch_once via asyncio.to_thread so SQLite WAL never blocks the loop. Sleeps in 1s slices for snappy shutdown. Respects HERMES_KANBAN_DISPATCH_IN_GATEWAY=0 env override for debugging. - Config: new `kanban` section in DEFAULT_CONFIG with `dispatch_in_gateway: true` (default) + `dispatch_interval_seconds: 60`. Additive — no \_config_version bump needed. Forward-compat: - workflow_template_id / current_step_key columns on tasks (v1 writes NULL; v2 will use them for routing). - task_runs holds claim machinery (claim_lock, claim_expires, worker_pid, last_heartbeat_at) so multi-attempt history is first- class from day one. Closes #16102. Co-authored-by: emozilla <emozilla@nousresearch.com> |
||
|
|
5d253e65b7 |
fix(openviking): pre-check fs/stat to route file URIs before hitting directory-only endpoints
Adds a deterministic pre-check on top of htsh's exception-based fallback: before calling /content/abstract or /content/overview on a non-pseudo URI, probe /api/v1/fs/stat. If the server says the URI is a file, route straight to /content/read instead of eating a failing 500 round-trip. This is the same idea pty819 and chennest independently landed in PRs #12757 and #12937 — merged here on top of htsh's broader fix so we keep pseudo-URI normalization and v0.3.3 browse-shape handling while avoiding the slow exception path on servers that return a raised 500 every time. The exception fallback from #5886 stays in place for environments where fs/stat is unavailable or returns an unfamiliar shape. Also credits pty819, chennest, and htsh in AUTHOR_MAP so future release notes attribute them correctly. |
||
|
|
10e43edc09 |
fix(openviking): fallback summary reads to content/read for file URIs
OpenViking returns 500 for /content/abstract and /content/overview when URI points to mem_*.md files. Add resilient fallback to /content/read for non-pseudo summary file URIs while preserving pseudo summary normalization. Also add regression tests for fallback behavior. |
||
|
|
97a851bf97 |
fix(openviking): normalize summary pseudo-URIs to prevent v0.3.3 500s
OpenViking v0.3.3 expects directory URIs for abstract/overview reads. Passing pseudo-files like /.overview.md and /.abstract.md to /api/v1/content/overview|abstract triggers HTTP 500. This change normalizes those pseudo-URIs to their parent directory for abstract/overview requests, preserves full reads, and hardens parsing for wrapped/unwrapped result payloads and fs list response shapes. |
||
|
|
e23bb18dac |
fix(teams): rewrite interactive_setup to use teams CLI flow
Replace the Azure portal credential prompts with the teams CLI workflow: install @microsoft/teams.cli, run teams app create, paste the output credentials. Matches the setup docs. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> |
||
|
|
45780edbbf |
feat(teams): keep card body visible after approval button click
Pass cmd/desc in button action data so the card response can reconstruct the original body. Clicking a button now replaces only the actions with a status line, keeping the command and reason text visible. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> |
||
|
|
39b0bc377c |
fix(teams): override send_image_file for local image attachments
The gateway calls send_image_file() for locally cached images (e.g. from image_gen tools). Without this override the base class falls back to sending the file path as plain text. Delegate to send_image() which already handles base64 encoding local paths. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> |
||
|
|
ca5bebef00 |
fix(teams): send images as attachments instead of markdown links
Teams doesn't render markdown image syntax. Send images using the SDK's Attachment API instead — base64 data URI for local files, direct URL for remote images. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> |
||
|
|
b3137d758c |
feat(teams): add Microsoft Teams platform adapter as a plugin
Hello! I am the maintainer of the microsoft-teams-apps Python SDK and I built this Teams adapter to integrate Microsoft Teams into Hermes. Adds a `plugins/platforms/teams` platform plugin using the new PlatformRegistry system from #17751. The adapter self-registers via `register(ctx)` — no hardcoding in run.py, toolsets.py, or any other core file. Key features: - Supports personal DMs, group chats, and channel posts - Adaptive Card approval prompts with in-place button replacement (Allow Once / Allow Session / Always Allow / Deny) - aiohttp webhook server bridged from the Teams SDK to avoid the fastapi/uvicorn dependency - ConversationReference caching for correct proactive sends in non-DM chats - `interactive_setup()` for `hermes gateway setup` integration - `platform_hint` for LLM context (Teams markdown subset) - 34 tests covering adapter init, send, message handling, and plugin registration Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> |
||
|
|
62a5d7207d |
feat(plugins): bundle hermes-achievements + scan full session history (#17754)
* feat(plugins): bundle hermes-achievements, scan full session history Ships @PCinkusz's hermes-achievements dashboard plugin (https://github.com/PCinkusz/hermes-achievements) as a bundled plugin at plugins/hermes-achievements/ and fixes a bug in the scan path that made the plugin only see the first 200 sessions — making lifetime badges (50k tool calls, 75k errors, etc.) unreachable on long-running installs. Changes: - plugins/hermes-achievements/: vendor v0.3.1 verbatim (manifest, dist/, plugin_api.py, tests, docs, README). - plugins/hermes-achievements/dashboard/plugin_api.py: * scan_sessions(): limit=None now scans ALL sessions via SQLite LIMIT -1. Previously capped at 200, so users with 8000+ sessions saw ~2% of their history. * evaluate_all(): first-ever scans run in a background thread so the dashboard request path never blocks. Stale snapshots serve immediately while a background refresh runs. force=True still blocks synchronously for manual /rescan. * _build_pending_snapshot(), _start_background_scan(), _run_scan_and_update_cache(): supporting plumbing + idempotent thread spawn. - tests/plugins/test_achievements_plugin.py: new tests covering the 200-cap regression, the background-scan first-run flow, stale-serve-plus-background-refresh, forced sync rescan, and scan-thread idempotency. - website/docs/user-guide/features/built-in-plugins.md: lists hermes-achievements in the bundled-plugins table and documents API endpoints, state files, and performance characteristics. E2E validated against a real 8564-session ~6.4GB state.db: * Cold scan: 13m 19s (one-time, backgrounded — UI never blocks) * Warm rescan: 1.47s (8563/8564 sessions reused from checkpoint cache) * 57/60 achievements unlocked, 3 discovered — aggregates like total_tool_calls=259958, total_errors=164213, skill_events=368243 correctly surface lifetime badges that the 200-cap made unreachable. Original credit: @PCinkusz (MIT-licensed). Upstream repo remains the staging ground for new badges; this bundle keeps the dashboard feature parity with Hermes core changes. * feat(achievements): publish partial snapshots during cold scan Previously a cold scan on a large session DB (13min on 8564 sessions) showed zero badges for the entire duration, then every badge at once when the scan completed. A dashboard refresh mid-scan was indistinguishable from a fresh install with no history. Now the scanner publishes a partial snapshot to _SNAPSHOT_CACHE every 250 sessions, so each refresh during a cold scan surfaces more badges incrementally. Mechanism: - scan_sessions() takes an optional progress_callback fired every progress_every sessions with (sessions_so_far, scanned, total). - _compute_from_scan() is extracted from compute_all() and gains an is_partial flag that skips writing to state.json — we don't want to record unlocked_at based on a half-complete aggregate that a later session might rebalance. - _run_scan_and_update_cache() installs a publisher callback that builds a partial snapshot, marks it mode='in_progress', and writes it to the cache with age=0 so the UI keeps polling /scan-status and picks up the final snapshot when the scan completes. - Manual /rescan (force=True) disables partial publishing — the caller is blocking on the final result anyway. E2E against real 8564-session state.db (polled cache every 10s): t=10s: cache empty t=20s: 250/8564 scanned, 35 unlocked, 25 discovered t=40s: 500/8564 scanned, 42 unlocked, 18 discovered t=60s: 1000/8564 scanned, 49 unlocked, 11 discovered ... Tests: 9/9 pass (2 new — partial snapshot publication + no-persist-on-partial). Upstream unittest suite: 10/10 pass. * feat(achievements): in-progress scan banner with live % progress Previously the dashboard showed zero badges silently during long cold scans (13min on 8564 sessions). The backend was publishing partial snapshots every 250 sessions, but the bundled UI didn't surface any indicator that a scan was running — it just rendered the main page with whatever counts were currently published and no way for the user to know more progress was coming. UI changes (dist/index.js, dist/style.css): - Added a scan-in-progress banner rendered between the hero and stats when scan_meta.mode is 'pending' or 'in_progress'. Shows: BUILDING ACHIEVEMENT PROFILE… Scanned 1,750 of 8,564 sessions · 20%. Badges unlock as more history streams in. with a pulsing teal indicator and a filling teal/cyan progress bar. Disappears the moment the backend flips to 'full' or 'incremental'. - Added an auto-poller via useEffect — while scanInFlight is true the page re-fetches /achievements every 4s WITHOUT toggling the loading skeleton, so unlock counts tick up visibly without the user refreshing. The effect cleans itself up when the scan finishes. - Added refresh() (re-fetch, no loading flip) alongside the existing load() (full reload, used by the Rescan button). Attribution preserved: - Added a header comment to index.js crediting @PCinkusz (https://github.com/PCinkusz/hermes-achievements, MIT) as the original author, noting the banner is a layered addition on top of the original dist bundle. - Matching header comment in style.css, flagging the new .ha-scan-banner* rules as the local addition. Live-verified end to end: - Spun up `hermes dashboard --port 9229 --no-open` against a fresh HERMES_HOME symlinked to the real 8564-session state.db. - Opened /achievements in a browser, confirmed the banner renders with live progress: 'Scanned 1,000 of 8,564 sessions · 11%' → updates to '1,250 ... · 14%' → '1,750 ... · 20%' without user interaction, matching the backend's partial publications. - Stats row simultaneously climbed from 35 → 49 → 53 unlocked as more history streamed in. - Vision analysis of the rendered page confirms the banner styling matches the rest of the dashboard (dark card bg, teal accent, same small-caps typography, pulsing indicator reusing ha-pulse keyframes). |
||
|
|
4d363499db |
feat(plugins): bundled platform plugins auto-load by default
Platform plugins shipped in-repo under plugins/platforms/ should be
available out of the box — users shouldn't have to add 'irc-platform'
to plugins.enabled before they can pick IRC from the gateway setup menu.
Adds a new ``kind: platform`` plugin type that mirrors the existing
``kind: backend`` auto-load semantics:
- Bundled (shipped in the hermes-agent repo): auto-load unconditionally.
- User-installed (~/.hermes/plugins/): still opt-in via plugins.enabled
so untrusted code doesn't silently run.
Changes:
* hermes_cli/plugins.py: add 'platform' to _VALID_PLUGIN_KINDS, document
the new kind in the PluginManifest docstring, extend the bundled auto-
load rule from 'backend only' to 'backend or platform'.
* plugins/platforms/irc/plugin.yaml: declare kind: platform.
* hermes_cli/gateway.py: remove the now-redundant
_load_bundled_platform_plugins_for_enumeration() helper and the
_enable_plugin_for_platform() helper. The setup menu's _all_platforms()
just calls discover_plugins() and reads the registry — bundled
platforms are already loaded at that point. Drops the 'needs_enable'
flag and the 'plugin disabled — select to enable' status string.
* hermes_cli/setup.py: relax the "gateway is configured" detector used
during OpenClaw migration. Switching to _platform_status() in an
earlier commit tightened the check to require an exact "configured"
match, dropping platforms whose status is "enabled, not paired",
"partially configured", "configured + E2EE", etc. Now any non-"not
configured" status counts — the user has already started setup there
and we shouldn't force the section to rerun.
* tests/hermes_cli/test_setup_irc.py: drop the TestIRCPluginDisabledFlow
class and test_configure_platform_enables_disabled_plugin_first — the
no-longer-existent flow they were testing.
* tests/hermes_cli/test_setup_openclaw_migration.py: patch both
setup.get_env_value and gateway.get_env_value in the 4 gateway-section
tests that reach _platform_status() through the unified setup flow;
switch WHATSAPP_ENABLED to the literal "true" in the registry-parity
test so WhatsApp's value-shape validator matches.
Verified via fresh-install smoke (empty plugins.enabled, no env vars):
IRC plugin loads, Platform('irc') resolves, _all_platforms() lists IRC
with status 'not configured'. 160 targeted tests pass.
|
||
|
|
868bc1c242 |
feat(irc): add interactive setup
feat(gateway): refine Platform._missing_ and platform-connected dispatch Restricts plugin-name acceptance to bundled plugin scan + registry (no arbitrary string -> enum-pollution), pulls per-platform connectivity checks into a _PLATFORM_CONNECTED_CHECKERS lambda map with a clean _is_platform_connected method, and adds tests covering the checker map, plugin platform interface, and IRC setup wizard. |
||
|
|
1f1608067c |
feat(gateway): unify setup flows, load platforms dynamically from registry
Merge the two gateway setup paths (hermes setup gateway + hermes gateway setup) to use a single _unified_platforms() list that merges built-in _PLATFORMS with dynamically registered plugin entries from platform_registry. - Add setup_fn field to PlatformEntry for plugin setup flows - _unified_platforms() merges built-ins with registry entries by key - setup_gateway() now uses unified list instead of hardcoded _GATEWAY_PLATFORMS tuple list - gateway_setup() uses same unified list, plugin entries appear alongside built-ins with no [plugin] suffix - _platform_status() handles plugin platforms via registry check_fn - Plugin platforms with setup_fn get called directly; plugins without get a generic env-var display fallback IRC and other plugin platforms now appear automatically in the setup menu when registered via platform_registry.register(). feat(gateway): surface disabled platform plugins in setup and auto-enable on select Platform plugins under plugins/platforms/* (IRC, etc.) were gated behind plugins.enabled, so `hermes gateway setup` wouldn't list them until the user ran `hermes plugins enable <name>` first. Now the setup menu always surfaces them as "plugin disabled — select to enable", and picking one adds it to plugins.enabled before running its setup flow. Along the way, unify the two gateway setup flows so `hermes setup gateway` and `hermes gateway setup` both read from the same platform list (built-in _PLATFORMS + platform_registry entries), dispatch through a single _configure_platform() helper, and share _platform_status(). Deletes the dead bespoke wrappers in setup.py (_setup_whatsapp, _setup_weixin, _setup_email, etc.) that duplicated logic now covered by the registry path or _setup_standard_platform. Also: - PlatformEntry gains a plugin_name field so the registry knows which plugin owns each entry (required for auto-enable). - PluginContext.register_platform auto-stamps plugin_name from the manifest so plugins don't have to pass it explicitly. - PluginManager now scans plugins/platforms/* as its own category root, one level below the bundled plugin scan. - Fix IRC plugin discovery: rename PLUGIN.yaml → plugin.yaml (the scanner is case-sensitive) and add the missing __init__.py that _load_directory_module requires. |
||
|
|
e464cde58f |
feat: final platform plugin parity — webhook delivery, platform hints, docs
Closes remaining functional gaps and adds documentation.
webhook.py: Cross-platform delivery now checks the plugin registry
for unknown platform names instead of hardcoding 15 names in a tuple.
Plugin platforms can receive webhook-routed deliveries.
prompt_builder: Platform hints (system prompt LLM guidance) now fall
back to the plugin registry's platform_hint field. Plugin platforms
can tell the LLM 'you're on IRC, no markdown.'
PlatformEntry: Added platform_hint field for LLM guidance injection.
IRC adapter: Added acquire_scoped_lock/release_scoped_lock in
connect/disconnect to prevent two profiles from using the same IRC
identity. Added platform_hint for IRC-specific LLM guidance.
Removed dead token-empty-warning extension for plugin platforms
(plugin adapters handle their own env vars via check_fn).
website/docs/developer-guide/adding-platform-adapters.md:
- Added 'Plugin Path (Recommended)' section with full code examples,
PLUGIN.yaml template, config.yaml examples, and a table showing all
18 integration points the plugin system handles automatically
- Renamed built-in checklist to clarify it's for core contributors
gateway/platforms/ADDING_A_PLATFORM.md:
- Added Plugin Path section pointing to the reference implementation
and full docs guide
- Clarified built-in path is for core contributors only
|
||
|
|
2e20f6ae2d |
feat: complete plugin platform parity — all 12 integration points
Extends the platform plugin interface from Phase 1 to cover every touchpoint where built-in platforms have hardcoded behavior. - allowed_users_env / allow_all_env: per-platform auth env vars - max_message_length: smart-chunking for send_message tool - pii_safe: session PII redaction flag - emoji: CLI/gateway display - allow_update_command: /update access control send_message tool (tools/send_message_tool.py): - Replaced hardcoded platform_map dict with Platform() call - Added _send_via_adapter() for plugin platforms — routes through live gateway adapter when available - Registry-aware max message length for smart chunking Cron delivery (cron/scheduler.py): - Replaced hardcoded 15-entry platform_map with Platform() call - Plugin platforms now work as cron delivery targets User authorization (gateway/run.py _is_user_authorized): - Registry fallback: checks PlatformEntry.allowed_users_env and allow_all_env when platform not in hardcoded maps - Plugin platforms get per-platform auth support _UPDATE_ALLOWED_PLATFORMS: checks registry allow_update_command flag Channel directory: includes plugin platforms in session enumeration Orphaned config warning: descriptive message when plugin platform is in config but no plugin registered it Gateway weakref: _gateway_runner_ref for cross-module adapter access hermes status: shows plugin platforms with (plugin) tag hermes gateway setup: plugin platforms appear in menu with setup hints hermes_cli/platforms.py: get_all_platforms() merges with registry, platform_label() falls back to registry for plugin names - 8 new tests (extended fields, cron resolution, platforms merge) - Updated 3 tests for new Platform() based resolution - 2829 passed, 24 pre-existing failures, zero new failures |
||
|
|
8f144fe36b |
feat: pluggable platform adapter registry + IRC reference implementation
Adds a platform adapter plugin interface so anyone can create new gateway
platforms (IRC, Viber, Line, etc.) as drop-in plugins without modifying
core gateway code.
- PlatformEntry dataclass: name, label, adapter_factory, check_fn,
validate_config, required_env, install_hint, source
- PlatformRegistry singleton with register/unregister/create_adapter
- _create_adapter() in gateway/run.py checks registry first, falls
through to existing if/elif chain for built-in platforms
- Platform._missing_() accepts unknown string values, creating cached
pseudo-members so Platform('irc') is Platform('irc') holds true
- GatewayConfig.from_dict() now parses plugin platform names from
config.yaml without rejecting them
- get_connected_platforms() delegates to registry for unknown platforms
- PluginContext.register_platform() for plugin authors
- Mirrors the existing register_tool() / register_hook() pattern
- Full async IRC adapter using stdlib asyncio (zero external deps)
- Connects via TLS, handles PING/PONG, nick collision, NickServ auth
- Channel messages require addressing (nick: msg), DMs always dispatch
- Markdown stripping for IRC-clean output, message splitting for
512-byte line limit
- Config via config.yaml extra dict or IRC_* env vars
- Platform enum dynamic members (identity stability, case normalization)
- PlatformRegistry (register, unregister, create, validation, factory)
- GatewayConfig integration (from_dict parsing, get_connected_platforms)
- IRC adapter (init, send, protocol parsing, markdown, requirements)
No existing platform adapters were migrated — the if/elif chain is
untouched. This is Phase 1: prove the interface with a real plugin.
|
||
|
|
0a5ee01e48 |
fix(hindsight): route flush-on-switch through writer queue, not raw thread
Follow-up to the cherry-picked PR #17447. The original flush spawned a bare threading.Thread for the buffer-flush path, overwriting self._sync_thread — which is aliased to the long-lived writer thread. Two consequences: 1. No serialization with the writer queue. If old-session retains were still queued in _retain_queue, the flush ran concurrently with the writer and both threads could call aretain_batch against the same document_id. 2. The pre-spawn 'self._sync_thread.join(timeout=5.0)' tried to join the long-lived writer, which never exits, so the join was a no-op that just timed out — never actually serialized anything. Fix: enqueue the flush closure on _retain_queue via _ensure_writer + put(). Natural FIFO ordering behind any pending retains, no new thread, no broken join. Shutdown-aware so it doesn't enqueue after teardown. Tests updated to drain via _retain_queue.join() instead of the stale _sync_thread.join(). Added regression guard test_flush_serializes_behind_pending_retains_via_writer_queue that blocks the writer mid-retain to prove the flush waits in FIFO behind the old retain. Also seeds _retain_queue / _shutting_down / stubbed _ensure_writer on the bare-object test helper in test_memory_session_switch.py so that path doesn't blow up under the new queue-enqueue. tests/plugins/memory/test_hindsight_provider.py + tests/agent/test_memory_session_switch.py: 103/103 passing. |
||
|
|
c38dac742b |
fix(hindsight): flush buffered turns and drop stale prefetch on session switch
Two data-loss / leak gaps in HindsightMemoryProvider.on_session_switch introduced by #17409. 1. Buffered turns silently lost when retain_every_n_turns > 1. on_session_switch unconditionally cleared _session_turns without flushing. Users who batched every N>1 turns and switched mid-batch (/reset, /new, /resume, /branch, or context compression) had those buffered turns disappear. Same data-loss class as the shutdown race, different lifecycle event. Note commit_memory_session() -> on_session_end() runs *before* on_session_switch on /reset, but Hindsight doesn't implement on_session_end so the buffer survives that step and dies at clear time. /resume, /branch, and compression skip commit_memory_session entirely so an on_session_end impl wouldn't help them anyway. Fix: snapshot the old _session_id, _document_id, _parent_session_id, _turn_index, and _session_turns; spawn one final retain that lands under the OLD document_id; then rotate state. Metadata is built synchronously against the old self._* so session_id / lineage tags on the flushed item all reference the prior session consistently. 2. Stale _prefetch_result leaks across switch. If queue_prefetch ran in the old session and the result hadn't been consumed by prefetch() yet, on_session_switch left the cached recall text in place. The next session's first prefetch() call would return text mined from the prior session's bank/query. Fix: join any in-flight _prefetch_thread (3s bounded — matches shutdown()), then clear _prefetch_result under _prefetch_lock before rotating session_id. Tests ----- - tests/plugins/memory/test_hindsight_provider.py (TestSessionSwitchBufferFlush): - buffered turns flushed under OLD document_id with OLD lineage tags - empty buffer => no spurious retain - _prefetch_result cleared on switch - in-flight prefetch thread is awaited before clear (no race) - tests/agent/test_memory_session_switch.py: factory extended to seed the attrs the new flush path reads (_retain_source, _platform, _bank_id, prefetch state, etc.) and stub _run_hindsight_operation so existing switch-state assertions keep passing without network setup. |
||
|
|
0565497dcc |
fix(hindsight): drain retain queue cleanly on shutdown
The plugin used to spawn one daemon thread per sync_turn() to do the
aretain_batch network write. On CLI exit, that pattern raced interpreter
shutdown — the last retain could reach aiohttp after asyncio's
"cannot schedule new futures" guard had fired, producing noisy logs and
silently losing the final unsaved turn:
WARNING ... Hindsight sync failed: cannot schedule new futures after
interpreter shutdown
ERROR asyncio: Unclosed client session
client_session: <aiohttp.client.ClientSession object at 0x...>
Switch to a single-writer model: each provider owns one long-lived
writer thread plus a queue. sync_turn() snapshots state and enqueues a
job; the writer drains sequentially. Once shutdown() is called:
- new sync_turn() / queue_prefetch() calls are dropped, not enqueued
- a sentinel wakes the writer so it finishes in-flight work
- shutdown joins the writer (10s) before nulling the client
Also register an idempotent atexit hook from the first sync_turn(), so
exit paths that don't go through MemoryManager.shutdown_all() (Ctrl-C,
abrupt exit) still get a chance to drain.
Tests: keep _sync_thread as a legacy alias to the writer, swap join()
calls to _retain_queue.join() (canonical wait-for-drain), add a new
TestShutdownRace suite covering single-writer reuse, post-shutdown drop,
queue draining, and shutdown idempotency.
|
||
|
|
13683c0842 |
feat(memory): notify providers on mid-process session_id rotation (#17409)
Fixes #6672 Memory providers now receive on_session_switch() whenever AIAgent.session_id rotates mid-process — /resume, /branch, /reset, /new, and context compression. Before this, providers that cached per-session state in initialize() (Hindsight's _session_id, _document_id, accumulated _session_turns, _turn_counter) kept writing into the old session's record after the agent had moved on. MemoryProvider ABC ------------------ - New optional hook on_session_switch(new_session_id, *, parent_session_id='', reset=False, **kwargs) with no-op default for backward compat. reset=True signals /reset or /new — providers should flush accumulated per-session buffers. reset=False for /resume, /branch, compression where the logical conversation continues. MemoryManager ------------- - on_session_switch() fans the hook out to every registered provider. Isolated try/except per provider — one bad provider can't block others. - Empty/None new_session_id is a no-op to avoid corrupting provider state during shutdown paths. run_agent.py ------------ - _sync_external_memory_for_turn now passes session_id=self.session_id into sync_all() and queue_prefetch_all(). Providers with defensive session_id updates in sync_turn (Hindsight already had this at plugins/memory/hindsight/__init__.py:1199) now actually receive the current id. - Compression block at ~L8884 already notified the context engine of the rollover; now also calls _memory_manager.on_session_switch(reason='compression'). cli.py ------ - new_session() fires reset=True, reason='new_session' so providers flush buffers. - _handle_resume_command fires reset=False, reason='resume' with the previous session as parent_session_id. - _handle_branch_command fires reset=False, reason='branch' with the parent session_id already captured for the DB parent link. gateway/run.py -------------- - _handle_resume_command now evicts the cached AIAgent, mirroring /branch and /reset. The next message rebuilds a fresh agent whose memory provider initialize() runs with the correct session_id — matches the pattern the gateway already uses for provider state cross-session transitions. Hindsight reference implementation ---------------------------------- - plugins/memory/hindsight/__init__.py adds on_session_switch that: updates _session_id, mints a fresh _document_id (prevents vectorize-io/hindsight#1303 overwrite), and clears _session_turns / _turn_counter / _turn_index so in-flight batches don't flush under the new document id. parent_session_id only overwritten when provided (avoids clobbering on a bare switch). Tests ----- - tests/agent/test_memory_session_switch.py: new dedicated file. ABC default no-op, manager fan-out, failure isolation, empty-id no-op, session_id propagation through sync_all/queue_prefetch_all, Hindsight state transitions for every reset/non-reset case, parent preservation. - tests/cli/test_branch_command.py: new test verifying /branch fires the hook with correct parent_session_id + reset=False + reason. - tests/gateway/test_resume_command.py: new test verifying /resume evicts the cached agent. - tests/run_agent/test_memory_sync_interrupted.py: updated existing assertions to account for the session_id kwarg on sync_all and queue_prefetch_all. E2E verified (real imports, tmp HERMES_HOME): - /resume: session_id updates, doc_id fresh, buffers cleared, parent set - /branch: session_id forks, parent links to original - /new: reset=True clears accumulated state - compression: reason='compression' propagated, lineage preserved - Empty id: no-op, state preserved - Legacy provider without on_session_switch: no crash Reported by @nicoloboschi (Hindsight maintainer); related scope-widening comment by @kidonng extending coverage to compression. |
||
|
|
059980727a |
refactor(config): migrate remaining 33 cfg_get call sites (#17311)
Completes the cfg_get migration started in PR #17304. Covers the remaining hermes_cli/ and plugins/ config-access sites that the first PR intentionally left opportunistic. Migrated (33 sites across 14 files): hermes_cli/setup.py 13 sites (terminal.*, agent.*, display.*, compression.*, tts.*) hermes_cli/tools_config.py 7 sites (tts.*, browser.*, web.*, platform_toolsets.*) hermes_cli/plugins_cmd.py 3 sites (plugins.*, memory.*, context.*) plugins/memory/honcho/cli.py 3 sites (hosts.*) hermes_cli/web_server.py 1 site (dashboard.*) hermes_cli/skills_config.py 1 site (platform_disabled) hermes_cli/plugins.py 1 site (plugins.disabled) hermes_cli/status.py 1 site (terminal.backend) hermes_cli/mcp_config.py 1 site (mcp_servers.*) hermes_cli/webhook.py 1 site (platforms.webhook) plugins/memory/__init__.py 1 site (memory.provider) plugins/memory/hindsight/ 1 site (banks.hermes) plugins/memory/holographic/ 1 site (plugins.hermes-memory-store) run_agent.py 1 site (auxiliary.compression) The helper supports non-literal keys too, so e.g. cfg.get('hosts', {}).get(HOST, {}) becomes cfg_get(cfg, 'hosts', HOST, default={}) Migration bugs caught and fixed during this PR: 1. An AST-based batch rewrite naïvely captured the first word token in a chain, which corrupted 'self._config.get(...).get(...)' into 'self.cfg_get(_config, ...)' (dropping 'self.', creating a broken method call). Plugins/memory/hindsight caught it via its test suite. Fixed manually to 'cfg_get(self._config, ...)'. 2. Import-extension heuristic rewrote multi-line parenthesized imports ('from X import (\n A,\n B,\n)') as 'from X import cfg_get, (' — syntactically broken. Fixed by inserting cfg_get as the first name inside the parentheses. Combined with PR #17304, the cfg_get migration now covers: PR #17304 (first batch): 20 sites in tools/ + gateway/ PR #17317 (this one): 33 sites in hermes_cli/ + plugins/ + run_agent.py Total: 53 sites migrated. Remaining ~8 sites are either: - Function-call chains (e.g. '_load_stt_config().get(...).get(...)') that would need double-evaluation or a local binding to migrate cleanly — intentionally deferred. - JSON response-navigation (e.g. 'response_data.get('data',{}).get('web')) which is unrelated to config access and shouldn't use cfg_get. Verified: - 412/412 tests/plugins/ pass (including the hindsight test that caught the self.X regex bug before commit) - 3181/3189 tests/hermes_cli/ pass (8 pre-existing failures on main, verified by git-stash comparison) - Live 'hermes status' and 'hermes config' render correctly (exercise the migrated terminal.backend, tts.provider, browser.cloud_provider, compression.threshold, display.tool_progress sites) - Live 'hermes chat': 1 turn + /quit, zero errors in 11-line log window No semantic changes — cfg_get was already proven to be a 1:1 match for the original .get("X",{}).get("Y",default) pattern in PR #17304. |
||
|
|
42cc905c13 |
feat(plugins): add bundled observability/langfuse plugin
Opt-in Langfuse tracing for Hermes conversations — LLM calls, tool usage, usage/cost breakdown per span. Hooks into pre/post_api_request, pre/post_llm_call, pre/post_tool_call. SDK is optional; missing SDK or credentials renders the plugin inert. Salvaged from PR #16845 by @kshitijk4poor, who wrote the plugin (~875 LOC, 6 hooks, Langfuse usage-details/cost-details normalization, read_file payload summarization). Salvage scope (why this isn't PR #16845 as-authored): - Lives at plugins/observability/langfuse/ (standalone kind, opt-in via plugins.enabled) instead of a new parallel optional-plugins/ directory. Standalone bundled plugins are already opt-in — only their plugin.yaml is scanned at startup; the Python module is not imported unless the user enables it. The premise of optional-plugins/ (avoid import cost for users who don't want it) is already solved by the existing plugin system. - Dropped the triple activation gate (plugins.enabled + plugins.langfuse.enabled + HERMES_LANGFUSE_ENABLED). The Hermes plugin system's own enable/disable is authoritative; runtime credentials gate whether the hook actually traces. - Rewrote _is_enabled() → cached _get_langfuse() with an _INIT_FAILED sentinel. The original called hermes_cli.config.load_config() from every hook invocation (full yaml parse + deep merge + env expansion on every pre/post_tool_call, potentially 100+ times per turn). The cached version reads env once and returns the cached client or None on every subsequent call with zero further work. - hermes tools → Langfuse Observability post-setup adds observability/langfuse to plugins.enabled directly (via _save_enabled_set) instead of going through an install-copy flow. Enable: hermes tools # interactive hermes plugins enable observability/langfuse # manual Required env (set by `hermes tools` or in ~/.hermes/.env): HERMES_LANGFUSE_PUBLIC_KEY HERMES_LANGFUSE_SECRET_KEY HERMES_LANGFUSE_BASE_URL # optional Co-authored-by: kshitijk4poor <kshitijk4poor@gmail.com> |
||
|
|
894e0b935b |
feat(honcho): explain why when honcho_profile returns an empty card
Closed PR #5137 addressed the retrieval path (peer cards via get_card() instead of the session-scoped lookup that returned empty for per-session messaging flows) — that architectural fix is already in main as _fetch_peer_card / _fetch_peer_context. What never got fixed is the user-visible side: honcho_profile returning a flat 'No profile facts available yet.' leaves the model to guess at why. The model then often surfaces it to the user as a cryptic error. Adds a diagnostic hint next to the existing 'result' message, enumerating the likely causes in rough order of frequency: 1. Observation disabled for this peer (user_observe_me/others off) 2. Peer card hasn't accumulated yet (fresh peer / dialectic cadence hasn't fired enough turns — cards build over time) 3. Generic fallback: self-hosted Honcho < 3.x lacks peer cards The hint also suggests alternative tools (honcho_reasoning / honcho_search) so the model can route around the empty card rather than giving up. Schema description updated so the model knows the hint field exists and that an empty card is NOT an error state. 7 tests cover the hint paths: warmup, observation-disabled for user + ai, generic fallback, populated card still returns plain result (no hint), alternative-tool suggestion present. |
||
|
|
5883df5574 |
fix(honcho): keep legacy schemeless baseUrl configs working
The scheme-validation commit (e77a3f2c) was too strict: a user with
legacy ''baseUrl: localhost:8000'' (no ''http://'' prefix) in their
''~/.honcho/config.json'' would get ''No API key configured'' from the
CLI after that change, even though their setup worked before.
urlparse on a schemeless host:port treats the host segment as the
scheme and leaves netloc empty, so the http/https check rejected it.
Falls back to a lenient check for schemeless strings that look like
hosts: contain '.' or ':', aren't a boolean/null literal, aren't pure
digits. The SDK still rejects truly malformed URLs at connect time
with a clearer error than ours.
Three new tests: legacy schemeless hosts accepted; obvious garbage
literals (''true'', ''null'', ''12345'') still rejected. Reviewer
noted concern #1: schemeless regression for self-hosters with old
configs.
|
||
|
|
cd276eef78 |
compat(honcho): accept metadata kwarg on on_memory_write ABC bump
main's
|
||
|
|
02ab255a0d |
style(honcho): hoist hashlib import; validate baseUrl scheme before 'local' sentinel
Two small follow-ups to the PR review: - Hoist hashlib import from _enforce_session_id_limit() to module top. stdlib imports are free after first cache, but keeping all imports at module top matches the rest of the codebase. - _resolve_api_key now URL-parses baseUrl and requires http/https + non-empty netloc before returning the 'local' sentinel. A typo like baseUrl: 'true' (or bare 'localhost') no longer silently passes the credential guard; the CLI correctly reports 'not configured'. Three new tests cover the new validation (garbage strings, non-http schemes, valid https). |
||
|
|
5d349ea857 |
fix(honcho): hold RLock across new_session's get_or_create to close race
new_session() was popping the old cached session, releasing the lock, calling get_or_create, then re-acquiring the lock to insert. A concurrent caller could observe the empty-cache window and race-create its own session, producing two divergent session objects for the same key. _cache_lock is an RLock, so nested reacquisition inside get_or_create is safe. Hold it across the whole pop/create/insert sequence. Follow-up to #13510 (@hekaru-agent). |
||
|
|
82205276c1 |
fix(plugins/memory/honcho): default Honcho SDK HTTP timeout to 30s
When no explicit timeout is configured (HonchoClientConfig.timeout, honcho.timeout / requestTimeout, or HONCHO_TIMEOUT), get_honcho_client previously constructed the SDK with no timeout kwarg, letting the underlying httpx client hang indefinitely if the Honcho backend became unreachable mid-request. This is a silent-failure hazard on the post-response path of run_conversation: the memory_manager.sync_all() / queue_prefetch_all() calls fire after the agent has already generated its final reply, so a stalled Honcho request blocks run_conversation from returning. The gateway never logs "response ready" and never delivers the response to the platform (Telegram, etc.), even though the text is already saved to the session file. Repro: unplug the network or block app.honcho.dev mid-turn after the model has produced its final message. Without this change, _run_agent never returns. With it, the call aborts after 30s, run_conversation returns, and the gateway delivers the response (Honcho sync failure is logged and swallowed as before). The default applies only when nothing is configured, so any deployment that has explicitly set timeout / HONCHO_TIMEOUT / honcho.timeout / honcho.requestTimeout keeps its existing value. Self-hosted deployments that genuinely need a longer ceiling can still override via any of those knobs. |
||
|
|
36d6b643f6 |
fix(honcho): CLI credential guard rejects self-hosted baseUrl configs
_resolve_api_key() only checks for apiKey / HONCHO_API_KEY, so all CLI subcommands (identity --show, status, migrate, etc.) bail with "No API key configured" on self-hosted instances that use baseUrl without an API key. Return "local" when baseUrl or HONCHO_BASE_URL is set, matching the client.py behavior that already handles this case for the SDK. Tested on: macOS, self-hosted Honcho (Docker, localhost:8000). |
||
|
|
5d36871d92 | Fix Honcho HOME-aware global config fallback | ||
|
|
f1ba4014e1 | fix: harden memory-context leak boundaries | ||
|
|
dad0217450 |
fix(honcho): thread-safe session cache via RLock
Wraps _session_cache mutations in threading.RLock. Without this, concurrent gateway sessions (e.g., Telegram + Discord hitting Honcho at the same time) can race on the cache and silently lose conclusions or memory writes. Adopted from #13510 by @hekaru-agent; the off-topic cron/jobs.py cleanup hunk from that PR is dropped here for scope isolation. Resolved a small conflict with the pinPeerName guard (kept both). |
||
|
|
cd1c4812ab |
fix(honcho): truncate resolve_session_name output to Honcho's 100-char limit (#13868)
Gateway session keys (Matrix "!room:server" + thread event IDs, Telegram supergroup reply chains, Slack thread IDs with long workspace prefixes) can exceed Honcho's 100-character session ID limit after sanitization. Every Honcho API call for those sessions then 400s with "session_id too long". Add a helper that enforces the 100-char limit after sanitization: short keys (the common case) short-circuit unchanged; over-limit keys keep a prefix and append a deterministic `-<8 hex>` SHA-256 suffix over the original key so two long keys sharing a leading segment can't collide onto the same truncated ID. Adds 7 regression tests in tests/honcho_plugin/test_client.py covering short / exact-limit / long / deterministic / collision-resistant / allowlist-preserving / hash-suffix-present cases. |
||
|
|
326c9daa69 |
fix(honcho): require strict True for pin_peer_name to survive MagicMock configs (#15162)
CI caught that ``test_session_manager_prefers_runtime_user_id_over_config_peer_name`` in ``tests/agent/test_memory_user_id.py`` failed after this branch: that test passes a ``MagicMock`` for ``config``, where ``mock.pin_peer_name`` silently returns another ``MagicMock`` — truthy by default. My ``getattr(..., "pin_peer_name", False)`` fallback was supposed to guard against callers that haven't added the new attr, but MagicMock *does* have the attr — it just returns a live mock for it. Tightened the gate to ``getattr(..., False) is True``. Real configs built via ``HonchoClientConfig.from_global_config`` always yield a proper boolean, so strict equality matches the pinned case and rejects both the unset-attr fallback and MagicMock stand-ins. Added a comment explaining why ``is True`` is intentional, not paranoid. Also tightened the ``peer_name`` existence check to ``getattr(..., None)`` so a MagicMock with ``peer_name`` left at its default (also truthy) doesn't spuriously enable pinning either. Verified against both the new ``test_pin_peer_name.py`` suite (13/13 pass) and the previously-failing ``TestHonchoUserIdScoping`` (3/3 pass). Zero behaviour change for real ``HonchoClientConfig`` values. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
d03c6fcc45 |
fix(honcho): pinPeerName opt-in keeps memory unified across platforms (#14984)
When a gateway drives Hermes (Telegram, Discord, Slack, ...), it passes the
platform-native user ID as ``runtime_user_peer_name`` into the Honcho
session manager. That ID wins over ``peer_name`` in ``honcho.json``, so a
single user who connects over three platforms ends up as three separate
Honcho peers — one per platform — with fragmented memory and no cross-
platform context continuity.
For multi-user bots this is correct (and must not change): each user gets
their own peer scope. For the vast majority of personal Hermes deployments
the configured ``peer_name`` is an unambiguous identity, though, so the
reporter asked for an opt-in knob that pins the user peer to that value.
Fix: new ``pinPeerName`` boolean on the host config, default ``false``.
When ``true`` AND ``peerName`` is set, the configured peer_name beats the
gateway's runtime identity; every other resolution case is unchanged.
honcho.json:
{
"peerName": "Igor",
"hosts": {
"hermes": { "pinPeerName": true }
}
}
session.py (resolution order, pinned case):
runtime_user_peer_name → skipped (opt-in flag active)
config.peer_name → WINS "Igor"
session-key fallback → unreached
Parsing follows the same host-block-overrides-root pattern as every other
flag in HonchoClientConfig.from_global_config (``_resolve_bool`` helper).
Tests (tests/honcho_plugin/test_pin_peer_name.py — 13 cases, 5 groups):
- Config parsing: default, root true, host-block true, host overrides
root, explicit false.
- Peer resolution: runtime wins by default (regression guard for multi-
user bots), config wins when pinned, pin-without-peer_name is a no-op
(prevents silent peer-id collapse to session-key fallback), CLI path
where runtime is absent, deepest fallback intact, assistant peer
untouched by the flag.
- Cross-platform unification: Telegram UID + Discord snowflake collapse
to one peer when pinned; negative control confirms two distinct
runtime IDs still produce two peers when unpinned.
244 honcho_plugin tests pass, 3 pre-existing skips, zero regressions.
Defensive detail: session.py uses ``getattr(self._config, "pin_peer_name",
False)`` so callers building partial config objects (several test fixtures
across the codebase do this) don't break if they haven't updated yet.
Runtime cost: one attr lookup per new session.
Closes #14984
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
df3c9593f8 |
feat(plugins): google_meet \u2014 join, transcribe, speak, follow up (#16364)
* feat(plugins): google_meet — bundled plugin for join+transcribe Meet calls v1 shipping transcribe-only. Spawns headless Chromium via Playwright, joins an explicit https://meet.google.com/ URL, enables live captions, and scrapes them into a transcript file the agent can read across turns. The agent then has the meeting content in context and can do followup work (send recap, file issues, schedule followups) with its regular tools. Surface: - Tools: meet_join, meet_status, meet_transcript, meet_leave, meet_say (meet_say is a v1 stub — returns not-implemented; v2 will wire realtime duplex audio via OpenAI Realtime / Gemini Live + BlackHole / PulseAudio null-sink.) - CLI: hermes meet setup | auth | join | status | transcript | stop - Lifecycle: on_session_end auto-leaves any still-running bot. Safety: - URL regex rejects anything that isn't https://meet.google.com/... - No calendar scanning, no auto-dial, no auto-consent announcement. - Single active meeting per install; a second meet_join leaves the first. - Platform-gated to Linux + macOS (Windows audio routing for v2 untested). - Opt-in: standalone plugin, user must add 'google_meet' to plugins.enabled in config.yaml. Zero core changes. Plugin uses existing register_tool / register_cli_command / register_hook surfaces. 21 new unit tests cover the URL safety gate, transcript dedup + status round-trip, process-manager refusals/start/stop paths, tool-handler JSON shape under each branch, session-end cleanup, and platform-gated register(). * feat(plugins/google_meet): v2 realtime audio + v3 remote node host v2 \u2014 agent speaks in-meeting audio_bridge.py: PulseAudio null-sink (Linux) + BlackHole probe (macOS). On Linux we load pactl module-null-sink + module-virtual-source, track module ids for teardown; Chrome gets PULSE_SOURCE=<virt src> env so its fake mic reads what we write to the sink. macOS just probes BlackHole 2ch and returns its device name \u2014 the plugin refuses to switch the user's default audio input (that would surprise them). realtime/openai_client.py: sync WebSocket client for the OpenAI Realtime API. RealtimeSession.speak(text) sends conversation.item.create + response.create, accumulates response.audio.delta PCM bytes, appends them to a file. RealtimeSpeaker runs a JSONL-queue loop consuming meet_say calls. 'websockets' is an optional dep imported lazily. meet_bot.py: when HERMES_MEET_MODE=realtime, provisions AudioBridge, starts RealtimeSession + speaker thread, spawns paplay to pump PCM into the null-sink, then cleans everything up on SIGTERM. If any realtime setup step fails, falls back cleanly to transcribe mode with an error flagged in status.json. process_manager.enqueue_say(): writes a JSONL line to say_queue.jsonl; refuses when no active meeting or active meeting is transcribe-only. tools.meet_say: real implementation; requires active mode='realtime'. meet_join: adds mode='transcribe'|'realtime' param. v3 \u2014 remote node host node/protocol.py: JSON envelope (type, id, token, payload) + validate. node/registry.py: $HERMES_HOME/workspace/meetings/nodes.json, with resolve() auto-selecting the sole registered node when name is None. node/server.py: NodeServer \u2014 websockets.serve, bearer-token auth, dispatches start_bot/stop/status/transcript/say/ping onto the local process_manager. Token auto-generated + persisted on first run. node/client.py: NodeClient \u2014 short-lived sync WS per RPC, raises RuntimeError on error envelopes, clean API matching the server. node/cli.py: 'hermes meet node {run,list,approve,remove,status,ping}' subtree; wired into the main meet CLI by cli.py so 'hermes meet node' Just Works. tools.py: every meet_* tool accepts node='<name>'|'auto'; when set, routes through NodeClient to the remote bot instead of running locally. Unknown node \u2192 clear 'no registered meet node matches ...' error. cli.py: 'hermes meet join --node my-mac --mode realtime' and 'hermes meet say "..." --node my-mac' route to the node; 'hermes meet node approve <name> <url> <token>' registers one. Tests 21 v1 tests updated (meet_say is no longer a stub; active-record now carries mode). 20 new audio_bridge + realtime tests. 42 new node tests (protocol/registry/server/client/cli). 17 new v1/v2/v3 integration tests at the plugin level covering enqueue_say edge cases, env var passthrough, mode validation, node routing (known/unknown/auto/ambiguous), and argparse wiring for `hermes meet say` + `hermes meet node` + --mode/--node flags. Total: 100 plugin tests + 58 plugin-system tests = 158 passing. E2E verified on Linux with fresh HERMES_HOME: plugin loads, 5 tools register, on_session_end hook wires, 'hermes meet' CLI tree wires including the node subtree, NodeRegistry round-trips, meet_join routes correctly to NodeClient under node='my-mac' with mode='realtime', enqueue_say accepts realtime/rejects transcribe, argparse parses every new flag cleanly. Zero changes to core. All new code lives under plugins/google_meet/. * feat(plugins/google_meet): auto-install, admission detect, mac PCM pump, barge-in, richer status Ready-for-live-test follow-up on PR #16364. Five additions that matter for the first live run on a real Meet, in priority order: 1. hermes meet install [--realtime] [--yes] pip install playwright websockets + python -m playwright install chromium --realtime: installs platform audio deps (pulseaudio-utils on Linux via sudo apt, blackhole-2ch + ffmpeg on macOS via brew). Prompts before sudo/brew unless --yes. Refuses on Windows. Refuses to auto-flip the macOS default input — user still selects BlackHole in System Settings (deliberate; surprise audio rerouting is worse than a manual step). 2. Admission detection _detect_admission(page): Leave-button visible OR caption region attached OR participants list present → we're in-call. _detect_denied(page): 'You can\'t join this video call' / 'You were removed' / 'No one responded to your request' → bail out. HERMES_MEET_LOBBY_TIMEOUT (default 300s) caps how long we sit in the lobby before giving up. in_call stays False until admitted. Status surfaces leaveReason: duration_expired | lobby_timeout | denied | page_closed. 3. macOS PCM pump ffmpeg reads speaker.pcm (24kHz s16le mono) and writes to the BlackHole AVFoundation output via -f audiotoolbox -audio_device_index <N>. _mac_audio_device_index() probes ffmpeg -f avfoundation -list_devices true to resolve 'BlackHole 2ch' → numeric index. Falls back to index 0 on probe failure. Linux paplay pump unchanged. 4. Richer status dict _BotState now tracks realtime, realtimeReady, realtimeDevice, audioBytesOut, lastAudioOutAt, lastBargeInAt, joinAttemptedAt, leaveReason. RealtimeSession.audio_bytes_out / last_audio_out_at counters fold into the status file once a second so meet_status() can show the agent's voice activity in near-real-time. 5. Barge-in RealtimeSession.cancel_response() sends type='response.cancel' over the same WS (lock-guarded so it's safe to call from the caption thread while speak() is reading frames). Handles response.cancelled as a terminal frame type. _looks_like_human_speaker() gates triggers so the bot's own name, 'You', 'Unknown', and blanks don't self-cancel. Called from the caption drain loop: when a new caption arrives attributed to a real participant while rt.session exists, we fire cancel_response() and stamp lastBargeInAt. Tests: 20 new unit tests across _BotState telemetry, barge-in gating, admission/denied probe error handling, cancel_response with and without a connected WS, and `hermes meet install` CLI wiring (flag parsing + end-to-end subprocess.run verification + Linux-already-installed fast path). Total 171 passing across all google_meet test files + the plugin-system regression suite. E2E verified on Linux: plugin loads, all 5 tools register, `hermes meet install --realtime --yes` parses, fresh-bot status.json has every new telemetry key, cancel_response on a disconnected session returns False without raising, barge-in helper gates the bot's own name correctly. Still out of scope (for a future PR, not blocking live test): mic → Realtime duplex (the agent listening to meeting audio via WebRTC), node-host TLS/pairing UX, Windows audio, Meet create+Twilio. Docs updated: SKILL.md now lists the installer subcommand, lobby timeout, barge-in caveat, and the full status-dict reference table. README.md quick-start uses hermes meet install. |
||
|
|
64a497bfa9 | fix(hindsight): preserve setup config on blank input | ||
|
|
0ba6471dd1 | fix: recover hindsight embedded daemon after idle shutdown | ||
|
|
18beb69b49 |
fix(memory): close embedded Hindsight async client cleanly
HindsightEmbedded.close() delegates to its sync client.close(). When Hermes created/used that client on the shared async loop, closing it from the main thread raises 'attached to a different loop' before aiohttp releases the session — so the ClientSession / TCPConnector leak past provider teardown. Close the embedded inner async client on the shared loop first via _run_sync(inner_client.aclose()), then let the wrapper's sync close() do its daemon/UI bookkeeping. Salvage of #14605: test placement rebased — appended TestShutdown class after TestSharedEventLoopLifecycle (which landed on main after the PR was written). Original author attribution preserved. |
||
|
|
ea01bdcebe |
refactor(memory): remove flush_memories entirely (#15696)
The AIAgent.flush_memories pre-compression save, the gateway _flush_memories_for_session, and everything feeding them are obsolete now that the background memory/skill review handles persistent memory extraction. Problems with flush_memories: - Pre-dates the background review loop. It was the only memory-save path when introduced; the background review now fires every 10 user turns on CLI and gateway alike, which is far more frequent than compression or session reset ever triggered flush. - Blocking and synchronous. Pre-compression flush ran on the live agent before compression, blocking the user-visible response. - Cache-breaking. Flush built a temporary conversation prefix (system prompt + memory-only tool list) that diverged from the live conversation's cached prefix, invalidating prompt caching. The gateway variant spawned a fresh AIAgent with its own clean prompt for each finalized session — still cache-breaking, just in a different process. - Redundant. Background review runs in the live conversation's session context, gets the same content, writes to the same memory store, and doesn't break the cache. Everything flush_memories claimed to preserve is already covered. What this removes: - AIAgent.flush_memories() method (~248 LOC in run_agent.py) - Pre-compression flush call in _compress_context - flush_memories call sites in cli.py (/new + exit) - GatewayRunner._flush_memories_for_session + _async_flush_memories (and the 3 call sites: session expiry watcher, /new, /resume) - 'flush_memories' entry from DEFAULT_CONFIG auxiliary tasks, hermes tools UI task list, auxiliary_client docstrings - _memory_flush_min_turns config + init - #15631's headroom-deduction math in _check_compression_model_feasibility (headroom was only needed because flush dragged the full main-agent system prompt along; the compression summariser sends a single user-role prompt so new_threshold = aux_context is safe again) - The dedicated test files and assertions that exercised flush-specific paths What this renames (with read-time backcompat on sessions.json): - SessionEntry.memory_flushed -> SessionEntry.expiry_finalized. The session-expiry watcher still uses the flag to avoid re-running finalize/eviction on the same expired session; the new name reflects what it now actually gates. from_dict() reads 'expiry_finalized' first, falls back to the legacy 'memory_flushed' key so existing sessions.json files upgrade seamlessly. Supersedes #15631 and #15638. Tested: 383 targeted tests pass across run_agent/, agent/, cli/, and gateway/ session-boundary suites. No behavior regressions — background memory review continues to handle persistent memory extraction on both CLI and gateway. |
||
|
|
af22421e87 |
feat(dashboard): page-scoped plugin slots for built-in pages (#15658)
* fix(terminal): three-layer defense against watch_patterns notification spam Background processes that stack notify_on_complete=True with watch_patterns can flood the user with duplicate, delayed notifications — matches deliver asynchronously via the completion queue and continue arriving minutes after the process has exited. The docstring warning against this (PR #12113) has proven insufficient; agents still misuse the combination. Three layered defenses, each sufficient on its own: 1. Mutual exclusion (terminal_tool.py): When both flags are set on a background process, drop watch_patterns with a warning. notify_on_complete wins because 'let me know when it's done' is the more useful signal and fires exactly once. Extracted as _resolve_notification_flag_conflict() so the rule is testable in isolation. 2. Suppress-after-exit (process_registry.py): _check_watch_patterns() now bails the moment session.exited is True. Post-exit chunks (buffered reads draining after the process is gone) no longer produce notifications. This is the fix flagged as future work in session 20260418_020302_79881c. 3. Global circuit breaker (process_registry.py): Per-session rate limits don't catch the sibling-flood case — N concurrent processes can each stay under 8/10s and still collectively spam. New WATCH_GLOBAL_MAX_PER_WINDOW=15 cap trips a 30-second cooldown across ALL sessions, emits a single watch_overflow_tripped event, silently counts dropped events, and emits a watch_overflow_released summary when the cooldown ends. Also updates the tool schema + docstring to document the new behavior. Tests: 8 new tests covering all three fixes (suppress-after-exit x2, mutual-exclusion resolver x4, global breaker trip/cooldown/release x2). All 60 tests across test_watch_patterns.py, test_notify_on_complete.py, test_terminal_tool.py pass. Real-world trigger: self-inflicted in session 20260425_051924 — three concurrent hermes-sweeper review subprocesses each set watch_patterns= ['failed validation', 'errored'] AND notify_on_complete=True, then iterated over multiple items, producing enough matches per process to defeat the per-session cap while staying under the global cap that didn't yet exist. * fix(terminal): aggressive 1-per-15s watch_patterns rate limit + strike-3 promotion Per Teknium's direction, the watch_patterns rate limit is now much more aggressive and self-healing. ## New rule — per session - HARD cap: 1 watch-match notification per 15 seconds per process. - Any match arriving inside the cooldown window is dropped and counts as ONE strike for that window (many drops in the same window still = 1 strike). - After 3 consecutive strike windows, watch_patterns is permanently disabled for the session and the session is auto-promoted to notify_on_complete semantics — exactly one notification when the process actually exits. - A cooldown window that expires with zero drops resets the consecutive strike counter — healthy cadence is forgiven. ## Schema + docstring rewritten The tool schema description now gives the model explicit guidance: - notify_on_complete is 'the right choice for almost every long-running task' - watch_patterns is for RARE one-shot signals on LONG-LIVED processes - Do NOT use watch_patterns with loops/batch jobs — error patterns fire every iteration and will hit the strike limit fast - Mutual exclusion is stated on both parameter descriptions - 1/15s cooldown and 3-strike promotion are stated in the watch_patterns description so the model sees the contract every turn ## Removed - WATCH_MAX_PER_WINDOW (8/10s) and WATCH_OVERLOAD_KILL_SECONDS (45) — the new 1/15s limit subsumes both; keeping them would double-count. - _watch_window_hits / _watch_window_start / _watch_overload_since fields on ProcessSession. Replaced by _watch_last_emit_at / _watch_cooldown_until / _watch_strike_candidate / _watch_consecutive_strikes. ## Kept - Global circuit breaker across all sessions (15/10s → 30s cooldown) as a secondary safety net for concurrent siblings. Still valuable when 20 short-lived processes each fire once — none individually violates the per-session limit. - Suppress-after-exit guard. - Mutual exclusion resolver at the tool entry point. ## Tests - 6 new tests in TestPerSessionRateLimit covering: first match delivers, second in cooldown suppressed, multi-drop = single strike, 3 strikes disables + promotes, clean window resets counter, suppressed count carried to next emit. - Global circuit breaker tests rewritten to use fresh sessions instead of hacking removed per-window fields. - 50/50 watch_patterns + notify_on_complete tests pass. - 60/60 including test_terminal_tool.py pass. * feat(dashboard): page-scoped plugin slots for built-in pages Dashboard plugins can now inject components into specific built-in pages (Sessions, Analytics, Logs, Cron, Skills, Config, Env, Docs, Chat) without overriding the whole route. Previously, plugins could only: 1. Add new tabs (tab.path) 2. Replace whole built-in pages (tab.override) 3. Inject into global shell slots (header-*, footer-*, pre-main, ...) None of those let a plugin add a banner, card, or widget to an existing page. The new <page>:top / <page>:bottom slots close that gap, reusing the existing registerSlot() API. Changes - web/src/plugins/slots.ts: 18 new KNOWN_SLOT_NAMES entries (sessions:top, sessions:bottom, analytics:top, ..., chat:bottom), grouped under "Shell-wide" vs "Page-scoped" in the docblock - web/src/pages/*: each built-in page now renders <PluginSlot name="<page>:top" /> as the first child of its outer wrapper and <PluginSlot name="<page>:bottom" /> as the last child -- zero visual cost when no plugin registers - plugins/example-dashboard: registers a demo banner into sessions:top via registerSlot(), with matching slots entry in the manifest -- so freshly-setup users can see what page-scoped slots look like without writing any plugin code - website/docs: new "Page-scoped slots" table in the plugin authoring guide, with a worked example - tests/hermes_cli/test_web_server.py: round-trip test for colon-bearing slot names (sessions:top, analytics:bottom, ...) Validation - npm run build: clean (tsc -b + vite build, 2761 modules) - scripts/run_tests.sh tests/hermes_cli/test_web_server.py::TestDashboardPluginManifestExtensions: 5/5 pass |
||
|
|
8d12fb1e6b |
refactor(spotify): convert to built-in bundled plugin under plugins/spotify (#15174)
Moves the Spotify integration from tools/ into plugins/spotify/,
matching the existing pattern established by plugins/image_gen/ for
third-party service integrations.
Why:
- tools/ should be reserved for foundational capabilities (terminal,
read_file, web_search, etc.). tools/providers/ was a one-off
directory created solely for spotify_client.py.
- plugins/ is already the home for image_gen backends, memory
providers, context engines, and standalone hook-based plugins.
Spotify is a third-party service integration and belongs alongside
those, not in tools/.
- Future service integrations (eventually: Deezer, Apple Music, etc.)
now have a pattern to copy.
Changes:
- tools/spotify_tool.py → plugins/spotify/tools.py (handlers + schemas)
- tools/providers/spotify_client.py → plugins/spotify/client.py
- tools/providers/ removed (was only used for Spotify)
- New plugins/spotify/__init__.py with register(ctx) calling
ctx.register_tool() × 7. The handler/check_fn wiring is unchanged.
- New plugins/spotify/plugin.yaml (kind: backend, bundled, auto-load).
- tests/tools/test_spotify_client.py: import paths updated.
tools_config fix — _DEFAULT_OFF_TOOLSETS now wins over plugin auto-enable:
- _get_platform_tools() previously auto-enabled unknown plugin
toolsets for new platforms. That was fine for image_gen (which has
no toolset of its own) but bad for Spotify, which explicitly
requires opt-in (don't ship 7 tool schemas to users who don't use
it). Added a check: if a plugin toolset is in _DEFAULT_OFF_TOOLSETS,
it stays off until the user picks it in 'hermes tools'.
Pre-existing test bug fix:
- tests/hermes_cli/test_plugins.py::test_list_returns_sorted
asserted names were sorted, but list_plugins() sorts by key
(path-derived, e.g. image_gen/openai). With only image_gen plugins
bundled, name and key order happened to agree. Adding plugins/spotify
broke that coincidence (spotify sorts between openai-codex and xai
by name but after xai by key). Updated test to assert key order,
which is what the code actually documents.
Validation:
- scripts/run_tests.sh tests/hermes_cli/test_plugins.py \
tests/hermes_cli/test_tools_config.py \
tests/hermes_cli/test_spotify_auth.py \
tests/tools/test_spotify_client.py \
tests/tools/test_registry.py
→ 143 passed
- E2E plugin load: 'spotify' appears in loaded plugins, all 7 tools
register into the spotify toolset, check_fn gating intact.
|
||
|
|
edff2fbe7e |
feat(hindsight): optional bank_id_template for per-agent / per-user banks
Adds an optional bank_id_template config that derives the bank name at
initialize() time from runtime context. Existing users with a static
bank_id keep the current behavior (template is empty by default).
Supported placeholders:
{profile} — active Hermes profile (agent_identity kwarg)
{workspace} — Hermes workspace (agent_workspace kwarg)
{platform} — cli, telegram, discord, etc.
{user} — platform user id (gateway sessions)
{session} — session id
Unsafe characters in placeholder values are sanitized, and empty
placeholders collapse cleanly (e.g. "hermes-{user}" with no user
becomes "hermes"). If the template renders empty, the static bank_id
is used as a fallback.
Common uses:
bank_id_template: hermes-{profile} # isolate per Hermes profile
bank_id_template: {workspace}-{profile} # workspace + profile scoping
bank_id_template: hermes-{user} # per-user banks for gateway
|
||
|
|
f9c6c5ab84 |
fix(hindsight): scope document_id per process to avoid resume overwrite (#6602)
Reusing session_id as document_id caused data loss on /resume: when
the session is loaded again, _session_turns starts empty and the next
retain replaces the entire previously stored content.
Now each process lifecycle gets its own document_id formed as
{session_id}-{startup_timestamp}, so:
- Same session, same process: turns accumulate into one document (existing behavior)
- Resume (new process, same session): writes a new document, old one preserved
- Forks: child process gets its own document; parent's doc is untouched
Also adds session lineage tags so all processes for the same session
(or its parent) can still be filtered together via recall:
- session:<session_id> on every retain
- parent:<parent_session_id> when initialized with parent_session_id
Closes #6602
|
||
|
|
f1ba2f0c0b |
fix(hindsight): use configured timeout in _run_sync for all async operations
The previous commit added HINDSIGHT_TIMEOUT as a configurable env var, but _run_sync still used the hardcoded _DEFAULT_TIMEOUT (120s). All async operations (recall, retain, reflect, aclose) now go through an instance method that uses self._timeout, so the configured value is actually applied. Also: added backward-compatible alias comment for the module-level function. |