mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-05-21 03:39:54 +00:00
88ee58f7d2
Follow-up to #28452. detect_stale_running() was calling _record_task_failure() on every reclaim, which ticked the consecutive_failures counter. With the default failure_limit=2, two legitimately long-running tasks (>4 h without explicit heartbeat) would auto-block via the spawn-failure circuit breaker — even though no worker actually failed. Stale reclaim is dispatcher-side absence-of-heartbeat detection, not a worker fault. Removed the _record_task_failure() call; the 'stale' event in task_events is still the audit surface, but the failure counter is now reserved for spawn_failed / timed_out / crashed (real failures). Also documents the heartbeat requirement: - KANBAN_GUIDANCE in agent/prompt_builder.py now states the rule ('call kanban_heartbeat at least once an hour for tasks running longer than 1 hour') so workers learn the contract. - kanban.md adds the stale event row to the events table and flags the heartbeat requirement in the worker lifecycle list. New regression test: test_detect_stale_does_not_tick_failure_counter locks in the new behaviour.
Website
This website is built using Docusaurus, a modern static website generator.
Installation
yarn
Local Development
yarn start
This command starts a local development server and opens up a browser window. Most changes are reflected live without having to restart the server.
Build
yarn build
This command generates static content into the build directory and can be served using any static contents hosting service.
Deployment
Using SSH:
USE_SSH=true yarn deploy
Not using SSH:
GIT_USER=<Your GitHub username> yarn deploy
If you are using GitHub pages for hosting, this command is a convenient way to build the website and push to the gh-pages branch.
Diagram Linting
CI runs ascii-guard to lint docs for ASCII box diagrams. Use Mermaid (````mermaid`) or plain lists/tables instead of ASCII boxes to avoid CI failures.