mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-05-21 03:39:54 +00:00
docs: add kanban codex lane skill
This commit is contained in:
@@ -0,0 +1,277 @@
|
||||
---
|
||||
name: kanban-codex-lane
|
||||
description: Use when a Hermes Kanban worker wants to run Codex CLI as an isolated implementation lane while Hermes keeps ownership of task lifecycle, reconciliation, testing, and handoff.
|
||||
version: 1.0.0
|
||||
author: Hermes Agent
|
||||
license: MIT
|
||||
metadata:
|
||||
hermes:
|
||||
tags: [kanban, codex, worktrees, autonomous-agents, prediction-market-bot]
|
||||
related_skills: [kanban-worker, codex, hermes-agent]
|
||||
---
|
||||
|
||||
# Kanban Codex Lane
|
||||
|
||||
## Overview
|
||||
|
||||
This skill defines the lightweight Hermes+Codex dual-lane convention for Kanban workers. Hermes is always the task owner: it calls `kanban_show`, decides whether Codex is appropriate, creates or selects an isolated workspace, starts and monitors Codex, reconciles any diff, runs verification, and writes the final `kanban_complete` or `kanban_block` handoff. Codex is an input lane only. Codex output is not a task completion signal, not a trusted reviewer, and not allowed to write durable Kanban state directly.
|
||||
|
||||
The convention exists so a Hermes worker can use Codex for bounded implementation help without changing the dispatcher. The dispatcher must still spawn Hermes workers. A worker may optionally spawn Codex inside its own run, then accept, partially accept, or reject the lane after independent review and tests.
|
||||
|
||||
## When to Use
|
||||
|
||||
Use the Codex lane when all of these are true:
|
||||
|
||||
- The Kanban task is a coding, refactor, documentation, test, or mechanical migration task with clear acceptance criteria.
|
||||
- A bounded diff can be evaluated by Hermes in one run.
|
||||
- The repo can be copied or checked out in an isolated git worktree/branch.
|
||||
- Hermes can run the relevant tests itself after Codex exits.
|
||||
- The prompt can state all safety constraints and files that must not change.
|
||||
|
||||
Do not use the Codex lane when any of these are true:
|
||||
|
||||
- The task requires human judgment that is not already captured in the Kanban body.
|
||||
- The worker lacks repo access, Codex auth, or time to reconcile the result.
|
||||
- The change touches secrets, credential stores, private user data, or production order-entry systems.
|
||||
- A small direct edit is faster and safer than spawning another agent.
|
||||
- The task is research-only and should produce a written handoff rather than a diff.
|
||||
- The worker would be tempted to mark Done based only on Codex self-report.
|
||||
|
||||
## Ownership Rules
|
||||
|
||||
1. Hermes owns the Kanban lifecycle. Codex must never call `kanban_complete`, `kanban_block`, `kanban_create`, gateway messaging, or any Hermes board CLI as a substitute for the worker.
|
||||
2. Hermes owns final acceptance. Treat Codex commits/diffs as untrusted patches until reviewed and verified.
|
||||
3. Hermes owns test execution. Codex may run tests, but those runs are advisory; repeat required verification from Hermes with the repo's canonical wrapper.
|
||||
4. Hermes owns safety. If Codex changes safety boundaries, risk gates, live trading behavior, or secrets handling, reject the lane even if tests pass.
|
||||
5. Hermes owns cleanup. Kill stuck Codex processes and remove temporary worktrees when they are no longer needed.
|
||||
|
||||
## Required Worktree and Branch Pattern
|
||||
|
||||
Never run Codex directly in a shared dirty checkout. Use a branch/worktree name that ties the lane to the Kanban task and keeps untrusted edits isolated.
|
||||
|
||||
Recommended variables:
|
||||
|
||||
```bash
|
||||
TASK_ID="${HERMES_KANBAN_TASK:-t_manual}"
|
||||
REPO="/path/to/repo"
|
||||
BASE="$(git -C "$REPO" rev-parse --abbrev-ref HEAD)"
|
||||
SAFE_TASK="$(printf '%s' "$TASK_ID" | tr -cd '[:alnum:]_-')"
|
||||
BRANCH="codex/${SAFE_TASK}/$(date -u +%Y%m%d%H%M%S)"
|
||||
WORKTREE="/tmp/${SAFE_TASK}-codex-lane"
|
||||
```
|
||||
|
||||
Create the isolated lane:
|
||||
|
||||
```bash
|
||||
git -C "$REPO" fetch --all --prune
|
||||
git -C "$REPO" worktree add -b "$BRANCH" "$WORKTREE" "$BASE"
|
||||
git -C "$WORKTREE" status --short --branch
|
||||
```
|
||||
|
||||
If the current Kanban workspace is already an isolated git worktree created for this task, you may create a sibling Codex branch inside it only if `git status --short` is clean except for intentional Hermes edits. Otherwise create a separate temporary worktree and cherry-pick or copy accepted commits back after reconciliation.
|
||||
|
||||
Cleanup after reconciliation:
|
||||
|
||||
```bash
|
||||
git -C "$REPO" worktree remove "$WORKTREE"
|
||||
git -C "$REPO" branch -D "$BRANCH" # only after accepted commits were copied/cherry-picked or intentionally rejected
|
||||
```
|
||||
|
||||
Keep the worktree if it is needed as an artifact for review; record it in `codex_lane.artifacts` and mention it in the handoff.
|
||||
|
||||
## Codex Capability Checks
|
||||
|
||||
Run these before spawning Codex. Missing Codex is a normal reason to skip the lane, not a task blocker if Hermes can do the task directly.
|
||||
|
||||
```bash
|
||||
command -v codex
|
||||
codex --version
|
||||
codex features list | grep -i goals || true
|
||||
```
|
||||
|
||||
If `/goal` support is required, enable or launch with the feature flag only after checking availability:
|
||||
|
||||
```bash
|
||||
codex features enable goals || true
|
||||
codex --enable goals --version
|
||||
```
|
||||
|
||||
Authentication can be via `OPENAI_API_KEY` or the Codex CLI OAuth state (often `~/.codex/auth.json`). Do not print token files. A missing `OPENAI_API_KEY` is not proof that auth is unavailable.
|
||||
|
||||
## Mode Selection
|
||||
|
||||
Use `codex exec` for bounded one-shot edits where Codex should exit on its own:
|
||||
|
||||
```python
|
||||
terminal(
|
||||
command="codex exec --full-auto '$(cat /tmp/codex_prompt.md)'",
|
||||
workdir=WORKTREE,
|
||||
background=True,
|
||||
pty=True,
|
||||
notify_on_complete=True,
|
||||
)
|
||||
```
|
||||
|
||||
Use Codex `/goal` only for broader multi-step work that benefits from durable objective tracking. Launch interactively in a PTY/tmux session or with `codex --enable goals` if the feature is disabled by default. Keep the goal objective self-contained: repo path, task id, safety constraints, allowed scope, acceptance criteria, tests, and commit expectations.
|
||||
|
||||
Example `/goal` objective text to paste into Codex:
|
||||
|
||||
```text
|
||||
/goal Work in this repository only: <WORKTREE>. Task: <TASK_ID> <TITLE>.
|
||||
Hermes owns the Kanban lifecycle; do not call Hermes kanban tools or messaging.
|
||||
Create small commits on branch <BRANCH>. Follow the PMB safety constraints in the prompt.
|
||||
Run the requested verification commands and report exact outputs. Stop after producing a diff and summary.
|
||||
```
|
||||
|
||||
Do not use `--yolo` for prediction-market-bot or safety-sensitive repos. Prefer `--full-auto` inside the isolated worktree, then rely on Hermes reconciliation.
|
||||
|
||||
## Prompt Construction
|
||||
|
||||
Use the linked template at `templates/pmb-codex-lane-prompt.md` for prediction-market-bot work. For other repos, keep the same structure and replace the PMB-specific safety block with repo-specific invariants.
|
||||
|
||||
Every Codex prompt must include:
|
||||
|
||||
- `task_id`, title, and full Kanban acceptance criteria.
|
||||
- Repo path, worktree path, branch name, and allowed file scope.
|
||||
- Explicit statement: Hermes owns Kanban lifecycle; Codex is an input lane only.
|
||||
- Required output: concise summary, files changed, commits, tests run, and known risks.
|
||||
- Prohibited actions: secrets access, external messaging, board mutation, unrelated refactors, dependency upgrades unless required.
|
||||
- Verification commands Codex may run and commands Hermes will run afterward.
|
||||
|
||||
For PMB, include these mandatory safety constraints verbatim:
|
||||
|
||||
```text
|
||||
PMB safety constraints:
|
||||
- live-SIM is paper-only; do not add or enable live REST order entry.
|
||||
- Never use market orders.
|
||||
- Do not add execution crossing or bypass price/risk checks.
|
||||
- Do not fake passive fills, fills, PnL, order states, or reconciliation evidence.
|
||||
- Do not weaken risk gates, limits, kill switches, or fail-closed behavior.
|
||||
- Keep research/selection outside the C++ hot path unless explicitly requested.
|
||||
- Do not read, print, write, or require secrets/tokens/credentials.
|
||||
```
|
||||
|
||||
## Monitoring, Timeout, and Kill Behavior
|
||||
|
||||
Start long Codex lanes in the background with PTY and completion notification:
|
||||
|
||||
```python
|
||||
result = terminal(
|
||||
command="codex exec --full-auto '$(cat /tmp/codex_prompt.md)'",
|
||||
workdir=WORKTREE,
|
||||
background=True,
|
||||
pty=True,
|
||||
notify_on_complete=True,
|
||||
)
|
||||
session_id = result["session_id"]
|
||||
```
|
||||
|
||||
Monitor without interfering:
|
||||
|
||||
```python
|
||||
process(action="poll", session_id=session_id)
|
||||
process(action="log", session_id=session_id, limit=200)
|
||||
process(action="wait", session_id=session_id, timeout=300)
|
||||
```
|
||||
|
||||
Send a Kanban heartbeat every few minutes for lanes longer than two minutes, e.g. `kanban_heartbeat(note="Codex lane running in <WORKTREE>; waiting for tests/diff")`.
|
||||
|
||||
Kill conditions:
|
||||
|
||||
- No useful output for the task's remaining runtime budget.
|
||||
- Codex requests secrets, production credentials, or external permissions.
|
||||
- Codex attempts to modify files outside the worktree.
|
||||
- Codex starts unrelated rewrites or dependency churn.
|
||||
- Codex is still running near the worker timeout and no safe partial artifact exists.
|
||||
|
||||
Kill command:
|
||||
|
||||
```python
|
||||
process(action="kill", session_id=session_id)
|
||||
```
|
||||
|
||||
After kill, inspect `git status --short`, preserve useful patches only if safe, and record `codex_lane.result: timed_out` or `rejected` with a concrete `rejected_reason`.
|
||||
|
||||
## Reconciliation Checklist
|
||||
|
||||
Hermes must perform this checklist before accepting any Codex lane result:
|
||||
|
||||
- [ ] `git -C <WORKTREE> status --short --branch` shows only expected files.
|
||||
- [ ] `git -C <WORKTREE> diff --stat` and `git diff` were reviewed by Hermes.
|
||||
- [ ] No secrets, credentials, generated caches, unrelated data, or local artifacts are included.
|
||||
- [ ] PMB safety constraints were preserved: no live REST order entry, no market orders, no execution crossing, no fake passive fills/PnL, no risk-gate weakening, no secrets.
|
||||
- [ ] Codex commits are small enough to cherry-pick or squash cleanly.
|
||||
- [ ] Hermes ran the canonical tests itself, using `scripts/run_tests.sh` for Hermes Agent or the repo's documented wrapper for other repos.
|
||||
- [ ] Any Codex-run tests are listed separately from Hermes-run tests.
|
||||
- [ ] Accepted commits/diffs were applied to the Hermes-owned workspace/branch.
|
||||
- [ ] Rejected or partial work has a concrete reason and artifact path if useful.
|
||||
|
||||
Acceptance outcomes:
|
||||
|
||||
- `accepted`: Codex diff/commits were reviewed, applied, and verified.
|
||||
- `partial`: Some Codex work was accepted after edits or cherry-picks; rejected parts are documented.
|
||||
- `rejected`: No Codex changes were accepted; reason is documented.
|
||||
- `timed_out`: Codex exceeded the lane budget; useful artifacts may or may not exist.
|
||||
|
||||
## kanban_complete Metadata Schema
|
||||
|
||||
Include this object under `metadata.codex_lane` for every task where the lane was considered. If Codex was not used, set `used: false` and explain why in `rejected_reason` or a sibling `notes` field.
|
||||
|
||||
```json
|
||||
{
|
||||
"codex_lane": {
|
||||
"used": true,
|
||||
"mode": "exec | goal | skipped",
|
||||
"worktree": "/absolute/path/to/codex/worktree",
|
||||
"branch": "codex/t_caa69668/20260508100000",
|
||||
"command": "codex exec --full-auto ...",
|
||||
"result": "accepted | rejected | partial | timed_out",
|
||||
"accepted_commits": ["<sha1>", "<sha2>"],
|
||||
"rejected_reason": "empty when fully accepted; otherwise concrete reason",
|
||||
"tests_run": [
|
||||
{"command": "scripts/run_tests.sh tests/tools/test_x.py", "exit_code": 0, "owner": "hermes"},
|
||||
{"command": "codex-reported: npm test", "exit_code": 0, "owner": "codex"}
|
||||
],
|
||||
"artifacts": ["/absolute/path/to/log-or-patch"]
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
For tasks that intentionally skip Codex:
|
||||
|
||||
```json
|
||||
{
|
||||
"codex_lane": {
|
||||
"used": false,
|
||||
"mode": "skipped",
|
||||
"worktree": null,
|
||||
"branch": null,
|
||||
"command": null,
|
||||
"result": "rejected",
|
||||
"accepted_commits": [],
|
||||
"rejected_reason": "Direct Hermes edit was smaller and safer than spawning Codex.",
|
||||
"tests_run": [],
|
||||
"artifacts": []
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Common Pitfalls
|
||||
|
||||
1. Treating Codex self-report as verification. Always inspect the diff and rerun tests from Hermes.
|
||||
2. Running Codex in the user's dirty main checkout. Always isolate in a worktree/branch.
|
||||
3. Letting Codex own Kanban. Codex may summarize progress, but Hermes writes board state.
|
||||
4. Forgetting PMB safety invariants in the prompt. Missing safety text is a lane setup failure.
|
||||
5. Using `/goal` for quick edits. Prefer `codex exec` unless durable multi-step continuation is needed.
|
||||
6. Killing a stuck lane without recording why. `rejected_reason` must explain the decision.
|
||||
7. Accepting broad unrelated cleanup because tests pass. Reject or cherry-pick only the scoped changes.
|
||||
|
||||
## Verification Checklist
|
||||
|
||||
- [ ] Codex was skipped or started only after `command -v codex`, `codex --version`, and optional goals feature checks.
|
||||
- [ ] Codex ran only in an isolated worktree/branch.
|
||||
- [ ] Prompt included task scope, ownership rules, PMB safety constraints when applicable, and verification commands.
|
||||
- [ ] Hermes reviewed `git diff` and safety-sensitive files.
|
||||
- [ ] Hermes ran canonical tests independently.
|
||||
- [ ] `kanban_complete.metadata.codex_lane` follows the schema above.
|
||||
- [ ] Temporary processes and unnecessary worktrees were cleaned up.
|
||||
@@ -0,0 +1,57 @@
|
||||
# PMB Codex Lane Prompt Template
|
||||
|
||||
Use this template when a Hermes Kanban worker chooses to run Codex as an implementation lane for prediction-market-bot. Fill every bracketed field before launching Codex. Do not include secrets.
|
||||
|
||||
```text
|
||||
You are Codex CLI running as an input lane for a Hermes Kanban worker.
|
||||
|
||||
Ownership:
|
||||
- Hermes owns the Kanban task lifecycle, final review, test verification, and handoff.
|
||||
- You are an implementation lane only. Do not call Hermes kanban tools, Hermes CLI board commands, messaging gateways, or external notification tools.
|
||||
- Produce a scoped diff/commits and a concise report; do not mark any task complete.
|
||||
|
||||
Task:
|
||||
- task_id: [KANBAN_TASK_ID]
|
||||
- title: [KANBAN_TITLE]
|
||||
- acceptance criteria:
|
||||
[PASTE_ACCEPTANCE_CRITERIA]
|
||||
|
||||
Repository and isolation:
|
||||
- repo: [REPO_PATH]
|
||||
- worktree: [CODEX_WORKTREE_PATH]
|
||||
- branch: [CODEX_BRANCH]
|
||||
- allowed files/scope: [ALLOWED_FILES_OR_DIRECTORIES]
|
||||
- forbidden files/scope: [FORBIDDEN_FILES_OR_DIRECTORIES]
|
||||
|
||||
PMB safety constraints:
|
||||
- live-SIM is paper-only; do not add or enable live REST order entry.
|
||||
- Never use market orders.
|
||||
- Do not add execution crossing or bypass price/risk checks.
|
||||
- Do not fake passive fills, fills, PnL, order states, or reconciliation evidence.
|
||||
- Do not weaken risk gates, limits, kill switches, or fail-closed behavior.
|
||||
- Keep research/selection outside the C++ hot path unless explicitly requested.
|
||||
- Do not read, print, write, or require secrets/tokens/credentials.
|
||||
|
||||
Implementation constraints:
|
||||
- Follow existing project conventions and style.
|
||||
- Keep diffs small and reviewable.
|
||||
- Do not perform unrelated refactors, dependency upgrades, formatting sweeps, or generated-file churn.
|
||||
- If a requirement is unsafe or ambiguous, stop and report the blocker instead of guessing.
|
||||
- Commit only if asked by the Hermes worker; if committing, use small commits with clear subjects.
|
||||
|
||||
Verification you may run:
|
||||
- [COMMAND_1]
|
||||
- [COMMAND_2]
|
||||
|
||||
Verification Hermes will rerun independently:
|
||||
- [HERMES_COMMAND_1]
|
||||
- [HERMES_COMMAND_2]
|
||||
|
||||
Required final report:
|
||||
- Summary of changes.
|
||||
- Files changed.
|
||||
- Commit SHAs, if any.
|
||||
- Tests/commands run with exit codes.
|
||||
- Safety constraints checked.
|
||||
- Known risks or incomplete items.
|
||||
```
|
||||
@@ -0,0 +1,98 @@
|
||||
"""Regression coverage for the bundled Kanban Codex lane skill."""
|
||||
|
||||
import json
|
||||
from pathlib import Path
|
||||
|
||||
from tools import skills_tool
|
||||
from tools.skill_manager_tool import _validate_frontmatter
|
||||
|
||||
|
||||
REPO_ROOT = Path(__file__).resolve().parents[2]
|
||||
SKILL_DIR = REPO_ROOT / "skills" / "autonomous-ai-agents" / "kanban-codex-lane"
|
||||
SKILL_MD = SKILL_DIR / "SKILL.md"
|
||||
TEMPLATE = SKILL_DIR / "templates" / "pmb-codex-lane-prompt.md"
|
||||
|
||||
|
||||
def _skill_text() -> str:
|
||||
return SKILL_MD.read_text(encoding="utf-8")
|
||||
|
||||
|
||||
def test_kanban_codex_lane_skill_frontmatter_is_valid():
|
||||
content = _skill_text()
|
||||
|
||||
assert _validate_frontmatter(content) is None
|
||||
assert "name: kanban-codex-lane" in content
|
||||
assert "description: Use when" in content
|
||||
|
||||
|
||||
def test_kanban_codex_lane_skill_is_discoverable_with_template(monkeypatch, tmp_path):
|
||||
local_skills = tmp_path / "skills"
|
||||
local_skills.mkdir()
|
||||
bundled_skills = REPO_ROOT / "skills"
|
||||
|
||||
monkeypatch.setattr(skills_tool, "SKILLS_DIR", local_skills)
|
||||
monkeypatch.setattr(
|
||||
"agent.skill_utils.get_external_skills_dirs",
|
||||
lambda: [bundled_skills],
|
||||
)
|
||||
|
||||
listed = json.loads(skills_tool.skills_list("autonomous-ai-agents"))
|
||||
assert listed["success"] is True
|
||||
assert any(skill["name"] == "kanban-codex-lane" for skill in listed["skills"])
|
||||
|
||||
viewed = json.loads(skills_tool.skill_view("kanban-codex-lane"))
|
||||
assert viewed["success"] is True
|
||||
assert viewed["path"].endswith("kanban-codex-lane/SKILL.md")
|
||||
assert viewed["linked_files"]["templates"] == ["templates/pmb-codex-lane-prompt.md"]
|
||||
|
||||
template = json.loads(
|
||||
skills_tool.skill_view(
|
||||
"kanban-codex-lane",
|
||||
file_path="templates/pmb-codex-lane-prompt.md",
|
||||
)
|
||||
)
|
||||
assert template["success"] is True
|
||||
assert "PMB safety constraints" in template["content"]
|
||||
|
||||
|
||||
def test_kanban_codex_lane_documents_required_contracts():
|
||||
content = _skill_text()
|
||||
template = TEMPLATE.read_text(encoding="utf-8")
|
||||
|
||||
required_skill_phrases = [
|
||||
"Hermes is always the task owner",
|
||||
"Codex is an input lane only",
|
||||
"git -C \"$REPO\" worktree add -b \"$BRANCH\" \"$WORKTREE\" \"$BASE\"",
|
||||
"codex --version",
|
||||
"codex features list | grep -i goals || true",
|
||||
"codex exec --full-auto",
|
||||
"/goal Work in this repository only",
|
||||
"process(action=\"kill\", session_id=session_id)",
|
||||
"scripts/run_tests.sh",
|
||||
'"codex_lane"',
|
||||
'"used"',
|
||||
'"mode"',
|
||||
'"worktree"',
|
||||
'"branch"',
|
||||
'"command"',
|
||||
'"result"',
|
||||
'"accepted_commits"',
|
||||
'"rejected_reason"',
|
||||
'"tests_run"',
|
||||
'"artifacts"',
|
||||
"accepted | rejected | partial | timed_out",
|
||||
]
|
||||
for phrase in required_skill_phrases:
|
||||
assert phrase in content
|
||||
|
||||
required_safety_phrases = [
|
||||
"live-SIM is paper-only; do not add or enable live REST order entry",
|
||||
"Never use market orders",
|
||||
"Do not add execution crossing",
|
||||
"Do not fake passive fills",
|
||||
"Do not weaken risk gates",
|
||||
"Do not read, print, write, or require secrets/tokens/credentials",
|
||||
]
|
||||
for phrase in required_safety_phrases:
|
||||
assert phrase in content
|
||||
assert phrase in template
|
||||
Reference in New Issue
Block a user