diff --git a/ARCHITECTURE.md b/ARCHITECTURE.md index 6b865018..c9834c36 100644 --- a/ARCHITECTURE.md +++ b/ARCHITECTURE.md @@ -122,6 +122,9 @@ Environment variables controlling behavior: HERMES_WEBUI_DEFAULT_MODEL Optional model override; unset means provider default HERMES_WEBUI_PASSWORD Optional: enable password auth (off by default) HERMES_WEBUI_SKIP_ONBOARDING Optional: bypass the first-run onboarding wizard + HERMES_PREFILL_MESSAGES_FILE Optional JSON message list for browser-turn prefill context + HERMES_WEBUI_PREFILL_MESSAGES_SCRIPT Optional command that prints JSON messages or text prefill context + HERMES_WEBUI_PREFILL_MESSAGES_SCRIPT_TIMEOUT Optional script timeout in seconds (default 5, max 30) HERMES_HOME Base directory for Hermes state (~/.hermes by default) Test isolation environment variables (set by conftest.py): diff --git a/CHANGELOG.md b/CHANGELOG.md index f019b3bb..848479ab 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -3,6 +3,17 @@ ## [Unreleased] +## [v0.51.141] — 2026-05-26 — Release DM (stage-batch23 — 4-PR second hold-bucket pass) + +### Added + +- WebUI can now opt into a `webui_prefill_messages_script` / `HERMES_WEBUI_PREFILL_MESSAGES_SCRIPT` hook for dynamic browser-turn prefill context from local notes or recall systems. The script output is capped at 256 KiB, normalized to ephemeral prefill messages, and browser status still hides message bodies while redacting script errors. +- Added a read-only WebUI/CLI session source switch in the chat sidebar when agent session sync is enabled. WebUI conversations stay in the default list, while imported CLI/agent sessions are surfaced under a separate `CLI sessions` tab with counts so large CLI histories do not clutter the normal conversation list. (Refs #2351) + +### Fixed + +- Compact tool activity now keeps visible interim assistant progress in the live Session timeline instead of making that progress effectively collapsed-only inside Activity details. The interim assistant stream path creates and flushes a visible assistant segment before resetting for later tool/compression activity. + ## [v0.51.140] — 2026-05-26 — Release DL (stage-batch22 — 5-PR hold-bucket reassessment) ### Fixed @@ -113,6 +124,7 @@ ## [v0.51.132] — 2026-05-24 — Release DD (stage-batch14 — 4-PR replayed-context + interrupted-response + shutdown affordance + passkey opt-in) ### Added +- **Cursor ACP provider integration** — Add `cursor-acp` to the WebUI model picker and route slash model IDs (for example `cursor/composer-2.5`) through explicit `@cursor-acp:` provider hints so they do not fall through to the configured default HTTP provider. - **PR #2859** by @AJV20 — Optional passkey/WebAuthn sign-in for password-protected WebUI instances. Authenticated users can register/remove passkeys from Settings -> System, and `/login` shows a passwordless sign-in button only after a passkey exists. Password auth remains the default-off bootstrap and recovery path. **Opt-in default-off behind `HERMES_WEBUI_PASSKEY=1` env var or `webui_passkey_enabled: true` config flag** — when disabled, the UI block hides, all 6 `/api/auth/passkey/*` endpoints return 404, and `is_auth_enabled()` ignores any pre-existing credential file so the auth posture cannot silently flip if the flag is unset later. @@ -120,6 +132,10 @@ ### Fixed +### Fixed +- **Reasoning effort chip visibility** — `/api/reasoning` now accepts `model` and `provider` query params and returns `supported_efforts` so the composer chip hides for models without configurable reasoning levels (for example Cursor Composer) while remaining available for models like GPT-5.5. Model picker changes now re-sync the chip after the session model/provider update instead of querying with stale session state. Composer dropdown selections now pass the provider id into `selectModelFromDropdown()` so duplicate bare model ids (for example `gpt-5.5` under OpenAI Codex vs OpenRouter) no longer fall back to the profile default provider when refreshing the chip. +- **Cursor ACP routing and new-chat defaults** — New conversations now carry the visible composer picker selection into `POST /api/session/new`, persist model changes before a session exists, and evict cached session agents when the model/provider changes mid-session. + - **PR #2685** by @LumenYoung — Prevent replayed context in chat reconciliation and metering. When a WebUI session is recovered (e.g., after a process restart, network drop, or browser reload), the sidebar/`state.db` reconciliation logic walks the sidecar transcript in order and only skips rows that can actually be aligned with the remaining sidecar context. The prior set-membership check was too broad: a legitimate fresh message that happened to share a key with any older repeated short message in the sidecar was mis-classified as already-seen and dropped from the replay, leading to lost context and inconsistent metering. Also caps the per-turn live-tool-prompt token estimate at 12,000 to prevent unbounded growth on bursts of large tool reads before exact provider accounting overrides. - **PR #2739** by @ai-ag2026 — Clarify `Response interrupted` recovery markers so they report that the live response stream stopped instead of asserting that the WebUI process restarted. The recovery path now records distinct interruption causes for real process restarts, stream/run split-brain, and lost worker bookkeeping; browser-side SSE transport failures show a separate `Connection interrupted` message, client-side `BrokenPipeError` disconnects no longer get logged as server 500s, and chat/gateway SSE errors emit rate-limited (30 events / 60s / 4KB body cap), sanitized client diagnostics to `/api/client-events/log` for future root-cause checks. The stream-status `terminal_state` value for lost-worker bookkeeping changes from `stale-from-restart` to `lost-worker-bookkeeping`, matching the new non-restart wording. diff --git a/README.md b/README.md index dbde203e..46912630 100644 --- a/README.md +++ b/README.md @@ -121,6 +121,44 @@ For self-hosted VM or homelab installs, `ctl.sh` wraps the common daemon lifecyc `ctl.sh start` runs the bootstrap in foreground/no-browser mode behind the daemon wrapper, writes logs to `~/.hermes/webui.log`, and respects `.env` plus inline overrides such as `HERMES_WEBUI_HOST=0.0.0.0 ./ctl.sh start`. +### Optional session recall prefill + +WebUI can attach ephemeral prefill messages to new browser-originated +agent turns. This is useful when a deployment already has a local recall or +router script for Joplin, Obsidian, Notion, llm-wiki, or another third-party +notes source and wants browser chat to know where durable context lives. + +Prefer a compact router-style prefill (for example, "Joplin has the durable +project context; use the available notes/search tools before answering +detail-dependent questions") instead of dumping the full note corpus into every +new browser session. The prefill should point the agent toward retrieval; the +notes/search tools should provide the specific facts on demand. + +Static JSON remains supported through `prefill_messages_file` or +`HERMES_PREFILL_MESSAGES_FILE`. For dynamic recall, opt in explicitly with a +WebUI-specific script hook: + +```yaml +webui_prefill_messages_script: + - python3 + - /path/to/notes_recall.py +webui_prefill_messages_script_timeout: 5 +``` + +or: + +```bash +HERMES_WEBUI_PREFILL_MESSAGES_SCRIPT="python3 /path/to/notes_recall.py" \ +HERMES_WEBUI_PREFILL_MESSAGES_SCRIPT_TIMEOUT=5 \ +./ctl.sh restart +``` + +The script may print either an OpenAI-style JSON message list, a JSON object with +a `messages` list, or plain text; plain text is wrapped as one `system` prefill +message. Script output is capped at 256 KiB before parsing. The browser only +receives a compact status event (`source`, `label`, message count, and redacted +errors), never the prefill message bodies. + The bootstrap will: 1. Detect Hermes Agent and, if missing, attempt the official installer (`curl -fsSL https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash`). diff --git a/api/config.py b/api/config.py index 9e582e40..91544913 100644 --- a/api/config.py +++ b/api/config.py @@ -709,6 +709,7 @@ _PROVIDER_DISPLAY = { "openai-codex": "OpenAI Codex", "xai-oauth": "xAI Grok OAuth", "copilot": "GitHub Copilot", + "cursor-acp": "Cursor ACP", "zai": "Z.AI / GLM", "kimi-coding": "Kimi / Moonshot", "deepseek": "DeepSeek", @@ -1131,6 +1132,13 @@ _PROVIDER_MODELS = { {"id": "claude-sonnet-4.6", "label": "Claude Sonnet 4.6"}, {"id": "gemini-3-flash-preview", "label": "Gemini 3 Flash Preview"}, ], + # Cursor ACP — models served via Cursor CLI agent acp + "cursor-acp": [ + {"id": "cursor/composer-2.5", "label": "Composer 2.5"}, + {"id": "cursor/composer-2", "label": "Composer 2"}, + {"id": "cursor/default", "label": "Default"}, + {"id": "cursor-acp", "label": "Cursor ACP"}, + ], # OpenCode Zen — curated models via opencode.ai/zen (pay-as-you-go credits) "opencode-zen": [ {"id": "gpt-5.4-pro", "label": "GPT-5.4 Pro"}, @@ -1987,6 +1995,12 @@ def resolve_custom_provider_connection(provider_id: str) -> tuple[str | None, st return None, None +# Subprocess ACP transports (Cursor/Copilot CLI). Model IDs often contain '/' +# but must still route via explicit @provider:model so they do not fall through +# to the configured default HTTP provider (e.g. openai-codex). +_ACP_SUBPROCESS_PROVIDERS = frozenset({"cursor-acp", "copilot-acp"}) + + def model_with_provider_context(model_id: str, model_provider: str | None = None) -> str: """Return the model string to pass to ``resolve_model_provider()``. @@ -2006,6 +2020,11 @@ def model_with_provider_context(model_id: str, model_provider: str | None = None if isinstance(model_cfg, dict): config_provider = str(model_cfg.get("provider") or "").strip().lower() + # ACP subprocess providers always need the explicit hint — their slash IDs + # are not OpenRouter paths and must not inherit config_provider routing. + if provider in _ACP_SUBPROCESS_PROVIDERS: + return f"@{provider}:{model}" + # If the selected provider is already the configured provider, leaving the # model bare preserves provider-specific base_url/proxy settings. if provider == config_provider: @@ -2069,7 +2088,121 @@ def parse_reasoning_effort(effort): return None -def get_reasoning_status() -> dict: +def _strip_provider_hint_for_reasoning(model_id: str) -> str: + """Remove WebUI routing hints before provider-specific capability lookup.""" + model = str(model_id or "").strip() + if model.startswith("@") and ":" in model: + return model.split(":", 1)[1] + return model + + +def _heuristic_reasoning_efforts(model_id: str, provider_id: str) -> list[str]: + """Fallback when hermes_cli is unavailable.""" + model = _strip_provider_hint_for_reasoning(model_id).lower() + provider = _resolve_provider_alias(str(provider_id or "").strip().lower()) + if not model or provider in {"cursor-acp", "copilot-acp"}: + return [] + bare = model.rsplit("/", 1)[-1] + if provider == "openai-codex" and bare.startswith(("gpt-5", "o1", "o3", "o4")): + if bare.startswith(("o1", "o3", "o4")): + return ["low", "medium", "high"] + return list(VALID_REASONING_EFFORTS) + if provider in {"copilot", "github-copilot"}: + if bare.startswith(("gpt-5", "o1", "o3", "o4")): + if bare.startswith(("o1", "o3", "o4")): + return ["low", "medium", "high"] + return list(VALID_REASONING_EFFORTS) + prefixes = ( + "deepseek/", + "anthropic/", + "openai/", + "x-ai/", + "google/gemini-2", + "google/gemma-4", + "qwen/qwen3", + "tencent/hy3-preview", + "xiaomi/", + ) + if any(model.startswith(prefix) for prefix in prefixes): + return list(VALID_REASONING_EFFORTS) + return [] + + +def resolve_model_reasoning_efforts( + model_id: str | None = None, + provider_id: str | None = None, + base_url: str | None = None, +) -> list[str]: + """Return supported reasoning-effort levels for *model_id*, or [] if none.""" + model = str(model_id or "").strip() + if not model: + return [] + + provider = str(provider_id or "").strip().lower() if provider_id else "" + resolved_base_url = str(base_url or "").strip() or None + if not provider: + try: + _, provider, resolved_base_url = resolve_model_provider(model) + except Exception: + provider = str((cfg.get("model") or {}).get("provider") or "").strip().lower() + + provider = _resolve_provider_alias(provider) + if provider in {"cursor-acp", "copilot-acp"}: + return [] + + try: + from hermes_cli.models import ( + github_model_reasoning_efforts, + lmstudio_model_reasoning_options, + ) + except Exception: + return _heuristic_reasoning_efforts(model, provider) + + hinted_model = _strip_provider_hint_for_reasoning(model) + if provider in {"copilot", "github-copilot"}: + return github_model_reasoning_efforts(hinted_model) + + if provider == "openai-codex": + bare = hinted_model.rsplit("/", 1)[-1] + return github_model_reasoning_efforts(bare) + + if provider == "lmstudio": + probe_base = resolved_base_url or _get_provider_base_url(provider) + opts = lmstudio_model_reasoning_options(model, probe_base) + normalized = [str(opt).strip().lower() for opt in opts if str(opt).strip()] + if not normalized or set(normalized).issubset({"off"}): + return [] + level_opts = [opt for opt in normalized if opt in VALID_REASONING_EFFORTS] + if level_opts: + return list(dict.fromkeys(level_opts)) + if set(normalized).issubset({"off", "on"}): + return [] + return [] + + model_lower = model.lower() + prefixes = ( + "deepseek/", + "anthropic/", + "openai/", + "x-ai/", + "google/gemini-2", + "google/gemma-4", + "qwen/qwen3", + "tencent/hy3-preview", + "xiaomi/", + ) + if any(model_lower.startswith(prefix) for prefix in prefixes): + return list(VALID_REASONING_EFFORTS) + + return [] + + +def get_reasoning_status( + *, + model_id: str | None = None, + provider_id: str | None = None, + base_url: str | None = None, +) -> dict: """Return current reasoning configuration from the active profile's config.yaml — the same source of truth the CLI reads from. @@ -2082,10 +2215,17 @@ def get_reasoning_status() -> dict: agent_cfg = config_data.get("agent") or {} show_raw = display_cfg.get("show_reasoning") if isinstance(display_cfg, dict) else None effort_raw = agent_cfg.get("reasoning_effort") if isinstance(agent_cfg, dict) else None + supported_efforts = resolve_model_reasoning_efforts( + model_id, + provider_id=provider_id, + base_url=base_url, + ) return { # Match CLI default (True if unset in config.yaml) "show_reasoning": bool(show_raw) if isinstance(show_raw, bool) else True, "reasoning_effort": str(effort_raw or "").strip().lower(), + "supported_efforts": supported_efforts, + "supports_reasoning_effort": bool(supported_efforts), } diff --git a/api/routes.py b/api/routes.py index f955988a..a8d32521 100644 --- a/api/routes.py +++ b/api/routes.py @@ -4019,7 +4019,18 @@ def handle_get(handler, parsed) -> bool: # Current reasoning config (shared source of truth with the CLI — # reads display.show_reasoning and agent.reasoning_effort from # the active profile's config.yaml). - return j(handler, get_reasoning_status()) + query = parse_qs(parsed.query) + model_id = (query.get("model", [""])[0] or "").strip() or None + provider_id = (query.get("provider", [""])[0] or "").strip() or None + base_url = (query.get("base_url", [""])[0] or "").strip() or None + return j( + handler, + get_reasoning_status( + model_id=model_id, + provider_id=provider_id, + base_url=base_url, + ), + ) if parsed.path == "/api/onboarding/status": return j(handler, get_onboarding_status()) @@ -5416,6 +5427,9 @@ def handle_post(handler, parsed) -> bool: ) s.threshold_tokens = 0 s.last_prompt_tokens = 0 + from api.config import _evict_session_agent + + _evict_session_agent(body["session_id"]) s.save() if str(old_ws or "") != str(new_ws or ""): try: diff --git a/api/streaming.py b/api/streaming.py index 9abaf85e..52d93af1 100644 --- a/api/streaming.py +++ b/api/streaming.py @@ -10,7 +10,9 @@ import mimetypes import os import queue import re +import shlex import sys +import subprocess import threading import time import traceback @@ -285,29 +287,117 @@ def _resolve_prefill_path(raw: str) -> Path: return path +_PREFILL_SCRIPT_OUTPUT_LIMIT = 262_144 + + +def _prefill_not_configured() -> dict: + return {"status": "not_configured", "source": "none", "label": "", "messages": [], "message_count": 0} + + +def _load_prefill_messages_file(file_raw: str, *, source: str = "file", status: str = "loaded") -> dict: + path = _resolve_prefill_path(file_raw) + label = path.name or "prefill file" + if not path.exists(): + return {"status": "error", "source": source, "label": label, "messages": [], "message_count": 0, "error": "prefill file not found"} + try: + messages = _valid_prefill_messages(json.loads(path.read_text(encoding="utf-8"))) + return {"status": status, "source": source, "label": label, "messages": messages, "message_count": len(messages)} + except Exception as exc: + return {"status": "error", "source": source, "label": label, "messages": [], "message_count": 0, "error": _redact_prefill_status_text(str(exc))} + + +def _prefill_script_timeout(config_data: dict) -> float: + raw = os.getenv("HERMES_WEBUI_PREFILL_MESSAGES_SCRIPT_TIMEOUT", "") or str(config_data.get("webui_prefill_messages_script_timeout") or "") + try: + return max(0.1, min(float(raw or 5), 30.0)) + except Exception: + return 5.0 + + +def _prefill_script_command(raw) -> list[str]: + if isinstance(raw, (list, tuple)): + return [str(part) for part in raw if str(part)] + parts = shlex.split(str(raw or "")) + if not parts: + return [] + # A single script path mirrors prefill_messages_file path resolution. More + # complex commands keep their argv untouched so admins can pass arguments. + if len(parts) == 1: + parts[0] = str(_resolve_prefill_path(parts[0])) + return parts + + +def _messages_from_prefill_script_output(text: str) -> list[dict]: + stripped = str(text or "").strip() + if not stripped: + return [] + try: + payload = json.loads(stripped) + except Exception: + payload = None + if isinstance(payload, dict): + payload = payload.get("messages") + messages = _valid_prefill_messages(payload) + if messages: + return messages + return [{"role": "system", "content": stripped}] + + +def _load_prefill_messages_script(config_data: dict) -> dict: + script_raw = os.getenv("HERMES_WEBUI_PREFILL_MESSAGES_SCRIPT", "") or config_data.get("webui_prefill_messages_script") + if not script_raw: + return _prefill_not_configured() + command = _prefill_script_command(script_raw) + label = Path(command[0]).name if command else "prefill script" + if not command: + return {"status": "error", "source": "script", "label": label, "messages": [], "message_count": 0, "error": "prefill script is empty"} + try: + proc = subprocess.run( + command, + text=True, + stdout=subprocess.PIPE, + stderr=subprocess.PIPE, + timeout=_prefill_script_timeout(config_data), + check=False, + ) + except subprocess.TimeoutExpired: + return {"status": "error", "source": "script", "label": label, "messages": [], "message_count": 0, "error": "prefill script timed out"} + except Exception as exc: + return {"status": "error", "source": "script", "label": label, "messages": [], "message_count": 0, "error": _redact_prefill_status_text(str(exc))} + if proc.returncode != 0: + err = _redact_prefill_status_text(proc.stderr or proc.stdout or f"prefill script exited {proc.returncode}") + return {"status": "error", "source": "script", "label": label, "messages": [], "message_count": 0, "error": err} + if len(proc.stdout.encode("utf-8")) > _PREFILL_SCRIPT_OUTPUT_LIMIT: + return { + "status": "error", + "source": "script", + "label": label, + "messages": [], + "message_count": 0, + "error": f"prefill script output exceeded {_PREFILL_SCRIPT_OUTPUT_LIMIT} bytes", + } + messages = _messages_from_prefill_script_output(proc.stdout) + return {"status": "loaded", "source": "script", "label": label, "messages": messages, "message_count": len(messages)} + + def _load_webui_prefill_context( config_data: Optional[dict] = None, ) -> dict: """Load configured WebUI session prefill messages. - Supports the same bounded JSON-file shape used by Hermes Agent. WebUI does - not execute a configured prefill script here; session recall that requires - code execution should go through the normal MCP/tool path instead of an - always-on per-turn subprocess before SSE starts. + Supports the same bounded JSON-file shape used by Hermes Agent. WebUI also + supports its own explicitly opt-in script hook so admins can bridge Joplin, + Obsidian, Notion, llm-wiki, or another local notes source into ephemeral + turn context without baking any one note provider into the WebUI. """ cfg = config_data if isinstance(config_data, dict) else get_config() + script_context = _load_prefill_messages_script(cfg) + if script_context.get("status") != "not_configured": + return script_context file_raw = os.getenv("HERMES_PREFILL_MESSAGES_FILE", "") or str(cfg.get("prefill_messages_file") or "") if file_raw: - path = _resolve_prefill_path(file_raw) - label = path.name or "prefill file" - if not path.exists(): - return {"status": "error", "source": "file", "label": label, "messages": [], "message_count": 0, "error": "prefill file not found"} - try: - messages = _valid_prefill_messages(json.loads(path.read_text(encoding="utf-8"))) - return {"status": "loaded", "source": "file", "label": label, "messages": messages, "message_count": len(messages)} - except Exception as exc: - return {"status": "error", "source": "file", "label": label, "messages": [], "message_count": 0, "error": _redact_prefill_status_text(str(exc))} - return {"status": "not_configured", "source": "none", "label": "", "messages": [], "message_count": 0} + return _load_prefill_messages_file(file_raw) + return _prefill_not_configured() def _public_prefill_context_status(prefill_context: dict) -> dict: diff --git a/docs/UIUX-GUIDE.md b/docs/UIUX-GUIDE.md index 78a728f3..63f66f9c 100644 --- a/docs/UIUX-GUIDE.md +++ b/docs/UIUX-GUIDE.md @@ -74,6 +74,11 @@ terse, for example `Activity: 4 tools`, and should not duplicate the thinking area, list every tool name in the summary, or add redundant trailing count badges. +Visible interim assistant progress is part of the live conversation timeline, +not raw debug detail. Compact Activity may collapse tool arguments, long tool +results, and low-level reasoning detail, but it must not make concise +user-visible progress text available only inside a collapsed disclosure. + The existing two-stage proposal in `docs/ui-ux/two-stage-proposal.html` records a compatible direction for long turns: live work can be grouped as a worklog, then settled history can collapse while the final answer reads as the calm diff --git a/docs/pr-media/2351/after-source-tabs.png b/docs/pr-media/2351/after-source-tabs.png new file mode 100644 index 00000000..2279fd30 Binary files /dev/null and b/docs/pr-media/2351/after-source-tabs.png differ diff --git a/docs/pr-media/2351/before-cli-mixed.png b/docs/pr-media/2351/before-cli-mixed.png new file mode 100644 index 00000000..2279fd30 Binary files /dev/null and b/docs/pr-media/2351/before-cli-mixed.png differ diff --git a/docs/rfcs/webui-run-state-consistency-contract.md b/docs/rfcs/webui-run-state-consistency-contract.md index b3329a25..9fa365cd 100644 --- a/docs/rfcs/webui-run-state-consistency-contract.md +++ b/docs/rfcs/webui-run-state-consistency-contract.md @@ -82,6 +82,9 @@ while WebUI still has multiple overlapping state stores. browser-facing timeline renderer as live SSE events so recovery does not downgrade a structured Thinking / progress / tool / compression turn into a separate flattened presentation. + Visible interim assistant progress must remain visible timeline content; a + compact Activity disclosure may summarize adjacent tool/debug detail, but it + must not be the only place where the user can see emitted progress text. 6. **Compression is not current intent.** Automatic compression summaries and reference cards are recovery/handoff material. They must not be treated as a new user request, active-turn content, or the default visible explanation for diff --git a/static/boot.js b/static/boot.js index eaadc8cd..7bcec46e 100644 --- a/static/boot.js +++ b/static/boot.js @@ -1022,7 +1022,6 @@ function _applySessionContextMetadataUpdate(data){ } $('modelSelect').onchange=async()=>{ - if(!S.session)return; const selectedModel=$('modelSelect').value; const modelState=(typeof _modelStateForSelect==='function') ? _modelStateForSelect($('modelSelect'),selectedModel) @@ -1030,10 +1029,16 @@ $('modelSelect').onchange=async()=>{ if(typeof closeModelDropdown==='function') closeModelDropdown(); if(typeof _writePersistedModelState==='function') _writePersistedModelState(modelState.model,modelState.model_provider); else try{localStorage.setItem('hermes-webui-model',modelState.model)}catch{} + if(!S.session){ + if(typeof syncModelChip==='function') syncModelChip(); + if(typeof syncReasoningChip==='function') syncReasoningChip(); + return; + } if(typeof _rememberPendingSessionModel==='function') _rememberPendingSessionModel(S.session.session_id,modelState.model,modelState.model_provider); S.session.model=modelState.model; S.session.model_provider=modelState.model_provider||null; if(typeof syncModelChip==='function') syncModelChip(); + if(typeof syncReasoningChip==='function') syncReasoningChip(); syncTopbar(); // Clarify scope: composer model changes are session-local, not the global default. if(typeof showToast==='function'){ diff --git a/static/commands.js b/static/commands.js index f7705546..73966304 100644 --- a/static/commands.js +++ b/static/commands.js @@ -1141,7 +1141,8 @@ function cmdReasoning(args){ } if(!arg){ // Status — read from the same config.yaml keys the CLI uses. - api('/api/reasoning').then(function(st){showToast(_fmtStatus(st));}) + const q=(typeof _reasoningEffortQuery==='function')?_reasoningEffortQuery():''; + api('/api/reasoning'+q).then(function(st){showToast(_fmtStatus(st));}) .catch(function(){showToast(BRAIN+' /reasoning — status unavailable');}); return true; } @@ -1168,7 +1169,7 @@ function cmdReasoning(args){ .then(function(st){ const eff=(st && st.reasoning_effort)||arg; showToast(BRAIN+' Reasoning effort: '+eff+' (saved; applies to next turn)'); - if(typeof _applyReasoningChip==='function') _applyReasoningChip(eff); + if(typeof _applyReasoningChip==='function') _applyReasoningChip(eff, st||{}); }) .catch(function(e){ showToast(BRAIN+' Failed to set effort: '+(e && e.message ? e.message : arg)); diff --git a/static/messages.js b/static/messages.js index 82174640..86245d57 100644 --- a/static/messages.js +++ b/static/messages.js @@ -1379,9 +1379,10 @@ function attachLiveStream(activeSid, streamId, uploaded=[], options={}){ }; step(); } - function _flushPendingSegmentRender(){ - if(!assistantBody||!_renderPending) return; - _cancelAnimationFramePendingStreamRender(); + function _flushPendingSegmentRender(options={}){ + const force=!!(options&&options.force); + if(!assistantBody||(!force&&!_renderPending)) return; + if(_renderPending) _cancelAnimationFramePendingStreamRender(); const displayText=segmentStart===0 ? _parseStreamState().displayText : _stripXmlToolCalls(assistantText.slice(segmentStart)); @@ -1539,8 +1540,9 @@ function attachLiveStream(activeSid, streamId, uploaded=[], options={}){ if(typeof updateThinking==='function') updateThinking(_liveThinkingText()); else appendThinking(_liveThinkingText()); } - _flushPendingSegmentRender(); ensureAssistantRow(true); + _flushPendingSegmentRender({force:true}); + if(typeof closeCurrentLiveActivityGroup==='function') closeCurrentLiveActivityGroup(); _resetAssistantSegment(); _scheduleRender(); }); @@ -1592,7 +1594,7 @@ function attachLiveStream(activeSid, streamId, uploaded=[], options={}){ // Reset the live assistant row reference so that any text tokens arriving // after this tool call create a NEW segment appended below the tool card, // rather than updating the old segment that sits above it in the DOM. - _flushPendingSegmentRender(); + _flushPendingSegmentRender({force:true}); _freshSegment=true; _smdEndParser(); _resetAssistantSegment(); diff --git a/static/sessions.js b/static/sessions.js index 7f25ae0b..0c101cff 100644 --- a/static/sessions.js +++ b/static/sessions.js @@ -470,6 +470,20 @@ async function newSession(flash, options={}){ if(S.session&&S.session.session_id) reqBody.prev_session_id=S.session.session_id; if(options&&options.worktree) reqBody.worktree=true; if(_activeProject&&_activeProject!==NO_PROJECT_FILTER) reqBody.project_id=_activeProject; + // Carry the visible picker selection into the new session. Without this, + // /api/session/new falls back to config.yaml defaults (e.g. gpt-5.5) even + // when the user already chose cursor/composer-2.5 in the composer chip. + const modelSelForNew=$('modelSelect'); + let newModelState=null; + if(modelSelForNew&&modelSelForNew.value&&typeof _modelStateForSelect==='function'){ + newModelState=_modelStateForSelect(modelSelForNew,modelSelForNew.value); + }else if(typeof _readPersistedModelState==='function'){ + newModelState=_readPersistedModelState(); + } + if(newModelState&&newModelState.model){ + reqBody.model=newModelState.model; + reqBody.model_provider=newModelState.model_provider||null; + } const data=await api('/api/session/new',{method:'POST',body:JSON.stringify(reqBody)}); S.session=data.session;S.messages=data.session.messages||[]; S.lastUsage={...(data.session.last_usage||{})}; @@ -892,6 +906,29 @@ function _isCliSession(session) { return session.is_cli_session === true; } +function _sessionSourceLabel(filter, count) { + const n = Number(count) || 0; + return filter === 'cli' ? `CLI sessions (${n})` : `WebUI sessions (${n})`; +} + +function _setSessionSourceFilter(filter) { + const next = filter === 'cli' ? 'cli' : 'webui'; + if (_sessionSourceFilter === next) return; + _sessionSourceFilter = next; + _activeProject = null; + _selectedSessions.clear(); + _sessionSelectMode = false; + try { localStorage.setItem('hermes-session-source-filter', next); } catch (_e) {} + renderSessionListFromCache(); +} + +function _restoreSessionSourceFilter() { + try { + const raw = localStorage.getItem('hermes-session-source-filter'); + if (raw === 'cli' || raw === 'webui') _sessionSourceFilter = raw; + } catch (_e) {} +} + function _normalizeMessageForCliImportComparison(message) { if (!message || typeof message !== 'object') return message; const clone = { ...message }; @@ -1537,6 +1574,8 @@ const NO_PROJECT_FILTER = '__none__'; let _activeProject = null; // project_id filter (null = show all, NO_PROJECT_FILTER = unassigned only) let _showAllProfiles = false; // false = filter to active profile only let _otherProfileCount = 0; // count of sessions from other profiles (server-reported) +let _sessionSourceFilter = 'webui'; // 'webui' keeps WebUI chats separate from read-only CLI sessions +_restoreSessionSourceFilter(); let _sessionActionMenu = null; let _sessionActionAnchor = null; let _sessionActionSessionId = null; @@ -3224,6 +3263,14 @@ function renderSessionListFromCache(){ (activeSidForSidebar&&s.session_id===activeSidForSidebar) || (S.session&&s.session_id===S.session.session_id&&(S.session.message_count||0)>0) ); + const webuiSessionCount = withMessages.filter(s=>!_isCliSession(s)).length; + const cliSessionCount = withMessages.filter(s=>_isCliSession(s)).length; + if(_sessionSourceFilter==='cli' && !window._showCliSessions && cliSessionCount===0){ + _sessionSourceFilter='webui'; + } + const sourceFiltered = _sessionSourceFilter==='cli' + ? withMessages.filter(s=>_isCliSession(s)) + : withMessages.filter(s=>!_isCliSession(s)); // The server is authoritative for profile scoping (#1611): it filters by // active profile when no query param is set, and returns the aggregate when // we send ?all_profiles=1. The renamed-root cross-alias (a row tagged @@ -3231,7 +3278,7 @@ function renderSessionListFromCache(){ // in _profiles_match, and a strict-equality client filter would reject those // rows incorrectly. So we trust the wire data and skip the redundant client // filter entirely. - const profileFiltered=withMessages; + const profileFiltered=sourceFiltered; // Filter by active project. NO_PROJECT_FILTER sentinel asks for sessions // with no project_id; otherwise filter to the matching project_id, or // pass through when no filter is active. @@ -3271,6 +3318,21 @@ function renderSessionListFromCache(){ list.appendChild(batchBar); if(_sessionSelectMode&&_selectedSessions.size>0){batchBar.style.display='flex';_renderBatchActionBar();} else{batchBar.style.display='none';} + if(window._showCliSessions || cliSessionCount>0){ + const sourceTabs=document.createElement('div'); + sourceTabs.className='session-source-tabs'; + for(const filter of ['webui','cli']){ + const count=filter==='cli'?cliSessionCount:webuiSessionCount; + const btn=document.createElement('button'); + btn.type='button'; + btn.className='session-source-tab'+(_sessionSourceFilter===filter?' active':''); + btn.textContent=_sessionSourceLabel(filter,count); + btn.setAttribute('aria-pressed', _sessionSourceFilter===filter?'true':'false'); + btn.onclick=()=>_setSessionSourceFilter(filter); + sourceTabs.appendChild(btn); + } + list.appendChild(sourceTabs); + } // Project filter bar — show when there are real projects OR there are // unassigned sessions (so the Unassigned chip has something to filter to). const hasUnprojected=profileFiltered.some(s=>!s.project_id); @@ -3353,9 +3415,14 @@ function renderSessionListFromCache(){ list.appendChild(toggle); } // Empty state for active project filter - if(_activeProject&&sessions.length===0){ + if(_sessionSourceFilter==='cli'&&sessions.length===0){ const empty=document.createElement('div'); - empty.style.cssText='padding:20px 14px;color:var(--muted);font-size:12px;text-align:center;opacity:.7;'; + empty.className='session-empty-note'; + empty.textContent=window._showCliSessions?'No CLI sessions found.':'Enable Show agent sessions in Settings to list CLI sessions here.'; + list.appendChild(empty); + } else if(_activeProject&&sessions.length===0){ + const empty=document.createElement('div'); + empty.className='session-empty-note'; empty.textContent=_activeProject===NO_PROJECT_FILTER?'No unassigned sessions.':'No sessions in this project yet.'; list.appendChild(empty); } diff --git a/static/style.css b/static/style.css index 5afc85eb..7e607f9f 100644 --- a/static/style.css +++ b/static/style.css @@ -3353,6 +3353,11 @@ main.main.showing-logs > #mainLogs{display:flex;} .mermaid-rendered svg{max-width:100%;height:auto;} /* ── Session projects ── */ +.session-source-tabs{display:flex;gap:4px;padding:4px 10px 8px;flex-shrink:0;} +.session-source-tab{flex:1;min-width:0;border:1px solid var(--border2);border-radius:10px;background:var(--input-bg);color:var(--muted);font-size:10px;font-weight:700;line-height:1.2;padding:5px 6px;cursor:pointer;white-space:nowrap;overflow:hidden;text-overflow:ellipsis;transition:background .15s,color .15s,border-color .15s;} +.session-source-tab:hover{background:rgba(255,255,255,.08);color:var(--text);} +.session-source-tab.active{background:var(--accent-bg);color:var(--accent-text);border-color:var(--accent-bg);} +.session-empty-note{padding:20px 14px;color:var(--muted);font-size:12px;text-align:center;opacity:.7;} .project-bar{display:flex;gap:4px;padding:4px 10px 8px;flex-wrap:wrap;align-items:center;flex-shrink:0;} .project-chip{font-size:10px;font-weight:600;padding:3px 8px;border-radius:12px;cursor:pointer;border:1px solid var(--border2);background:var(--input-bg);color:var(--muted);transition:all .15s;white-space:nowrap;display:inline-flex;align-items:center;gap:4px;} .project-chip:hover{background:rgba(255,255,255,.08);color:var(--text);} diff --git a/static/ui.js b/static/ui.js index dc5595db..8ce7745f 100644 --- a/static/ui.js +++ b/static/ui.js @@ -1067,11 +1067,14 @@ function _ensureModelOptionInDropdown(modelId, sel, preferredProviderId){ if(!modelId||!sel) return null; const applied=_applyModelToDropdown(modelId,sel,preferredProviderId); if(applied) return applied; + const value=modelId; const opt=document.createElement('option'); opt.value=modelId; opt.textContent=typeof getModelLabel==='function'?getModelLabel(modelId):modelId; opt.dataset.custom='1'; - const provider=preferredProviderId||_providerFromModelValue(modelId)||''; + const badge=(window._configuredModelBadges||{})[value]; + if(badge&&badge.provider) opt.dataset.provider=badge.provider; + const provider=preferredProviderId||(badge&&badge.provider)||_providerFromModelValue(modelId)||''; if(provider) opt.dataset.provider=provider; sel.appendChild(opt); sel.value=modelId; @@ -1578,7 +1581,7 @@ function renderModelDropdown(){ } const badgeHtml=m.badge?`${esc(badgeLabel)}`:''; row.innerHTML=`