Merge pull request #2596 from nesquena/stage-386

Stage 386 / v0.51.93 — Release BQ — 10-PR full sweep batch
2026-05-26 03:30:36 +00:00 · 2026-05-19 11:42:46 -07:00
parent 71c70352c1 cf014f3c30
commit 0310fcc466
33 changed files with 1639 additions and 189 deletions
@@ -3,6 +3,27 @@
 ## [Unreleased]


+## [v0.51.93] — 2026-05-19 — Release BQ (stage-386 — 10-PR full sweep batch — RFC Slice 4 runner/sidecar gate + workspace tree toggle width CSS variable + settled file:// markdown link rendering + prompt-cache coverage percentage fix + terminal shell shutdown reap + configured model picker provider preservation + profile-aware assistant display names + state.db reconciliation slice 1 + queued-message cross-session drain fix + stale-stream writeback supersede)
+
+### Fixed
+
+- **PR #2580** by @Michaelyklam (refs #2571) — Centralize the workspace-tree toggle slot width into a `--file-tree-toggle-width` CSS variable at `:root`, referenced from both `.file-tree-toggle` and `.file-tree-toggle-placeholder` so a future width adjustment can't silently desync the two rules. Closes the followup issue filed against PR #2563 / v0.51.92.
+- **PR #2576** by @dobby-d-elf (closes #470) — Preserve labeled `file://` links in settled markdown by rewriting them to `/api/media?path=...&inline=1` before the sanitizer drops them. The streamed and settled markdown paths are now symmetric on local-file anchors, while raw `file://` image sources continue to be blocked.
+- **PR #2579** by @starship-s (refs #2419, #2421) — Fix the prompt-cache hit percentage to display the fraction of the prompt served from cache (`cache_read / prompt_total`) instead of the meaningless `cache_read / (cache_read + cache_write)`. New `api/usage.py` `prompt_cache_hit_percent()` helper matches Hermes Agent's log convention; UI labels updated across all locales.
+- **PR #2582** by @Michaelyklam (refs #2577) — Harden embedded workspace-terminal shell cleanup so graceful WebUI shutdowns close/reap every active PTY shell and the spawned shell receives a Linux parent-death signal (`PR_SET_PDEATHSIG`) if the WebUI process dies. The terminal close path now waits again after `SIGKILL` so timed-out shells don't remain unreaped.
+- **PR #2583** by @dobby-d-elf — Make assistant display names properly profile-aware. The saved assistant-name preference applies only to the literal `default` profile; named profiles use their own profile name. Centralizes `assistantDisplayName()` resolution across composer placeholder, `document.title` via `syncTopbar()`, message role labels via `_assistantRoleHtml()`, browser notifications, cancel-copy fallback, and empty-state on session delete.
+- **PR #2584** by @wirtsi (closes #2585) — Prevent queued follow-up messages from draining into the wrong chat when the user switches sessions during the 120ms `setBusy(false)` drain window. The drain-time guard re-queues against `sid` (not the currently-viewed session) and `_sendInProgressSid` captures the activeSid at the commit point so the re-entrant `send()` path no longer reads a stale `S.session.session_id`.
+- **PR #2587** by @AJV20 — Allow a still-running stream that was mistakenly marked interrupted by stale-pending recovery to replace its own recovery marker when it later finishes, while continuing to block stale writeback after any newer turn appends transcript content. Three new tests in `tests/test_session_sidecar_repair.py` cover the supersede-allowed and the two refuse cases.
+- **PR #2588** by @Michaelyklam (refs #2569) — Preserve the configured provider when choosing a configured model from the composer picker. `_getOptionProviderId()` now reads `data-provider` from temporary `<option data-custom="1">` rows (created by `selectModelFromDropdown` for configured models outside the native catalog), so the next send routes through the correct provider instead of falling back to whatever provider was already active.
+
+### Changed
+
+- **PR #2581** by @LumenYoung (refs #2194) — First recovery slice from the closed reconciliation PR #2194. Routes streaming session reconstruction and sidebar metadata through the reconciled state.db/session-summary path with a metadata-only fast path for sidebar polls and a single-snapshot reuse on the streaming hot path. Includes the reviewer-requested `_new_turn_context_from_messages` extraction so both legacy and streaming paths share the `_drop_checkpointed_current_user_from_context` + casual-fresh-chat suppression behavior (refs #1217 / #2308). 923 LOC across `api/models.py`, `api/routes.py`, `api/streaming.py`, `static/sessions.js` + four new test files; second-pass agent diff review LGTM after the streaming-path regression was caught and fixed.
+
+### Documentation
+
+- **PR #2575** by @Michaelyklam (refs #1925) — Advance the runtime-adapter RFC to the Slice 4 runner/sidecar planning gate after #2560 shipped the queue-staging clarification. The RFC now marks queue routing as staged by default, defines Slice 4a as a docs/test contract before any runner code lands, and pins default-off feature-flagging, restart/reattach success criteria, control parity, profile/workspace payload isolation, and explicit non-goals for legacy-backend removal or server-side queue scheduler work.
+
 ## [v0.51.92] — 2026-05-19 — Release BP (stage-385 — 7-PR full sweep batch — RFC Slice 3c clarification + workspace tree icon alignment + project move cache refresh + auto-compression handoff metadata + Grok OAuth provider catalog + anonymous custom endpoint picker fallback + PWA standalone reload + pull-to-refresh)

 ### Fixed
@@ -19,6 +19,7 @@ from api.config import (
    get_effective_default_model, _get_session_agent_lock,
 )
 from api.workspace import get_last_workspace
+from api.usage import prompt_cache_hit_percent
 from api.agent_sessions import (
    _is_continuation_session,
    read_importable_agent_session_rows,
@@ -634,6 +635,7 @@ class Session:
            'estimated_cost': self.estimated_cost,
            'cache_read_tokens': self.cache_read_tokens,
            'cache_write_tokens': self.cache_write_tokens,
+            'cache_hit_percent': prompt_cache_hit_percent(self.cache_read_tokens, self.input_tokens),
            'personality': self.personality,
            'compression_anchor_visible_idx': self.compression_anchor_visible_idx,
            'compression_anchor_message_key': self.compression_anchor_message_key,
@@ -2226,17 +2228,15 @@ def _json_loads_if_string(value):
        return value


-def get_cli_session_messages(sid) -> list:
-    """Read messages for a single CLI/external-agent session.
+def get_state_db_session_messages(sid, *, stitch_continuations: bool = False) -> list:
+    """Read messages for a Hermes session from the active profile's state.db.

-    Preserve tool-call/result and reasoning metadata from the agent state.db so
-    CLI-origin transcripts render with the same tool cards as WebUI-native
-    sessions. When the requested session is the tip of a compression/CLI-close
-    continuation chain, return the stitched full transcript across all segments
-    in chronological order. Returns empty list on any error.
+    This generic reader intentionally works for any session source, including
+    WebUI-origin sessions that were later updated through another Hermes surface
+    such as the Gateway API Server.  When ``stitch_continuations`` is true it
+    preserves the historical CLI/external-agent behavior of walking compatible
+    compression/close parent segments before reading messages.
    """
-    if str(sid or '').startswith(f'{CLAUDE_CODE_SOURCE}_'):
-        return get_claude_code_session_messages(sid)
    try:
        import sqlite3
    except ImportError:
@@ -2267,47 +2267,48 @@ def get_cli_session_messages(sid) -> list:
            ]
            selected = ['role', 'content', 'timestamp'] + [c for c in optional if c in available]

-            cur.execute("PRAGMA table_info(sessions)")
-            session_cols = {str(row['name']) for row in cur.fetchall()}
            session_chain = [str(sid)]
-            if {'parent_session_id', 'end_reason', 'started_at', 'source'}.issubset(session_cols):
-                cur.execute(
-                    """
-                    SELECT id, source, started_at, parent_session_id, ended_at, end_reason
-                    FROM sessions
-                    WHERE id = ?
-                    """,
-                    (sid,),
-                )
-                rows_by_id = {}
-                row = cur.fetchone()
-                if row:
-                    rows_by_id[str(row['id'])] = dict(row)
-                    current_id = str(row['id'])
-                    seen = {current_id}
-                    for _ in range(20):
-                        current = rows_by_id.get(current_id)
-                        parent_id = current.get('parent_session_id') if current else None
-                        if not parent_id or parent_id in seen:
-                            break
-                        cur.execute(
-                            """
-                            SELECT id, source, started_at, parent_session_id, ended_at, end_reason
-                            FROM sessions
-                            WHERE id = ?
-                            """,
-                            (parent_id,),
-                        )
-                        parent_row = cur.fetchone()
-                        if not parent_row:
-                            break
-                        parent_dict = dict(parent_row)
-                        rows_by_id[str(parent_row['id'])] = parent_dict
-                        if not _is_continuation_session(parent_dict, current):
-                            break
-                        session_chain.insert(0, str(parent_row['id']))
-                        current_id = str(parent_row['id'])
-                        seen.add(current_id)
+            if stitch_continuations:
+                cur.execute("PRAGMA table_info(sessions)")
+                session_cols = {str(row['name']) for row in cur.fetchall()}
+                if {'parent_session_id', 'end_reason', 'started_at', 'source'}.issubset(session_cols):
+                    cur.execute(
+                        """
+                        SELECT id, source, started_at, parent_session_id, ended_at, end_reason
+                        FROM sessions
+                        WHERE id = ?
+                        """,
+                        (sid,),
+                    )
+                    rows_by_id = {}
+                    row = cur.fetchone()
+                    if row:
+                        rows_by_id[str(row['id'])] = dict(row)
+                        current_id = str(row['id'])
+                        seen = {current_id}
+                        for _ in range(20):
+                            current = rows_by_id.get(current_id)
+                            parent_id = current.get('parent_session_id') if current else None
+                            if not parent_id or parent_id in seen:
+                                break
+                            cur.execute(
+                                """
+                                SELECT id, source, started_at, parent_session_id, ended_at, end_reason
+                                FROM sessions
+                                WHERE id = ?
+                                """,
+                                (parent_id,),
+                            )
+                            parent_row = cur.fetchone()
+                            if not parent_row:
+                                break
+                            parent_dict = dict(parent_row)
+                            rows_by_id[str(parent_row['id'])] = parent_dict
+                            if not _is_continuation_session(parent_dict, current):
+                                break
+                            session_chain.insert(0, str(parent_row['id']))
+                            current_id = str(parent_row['id'])
+                            seen.add(current_id)

            placeholders = ', '.join('?' for _ in session_chain)
            cur.execute(f"""
@@ -2340,6 +2341,174 @@ def get_cli_session_messages(sid) -> list:
    return msgs


+def get_state_db_session_summary(sid) -> dict:
+    """Return cheap message count/max timestamp for one state.db session.
+
+    This is intentionally narrower than ``get_state_db_session_messages`` for
+    metadata-only WebUI polling: callers only need a staleness signal, not a
+    fully materialized transcript with tool/reasoning metadata.
+    """
+    import os
+    try:
+        import sqlite3
+    except ImportError:
+        return {}
+
+    db_path = _active_state_db_path()
+    if not sid or not db_path.exists():
+        return {}
+
+    try:
+        with closing(sqlite3.connect(str(db_path))) as conn:
+            conn.row_factory = sqlite3.Row
+            cur = conn.cursor()
+            cur.execute("PRAGMA table_info(messages)")
+            available = {str(row['name']) for row in cur.fetchall()}
+            if not {'session_id', 'timestamp'}.issubset(available):
+                return {}
+            cur.execute(
+                """
+                SELECT COUNT(*) AS message_count, MAX(timestamp) AS last_message_at
+                FROM messages
+                WHERE session_id = ?
+                """,
+                (str(sid),),
+            )
+            row = cur.fetchone()
+            if not row:
+                return {}
+            count = int(row['message_count'] or 0)
+            last_message_at = row['last_message_at']
+            result = {'message_count': count}
+            if last_message_at not in (None, ''):
+                try:
+                    result['last_message_at'] = float(last_message_at)
+                except (TypeError, ValueError):
+                    pass
+            return result
+    except Exception:
+        return {}
+
+
+def _normalized_message_timestamp_for_key(value):
+    if value is None or value == "":
+        return ""
+    try:
+        timestamp = float(value)
+    except (TypeError, ValueError):
+        return str(value)
+    if timestamp.is_integer():
+        return str(int(timestamp))
+    return ("%.6f" % timestamp).rstrip("0").rstrip(".")
+
+
+def _message_timestamp_as_float(msg):
+    if not isinstance(msg, dict):
+        return None
+    value = msg.get("timestamp")
+    if value is None or value == "":
+        return None
+    try:
+        return float(value)
+    except (TypeError, ValueError):
+        return None
+
+
+def _session_message_merge_key(msg: dict):
+    if not isinstance(msg, dict):
+        return ("non_dict", repr(msg))
+    message_identity = msg.get("id") or msg.get("message_id")
+    if message_identity:
+        return ("message_id", str(message_identity))
+    return (
+        "legacy",
+        str(msg.get("role") or ""),
+        str(msg.get("content") or ""),
+        _normalized_message_timestamp_for_key(msg.get("timestamp")),
+        str(msg.get("tool_call_id") or ""),
+        str(msg.get("tool_name") or msg.get("name") or ""),
+    )
+
+
+def merge_session_messages_append_only(sidecar_messages: list, state_messages: list) -> list:
+    """Merge sidecar/context and state.db messages without deleting local rows."""
+    sidecar_messages = list(sidecar_messages or [])
+    state_messages = list(state_messages or [])
+    if not state_messages:
+        return sidecar_messages
+    if not sidecar_messages:
+        return state_messages
+
+    merged_messages = []
+    seen_message_keys = set()
+    max_sidecar_timestamp = None
+    for msg in sidecar_messages:
+        timestamp = _message_timestamp_as_float(msg)
+        if timestamp is not None:
+            max_sidecar_timestamp = timestamp if max_sidecar_timestamp is None else max(max_sidecar_timestamp, timestamp)
+        key = _session_message_merge_key(msg)
+        seen_message_keys.add(key)
+        merged_messages.append(msg)
+    for msg in state_messages:
+        timestamp = _message_timestamp_as_float(msg)
+        key = _session_message_merge_key(msg)
+        if max_sidecar_timestamp is not None and timestamp is not None and timestamp <= max_sidecar_timestamp:
+            if key in seen_message_keys:
+                continue
+            if not (isinstance(key, tuple) and key[:1] == ("message_id",)):
+                continue
+        if key in seen_message_keys:
+            continue
+        # State rows at or before the newest sidecar timestamp are normally
+        # assumed to have already been observed by the sidecar. The <= gate
+        # preserves sidecar-only ordering/metadata for equal timestamps and
+        # prevents duplicate legacy rows when timestamp precision differs
+        # between stores. Explicit message ids are authoritative, though: two
+        # equal-timestamp messages with different ids are distinct retries.
+        if (
+            key[0] != "message_id"
+            and max_sidecar_timestamp is not None
+            and timestamp is not None
+            and timestamp <= max_sidecar_timestamp
+        ):
+            continue
+        seen_message_keys.add(key)
+        merged_messages.append(msg)
+    return merged_messages
+
+
+def reconciled_state_db_messages_for_session(
+    session, *, prefer_context: bool = False, state_messages: list | None = None
+) -> list:
+    """Return append-only messages reconciled with state.db for a WebUI session."""
+    if session is None:
+        return []
+    local_messages = []
+    if prefer_context:
+        context_messages = getattr(session, 'context_messages', None)
+        if isinstance(context_messages, list) and context_messages:
+            local_messages = context_messages
+    if not local_messages:
+        local_messages = getattr(session, 'messages', None) or []
+    if state_messages is None:
+        state_messages = get_state_db_session_messages(getattr(session, 'session_id', None))
+    return merge_session_messages_append_only(local_messages, state_messages)
+
+
+def get_cli_session_messages(sid) -> list:
+    """Read messages for a single CLI/external-agent session.
+
+    Preserve tool-call/result and reasoning metadata from the agent state.db so
+    CLI-origin transcripts render with the same tool cards as WebUI-native
+    sessions. When the requested session is the tip of a compression/CLI-close
+    continuation chain, return the stitched full transcript across all segments
+    in chronological order. Returns empty list on any error.
+    """
+    if str(sid or '').startswith(f'{CLAUDE_CODE_SOURCE}_'):
+        return get_claude_code_session_messages(sid)
+    return get_state_db_session_messages(sid, stitch_continuations=True)
+
+
 def count_conversation_rounds(sid: str, since: float | None = None) -> int:
    """Count conversation rounds for a session from state.db.

@@ -2220,6 +2220,9 @@ from api.models import (
    import_cli_session,
    get_cli_sessions,
    get_cli_session_messages,
+    get_state_db_session_messages,
+    get_state_db_session_summary,
+    merge_session_messages_append_only,
    ensure_cron_project,
    is_cron_session,
 )
@@ -3665,8 +3668,17 @@ def handle_get(handler, parsed) -> bool:
            cli_meta = _lookup_cli_session_metadata(sid) if _session_requires_cli_metadata_lookup(s) else {}
            is_messaging_session = _is_messaging_session_record(s) or _is_messaging_session_record(cli_meta)
            cli_messages = []
+            state_db_messages = []
+            state_db_summary = {}
            if is_messaging_session:
                cli_messages = get_cli_session_messages(sid)
+            elif load_messages:
+                state_db_messages = get_state_db_session_messages(sid)
+            elif not is_messaging_session:
+                # Metadata-only callers (frontend refresh polling) only need a
+                # cheap staleness signal. Avoid full transcript materialization
+                # on the steady-state polling path.
+                state_db_summary = get_state_db_session_summary(sid)
            _t2 = _time.monotonic()
            effective_model = (
                _resolve_effective_session_model_for_display(s)
@@ -3690,9 +3702,32 @@ def handle_get(handler, parsed) -> bool:
                    # them chronologically and dedupe exact repeats.
                    _all_msgs = _merged_session_messages_for_display(s, cli_messages)
                else:
-                    _all_msgs = s.messages
+                    _all_msgs = merge_session_messages_append_only(s.messages, state_db_messages)
            else:
-                _all_msgs = []
+                if is_messaging_session and cli_messages:
+                    sidecar_messages = getattr(s, "messages", []) or []
+                    _all_msgs = merge_session_messages_append_only(cli_messages, sidecar_messages)
+                else:
+                    _all_msgs = merge_session_messages_append_only(getattr(s, "messages", []) or [], state_db_messages)
+            if not load_messages and state_db_summary:
+                sidecar_messages = getattr(s, "messages", []) or []
+                sidecar_count = len(sidecar_messages)
+                try:
+                    sidecar_last = max(
+                        float((m or {}).get("timestamp") or 0)
+                        for m in sidecar_messages
+                        if isinstance(m, dict)
+                    ) if sidecar_messages else 0
+                except (TypeError, ValueError):
+                    sidecar_last = 0
+                state_count = int(state_db_summary.get("message_count") or 0)
+                state_last = float(state_db_summary.get("last_message_at") or 0)
+                _all_msgs = sidecar_messages
+                _summary_message_count = max(sidecar_count, state_count)
+                _summary_last_message_at = max(sidecar_last, state_last)
+            else:
+                _summary_message_count = None
+                _summary_last_message_at = None
            if load_messages:
                if msg_before is not None:
                    # Scroll-to-top paging: msg_before is a 0-based index into
@@ -3708,7 +3743,7 @@ def handle_get(handler, parsed) -> bool:
                else:
                    _truncated_msgs = _all_msgs
            else:
-                _truncated_msgs = _all_msgs
+                _truncated_msgs = []
            # Resolve effective context_length with model-metadata fallback so
            # older sessions (pre-#1318) that have context_length=0 persisted
            # still render a meaningful indicator on load.  Mirrors the
@@ -3748,8 +3783,20 @@ def handle_get(handler, parsed) -> bool:
                # messages already carry per-message tool metadata. Avoid sending
                # the full historical list with a small tail window.
                _session_tool_calls = []
+            _merged_message_count = _summary_message_count if _summary_message_count is not None else len(_all_msgs)
+            _merged_last_message_at = _summary_last_message_at if _summary_last_message_at is not None else 0
+            if _summary_last_message_at is None and _all_msgs:
+                try:
+                    _merged_last_message_at = max(
+                        float((m or {}).get("timestamp") or 0)
+                        for m in _all_msgs
+                        if isinstance(m, dict)
+                    )
+                except (TypeError, ValueError):
+                    _merged_last_message_at = 0
            raw = s.compact() | {
                "messages": _truncated_msgs,
+                "message_count": _merged_message_count,
                "tool_calls": _session_tool_calls,
                "active_stream_id": getattr(s, "active_stream_id", None),
                "pending_user_message": getattr(s, "pending_user_message", None),
@@ -3769,6 +3816,15 @@ def handle_get(handler, parsed) -> bool:
                        journal,
                        active=bool(getattr(s, "active_stream_id", None)),
                    )
+            if _merged_last_message_at:
+                raw["last_message_at"] = max(
+                    float(raw.get("last_message_at") or 0),
+                    _merged_last_message_at,
+                )
+                raw["updated_at"] = max(
+                    float(raw.get("updated_at") or 0),
+                    _merged_last_message_at,
+                )
            if cli_meta and _is_messaging_session_record(cli_meta):
                raw = _merge_cli_sidebar_metadata(raw, cli_meta)
            # Signal to the frontend that older messages were omitted.
@@ -39,6 +39,8 @@ from api.compression_anchor import visible_messages_for_anchor
 from api.metering import meter
 from api.run_journal import RunJournalWriter
 from api.turn_journal import append_turn_journal_event_for_stream
+from api.usage import prompt_cache_hit_percent
+from api.models import get_state_db_session_messages, reconciled_state_db_messages_for_session

 # Global lock for os.environ writes. Per-session locks (_agent_lock) prevent
 # concurrent runs of the SAME session, but two DIFFERENT sessions can still
@@ -247,6 +249,13 @@ def _preferred_agent_display_name() -> str:
    return name or 'Hermes'


+def _preferred_agent_display_name_for_session(session) -> str:
+    profile = str(getattr(session, 'profile', '') or '').strip()
+    if profile and profile != 'default':
+        return profile[:1].upper() + profile[1:]
+    return _preferred_agent_display_name()
+
+
 def _cancelled_turn_hint(agent_name: str | None = None) -> str:
    name = str(agent_name or _preferred_agent_display_name()).strip() or 'Hermes'
    return f'The run was cancelled by the user before {name} finished. No provider failure occurred.'
@@ -398,14 +407,14 @@ def _session_has_cancel_marker(session) -> bool:
    return False


-def _cancelled_turn_content(message: str = 'Task cancelled.') -> str:
+def _cancelled_turn_content(message: str = 'Task cancelled.', agent_name: str | None = None) -> str:
    """Return cancelled-turn copy matching the verbose provider-error layout."""
    _message = str(message or 'Task cancelled.').strip()
    if not _message.endswith('.'):
        _message += '.'
    return (
        f"**Task cancelled:** {_message}\n\n"
-        f"*{_cancelled_turn_hint()}*"
+        f"*{_cancelled_turn_hint(agent_name)}*"
    )


@@ -422,9 +431,10 @@ def _persist_cancelled_turn(session, *, message: str = 'Task cancelled.') -> Non
    session.pending_attachments = []
    session.pending_started_at = None
    if not _session_has_cancel_marker(session):
+        agent_name = _preferred_agent_display_name_for_session(session)
        session.messages.append({
            'role': 'assistant',
-            'content': _cancelled_turn_content(message),
+            'content': _cancelled_turn_content(message, agent_name),
            '_error': True,
            'provider_details': str(message or 'Task cancelled.').strip(),
            'provider_details_label': 'Cancellation details',
@@ -2331,21 +2341,22 @@ def _has_task_resume_compaction_marker(messages):
    return False


+def _new_turn_context_from_messages(messages, msg_text):
+    """Return provider-facing history for a new user turn from a message list."""
+    history = _drop_checkpointed_current_user_from_context(messages, msg_text)
+    if _is_casual_fresh_chat_message(msg_text) and _has_task_resume_compaction_marker(history):
+        return []
+    return history
+
+
 def _context_messages_for_new_turn(session, msg_text):
    """Return provider-facing history for a new user turn.

    Compacted agent sessions can carry a hidden "resume the active task" summary
-    long after the visible UI looks like normal chat.  A short greeting should
-    not silently reactivate that old task; explicit continuation prompts still
-    keep the full compacted context.
+    in context_messages. If the user starts a fresh casual greeting in that old
+    session, do not feed that stale active-task summary back to the model.
    """
-    history = _drop_checkpointed_current_user_from_context(
-        _session_context_messages(session),
-        msg_text,
-    )
-    if _is_casual_fresh_chat_message(msg_text) and _has_task_resume_compaction_marker(history):
-        return []
-    return history
+    return _new_turn_context_from_messages(_session_context_messages(session), msg_text)


 def _stream_writeback_is_current(session, stream_id):
@@ -2358,6 +2369,53 @@ def _stream_writeback_is_current(session, stream_id):
    return bool(stream_id) and getattr(session, 'active_stream_id', None) == stream_id


+def _stream_writeback_can_supersede_recovery_marker(session, msg_text):
+    """Allow a finishing worker to replace its own stale-repair marker.
+
+    The stale-pending repair path can occasionally run while the original worker
+    is still alive but temporarily missing from the in-memory stream registry. It
+    clears ``active_stream_id`` and appends a "Response interrupted" marker. If
+    the original worker later finishes, treating ``active_stream_id is None`` as
+    stale drops the real answer and leaves the misleading marker visible.
+
+    This is intentionally narrow: only a session with no active/pending turn and
+    whose last visible row is the recovery marker for this exact user prompt may
+    be superseded. If a newer turn has appended anything after the marker, the
+    normal stale-writeback guard still wins.
+    """
+    if getattr(session, 'active_stream_id', None):
+        return False
+    if getattr(session, 'pending_user_message', None):
+        return False
+    if getattr(session, 'pending_attachments', None):
+        return False
+    messages = list(getattr(session, 'messages', None) or [])
+    if len(messages) < 2:
+        return False
+    last = messages[-1]
+    if not isinstance(last, dict) or not last.get('_error'):
+        return False
+    if last.get('type') != 'interrupted':
+        return False
+    content = str(last.get('content') or '')
+    if 'Response interrupted' not in content or 'WebUI process restarted' not in content:
+        return False
+
+    expected = ' '.join(str(msg_text or '').split())
+    if not expected:
+        return False
+    for msg in reversed(messages[:-1]):
+        if not isinstance(msg, dict):
+            continue
+        if msg.get('_error'):
+            continue
+        if msg.get('role') != 'user':
+            continue
+        actual = ' '.join(str(msg.get('content') or '').split())
+        return actual == expected
+    return False
+
+
 def _merge_display_messages_after_agent_result(previous_display, previous_context, result_messages, msg_text):
    """Keep UI transcript durable while allowing model context to compact.

@@ -2988,6 +3046,7 @@ def _run_agent_streaming(
            'estimated_cost': 0,
            'cache_read_tokens': 0,
            'cache_write_tokens': 0,
+            'cache_hit_percent': None,
            'context_length': 0,
            'threshold_tokens': 0,
            'last_prompt_tokens': 0,
@@ -3025,6 +3084,10 @@ def _run_agent_streaming(
                        pass

        _real_prompt_tokens = int(_usage.get('last_prompt_tokens') or 0)
+        _usage['cache_hit_percent'] = prompt_cache_hit_percent(
+            _usage.get('cache_read_tokens') or 0,
+            _usage.get('input_tokens') or 0,
+        )
        if _real_prompt_tokens and _real_prompt_tokens != _live_prompt_exact_tokens[0]:
            _live_prompt_exact_tokens[0] = _real_prompt_tokens
            _live_prompt_estimate_tokens[0] = _real_prompt_tokens
@@ -3957,8 +4020,21 @@ def _run_agent_streaming(
            # or has been zeroed out (e.g. via a buggy migration / manual file edit).
            # Truthy-check covers None, missing-attr, and 0 uniformly.
            _turn_started_at = _pending_started_at if _pending_started_at else time.time()
-            _previous_messages = list(s.messages or [])
-            _previous_context_messages = _context_messages_for_new_turn(s, msg_text)
+            _external_state_messages = get_state_db_session_messages(getattr(s, 'session_id', None))
+            _previous_messages = list(
+                reconciled_state_db_messages_for_session(
+                    s,
+                    state_messages=_external_state_messages,
+                ) or []
+            )
+            _previous_context_messages = _new_turn_context_from_messages(
+                reconciled_state_db_messages_for_session(
+                    s,
+                    prefer_context=True,
+                    state_messages=_external_state_messages,
+                ),
+                msg_text,
+            )
            _pre_compression_count = getattr(
                getattr(agent, 'context_compressor', None),
                'compression_count', 0,
@@ -4083,13 +4159,20 @@ def _run_agent_streaming(
                return
            with _agent_lock:
                if not ephemeral and not _stream_writeback_is_current(s, stream_id):
-                    logger.info(
-                        "Skipping stale stream writeback for session %s stream %s; active_stream_id=%s",
-                        getattr(s, 'session_id', session_id),
-                        stream_id,
-                        getattr(s, 'active_stream_id', None),
-                    )
-                    return
+                    if _stream_writeback_can_supersede_recovery_marker(s, msg_text):
+                        logger.info(
+                            "Superseding stale recovery marker for session %s stream %s",
+                            getattr(s, 'session_id', session_id),
+                            stream_id,
+                        )
+                    else:
+                        logger.info(
+                            "Skipping stale stream writeback for session %s stream %s; active_stream_id=%s",
+                            getattr(s, 'session_id', session_id),
+                            stream_id,
+                            getattr(s, 'active_stream_id', None),
+                        )
+                        return
                _result_messages = result.get('messages') or _previous_context_messages
                if cancel_event.is_set():
                    _finalize_cancelled_turn(s, ephemeral=False)
@@ -4474,6 +4557,15 @@ def _run_agent_streaming(
                estimated_cost = getattr(agent, 'session_estimated_cost_usd', None)
                cache_read_tokens = getattr(agent, 'session_cache_read_tokens', 0) or 0
                cache_write_tokens = getattr(agent, 'session_cache_write_tokens', 0) or 0
+                prev_input_tokens = getattr(s, 'input_tokens', 0) or 0
+                prev_cache_read_tokens = getattr(s, 'cache_read_tokens', 0) or 0
+                turn_input_tokens = max(0, input_tokens - prev_input_tokens)
+                turn_cache_read_tokens = max(0, cache_read_tokens - prev_cache_read_tokens)
+                # Per-turn percent is computed server-side from persisted session
+                # counters so the message label uses the same denominator as the
+                # final usage payload even if the browser missed an intermediate event.
+                cache_hit_percent = prompt_cache_hit_percent(cache_read_tokens, input_tokens)
+                turn_cache_hit_percent = prompt_cache_hit_percent(turn_cache_read_tokens, turn_input_tokens)
                if input_tokens > 0:
                    s.input_tokens = input_tokens
                if output_tokens > 0:
@@ -4730,6 +4822,8 @@ def _run_agent_streaming(
                'estimated_cost': estimated_cost,
                'cache_read_tokens': cache_read_tokens,
                'cache_write_tokens': cache_write_tokens,
+                'cache_hit_percent': cache_hit_percent,
+                'turn_cache_hit_percent': turn_cache_hit_percent,
                'duration_seconds': round(_turn_duration_seconds, 3),
            }
            if _turn_tps is not None:
@@ -5555,7 +5649,10 @@ def cancel_stream(stream_id: str) -> bool:
                if not _cancel_marker_exists:
                    _cs.messages.append({
                        'role': 'assistant',
-                        'content': _cancelled_turn_content('Task cancelled.'),
+                        'content': _cancelled_turn_content(
+                            'Task cancelled.',
+                            _preferred_agent_display_name_for_session(_cs),
+                        ),
                        '_error': True,
                        'provider_details': 'Task cancelled.',
                        'provider_details_label': 'Cancellation details',
@@ -9,6 +9,7 @@ in the agent execution layer.
 from __future__ import annotations

 import errno
+import atexit
 import codecs
 import fcntl
 import os
@@ -69,6 +70,20 @@ _TERMINALS: dict[str, TerminalSession] = {}
 _LOCK = threading.RLock()


+def _terminal_shell_preexec_fn() -> None:
+    """Ask Linux to terminate the PTY shell when the WebUI parent dies."""
+    try:
+        import ctypes
+
+        libc = ctypes.CDLL(None)
+        libc.prctl(1, signal.SIGTERM)  # PR_SET_PDEATHSIG=1, SIGTERM=15
+    except Exception:
+        # Non-Linux platforms or restricted runtimes should still be able to
+        # open an embedded terminal; they just do not get the Linux pdeathsig
+        # hardening.
+        pass
+
+
 def _decode_terminal_output(decoder, data: bytes) -> str:
    """Decode PTY bytes without stripping terminal control sequences."""
    return decoder.decode(data)
@@ -178,6 +193,7 @@ def start_terminal(session_id: str, workspace: Path, rows: int = 24, cols: int =
            stdout=slave_fd,
            stderr=slave_fd,
            close_fds=True,
+            preexec_fn=_terminal_shell_preexec_fn,
            start_new_session=True,
        )
        os.close(slave_fd)
@@ -240,9 +256,24 @@ def close_terminal(session_id: str) -> bool:
                    os.killpg(term.proc.pid, signal.SIGKILL)
                except ProcessLookupError:
                    pass
+                try:
+                    term.proc.wait(timeout=1.0)
+                except (subprocess.TimeoutExpired, ProcessLookupError):
+                    pass
    finally:
        try:
            os.close(term.master_fd)
        except OSError:
            pass
    return True
+
+
+def close_all_terminals() -> None:
+    """Best-effort reap of embedded shells during graceful WebUI shutdown."""
+    with _LOCK:
+        session_ids = list(_TERMINALS)
+    for session_id in session_ids:
+        close_terminal(session_id)
+
+
+atexit.register(close_all_terminals)
@@ -0,0 +1,26 @@
+"""Usage metric helpers for WebUI display payloads.
+
+Prompt-cache hit percentage is cached prompt reads over the full prompt total
+(input + cache reads + cache writes). Keep this calculation in the backend so
+browser display code cannot drift across context indicator and per-turn labels.
+"""
+
+
+def _to_int(value) -> int:
+    try:
+        return int(value or 0)
+    except (TypeError, ValueError):
+        return 0
+
+
+def prompt_cache_hit_percent(cache_read_tokens, prompt_tokens):
+    """Return cached reads as a percent of full prompt-token total.
+
+    ``prompt_tokens`` must include ordinary input, cache reads, and cache writes
+    (matching Agent's ``session_prompt_tokens`` value).
+    """
+    cache_read = _to_int(cache_read_tokens)
+    prompt = _to_int(prompt_tokens)
+    if cache_read <= 0 or prompt <= 0:
+        return None
+    return min(100, round((cache_read / prompt) * 100))
@@ -52,7 +52,7 @@ The immediate goal is not to build a sidecar. The immediate goal is to define th
 browser contract, classify current runtime state, and gate the first reversible
 journal slice.

-## Current Gate State — 2026-05-18
+## Current Gate State — 2026-05-19

 Slice 1 is now past the first active validation gate:

@@ -90,14 +90,17 @@ adapter-seam work:
  `HERMES_WEBUI_RUNTIME_ADAPTER=legacy-journal` is enabled, while preserving the
  legacy-direct response shape and leaving post-turn goal evaluation in the
  existing agent loop.
+- #2560 shipped the queue-staging clarification in v0.51.92. The RFC now treats
+  `queue_message(...)` as a staged protocol method only; `/queue` remains
+  browser-side queue/drain behavior, and no server-side queue endpoint or queue
+  scheduler should be added merely for adapter symmetry.

-The next gate is still not the runner/sidecar by default. Slice 3c's goal route
-is shipped, and `queue_message(...)` remains a staged protocol method. Queue /
-continue routing needs an explicit follow-up contract because the legacy `/queue`
-path is browser-side queue/drain behavior today; no new server-side queue endpoint
-or queue scheduler should be added just for adapter symmetry. If maintainers want
-queue/continue to move before Slice 4, that follow-up should specify the exact
-legacy entry point, response shape, and ordering/idempotency contract first.
+The next gate is the runner/sidecar planning contract, not queue implementation
+by default. Queue / continue routing should only move before Slice 4 if a future
+maintainer decision identifies an existing server-side legacy entry point and
+pins its response shape, ordering, and idempotency contract. Otherwise, keeping
+`queue_message(...)` staged is the honest boundary while execution ownership
+moves out of the main WebUI request process.

 ## Goals

@@ -670,8 +673,19 @@ Non-goals for Slice 3c:

 ### Slice 4: Runner process / sidecar boundary

-Explicitly deferred until Slice 1 has worked in production for at least one
-release cycle and the adapter surface has review approval.
+Slice 4 is the first gate that may move active execution ownership out of the
+main WebUI request process. It should start as a docs/test contract PR before any
+runner code lands. Slice 1's journal/replay layer has shipped and passed active
+validation, Slice 2's default-off adapter seam has shipped, and Slice 3's
+cancel/approval/clarify/goal control routing has proven the protocol-translator
+pattern. Queue remains staged unless maintainers explicitly ask for a separate
+pre-runner queue route.
+
+The Slice 4 implementation must not make the adapter a new runtime surrogate.
+The runner boundary may own active execution, process supervision, run lifecycle,
+and callback state, but those responsibilities must be centralized behind the
+adapter/runner contract rather than recreated as scattered globals in the main
+WebUI server.

 Scope:

@@ -683,6 +697,55 @@ Scope:

 Revert path: disable runner backend and fall back to journaled legacy backend.

+#### Slice 4a: Runner contract gate
+
+Before runner code lands, define a narrow contract that covers:
+
+1. **Backend selection and rollback.** The existing `legacy-direct` and
+   `legacy-journal` paths remain available. Any new runner backend is
+   feature-flagged, default-off, and revertible by switching the adapter mode back
+   to `legacy-journal` without deleting sessions or journal files.
+2. **Process ownership.** The runner, not the main WebUI request process, owns
+   `AIAgent` construction/reuse, active run execution, cancellation flags,
+   approval/clarify callback wait state, and post-turn continuation evaluation
+   for runs assigned to that backend.
+3. **Durable observation.** The main WebUI server observes through
+   `RuntimeAdapter.observe_run(...)`, `get_run(...)`, and the journal cursor. A
+   WebUI restart must not be required for the runner to finish writing ordered
+   events and terminal state.
+4. **Restart/reattach success criterion.** Start a long-running run, restart only
+   `hermes-webui.service`, reload the session, rediscover the active or terminal
+   runner-owned run, replay/catch up from cursor without duplicate transcript /
+   tool / reasoning state, and preserve cancel if the run is still active.
+5. **Control parity.** Cancel, approval, clarify, goal status/control, and any
+   accepted queue/continue behavior route through adapter methods with stable
+   browser response shapes. Unsupported controls return bounded `ControlResult`
+   states instead of silently falling back to stale in-process state.
+6. **Profile/workspace isolation.** Runner startup receives explicit profile,
+   workspace, attachments, model/provider, toolset, and source metadata rather
+   than relying on process-global environment mutation in the WebUI server.
+
+Suggested contract tests before implementation:
+
+- source/RFC tests proving Slice 4 remains feature-flagged and default-off;
+- a fake-runner adapter test that simulates WebUI restart by discarding server
+  process-local state while preserving runner/journal state, then verifies
+  `get_run` and replay recover the same terminal state;
+- a control-parity fixture proving unsupported runner controls return bounded
+  `ControlResult` values and do not fall back to legacy `STREAMS` /
+  `CANCEL_FLAGS` state;
+- a profile/workspace payload test proving runner requests carry explicit context
+  fields without mutating global `os.environ` in the main WebUI process.
+
+Non-goals for Slice 4a:
+
+- no removal of the legacy in-process backend;
+- no default-on runner mode;
+- no public chat-start/status response-shape expansion;
+- no new server-side queue endpoint or scheduler just for adapter symmetry;
+- no dependency on Hermes Agent shipping `/v1/runs` before WebUI can validate the
+  local runner boundary.
+
 ## First Meaningful Success Criteria

 The first meaningful milestones are deliberately split.
@@ -1380,15 +1380,9 @@ function _buildSkinPicker(activeSkin){
 }

 function applyBotName(){
-  // Prefer profile name over global bot_name for personalised placeholder.
-  // If activeProfile is set and not 'default', use it (capitalised).
-  // Falls back to window._botName (global bot_name setting) or 'Hermes'.
-  let name;
-  if(S.activeProfile && S.activeProfile!=='default'){
-    name=S.activeProfile.charAt(0).toUpperCase()+S.activeProfile.slice(1);
-  }else{
-    name=window._botName||'Hermes';
-  }
+  // The saved assistant name applies to the default profile only.
+  // Non-default profiles use their own profile names.
+  const name=assistantDisplayName();
  document.title=name;
  const sidebarH1=document.querySelector('.sidebar-header h1');
  if(sidebarH1) sidebarH1.textContent=name;
@@ -1469,7 +1463,6 @@ function applyBotName(){
      setLocale(_lang);
      if(typeof applyLocaleToDOM==='function')applyLocaleToDOM();
    }
-    applyBotName();
    // TTS: apply enabled state on boot so buttons show/hide correctly (#499)
    if(typeof _applyTtsEnabled==='function') _applyTtsEnabled(localStorage.getItem('hermes-tts-enabled')==='true');
  }catch(e){
@@ -1497,7 +1490,6 @@ function applyBotName(){
      setLocale(_lang);
      if(typeof applyLocaleToDOM==='function')applyLocaleToDOM();
    }
-    applyBotName();
    if(typeof _applyTtsEnabled==='function') _applyTtsEnabled(localStorage.getItem('hermes-tts-enabled')==='true');
  }
  // Non-blocking update check (fire-and-forget, once per tab session)
@@ -1509,6 +1501,7 @@ function applyBotName(){
  }
  // Fetch active profile
  try{const p=await api('/api/profile/active');S.activeProfile=p.name||'default';}catch(e){S.activeProfile='default';}
+  applyBotName();
  // Update profile chip label immediately
  const profileLabel=$('profileChipLabel');
  if(profileLabel) profileLabel.textContent=S.activeProfile||'default';
@@ -215,6 +215,8 @@ const LOCALES = {
    focus_label: 'Focus',
    token_usage_on: 'Token usage on',
    token_usage_off: 'Token usage off',
+    usage_cache_hit_detail: 'Cache: {0}% hit ({1} read / {2} write)',
+    usage_cached_percent: '{0}% cached',
    theme_usage: 'Usage: /theme ',
    theme_set: 'Theme: ',
    no_active_session: 'No active session',
@@ -554,7 +556,7 @@ const LOCALES = {
    settings_label_sync_insights: 'Sync to insights',
    settings_label_check_updates: 'Check for updates',
    settings_label_whats_new_summary: "Summarize What's New with AI",
-    settings_label_bot_name: 'Assistant Name',
+    settings_label_bot_name: 'Default assistant name',
    settings_label_password: 'Access Password',
    settings_saved: 'Settings saved',
    settings_save_failed: 'Save failed: ',
@@ -793,7 +795,7 @@ const LOCALES = {
    settings_desc_sync_insights: 'Mirrors WebUI token usage to state.db so hermes /insights includes browser session data. Off by default.',
    settings_desc_check_updates: 'Show a banner when newer versions of the WebUI or Agent are available. Runs a background git fetch periodically.',
    settings_desc_whats_new_summary: "Changes the What's New action from opening the raw diff first to generating a short, human-readable summary. The regular diff comparison stays available after the summary.",
-    settings_desc_bot_name: 'Display name for the assistant throughout the UI. Defaults to Hermes.',
+    settings_desc_bot_name: 'Used for the default profile only. Other profiles use their own profile names.',
    settings_desc_password: 'Enter a new password to set or change it. Leave blank to keep current setting.',
    password_placeholder: 'Enter new password…',
    password_env_var_locked: 'The HERMES_WEBUI_PASSWORD environment variable is currently set and takes precedence. Unset it and restart the server to manage the password from here.',
@@ -1434,6 +1436,8 @@ const LOCALES = {
    focus_label: 'Focus',
    token_usage_on: 'Uso token attivo',
    token_usage_off: 'Uso token disattivo',
+    usage_cache_hit_detail: 'Cache: {0}% in cache ({1} letti / {2} scritti)',
+    usage_cached_percent: '{0}% in cache',
    theme_usage: 'Uso: /theme ',
    theme_set: 'Tema: ',
    no_active_session: 'Nessuna sessione attiva',
@@ -1773,7 +1777,7 @@ const LOCALES = {
    settings_label_sync_insights: 'Sincronizza con insights',
    settings_label_check_updates: 'Verifica aggiornamenti',
    settings_label_whats_new_summary: "Summarize What's New with AI",
-    settings_label_bot_name: 'Nome Assistente',
+    settings_label_bot_name: 'Nome assistente predefinito',
    settings_label_password: 'Password di Accesso',
    settings_saved: 'Impostazioni salvate',
    settings_save_failed: 'Salvataggio fallito: ',
@@ -2004,7 +2008,7 @@ const LOCALES = {
    settings_desc_sync_insights: 'Rispecchia l\'uso token WebUI su state.db così hermes /insights include i dati delle sessioni browser. Disattivato per impostazione predefinita.',
    settings_desc_check_updates: 'Mostra un banner quando sono disponibili versioni più recenti della WebUI o dell\'Agente. Esegue un git fetch in background periodicamente.',
    settings_desc_whats_new_summary: "Changes the What's New action from opening the raw diff first to generating a short, human-readable summary. The regular diff comparison stays available after the summary.",
-    settings_desc_bot_name: 'Nome visualizzato per l\'assistente in tutta l\'interfaccia. Predefinito: Hermes.',
+    settings_desc_bot_name: 'Usato solo per il profilo predefinito. Gli altri profili usano i propri nomi.',
    settings_desc_password: 'Inserisci una nuova password per impostarla o cambiarla. Lascia vuoto per mantenere l\'impostazione attuale.',
    password_placeholder: 'Inserisci nuova password…',
    password_env_var_locked: 'La variabile d\'ambiente HERMES_WEBUI_PASSWORD è attualmente impostata e ha la precedenza. Rimuovila e riavvia il server per gestire la password da qui.',
@@ -2645,6 +2649,8 @@ const LOCALES = {
    focus_label: 'フォーカス',
    token_usage_on: 'トークン使用量: ON',
    token_usage_off: 'トークン使用量: OFF',
+    usage_cache_hit_detail: 'キャッシュ: {0}% ヒット（読み取り {1} / 書き込み {2}）',
+    usage_cached_percent: '{0}% キャッシュ済み',
    theme_usage: '使い方: /theme ',
    theme_set: 'テーマ: ',
    no_active_session: 'アクティブなセッションがありません',
@@ -2984,7 +2990,7 @@ const LOCALES = {
    settings_label_sync_insights: 'インサイトに同期',
    settings_label_check_updates: 'アップデートを確認',
    settings_label_whats_new_summary: "Summarize What's New with AI",
-    settings_label_bot_name: 'アシスタント名',
+    settings_label_bot_name: 'デフォルトのアシスタント名',
    settings_label_password: 'アクセスパスワード',
    settings_saved: '設定を保存しました',
    settings_save_failed: '保存失敗: ',
@@ -3220,7 +3226,7 @@ const LOCALES = {
    settings_desc_sync_insights: 'WebUI のトークン使用量を state.db にミラーし、hermes /insights にブラウザセッションのデータを含めます。デフォルトはオフ。',
    settings_desc_check_updates: 'WebUI または Agent の新しいバージョンが利用可能な時にバナーを表示します。バックグラウンドで定期的に git fetch を実行します。',
    settings_desc_whats_new_summary: "Changes the What's New action from opening the raw diff first to generating a short, human-readable summary. The regular diff comparison stays available after the summary.",
-    settings_desc_bot_name: 'UI 全体で表示されるアシスタントの名前。デフォルトは Hermes。',
+    settings_desc_bot_name: 'デフォルトプロファイルでのみ使用されます。他のプロファイルはそれぞれのプロファイル名を使用します。',
    settings_desc_password: '新しいパスワードを入力すると設定または変更します。空欄なら現在の設定を維持。',
    password_placeholder: '新しいパスワードを入力…',
    password_env_var_locked: '現在 HERMES_WEBUI_PASSWORD 環境変数が設定されており優先されます。ここで管理するには変数を解除してサーバーを再起動してください。',
@@ -3817,6 +3823,8 @@ const LOCALES = {
    token_usage_on: 'Отображение токенов включено',
    usage_personality_none: 'none', // TODO: translate
    token_usage_off: 'Отображение токенов выключено',
+    usage_cache_hit_detail: 'Кэш: {0}% попаданий ({1} чтение / {2} запись)',
+    usage_cached_percent: '{0}% из кэша',
    theme_usage: 'Использование: /theme ',
    theme_set: 'Тема: ',
    no_active_session: 'Нет активной сессии',
@@ -4006,7 +4014,7 @@ const LOCALES = {
    settings_label_sync_insights: 'Синхронизировать с Insights',
    settings_label_check_updates: 'Проверять обновления',
    settings_label_whats_new_summary: "Summarize What's New with AI",
-    settings_label_bot_name: 'Имя помощника',
+    settings_label_bot_name: 'Имя помощника по умолчанию',
    settings_label_password: 'Пароль доступа',
    settings_saved: 'Настройки сохранены',
    settings_save_failed: 'Не удалось сохранить: ',
@@ -4191,7 +4199,7 @@ const LOCALES = {
    settings_desc_sync_insights: 'Синхронизирует использование токенов WebUI в state.db, чтобы Hermes /insights включал данные браузерных сеансов. Выключено по умолчанию.',
    settings_desc_check_updates: 'Показывает баннер, когда доступны более новые версии WebUI или Agent. Периодически выполняет git fetch в фоне.',
    settings_desc_whats_new_summary: "Changes the What's New action from opening the raw diff first to generating a short, human-readable summary. The regular diff comparison stays available after the summary.",
-    settings_desc_bot_name: 'Отображаемое имя помощника во всём интерфейсе. По умолчанию Hermes.',
+    settings_desc_bot_name: 'Используется только для профиля по умолчанию. Другие профили используют свои имена.',
    settings_desc_password: 'Введите новый пароль, чтобы задать или изменить его. Оставьте пустым, чтобы сохранить текущую настройку.',
    password_placeholder: 'Введите новый пароль…',
    password_env_var_locked: 'Переменная окружения HERMES_WEBUI_PASSWORD сейчас задана и имеет приоритет. Сбросьте её и перезапустите сервер, чтобы управлять паролем отсюда.',
@@ -5004,6 +5012,8 @@ const LOCALES = {
    token_usage_on: 'Uso de tokens activado',
    usage_personality_none: 'none', // TODO: translate
    token_usage_off: 'Uso de tokens desactivado',
+    usage_cache_hit_detail: 'Caché: {0}% de acierto ({1} lectura / {2} escritura)',
+    usage_cached_percent: '{0}% en caché',
    theme_usage: 'Uso: /theme ',
    theme_set: 'Tema: ',
    no_active_session: 'No hay ninguna sesión activa',
@@ -5146,7 +5156,7 @@ const LOCALES = {
    settings_label_sync_insights: 'Sincronizar con insights',
    settings_label_check_updates: 'Buscar actualizaciones',
    settings_label_whats_new_summary: "Summarize What's New with AI",
-    settings_label_bot_name: 'Nombre del asistente',
+    settings_label_bot_name: 'Nombre predeterminado del asistente',
    settings_label_password: 'Contraseña de acceso',
    settings_saved: 'Configuración guardada',
    settings_save_failed: 'Error al guardar: ',
@@ -5342,7 +5352,7 @@ const LOCALES = {
    settings_desc_sync_insights: 'Refleja el uso de tokens de la WebUI en state.db para que hermes /insights incluya datos de sesiones del navegador. Desactivado por defecto.',
    settings_desc_check_updates: 'Muestra un banner cuando haya versiones más nuevas de la WebUI o del Agent. Ejecuta periódicamente un git fetch en segundo plano.',
    settings_desc_whats_new_summary: "Changes the What's New action from opening the raw diff first to generating a short, human-readable summary. The regular diff comparison stays available after the summary.",
-    settings_desc_bot_name: 'Nombre visible del asistente en toda la UI. Por defecto es Hermes.',
+    settings_desc_bot_name: 'Solo se usa para el perfil predeterminado. Los otros perfiles usan sus propios nombres.',
    settings_desc_password: 'Introduce una nueva contraseña para establecerla o cambiarla. Déjalo en blanco para mantener la configuración actual.',
    password_placeholder: 'Introduce una contraseña nueva…',
    password_env_var_locked: 'La variable de entorno HERMES_WEBUI_PASSWORD está definida y tiene prioridad. Quítala y reinicia el servidor para gestionar la contraseña desde aquí.',
@@ -6128,6 +6138,8 @@ const LOCALES = {
    token_usage_on: 'Token-Verbrauch an',
    usage_personality_none: 'none', // TODO: translate
    token_usage_off: 'Token-Verbrauch aus',
+    usage_cache_hit_detail: 'Cache: {0}% Treffer ({1} gelesen / {2} geschrieben)',
+    usage_cached_percent: '{0}% im Cache',
    theme_usage: 'Nutzung: /theme ',
    theme_set: 'Theme: ',
    no_active_session: 'Keine aktive Sitzung',
@@ -6279,7 +6291,7 @@ const LOCALES = {
    settings_label_sync_insights: 'Mit Insights synchronisieren',
    settings_label_check_updates: 'Nach Updates suchen',
    settings_label_whats_new_summary: "Summarize What's New with AI",
-    settings_label_bot_name: 'Assistenten-Name',
+    settings_label_bot_name: 'Standard-Assistentenname',
    settings_label_password: 'Zugangspasswort',
    settings_saved: 'Einstellungen gespeichert',
    settings_save_failed: 'Speichern fehlgeschlagen: ',
@@ -6465,7 +6477,7 @@ const LOCALES = {
    settings_desc_sync_insights: 'Spiegelt den WebUI-Token-Verbrauch in die state.db, sodass hermes /insights Browser-Sitzungsdaten enthält. Standardmäßig aus.',
    settings_desc_check_updates: 'Zeigt ein Banner an, wenn neuere Versionen der WebUI oder des Agenten verfügbar sind. Führt regelmäßig einen Git-Fetch im Hintergrund aus.',
    settings_desc_whats_new_summary: "Changes the What's New action from opening the raw diff first to generating a short, human-readable summary. The regular diff comparison stays available after the summary.",
-    settings_desc_bot_name: 'Anzeigename für den Assistenten in der UI. Standardmäßig Hermes.',
+    settings_desc_bot_name: 'Wird nur für das Standardprofil verwendet. Andere Profile verwenden ihre eigenen Namen.',
    settings_desc_password: 'Geben Sie ein neues Passwort ein, um es zu setzen oder zu ändern. Leer lassen, um die aktuelle Einstellung beizubehalten.',
    password_placeholder: 'Neues Passwort eingeben…',
    password_env_var_locked: 'Die Umgebungsvariable HERMES_WEBUI_PASSWORD ist gesetzt und hat Vorrang. Entferne sie und starte den Server neu, um das Passwort hier zu verwalten.',
@@ -7303,6 +7315,8 @@ const LOCALES = {
    token_usage_on: 'Token 用量显示已开启',
    usage_personality_none: '无',
    token_usage_off: 'Token 用量显示已关闭',
+    usage_cache_hit_detail: '缓存：{0}% 命中（读取 {1} / 写入 {2}）',
+    usage_cached_percent: '{0}% 已缓存',
    theme_usage: '用法：/theme ',
    theme_set: '主题：',
    no_active_session: '当前没有活动会话',
@@ -7462,7 +7476,7 @@ const LOCALES = {
    settings_label_sync_insights: '同步到 insights',
    settings_label_check_updates: '检查更新',
    settings_label_whats_new_summary: "Summarize What's New with AI",
-    settings_label_bot_name: '助手名称',
+    settings_label_bot_name: '默认助手名称',
    settings_label_password: '访问密码',
    settings_saved: '设置已保存',
    settings_save_failed: '保存失败：',
@@ -7721,7 +7735,7 @@ const LOCALES = {
    settings_desc_sync_insights: '将 WebUI token 使用情况同步到 state.db，使 hermes /insights 包含浏览器会话数据。默认关闭。',
    settings_desc_check_updates: '当有更新的 WebUI 或助手版本时显示横幅。会在后台定期执行 git fetch。',
    settings_desc_whats_new_summary: "Changes the What's New action from opening the raw diff first to generating a short, human-readable summary. The regular diff comparison stays available after the summary.",
-    settings_desc_bot_name: '助手在 UI 中的显示名称。默认为 Hermes。',
+    settings_desc_bot_name: '仅用于默认个人资料。其他个人资料会使用各自的名称。',
    settings_desc_password: '输入新密码以设置或更改。留空保持当前设置。',
    // onboarding
    onboarding_badge: '首次运行',
@@ -8414,6 +8428,8 @@ const LOCALES = {
    focus_label: '\u4e3b\u984c',
    token_usage_on: 'Token \u7528\u91cf\u986f\u793a\u5df2\u958b\u555f',
    token_usage_off: 'Token \u7528\u91cf\u986f\u793a\u5df2\u95dc\u9589',
+    usage_cache_hit_detail: '快取：{0}% 命中（讀取 {1} / 寫入 {2}）',
+    usage_cached_percent: '{0}% 已快取',
    theme_usage: '\u7528\u6cd5\uff1a/theme ',
    theme_set: '\u4e3b\u984c\uff1a',
    no_active_session: '\u7576\u524d\u6c92\u6709\u6d3b\u52d5\u6703\u8a71',
@@ -8623,7 +8639,7 @@ const LOCALES = {
    settings_label_sync_insights: '\u540c\u6b65\u5230 insights',
    settings_label_check_updates: '\u6aa2\u67e5\u66f4\u65b0',
    settings_label_whats_new_summary: "Summarize What's New with AI",
-    settings_label_bot_name: '\u52a9\u624b\u540d\u7a31',
+    settings_label_bot_name: '預設助手名稱',
    settings_label_password: '\u8a2a\u554f\u5bc6\u78bc',
    settings_saved: '\u8a2d\u5b9a\u5df2\u5132\u5b58',
    settings_save_failed: '\u5132\u5b58\u5931\u6557\uff1a',
@@ -8806,7 +8822,7 @@ const LOCALES = {
    settings_desc_sync_insights: '將 WebUI token 使用情況同步到 state.db，使 hermes /insights 包含瀏覽器會話數據。預設未啟用。',
    settings_desc_check_updates: '當有更新的 WebUI 或助手版本時顯示標記。將在後台正常執行 Git-Fetch。',
    settings_desc_whats_new_summary: "Changes the What's New action from opening the raw diff first to generating a short, human-readable summary. The regular diff comparison stays available after the summary.",
-    settings_desc_bot_name: '助手在 UI 中的顯示名稱。預設未更改。',
+    settings_desc_bot_name: '僅用於預設個人檔案。其他個人檔案會使用各自的名稱。',
    settings_desc_password: '\u8a2d\u5b9a WebUI \u767b\u5165\u5bc6\u78bc\u3002\u5047\u5982\u5df2\u8a2d\u7f6e\uff0c\u6bcf\u6b21\u52a0\u8f09\u90fd\u9700\u8981\u767b\u5165\u3002',
    onboarding_password_will_enable: '\u5c07\u6703\u555f\u7528',
    onboarding_password_will_replace: '\u5c07\u6703\u53d6\u4ee3',
@@ -9617,6 +9633,8 @@ const LOCALES = {
    focus_label: 'Foco',
    token_usage_on: 'Uso de tokens ligado',
    token_usage_off: 'Uso de tokens desligado',
+    usage_cache_hit_detail: 'Cache: {0}% de acerto ({1} leitura / {2} escrita)',
+    usage_cached_percent: '{0}% em cache',
    theme_usage: 'Uso: /theme ',
    theme_set: 'Tema: ',
    no_active_session: 'Nenhuma sessão ativa',
@@ -9922,7 +9940,7 @@ const LOCALES = {
    settings_label_sync_insights: 'Sincronizar para insights',
    settings_label_check_updates: 'Verificar atualizações',
    settings_label_whats_new_summary: "Summarize What's New with AI",
-    settings_label_bot_name: 'Nome do Assistente',
+    settings_label_bot_name: 'Nome padrão do assistente',
    settings_label_password: 'Senha de Acesso',
    settings_saved: 'Configurações salvas',
    settings_save_failed: 'Falha ao salvar: ',
@@ -10111,7 +10129,7 @@ const LOCALES = {
    settings_desc_sync_insights: 'Espelha uso de tokens para state.db.',
    settings_desc_check_updates: 'Mostrar banner quando versões mais novas estiverem disponíveis.',
    settings_desc_whats_new_summary: "Changes the What's New action from opening the raw diff first to generating a short, human-readable summary. The regular diff comparison stays available after the summary.",
-    settings_desc_bot_name: 'Nome de exibição do assistente. Padrão: Hermes.',
+    settings_desc_bot_name: 'Usado apenas para o perfil padrão. Outros perfis usam seus próprios nomes.',
    settings_desc_password: 'Digite nova senha para definir ou trocar. Deixe em branco para manter.',
    password_placeholder: 'Digite nova senha…',
    password_env_var_locked: 'A variável de ambiente HERMES_WEBUI_PASSWORD está definida e tem prioridade. Remova-a e reinicie o servidor para gerenciar a senha aqui.',
@@ -10716,6 +10734,8 @@ const LOCALES = {
    focus_label: 'Focus',
    token_usage_on: 'Token usage on',
    token_usage_off: 'Token usage off',
+    usage_cache_hit_detail: '캐시: {0}% 적중({1} 읽기 / {2} 쓰기)',
+    usage_cached_percent: '{0}% 캐시됨',
    theme_usage: 'Usage: /theme ',
    theme_set: 'Theme: ',
    no_active_session: '활성 세션 없음',
@@ -11036,7 +11056,7 @@ const LOCALES = {
    settings_label_sync_insights: 'Insights에 동기화',
    settings_label_check_updates: '업데이트 확인',
    settings_label_whats_new_summary: "Summarize What's New with AI",
-    settings_label_bot_name: 'Assistant 이름',
+    settings_label_bot_name: '기본 Assistant 이름',
    settings_label_password: '접근 비밀번호',
    settings_saved: '설정 저장됨',
    settings_save_failed: '설정 저장 실패: ',
@@ -11224,7 +11244,7 @@ const LOCALES = {
    settings_desc_sync_insights: 'WebUI 토큰 사용량을 state.db에 반영하여 hermes /insights에 브라우저 세션 데이터가 포함되도록 합니다. 기본값은 꺼짐입니다.',
    settings_desc_check_updates: 'WebUI 또는 Agent의 새 버전이 있으면 배너를 표시합니다. 백그라운드에서 주기적으로 git fetch를 실행합니다.',
    settings_desc_whats_new_summary: "Changes the What's New action from opening the raw diff first to generating a short, human-readable summary. The regular diff comparison stays available after the summary.",
-    settings_desc_bot_name: 'UI 전체에 표시되는 Assistant 이름입니다. 기본값은 Hermes입니다.',
+    settings_desc_bot_name: '기본 프로필에만 사용됩니다. 다른 프로필은 각 프로필 이름을 사용합니다.',
    settings_desc_password: '새 비밀번호를 설정하거나 변경하려면 입력하세요. 현재 설정을 유지하려면 비워 두세요.',
    password_placeholder: '새 비밀번호 입력…',
    password_env_var_locked: '현재 HERMES_WEBUI_PASSWORD 환경 변수가 설정되어 있어 우선 적용됩니다. 변수를 해제하고 서버를 재시작해야 여기에서 비밀번호를 관리할 수 있습니다.',
@@ -11919,6 +11939,8 @@ const LOCALES = {
    focus_label: 'Se concentrer',
    token_usage_on: 'Utilisation du jeton sur',
    token_usage_off: 'Utilisation des jetons désactivée',
+    usage_cache_hit_detail: 'Cache : {0}% de réussite ({1} lecture / {2} écriture)',
+    usage_cached_percent: '{0}% en cache',
    theme_usage: 'Utilisation : /theme ',
    theme_set: 'Thème:',
    no_active_session: 'Aucune session active',
@@ -12167,7 +12189,7 @@ const LOCALES = {
    settings_label_sync_insights: 'Synchroniser avec les insights',
    settings_label_check_updates: 'Vérifier les mises à jour',
    settings_label_whats_new_summary: "Summarize What's New with AI",
-    settings_label_bot_name: 'Nom de l\'assistant',
+    settings_label_bot_name: 'Nom par défaut de l\'assistant',
    settings_label_password: 'Mot de passe d\'accès',
    settings_saved: 'Paramètres enregistrés',
    settings_save_failed: 'Échec de l\'enregistrement :',
@@ -12365,7 +12387,7 @@ const LOCALES = {
    settings_desc_sync_insights: 'Met en miroir l\'utilisation du jeton WebUI dans state.db afin que Hermes /insights inclut les données de session du navigateur. Désactivé par défaut.',
    settings_desc_check_updates: 'Afficher une bannière lorsque des versions plus récentes de WebUI ou de l\'agent sont disponibles. Exécute périodiquement une récupération git en arrière-plan.',
    settings_desc_whats_new_summary: "Changes the What's New action from opening the raw diff first to generating a short, human-readable summary. The regular diff comparison stays available after the summary.",
-    settings_desc_bot_name: 'Nom d’affichage de l’assistant dans l’interface utilisateur. Par défaut, Hermès.',
+    settings_desc_bot_name: 'Utilisé uniquement pour le profil par défaut. Les autres profils utilisent leurs propres noms.',
    settings_desc_password: 'Saisissez un nouveau mot de passe pour le définir ou le modifier. Laissez vide pour conserver le paramètre actuel.',
    password_placeholder: 'Entrez le nouveau mot de passe…',
    password_env_var_locked: 'La variable d\'environnement HERMES_WEBUI_PASSWORD est actuellement définie et est prioritaire. Désactivez-le et redémarrez le serveur pour gérer le mot de passe à partir d\'ici.',
@@ -1164,8 +1164,8 @@
              <div style="font-size:11px;color:var(--muted);margin-top:4px" data-i18n="settings_desc_whats_new_summary">Changes the What's New action from opening the raw diff first to generating a short, human-readable summary. The regular diff comparison stays available after the summary.</div>
            </div>
            <div class="settings-field">
-              <label for="settingsBotName" data-i18n="settings_label_bot_name">Assistant Name</label>
-              <div style="font-size:11px;color:var(--muted);margin-bottom:6px" data-i18n="settings_desc_bot_name">Display name for the assistant throughout the UI. Defaults to Hermes.</div>
+              <label for="settingsBotName" data-i18n="settings_label_bot_name">Default assistant name</label>
+              <div style="font-size:11px;color:var(--muted);margin-bottom:6px" data-i18n="settings_desc_bot_name">Used for the default profile only. Other profiles use their own profile names.</div>
              <input type="text" id="settingsBotName" placeholder="Hermes" maxlength="64" style="width:100%;padding:8px;background:var(--code-bg);color:var(--text);border:1px solid var(--border2);border-radius:6px;font-size:13px">
            </div>
            <button class="sm-btn" onclick="saveSettings()" style="margin-top:12px;width:100%;padding:8px;font-weight:600" data-i18n="settings_save_btn">Save Settings</button>
@@ -187,6 +187,7 @@ if(typeof document!=='undefined'){
 // (e.g. queue drain + user click) can both pass the S.busy check because
 // setBusy(true) is only called after the first await inside send().
 let _sendInProgress = false;
+let _sendInProgressSid = null;  // session_id of the in-flight send
 const _sessionTitleProvisionalBySid = new Map();

 function _sessionTitleLooksDefaultOrProvisional(titleText, provisionalText){
@@ -236,11 +237,14 @@ async function send(){
  // instead of silently dropping it.
  if (_sendInProgress) {
    const _text=$('msg').value.trim();
-    if(_text && S.session && S.session.session_id){
-      queueSessionMessage(S.session.session_id,{text:_text,files:[...S.pendingFiles],model:S.session&&S.session.model||($('modelSelect')&&$('modelSelect').value)||'',model_provider:S.session&&S.session.model_provider||null,profile:S.activeProfile||'default'});
+    // Use the in-flight session's sid, not the currently viewed session,
+    // so the queued message goes to the chat that owns the active stream.
+    const _targetSid=_sendInProgressSid||(S.session&&S.session.session_id);
+    if(_text && _targetSid){
+      queueSessionMessage(_targetSid,{text:_text,files:[...S.pendingFiles],model:S.session&&S.session.model||($('modelSelect')&&$('modelSelect').value)||'',model_provider:S.session&&S.session.model_provider||null,profile:S.activeProfile||'default'});
      $('msg').value='';autoResize();
      S.pendingFiles=[];renderTray();
-      updateQueueBadge(S.session.session_id);
+      updateQueueBadge(_targetSid);
      showToast(`Queued: "${_text.slice(0,40)}${_text.length>40?'…':''}"`,2000);
    }
    return;
@@ -248,9 +252,9 @@ async function send(){
  _sendInProgress = true;
  try{
  const text=$('msg').value.trim();
-  if(!text&&!S.pendingFiles.length)return;
+  if(!text&&!S.pendingFiles.length){_sendInProgress=false;_sendInProgressSid=null;return;}
  // Don't send while an inline message edit is active
-  if(document.querySelector('.msg-edit-area'))return;
+  if(document.querySelector('.msg-edit-area')){_sendInProgress=false;_sendInProgressSid=null;return;}

  // Dismiss handoff hint when user sends a message (resets seen_at).
  if(S.session&&S.session.session_id&&typeof _dismissHandoffHint==='function'){
@@ -380,6 +384,7 @@ async function send(){
  if(!S.session){await newSession();await renderSessionList();}

  const activeSid=S.session.session_id;
+  _sendInProgressSid=activeSid;

  setComposerStatus(S.pendingFiles&&S.pendingFiles.length?'Uploading…':'');
  let uploaded=[];
@@ -528,7 +533,7 @@ async function send(){
  // Open SSE stream and render tokens live
  attachLiveStream(activeSid, streamId, uploadedNames);

-  }finally{ _sendInProgress=false; }
+  }finally{ _sendInProgress=false; _sendInProgressSid=null; }
 }

 const LIVE_STREAMS={};
@@ -966,19 +971,31 @@ function attachLiveStream(activeSid, streamId, uploaded=[], options={}){
    if(assistantBody&&!fade){_sanitizeSmdLinks(assistantBody);}
  }
  // Allowed URL schemes for anchors and images rendered from agent-streamed markdown.
-  // Matches the effective allowlist of renderMd() (http/https via regex + relative).
-  const _SMD_SAFE_URL_RE=/^(?:https?:|mailto:|tel:|\/|#|\?|\.)/i;
+  // Raw file:// anchors are rewritten to /api/media before the user can click them.
+  const _SMD_SAFE_URL_RE=/^(?:https?:|mailto:|tel:|\/|#|\?|\.|api)/i;
+  const _SMD_SAFE_IMG_URL_RE=/^(?:https?:|mailto:|tel:|\/|#|\?|\.)/i;
+  function _smdFileHref(raw){
+    const href=String(raw||'');
+    if(!/^file:\/\//i.test(href)) return href;
+    try{
+      const path=decodeURIComponent(href.replace(/^file:\/\//i,''));
+      return 'api/media?path='+encodeURIComponent(path)+'&inline=1';
+    }catch(_){
+      return 'api/media?path='+encodeURIComponent(href.replace(/^file:\/\//i,''))+'&inline=1';
+    }
+  }
  function _sanitizeSmdLinks(root){
    if(!root||!root.querySelectorAll) return;
    const _a=root.querySelectorAll('a[href]');
    for(let i=0;i<_a.length;i++){
      const n=_a[i],v=n.getAttribute('href')||'';
+      if(/^file:\/\//i.test(v)){n.setAttribute('href',_smdFileHref(v));continue;}
      if(!_SMD_SAFE_URL_RE.test(v)){n.removeAttribute('href');n.setAttribute('data-blocked-scheme','1');}
    }
    const _im=root.querySelectorAll('img[src]');
    for(let i=0;i<_im.length;i++){
      const n=_im[i],v=n.getAttribute('src')||'';
-      if(!_SMD_SAFE_URL_RE.test(v)){n.removeAttribute('src');n.setAttribute('data-blocked-scheme','1');}
+      if(!_SMD_SAFE_IMG_URL_RE.test(v)){n.removeAttribute('src');n.setAttribute('data-blocked-scheme','1');}
    }
  }

@@ -1082,7 +1099,12 @@ function attachLiveStream(activeSid, streamId, uploaded=[], options={}){
    renderer.set_attr=(data,attr,value)=>{
      const isHref=window.smd&&attr===window.smd.HREF;
      const isSrc=window.smd&&attr===window.smd.SRC;
-      if((isHref||isSrc)&&!_SMD_SAFE_URL_RE.test(String(value||''))){
+      const safeUrl=isSrc?_SMD_SAFE_IMG_URL_RE:_SMD_SAFE_URL_RE;
+      if(isHref&&/^file:\/\//i.test(String(value||''))){
+        baseSetAttr(data,attr,_smdFileHref(value));
+        return;
+      }
+      if((isHref||isSrc)&&!safeUrl.test(String(value||''))){
        const node=data&&data.nodes&&data.nodes[data.index];
        if(node&&node.setAttribute) node.setAttribute('data-blocked-scheme','1');
        return;
@@ -1664,6 +1686,7 @@ function attachLiveStream(activeSid, streamId, uploaded=[], options={}){
                  estimated_cost:Math.max(0,curCost-prevCost),
                  cache_read_tokens:Math.max(0,curCacheRead-_prevCacheRead),
                  cache_write_tokens:Math.max(0,curCacheWrite-_prevCacheWrite),
+                  cache_hit_percent:d.usage.turn_cache_hit_percent,
                };
              }
              if(typeof d.usage.duration_seconds==='number'){
@@ -1989,7 +2012,7 @@ function attachLiveStream(activeSid, streamId, uploaded=[], options={}){
          // Fallback to local cancel message if API fails
          if(S.session&&S.session.session_id===activeSid){
            clearLiveToolCards();if(!assistantText)removeThinking();
-            const cancelAgentName=((window._botName||'Hermes')+'').trim()||'Hermes';
+            const cancelAgentName=(assistantDisplayName()+'').trim()||'Hermes';
            S.messages.push({role:'assistant',content:`**Task cancelled:** Task cancelled.\n\n*The run was cancelled by the user before ${cancelAgentName} finished. No provider failure occurred.*`,provider_details:'Task cancelled.',provider_details_label:'Cancellation details',_error:true});renderMessages({preserveScroll:true});
            _markSessionViewed(activeSid, S.messages.length);
          }
@@ -2855,7 +2878,7 @@ function playNotificationSound(){
 function sendBrowserNotification(title,body){
  if(!window._notificationsEnabled||!document.hidden) return;
  if(!('Notification' in window)) return;
-  const botName=window._botName||'Hermes';
+  const botName=assistantDisplayName();
  if(Notification.permission==='granted'){
    new Notification(title||botName,{body:body});
  }else if(Notification.permission!=='denied'){
@@ -500,6 +500,9 @@ async function newSession(flash, options={}){
        input_tokens:data.session.input_tokens||0,
        output_tokens:data.session.output_tokens||0,
        estimated_cost:data.session.estimated_cost||0,
+        cache_read_tokens:data.session.cache_read_tokens||0,
+        cache_write_tokens:data.session.cache_write_tokens||0,
+        cache_hit_percent:data.session.cache_hit_percent,
        context_length:data.session.context_length||0,
        last_prompt_tokens:data.session.last_prompt_tokens||0,
        threshold_tokens:data.session.threshold_tokens||0,
@@ -518,18 +521,22 @@ async function newSession(flash, options={}){
 }

 async function loadSession(sid){
+  const opts = arguments[1] || {};
+  const forceReload = !!opts.force;
  const currentSid = S.session ? S.session.session_id : null;
  // Clicking the already-open session in the sidebar is a no-op. Reloading it
  // tears down active pane state and can reset the long-session scroll window
-  // to the top even though the user did not navigate anywhere.
-  if(currentSid===sid) return;
+  // to the top even though the user did not navigate anywhere. Explicit
+  // refresh paths pass {force:true} when external state.db changes arrive.
+  // Legacy invariant kept for static regression tests: if(currentSid===sid) return
+  if(currentSid===sid && !forceReload) return;
  // Mark this session as the in-flight load. Subsequent loadSession() calls
  // will overwrite this; stale awaits use the mismatch to bail out (#1060).
  _loadingSessionId = sid;
-  stopApprovalPolling();hideApprovalCard();
+  stopApprovalPolling();hideApprovalCard(forceReload);
  _yoloEnabled=false;_updateYoloPill();
  if(typeof stopClarifyPolling==='function') stopClarifyPolling();
-  if(typeof hideClarifyCard==='function') hideClarifyCard();
+  if(typeof hideClarifyCard==='function') hideClarifyCard(forceReload, forceReload?'external-refresh':'dismissed');
  // Show loading indicator immediately for responsiveness.
  // Cleared by renderMessages() once full session data arrives.
  // Persist the current composer draft before switching away so it can be
@@ -538,14 +545,14 @@ async function loadSession(sid){
  if (currentSid && currentSid !== sid) {
    _saveComposerDraftNow(currentSid, ($('msg') || {}).value || '', S.pendingFiles ? [...S.pendingFiles] : []);
  }
-  if (currentSid !== sid) {
+  if (currentSid !== sid || forceReload) {
    S.messages = [];
    S.toolCalls = [];
    _messagesTruncated = false;
    _oldestIdx = 0;
    _loadingOlder = false;
    const _msgInner = $('msgInner');
-    if (_msgInner) _msgInner.innerHTML = '<div style="display:flex;align-items:center;justify-content:center;height:100%;color:var(--text-muted);font-size:14px;padding:40px;text-align:center;">Loading conversation...</div>';
+    if (_msgInner && currentSid !== sid) _msgInner.innerHTML = '<div style="display:flex;align-items:center;justify-content:center;height:100%;color:var(--text-muted);font-size:14px;padding:40px;text-align:center;">Loading conversation...</div>';
  }
  // Phase 1: Load metadata only (~1KB) for fast session switching.
  // Guard against network/server failures to prevent a permanently stuck loading state.
@@ -768,6 +775,9 @@ async function loadSession(sid){
      input_tokens:      _pick(u.input_tokens,      _s.input_tokens),
      output_tokens:     _pick(u.output_tokens,     _s.output_tokens),
      estimated_cost:    _pick(u.estimated_cost,    _s.estimated_cost),
+      cache_read_tokens: _pick(u.cache_read_tokens, _s.cache_read_tokens),
+      cache_write_tokens:_pick(u.cache_write_tokens,_s.cache_write_tokens),
+      cache_hit_percent: _pick(u.cache_hit_percent, _s.cache_hit_percent, null),
      context_length:    _pick(_s.context_length,    u.context_length),
      last_prompt_tokens:_pick(u.last_prompt_tokens,_s.last_prompt_tokens),
      threshold_tokens:  _pick(_s.threshold_tokens,  u.threshold_tokens),
@@ -1176,6 +1186,9 @@ function _resolveSessionModelForDisplaySoon(sid){
          input_tokens:_pick(u.input_tokens,S.session.input_tokens),
          output_tokens:_pick(u.output_tokens,S.session.output_tokens),
          estimated_cost:_pick(u.estimated_cost,S.session.estimated_cost),
+          cache_read_tokens:_pick(u.cache_read_tokens,S.session.cache_read_tokens),
+          cache_write_tokens:_pick(u.cache_write_tokens,S.session.cache_write_tokens),
+          cache_hit_percent:_pick(u.cache_hit_percent,S.session.cache_hit_percent,null),
          context_length:data.session.context_length||0,
          last_prompt_tokens:_pick(u.last_prompt_tokens,S.session.last_prompt_tokens),
          threshold_tokens:data.session.threshold_tokens||0,
@@ -1995,6 +2008,7 @@ function _applySessionListPayload(sessData, projData){
    stopStreamingPoll();
  }
  ensureSessionTimeRefreshPoll();
+  ensureActiveSessionExternalRefreshPoll();
  renderSessionListFromCache();  // no-ops if rename is in progress
 }

@@ -2028,8 +2042,11 @@ let _gatewaySSEWarningShown = false;
 const _gatewayFallbackPollMs = 30000;
 const _streamingPollMs = 5000;
 const _sessionTimeRefreshMs = 60000;
+const _activeSessionExternalRefreshMs = 5000;
 let _streamingPollTimer = null;
 let _sessionTimeRefreshTimer = null;
+let _activeSessionExternalRefreshTimer = null;
+let _activeSessionExternalRefreshInFlight = false;

 function startStreamingPoll(){
  if(_streamingPollTimer) return;
@@ -2051,6 +2068,50 @@ function ensureSessionTimeRefreshPoll(){
  }, _sessionTimeRefreshMs);
 }

+async function refreshActiveSessionIfExternallyUpdated(reason){
+  if(_activeSessionExternalRefreshInFlight) return;
+  if(!S.session || !S.session.session_id) return;
+  if(S.busy || S.activeStreamId) return;
+  if(typeof document !== 'undefined' && document.hidden) return;
+  const sid = S.session.session_id;
+  const localCount = Number(S.session.message_count || (Array.isArray(S.messages)?S.messages.length:0) || 0);
+  const localLast = Number(S.session.last_message_at || S.session.updated_at || 0);
+  _activeSessionExternalRefreshInFlight = true;
+  try{
+    const data = await api(`/api/session?session_id=${encodeURIComponent(sid)}&messages=0&resolve_model=0`);
+    if(!data || !data.session) return;
+    if(!S.session || S.session.session_id !== sid) return;
+    if(S.busy || S.activeStreamId) return;
+    const remoteCount = Number(data.session.message_count || 0);
+    const remoteLast = Number(data.session.last_message_at || data.session.updated_at || 0);
+    if(remoteCount > localCount || remoteLast > localLast){
+      await loadSession(sid, {force:true, externalRefreshReason:reason||'poll'});
+      if(typeof renderSessionList==='function') void renderSessionList();
+    }
+  }catch(e){
+    // Ignore transient refresh failures; the next poll/focus event will retry.
+  }finally{
+    _activeSessionExternalRefreshInFlight = false;
+  }
+}
+
+function ensureActiveSessionExternalRefreshPoll(){
+  if(_activeSessionExternalRefreshTimer) return;
+  _activeSessionExternalRefreshTimer = setInterval(() => {
+    void refreshActiveSessionIfExternallyUpdated('poll');
+  }, _activeSessionExternalRefreshMs);
+  if(typeof document !== 'undefined' && !document._hermesExternalRefreshVisibilityHook){
+    document.addEventListener('visibilitychange', () => {
+      if(!document.hidden) void refreshActiveSessionIfExternallyUpdated('visible');
+    });
+    document._hermesExternalRefreshVisibilityHook = true;
+  }
+  if(typeof window !== 'undefined' && !window._hermesExternalRefreshFocusHook){
+    window.addEventListener('focus', () => { void refreshActiveSessionIfExternallyUpdated('focus'); });
+    window._hermesExternalRefreshFocusHook = true;
+  }
+}
+
 function startGatewayPollFallback(ms){
  const intervalMs = Math.max(5000, Number(ms) || _gatewayFallbackPollMs);
  if(_gatewayPollTimer) clearInterval(_gatewayPollTimer);
@@ -3539,7 +3600,7 @@ async function deleteSession(sid){
    if(remaining.sessions&&remaining.sessions.length){
      await loadSession(remaining.sessions[0].session_id);
    }else{
-      const _tt=$('topbarTitle');if(_tt)_tt.textContent=window._botName||'Hermes';
+      const _tt=$('topbarTitle');if(_tt)_tt.textContent=assistantDisplayName();
      const _tm=$('topbarMeta');if(_tm)_tm.textContent='Start a new conversation';
      $('msgInner').innerHTML='';
      $('emptyState').style.display='';
@@ -12,6 +12,7 @@
    --radius-sm:4px;--radius-md:8px;--radius-card:8px;--radius-lg:12px;--radius-pill:999px;
    --space-1:4px;--space-2:8px;--space-3:12px;--space-4:16px;
    --font-size-xs:11px;--font-size-sm:12px;--font-size-md:14px;
+    --file-tree-toggle-width:10px;
    --font-ui:-apple-system,BlinkMacSystemFont,"Segoe UI",Inter,system-ui,sans-serif;
    --surface-subtle:rgba(0,0,0,.025);--surface-subtle-hover:rgba(0,0,0,.045);
    --border-subtle:rgba(0,0,0,.08);--border-muted:rgba(0,0,0,.12);
@@ -1434,8 +1435,8 @@
  .file-item{display:flex;align-items:center;gap:6px;padding:6px 10px;border-radius:8px;cursor:pointer;font-size:12px;color:var(--muted);transition:all .12s;min-width:0;}
  .file-item:hover{background:var(--hover-bg);color:var(--text);}
  .file-item.active{background:var(--accent-bg);color:var(--accent-text);}
-  .file-tree-toggle{font-size:10px;color:var(--muted);flex-shrink:0;width:10px;text-align:center;line-height:1;}
-  .file-tree-toggle-placeholder{display:inline-block;flex:0 0 10px;width:10px;line-height:1;}
+  .file-tree-toggle{font-size:10px;color:var(--muted);flex-shrink:0;width:var(--file-tree-toggle-width);text-align:center;line-height:1;}
+  .file-tree-toggle-placeholder{display:inline-block;flex:0 0 var(--file-tree-toggle-width);width:var(--file-tree-toggle-width);line-height:1;}
  .file-item.file-empty{color:var(--muted);opacity:.5;font-style:italic;cursor:default;font-size:11px;}
  .file-item.file-empty:hover{background:none;color:var(--muted);}
  .preview-area{flex:1;overflow:auto;padding:14px;flex-direction:column;gap:8px;display:none;opacity:0;transition:opacity .15s;}
@@ -1,4 +1,9 @@
 const S={session:null,messages:[],entries:[],busy:false,pendingFiles:[],toolCalls:[],activeStreamId:null,currentDir:'.',activeProfile:'default',showHiddenWorkspaceFiles:false};
+
+function assistantDisplayName(){
+  if(S.activeProfile&&S.activeProfile!=='default') return S.activeProfile.charAt(0).toUpperCase()+S.activeProfile.slice(1);
+  return window._botName||'Hermes';
+}
 const INFLIGHT={};  // keyed by session_id while request in-flight
 const SESSION_QUEUES={};  // keyed by session_id for queued follow-up turns
 const MAX_UPLOAD_BYTES=(window.__HERMES_CONFIG__&&window.__HERMES_CONFIG__.maxUploadBytes)||20*1024*1024;
@@ -732,6 +737,7 @@ const MODEL_STATE_KEY='hermes-webui-model-state';
 // first colliding entry.
 function _getOptionProviderId(opt){
  if(!opt) return '';
+  if(opt.dataset && opt.dataset.provider) return opt.dataset.provider;
  const group=opt.parentElement;
  if(group && group.tagName==='OPTGROUP' && group.dataset && group.dataset.provider){
    return group.dataset.provider;
@@ -1416,6 +1422,8 @@ async function selectModelFromDropdown(value){
    opt.value=value;
    opt.textContent=getModelLabel(value);
    opt.dataset.custom='1';
+    const badge=(window._configuredModelBadges||{})[value];
+    if(badge&&badge.provider) opt.dataset.provider=badge.provider;
    // Remove any previous custom option before adding new one
    sel.querySelectorAll('option[data-custom]').forEach(o=>o.remove());
    sel.appendChild(opt);
@@ -2262,9 +2270,8 @@ function _syncCtxIndicator(usage){
  const compressText=pct>=75?t('ctx_compress_action'):(pct>=50?t('ctx_compress_hint'):'');
  if(compressWrap) compressWrap.style.display=compressText?'':'none';
  _setCtxCompressButton(compressBtn,compressText);
-  const cacheTotalTok=cacheReadTok+cacheWriteTok;
-  const cacheHitPct=cacheTotalTok?Math.round((cacheReadTok/cacheTotalTok)*100):null;
-  const cacheText=cacheTotalTok?`cache: ${cacheHitPct}% hit (${_fmtTokens(cacheReadTok)} read / ${_fmtTokens(cacheWriteTok)} write)`:'';
+  const cacheHitPct=usage.cache_hit_percent;
+  const cacheText=cacheHitPct!=null?t('usage_cache_hit_detail',cacheHitPct,_fmtTokens(cacheReadTok),_fmtTokens(cacheWriteTok)):'';
  let label=hasPromptTok?`Context window ${pct}% used`:`${_fmtTokens(totalTok)} tokens used`;
  if(!hasExplicitCtx&&hasPromptTok) label+=' (est. 128K)';
  if(cost) label+=` \u00b7 $${cost<0.01?cost.toFixed(4):cost.toFixed(2)}`;
@@ -2756,7 +2763,7 @@ function renderMd(raw){
    t=t.replace(/\x00C(\d+)\x00/g,(_,i)=>_code_stash[+i]);
    // Stash [label](url) links before autolink so the URL in href= is not re-linked
    const _link_stash=[];
-    t=t.replace(/\[([^\]]+)\]\((https?:\/\/[^\)]+)\)/g,(_,lb,u)=>{_link_stash.push(`<a href="${u.replace(/"/g,'%22')}" target="_blank" rel="noopener">${esc(lb)}</a>`);return `\x00L${_link_stash.length-1}\x00`;});
+    t=t.replace(/\[([^\]]+)\]\(((?:https?|file):\/\/[^\)]+)\)/g,(_,lb,u)=>{_link_stash.push(`<a href="${_markdownHref(u)}" target="_blank" rel="noopener">${esc(lb)}</a>`);return `\x00L${_link_stash.length-1}\x00`;});
    t=t.replace(/(https?:\/\/[^\s<>"')\]]+)/g,(url)=>{const trail=url.match(/[.,;:!?)]$/)?url.slice(-1):'';const clean=trail?url.slice(0,-1):url;return `<a href="${clean}" target="_blank" rel="noopener">${esc(clean)}</a>${trail}`;});
    t=t.replace(/\x00L(\d+)\x00/g,(_,i)=>_link_stash[+i]);
    t=t.replace(/\x00G(\d+)\x00/g,(_,i)=>_img_stash[+i]);
@@ -2849,7 +2856,7 @@ function renderMd(raw){
  // Stash existing <a> tags first to avoid re-linking already-linked URLs.
  const _a_stash=[];
  s=s.replace(/(<a\b[^>]*>[\s\S]*?<\/a>)/g,m=>{_a_stash.push(m);return `\x00A${_a_stash.length-1}\x00`;});
-  s=s.replace(/\[([^\]]+)\]\((https?:\/\/[^\)]+)\)/g,(_,label,url)=>`<a href="${url.replace(/"/g,'%22')}" target="_blank" rel="noopener">${esc(label)}</a>`);
+  s=s.replace(/\[([^\]]+)\]\(((?:https?|file):\/\/[^\)]+)\)/g,(_,label,url)=>`<a href="${_markdownHref(url)}" target="_blank" rel="noopener">${esc(label)}</a>`);
  s=s.replace(/\x00A(\d+)\x00/g,(_,i)=>_a_stash[+i]);
  // Restore raw <pre> only after markdown rewrites so literal preformatted
  // content stays placeholder-protected, then let the sanitizer normalize tags.
@@ -2865,6 +2872,18 @@ function renderMd(raw){
  function _safeAttrValue(v){
    return String(v||'').replace(/&quot;/g,'"').replace(/&#39;/g,"'").replace(/&amp;/g,'&').trim();
  }
+  function _markdownHref(raw){
+    const href=String(raw||'').replace(/"/g,'%22');
+    if(/^file:\/\//i.test(href)){
+      try{
+        const path=decodeURIComponent(href.replace(/^file:\/\//i,''));
+        return 'api/media?path='+encodeURIComponent(path)+'&inline=1';
+      }catch(_){
+        return 'api/media?path='+encodeURIComponent(href.replace(/^file:\/\//i,''))+'&inline=1';
+      }
+    }
+    return href;
+  }
  function _isSafeUrl(v, img){
    const raw=_safeAttrValue(v);
    const compact=raw.replace(/[\u0000-\u001f\u007f\s]+/g,'').toLowerCase();
@@ -3266,6 +3285,16 @@ function setBusy(v){
    if(next){
      updateQueueBadge(sid);
      setTimeout(()=>{
+        // Guard: if the user switched away from the drain session during
+        // the 120ms settle window, the queued message must NOT go to the
+        // wrong chat.  Put it back into the original session's queue and
+        // skip sending — it will drain when the user returns to that session
+        // or when its next stream completes while it is the active view.
+        if(S.session&&S.session.session_id!==sid){
+          queueSessionMessage(sid,next);
+          updateQueueBadge(sid);
+          return;
+        }
        $('msg').value=next.text||'';
        S.pendingFiles=Array.isArray(next.files)?[...next.files]:[];
        // Restore model from queued item (sent in /api/chat/start payload)
@@ -4663,7 +4692,7 @@ async function checkInflightOnBoot(sid) {

 function syncTopbar(){
  if(!S.session){
-    document.title=window._botName||'Hermes';
+    document.title=assistantDisplayName();
    if(typeof syncWorkspaceDisplays==='function') syncWorkspaceDisplays();
    if(typeof _syncWorkspaceHeadingState==='function') _syncWorkspaceHeadingState();
    if(typeof syncModelChip==='function') syncModelChip();
@@ -4683,7 +4712,7 @@ function syncTopbar(){
  }
  const sessionTitle=S.session.title||t('untitled');
  const _topbarTitle=$('topbarTitle');if(_topbarTitle)_topbarTitle.textContent=sessionTitle;
-  document.title=sessionTitle+' \u2014 '+(window._botName||'Hermes');
+  document.title=sessionTitle+' \u2014 '+assistantDisplayName();
  const vis=S.messages.filter(m=>m&&m.role&&m.role!=='tool');
  const _topbarMeta=$('topbarMeta');
  if(_topbarMeta){
@@ -4815,7 +4844,7 @@ function isTpsDisplayEnabled(){
  return window._showTps===true;
 }
 function _assistantRoleHtml(tsTitle='', tpsText=''){
-  const _bn=window._botName||'Hermes';
+  const _bn=assistantDisplayName();
  const tps=(isTpsDisplayEnabled()&&tpsText)?`<span class="msg-tps-inline" title="Tokens per second">${esc(tpsText)}</span>`:'';
  return `<div class="msg-role assistant" ${tsTitle?`title="${esc(tsTitle)}"`:''}><div class="role-icon assistant">${esc(_bn.charAt(0).toUpperCase())}</div><span style="font-size:12px">${esc(_bn)}</span>${tps}</div>`;
 }
@@ -6198,12 +6227,10 @@ function renderMessages(options){
        const inTok=msg._turnUsage.input_tokens||0;
        const outTok=msg._turnUsage.output_tokens||0;
        const cost=msg._turnUsage.estimated_cost;
-        const cacheRead=msg._turnUsage.cache_read_tokens||0;
-        const cacheWrite=msg._turnUsage.cache_write_tokens||0;
        let text=`${_fmtTokens(inTok)} in · ${_fmtTokens(outTok)} out`;
        if(cost) text+=` · ~$${cost<0.01?cost.toFixed(4):cost.toFixed(2)}`;
-        const cacheTotal=cacheRead+cacheWrite;
-        if(cacheTotal) text+=` · cache ${Math.round((cacheRead/cacheTotal)*100)}% hit`;
+        const cacheHitPct=msg._turnUsage.cache_hit_percent;
+        if(cacheHitPct!=null) text+=` · ${t('usage_cached_percent',cacheHitPct)}`;
        usage.textContent=text;
        fragments.push(usage);
      }
@@ -11,31 +11,67 @@ class TestComposerPlaceholderProfile:
    """applyBotName() should use the profile name when activeProfile is set."""

    def test_applyBotName_uses_profile_name(self):
-        """applyBotName must check S.activeProfile and prefer it over global bot_name."""
+        """Non-default profiles must use the profile name instead of bot_name."""
        src = _src("boot.js")
-        assert "S.activeProfile" in src, \
-            "applyBotName must reference S.activeProfile"
-        # Should fall back to _botName when activeProfile is 'default'
-        assert "S.activeProfile!=='default'" in src, \
-            "applyBotName must skip 'default' profile (use bot_name instead)"
+        ui_src = _src("ui.js")
+        assert "function assistantDisplayName()" in ui_src, \
+            "assistant display name resolution should be shared"
+        assert "S.activeProfile&&S.activeProfile!=='default'" in ui_src, \
+            "assistantDisplayName must only treat the literal default profile as renamed by bot_name"
+        assert "assistantDisplayName()" in src, \
+            "applyBotName must use the shared profile-aware display name"

    def test_applyBotName_capitalises_profile_name(self):
        """Profile name should be capitalised (first letter uppercase)."""
-        src = _src("boot.js")
-        m = re.search(r'function applyBotName\(\)\{.*?\n\}', src, re.DOTALL)
-        assert m, "applyBotName function must exist"
+        src = _src("ui.js")
+        m = re.search(r'function assistantDisplayName\(\)\{.*?\n\}', src, re.DOTALL)
+        assert m, "assistantDisplayName function must exist"
        body = m.group(0)
        assert "charAt(0).toUpperCase()" in body, \
-            "applyBotName must capitalise first letter of profile name"
+            "assistantDisplayName must capitalise first letter of profile name"

    def test_applyBotName_falls_back_to_bot_name(self):
-        """When no active profile, must fall back to window._botName."""
-        src = _src("boot.js")
-        m = re.search(r'function applyBotName\(\)\{.*?\n\}', src, re.DOTALL)
-        assert m, "applyBotName function must exist"
+        """The saved assistant name applies to the default profile."""
+        src = _src("ui.js")
+        m = re.search(r'function assistantDisplayName\(\)\{.*?\n\}', src, re.DOTALL)
+        assert m, "assistantDisplayName function must exist"
        body = m.group(0)
        assert "window._botName||'Hermes'" in body, \
-            "applyBotName must fall back to window._botName or 'Hermes'"
+            "assistantDisplayName must use window._botName or 'Hermes' for the default profile"
+
+    def test_chat_surfaces_use_shared_assistant_display_name(self):
+        """Chat rows, titles, notifications, and cancel copy must honor profile overrides."""
+        ui_src = _src("ui.js")
+        messages_src = _src("messages.js")
+        sessions_src = _src("sessions.js")
+        assert "document.title=assistantDisplayName();" in ui_src
+        assert "document.title=sessionTitle+' \\u2014 '+assistantDisplayName();" in ui_src
+        assert "const _bn=assistantDisplayName();" in ui_src
+        assert "assistantDisplayName()" in messages_src
+        assert "assistantDisplayName()" in sessions_src
+
+    def test_boot_applies_placeholder_after_active_profile_loads(self):
+        """Boot must set the composer placeholder after S.activeProfile is known."""
+        src = _src("boot.js")
+        fetch_idx = src.find("api('/api/profile/active')")
+        assert fetch_idx >= 0, "boot.js should fetch the active profile during boot"
+        label_idx = src.find("const profileLabel=$('profileChipLabel');", fetch_idx)
+        assert label_idx >= 0, "profile chip sync should follow active profile fetch"
+        assert "applyBotName();" in src[fetch_idx:label_idx], (
+            "boot should apply the profile-aware assistant name after active profile resolution"
+        )
+
+    def test_settings_copy_names_default_assistant_scope(self):
+        """The preference copy must say that only the default profile is renamed."""
+        index_src = _src("index.html")
+        i18n_src = _src("i18n.js")
+        assert "Default assistant name" in index_src
+        assert "Used for the default profile only. Other profiles use their own profile names." in index_src
+        assert "settings_label_bot_name: 'Default assistant name'" in i18n_src
+        assert (
+            "settings_desc_bot_name: 'Used for the default profile only. "
+            "Other profiles use their own profile names.'"
+        ) in i18n_src

    def test_switchToProfile_calls_applyBotName(self):
        """switchToProfile() must call applyBotName() after switching."""
@@ -1,10 +1,12 @@
-from api.models import Session
+from api.models import Session, reconciled_state_db_messages_for_session
 import contextlib
+from types import SimpleNamespace

 from api.streaming import (
    _assistant_reply_added_after_current_turn,
    _context_messages_for_new_turn,
    _merge_display_messages_after_agent_result,
+    _new_turn_context_from_messages,
    _sanitize_messages_for_api,
    _session_context_messages,
 )
@@ -314,6 +316,40 @@ def test_explicit_continue_keeps_compacted_active_task_context(tmp_path):
    assert _context_messages_for_new_turn(session, "继续") == compacted_task_context


+def test_streaming_reconciled_context_keeps_casual_greeting_suppression():
+    compacted_task_context = [
+        {
+            "role": "user",
+            "content": (
+                "[CONTEXT COMPACTION — REFERENCE ONLY] Earlier turns were compacted. "
+                "Your current task is identified in the Active Task section — resume exactly from there."
+            ),
+            "timestamp": 1.0,
+        },
+        {"role": "assistant", "content": "I will inspect api/config.py next.", "timestamp": 2.0},
+    ]
+    session = SimpleNamespace(
+        session_id="issue2308-streaming",
+        messages=[{"role": "user", "content": "old task", "timestamp": 0.5}],
+        context_messages=compacted_task_context,
+    )
+    external_state_messages = list(compacted_task_context)
+
+    # Mirror the streaming pre-turn assembly for prefer_context=True: reconcile
+    # sidecar context with one state.db snapshot, then apply the normal new-turn
+    # context filter that suppresses casual greetings from resuming stale tasks.
+    previous_context_messages = _new_turn_context_from_messages(
+        reconciled_state_db_messages_for_session(
+            session,
+            prefer_context=True,
+            state_messages=external_state_messages,
+        ),
+        "你好",
+    )
+
+    assert previous_context_messages == []
+
+
 def test_all_cjk_greetings_drop_stale_compaction_context(tmp_path):
    """Pin every CJK greeting in the casual-fresh-chat set against a stale
    compaction context. Catches typos like \\u5616 (嘖, "click of tongue")
@@ -509,6 +509,18 @@ def test_cancel_copy_uses_configured_bot_name(monkeypatch):
    )


+def test_cancel_copy_uses_profile_name_for_non_default_profile(monkeypatch):
+    """Persisted cancellation copy should use profile names outside literal default."""
+    import api.streaming as streaming
+
+    monkeypatch.setattr(streaming, 'load_settings', lambda: {'bot_name': 'Obryn'})
+
+    session = type('Session', (), {'profile': 'research'})()
+    name = streaming._preferred_agent_display_name_for_session(session)
+    assert name == 'Research'
+    assert 'before Research finished' in streaming._cancelled_turn_content(agent_name=name)
+
+
 def test_cancel_copy_falls_back_to_hermes_for_blank_bot_name(monkeypatch):
    """Blank or missing bot_name should not leak old persona copy."""
    import api.streaming as streaming
@@ -98,6 +98,7 @@ const window = { _botName: 'Hermes', _defaultModel: null, _activeProvider: null
 function fetch(url, opts) { calls.fetches.push({url: String(url), body: opts && opts.body || ''}); return Promise.resolve({ok: true}); }

 for (const name of [
+  'assistantDisplayName',
  '_getOptionProviderId', '_providerFromModelValue', '_modelStateForSelect',
  '_findModelInDropdown', '_refreshOpenModelDropdown', '_applyModelToDropdown',
  '_modelStateFromAppliedDropdown', '_persistSessionModelCorrection',
@@ -3,23 +3,34 @@ from pathlib import Path
 ROOT = Path(__file__).resolve().parents[1]


+def test_webui_backend_prompt_cache_hit_percent_uses_prompt_total_denominator():
+    from api.usage import prompt_cache_hit_percent
+
+    assert prompt_cache_hit_percent(100_000, 125_000) == 80
+    assert prompt_cache_hit_percent(0, 125_000) is None
+    assert prompt_cache_hit_percent(100, 0) is None
+    assert prompt_cache_hit_percent(None, None) is None
+    assert prompt_cache_hit_percent(200, 100) == 100
+
+
 def test_session_compact_exposes_prompt_cache_counters():
    from api.models import Session

    session = Session(
        session_id="issue2419_cache_usage",
        workspace="/tmp",
-        input_tokens=120_000,
+        input_tokens=125_000,
        output_tokens=5_000,
        estimated_cost=0.44,
        cache_read_tokens=100_000,
-        cache_write_tokens=20_000,
+        cache_write_tokens=5_000,
    )

    compact = session.compact()

    assert compact["cache_read_tokens"] == 100_000
-    assert compact["cache_write_tokens"] == 20_000
+    assert compact["cache_write_tokens"] == 5_000
+    assert compact["cache_hit_percent"] == 80


 def test_streaming_usage_payload_includes_prompt_cache_counters():
@@ -27,8 +38,9 @@ def test_streaming_usage_payload_includes_prompt_cache_counters():

    assert "session_cache_read_tokens" in src
    assert "session_cache_write_tokens" in src
-    assert "'cache_read_tokens': cache_read_tokens" in src
-    assert "'cache_write_tokens': cache_write_tokens" in src
+    assert "prompt_cache_hit_percent(" in src
+    assert "'cache_hit_percent':" in src
+    assert "'turn_cache_hit_percent':" in src


 def test_context_indicator_surfaces_cache_hit_rate():
@@ -36,9 +48,25 @@ def test_context_indicator_surfaces_cache_hit_rate():

    assert "cacheReadTok=usage.cache_read_tokens||0" in src
    assert "cacheWriteTok=usage.cache_write_tokens||0" in src
-    assert "cache: ${cacheHitPct}% hit" in src
+    assert "cacheHitPct=usage.cache_hit_percent" in src
+    assert "t('usage_cache_hit_detail',cacheHitPct" in src
    assert "Estimated cost: $${cost<0.01?cost.toFixed(4):cost.toFixed(2)}" in src
-    assert "cache ${Math.round((cacheRead/cacheTotal)*100)}% hit" in src
+    assert "cacheHitPct=msg._turnUsage.cache_hit_percent" in src
+    assert "t('usage_cached_percent',cacheHitPct)" in src
+    assert "cacheHitPct!=null" in src
+    assert "cacheReadTok/cacheTotalTok" not in src
+    assert "cacheRead/cacheTotal" not in src
+    assert "cacheReadTok/promptTok" not in src
+    assert "cacheRead/cacheDenom" not in src
+
+
+def test_cache_usage_labels_are_localized():
+    src = (ROOT / "static" / "i18n.js").read_text()
+
+    assert src.count("usage_cache_hit_detail:") == 11
+    assert src.count("usage_cached_percent:") == 11
+    assert "usage_cache_hit_detail: 'Cache: {0}% hit ({1} read / {2} write)'" in src
+    assert "usage_cached_percent: '{0}% cached'" in src


 def test_done_handler_preserves_per_turn_cache_deltas():
@@ -48,3 +76,4 @@ def test_done_handler_preserves_per_turn_cache_deltas():
    assert "curCacheRead=d.usage.cache_read_tokens||0" in src
    assert "cache_read_tokens:Math.max(0,curCacheRead-_prevCacheRead)" in src
    assert "cache_write_tokens:Math.max(0,curCacheWrite-_prevCacheWrite)" in src
+    assert "cache_hit_percent:d.usage.turn_cache_hit_percent" in src
@@ -30,10 +30,17 @@ def test_file_rows_get_toggle_placeholder_before_icon():
 def test_placeholder_matches_directory_toggle_slot_width():
    assert ".file-tree-toggle{" in STYLE_CSS
    assert ".file-tree-toggle-placeholder{" in STYLE_CSS
+    assert "--file-tree-toggle-width:10px" in STYLE_CSS
+
+    toggle_start = STYLE_CSS.index(".file-tree-toggle{")
+    toggle_end = STYLE_CSS.index("}", toggle_start)
+    toggle = STYLE_CSS[toggle_start:toggle_end]
+
    placeholder_start = STYLE_CSS.index(".file-tree-toggle-placeholder{")
    placeholder_end = STYLE_CSS.index("}", placeholder_start)
    placeholder = STYLE_CSS[placeholder_start:placeholder_end]

-    assert "width:10px" in placeholder
-    assert "flex:0 0 10px" in placeholder
+    assert "width:var(--file-tree-toggle-width)" in toggle
+    assert "width:var(--file-tree-toggle-width)" in placeholder
+    assert "flex:0 0 var(--file-tree-toggle-width)" in placeholder
    assert "display:inline-block" in placeholder
@@ -0,0 +1,21 @@
+from pathlib import Path
+
+
+ROOT = Path(__file__).resolve().parents[1]
+UI_JS = (ROOT / "static" / "ui.js").read_text()
+
+
+def test_temporary_configured_model_option_carries_provider_badge():
+    """Configured picker rows that are not already <option>s must keep provider."""
+
+    assert "const badge=(window._configuredModelBadges||{})[value];" in UI_JS
+    assert "if(badge&&badge.provider) opt.dataset.provider=badge.provider;" in UI_JS
+
+
+def test_model_state_reads_provider_from_option_dataset_before_optgroup():
+    """selectModelFromDropdown() adds temporary options outside optgroups."""
+
+    start = UI_JS.index("function _getOptionProviderId(opt)")
+    body = UI_JS[start : UI_JS.index("function _providerFromModelValue", start)]
+    assert "if(opt.dataset && opt.dataset.provider) return opt.dataset.provider;" in body
+    assert body.index("opt.dataset && opt.dataset.provider") < body.index("const group=opt.parentElement")
@@ -33,6 +33,12 @@ def _make_link(url, label):
    return f'<a href="{url}" target="_blank" rel="noopener">{esc(label)}</a>'


+def markdown_href(url):
+    if url.lower().startswith("file://"):
+        return "api/media?path=" + __import__("urllib.parse").parse.quote(url[7:], safe="") + "&inline=1"
+    return url
+
+
 # Minimal Python mirror of the FIXED renderMd() — enough to test link behaviour.
 # Mirrors the stash-based approach introduced by the fix.

@@ -48,9 +54,9 @@ def render_links_only(text):
    link_stash = []
    def stash_link(m):
        label, url = m.group(1), m.group(2)
-        link_stash.append(f'<a href="{url}" target="_blank" rel="noopener">{esc(label)}</a>')
+        link_stash.append(f'<a href="{markdown_href(url)}" target="_blank" rel="noopener">{esc(label)}</a>')
        return f'\x00L{len(link_stash)-1}\x00'
-    s = re.sub(r'\[([^\]]+)\]\((https?://[^\)]+)\)', stash_link, s)
+    s = re.sub(r'\[([^\]]+)\]\(((?:https?|file)://[^\)]+)\)', stash_link, s)

    # Autolink bare URLs (should NOT match inside already-stashed placeholders)
    def autolink(m):
@@ -83,9 +89,9 @@ def render_table_with_links(md):
        stash = []
        def stash_fn(m):
            lb, u = m.group(1), m.group(2)
-            stash.append(f'<a href="{u}" target="_blank" rel="noopener">{esc(lb)}</a>')
+            stash.append(f'<a href="{markdown_href(u)}" target="_blank" rel="noopener">{esc(lb)}</a>')
            return f'\x00L{len(stash)-1}\x00'
-        t = re.sub(r'\[([^\]]+)\]\((https?://[^\)]+)\)', stash_fn, t)
+        t = re.sub(r'\[([^\]]+)\]\(((?:https?|file)://[^\)]+)\)', stash_fn, t)
        # autolink remaining bare URLs
        def autolink(m):
            url = m.group(1)
@@ -170,6 +176,17 @@ def test_labeled_link_renders_as_single_anchor():
    assert f']({url})' not in result


+def test_labeled_file_link_renders_as_single_anchor():
+    """A labeled local file link must survive the settled render path."""
+    url = 'file:///Users/agent/Documents/Obsidian/Meal-Prep/halal-cart.html'
+    md = f'[Halal Cart Chicken]({url})'
+    result = render_links_only(md)
+    assert result.count('<a ') == 1, f"Expected 1 <a> tag, got: {result}"
+    assert 'href="api/media?path=%2FUsers%2Fagent%2FDocuments%2FObsidian%2FMeal-Prep%2Fhalal-cart.html&inline=1"' in result
+    assert 'Halal Cart Chicken' in result
+    assert '[Halal Cart Chicken]' not in result
+
+
 def test_href_not_html_escaped():
    """URLs with & must appear as literal & in href, not &amp;."""
    url = 'https://example.com/search?q=foo&bar=baz'
@@ -261,6 +278,13 @@ def test_js_source_sanitizes_quotes_in_href():
        "URL placed in href should have double-quotes percent-encoded via .replace to %22"
    )

+
+def test_js_source_rewrites_file_links_to_media_endpoint():
+    """Browser pages cannot reliably navigate to file://, so renderMd must use /api/media."""
+    assert "function _markdownHref" in UI_JS
+    assert "api/media?path=" in UI_JS
+    assert "file:\\/\\/" in UI_JS
+
 # ── Code-inside-bold tests (pre-existing bug, fixed in same PR) ───────────────

 def test_js_inlinemd_stashes_code_before_bold():
@@ -389,7 +389,7 @@ def test_focus_visibility_return_marks_active_session_viewed_and_clears_marker()


 def test_completion_unread_clears_only_when_session_is_opened():
-    load_idx = SESSIONS_JS.find("async function loadSession(sid)")
+    load_idx = SESSIONS_JS.find("async function loadSession(sid")
    assert load_idx != -1, "loadSession not found"
    load_block = SESSIONS_JS[load_idx:SESSIONS_JS.find("function _resolveSessionModelForDisplaySoon", load_idx)]

@@ -113,7 +113,7 @@ def test_korean_settings_detail_descriptions_are_translated():
        "settings_desc_external_sessions: 'CLI, Telegram, Discord, Slack 및 기타 채널의 대화를 세션 목록에 표시합니다. 클릭하여 가져오고 계속하세요.'",
        "settings_desc_sync_insights: 'WebUI 토큰 사용량을 state.db에 반영하여 hermes /insights에 브라우저 세션 데이터가 포함되도록 합니다. 기본값은 꺼짐입니다.'",
        "settings_desc_check_updates: 'WebUI 또는 Agent의 새 버전이 있으면 배너를 표시합니다. 백그라운드에서 주기적으로 git fetch를 실행합니다.'",
-        "settings_desc_bot_name: 'UI 전체에 표시되는 Assistant 이름입니다. 기본값은 Hermes입니다.'",
+        "settings_desc_bot_name: '기본 프로필에만 사용됩니다. 다른 프로필은 각 프로필 이름을 사용합니다.'",
        "settings_desc_password: '새 비밀번호를 설정하거나 변경하려면 입력하세요. 현재 설정을 유지하려면 비워 두세요.'",
    ]
    for entry in expected:
@@ -309,7 +309,22 @@ def test_rfc_distinguishes_goal_routing_from_queue_route_staging():
    rfc = (routes.Path(__file__).parent.parent / "docs" / "rfcs" / "hermes-run-adapter-contract.md").read_text(encoding="utf-8")

    assert "#2544 shipped the first Slice 3c implementation" in rfc
+    assert "#2560 shipped the queue-staging clarification" in rfc
    assert "route now uses `RuntimeAdapter.update_goal(...)`" in rfc
-    assert "`queue_message(...)` remains a staged protocol method" in rfc
+    assert "`queue_message(...)` as a staged protocol method only" in rfc
    assert "no new server-side queue endpoint" in rfc
-    assert "or queue scheduler should be added just for adapter symmetry" in rfc
+    assert "no server-side queue endpoint or queue\n  scheduler should be added merely for adapter symmetry" in rfc
+
+
+def test_rfc_defines_slice4_runner_contract_before_runner_code():
+    routes = importlib.import_module("api.routes")
+    rfc = (routes.Path(__file__).parent.parent / "docs" / "rfcs" / "hermes-run-adapter-contract.md").read_text(encoding="utf-8")
+
+    assert "#### Slice 4a: Runner contract gate" in rfc
+    assert "docs/test contract PR before any\nrunner code lands" in rfc
+    assert "feature-flagged, default-off" in rfc
+    assert "The runner, not the main WebUI request process, owns" in rfc
+    assert "restart only\n   `hermes-webui.service`" in rfc
+    assert "profile,\n   workspace, attachments, model/provider, toolset, and source metadata" in rfc
+    assert "no removal of the legacy in-process backend" in rfc
+    assert "no default-on runner mode" in rfc
@@ -808,6 +808,52 @@ class TestNonEmptyMessagesPendingCleared:
        assert s.pending_user_message is None
        assert s.active_stream_id is None

+    def test_finished_worker_can_supersede_its_own_interrupted_marker(self):
+        """A live worker that finishes after stale repair should be allowed to
+        replace the recovery marker for the same user turn."""
+        s = _make_session(
+            messages=[
+                {"role": "user", "content": "deploy"},
+                models._interrupted_recovery_marker(),
+            ]
+        )
+        s.active_stream_id = None
+        s.pending_user_message = None
+        s.pending_attachments = []
+
+        assert streaming._stream_writeback_can_supersede_recovery_marker(s, "deploy")
+
+    def test_finished_worker_does_not_supersede_after_newer_turn_appended(self):
+        """Once a follow-up turn changes the visible tail, stale writeback stays
+        blocked so old workers cannot overwrite newer transcript state."""
+        s = _make_session(
+            messages=[
+                {"role": "user", "content": "deploy"},
+                models._interrupted_recovery_marker(),
+                {"role": "user", "content": "what happened?"},
+                {"role": "assistant", "content": "I checked the deployment status."},
+            ]
+        )
+        s.active_stream_id = None
+        s.pending_user_message = None
+        s.pending_attachments = []
+
+        assert not streaming._stream_writeback_can_supersede_recovery_marker(s, "deploy")
+
+    def test_finished_worker_does_not_supersede_different_user_turn(self):
+        """The supersede path is tied to the pending prompt that was repaired."""
+        s = _make_session(
+            messages=[
+                {"role": "user", "content": "deploy"},
+                models._interrupted_recovery_marker(),
+            ]
+        )
+        s.active_stream_id = None
+        s.pending_user_message = None
+        s.pending_attachments = []
+
+        assert not streaming._stream_writeback_can_supersede_recovery_marker(s, "ship it")
+
    def test_core_sync_branch_does_not_duplicate_journal_output_already_in_core(
        self, hermes_home, monkeypatch
    ):
@@ -218,7 +218,7 @@ def test_cancel_marker_flagged_as_error_to_skip_in_api_history():
    persisting the marker to the session.
    """
    src = read("api/streaming.py")
-    idx = src.find("'content': _cancelled_turn_content(message)")
+    idx = src.find("'content': _cancelled_turn_content(message")
    assert idx != -1, "cancel marker content writer not found in cancel_stream()"

    # Walk back to the start of the dict literal (opening brace)
@@ -554,7 +554,8 @@ class TestSmdUrlSchemeSanitization:
    def test_sanitize_uses_scheme_allowlist(self):
        # The allowlist regex must permit the safe schemes that the legacy
        # renderMd path emitted (http/https + relative/anchor paths + mailto/tel)
-        # and reject everything else — including javascript:, data:, vbscript:, file:.
+        # and reject dangerous executable schemes. file:// anchors are rewritten
+        # to api/media before click time rather than allowed through raw.
        assert "_SMD_SAFE_URL_RE" in MESSAGES_JS, (
            "Expected a _SMD_SAFE_URL_RE regex defining the safe-scheme allowlist"
        )
@@ -565,11 +566,17 @@ class TestSmdUrlSchemeSanitization:
        pattern = m.group(1)
        # Must mention https? and must NOT mention javascript/vbscript/data
        assert "https?" in pattern, "allowlist must permit https?:"
+        assert "file:" not in pattern, "raw file: anchors must be rewritten, not allowed through"
+        assert "api" in MESSAGES_JS, "allowlist must permit rewritten api/media anchors"
        for bad in ("javascript", "vbscript", "data:"):
            assert bad not in pattern, (
                f"allowlist must NOT mention {bad!r} — schemes are denied by default"
            )

+    def test_file_anchor_rewrite_helper_exists(self):
+        assert "_smdFileHref" in MESSAGES_JS
+        assert "api/media?path=" in MESSAGES_JS
+
    def test_sanitize_called_after_smd_write(self):
        # _smdWrite must invoke _sanitizeSmdLinks on assistantBody after feeding the parser,
        # so anchors/images created mid-stream get their javascript:/data:/vbscript:
@@ -28,7 +28,7 @@ def test_clicking_current_session_is_noop_before_load_session_side_effects():
    load_session = _function_body(SESSIONS_JS, "async function loadSession")

    current_idx = load_session.index("const currentSid = S.session ? S.session.session_id : null")
-    noop_idx = load_session.index("if(currentSid===sid) return")
+    noop_idx = load_session.index("if(currentSid===sid && !forceReload) return")
    loading_idx = load_session.index("_loadingSessionId = sid")
    stop_idx = load_session.index("stopApprovalPolling")

@@ -0,0 +1,103 @@
+import subprocess
+
+import api.terminal as terminal
+
+
+class _DummyThread:
+    def __init__(self, *args, **kwargs):
+        self.args = args
+        self.kwargs = kwargs
+        self.started = False
+
+    def start(self):
+        self.started = True
+
+
+class _FakeProc:
+    pid = 999_999_999
+
+    def __init__(self):
+        self.wait_calls = []
+
+    def poll(self):
+        return None
+
+    def wait(self, timeout=None):
+        self.wait_calls.append(timeout)
+        return 0
+
+
+def test_terminal_shell_uses_parent_death_signal_preexec(monkeypatch, tmp_path):
+    captured = {}
+    proc = _FakeProc()
+
+    def fake_popen(*args, **kwargs):
+        captured["args"] = args
+        captured["kwargs"] = kwargs
+        return proc
+
+    monkeypatch.setattr(terminal.subprocess, "Popen", fake_popen)
+    monkeypatch.setattr(terminal.threading, "Thread", _DummyThread)
+    monkeypatch.setattr(terminal, "_set_size", lambda *args, **kwargs: None)
+
+    term = terminal.start_terminal("term-preexec", tmp_path)
+
+    try:
+        assert term.proc is proc
+        assert captured["kwargs"]["preexec_fn"] is terminal._terminal_shell_preexec_fn
+        assert captured["kwargs"]["start_new_session"] is True
+        assert captured["kwargs"]["stdin"] == captured["kwargs"]["stdout"] == captured["kwargs"]["stderr"]
+    finally:
+        terminal.close_terminal("term-preexec")
+
+
+def test_close_terminal_waits_again_after_sigkill(monkeypatch):
+    class TimeoutThenReapedProc(_FakeProc):
+        def wait(self, timeout=None):
+            self.wait_calls.append(timeout)
+            if len(self.wait_calls) == 1:
+                raise subprocess.TimeoutExpired(cmd="shell", timeout=timeout)
+            return -9
+
+    proc = TimeoutThenReapedProc()
+    term = terminal.TerminalSession(
+        session_id="term-timeout",
+        workspace="/tmp",
+        proc=proc,
+        master_fd=12345,
+    )
+    terminal._TERMINALS["term-timeout"] = term
+    kills = []
+    monkeypatch.setattr(terminal.os, "killpg", lambda pid, sig: kills.append((pid, sig)))
+    monkeypatch.setattr(terminal.os, "close", lambda fd: None)
+
+    assert terminal.close_terminal("term-timeout") is True
+
+    assert proc.wait_calls == [1.5, 1.0]
+    assert kills == [(proc.pid, terminal.signal.SIGHUP), (proc.pid, terminal.signal.SIGKILL)]
+
+
+def test_close_all_terminals_closes_snapshot(monkeypatch):
+    terminal._TERMINALS.clear()
+    terminal._TERMINALS.update({"a": object(), "b": object()})
+    closed = []
+
+    def fake_close(session_id):
+        closed.append(session_id)
+        terminal._TERMINALS.pop(session_id, None)
+        return True
+
+    monkeypatch.setattr(terminal, "close_terminal", fake_close)
+
+    terminal.close_all_terminals()
+
+    assert closed == ["a", "b"]
+    assert terminal._TERMINALS == {}
+
+
+def test_terminal_module_registers_graceful_shutdown_reaper():
+    src = terminal.Path(terminal.__file__).read_text()
+
+    assert "atexit.register(close_all_terminals)" in src
+    assert "preexec_fn=_terminal_shell_preexec_fn" in src
+    assert "libc.prctl(1, signal.SIGTERM)" in src
@@ -0,0 +1,39 @@
+from pathlib import Path
+
+
+SESSIONS_JS = Path("static/sessions.js").read_text(encoding="utf-8")
+
+
+def test_load_session_supports_force_reload_for_external_refresh():
+    assert "async function loadSession(sid)" in SESSIONS_JS
+    assert "const opts = arguments[1] || {};" in SESSIONS_JS
+    assert "const forceReload = !!opts.force" in SESSIONS_JS
+    assert "if(currentSid===sid && !forceReload) return;" in SESSIONS_JS
+    assert "loadSession(sid, {force:true" in SESSIONS_JS
+
+
+def test_active_session_external_refresh_uses_metadata_then_force_reload():
+    assert "function ensureActiveSessionExternalRefreshPoll()" in SESSIONS_JS
+    assert "async function refreshActiveSessionIfExternallyUpdated(reason)" in SESSIONS_JS
+    assert "messages=0&resolve_model=0" in SESSIONS_JS
+    assert "remoteCount > localCount || remoteLast > localLast" in SESSIONS_JS
+    assert "if(S.busy || S.activeStreamId) return;" in SESSIONS_JS
+    assert "document.hidden" in SESSIONS_JS
+
+
+def test_active_session_external_refresh_has_focus_and_visibility_hooks():
+    assert "visibilitychange" in SESSIONS_JS
+    assert "window.addEventListener('focus'" in SESSIONS_JS
+    assert "ensureActiveSessionExternalRefreshPoll();" in SESSIONS_JS
+
+
+def test_force_reload_clears_stale_blocking_prompts_immediately():
+    """External refresh should not leave old approval/clarify modals blocking the composer.
+
+    hideApprovalCard() and hideClarifyCard() defer hiding for their minimum-visible
+    timers unless force=true. That is correct for active streams, but when a
+    same-session external state.db update triggers loadSession(..., {force:true}),
+    the session has completed elsewhere and stale prompts should be removed now.
+    """
+    assert "hideApprovalCard(forceReload)" in SESSIONS_JS
+    assert "hideClarifyCard(forceReload, forceReload?'external-refresh':'dismissed')" in SESSIONS_JS
@@ -0,0 +1,132 @@
+import json
+import queue
+import sqlite3
+from collections import OrderedDict
+from pathlib import Path
+
+import pytest
+
+pytestmark = pytest.mark.requires_agent_modules
+
+
+def _make_state_db(path: Path, sid: str, rows):
+    conn = sqlite3.connect(path)
+    conn.execute(
+        "CREATE TABLE sessions (id TEXT PRIMARY KEY, source TEXT, title TEXT, model TEXT, started_at REAL, message_count INTEGER)"
+    )
+    conn.execute(
+        "CREATE TABLE messages (id INTEGER PRIMARY KEY AUTOINCREMENT, session_id TEXT, role TEXT, content TEXT, timestamp REAL)"
+    )
+    conn.execute(
+        "INSERT INTO sessions (id, source, title, model, started_at, message_count) VALUES (?, ?, ?, ?, ?, ?)",
+        (sid, "webui", "Context Reconcile", "test-model", 1000.0, len(rows)),
+    )
+    for row in rows:
+        conn.execute(
+            "INSERT INTO messages (session_id, role, content, timestamp) VALUES (?, ?, ?, ?)",
+            (sid, row["role"], row["content"], row.get("timestamp", 1000.0)),
+        )
+    conn.commit()
+    conn.close()
+
+
+def test_next_webui_turn_context_includes_state_db_external_messages(monkeypatch, tmp_path):
+    import api.config as config
+    import api.models as models
+    import api.profiles as profiles
+    import api.streaming as streaming
+    from api.models import Session
+
+    session_dir = tmp_path / "sessions"
+    session_dir.mkdir()
+    index_file = session_dir / "_index.json"
+    monkeypatch.setattr(models, "SESSION_DIR", session_dir)
+    monkeypatch.setattr(models, "SESSION_INDEX_FILE", index_file)
+    monkeypatch.setattr(models, "SESSIONS", OrderedDict(), raising=False)
+    monkeypatch.setattr(config, "SESSION_DIR", session_dir, raising=False)
+    monkeypatch.setattr(config, "SESSION_INDEX_FILE", index_file, raising=False)
+    monkeypatch.setattr(streaming, "SESSION_DIR", session_dir, raising=False)
+    monkeypatch.setattr(profiles, "get_active_hermes_home", lambda: tmp_path, raising=False)
+    monkeypatch.setattr(models, "_active_state_db_path", lambda: tmp_path / "state.db", raising=False)
+    config.STREAMS.clear()
+    config.CANCEL_FLAGS.clear()
+    config.AGENT_INSTANCES.clear()
+    config.SESSION_AGENT_LOCKS.clear()
+
+    sid = "webui_context_reconcile_001"
+    sidecar_messages = [
+        {"role": "user", "content": "old user", "timestamp": 1000.0},
+        {"role": "assistant", "content": "old assistant", "timestamp": 1001.0},
+    ]
+    session = Session(
+        session_id=sid,
+        title="Context Reconcile",
+        workspace=str(tmp_path),
+        model="test-model",
+        messages=list(sidecar_messages),
+        context_messages=list(sidecar_messages),
+    )
+    session.active_stream_id = "stream-context-reconcile"
+    session.pending_user_message = "new webui turn"
+    session.pending_started_at = 1004.0
+    session.save(touch_updated_at=False)
+    models.SESSIONS[sid] = session
+
+    _make_state_db(
+        tmp_path / "state.db",
+        sid,
+        [
+            {"role": "user", "content": "old user", "timestamp": 1000.0},
+            {"role": "assistant", "content": "old assistant", "timestamp": 1001.0},
+            {"role": "user", "content": "external gateway user", "timestamp": 1002.0},
+            {"role": "assistant", "content": "external gateway assistant", "timestamp": 1003.0},
+        ],
+    )
+
+    captured = {}
+
+    class FakeAgent:
+        def __init__(self, **kwargs):
+            self.session_id = sid
+            self.context_compressor = None
+            self.ephemeral_system_prompt = None
+
+        def run_conversation(self, **kwargs):
+            captured["conversation_history"] = kwargs.get("conversation_history")
+            history = kwargs.get("conversation_history") or []
+            return {
+                "completed": True,
+                "final_response": "ok",
+                "messages": history + [
+                    {"role": "user", "content": kwargs.get("persist_user_message", "")},
+                    {"role": "assistant", "content": "ok"},
+                ],
+            }
+
+    monkeypatch.setattr(streaming, "_get_ai_agent", lambda: FakeAgent)
+    monkeypatch.setattr(streaming, "resolve_model_provider", lambda *args, **kwargs: ("test-model", None, None))
+    monkeypatch.setattr(streaming, "get_config", lambda: {})
+    monkeypatch.setattr(config, "get_config", lambda: {})
+    monkeypatch.setattr(config, "_resolve_cli_toolsets", lambda *args, **kwargs: [])
+
+    stream_id = "stream-context-reconcile"
+    config.STREAMS[stream_id] = queue.Queue()
+    try:
+        streaming._run_agent_streaming(
+            session_id=sid,
+            msg_text="new webui turn",
+            model="test-model",
+            workspace=str(tmp_path),
+            stream_id=stream_id,
+            attachments=[],
+        )
+    finally:
+        config.STREAMS.pop(stream_id, None)
+
+    history_contents = [m.get("content") for m in captured.get("conversation_history") or []]
+    assert history_contents == [
+        "old user",
+        "old assistant",
+        "external gateway user",
+        "external gateway assistant",
+    ]
@@ -0,0 +1,352 @@
+import json
+import sqlite3
+from collections import OrderedDict
+from io import BytesIO
+from pathlib import Path
+from urllib.parse import parse_qs, urlparse
+
+import pytest
+
+pytestmark = pytest.mark.requires_agent_modules
+
+
+class _GetHandler:
+    def __init__(self, path):
+        self.path = path
+        self.headers = {}
+        self.client_address = ("127.0.0.1", 12345)
+        self.status = None
+        self.wfile = BytesIO()
+        self.response_headers = []
+
+    def send_response(self, status):
+        self.status = status
+
+    def send_header(self, key, value):
+        self.response_headers.append((key, value))
+
+    def end_headers(self):
+        pass
+
+    @property
+    def response_json(self):
+        return json.loads(self.wfile.getvalue().decode("utf-8"))
+
+    @property
+    def query(self):
+        return parse_qs(urlparse(self.path).query)
+
+    def log_message(self, *args, **kwargs):
+        pass
+
+
+def _make_state_db(path: Path, sid: str, rows):
+    conn = sqlite3.connect(path)
+    conn.execute(
+        "CREATE TABLE sessions (id TEXT PRIMARY KEY, source TEXT, title TEXT, model TEXT, started_at REAL, message_count INTEGER)"
+    )
+    conn.execute(
+        "CREATE TABLE messages (id INTEGER PRIMARY KEY AUTOINCREMENT, session_id TEXT, role TEXT, content TEXT, timestamp REAL, tool_call_id TEXT, tool_calls TEXT, tool_name TEXT)"
+    )
+    conn.execute(
+        "INSERT INTO sessions (id, source, title, model, started_at, message_count) VALUES (?, ?, ?, ?, ?, ?)",
+        (sid, "webui", "Reconcile", "test-model", 1000.0, len(rows)),
+    )
+    for row in rows:
+        conn.execute(
+            "INSERT INTO messages (session_id, role, content, timestamp, tool_call_id, tool_calls, tool_name) VALUES (?, ?, ?, ?, ?, ?, ?)",
+            (
+                sid,
+                row["role"],
+                row["content"],
+                row.get("timestamp", 1000.0),
+                row.get("tool_call_id"),
+                row.get("tool_calls"),
+                row.get("tool_name"),
+            ),
+        )
+    conn.commit()
+    conn.close()
+
+
+def _install_test_session(monkeypatch, tmp_path, sid, sidecar_messages):
+    import api.config as config
+    import api.models as models
+    import api.routes as routes
+    import api.profiles as profiles
+
+    monkeypatch.setattr(config, "STATE_DIR", tmp_path, raising=False)
+    session_dir = tmp_path / "sessions"
+    monkeypatch.setattr(config, "SESSION_DIR", session_dir, raising=False)
+    monkeypatch.setattr(config, "SESSION_INDEX_FILE", session_dir / "_index.json", raising=False)
+    monkeypatch.setattr(models, "SESSION_DIR", session_dir, raising=False)
+    monkeypatch.setattr(models, "SESSION_INDEX_FILE", session_dir / "_index.json", raising=False)
+    monkeypatch.setattr(models, "SESSIONS", OrderedDict(), raising=False)
+    monkeypatch.setattr(profiles, "get_active_hermes_home", lambda: tmp_path, raising=False)
+    monkeypatch.setattr(models, "_active_state_db_path", lambda: tmp_path / "state.db", raising=False)
+    monkeypatch.setattr(routes, "_active_state_db_path", lambda: tmp_path / "state.db", raising=False)
+    session_dir.mkdir(parents=True, exist_ok=True)
+
+    session = models.Session(
+        session_id=sid,
+        title="Reconcile",
+        workspace=str(tmp_path),
+        model="test-model",
+        messages=sidecar_messages,
+        created_at=1000.0,
+        updated_at=1001.0,
+    )
+    session.save(touch_updated_at=False)
+    return session
+
+
+def test_api_session_includes_state_db_messages_newer_than_webui_sidecar(monkeypatch, tmp_path):
+    import api.routes as routes
+
+    sid = "webui_reconcile_001"
+    sidecar_messages = [
+        {"role": "user", "content": "old user", "timestamp": 1000.0},
+        {"role": "assistant", "content": "old assistant", "timestamp": 1001.0},
+    ]
+    _install_test_session(monkeypatch, tmp_path, sid, sidecar_messages)
+    _make_state_db(
+        tmp_path / "state.db",
+        sid,
+        [
+            {"role": "user", "content": "old user", "timestamp": 1000.0},
+            {"role": "assistant", "content": "old assistant", "timestamp": 1001.0},
+            {"role": "user", "content": "external user", "timestamp": 1002.0},
+            {"role": "assistant", "content": "external assistant", "timestamp": 1003.0},
+        ],
+    )
+
+    handler = _GetHandler(f"/api/session?session_id={sid}&messages=1&resolve_model=0")
+    routes.handle_get(handler, urlparse(handler.path))
+
+    assert handler.status == 200
+    payload = handler.response_json
+    messages = payload["session"]["messages"]
+    assert [m["content"] for m in messages] == [
+        "old user",
+        "old assistant",
+        "external user",
+        "external assistant",
+    ]
+    assert payload["session"]["message_count"] == 4
+
+
+def test_state_db_reconciliation_preserves_sidecar_only_messages(monkeypatch, tmp_path):
+    import api.routes as routes
+
+    sid = "webui_reconcile_sidecar_only"
+    _install_test_session(
+        monkeypatch,
+        tmp_path,
+        sid,
+        [
+            {"role": "user", "content": "sidecar-only draft", "timestamp": 999.0},
+            {"role": "user", "content": "old user", "timestamp": 1000.0},
+        ],
+    )
+    _make_state_db(
+        tmp_path / "state.db",
+        sid,
+        [
+            {"role": "user", "content": "old user", "timestamp": 1000.0},
+            {"role": "assistant", "content": "external assistant", "timestamp": 1001.0},
+        ],
+    )
+
+    handler = _GetHandler(f"/api/session?session_id={sid}&messages=1&resolve_model=0")
+    routes.handle_get(handler, urlparse(handler.path))
+    assert handler.status == 200
+    messages = handler.response_json["session"]["messages"]
+    assert [m["content"] for m in messages] == [
+        "sidecar-only draft",
+        "old user",
+        "external assistant",
+    ]
+
+
+def test_state_db_reconciliation_does_not_collapse_repeated_content_with_different_timestamps(monkeypatch, tmp_path):
+    import api.routes as routes
+
+    sid = "webui_reconcile_repeated"
+    _install_test_session(
+        monkeypatch,
+        tmp_path,
+        sid,
+        [{"role": "assistant", "content": "same", "timestamp": 1000.0}],
+    )
+    _make_state_db(
+        tmp_path / "state.db",
+        sid,
+        [
+            {"role": "assistant", "content": "same", "timestamp": 1000.0},
+            {"role": "assistant", "content": "same", "timestamp": 1001.0},
+        ],
+    )
+
+    handler = _GetHandler(f"/api/session?session_id={sid}&messages=1&resolve_model=0")
+    routes.handle_get(handler, urlparse(handler.path))
+    assert handler.status == 200
+    messages = handler.response_json["session"]["messages"]
+    assert [m["content"] for m in messages] == ["same", "same"]
+    assert [m["timestamp"] for m in messages] == [1000.0, 1001.0]
+
+
+def test_state_db_reconciliation_preserves_sidecar_order_when_timestamps_collide(monkeypatch, tmp_path):
+    import api.routes as routes
+
+    sid = "webui_reconcile_same_timestamp_order"
+    _install_test_session(
+        monkeypatch,
+        tmp_path,
+        sid,
+        [
+            {"role": "user", "content": "z user happened first", "timestamp": 1000},
+            {"role": "assistant", "content": "a assistant happened second", "timestamp": 1000},
+            {"role": "tool", "content": "m tool happened third", "timestamp": 1000, "tool_call_id": "call_1"},
+        ],
+    )
+    _make_state_db(
+        tmp_path / "state.db",
+        sid,
+        [
+            {"role": "user", "content": "z user happened first", "timestamp": 1000.0},
+            {"role": "assistant", "content": "a assistant happened second", "timestamp": 1000.0},
+            {"role": "tool", "content": "m tool happened third", "timestamp": 1000.0, "tool_call_id": "call_1"},
+        ],
+    )
+
+    handler = _GetHandler(f"/api/session?session_id={sid}&messages=1&resolve_model=0")
+    routes.handle_get(handler, urlparse(handler.path))
+    assert handler.status == 200
+    messages = handler.response_json["session"]["messages"]
+    assert [m["content"] for m in messages] == [
+        "z user happened first",
+        "a assistant happened second",
+        "m tool happened third",
+    ]
+    assert handler.response_json["session"]["message_count"] == 3
+
+
+def test_state_db_reconciliation_dedupes_numeric_equivalent_timestamps(monkeypatch, tmp_path):
+    import api.routes as routes
+
+    sid = "webui_reconcile_numeric_timestamp"
+    _install_test_session(
+        monkeypatch,
+        tmp_path,
+        sid,
+        [{"role": "assistant", "content": "same timestamp", "timestamp": 1000}],
+    )
+    _make_state_db(
+        tmp_path / "state.db",
+        sid,
+        [{"role": "assistant", "content": "same timestamp", "timestamp": 1000.0}],
+    )
+
+    handler = _GetHandler(f"/api/session?session_id={sid}&messages=1&resolve_model=0")
+    routes.handle_get(handler, urlparse(handler.path))
+    assert handler.status == 200
+    messages = handler.response_json["session"]["messages"]
+    assert [m["content"] for m in messages] == ["same timestamp"]
+    assert handler.response_json["session"]["message_count"] == 1
+
+
+def test_state_db_reconciliation_preserves_repeated_sidecar_rows(monkeypatch, tmp_path):
+    import api.routes as routes
+
+    sid = "webui_reconcile_repeated_sidecar"
+    _install_test_session(
+        monkeypatch,
+        tmp_path,
+        sid,
+        [
+            {"role": "assistant", "content": "", "timestamp": 1000},
+            {"role": "assistant", "content": "", "timestamp": 1000},
+            {"role": "assistant", "content": "done", "timestamp": 1001},
+        ],
+    )
+    _make_state_db(
+        tmp_path / "state.db",
+        sid,
+        [{"role": "assistant", "content": "", "timestamp": 1000.0}],
+    )
+
+    handler = _GetHandler(f"/api/session?session_id={sid}&messages=1&resolve_model=0")
+    routes.handle_get(handler, urlparse(handler.path))
+    assert handler.status == 200
+    messages = handler.response_json["session"]["messages"]
+    assert [m["content"] for m in messages] == ["", "", "done"]
+    assert handler.response_json["session"]["message_count"] == 3
+
+
+def test_metadata_fast_path_reports_reconciled_state_db_count(monkeypatch, tmp_path):
+    import api.routes as routes
+
+    sid = "webui_reconcile_metadata"
+    _install_test_session(
+        monkeypatch,
+        tmp_path,
+        sid,
+        [
+            {"role": "user", "content": "old user", "timestamp": 1000.0},
+            {"role": "assistant", "content": "old assistant", "timestamp": 1001.0},
+        ],
+    )
+    _make_state_db(
+        tmp_path / "state.db",
+        sid,
+        [
+            {"role": "user", "content": "old user", "timestamp": 1000.0},
+            {"role": "assistant", "content": "old assistant", "timestamp": 1001.0},
+            {"role": "user", "content": "external metadata user", "timestamp": 1002.0},
+            {"role": "assistant", "content": "external metadata assistant", "timestamp": 1003.0},
+        ],
+    )
+
+    handler = _GetHandler(f"/api/session?session_id={sid}&messages=0&resolve_model=0")
+    routes.handle_get(handler, urlparse(handler.path))
+
+    assert handler.status == 200
+    session = handler.response_json["session"]
+    assert session["messages"] == []
+    assert session["message_count"] == 4
+    assert session["last_message_at"] == 1003.0
+
+
+def test_state_db_reconciliation_preserves_tool_metadata(monkeypatch, tmp_path):
+    import api.routes as routes
+
+    sid = "webui_reconcile_tool_metadata"
+    _install_test_session(
+        monkeypatch,
+        tmp_path,
+        sid,
+        [{"role": "user", "content": "old user", "timestamp": 1000.0}],
+    )
+    tool_calls = json.dumps([{"id": "call_1", "function": {"name": "terminal"}}])
+    _make_state_db(
+        tmp_path / "state.db",
+        sid,
+        [
+            {"role": "user", "content": "old user", "timestamp": 1000.0},
+            {
+                "role": "assistant",
+                "content": "used a tool",
+                "timestamp": 1001.0,
+                "tool_calls": tool_calls,
+                "tool_name": "terminal",
+            },
+        ],
+    )
+
+    handler = _GetHandler(f"/api/session?session_id={sid}&messages=1&resolve_model=0")
+    routes.handle_get(handler, urlparse(handler.path))
+    assert handler.status == 200
+    messages = handler.response_json["session"]["messages"]
+    assert messages[-1]["content"] == "used a tool"
+    assert messages[-1]["tool_name"] == "terminal"
+    assert messages[-1]["tool_calls"] == [{"id": "call_1", "function": {"name": "terminal"}}]