4 Commits

Author SHA1 Message Date
nesquena-hermes 71ba863ce5 fix(terminal): drop PR_SET_PDEATHSIG preexec_fn that killed every Linux shell
Fixes #2853. The `_terminal_shell_preexec_fn` added in `71d8a8fb` called
`prctl(PR_SET_PDEATHSIG, SIGTERM)` so orphaned PTY shells would die when
the WebUI process crashed. But that signal is **per-thread**, not
per-process, and WebUI runs `ThreadingHTTPServer`: every HTTP request is
handled in its own short-lived worker thread.

Flow that broke every Linux user:

1. User clicks the terminal toggle → frontend hits `POST /api/terminal/start`.
2. ThreadingHTTPServer spins up a worker thread to handle that one request.
3. The worker thread calls `subprocess.Popen(..., preexec_fn=...)`.
4. The shell calls `prctl(PR_SET_PDEATHSIG, SIGTERM)` in its preexec_fn.
   Its registered "parent" is now the WebUI worker thread that called Popen.
5. The handler returns its JSON response and the worker thread exits.
6. The kernel sees the pdeathsig-parent thread has died and sends SIGTERM
   to the PTY shell. The shell dies within ~10 ms of being created.
7. The reader loop sees EIO on the master FD, emits `terminal_closed`, and
   the frontend writes `[terminal closed]`.

macOS users were unaffected because `libc.prctl` doesn't exist there —
`ctypes.CDLL(None)` returns a libc handle, `libc.prctl` raises
`AttributeError`, the bare-`except` swallows it, and the shell starts
with no pdeathsig configured.

Empirical verification on this Linux host (real PTY + `subprocess.Popen`
inside a `threading.Thread` that joins immediately):

  with    preexec_fn → proc.poll() == -15 (SIGTERM), master FD returns EIO
  without preexec_fn → proc.poll() == None (alive), master FD returns "HELLO\\r\\n"

Same shell, same PTY, same threading topology as WebUI.

Fix
---

Drop the `preexec_fn` entirely. The orphan-shell-on-crash case the original
PR was navigating is rare for self-hosted single-user installs, and the
existing `atexit.register(close_all_terminals)` + explicit `close_terminal`
paths cover graceful shutdown. A future fix (option B in the issue) can
re-introduce pdeathsig pinned to a long-lived supervisor thread, but that
is a follow-up — this PR is the smallest unbricks-Linux-today change.

Tests
-----

- Invert `test_terminal_shell_uses_parent_death_signal_preexec` →
  `test_terminal_shell_does_not_use_pdeathsig_preexec`: asserts
  `preexec_fn` is NOT in the Popen kwargs.
- Add `test_pty_shell_survives_when_spawning_thread_exits`: spawns a
  real PTY shell via `start_terminal` from a worker thread, waits for
  the worker to join, asserts the shell is still alive after a half-second
  grace window. This is the contract the original tests never exercised.
- Update `test_terminal_module_registers_graceful_shutdown_reaper` to
  refuse re-introduction of the preexec_fn or the `libc.prctl(1, SIGTERM)`
  call (treats either as a regression).

All 27 terminal-related tests pass locally.

Refs #2853
2026-05-24 17:13:31 +00:00
Michael Lam 71d8a8fb1b fix: reap terminal shells on shutdown 2026-05-19 04:57:51 -07:00
Hermes Agent 867f2a3f81 absorb: address Opus review findings (security + correctness)
B1: fix stored XSS in MCP delete button — replace inline onclick with
    data-mcp-name attribute + event delegation (panels.js)
B2: fix zip/tar-slip via startswith prefix collision — use
    is_relative_to(); track actual extracted bytes instead of trusting
    member.file_size (upload.py)
B3: add NVIDIA NIM endpoint to _OPENAI_COMPAT_ENDPOINTS and
    _SUPPORTED_PROVIDER_SETUPS so provider is reachable (routes.py,
    onboarding.py)
H1: add terminalResizeHandle element to index.html and return it from
    _terminalEls() so resize-by-drag works (index.html, terminal.js)
H2: fix dead get_terminal() branch — return None for dead terminals
    instead of always returning term (terminal.py)
H3: replace os.environ.copy() with a safe allowlist in PTY shell env
    so API keys are not exposed inside the terminal (terminal.py)
H5: make model dedup deterministic — sort groups by provider_id
    alphabetically before first-occurrence assignment (config.py)
H7: add pid regex validation before OAuth probe; constrain key_source
    to a closed set of safe values (providers.py)
M8: add double-run guard for cron run-now — reject if job is already
    tracked as running (routes.py)
2026-04-29 05:06:34 +00:00
Frank Song 60a4cb057e Add embedded workspace terminal 2026-04-29 04:35:11 +00:00