Fixes#2853. The `_terminal_shell_preexec_fn` added in `71d8a8fb` called
`prctl(PR_SET_PDEATHSIG, SIGTERM)` so orphaned PTY shells would die when
the WebUI process crashed. But that signal is **per-thread**, not
per-process, and WebUI runs `ThreadingHTTPServer`: every HTTP request is
handled in its own short-lived worker thread.
Flow that broke every Linux user:
1. User clicks the terminal toggle → frontend hits `POST /api/terminal/start`.
2. ThreadingHTTPServer spins up a worker thread to handle that one request.
3. The worker thread calls `subprocess.Popen(..., preexec_fn=...)`.
4. The shell calls `prctl(PR_SET_PDEATHSIG, SIGTERM)` in its preexec_fn.
Its registered "parent" is now the WebUI worker thread that called Popen.
5. The handler returns its JSON response and the worker thread exits.
6. The kernel sees the pdeathsig-parent thread has died and sends SIGTERM
to the PTY shell. The shell dies within ~10 ms of being created.
7. The reader loop sees EIO on the master FD, emits `terminal_closed`, and
the frontend writes `[terminal closed]`.
macOS users were unaffected because `libc.prctl` doesn't exist there —
`ctypes.CDLL(None)` returns a libc handle, `libc.prctl` raises
`AttributeError`, the bare-`except` swallows it, and the shell starts
with no pdeathsig configured.
Empirical verification on this Linux host (real PTY + `subprocess.Popen`
inside a `threading.Thread` that joins immediately):
with preexec_fn → proc.poll() == -15 (SIGTERM), master FD returns EIO
without preexec_fn → proc.poll() == None (alive), master FD returns "HELLO\\r\\n"
Same shell, same PTY, same threading topology as WebUI.
Fix
---
Drop the `preexec_fn` entirely. The orphan-shell-on-crash case the original
PR was navigating is rare for self-hosted single-user installs, and the
existing `atexit.register(close_all_terminals)` + explicit `close_terminal`
paths cover graceful shutdown. A future fix (option B in the issue) can
re-introduce pdeathsig pinned to a long-lived supervisor thread, but that
is a follow-up — this PR is the smallest unbricks-Linux-today change.
Tests
-----
- Invert `test_terminal_shell_uses_parent_death_signal_preexec` →
`test_terminal_shell_does_not_use_pdeathsig_preexec`: asserts
`preexec_fn` is NOT in the Popen kwargs.
- Add `test_pty_shell_survives_when_spawning_thread_exits`: spawns a
real PTY shell via `start_terminal` from a worker thread, waits for
the worker to join, asserts the shell is still alive after a half-second
grace window. This is the contract the original tests never exercised.
- Update `test_terminal_module_registers_graceful_shutdown_reaper` to
refuse re-introduction of the preexec_fn or the `libc.prctl(1, SIGTERM)`
call (treats either as a regression).
All 27 terminal-related tests pass locally.
Refs #2853
B1: fix stored XSS in MCP delete button — replace inline onclick with
data-mcp-name attribute + event delegation (panels.js)
B2: fix zip/tar-slip via startswith prefix collision — use
is_relative_to(); track actual extracted bytes instead of trusting
member.file_size (upload.py)
B3: add NVIDIA NIM endpoint to _OPENAI_COMPAT_ENDPOINTS and
_SUPPORTED_PROVIDER_SETUPS so provider is reachable (routes.py,
onboarding.py)
H1: add terminalResizeHandle element to index.html and return it from
_terminalEls() so resize-by-drag works (index.html, terminal.js)
H2: fix dead get_terminal() branch — return None for dead terminals
instead of always returning term (terminal.py)
H3: replace os.environ.copy() with a safe allowlist in PTY shell env
so API keys are not exposed inside the terminal (terminal.py)
H5: make model dedup deterministic — sort groups by provider_id
alphabetically before first-occurrence assignment (config.py)
H7: add pid regex validation before OAuth probe; constrain key_source
to a closed set of safe values (providers.py)
M8: add double-run guard for cron run-now — reject if job is already
tracked as running (routes.py)