mirror of
https://github.com/nesquena/hermes-webui.git
synced 2026-05-25 19:20:16 +00:00
71ba863ce5
Fixes #2853. The `_terminal_shell_preexec_fn` added in `71d8a8fb` called `prctl(PR_SET_PDEATHSIG, SIGTERM)` so orphaned PTY shells would die when the WebUI process crashed. But that signal is **per-thread**, not per-process, and WebUI runs `ThreadingHTTPServer`: every HTTP request is handled in its own short-lived worker thread. Flow that broke every Linux user: 1. User clicks the terminal toggle → frontend hits `POST /api/terminal/start`. 2. ThreadingHTTPServer spins up a worker thread to handle that one request. 3. The worker thread calls `subprocess.Popen(..., preexec_fn=...)`. 4. The shell calls `prctl(PR_SET_PDEATHSIG, SIGTERM)` in its preexec_fn. Its registered "parent" is now the WebUI worker thread that called Popen. 5. The handler returns its JSON response and the worker thread exits. 6. The kernel sees the pdeathsig-parent thread has died and sends SIGTERM to the PTY shell. The shell dies within ~10 ms of being created. 7. The reader loop sees EIO on the master FD, emits `terminal_closed`, and the frontend writes `[terminal closed]`. macOS users were unaffected because `libc.prctl` doesn't exist there — `ctypes.CDLL(None)` returns a libc handle, `libc.prctl` raises `AttributeError`, the bare-`except` swallows it, and the shell starts with no pdeathsig configured. Empirical verification on this Linux host (real PTY + `subprocess.Popen` inside a `threading.Thread` that joins immediately): with preexec_fn → proc.poll() == -15 (SIGTERM), master FD returns EIO without preexec_fn → proc.poll() == None (alive), master FD returns "HELLO\\r\\n" Same shell, same PTY, same threading topology as WebUI. Fix --- Drop the `preexec_fn` entirely. The orphan-shell-on-crash case the original PR was navigating is rare for self-hosted single-user installs, and the existing `atexit.register(close_all_terminals)` + explicit `close_terminal` paths cover graceful shutdown. A future fix (option B in the issue) can re-introduce pdeathsig pinned to a long-lived supervisor thread, but that is a follow-up — this PR is the smallest unbricks-Linux-today change. Tests ----- - Invert `test_terminal_shell_uses_parent_death_signal_preexec` → `test_terminal_shell_does_not_use_pdeathsig_preexec`: asserts `preexec_fn` is NOT in the Popen kwargs. - Add `test_pty_shell_survives_when_spawning_thread_exits`: spawns a real PTY shell via `start_terminal` from a worker thread, waits for the worker to join, asserts the shell is still alive after a half-second grace window. This is the contract the original tests never exercised. - Update `test_terminal_module_registers_graceful_shutdown_reaper` to refuse re-introduction of the preexec_fn or the `libc.prctl(1, SIGTERM)` call (treats either as a regression). All 27 terminal-related tests pass locally. Refs #2853