mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-05-21 03:39:54 +00:00
55d6a1636b
Closes #25249 (and supersedes PR #25260) in spirit. Two bugs in the streaming chat-completions path caused provider timeout configuration to be silently ignored: 1. Hardcoded connect/pool timeout. The httpx.Timeout for streaming calls used hardcoded connect=30.0 and pool=30.0 regardless of the user's providers.<id>.request_timeout_seconds config. If the custom provider (e.g. Ollama) was unreachable, the call always waited exactly 30s before failing, ignoring any configured timeout. Fix: use min(_base_timeout, 60.0) for connect and pool when a provider timeout is configured, falling back to 30.0 otherwise. The 60s cap addresses review feedback (TCP handshake shouldn't wait the inference timeout — connect/pool cover the connection layer, not model latency). 2. Streaming stale-stream detector ignored provider config. The stale detector read only HERMES_STREAM_STALE_TIMEOUT (env default 180s). The providers.<id>.stale_timeout_seconds key (correctly used in the non-streaming path) was never consulted. Fix: check get_provider_stale_timeout(provider, model) first, then fall back to the env var. Aligns the streaming path with the non-streaming path's priority chain (config > env > default). Salvage shape diverged from PR #25260: the function moved to agent/chat_completion_helpers.py and the contributor's two commits (initial fix + 60s-cap review follow-up) are squashed into one final commit applied at the new location. Original diagnosis, fix shape, AND the 60s-cap review response from @zccyman in PR #25260; credited via Co-authored-by. Co-authored-by: zccyman <16263913+zccyman@users.noreply.github.com>