mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-05-21 03:39:54 +00:00
45d860d424
The prior implementation routed download_to_file through the shared _request() path, which uses httpx.AsyncClient.request() inside a context manager that closes before aiter_bytes() iterates. The body was read into memory first and the chunked write loop replayed it from buffer. On small test payloads this was invisible; on real Teams meeting recordings (hundreds of MB) it would force the full artifact into RAM per download. Rewrites download_to_file to open its own AsyncClient and use client.stream(), keeping the context open across the aiter_bytes iteration so the body is actually streamed chunk-by-chunk to disk. Retry/token-refresh/Retry-After semantics are preserved by handling them inline on the stream path. Partial .part files are cleaned up on transport errors and on exhausted retries. Adds three tests: large-payload streaming verifies the chunk loop runs multiple times (discriminator: 512 KiB at chunk_size=65536 yields 8 chunks under streaming, 1 under buffering), transient-5xx retry recovers after a single retry, and exhausted-retry cleans up the partial file.