mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-05-21 03:39:54 +00:00
0301787653
Two changes: 1. _PROVIDER_VISION_MODELS: add 'nous' -> 'xiaomi/mimo-v2-omni' entry so the vision auto-detect chain picks the correct multimodal model. 2. resolve_provider_client: detect when the requested model is a vision model (from _PROVIDER_VISION_MODELS or known vision model names) and pass vision=True to _try_nous(). Previously, _try_nous() was always called without vision=True in resolve_provider_client(), causing it to return the default text model (gemini-3-flash-preview or mimo-v2-pro) instead of the vision-capable mimo-v2-omni. The _try_nous() function already handled free-tier vision correctly, but the resolve_provider_client() path (used by the auto-detect vision chain) never signaled that a vision task was in progress. Verified: xiaomi/mimo-v2-omni returns HTTP 200 with image inputs on Nous inference API. google/gemini-3-flash-preview returns 404 with images.