mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-05-21 03:39:54 +00:00
91eef6255e
Kimi-k2.6 (which supports 262K context) was incorrectly resolved as 32K,
tripping the 64K minimum-context guard and preventing use of the model on
Ollama Cloud and Kimi Coding / Moonshot providers.
Three fixes in the context-length resolution chain:
1. Ollama Cloud native /api/show query: new _query_ollama_api_show()
queries the Ollama native API for authoritative GGUF model_info
context_length. For hosted Ollama, prefers model_info over num_ctx
since users can't set their own num_ctx on Cloud. Added at step 5e
in get_model_context_length(), before the models.dev fallback.
2. models.dev :cloud/-cloud suffix fallback: lookup_models_dev_context()
now also tries appending :cloud and -cloud suffixes when the bare
model name doesn't match. models.dev stores 'kimi-k2.6:cloud' but
users and the live API use bare 'kimi-k2.6'.
3. Kimi-family 32K guard: after the OpenRouter metadata step, reject
exactly 32768 for Kimi-named models (kimi-*, moonshot*) and fall
through to hardcoded defaults ('kimi': 262144). OpenRouter reports
32768 for moonshotai/kimi-k2.6 but the model actually supports 262K.
Narrow filter — only 32768, only Kimi-family — becomes dead code
when OpenRouter updates its metadata.
---