Files
hermes-agent/tests/hermes_cli/test_openai_codex_model_validation_fallback.py
kshitijk4poor 44cdf555a8 fix(codex-spark): defensive 128k entry in DEFAULT_CONTEXT_LENGTHS + clarify validation test docstring
Two follow-ups from self-review:

1. Add gpt-5.3-codex-spark to DEFAULT_CONTEXT_LENGTHS at 128k. The
   primary resolution path for Spark goes through provider='openai-codex'
   → _CODEX_OAUTH_CONTEXT_FALLBACK (already correct). But if any future
   code path resolves Spark's context with a different provider (custom
   proxy, generic fallthrough), the longest-substring-first lookup in
   step 8 would match 'gpt-5' and report 400k, which is wrong by ~3x.
   Adding the explicit override is a cheap defensive correctness fix
   matching how gpt-5.4-mini and gpt-5.4-nano already shadow the generic
   gpt-5 entry.

2. Update test_openai_codex_model_validation_fallback.py docstring. The
   bug it was originally written for (gpt-5.3-codex-spark missing from
   listing) is now resolved by this PR's catalog restoration. The test
   still validly exercises the soft-accept code path for any future
   entitlement-gated Codex slug that ships before Hermes catalogs it,
   but the framing was stale — clarified.
2026-05-09 23:17:25 -07:00

65 lines
2.5 KiB
Python

"""Regression tests for OpenAI Codex model validation when the listing lags behind
actually usable backend model IDs.
The bug originally reported in #16172: `/model` and `switch_model()` rejected
`gpt-5.3-codex-spark` because the curated listing omitted it, even though direct
runtime calls succeeded. PR #19729 fixed this by soft-accepting unknown-but-
plausible Codex slugs with a warning, and this test pins the soft-accept
behavior so it doesn't regress.
Note: gpt-5.3-codex-spark itself is now in the curated catalog (PR #22991),
so the real-world Spark request takes the `recognized=True` fast path. This
test still uses Spark as the example slug but explicitly mocks
``provider_model_ids`` to omit it, exercising the soft-accept path generically
for any future entitlement-gated Codex slug that ships before Hermes catalogs
it.
"""
from unittest.mock import patch
from hermes_cli.model_switch import switch_model
from hermes_cli.models import validate_requested_model
def test_openai_codex_unknown_but_plausible_model_is_accepted_with_warning():
"""If the Codex listing is incomplete, `/model` should soft-accept the model
with a warning instead of hard-rejecting it.
"""
with patch(
"hermes_cli.models.provider_model_ids",
return_value=["gpt-5.5", "gpt-5.4", "gpt-5.3-codex"],
):
result = validate_requested_model("gpt-5.3-codex-spark", "openai-codex")
assert result["accepted"] is True
assert result["persist"] is True
assert result["recognized"] is False
assert "gpt-5.3-codex-spark" in result["message"]
assert "OpenAI Codex model listing" in result["message"]
assert "Similar models" in result["message"]
assert "gpt-5.3-codex" in result["message"]
def test_switch_model_allows_openai_codex_model_missing_from_listing():
"""switch_model() should succeed for Codex models that the runtime accepts
even when the listing has not caught up yet.
"""
with patch(
"hermes_cli.models.provider_model_ids",
return_value=["gpt-5.5", "gpt-5.4", "gpt-5.3-codex"],
):
result = switch_model(
"gpt-5.3-codex-spark",
current_provider="openai-codex",
current_model="gpt-5.4",
current_base_url="",
current_api_key="",
user_providers=None,
)
assert result.success is True
assert result.new_model == "gpt-5.3-codex-spark"
assert result.target_provider == "openai-codex"
assert result.warning_message
assert "OpenAI Codex model listing" in result.warning_message