mirror of
https://github.com/nesquena/hermes-webui.git
synced 2026-05-25 03:00:23 +00:00
1a7eaf518f
v0.50.284 shipped startup self-heal in api/session_recovery.py that
crashed on the very first JSON file it scanned in the production
session directory. Verified live on the prod server immediately after
the v0.50.284 deploy:
[recovery] startup recovery failed: 'list' object has no attribute 'get'
Root cause: the production session dir contains _index.json — a
top-level LIST of session metadata dicts (not a dict). _msg_count()
did data.get('messages') which raises AttributeError on a list.
The broad except Exception in server.py's startup hook swallowed the
error and the recovery silently no-op'd for every user — defeating
the entire purpose of the v0.50.284 release.
Fix is three small defensive changes:
1. _msg_count() — added isinstance(data, dict) guard. Non-dict-shaped
JSON files now return -1 (the harmless 'unknown count' sentinel)
instead of raising AttributeError.
2. recover_all_sessions_on_startup() — skips any file whose name starts
with '_' (the existing project convention for non-session metadata
files like _index.json). These are convention-marked as system
files, not session payloads.
3. recover_all_sessions_on_startup() — wraps recover_session(path) in
try/except Exception so a single malformed file can't break recovery
for the rest. Logs and continues.
2 new regression tests:
- test_recover_all_sessions_on_startup_skips_non_session_index_json
- test_msg_count_returns_neg1_for_non_dict_top_level
4026 → 4028 tests passing (+2).
Net effect: any user wiped between v0.50.279 and v0.50.284 deploys
whose session has a .bak shadow will now get auto-recovered on first
launch of v0.50.285, as v0.50.284's release notes promised.
Closes #1558 (follow-up — the original P0 was closed by v0.50.284 but
the recovery half didn't actually run in production).