Files
claw/docs/superpowers/reports/2026-04-19-timeout-regression-diagnostic-report.md

3.3 KiB

Timeout Regression Diagnostic Report

Date: 2026-04-19 Plan: docs/superpowers/plans/2026-04-19-timeout-regression-diagnostic-plan.md Follow-up input: tests/fixtures/generated_scene/structured_fail_closed_improvement_followup_2026-04-19.json

Scope

This diagnostic only handled the three timeout records from the structured fail-closed improvement follow-up sweep.

No analyzer or generator logic was changed.

No execution board state was updated.

Frozen Timeout Inputs

Scene id Scene Type Previous reconciled status Follow-up status
sweep-015-scene ???? persistent timeout source-unreadable source-unreadable
sweep-025-scene ?????????? persistent timeout source-unreadable source-unreadable
sweep-040-scene ????? regression timeout fail-closed-known source-unreadable

Source Diagnostics

Scene id File count HTML JS Total bytes
sweep-015-scene 93 10 21 96,922,420
sweep-025-scene 137 51 38 11,274,750
sweep-040-scene 50 2 21 5,037,507

Interpretation:

  1. sweep-015-scene is the largest source set by total bytes and contains many zip artifacts.
  2. sweep-025-scene is not the largest by bytes, but it has the highest combined HTML and JavaScript file count.
  3. sweep-040-scene is materially smaller than the two persistent timeout records, so its regression does not look like a pure source-scale problem.

Diagnostic Rerun

A bounded diagnostic rerun was executed for each timeout record with a 90s timeout budget.

Scene id Elapsed seconds Exit code Timed out Generation report Result
sweep-015-scene 74.76 0 false present readiness A
sweep-025-scene 49.03 0 false present readiness A
sweep-040-scene 45.91 1 false present fail-closed, readiness C

Final Diagnostic Labels

Scene id Final label Secondary labels
sweep-015-scene timeout-rerun-pass large-total-source, zip-heavy-source
sweep-025-scene timeout-rerun-pass source-scan-heavy, high-html-js-count
sweep-040-scene timeout-rerun-fail-closed regression-timeout, budget-sensitive-timeout

Conclusions

  1. The two persistent timeout records are not hard failures. Under a bounded 90s diagnostic rerun, both completed successfully.
  2. sweep-040-scene is the only real regression timeout. Under the same 90s diagnostic rerun, it resolved into a structured fail-closed result instead of timing out.
  3. The current timeout bucket is therefore mixed:
    • two records are budget-sensitive successful runs
    • one record is a budget-sensitive regression that should really be treated as a structured fail-closed case after rerun
  4. The next step should not be timeout implementation first. The higher-value next step is rerun hygiene and timeout-budget policy, so that scenes like sweep-040-scene do not get miscounted as unreadable when they can resolve into a concrete fail-closed result.

Boundaries Preserved

This diagnostic did not:

  1. change analyzer or generator code
  2. update scene_execution_board_2026-04-18.json
  3. promote any scene
  4. rerun the full 102 sweep
  5. start an implementation correction plan