Files
claw/docs/superpowers/reports/2026-04-19-timeout-regression-diagnostic-report.md

73 lines
3.3 KiB
Markdown

# Timeout Regression Diagnostic Report
> Date: 2026-04-19
> Plan: `docs/superpowers/plans/2026-04-19-timeout-regression-diagnostic-plan.md`
> Follow-up input: `tests/fixtures/generated_scene/structured_fail_closed_improvement_followup_2026-04-19.json`
## Scope
This diagnostic only handled the three timeout records from the structured fail-closed improvement follow-up sweep.
No analyzer or generator logic was changed.
No execution board state was updated.
## Frozen Timeout Inputs
| Scene id | Scene | Type | Previous reconciled status | Follow-up status |
| --- | --- | --- | --- | --- |
| `sweep-015-scene` | `????` | persistent timeout | `source-unreadable` | `source-unreadable` |
| `sweep-025-scene` | `??????????` | persistent timeout | `source-unreadable` | `source-unreadable` |
| `sweep-040-scene` | `?????` | regression timeout | `fail-closed-known` | `source-unreadable` |
## Source Diagnostics
| Scene id | File count | HTML | JS | Total bytes |
| --- | ---: | ---: | ---: | ---: |
| `sweep-015-scene` | 93 | 10 | 21 | 96,922,420 |
| `sweep-025-scene` | 137 | 51 | 38 | 11,274,750 |
| `sweep-040-scene` | 50 | 2 | 21 | 5,037,507 |
Interpretation:
1. `sweep-015-scene` is the largest source set by total bytes and contains many zip artifacts.
2. `sweep-025-scene` is not the largest by bytes, but it has the highest combined HTML and JavaScript file count.
3. `sweep-040-scene` is materially smaller than the two persistent timeout records, so its regression does not look like a pure source-scale problem.
## Diagnostic Rerun
A bounded diagnostic rerun was executed for each timeout record with a `90s` timeout budget.
| Scene id | Elapsed seconds | Exit code | Timed out | Generation report | Result |
| --- | ---: | ---: | --- | --- | --- |
| `sweep-015-scene` | 74.76 | 0 | `false` | present | readiness `A` |
| `sweep-025-scene` | 49.03 | 0 | `false` | present | readiness `A` |
| `sweep-040-scene` | 45.91 | 1 | `false` | present | fail-closed, readiness `C` |
## Final Diagnostic Labels
| Scene id | Final label | Secondary labels |
| --- | --- | --- |
| `sweep-015-scene` | `timeout-rerun-pass` | `large-total-source`, `zip-heavy-source` |
| `sweep-025-scene` | `timeout-rerun-pass` | `source-scan-heavy`, `high-html-js-count` |
| `sweep-040-scene` | `timeout-rerun-fail-closed` | `regression-timeout`, `budget-sensitive-timeout` |
## Conclusions
1. The two persistent timeout records are not hard failures. Under a bounded `90s` diagnostic rerun, both completed successfully.
2. `sweep-040-scene` is the only real regression timeout. Under the same `90s` diagnostic rerun, it resolved into a structured fail-closed result instead of timing out.
3. The current timeout bucket is therefore mixed:
- two records are budget-sensitive successful runs
- one record is a budget-sensitive regression that should really be treated as a structured fail-closed case after rerun
4. The next step should not be timeout implementation first. The higher-value next step is rerun hygiene and timeout-budget policy, so that scenes like `sweep-040-scene` do not get miscounted as unreadable when they can resolve into a concrete fail-closed result.
## Boundaries Preserved
This diagnostic did not:
1. change analyzer or generator code
2. update `scene_execution_board_2026-04-18.json`
3. promote any scene
4. rerun the full `102` sweep
5. start an implementation correction plan