3.3 KiB
Timeout Regression Diagnostic Design
Date: 2026-04-19 Status: Draft Upstream Plan:
docs/superpowers/plans/2026-04-19-structured-fail-closed-improvement-roadmap-plan.mdUpstream Follow-up:tests/fixtures/generated_scene/structured_fail_closed_improvement_followup_2026-04-19.json
Intent
Diagnose the three timeout records visible after the structured fail-closed improvement follow-up sweep.
This design is diagnostic-only. It does not change analyzer or generator logic, promote scenes, update the execution board, or treat a longer rerun success as a validated pass.
Problem Statement
The structured fail-closed improvement follow-up sweep produced:
| Status | Count |
|---|---|
auto-pass |
48 |
fail-closed-known |
47 |
adjudicated-valid-host-bridge |
4 |
source-unreadable |
3 |
The three source-unreadable records are timeout records:
| Scene id | Scene | Timeout type |
|---|---|---|
sweep-015-scene |
任务报表 |
persistent timeout |
sweep-025-scene |
力禾动环系统巡视记录 |
persistent timeout |
sweep-040-scene |
嘉峪关日报 |
new regression timeout |
sweep-040-scene is the most important record because it regressed from fail-closed-known in the reconciled baseline to source-unreadable in the follow-up sweep.
Scope
In scope:
- identify the three timeout records from the follow-up sweep
- collect source directory diagnostics
- run bounded diagnostic reruns with longer timeout budgets
- classify each timeout into a secondary timeout reason
- publish diagnostic JSON and report
Out of scope:
- analyzer or generator implementation changes
- readiness gate changes
- execution board updates
- scene promotion
- family baseline changes
- handling the remaining
47structured fail-closed records - handling the
4adjudicated host-bridge records
Diagnostic Labels
Each timeout must receive exactly one final diagnostic label:
timeout-rerun-passtimeout-rerun-fail-closedtimeout-large-sourcetimeout-command-hangtimeout-nondeterministictimeout-source-scan-heavytimeout-unknown
Secondary labels may be attached for:
- large file count
- large total size
- many HTML or JS files
- generated report present after rerun
- stderr decode noise
- elapsed time near budget
Required Evidence
For each timeout record, collect:
- scene id
- scene name
- source directory
- previous reconciled status
- follow-up status
- file count
- total source bytes
- HTML file count
- JavaScript file count
- largest files
- diagnostic rerun exit code
- diagnostic rerun elapsed seconds
- diagnostic rerun timed out flag
- generation report path if produced
- generation status if produced
- final diagnostic label
Output
Diagnostic output:
tests/fixtures/generated_scene/timeout_regression_diagnostic_2026-04-19.json
Report output:
docs/superpowers/reports/2026-04-19-timeout-regression-diagnostic-report.md
Success Criteria
- exactly three timeout records are diagnosed
sweep-040-sceneis explicitly marked as the regression timeout- the two persistent timeout records remain distinguishable from the regression timeout
- each record has one final diagnostic label
- no implementation changes are made
- no execution board state is updated