Files
claw/docs/superpowers/specs/2026-04-19-timeout-regression-diagnostic-design.md

3.3 KiB

Timeout Regression Diagnostic Design

Date: 2026-04-19 Status: Draft Upstream Plan: docs/superpowers/plans/2026-04-19-structured-fail-closed-improvement-roadmap-plan.md Upstream Follow-up: tests/fixtures/generated_scene/structured_fail_closed_improvement_followup_2026-04-19.json

Intent

Diagnose the three timeout records visible after the structured fail-closed improvement follow-up sweep.

This design is diagnostic-only. It does not change analyzer or generator logic, promote scenes, update the execution board, or treat a longer rerun success as a validated pass.

Problem Statement

The structured fail-closed improvement follow-up sweep produced:

Status Count
auto-pass 48
fail-closed-known 47
adjudicated-valid-host-bridge 4
source-unreadable 3

The three source-unreadable records are timeout records:

Scene id Scene Timeout type
sweep-015-scene 任务报表 persistent timeout
sweep-025-scene 力禾动环系统巡视记录 persistent timeout
sweep-040-scene 嘉峪关日报 new regression timeout

sweep-040-scene is the most important record because it regressed from fail-closed-known in the reconciled baseline to source-unreadable in the follow-up sweep.

Scope

In scope:

  1. identify the three timeout records from the follow-up sweep
  2. collect source directory diagnostics
  3. run bounded diagnostic reruns with longer timeout budgets
  4. classify each timeout into a secondary timeout reason
  5. publish diagnostic JSON and report

Out of scope:

  1. analyzer or generator implementation changes
  2. readiness gate changes
  3. execution board updates
  4. scene promotion
  5. family baseline changes
  6. handling the remaining 47 structured fail-closed records
  7. handling the 4 adjudicated host-bridge records

Diagnostic Labels

Each timeout must receive exactly one final diagnostic label:

  1. timeout-rerun-pass
  2. timeout-rerun-fail-closed
  3. timeout-large-source
  4. timeout-command-hang
  5. timeout-nondeterministic
  6. timeout-source-scan-heavy
  7. timeout-unknown

Secondary labels may be attached for:

  1. large file count
  2. large total size
  3. many HTML or JS files
  4. generated report present after rerun
  5. stderr decode noise
  6. elapsed time near budget

Required Evidence

For each timeout record, collect:

  1. scene id
  2. scene name
  3. source directory
  4. previous reconciled status
  5. follow-up status
  6. file count
  7. total source bytes
  8. HTML file count
  9. JavaScript file count
  10. largest files
  11. diagnostic rerun exit code
  12. diagnostic rerun elapsed seconds
  13. diagnostic rerun timed out flag
  14. generation report path if produced
  15. generation status if produced
  16. final diagnostic label

Output

Diagnostic output:

tests/fixtures/generated_scene/timeout_regression_diagnostic_2026-04-19.json

Report output:

docs/superpowers/reports/2026-04-19-timeout-regression-diagnostic-report.md

Success Criteria

  1. exactly three timeout records are diagnosed
  2. sweep-040-scene is explicitly marked as the regression timeout
  3. the two persistent timeout records remain distinguishable from the regression timeout
  4. each record has one final diagnostic label
  5. no implementation changes are made
  6. no execution board state is updated