179 lines
4.7 KiB
Markdown
179 lines
4.7 KiB
Markdown
# Timeout Regression Diagnostic Plan
|
|
|
|
> Date: 2026-04-19
|
|
> Status: Draft
|
|
> Upstream Design: `docs/superpowers/specs/2026-04-19-timeout-regression-diagnostic-design.md`
|
|
> Upstream Follow-up: `tests/fixtures/generated_scene/structured_fail_closed_improvement_followup_2026-04-19.json`
|
|
|
|
## Plan Intent
|
|
|
|
Run a bounded diagnostic for the three timeout records after the structured fail-closed improvement follow-up sweep.
|
|
|
|
This plan only diagnoses timeout behavior. It does not implement fixes.
|
|
|
|
## Scope Guardrails
|
|
|
|
1. do not modify `src/generated_scene/analyzer.rs`
|
|
2. do not modify `src/generated_scene/generator.rs`
|
|
3. do not update `scene_execution_board_2026-04-18.json`
|
|
4. do not promote scenes
|
|
5. do not add family baselines
|
|
6. do not handle the remaining structured fail-closed records
|
|
7. do not handle adjudicated host-bridge records
|
|
8. do not treat diagnostic rerun success as validated scene pass
|
|
|
|
## Fixed Input
|
|
|
|
The fixed input is:
|
|
|
|
`tests/fixtures/generated_scene/structured_fail_closed_improvement_followup_2026-04-19.json`
|
|
|
|
Only records with `followupStatus = source-unreadable` and reason `generator timeout after 45s` enter this plan.
|
|
|
|
Expected fixed set:
|
|
|
|
| Scene id | Scene | Type |
|
|
| --- | --- | --- |
|
|
| `sweep-015-scene` | `任务报表` | persistent timeout |
|
|
| `sweep-025-scene` | `力禾动环系统巡视记录` | persistent timeout |
|
|
| `sweep-040-scene` | `嘉峪关日报` | regression timeout |
|
|
|
|
## Phase 0: Freeze Timeout Inputs
|
|
|
|
### Objective
|
|
|
|
Freeze the exact timeout set before diagnostics.
|
|
|
|
### Tasks
|
|
|
|
1. read the follow-up sweep JSON
|
|
2. filter `source-unreadable` timeout records
|
|
3. verify the count is exactly `3`
|
|
4. identify `sweep-040-scene` as the regression timeout
|
|
|
|
### Deliverables
|
|
|
|
1. frozen timeout input list
|
|
|
|
### Acceptance Criteria
|
|
|
|
1. exactly `3` timeout records enter diagnostics
|
|
2. no non-timeout record enters diagnostics
|
|
|
|
## Phase 1: Source Directory Diagnostics
|
|
|
|
### Objective
|
|
|
|
Determine whether timeout records are likely caused by source scale or source structure.
|
|
|
|
### Tasks
|
|
|
|
1. inspect each source directory
|
|
2. count all files
|
|
3. count HTML files
|
|
4. count JavaScript files
|
|
5. compute total source bytes
|
|
6. record the largest files
|
|
|
|
### Deliverables
|
|
|
|
1. per-scene source diagnostics in JSON
|
|
|
|
### Acceptance Criteria
|
|
|
|
1. all `3` timeout records have source diagnostics
|
|
2. missing directories are reported explicitly
|
|
|
|
## Phase 2: Bounded Diagnostic Rerun
|
|
|
|
### Objective
|
|
|
|
Check whether each timeout completes under a longer diagnostic budget.
|
|
|
|
### Tasks
|
|
|
|
1. rerun each timeout scene with a diagnostic timeout budget
|
|
2. write output under `examples/timeout_regression_diagnostic_2026-04-19`
|
|
3. capture exit code
|
|
4. capture elapsed seconds
|
|
5. record whether a `generation-report.json` is produced
|
|
6. do not update any execution status based on the result
|
|
|
|
### Deliverables
|
|
|
|
1. diagnostic rerun result per timeout scene
|
|
|
|
### Acceptance Criteria
|
|
|
|
1. each timeout has exactly one diagnostic rerun result
|
|
2. rerun success is marked only as diagnostic evidence
|
|
3. rerun failure is categorized, not fixed
|
|
|
|
## Phase 3: Timeout Labeling
|
|
|
|
### Objective
|
|
|
|
Assign each timeout one final diagnostic label.
|
|
|
|
### Tasks
|
|
|
|
1. assign one primary diagnostic label:
|
|
- `timeout-rerun-pass`
|
|
- `timeout-rerun-fail-closed`
|
|
- `timeout-large-source`
|
|
- `timeout-command-hang`
|
|
- `timeout-nondeterministic`
|
|
- `timeout-source-scan-heavy`
|
|
- `timeout-unknown`
|
|
2. attach secondary labels when useful
|
|
3. distinguish persistent timeouts from regression timeout
|
|
|
|
### Deliverables
|
|
|
|
1. labeled timeout diagnostic JSON
|
|
|
|
### Acceptance Criteria
|
|
|
|
1. all `3` records have exactly one primary diagnostic label
|
|
2. `sweep-040-scene` remains clearly identified as the regression timeout
|
|
|
|
## Phase 4: Diagnostic Report
|
|
|
|
### Objective
|
|
|
|
Publish diagnostic results without starting implementation.
|
|
|
|
### Tasks
|
|
|
|
1. write `tests/fixtures/generated_scene/timeout_regression_diagnostic_2026-04-19.json`
|
|
2. write `docs/superpowers/reports/2026-04-19-timeout-regression-diagnostic-report.md`
|
|
3. summarize whether the next step should be timeout implementation, rerun hygiene, or no action
|
|
|
|
### Deliverables
|
|
|
|
1. `tests/fixtures/generated_scene/timeout_regression_diagnostic_2026-04-19.json`
|
|
2. `docs/superpowers/reports/2026-04-19-timeout-regression-diagnostic-report.md`
|
|
|
|
### Acceptance Criteria
|
|
|
|
1. diagnostic output exists
|
|
2. report exists
|
|
3. no implementation changes are made
|
|
4. no execution board update is made
|
|
|
|
## Completion Criteria
|
|
|
|
This plan is complete when:
|
|
|
|
1. the three timeout records are frozen
|
|
2. each has source diagnostics
|
|
3. each has one diagnostic rerun result
|
|
4. each has one final diagnostic label
|
|
5. JSON and report are published
|
|
|
|
## Stop Statement
|
|
|
|
Stop after publishing the timeout diagnostic JSON and report.
|
|
|
|
Do not start timeout implementation or status promotion inside this plan.
|