feat: add generated scene skill platform hardening
This commit is contained in:
@@ -0,0 +1,99 @@
|
||||
# Timeout Budget and Rerun Hygiene Design
|
||||
|
||||
> Date: 2026-04-19
|
||||
> Status: Draft
|
||||
> Upstream Diagnostic: `docs/superpowers/reports/2026-04-19-timeout-regression-diagnostic-report.md`
|
||||
|
||||
## Intent
|
||||
|
||||
Prevent budget-sensitive scenes from being miscounted as `source-unreadable` when they can resolve into:
|
||||
|
||||
1. `executed-pass`
|
||||
2. structured `fail-closed`
|
||||
|
||||
under a bounded rerun budget.
|
||||
|
||||
This design does not attempt to improve scene understanding. It only changes timeout handling and rerun classification hygiene.
|
||||
|
||||
## Problem Statement
|
||||
|
||||
The timeout regression diagnostic produced:
|
||||
|
||||
| Scene id | Diagnostic label | Actual behavior under `90s` |
|
||||
| --- | --- | --- |
|
||||
| `sweep-015-scene` | `timeout-rerun-pass` | completed successfully |
|
||||
| `sweep-025-scene` | `timeout-rerun-pass` | completed successfully |
|
||||
| `sweep-040-scene` | `timeout-rerun-fail-closed` | resolved into structured fail-closed |
|
||||
|
||||
This means the current fixed `45s` budget is too coarse for a subset of scenes. It collapses:
|
||||
|
||||
1. budget-sensitive success
|
||||
2. budget-sensitive fail-closed
|
||||
3. true unreadable or hanging cases
|
||||
|
||||
into the same `source-unreadable` bucket.
|
||||
|
||||
## Scope
|
||||
|
||||
In scope:
|
||||
|
||||
1. define a bounded timeout-budget policy
|
||||
2. define when a diagnostic rerun is allowed
|
||||
3. define how rerun results should be classified
|
||||
4. define output JSON and report for timeout hygiene verification
|
||||
|
||||
Out of scope:
|
||||
|
||||
1. analyzer logic changes
|
||||
2. generator contract recovery changes
|
||||
3. scene promotion
|
||||
4. execution board updates
|
||||
5. full `102` sweep improvement work
|
||||
6. timeout implementation unrelated to rerun hygiene
|
||||
|
||||
## Policy
|
||||
|
||||
### Primary Sweep Budget
|
||||
|
||||
The initial sweep still runs with the fixed primary budget.
|
||||
|
||||
### Secondary Diagnostic Budget
|
||||
|
||||
When a scene ends with:
|
||||
|
||||
1. `source-unreadable`
|
||||
2. reason `generator timeout after 45s`
|
||||
|
||||
it becomes eligible for one bounded rerun under a secondary timeout budget.
|
||||
|
||||
### Rerun Result Mapping
|
||||
|
||||
A bounded rerun may only map to:
|
||||
|
||||
1. `timeout-rerun-pass`
|
||||
2. `timeout-rerun-fail-closed`
|
||||
3. `timeout-rerun-timeout`
|
||||
4. `timeout-rerun-error`
|
||||
|
||||
These are hygiene classifications, not promoted scene statuses.
|
||||
|
||||
### Board and Promotion Boundary
|
||||
|
||||
Even when rerun succeeds:
|
||||
|
||||
1. do not update `scene_execution_board_2026-04-18.json`
|
||||
2. do not convert the scene to promoted status
|
||||
3. do not silently merge the rerun result into canonical scene status
|
||||
|
||||
## Output
|
||||
|
||||
1. `tests/fixtures/generated_scene/timeout_budget_rerun_hygiene_2026-04-19.json`
|
||||
2. `docs/superpowers/reports/2026-04-19-timeout-budget-rerun-hygiene-report.md`
|
||||
|
||||
## Success Criteria
|
||||
|
||||
1. timeout scenes are no longer treated as a single unreadable bucket in the hygiene layer
|
||||
2. rerun-pass and rerun-fail-closed are distinguishable
|
||||
3. true timeout cases remain distinguishable
|
||||
4. no analyzer or generator implementation changes are made
|
||||
5. no execution board updates are made
|
||||
Reference in New Issue
Block a user