Files
claw/docs/superpowers/plans/2026-04-19-structured-fail-closed-improvement-roadmap-plan.md

264 lines
7.6 KiB
Markdown

# Structured Fail-Closed Improvement Roadmap Plan
> Date: 2026-04-19
> Status: Draft
> Upstream Spec: `docs/superpowers/specs/2026-04-19-structured-fail-closed-improvement-roadmap-design.md`
> Upstream Reconciliation: `tests/fixtures/generated_scene/full_sweep_status_reconciliation_2026-04-19.json`
## Plan Intent
Coordinate the next improvement cycle for the `48` structured fail-closed records from the reconciled `102` sweep.
This is a roadmap-level plan. It intentionally starts with inventory and gap taxonomy before any implementation correction.
## Baseline
Current reconciled `102` status:
| Status | Count |
| --- | ---: |
| `auto-pass` | 48 |
| `fail-closed-known` | 48 |
| `adjudicated-valid-host-bridge` | 4 |
| `source-unreadable` | 2 |
Fail-closed distribution:
| Inferred archetype | Count |
| --- | ---: |
| `paginated_enrichment` | 35 |
| `local_doc_pipeline` | 5 |
| `multi_mode_request` | 4 |
| `single_request_enrichment` | 2 |
| `host_bridge_workflow` | 1 |
| `page_state_eval` | 1 |
## Scope Guardrails
1. do not add new scene families
2. do not start `G4/G5`
3. do not implement login recovery
4. do not implement full host runtime transport
5. do not implement local document attachment runtime
6. do not update `scene_execution_board_2026-04-18.json`
7. do not promote scenes directly from dry-run or follow-up results
8. do not reopen `adjudicated-valid-host-bridge` records
9. do not handle the `2` timeout records in this roadmap
10. do not loosen readiness gates to increase pass count
## Workstreams
1. `WS1` Fail-Closed Inventory and Gap Taxonomy
2. `WS2` G3 Paginated Enrichment Recovery
3. `WS3` Small-Bucket Recovery
4. `WS4` Bootstrap Isolation
5. `WS5` Follow-Up Sweep and Reporting
## Phase 0: Freeze Structured Fail-Closed Baseline
### Objective
Freeze the `48` fail-closed records as the only implementation-analysis input.
### Tasks
1. read `full_sweep_status_reconciliation_2026-04-19.json`
2. verify total scene count is `102`
3. verify `fail-closed-known = 48`
4. verify `adjudicated-valid-host-bridge = 4`
5. verify `source-unreadable = 2`
6. extract only records with `reconciledStatus = fail-closed-known`
### Deliverables
1. frozen fail-closed input list
2. baseline validation summary
### Acceptance Criteria
1. exactly `48` records enter this roadmap
2. route-adjudicated records are excluded
3. timeout records are excluded
## Phase 1: Build Fail-Closed Inventory and Gap Taxonomy
### Objective
Split the `48` records into actionable missing-contract buckets.
### Tasks
1. inspect each fail-closed record
2. assign exactly one primary missing-contract label:
- `main_request_missing`
- `pagination_plan_missing`
- `enrichment_request_missing`
- `join_key_missing`
- `export_plan_missing`
- `mode_matrix_missing`
- `mode_request_contract_missing`
- `single_request_enrichment_contract_missing`
- `host_bridge_contract_missing`
- `local_doc_contract_missing`
- `bootstrap_target_unresolved`
- `mixed_or_ambiguous_contract_gap`
3. attach secondary labels when useful
4. group by inferred archetype and primary label
5. identify top repeated recoverable patterns
### Deliverables
1. `tests/fixtures/generated_scene/structured_fail_closed_inventory_2026-04-19.json`
2. `docs/superpowers/reports/2026-04-19-structured-fail-closed-inventory-report.md`
### Acceptance Criteria
1. all `48` records have exactly one primary label
2. the `35` `paginated_enrichment` records are explicitly split
3. no implementation is performed in this phase
## Phase 2: G3 Paginated Enrichment Recovery Slice
### Objective
Improve the largest bucket only when Phase 1 identifies repeated recoverable G3 patterns.
### Tasks
1. select only `paginated_enrichment` records from the inventory
2. prioritize repeated primary labels in this order:
- `main_request_missing`
- `pagination_plan_missing`
- `enrichment_request_missing`
- `join_key_missing`
- `export_plan_missing`
3. define bounded recovery rules for the top repeated pattern
4. implement only traceable evidence recovery
5. add regression tests for the recovered pattern
6. preserve canonical `G3` and real-sample `G3` pass
### Deliverables
1. G3 recovery implementation if evidence supports it
2. regression tests for the recovered pattern
3. G3 recovery report
### Acceptance Criteria
1. no scene-name hardcoding
2. no gate relaxation
3. recovered fields are traceable to source evidence
4. existing `G3` canonical and real-sample tests pass
## Phase 3: Small-Bucket Recovery Slice
### Objective
Handle smaller buckets only after the G3 slice is complete or explicitly deferred.
### Tasks
1. inspect `local_doc_pipeline = 5`
2. inspect `multi_mode_request = 4`
3. inspect `single_request_enrichment = 2`
4. inspect `host_bridge_workflow = 1`
5. choose at most one bounded non-G3 recovery slice
6. preserve existing real-sample passes for `G1-E`, `G2`, `G6`, `G7`
### Deliverables
1. small-bucket recovery decision report
2. optional bounded implementation and tests
### Acceptance Criteria
1. only one small-bucket slice is implemented in this roadmap
2. no `G8` attachment/local document runtime is started
3. no full host runtime transport is started
## Phase 4: Bootstrap Target Isolation
### Objective
Keep the single `page_state_eval + bootstrap_target` record separate.
### Tasks
1. identify the bootstrap target record
2. preserve it as a separate future input
3. do not implement login recovery
4. produce bootstrap isolation note
### Deliverables
1. bootstrap isolation note
### Acceptance Criteria
1. bootstrap target does not pollute G3 or small-bucket recovery
2. no login or bootstrap auto-recovery is implemented
## Phase 5: Follow-Up Sweep and Coverage Delta
### Objective
Measure the impact of bounded recovery work.
### Tasks
1. rerun the fixed `102` scene sweep
2. produce a new follow-up result
3. compare against the reconciled baseline:
- auto-pass delta
- fail-closed-known delta
- actionable coverage delta
- timeout count
- adjudicated host-bridge count
4. publish coverage delta report
### Deliverables
1. `tests/fixtures/generated_scene/structured_fail_closed_improvement_followup_2026-04-19.json`
2. `docs/superpowers/reports/2026-04-19-structured-fail-closed-improvement-coverage-delta-report.md`
3. `docs/superpowers/reports/2026-04-19-structured-fail-closed-improvement-roadmap-closure-report.md`
### Acceptance Criteria
1. scene set remains exactly `102`
2. improvements are measured, not assumed
3. execution board remains unchanged
4. fail-closed count only drops when contracts close or become more specifically isolated
## Milestone Order
The order is fixed:
1. Phase 0: freeze fail-closed baseline
2. Phase 1: build inventory and taxonomy
3. Phase 2: G3 recovery slice
4. Phase 3: small-bucket recovery slice
5. Phase 4: bootstrap target isolation
6. Phase 5: follow-up sweep and delta
Do not start implementation before Phase 1 is complete.
Do not start small-bucket recovery before the G3 slice is completed or explicitly deferred with reasons.
## Completion Criteria
This roadmap is complete when:
1. all `48` structured fail-closed records are inventoried and labeled
2. the `35` G3 records are split into actionable contract-gap groups
3. at least the highest-value repeated recoverable pattern is either implemented or explicitly deferred
4. small buckets are inspected and at most one bounded slice is implemented
5. the bootstrap target remains isolated
6. a follow-up sweep quantifies coverage delta
7. no new family is introduced
## Stop Statement
Stop after the follow-up sweep, delta report, and closure report.
Do not automatically update the execution board or start another roadmap inside this plan.