264 lines
7.6 KiB
Markdown
264 lines
7.6 KiB
Markdown
# Structured Fail-Closed Improvement Roadmap Plan
|
|
|
|
> Date: 2026-04-19
|
|
> Status: Draft
|
|
> Upstream Spec: `docs/superpowers/specs/2026-04-19-structured-fail-closed-improvement-roadmap-design.md`
|
|
> Upstream Reconciliation: `tests/fixtures/generated_scene/full_sweep_status_reconciliation_2026-04-19.json`
|
|
|
|
## Plan Intent
|
|
|
|
Coordinate the next improvement cycle for the `48` structured fail-closed records from the reconciled `102` sweep.
|
|
|
|
This is a roadmap-level plan. It intentionally starts with inventory and gap taxonomy before any implementation correction.
|
|
|
|
## Baseline
|
|
|
|
Current reconciled `102` status:
|
|
|
|
| Status | Count |
|
|
| --- | ---: |
|
|
| `auto-pass` | 48 |
|
|
| `fail-closed-known` | 48 |
|
|
| `adjudicated-valid-host-bridge` | 4 |
|
|
| `source-unreadable` | 2 |
|
|
|
|
Fail-closed distribution:
|
|
|
|
| Inferred archetype | Count |
|
|
| --- | ---: |
|
|
| `paginated_enrichment` | 35 |
|
|
| `local_doc_pipeline` | 5 |
|
|
| `multi_mode_request` | 4 |
|
|
| `single_request_enrichment` | 2 |
|
|
| `host_bridge_workflow` | 1 |
|
|
| `page_state_eval` | 1 |
|
|
|
|
## Scope Guardrails
|
|
|
|
1. do not add new scene families
|
|
2. do not start `G4/G5`
|
|
3. do not implement login recovery
|
|
4. do not implement full host runtime transport
|
|
5. do not implement local document attachment runtime
|
|
6. do not update `scene_execution_board_2026-04-18.json`
|
|
7. do not promote scenes directly from dry-run or follow-up results
|
|
8. do not reopen `adjudicated-valid-host-bridge` records
|
|
9. do not handle the `2` timeout records in this roadmap
|
|
10. do not loosen readiness gates to increase pass count
|
|
|
|
## Workstreams
|
|
|
|
1. `WS1` Fail-Closed Inventory and Gap Taxonomy
|
|
2. `WS2` G3 Paginated Enrichment Recovery
|
|
3. `WS3` Small-Bucket Recovery
|
|
4. `WS4` Bootstrap Isolation
|
|
5. `WS5` Follow-Up Sweep and Reporting
|
|
|
|
## Phase 0: Freeze Structured Fail-Closed Baseline
|
|
|
|
### Objective
|
|
|
|
Freeze the `48` fail-closed records as the only implementation-analysis input.
|
|
|
|
### Tasks
|
|
|
|
1. read `full_sweep_status_reconciliation_2026-04-19.json`
|
|
2. verify total scene count is `102`
|
|
3. verify `fail-closed-known = 48`
|
|
4. verify `adjudicated-valid-host-bridge = 4`
|
|
5. verify `source-unreadable = 2`
|
|
6. extract only records with `reconciledStatus = fail-closed-known`
|
|
|
|
### Deliverables
|
|
|
|
1. frozen fail-closed input list
|
|
2. baseline validation summary
|
|
|
|
### Acceptance Criteria
|
|
|
|
1. exactly `48` records enter this roadmap
|
|
2. route-adjudicated records are excluded
|
|
3. timeout records are excluded
|
|
|
|
## Phase 1: Build Fail-Closed Inventory and Gap Taxonomy
|
|
|
|
### Objective
|
|
|
|
Split the `48` records into actionable missing-contract buckets.
|
|
|
|
### Tasks
|
|
|
|
1. inspect each fail-closed record
|
|
2. assign exactly one primary missing-contract label:
|
|
- `main_request_missing`
|
|
- `pagination_plan_missing`
|
|
- `enrichment_request_missing`
|
|
- `join_key_missing`
|
|
- `export_plan_missing`
|
|
- `mode_matrix_missing`
|
|
- `mode_request_contract_missing`
|
|
- `single_request_enrichment_contract_missing`
|
|
- `host_bridge_contract_missing`
|
|
- `local_doc_contract_missing`
|
|
- `bootstrap_target_unresolved`
|
|
- `mixed_or_ambiguous_contract_gap`
|
|
3. attach secondary labels when useful
|
|
4. group by inferred archetype and primary label
|
|
5. identify top repeated recoverable patterns
|
|
|
|
### Deliverables
|
|
|
|
1. `tests/fixtures/generated_scene/structured_fail_closed_inventory_2026-04-19.json`
|
|
2. `docs/superpowers/reports/2026-04-19-structured-fail-closed-inventory-report.md`
|
|
|
|
### Acceptance Criteria
|
|
|
|
1. all `48` records have exactly one primary label
|
|
2. the `35` `paginated_enrichment` records are explicitly split
|
|
3. no implementation is performed in this phase
|
|
|
|
## Phase 2: G3 Paginated Enrichment Recovery Slice
|
|
|
|
### Objective
|
|
|
|
Improve the largest bucket only when Phase 1 identifies repeated recoverable G3 patterns.
|
|
|
|
### Tasks
|
|
|
|
1. select only `paginated_enrichment` records from the inventory
|
|
2. prioritize repeated primary labels in this order:
|
|
- `main_request_missing`
|
|
- `pagination_plan_missing`
|
|
- `enrichment_request_missing`
|
|
- `join_key_missing`
|
|
- `export_plan_missing`
|
|
3. define bounded recovery rules for the top repeated pattern
|
|
4. implement only traceable evidence recovery
|
|
5. add regression tests for the recovered pattern
|
|
6. preserve canonical `G3` and real-sample `G3` pass
|
|
|
|
### Deliverables
|
|
|
|
1. G3 recovery implementation if evidence supports it
|
|
2. regression tests for the recovered pattern
|
|
3. G3 recovery report
|
|
|
|
### Acceptance Criteria
|
|
|
|
1. no scene-name hardcoding
|
|
2. no gate relaxation
|
|
3. recovered fields are traceable to source evidence
|
|
4. existing `G3` canonical and real-sample tests pass
|
|
|
|
## Phase 3: Small-Bucket Recovery Slice
|
|
|
|
### Objective
|
|
|
|
Handle smaller buckets only after the G3 slice is complete or explicitly deferred.
|
|
|
|
### Tasks
|
|
|
|
1. inspect `local_doc_pipeline = 5`
|
|
2. inspect `multi_mode_request = 4`
|
|
3. inspect `single_request_enrichment = 2`
|
|
4. inspect `host_bridge_workflow = 1`
|
|
5. choose at most one bounded non-G3 recovery slice
|
|
6. preserve existing real-sample passes for `G1-E`, `G2`, `G6`, `G7`
|
|
|
|
### Deliverables
|
|
|
|
1. small-bucket recovery decision report
|
|
2. optional bounded implementation and tests
|
|
|
|
### Acceptance Criteria
|
|
|
|
1. only one small-bucket slice is implemented in this roadmap
|
|
2. no `G8` attachment/local document runtime is started
|
|
3. no full host runtime transport is started
|
|
|
|
## Phase 4: Bootstrap Target Isolation
|
|
|
|
### Objective
|
|
|
|
Keep the single `page_state_eval + bootstrap_target` record separate.
|
|
|
|
### Tasks
|
|
|
|
1. identify the bootstrap target record
|
|
2. preserve it as a separate future input
|
|
3. do not implement login recovery
|
|
4. produce bootstrap isolation note
|
|
|
|
### Deliverables
|
|
|
|
1. bootstrap isolation note
|
|
|
|
### Acceptance Criteria
|
|
|
|
1. bootstrap target does not pollute G3 or small-bucket recovery
|
|
2. no login or bootstrap auto-recovery is implemented
|
|
|
|
## Phase 5: Follow-Up Sweep and Coverage Delta
|
|
|
|
### Objective
|
|
|
|
Measure the impact of bounded recovery work.
|
|
|
|
### Tasks
|
|
|
|
1. rerun the fixed `102` scene sweep
|
|
2. produce a new follow-up result
|
|
3. compare against the reconciled baseline:
|
|
- auto-pass delta
|
|
- fail-closed-known delta
|
|
- actionable coverage delta
|
|
- timeout count
|
|
- adjudicated host-bridge count
|
|
4. publish coverage delta report
|
|
|
|
### Deliverables
|
|
|
|
1. `tests/fixtures/generated_scene/structured_fail_closed_improvement_followup_2026-04-19.json`
|
|
2. `docs/superpowers/reports/2026-04-19-structured-fail-closed-improvement-coverage-delta-report.md`
|
|
3. `docs/superpowers/reports/2026-04-19-structured-fail-closed-improvement-roadmap-closure-report.md`
|
|
|
|
### Acceptance Criteria
|
|
|
|
1. scene set remains exactly `102`
|
|
2. improvements are measured, not assumed
|
|
3. execution board remains unchanged
|
|
4. fail-closed count only drops when contracts close or become more specifically isolated
|
|
|
|
## Milestone Order
|
|
|
|
The order is fixed:
|
|
|
|
1. Phase 0: freeze fail-closed baseline
|
|
2. Phase 1: build inventory and taxonomy
|
|
3. Phase 2: G3 recovery slice
|
|
4. Phase 3: small-bucket recovery slice
|
|
5. Phase 4: bootstrap target isolation
|
|
6. Phase 5: follow-up sweep and delta
|
|
|
|
Do not start implementation before Phase 1 is complete.
|
|
|
|
Do not start small-bucket recovery before the G3 slice is completed or explicitly deferred with reasons.
|
|
|
|
## Completion Criteria
|
|
|
|
This roadmap is complete when:
|
|
|
|
1. all `48` structured fail-closed records are inventoried and labeled
|
|
2. the `35` G3 records are split into actionable contract-gap groups
|
|
3. at least the highest-value repeated recoverable pattern is either implemented or explicitly deferred
|
|
4. small buckets are inspected and at most one bounded slice is implemented
|
|
5. the bootstrap target remains isolated
|
|
6. a follow-up sweep quantifies coverage delta
|
|
7. no new family is introduced
|
|
|
|
## Stop Statement
|
|
|
|
Stop after the follow-up sweep, delta report, and closure report.
|
|
|
|
Do not automatically update the execution board or start another roadmap inside this plan.
|