392 lines
10 KiB
Markdown
392 lines
10 KiB
Markdown
# Scene Skill 102 Full Coverage Framework Design
|
||
|
||
> Date: 2026-04-19
|
||
> Status: Draft
|
||
> Upstream Roadmap: `docs/superpowers/plans/2026-04-17-scene-skill-60-to-90-roadmap-plan.md`
|
||
> Upstream Reconciliation: `tests/fixtures/generated_scene/full_sweep_status_reconciliation_2026-04-19.json`
|
||
> Upstream Follow-up: `tests/fixtures/generated_scene/structured_fail_closed_improvement_followup_2026-04-19.json`
|
||
> Upstream Timeout Hygiene: `tests/fixtures/generated_scene/timeout_rerun_hygiene_integration_2026-04-19.json`
|
||
|
||
## Intent
|
||
|
||
Provide the single post-roadmap framework design for driving the current sgClaw scene-to-skill pipeline from partial `102` scene coverage to full bounded `102` scene coverage.
|
||
|
||
This design is intentionally broader than the bounded micro-plans used so far. It defines:
|
||
|
||
1. the current actual state of the `102` scene set
|
||
2. what is still missing before `100%` coverage can be claimed
|
||
3. the layered framework that all future changes must fit into
|
||
4. the fixed route order for future implementation work
|
||
5. the stop rules that prevent the project from drifting into unbounded plan recursion
|
||
|
||
This design is meant to become the single parent framework for later bounded plans.
|
||
|
||
## Current State
|
||
|
||
### Raw Current State
|
||
|
||
From the latest integrated assets:
|
||
|
||
| Status | Count |
|
||
| --- | ---: |
|
||
| `auto-pass` | 48 |
|
||
| `fail-closed-known` | 47 |
|
||
| `adjudicated-valid-host-bridge` | 4 |
|
||
| raw `source-unreadable` | 3 |
|
||
| Total | 102 |
|
||
|
||
### Timeout Hygiene Overlay
|
||
|
||
The timeout hygiene layer shows that the raw `3` timeout records are not all hard unreadable records:
|
||
|
||
| Hygiene-aware timeout interpretation | Count |
|
||
| --- | ---: |
|
||
| `timeout-as-pass-candidate` | 2 |
|
||
| `timeout-as-fail-closed-candidate` | 1 |
|
||
| `timeout-still-unreadable` | 0 |
|
||
| `timeout-rerun-error` | 0 |
|
||
|
||
### Interpretation
|
||
|
||
This means the framework has already reached these milestones:
|
||
|
||
1. there are no `unsupported-family` scenes in the current `102` sweep
|
||
2. there are no unresolved route conflicts left in the current `102` sweep
|
||
3. the remaining gap is no longer “framework cannot classify this scene”
|
||
4. the remaining gap is “contract does not close” or “timeout budget/hygiene distorts the raw reading”
|
||
|
||
## What Is Still Missing Before 100% Coverage
|
||
|
||
`100%` coverage does not mean all `102` scenes must become direct `auto-pass`.
|
||
|
||
For this framework, `100% bounded coverage` means:
|
||
|
||
1. every scene is classified into a supported framework path
|
||
2. every non-pass result is either:
|
||
- structured fail-closed with named blocker
|
||
- valid host-bridge workflow adjudication
|
||
- hygiene-aware timeout interpretation
|
||
3. there are no unresolved buckets like:
|
||
- unsupported family
|
||
- unresolved route conflict
|
||
- opaque no-report failure
|
||
- unexplained timeout
|
||
|
||
Under that definition, the missing gap is:
|
||
|
||
### Missing Gap A: Structured Contract Closure
|
||
|
||
There are still `47` structured fail-closed records.
|
||
|
||
Current distribution:
|
||
|
||
| Archetype | Count |
|
||
| --- | ---: |
|
||
| `paginated_enrichment` | 34 |
|
||
| `local_doc_pipeline` | 5 |
|
||
| `multi_mode_request` | 4 |
|
||
| `single_request_enrichment` | 2 |
|
||
| `host_bridge_workflow` | 1 |
|
||
| `page_state_eval` | 1 |
|
||
|
||
This is the largest remaining implementation gap.
|
||
|
||
### Missing Gap B: Timeout Hygiene Integration into Main Reporting
|
||
|
||
The timeout hygiene layer now exists, but it is still a reporting-side overlay. It has not yet been folded into the primary current-state narrative used by later roadmap decisions.
|
||
|
||
### Missing Gap C: Current-State Overlay vs Execution Board
|
||
|
||
The project intentionally did not update `scene_execution_board_2026-04-18.json` during these bounded plans. That is correct, but it means the official board is still behind the latest integrated view.
|
||
|
||
### Missing Gap D: Promotion Policy
|
||
|
||
The project still lacks a single parent rule that says when a structured fail-closed scene may be promoted from:
|
||
|
||
1. fail-closed
|
||
2. fail-closed with stronger evidence
|
||
3. bounded rerun pass candidate
|
||
|
||
into a stronger scene-level coverage status.
|
||
|
||
## Framework Layers
|
||
|
||
All future work must land in exactly one of these layers.
|
||
|
||
### Layer A: Source Scan and Budget Layer
|
||
|
||
Purpose:
|
||
|
||
1. source directory size handling
|
||
2. file filtering
|
||
3. timeout budget policy
|
||
4. rerun hygiene
|
||
|
||
Owned concerns:
|
||
|
||
1. source scan volume
|
||
2. timeout policy
|
||
3. rerun interpretation
|
||
|
||
Must not own:
|
||
|
||
1. archetype routing
|
||
2. contract closure logic
|
||
3. scene promotion
|
||
|
||
Primary code area:
|
||
|
||
1. `src/generated_scene/analyzer.rs`
|
||
2. reporting JSON and sweep scripts
|
||
|
||
### Layer B: Archetype Routing Layer
|
||
|
||
Purpose:
|
||
|
||
1. decide the correct framework path:
|
||
- `single_request_table`
|
||
- `single_request_enrichment`
|
||
- `multi_mode_request`
|
||
- `paginated_enrichment`
|
||
- `host_bridge_workflow`
|
||
- `multi_endpoint_inventory`
|
||
- `local_doc_pipeline`
|
||
|
||
Owned concerns:
|
||
|
||
1. route precedence
|
||
2. mixed-evidence routing boundaries
|
||
3. route adjudication support
|
||
|
||
Must not own:
|
||
|
||
1. timeout policy
|
||
2. contract synthesis beyond routing evidence
|
||
3. board reconciliation
|
||
|
||
Primary code area:
|
||
|
||
1. `src/generated_scene/analyzer.rs`
|
||
|
||
### Layer C: Contract Recovery Layer
|
||
|
||
Purpose:
|
||
|
||
Recover the minimum business contract fields needed by each supported archetype.
|
||
|
||
Owned concerns:
|
||
|
||
1. request contract recovery
|
||
2. response contract recovery
|
||
3. pagination plan recovery
|
||
4. enrichment request recovery
|
||
5. join key recovery
|
||
6. export plan recovery
|
||
7. mode matrix recovery
|
||
|
||
Must not own:
|
||
|
||
1. timeout policy
|
||
2. execution board updates
|
||
3. status promotion
|
||
|
||
Primary code area:
|
||
|
||
1. `src/generated_scene/generator.rs`
|
||
2. `src/generated_scene/ir.rs`
|
||
|
||
### Layer D: Structured Fail-Closed and Reporting Layer
|
||
|
||
Purpose:
|
||
|
||
Make every incomplete scene fail in an explainable and structured way.
|
||
|
||
Owned concerns:
|
||
|
||
1. readiness-before-report classification
|
||
2. blocker naming
|
||
3. `contractSnapshot`
|
||
4. generation-report completeness
|
||
|
||
Must not own:
|
||
|
||
1. route preference
|
||
2. source scan budget
|
||
3. promotion policy
|
||
|
||
Primary code area:
|
||
|
||
1. `src/generated_scene/generator.rs`
|
||
2. reporting assets under `tests/fixtures/generated_scene/`
|
||
|
||
### Layer E: Sweep, Reconciliation, and Coverage Layer
|
||
|
||
Purpose:
|
||
|
||
Measure the whole `102` scene set, reconcile multiple interpretation layers, and report trustworthy coverage.
|
||
|
||
Owned concerns:
|
||
|
||
1. full sweep outputs
|
||
2. route adjudication overlay
|
||
3. timeout hygiene overlay
|
||
4. integrated coverage reporting
|
||
5. board reconciliation planning
|
||
|
||
Must not own:
|
||
|
||
1. analyzer implementation changes
|
||
2. generator implementation changes
|
||
|
||
Primary assets:
|
||
|
||
1. `tests/fixtures/generated_scene/*full_sweep*`
|
||
2. `tests/fixtures/generated_scene/*reconciliation*`
|
||
3. `tests/fixtures/generated_scene/*timeout*hygiene*`
|
||
4. `docs/superpowers/reports/*coverage*`
|
||
|
||
## Coverage Definitions
|
||
|
||
This framework uses four explicit coverage concepts.
|
||
|
||
### Coverage 1: Direct Pass Coverage
|
||
|
||
Scenes with direct `auto-pass`.
|
||
|
||
Current count:
|
||
|
||
`48 / 102`
|
||
|
||
### Coverage 2: Framework-Resolved Coverage
|
||
|
||
Scenes in one of:
|
||
|
||
1. `auto-pass`
|
||
2. `adjudicated-valid-host-bridge`
|
||
3. structured `fail-closed-known`
|
||
4. hygiene-aware timeout interpretation
|
||
|
||
This is the best measure of whether the framework has “caught” the scene set.
|
||
|
||
### Coverage 3: Promotion Coverage
|
||
|
||
Scenes already represented as promoted or boundary family assets in current project assets.
|
||
|
||
This is lower than framework-resolved coverage because promotion is intentionally conservative.
|
||
|
||
### Coverage 4: Real-Sample Execution Coverage
|
||
|
||
Scenes that have actual selected and executed real-sample validation records.
|
||
|
||
This is the strictest coverage metric.
|
||
|
||
## Fixed Route Order for Future Work
|
||
|
||
Future work must follow this order.
|
||
|
||
### Route 1: Finish Layer E Hygiene Integration
|
||
|
||
Goal:
|
||
|
||
Make sweep and reconciliation reporting hygiene-aware by default.
|
||
|
||
This route is nearly finished and should be closed first.
|
||
|
||
### Route 2: `G3 / paginated_enrichment` Contract Closure
|
||
|
||
Goal:
|
||
|
||
Work down the largest remaining structured fail-closed bucket.
|
||
|
||
Why first:
|
||
|
||
1. largest bucket by count
|
||
2. most important for closing the remaining `102` gap
|
||
3. already split into repeated missing-contract patterns
|
||
|
||
Expected sub-order:
|
||
|
||
1. `enrichment_request_missing`
|
||
2. `export_plan_missing`
|
||
3. then any remaining `join_key` or runtime-scope style gaps
|
||
|
||
### Route 3: `G2 / multi_mode_request` Small-Bucket Closure
|
||
|
||
Goal:
|
||
|
||
Close the remaining `4` multi-mode structured fail-closed records.
|
||
|
||
Why third:
|
||
|
||
1. clear archetype
|
||
2. relatively small bucket
|
||
3. mainline family already has real-sample pass anchor
|
||
|
||
### Route 4: `G1-E / single_request_enrichment` Small-Bucket Closure
|
||
|
||
Goal:
|
||
|
||
Close the remaining `2` G1-E structured fail-closed records.
|
||
|
||
Why fourth:
|
||
|
||
1. smallest mainline bucket
|
||
2. framework anchor already exists
|
||
3. lower leverage than G3 and G2
|
||
|
||
### Route 5: Decide on `local_doc_pipeline` and `host_bridge_workflow`
|
||
|
||
Goal:
|
||
|
||
Handle the remaining boundary-family fail-closed records only after the mainline buckets are reduced.
|
||
|
||
This route must not start before Routes 2–4 have completed or been explicitly deferred.
|
||
|
||
### Route 6: Reconciliation and Board Promotion Policy
|
||
|
||
Goal:
|
||
|
||
Define how stronger framework-resolved statuses can update the execution board without over-promoting scenes.
|
||
|
||
This must be done only after contract-closure routes have produced stable deltas.
|
||
|
||
## What Future Plans Must Contain
|
||
|
||
Every later bounded implementation plan must explicitly declare:
|
||
|
||
1. which framework layer it belongs to
|
||
2. which route from this design it belongs to
|
||
3. which code modules it is allowed to touch
|
||
4. which code modules it must not touch
|
||
5. how it protects current real-sample and canonical passes
|
||
6. what exact delta it expects to produce in the `102` scene state
|
||
|
||
If a future plan cannot answer those six items, it is out of framework and should not start.
|
||
|
||
## Stop Rules
|
||
|
||
The framework forbids:
|
||
|
||
1. starting a new micro-plan that only renames a narrower semantics problem without moving toward a route completion
|
||
2. treating timeout rerun success as promotion
|
||
3. updating execution board state inside a diagnostic plan
|
||
4. opening `G4/G5` before the current structured fail-closed mainline is reduced
|
||
5. using prompt-only tuning as a substitute for contract recovery
|
||
|
||
## What 100% Looks Like
|
||
|
||
This framework considers `100% bounded coverage` achieved when:
|
||
|
||
1. `unsupported-family = 0`
|
||
2. `missing-source = 0`
|
||
3. `misclassified-unresolved = 0`
|
||
4. `timeout-still-unreadable = 0`
|
||
5. every remaining non-pass scene is structured and attributable to a supported framework path
|
||
6. execution board and reconciliation reporting can express the current scene state without ambiguity
|
||
|
||
This is different from `100% auto-pass`.
|
||
|
||
`100% auto-pass` is not the immediate target.
|
||
|
||
`100% bounded framework coverage` is the immediate target.
|