Files
claw/docs/superpowers/specs/2026-04-19-scene-skill-102-full-coverage-framework-design.md

10 KiB
Raw Permalink Blame History

Scene Skill 102 Full Coverage Framework Design

Date: 2026-04-19 Status: Draft Upstream Roadmap: docs/superpowers/plans/2026-04-17-scene-skill-60-to-90-roadmap-plan.md Upstream Reconciliation: tests/fixtures/generated_scene/full_sweep_status_reconciliation_2026-04-19.json Upstream Follow-up: tests/fixtures/generated_scene/structured_fail_closed_improvement_followup_2026-04-19.json Upstream Timeout Hygiene: tests/fixtures/generated_scene/timeout_rerun_hygiene_integration_2026-04-19.json

Intent

Provide the single post-roadmap framework design for driving the current sgClaw scene-to-skill pipeline from partial 102 scene coverage to full bounded 102 scene coverage.

This design is intentionally broader than the bounded micro-plans used so far. It defines:

  1. the current actual state of the 102 scene set
  2. what is still missing before 100% coverage can be claimed
  3. the layered framework that all future changes must fit into
  4. the fixed route order for future implementation work
  5. the stop rules that prevent the project from drifting into unbounded plan recursion

This design is meant to become the single parent framework for later bounded plans.

Current State

Raw Current State

From the latest integrated assets:

Status Count
auto-pass 48
fail-closed-known 47
adjudicated-valid-host-bridge 4
raw source-unreadable 3
Total 102

Timeout Hygiene Overlay

The timeout hygiene layer shows that the raw 3 timeout records are not all hard unreadable records:

Hygiene-aware timeout interpretation Count
timeout-as-pass-candidate 2
timeout-as-fail-closed-candidate 1
timeout-still-unreadable 0
timeout-rerun-error 0

Interpretation

This means the framework has already reached these milestones:

  1. there are no unsupported-family scenes in the current 102 sweep
  2. there are no unresolved route conflicts left in the current 102 sweep
  3. the remaining gap is no longer “framework cannot classify this scene”
  4. the remaining gap is “contract does not close” or “timeout budget/hygiene distorts the raw reading”

What Is Still Missing Before 100% Coverage

100% coverage does not mean all 102 scenes must become direct auto-pass.

For this framework, 100% bounded coverage means:

  1. every scene is classified into a supported framework path
  2. every non-pass result is either:
    • structured fail-closed with named blocker
    • valid host-bridge workflow adjudication
    • hygiene-aware timeout interpretation
  3. there are no unresolved buckets like:
    • unsupported family
    • unresolved route conflict
    • opaque no-report failure
    • unexplained timeout

Under that definition, the missing gap is:

Missing Gap A: Structured Contract Closure

There are still 47 structured fail-closed records.

Current distribution:

Archetype Count
paginated_enrichment 34
local_doc_pipeline 5
multi_mode_request 4
single_request_enrichment 2
host_bridge_workflow 1
page_state_eval 1

This is the largest remaining implementation gap.

Missing Gap B: Timeout Hygiene Integration into Main Reporting

The timeout hygiene layer now exists, but it is still a reporting-side overlay. It has not yet been folded into the primary current-state narrative used by later roadmap decisions.

Missing Gap C: Current-State Overlay vs Execution Board

The project intentionally did not update scene_execution_board_2026-04-18.json during these bounded plans. That is correct, but it means the official board is still behind the latest integrated view.

Missing Gap D: Promotion Policy

The project still lacks a single parent rule that says when a structured fail-closed scene may be promoted from:

  1. fail-closed
  2. fail-closed with stronger evidence
  3. bounded rerun pass candidate

into a stronger scene-level coverage status.

Framework Layers

All future work must land in exactly one of these layers.

Layer A: Source Scan and Budget Layer

Purpose:

  1. source directory size handling
  2. file filtering
  3. timeout budget policy
  4. rerun hygiene

Owned concerns:

  1. source scan volume
  2. timeout policy
  3. rerun interpretation

Must not own:

  1. archetype routing
  2. contract closure logic
  3. scene promotion

Primary code area:

  1. src/generated_scene/analyzer.rs
  2. reporting JSON and sweep scripts

Layer B: Archetype Routing Layer

Purpose:

  1. decide the correct framework path:
    • single_request_table
    • single_request_enrichment
    • multi_mode_request
    • paginated_enrichment
    • host_bridge_workflow
    • multi_endpoint_inventory
    • local_doc_pipeline

Owned concerns:

  1. route precedence
  2. mixed-evidence routing boundaries
  3. route adjudication support

Must not own:

  1. timeout policy
  2. contract synthesis beyond routing evidence
  3. board reconciliation

Primary code area:

  1. src/generated_scene/analyzer.rs

Layer C: Contract Recovery Layer

Purpose:

Recover the minimum business contract fields needed by each supported archetype.

Owned concerns:

  1. request contract recovery
  2. response contract recovery
  3. pagination plan recovery
  4. enrichment request recovery
  5. join key recovery
  6. export plan recovery
  7. mode matrix recovery

Must not own:

  1. timeout policy
  2. execution board updates
  3. status promotion

Primary code area:

  1. src/generated_scene/generator.rs
  2. src/generated_scene/ir.rs

Layer D: Structured Fail-Closed and Reporting Layer

Purpose:

Make every incomplete scene fail in an explainable and structured way.

Owned concerns:

  1. readiness-before-report classification
  2. blocker naming
  3. contractSnapshot
  4. generation-report completeness

Must not own:

  1. route preference
  2. source scan budget
  3. promotion policy

Primary code area:

  1. src/generated_scene/generator.rs
  2. reporting assets under tests/fixtures/generated_scene/

Layer E: Sweep, Reconciliation, and Coverage Layer

Purpose:

Measure the whole 102 scene set, reconcile multiple interpretation layers, and report trustworthy coverage.

Owned concerns:

  1. full sweep outputs
  2. route adjudication overlay
  3. timeout hygiene overlay
  4. integrated coverage reporting
  5. board reconciliation planning

Must not own:

  1. analyzer implementation changes
  2. generator implementation changes

Primary assets:

  1. tests/fixtures/generated_scene/*full_sweep*
  2. tests/fixtures/generated_scene/*reconciliation*
  3. tests/fixtures/generated_scene/*timeout*hygiene*
  4. docs/superpowers/reports/*coverage*

Coverage Definitions

This framework uses four explicit coverage concepts.

Coverage 1: Direct Pass Coverage

Scenes with direct auto-pass.

Current count:

48 / 102

Coverage 2: Framework-Resolved Coverage

Scenes in one of:

  1. auto-pass
  2. adjudicated-valid-host-bridge
  3. structured fail-closed-known
  4. hygiene-aware timeout interpretation

This is the best measure of whether the framework has “caught” the scene set.

Coverage 3: Promotion Coverage

Scenes already represented as promoted or boundary family assets in current project assets.

This is lower than framework-resolved coverage because promotion is intentionally conservative.

Coverage 4: Real-Sample Execution Coverage

Scenes that have actual selected and executed real-sample validation records.

This is the strictest coverage metric.

Fixed Route Order for Future Work

Future work must follow this order.

Route 1: Finish Layer E Hygiene Integration

Goal:

Make sweep and reconciliation reporting hygiene-aware by default.

This route is nearly finished and should be closed first.

Route 2: G3 / paginated_enrichment Contract Closure

Goal:

Work down the largest remaining structured fail-closed bucket.

Why first:

  1. largest bucket by count
  2. most important for closing the remaining 102 gap
  3. already split into repeated missing-contract patterns

Expected sub-order:

  1. enrichment_request_missing
  2. export_plan_missing
  3. then any remaining join_key or runtime-scope style gaps

Route 3: G2 / multi_mode_request Small-Bucket Closure

Goal:

Close the remaining 4 multi-mode structured fail-closed records.

Why third:

  1. clear archetype
  2. relatively small bucket
  3. mainline family already has real-sample pass anchor

Route 4: G1-E / single_request_enrichment Small-Bucket Closure

Goal:

Close the remaining 2 G1-E structured fail-closed records.

Why fourth:

  1. smallest mainline bucket
  2. framework anchor already exists
  3. lower leverage than G3 and G2

Route 5: Decide on local_doc_pipeline and host_bridge_workflow

Goal:

Handle the remaining boundary-family fail-closed records only after the mainline buckets are reduced.

This route must not start before Routes 24 have completed or been explicitly deferred.

Route 6: Reconciliation and Board Promotion Policy

Goal:

Define how stronger framework-resolved statuses can update the execution board without over-promoting scenes.

This must be done only after contract-closure routes have produced stable deltas.

What Future Plans Must Contain

Every later bounded implementation plan must explicitly declare:

  1. which framework layer it belongs to
  2. which route from this design it belongs to
  3. which code modules it is allowed to touch
  4. which code modules it must not touch
  5. how it protects current real-sample and canonical passes
  6. what exact delta it expects to produce in the 102 scene state

If a future plan cannot answer those six items, it is out of framework and should not start.

Stop Rules

The framework forbids:

  1. starting a new micro-plan that only renames a narrower semantics problem without moving toward a route completion
  2. treating timeout rerun success as promotion
  3. updating execution board state inside a diagnostic plan
  4. opening G4/G5 before the current structured fail-closed mainline is reduced
  5. using prompt-only tuning as a substitute for contract recovery

What 100% Looks Like

This framework considers 100% bounded coverage achieved when:

  1. unsupported-family = 0
  2. missing-source = 0
  3. misclassified-unresolved = 0
  4. timeout-still-unreadable = 0
  5. every remaining non-pass scene is structured and attributable to a supported framework path
  6. execution board and reconciliation reporting can express the current scene state without ambiguity

This is different from 100% auto-pass.

100% auto-pass is not the immediate target.

100% bounded framework coverage is the immediate target.