Files
claw/docs/superpowers/specs/2026-04-20-generated-scene-source-first-runtime-semantics-ledger-design.md

6.6 KiB

Generated Scene Source-First Runtime Semantics Ledger Design

Date: 2026-04-20 Status: Draft Parent roadmap:

  • docs/superpowers/plans/2026-04-20-generated-scene-source-first-runtime-semantics-hardening-plan.md Upstream scan:
  • docs/superpowers/plans/2026-04-20-generated-scene-source-evidence-cross-scan-plan.md

Intent

Define the second bounded child step of the source-first runtime semantics hardening roadmap:

merge source-side evidence with generated-skill evidence into a full 102-scene runtime-semantics ledger

This design is still analysis-only. It does not modify src/, generated skills, validation assets, or execution-board state.

Objective

For every scene in the current 102-scene set:

  1. merge source-side evidence from the completed cross-scan
  2. compare that evidence against current generated skill manifests and references
  3. assign one or more canonical runtime-semantics gap classes
  4. assign a bounded riskLevel
  5. distinguish:
    • reusable generator-level rule gap
    • runtime-only residual
  6. publish a source-first runtime-semantics ledger that becomes the only valid input for later hardening-route design

Fixed Gap Taxonomy

The ledger must continue using the five gap classes already anchored by sweep-030-scene:

  1. invocation_alias_gap
  2. dictionary_recovery_gap
  3. parameter_default_semantics_gap
  4. resolver_to_request_mapping_gap
  5. runtime_url_semantics_gap

No additional gap class should be invented inside this ledger stage unless the evidence is clearly outside these five and cannot be expressed as a subtype.

Scope

In scope:

  1. the completed source cross-scan asset
  2. the current final generated skills under examples/scene_skill_102_final_materialization_2026-04-19/skills
  3. current deterministic invocation readiness assets
  4. current natural-language parameter readiness assets
  5. current parameter dictionary normalization assets
  6. source-to-generated comparison for all 102 scenes
  7. JSON ledger + human-readable report

Out of scope:

  1. any change in src/
  2. any skill manifest or script edit
  3. any rematerialization
  4. any validation rerun
  5. any inner-network execution

Required Comparisons

The ledger stage must compare source evidence with generated output along these axes.

1. Invocation alias comparison

Check whether source-side operator wording, labels, route names, or titles imply broader natural-language coverage than the current generated include_keywords.

2. Dictionary comparison

Check whether source-side dictionaries, trees, or option arrays imply a richer entity dictionary than the generated references/*dictionary*.json assets currently expose.

3. Parameter default semantics comparison

Check whether source-side date / period / mode initialization implies a default-value policy that the generated manifest or resolver metadata does not currently preserve.

4. Resolver-to-request mapping comparison

Check whether source-side request field names differ from generated resolver output names and whether the generated skill currently encodes an explicit mapping.

5. Runtime URL comparison

Check whether source-side evidence implies multiple URL roles:

  1. app entry URL
  2. module route URL
  3. API endpoint URL
  4. runtime browser context URL

and whether the generated skill currently collapses those roles into a single ambiguous target.

Ledger Schema

Each scene record in the runtime-semantics ledger should include:

  1. sceneId
  2. sceneName
  3. sourceDir
  4. archetype
  5. readiness
  6. riskLevel
  7. gaps
  8. generatorLevelGap
  9. runtimeOnlyResidual
  10. recommendedFixRoutes
  11. sourceEvidenceSummary
  12. generatedEvidenceSummary
  13. comparisonNotes

Risk-Level Rules

The ledger should use bounded, reproducible risk levels:

high

Use when the scene has strong source evidence for one or more gap classes and the current generated skill visibly lacks equivalent semantics.

medium

Use when the scene has source evidence for one or more gap classes, but current generated output appears partially aligned or the mismatch is plausible rather than explicit.

low

Use when source evidence exists but generated output already appears materially aligned, or when the residual is likely runtime-only rather than generator-level.

Generator-Level vs Runtime-Only

The ledger must classify whether a scene's residuals should later drive generator hardening or should remain runtime-only.

generatorLevelGap = true

Use when source evidence proves the generated skill is missing semantics that should be recoverable during generation.

runtimeOnlyResidual = true

Use when the remaining risk is primarily:

  1. login / session
  2. host runtime behavior
  3. local-doc / host-bridge environment
  4. inner-network-only execution context

and not a generation-semantic omission.

These two flags are not always mutually exclusive, but the ledger must explain why.

Inputs

Primary inputs:

  1. tests/fixtures/generated_scene/generated_scene_source_evidence_cross_scan_2026-04-20.json
  2. examples/scene_skill_102_final_materialization_2026-04-19/skills
  3. tests/fixtures/generated_scene/scene_skill_102_deterministic_invocation_readiness_after_keyword_refinement_2026-04-20.json
  4. tests/fixtures/generated_scene/scene_skill_102_natural_language_parameter_readiness_2026-04-20.json
  5. tests/fixtures/generated_scene/scene_skill_102_parameter_dictionary_template_normalization_2026-04-20.json

Anchor runtime findings:

  1. the confirmed sweep-030-scene inner-network findings:
    • alias mismatch
    • starter-subset org dictionary
    • page-semantic default period behavior
    • request-field mismatch
    • runtime context URL ambiguity

Output Artifacts

JSON

  • tests/fixtures/generated_scene/generated_scene_source_first_runtime_semantics_ledger_2026-04-20.json

Report

  • docs/superpowers/reports/2026-04-20-generated-scene-source-first-runtime-semantics-ledger-report.md

The report must answer:

  1. how many scenes are high, medium, low
  2. how many scenes carry each gap class
  3. how many scenes appear to require generator-level fixes
  4. how many scenes look runtime-only
  5. which route clusters are likely to yield the highest reuse

Acceptance Criteria

This design is complete when:

  1. it defines a full-scene ledger stage rather than scene-by-scene notes
  2. it binds the ledger to the fixed five-gap taxonomy
  3. it defines how source evidence and generated evidence are compared
  4. it defines riskLevel, generatorLevelGap, and runtimeOnlyResidual
  5. it remains analysis-only

Stop Statement

Stop after publishing this ledger design and its child plan.

Do not execute the ledger build inside this design.