Files
claw/docs/superpowers/plans/2026-04-20-generated-scene-source-first-runtime-semantics-hardening-plan.md

5.9 KiB

Generated Scene Source-First Runtime Semantics Hardening Plan

Date: 2026-04-20 Status: Draft Parent design: docs/superpowers/specs/2026-04-20-generated-scene-source-first-runtime-semantics-hardening-design.md

Plan Intent

Replace the weaker generated-skill-first analysis path with a stronger source-first roadmap:

  1. scan all 102 original source scenes
  2. detect scenes that can reproduce the same runtime-semantics defect classes exposed by sweep-030-scene
  3. convert those findings into rule-level hardening routes
  4. require full 102-scene rematerialization after rule changes
  5. refresh the full validation stack after rematerialization

Why This Plan Exists

The project goal is not to describe already-surfaced gaps after they break in inner-network testing.

The goal is to prevent the same class of defect from reappearing across the remaining source scenes.

Therefore this plan is driven by original source-scene evidence, not generated skill artifacts alone.

Fixed Inputs

  1. Original source root:
    • D:/desk/智能体资料/全量业务场景/一平台场景
  2. Current final generated skills:
    • examples/scene_skill_102_final_materialization_2026-04-19/skills
  3. Current 102-skill materialization manifest
  4. Current invocation / parameter readiness assets
  5. sweep-030-scene inner-network runtime findings

Scope Guardrails

Allowed:

  1. scan all 102 original source-scene directories
  2. compare source evidence against current generated skills
  3. produce risk ledgers, reports, and downstream bounded plans

Forbidden in this parent plan:

  1. no implementation changes in src/
  2. no skill manifest edits
  3. no rematerialization execution yet
  4. no validation reruns yet
  5. no inner-network patching as a substitute for source-first analysis

Workstreams

  1. WS1 Source Evidence Scan
  2. WS2 Runtime-Semantics Risk Ledger
  3. WS3 Rule Hardening Route Design
  4. WS4 Full Rematerialization and Validation Refresh Planning

Phase 0: Freeze Parent Scope

Objective

Make this the new parent roadmap for generated-scene runtime semantics hardening.

Tasks

  1. freeze the five gap classes
  2. freeze the source-first principle
  3. freeze rematerialization as a required downstream step

Acceptance

  1. future work must start from source-scene evidence
  2. future fixes must be rule-level before scene-level

Phase 1: Full 102 Source Cross-Scan

Objective

Systematically scan the original 102 source scenes for high-signal evidence related to the five runtime-semantics gap classes.

Required scan targets

  1. dictionary / enum / tree files
  2. default parameter logic
  3. request payload field names
  4. runtime URL candidates
  5. operator-facing wording and alias sources

Tasks

  1. map each scene id to its original source directory
  2. run a bounded evidence scan over all 102 source directories
  3. tag source-side evidence flags per scene

Deliverables

  1. source evidence scan JSON
  2. source evidence scan report

Acceptance

  1. all 102 scenes have source evidence flags
  2. sweep-030-scene is validated as anchor evidence

Phase 2: Build the Source-First Runtime Semantics Ledger

Objective

Merge source-side evidence with generated-skill evidence into a full runtime-semantics risk ledger.

Tasks

  1. compare source evidence with generated manifests and references
  2. assign gap classes per scene
  3. assign risk level per scene
  4. distinguish:
    • generator-level rule gap
    • runtime-only residual

Deliverables

  1. generated_scene_source_first_runtime_semantics_ledger_2026-04-20.json
  2. source-first runtime semantics report

Acceptance

  1. all 102 scenes are represented
  2. each scene has gaps, riskLevel, and recommendedFixRoutes

Phase 3: Convert Ledger into Rule-Hardening Routes

Objective

Turn the source-first ledger into bounded implementation routes that modify reusable generation rules rather than scene-specific patches.

Candidate hardening routes

  1. alias generation hardening
  2. embedded dictionary extraction hardening
  3. parameter default semantics recovery hardening
  4. resolver-to-request mapping hardening
  5. runtime URL classification hardening

Tasks

  1. count scenes affected by each route
  2. prioritize routes by coverage gain and reuse
  3. define bounded implementation slices for the top routes

Deliverables

  1. child-plan sequence for runtime semantics hardening
  2. bounded route plans for top reusable fixes

Acceptance

  1. no route is scene-name hardcoded
  2. route priority is based on 102-scene reuse, not anecdotal debugging order

Phase 4: Require Full 102 Rematerialization

Objective

Ensure that hardened rules are propagated into the final generated skill inventory.

Tasks

  1. define full 102 rematerialization as mandatory after route implementation
  2. define materialization outputs that must be refreshed
  3. define how canonical final skill bundle is replaced

Deliverables

  1. full rematerialization refresh plan

Acceptance

  1. no runtime-semantics hardening route may be considered complete without rematerialization

Phase 5: Require Validation Refresh

Objective

Refresh downstream validation after rematerialization so improved rules are measured end-to-end.

Required refresh layers

  1. deterministic invocation readiness
  2. natural-language parameter readiness
  3. static validation
  4. direct mock execution
  5. pseudo-production handoff refresh

Deliverables

  1. validation refresh plan

Acceptance

  1. the new final 102-skill bundle is revalidated before more inner-network testing

Immediate Next Output

This parent plan should immediately lead to a new bounded child plan:

  • 2026-04-20-generated-scene-source-evidence-cross-scan-plan.md

That child plan should perform the actual source cross-scan over the 102 original scenes.

Stop Statement

Stop after publishing this parent plan and its design.

Do not execute the source cross-scan or implementation inside this plan.