claw/docs/superpowers/plans/2026-04-20-generated-scene-source-first-runtime-semantics-hardening-plan.md

# Generated Scene Source-First Runtime Semantics Hardening Plan

> Date: 2026-04-20
> Status: Draft
> Parent design: `docs/superpowers/specs/2026-04-20-generated-scene-source-first-runtime-semantics-hardening-design.md`

## Plan Intent

Replace the weaker generated-skill-first analysis path with a stronger source-first roadmap:

1. scan all 102 original source scenes
2. detect scenes that can reproduce the same runtime-semantics defect classes exposed by `sweep-030-scene`
3. convert those findings into rule-level hardening routes
4. require full 102-scene rematerialization after rule changes
5. refresh the full validation stack after rematerialization

## Why This Plan Exists

The project goal is not to describe already-surfaced gaps after they break in inner-network testing.

The goal is to prevent the same class of defect from reappearing across the remaining source scenes.

Therefore this plan is driven by original source-scene evidence, not generated skill artifacts alone.

## Fixed Inputs

1. Original source root:
   - `D:/desk/智能体资料/全量业务场景/一平台场景`
2. Current final generated skills:
   - `examples/scene_skill_102_final_materialization_2026-04-19/skills`
3. Current 102-skill materialization manifest
4. Current invocation / parameter readiness assets
5. `sweep-030-scene` inner-network runtime findings

## Scope Guardrails

Allowed:

1. scan all 102 original source-scene directories
2. compare source evidence against current generated skills
3. produce risk ledgers, reports, and downstream bounded plans

Forbidden in this parent plan:

1. no implementation changes in `src/`
2. no skill manifest edits
3. no rematerialization execution yet
4. no validation reruns yet
5. no inner-network patching as a substitute for source-first analysis

## Workstreams

1. `WS1` Source Evidence Scan
2. `WS2` Runtime-Semantics Risk Ledger
3. `WS3` Rule Hardening Route Design
4. `WS4` Full Rematerialization and Validation Refresh Planning

## Phase 0: Freeze Parent Scope

### Objective

Make this the new parent roadmap for generated-scene runtime semantics hardening.

### Tasks

1. freeze the five gap classes
2. freeze the source-first principle
3. freeze rematerialization as a required downstream step

### Acceptance

1. future work must start from source-scene evidence
2. future fixes must be rule-level before scene-level

## Phase 1: Full 102 Source Cross-Scan

### Objective

Systematically scan the original 102 source scenes for high-signal evidence related to the five runtime-semantics gap classes.

### Required scan targets

1. dictionary / enum / tree files
2. default parameter logic
3. request payload field names
4. runtime URL candidates
5. operator-facing wording and alias sources

### Tasks

1. map each scene id to its original source directory
2. run a bounded evidence scan over all 102 source directories
3. tag source-side evidence flags per scene

### Deliverables

1. source evidence scan JSON
2. source evidence scan report

### Acceptance

1. all 102 scenes have source evidence flags
2. `sweep-030-scene` is validated as anchor evidence

## Phase 2: Build the Source-First Runtime Semantics Ledger

### Objective

Merge source-side evidence with generated-skill evidence into a full runtime-semantics risk ledger.

### Tasks

1. compare source evidence with generated manifests and references
2. assign gap classes per scene
3. assign risk level per scene
4. distinguish:
   - generator-level rule gap
   - runtime-only residual

### Deliverables

1. `generated_scene_source_first_runtime_semantics_ledger_2026-04-20.json`
2. source-first runtime semantics report

### Acceptance

1. all 102 scenes are represented
2. each scene has `gaps`, `riskLevel`, and `recommendedFixRoutes`

## Phase 3: Convert Ledger into Rule-Hardening Routes

### Objective

Turn the source-first ledger into bounded implementation routes that modify reusable generation rules rather than scene-specific patches.

### Candidate hardening routes

1. alias generation hardening
2. embedded dictionary extraction hardening
3. parameter default semantics recovery hardening
4. resolver-to-request mapping hardening
5. runtime URL classification hardening

### Tasks

1. count scenes affected by each route
2. prioritize routes by coverage gain and reuse
3. define bounded implementation slices for the top routes

### Deliverables

1. child-plan sequence for runtime semantics hardening
2. bounded route plans for top reusable fixes

### Acceptance

1. no route is scene-name hardcoded
2. route priority is based on 102-scene reuse, not anecdotal debugging order

## Phase 4: Require Full 102 Rematerialization

### Objective

Ensure that hardened rules are propagated into the final generated skill inventory.

### Tasks

1. define full 102 rematerialization as mandatory after route implementation
2. define materialization outputs that must be refreshed
3. define how canonical final skill bundle is replaced

### Deliverables

1. full rematerialization refresh plan

### Acceptance

1. no runtime-semantics hardening route may be considered complete without rematerialization

## Phase 5: Require Validation Refresh

### Objective

Refresh downstream validation after rematerialization so improved rules are measured end-to-end.

### Required refresh layers

1. deterministic invocation readiness
2. natural-language parameter readiness
3. static validation
4. direct mock execution
5. pseudo-production handoff refresh

### Deliverables

1. validation refresh plan

### Acceptance

1. the new final 102-skill bundle is revalidated before more inner-network testing

## Immediate Next Output

This parent plan should immediately lead to a new bounded child plan:

- `2026-04-20-generated-scene-source-evidence-cross-scan-plan.md`

That child plan should perform the actual source cross-scan over the 102 original scenes.

## Stop Statement

Stop after publishing this parent plan and its design.

Do not execute the source cross-scan or implementation inside this plan.