Files
claw/docs/superpowers/plans/2026-04-20-generated-scene-source-first-runtime-semantics-hardening-plan.md

215 lines
5.9 KiB
Markdown

# Generated Scene Source-First Runtime Semantics Hardening Plan
> Date: 2026-04-20
> Status: Draft
> Parent design: `docs/superpowers/specs/2026-04-20-generated-scene-source-first-runtime-semantics-hardening-design.md`
## Plan Intent
Replace the weaker generated-skill-first analysis path with a stronger source-first roadmap:
1. scan all 102 original source scenes
2. detect scenes that can reproduce the same runtime-semantics defect classes exposed by `sweep-030-scene`
3. convert those findings into rule-level hardening routes
4. require full 102-scene rematerialization after rule changes
5. refresh the full validation stack after rematerialization
## Why This Plan Exists
The project goal is not to describe already-surfaced gaps after they break in inner-network testing.
The goal is to prevent the same class of defect from reappearing across the remaining source scenes.
Therefore this plan is driven by original source-scene evidence, not generated skill artifacts alone.
## Fixed Inputs
1. Original source root:
- `D:/desk/智能体资料/全量业务场景/一平台场景`
2. Current final generated skills:
- `examples/scene_skill_102_final_materialization_2026-04-19/skills`
3. Current 102-skill materialization manifest
4. Current invocation / parameter readiness assets
5. `sweep-030-scene` inner-network runtime findings
## Scope Guardrails
Allowed:
1. scan all 102 original source-scene directories
2. compare source evidence against current generated skills
3. produce risk ledgers, reports, and downstream bounded plans
Forbidden in this parent plan:
1. no implementation changes in `src/`
2. no skill manifest edits
3. no rematerialization execution yet
4. no validation reruns yet
5. no inner-network patching as a substitute for source-first analysis
## Workstreams
1. `WS1` Source Evidence Scan
2. `WS2` Runtime-Semantics Risk Ledger
3. `WS3` Rule Hardening Route Design
4. `WS4` Full Rematerialization and Validation Refresh Planning
## Phase 0: Freeze Parent Scope
### Objective
Make this the new parent roadmap for generated-scene runtime semantics hardening.
### Tasks
1. freeze the five gap classes
2. freeze the source-first principle
3. freeze rematerialization as a required downstream step
### Acceptance
1. future work must start from source-scene evidence
2. future fixes must be rule-level before scene-level
## Phase 1: Full 102 Source Cross-Scan
### Objective
Systematically scan the original 102 source scenes for high-signal evidence related to the five runtime-semantics gap classes.
### Required scan targets
1. dictionary / enum / tree files
2. default parameter logic
3. request payload field names
4. runtime URL candidates
5. operator-facing wording and alias sources
### Tasks
1. map each scene id to its original source directory
2. run a bounded evidence scan over all 102 source directories
3. tag source-side evidence flags per scene
### Deliverables
1. source evidence scan JSON
2. source evidence scan report
### Acceptance
1. all 102 scenes have source evidence flags
2. `sweep-030-scene` is validated as anchor evidence
## Phase 2: Build the Source-First Runtime Semantics Ledger
### Objective
Merge source-side evidence with generated-skill evidence into a full runtime-semantics risk ledger.
### Tasks
1. compare source evidence with generated manifests and references
2. assign gap classes per scene
3. assign risk level per scene
4. distinguish:
- generator-level rule gap
- runtime-only residual
### Deliverables
1. `generated_scene_source_first_runtime_semantics_ledger_2026-04-20.json`
2. source-first runtime semantics report
### Acceptance
1. all 102 scenes are represented
2. each scene has `gaps`, `riskLevel`, and `recommendedFixRoutes`
## Phase 3: Convert Ledger into Rule-Hardening Routes
### Objective
Turn the source-first ledger into bounded implementation routes that modify reusable generation rules rather than scene-specific patches.
### Candidate hardening routes
1. alias generation hardening
2. embedded dictionary extraction hardening
3. parameter default semantics recovery hardening
4. resolver-to-request mapping hardening
5. runtime URL classification hardening
### Tasks
1. count scenes affected by each route
2. prioritize routes by coverage gain and reuse
3. define bounded implementation slices for the top routes
### Deliverables
1. child-plan sequence for runtime semantics hardening
2. bounded route plans for top reusable fixes
### Acceptance
1. no route is scene-name hardcoded
2. route priority is based on 102-scene reuse, not anecdotal debugging order
## Phase 4: Require Full 102 Rematerialization
### Objective
Ensure that hardened rules are propagated into the final generated skill inventory.
### Tasks
1. define full 102 rematerialization as mandatory after route implementation
2. define materialization outputs that must be refreshed
3. define how canonical final skill bundle is replaced
### Deliverables
1. full rematerialization refresh plan
### Acceptance
1. no runtime-semantics hardening route may be considered complete without rematerialization
## Phase 5: Require Validation Refresh
### Objective
Refresh downstream validation after rematerialization so improved rules are measured end-to-end.
### Required refresh layers
1. deterministic invocation readiness
2. natural-language parameter readiness
3. static validation
4. direct mock execution
5. pseudo-production handoff refresh
### Deliverables
1. validation refresh plan
### Acceptance
1. the new final 102-skill bundle is revalidated before more inner-network testing
## Immediate Next Output
This parent plan should immediately lead to a new bounded child plan:
- `2026-04-20-generated-scene-source-evidence-cross-scan-plan.md`
That child plan should perform the actual source cross-scan over the 102 original scenes.
## Stop Statement
Stop after publishing this parent plan and its design.
Do not execute the source cross-scan or implementation inside this plan.