215 lines
5.9 KiB
Markdown
215 lines
5.9 KiB
Markdown
# Generated Scene Source-First Runtime Semantics Hardening Plan
|
|
|
|
> Date: 2026-04-20
|
|
> Status: Draft
|
|
> Parent design: `docs/superpowers/specs/2026-04-20-generated-scene-source-first-runtime-semantics-hardening-design.md`
|
|
|
|
## Plan Intent
|
|
|
|
Replace the weaker generated-skill-first analysis path with a stronger source-first roadmap:
|
|
|
|
1. scan all 102 original source scenes
|
|
2. detect scenes that can reproduce the same runtime-semantics defect classes exposed by `sweep-030-scene`
|
|
3. convert those findings into rule-level hardening routes
|
|
4. require full 102-scene rematerialization after rule changes
|
|
5. refresh the full validation stack after rematerialization
|
|
|
|
## Why This Plan Exists
|
|
|
|
The project goal is not to describe already-surfaced gaps after they break in inner-network testing.
|
|
|
|
The goal is to prevent the same class of defect from reappearing across the remaining source scenes.
|
|
|
|
Therefore this plan is driven by original source-scene evidence, not generated skill artifacts alone.
|
|
|
|
## Fixed Inputs
|
|
|
|
1. Original source root:
|
|
- `D:/desk/智能体资料/全量业务场景/一平台场景`
|
|
2. Current final generated skills:
|
|
- `examples/scene_skill_102_final_materialization_2026-04-19/skills`
|
|
3. Current 102-skill materialization manifest
|
|
4. Current invocation / parameter readiness assets
|
|
5. `sweep-030-scene` inner-network runtime findings
|
|
|
|
## Scope Guardrails
|
|
|
|
Allowed:
|
|
|
|
1. scan all 102 original source-scene directories
|
|
2. compare source evidence against current generated skills
|
|
3. produce risk ledgers, reports, and downstream bounded plans
|
|
|
|
Forbidden in this parent plan:
|
|
|
|
1. no implementation changes in `src/`
|
|
2. no skill manifest edits
|
|
3. no rematerialization execution yet
|
|
4. no validation reruns yet
|
|
5. no inner-network patching as a substitute for source-first analysis
|
|
|
|
## Workstreams
|
|
|
|
1. `WS1` Source Evidence Scan
|
|
2. `WS2` Runtime-Semantics Risk Ledger
|
|
3. `WS3` Rule Hardening Route Design
|
|
4. `WS4` Full Rematerialization and Validation Refresh Planning
|
|
|
|
## Phase 0: Freeze Parent Scope
|
|
|
|
### Objective
|
|
|
|
Make this the new parent roadmap for generated-scene runtime semantics hardening.
|
|
|
|
### Tasks
|
|
|
|
1. freeze the five gap classes
|
|
2. freeze the source-first principle
|
|
3. freeze rematerialization as a required downstream step
|
|
|
|
### Acceptance
|
|
|
|
1. future work must start from source-scene evidence
|
|
2. future fixes must be rule-level before scene-level
|
|
|
|
## Phase 1: Full 102 Source Cross-Scan
|
|
|
|
### Objective
|
|
|
|
Systematically scan the original 102 source scenes for high-signal evidence related to the five runtime-semantics gap classes.
|
|
|
|
### Required scan targets
|
|
|
|
1. dictionary / enum / tree files
|
|
2. default parameter logic
|
|
3. request payload field names
|
|
4. runtime URL candidates
|
|
5. operator-facing wording and alias sources
|
|
|
|
### Tasks
|
|
|
|
1. map each scene id to its original source directory
|
|
2. run a bounded evidence scan over all 102 source directories
|
|
3. tag source-side evidence flags per scene
|
|
|
|
### Deliverables
|
|
|
|
1. source evidence scan JSON
|
|
2. source evidence scan report
|
|
|
|
### Acceptance
|
|
|
|
1. all 102 scenes have source evidence flags
|
|
2. `sweep-030-scene` is validated as anchor evidence
|
|
|
|
## Phase 2: Build the Source-First Runtime Semantics Ledger
|
|
|
|
### Objective
|
|
|
|
Merge source-side evidence with generated-skill evidence into a full runtime-semantics risk ledger.
|
|
|
|
### Tasks
|
|
|
|
1. compare source evidence with generated manifests and references
|
|
2. assign gap classes per scene
|
|
3. assign risk level per scene
|
|
4. distinguish:
|
|
- generator-level rule gap
|
|
- runtime-only residual
|
|
|
|
### Deliverables
|
|
|
|
1. `generated_scene_source_first_runtime_semantics_ledger_2026-04-20.json`
|
|
2. source-first runtime semantics report
|
|
|
|
### Acceptance
|
|
|
|
1. all 102 scenes are represented
|
|
2. each scene has `gaps`, `riskLevel`, and `recommendedFixRoutes`
|
|
|
|
## Phase 3: Convert Ledger into Rule-Hardening Routes
|
|
|
|
### Objective
|
|
|
|
Turn the source-first ledger into bounded implementation routes that modify reusable generation rules rather than scene-specific patches.
|
|
|
|
### Candidate hardening routes
|
|
|
|
1. alias generation hardening
|
|
2. embedded dictionary extraction hardening
|
|
3. parameter default semantics recovery hardening
|
|
4. resolver-to-request mapping hardening
|
|
5. runtime URL classification hardening
|
|
|
|
### Tasks
|
|
|
|
1. count scenes affected by each route
|
|
2. prioritize routes by coverage gain and reuse
|
|
3. define bounded implementation slices for the top routes
|
|
|
|
### Deliverables
|
|
|
|
1. child-plan sequence for runtime semantics hardening
|
|
2. bounded route plans for top reusable fixes
|
|
|
|
### Acceptance
|
|
|
|
1. no route is scene-name hardcoded
|
|
2. route priority is based on 102-scene reuse, not anecdotal debugging order
|
|
|
|
## Phase 4: Require Full 102 Rematerialization
|
|
|
|
### Objective
|
|
|
|
Ensure that hardened rules are propagated into the final generated skill inventory.
|
|
|
|
### Tasks
|
|
|
|
1. define full 102 rematerialization as mandatory after route implementation
|
|
2. define materialization outputs that must be refreshed
|
|
3. define how canonical final skill bundle is replaced
|
|
|
|
### Deliverables
|
|
|
|
1. full rematerialization refresh plan
|
|
|
|
### Acceptance
|
|
|
|
1. no runtime-semantics hardening route may be considered complete without rematerialization
|
|
|
|
## Phase 5: Require Validation Refresh
|
|
|
|
### Objective
|
|
|
|
Refresh downstream validation after rematerialization so improved rules are measured end-to-end.
|
|
|
|
### Required refresh layers
|
|
|
|
1. deterministic invocation readiness
|
|
2. natural-language parameter readiness
|
|
3. static validation
|
|
4. direct mock execution
|
|
5. pseudo-production handoff refresh
|
|
|
|
### Deliverables
|
|
|
|
1. validation refresh plan
|
|
|
|
### Acceptance
|
|
|
|
1. the new final 102-skill bundle is revalidated before more inner-network testing
|
|
|
|
## Immediate Next Output
|
|
|
|
This parent plan should immediately lead to a new bounded child plan:
|
|
|
|
- `2026-04-20-generated-scene-source-evidence-cross-scan-plan.md`
|
|
|
|
That child plan should perform the actual source cross-scan over the 102 original scenes.
|
|
|
|
## Stop Statement
|
|
|
|
Stop after publishing this parent plan and its design.
|
|
|
|
Do not execute the source cross-scan or implementation inside this plan.
|