9.2 KiB
Generated Scene Source-First Runtime Semantics Hardening Design
Date: 2026-04-20 Status: Draft Supersedes:
docs/superpowers/specs/2026-04-20-generated-scene-runtime-semantics-gap-analysis-design.mdUpstream Parent:docs/superpowers/plans/2026-04-19-scene-skill-102-full-coverage-framework-plan.mdUpstream Materialization:tests/fixtures/generated_scene/scene_skill_102_final_materialization_manifest_2026-04-19.json
Intent
Define the next parent roadmap for generated_scene after framework closure has already been achieved.
The purpose is no longer:
- whether the
102scenes can be generated into skills
That has already been proven.
The purpose is now:
- scan the original
102source scenes for runtime-semantics evidence - identify all scenes that can reproduce the same class of divergence exposed by
sweep-030-scene - harden analyzer / generator / manifest rules at the rule level rather than scene-by-scene
- regenerate the full
102skill set from the hardened rules - rerun validation assets so future inner-network execution does not rediscover the same class of defects one scene at a time
This design deliberately moves from a weak generated-skill-first analysis to a stronger source-first analysis and regeneration program.
Why the Previous Analysis Was Not Enough
The superseded analysis-only design focused mainly on the already-generated skill assets.
That is insufficient for the actual project goal, because the goal is not simply to describe gaps that already surfaced in generated skills. The goal is to:
- proactively find other source scenes with the same latent runtime-semantics risks as
sweep-030-scene - correct the generation rules once
- regenerate the full 102-scene bundle
- avoid repeated inner-network rediscovery of the same class of defects
Therefore the correct parent approach must be source-first.
Anchor Problem Family
sweep-030-scene / 台区线损大数据-月_周累计线损率统计分析 exposed five reusable gap classes:
invocation_alias_gapdictionary_recovery_gapparameter_default_semantics_gapresolver_to_request_mapping_gapruntime_url_semantics_gap
The roadmapping problem is no longer “fix sweep-030”.
It is:
find every source scene in the current 102 set that can reproduce one or more of these five gap classes, then harden generation rules and rematerialize the whole set
Source-First Principle
For this roadmap, the original source scenes are the primary truth.
Generated skills are secondary, derived artifacts used for comparison.
This means:
- risk discovery starts from original source-scene files, not from generated output alone
- generated skills are used to measure what is missing compared with source evidence
- implementation targets rule-level recovery, not scene-name patching
- the roadmap is incomplete until the full 102 skills are regenerated from hardened rules
Scope
In scope:
- Scan the original 102 source-scene directories under:
D:/desk/智能体资料/全量业务场景/一平台场景
- Cross-map each source scene to the current final generated skill
- Detect source-side evidence for the five runtime-semantics gap classes
- Produce a full risk ledger for all 102 scenes
- Define the bounded implementation routes required to harden generation rules
- Define the required full rematerialization and validation refresh after rule changes
Out of scope:
- Inner-network execution itself
- Login / credential handling
- Host-bridge runtime hardening outside current generated-scene semantics
- Scene-by-scene ad hoc inner-network patching as the primary method
Problem Restatement
The repository already reached:
102 / 102framework auto-pass102 / 102materialized skills- deterministic invocation readiness
- full direct mock pass
But sweep-030-scene proved that generated skills can still diverge from original scene runtime semantics in ways that only surface when actually invoked in a browser-attached environment.
The project cannot sustainably close that gap by waiting for each scene to fail in inner-network execution.
The missing capability is:
source-first runtime semantics extraction and rule hardening
Runtime-Semantics Gap Taxonomy
The five anchor gap classes remain the canonical taxonomy.
1. invocation_alias_gap
The original scene affords natural operator phrasing, but the generated deterministic manifest is too narrow.
2. dictionary_recovery_gap
The original scene contains embedded dictionaries, trees, or option structures, but the generated skill only restores a starter subset or no dictionary.
3. parameter_default_semantics_gap
The original page supplies default time / mode / org semantics, but the generated skill initially treats the parameter as explicitly required.
4. resolver_to_request_mapping_gap
The generated resolver output names are not the actual request payload field names used by the original page.
5. runtime_url_semantics_gap
The generated skill does not properly separate:
- app-entry URL
- module-route URL
- API endpoint URL
- runtime browser context URL
New Required Source-Side Scan
The new parent roadmap must explicitly scan the original source scenes for high-signal evidence.
Evidence families to scan
-
Dictionary files
city.jsdict.jsenum.jsoptions*.js- tree / option / label-code-value arrays
-
Default-parameter semantics
moment(dayjs(- month/week defaulting
- implicit query payload initialization
-
Request payload semantics
$.ajaxfetchcontentTypedata- request body field names
-
Runtime URL semantics
- app entry URLs
- module route URLs
- menu navigation targets
- bootstrap candidates
-
Invocation alias evidence
- titles
- menu labels
- button text
- route names
- report names
- operator-facing wording
Required output of the scan
For each source scene:
- whether embedded dictionaries exist
- whether page defaults exist
- whether request-field aliasing exists
- whether multiple URL kinds exist
- whether natural alias variation is likely
Work Product Hierarchy
The roadmap should produce three layers of output.
Layer 1: Source-Side Risk Ledger
A full 102-scene ledger that starts from original source evidence.
Layer 2: Rule-Hardening Route Map
A route map that groups scenes by reusable rule fixes rather than by scene name.
Layer 3: Rematerialization + Validation Refresh Plan
A controlled plan for regenerating all 102 skills and refreshing validation assets after the rule changes land.
Core Routes
The source-first roadmap must be split into these fixed routes:
Route A: Source Cross-Scan and Evidence Ledger
Goal:
Build a full 102-scene source-first runtime-semantics risk inventory.
Route B: Rule-Level Hardening Design
Goal:
Translate the source-first gaps into rule-level changes for analyzer/generator/manifest output.
Primary targets:
- alias generation
- dictionary extraction
- parameter default recovery
- resolver-to-request field mapping
- runtime URL classification
Route C: Bounded Implementation Slices
Goal:
Implement the rule-level hardening in bounded slices organized by reusable fix route, not by single scene.
Route D: Full 102 Rematerialization
Goal:
Regenerate all 102 skills after hardening so the new rules actually propagate to the released skill bundle.
Route E: Validation Refresh
Goal:
Refresh:
- deterministic invocation readiness
- parameter readiness
- static validation
- direct mock execution
- offline / pseudo-production handoff assets
Inputs
Primary source inventory:
D:/desk/智能体资料/全量业务场景/一平台场景
Primary generated comparison inventory:
examples/scene_skill_102_final_materialization_2026-04-19/skills
Supporting assets:
tests/fixtures/generated_scene/scene_skill_102_final_materialization_manifest_2026-04-19.jsontests/fixtures/generated_scene/scene_skill_102_deterministic_invocation_readiness_after_keyword_refinement_2026-04-20.jsontests/fixtures/generated_scene/scene_skill_102_natural_language_parameter_readiness_2026-04-20.jsontests/fixtures/generated_scene/scene_skill_102_parameter_dictionary_template_normalization_2026-04-20.json
Deliverables
1. Source-first risk ledger
tests/fixtures/generated_scene/generated_scene_source_first_runtime_semantics_ledger_2026-04-20.json
2. Source-first analysis report
docs/superpowers/reports/2026-04-20-generated-scene-source-first-runtime-semantics-report.md
3. Rule-hardening roadmap outputs
Not implemented in this design, but this design must define the bounded next plans that follow the ledger.
Acceptance Criteria
This design is successful when:
- it explicitly requires source-scene cross-scan over the full 102 set
- it no longer relies on generated-skill-only inspection as the main discovery method
- it makes full rematerialization a required downstream step
- it treats
sweep-030-sceneas an anchor case, not a one-off patch - it defines a route from source scan to rule hardening to regeneration
Stop Rule
Stop after publishing the parent design and parent plan.
Do not begin source scanning or implementation inside this design document.