# Generated Scene Runtime Semantics Gap Analysis Design > Status: Superseded by `docs/superpowers/specs/2026-04-20-generated-scene-source-first-runtime-semantics-hardening-design.md` ## Objective Produce a bounded, implementation-free analysis of runtime semantics gaps across the final 102 generated scene skills, using `sweep-030-scene` as the anchor case that exposed five concrete gap classes during inner-network validation. This design does **not** modify analyzer, generator, runtime, skill manifests, or execution assets. It only defines how to analyze and classify the gaps that remain between: - `generated_scene` framework-level success - real inner-network invocation / execution equivalence ## Anchor Case The anchor case is: - `sweep-030-scene / 台区线损大数据-月_周累计线损率统计分析` Inner-network debugging exposed the following gap classes: 1. `invocation_alias_gap` 2. `dictionary_recovery_gap` 3. `parameter_default_semantics_gap` 4. `resolver_to_request_mapping_gap` 5. `runtime_url_semantics_gap` The analysis generalizes these five classes across the full 102-scene final materialization set. ## Scope In scope: - Analyze the final 102 generated skills under: - `examples/scene_skill_102_final_materialization_2026-04-19/skills` - Inspect: - `scene.toml` - `SKILL.toml` - `references/generation-report.json` - `references/org-dictionary.json` where present - generated browser scripts where needed for request mapping evidence - Compare generated assets against source-scene evidence when required to validate dictionary and runtime-url semantics - Produce a 102-scene gap inventory and summary report Out of scope: - Any code change in `src/` - Any edit to generated skill packages - Any update to execution board / official board - Any new pseudo-production execution - Any new inner-network fix for a specific scene ## Problem Statement The repository has already reached: - `102 / 102` framework auto-pass - `102 / 102` final materialized skills - deterministic invocation readiness But `sweep-030-scene` demonstrated that generated skills can still diverge from real runtime semantics in ways not captured by framework-level closure: - user phrasing differs from canonical scene name - source scene contains complete org dictionaries not fully recovered into the generated skill - source page defaults dates / periods while generated invocation initially required explicit period values - resolver outputs and request field names do not align 1:1 - runtime context URL semantics differ from module-route URL semantics Therefore the next bounded step is analysis, not implementation. ## Gap Taxonomy Each scene may be tagged with zero or more of the following gap classes: ### 1. `invocation_alias_gap` Definition: - Natural operator phrasing is likely not covered by current deterministic `include_keywords` Indicators: - Deterministic keywords only contain canonical scene title - Scene title includes punctuation / separators / compound mode phrases - Existing reports already required alias normalization ### 2. `dictionary_recovery_gap` Definition: - Source scene contains embedded dictionaries / trees / option arrays, but generated skill only carries a starter subset or no dictionary at all Indicators: - Source contains files like `city.js`, `dict.js`, `enum.js`, `options.js` - Source JS includes tree/option structures with labels/codes/children - Generated `references/org-dictionary.json` is empty or much smaller than source evidence ### 3. `parameter_default_semantics_gap` Definition: - Source page applies default values (date, period, mode, range, org) when user omits them, but generated skill currently treats them as required or unresolved Indicators: - Source contains `moment()` / date defaulting / initial query payloads - Generated parameter readiness previously required explicit user input ### 4. `resolver_to_request_mapping_gap` Definition: - Resolved semantic parameters do not align directly with actual request field names or payload layout used by the source page Indicators: - Resolver outputs `org_code` while request uses `orgno`, or analogous mismatches - Generated request template uses placeholders not directly populated by resolver outputs - Source request payload structure differs from generated request mapping ### 5. `runtime_url_semantics_gap` Definition: - Generated skill does not clearly distinguish between app-entry URL, module-route URL, and API endpoint URL for runtime binding Indicators: - `scene.toml` only stores one `bootstrap.target_url` - Inner-network execution shows app-entry URL succeeds while module-route URL fails, or vice versa - Generation report contains both an app entry and a deeper route candidate ## Inputs Primary inputs: - `examples/scene_skill_102_final_materialization_2026-04-19/skills` - `tests/fixtures/generated_scene/scene_skill_102_deterministic_invocation_readiness_after_keyword_refinement_2026-04-20.json` - `tests/fixtures/generated_scene/scene_skill_102_natural_language_parameter_readiness_2026-04-20.json` - `tests/fixtures/generated_scene/scene_skill_102_parameter_dictionary_template_normalization_2026-04-20.json` Anchor-case source evidence: - `D:/desk/智能体资料/全量业务场景/一平台场景/台区线损大数据-月_周累计线损率统计分析` ## Output Artifacts ### 1. JSON inventory - `tests/fixtures/generated_scene/generated_scene_runtime_semantics_gap_analysis_2026-04-20.json` Required structure: - top-level summary counts by gap class - per-scene records - per-risk-bucket grouping Each scene record should include: - `sceneId` - `sceneName` - `archetype` - `riskLevel` - `gaps` - `evidence` - `recommendedFixRoutes` ### 2. Human-readable report - `docs/superpowers/reports/2026-04-20-generated-scene-runtime-semantics-gap-analysis-report.md` The report must answer: 1. How many scenes likely have each gap type 2. Which families / archetypes are most affected 3. Which gaps are generator-level 4. Which gaps are runtime-only and should not be pushed back into generation 5. Which next implementation routes should be prioritized ## Risk Buckets Scenes should be grouped into: - `high`: multi-parameter or runtime-sensitive scenes where inner-network invocation is likely to diverge without further hardening - `medium`: scenes with likely alias / dictionary / default-semantics issues but lower execution sensitivity - `low`: scenes with no immediate evidence of these five gap classes ## Acceptance Criteria This analysis is complete when: 1. All 102 final materialized scenes have a runtime-semantics record 2. `sweep-030-scene` is explicitly analyzed under all applicable gap classes 3. Summary counts exist for all five gap classes 4. Dictionary recovery gap is supported by direct source-vs-generated evidence for the anchor case 5. The report recommends next implementation routes without changing code ## Stop Statement Stop after publishing the JSON inventory and report. Do not open implementation work from this design.