204 lines
6.9 KiB
Markdown
204 lines
6.9 KiB
Markdown
# Generated Scene Runtime Semantics Gap Analysis Design
|
|
|
|
> Status: Superseded by `docs/superpowers/specs/2026-04-20-generated-scene-source-first-runtime-semantics-hardening-design.md`
|
|
|
|
## Objective
|
|
|
|
Produce a bounded, implementation-free analysis of runtime semantics gaps across the final 102 generated scene skills, using `sweep-030-scene` as the anchor case that exposed five concrete gap classes during inner-network validation.
|
|
|
|
This design does **not** modify analyzer, generator, runtime, skill manifests, or execution assets. It only defines how to analyze and classify the gaps that remain between:
|
|
|
|
- `generated_scene` framework-level success
|
|
- real inner-network invocation / execution equivalence
|
|
|
|
## Anchor Case
|
|
|
|
The anchor case is:
|
|
|
|
- `sweep-030-scene / 台区线损大数据-月_周累计线损率统计分析`
|
|
|
|
Inner-network debugging exposed the following gap classes:
|
|
|
|
1. `invocation_alias_gap`
|
|
2. `dictionary_recovery_gap`
|
|
3. `parameter_default_semantics_gap`
|
|
4. `resolver_to_request_mapping_gap`
|
|
5. `runtime_url_semantics_gap`
|
|
|
|
The analysis generalizes these five classes across the full 102-scene final materialization set.
|
|
|
|
## Scope
|
|
|
|
In scope:
|
|
|
|
- Analyze the final 102 generated skills under:
|
|
- `examples/scene_skill_102_final_materialization_2026-04-19/skills`
|
|
- Inspect:
|
|
- `scene.toml`
|
|
- `SKILL.toml`
|
|
- `references/generation-report.json`
|
|
- `references/org-dictionary.json` where present
|
|
- generated browser scripts where needed for request mapping evidence
|
|
- Compare generated assets against source-scene evidence when required to validate dictionary and runtime-url semantics
|
|
- Produce a 102-scene gap inventory and summary report
|
|
|
|
Out of scope:
|
|
|
|
- Any code change in `src/`
|
|
- Any edit to generated skill packages
|
|
- Any update to execution board / official board
|
|
- Any new pseudo-production execution
|
|
- Any new inner-network fix for a specific scene
|
|
|
|
## Problem Statement
|
|
|
|
The repository has already reached:
|
|
|
|
- `102 / 102` framework auto-pass
|
|
- `102 / 102` final materialized skills
|
|
- deterministic invocation readiness
|
|
|
|
But `sweep-030-scene` demonstrated that generated skills can still diverge from real runtime semantics in ways not captured by framework-level closure:
|
|
|
|
- user phrasing differs from canonical scene name
|
|
- source scene contains complete org dictionaries not fully recovered into the generated skill
|
|
- source page defaults dates / periods while generated invocation initially required explicit period values
|
|
- resolver outputs and request field names do not align 1:1
|
|
- runtime context URL semantics differ from module-route URL semantics
|
|
|
|
Therefore the next bounded step is analysis, not implementation.
|
|
|
|
## Gap Taxonomy
|
|
|
|
Each scene may be tagged with zero or more of the following gap classes:
|
|
|
|
### 1. `invocation_alias_gap`
|
|
|
|
Definition:
|
|
|
|
- Natural operator phrasing is likely not covered by current deterministic `include_keywords`
|
|
|
|
Indicators:
|
|
|
|
- Deterministic keywords only contain canonical scene title
|
|
- Scene title includes punctuation / separators / compound mode phrases
|
|
- Existing reports already required alias normalization
|
|
|
|
### 2. `dictionary_recovery_gap`
|
|
|
|
Definition:
|
|
|
|
- Source scene contains embedded dictionaries / trees / option arrays, but generated skill only carries a starter subset or no dictionary at all
|
|
|
|
Indicators:
|
|
|
|
- Source contains files like `city.js`, `dict.js`, `enum.js`, `options.js`
|
|
- Source JS includes tree/option structures with labels/codes/children
|
|
- Generated `references/org-dictionary.json` is empty or much smaller than source evidence
|
|
|
|
### 3. `parameter_default_semantics_gap`
|
|
|
|
Definition:
|
|
|
|
- Source page applies default values (date, period, mode, range, org) when user omits them, but generated skill currently treats them as required or unresolved
|
|
|
|
Indicators:
|
|
|
|
- Source contains `moment()` / date defaulting / initial query payloads
|
|
- Generated parameter readiness previously required explicit user input
|
|
|
|
### 4. `resolver_to_request_mapping_gap`
|
|
|
|
Definition:
|
|
|
|
- Resolved semantic parameters do not align directly with actual request field names or payload layout used by the source page
|
|
|
|
Indicators:
|
|
|
|
- Resolver outputs `org_code` while request uses `orgno`, or analogous mismatches
|
|
- Generated request template uses placeholders not directly populated by resolver outputs
|
|
- Source request payload structure differs from generated request mapping
|
|
|
|
### 5. `runtime_url_semantics_gap`
|
|
|
|
Definition:
|
|
|
|
- Generated skill does not clearly distinguish between app-entry URL, module-route URL, and API endpoint URL for runtime binding
|
|
|
|
Indicators:
|
|
|
|
- `scene.toml` only stores one `bootstrap.target_url`
|
|
- Inner-network execution shows app-entry URL succeeds while module-route URL fails, or vice versa
|
|
- Generation report contains both an app entry and a deeper route candidate
|
|
|
|
## Inputs
|
|
|
|
Primary inputs:
|
|
|
|
- `examples/scene_skill_102_final_materialization_2026-04-19/skills`
|
|
- `tests/fixtures/generated_scene/scene_skill_102_deterministic_invocation_readiness_after_keyword_refinement_2026-04-20.json`
|
|
- `tests/fixtures/generated_scene/scene_skill_102_natural_language_parameter_readiness_2026-04-20.json`
|
|
- `tests/fixtures/generated_scene/scene_skill_102_parameter_dictionary_template_normalization_2026-04-20.json`
|
|
|
|
Anchor-case source evidence:
|
|
|
|
- `D:/desk/智能体资料/全量业务场景/一平台场景/台区线损大数据-月_周累计线损率统计分析`
|
|
|
|
## Output Artifacts
|
|
|
|
### 1. JSON inventory
|
|
|
|
- `tests/fixtures/generated_scene/generated_scene_runtime_semantics_gap_analysis_2026-04-20.json`
|
|
|
|
Required structure:
|
|
|
|
- top-level summary counts by gap class
|
|
- per-scene records
|
|
- per-risk-bucket grouping
|
|
|
|
Each scene record should include:
|
|
|
|
- `sceneId`
|
|
- `sceneName`
|
|
- `archetype`
|
|
- `riskLevel`
|
|
- `gaps`
|
|
- `evidence`
|
|
- `recommendedFixRoutes`
|
|
|
|
### 2. Human-readable report
|
|
|
|
- `docs/superpowers/reports/2026-04-20-generated-scene-runtime-semantics-gap-analysis-report.md`
|
|
|
|
The report must answer:
|
|
|
|
1. How many scenes likely have each gap type
|
|
2. Which families / archetypes are most affected
|
|
3. Which gaps are generator-level
|
|
4. Which gaps are runtime-only and should not be pushed back into generation
|
|
5. Which next implementation routes should be prioritized
|
|
|
|
## Risk Buckets
|
|
|
|
Scenes should be grouped into:
|
|
|
|
- `high`: multi-parameter or runtime-sensitive scenes where inner-network invocation is likely to diverge without further hardening
|
|
- `medium`: scenes with likely alias / dictionary / default-semantics issues but lower execution sensitivity
|
|
- `low`: scenes with no immediate evidence of these five gap classes
|
|
|
|
## Acceptance Criteria
|
|
|
|
This analysis is complete when:
|
|
|
|
1. All 102 final materialized scenes have a runtime-semantics record
|
|
2. `sweep-030-scene` is explicitly analyzed under all applicable gap classes
|
|
3. Summary counts exist for all five gap classes
|
|
4. Dictionary recovery gap is supported by direct source-vs-generated evidence for the anchor case
|
|
5. The report recommends next implementation routes without changing code
|
|
|
|
## Stop Statement
|
|
|
|
Stop after publishing the JSON inventory and report.
|
|
|
|
Do not open implementation work from this design.
|