admin/claw

Files

木炎 956f0c2b68 feat: add generated scene skill platform hardening

2026-04-21 23:19:06 +08:00

9.2 KiB

Raw Permalink Blame History

Generated Scene Source-First Runtime Semantics Hardening Design

Date: 2026-04-20 Status: Draft Supersedes:

docs/superpowers/specs/2026-04-20-generated-scene-runtime-semantics-gap-analysis-design.md Upstream Parent:

docs/superpowers/plans/2026-04-19-scene-skill-102-full-coverage-framework-plan.md Upstream Materialization:

tests/fixtures/generated_scene/scene_skill_102_final_materialization_manifest_2026-04-19.json

Intent

Define the next parent roadmap for generated_scene after framework closure has already been achieved.

The purpose is no longer:

whether the 102 scenes can be generated into skills

That has already been proven.

The purpose is now:

scan the original 102 source scenes for runtime-semantics evidence
identify all scenes that can reproduce the same class of divergence exposed by sweep-030-scene
harden analyzer / generator / manifest rules at the rule level rather than scene-by-scene
regenerate the full 102 skill set from the hardened rules
rerun validation assets so future inner-network execution does not rediscover the same class of defects one scene at a time

This design deliberately moves from a weak generated-skill-first analysis to a stronger source-first analysis and regeneration program.

Why the Previous Analysis Was Not Enough

The superseded analysis-only design focused mainly on the already-generated skill assets.

That is insufficient for the actual project goal, because the goal is not simply to describe gaps that already surfaced in generated skills. The goal is to:

proactively find other source scenes with the same latent runtime-semantics risks as sweep-030-scene
correct the generation rules once
regenerate the full 102-scene bundle
avoid repeated inner-network rediscovery of the same class of defects

Therefore the correct parent approach must be source-first.

Anchor Problem Family

sweep-030-scene / 台区线损大数据-月_周累计线损率统计分析 exposed five reusable gap classes:

invocation_alias_gap
dictionary_recovery_gap
parameter_default_semantics_gap
resolver_to_request_mapping_gap
runtime_url_semantics_gap

The roadmapping problem is no longer “fix sweep-030”.

It is:

find every source scene in the current 102 set that can reproduce one or more of these five gap classes, then harden generation rules and rematerialize the whole set

Source-First Principle

For this roadmap, the original source scenes are the primary truth.

Generated skills are secondary, derived artifacts used for comparison.

This means:

risk discovery starts from original source-scene files, not from generated output alone
generated skills are used to measure what is missing compared with source evidence
implementation targets rule-level recovery, not scene-name patching
the roadmap is incomplete until the full 102 skills are regenerated from hardened rules

Scope

In scope:

Scan the original 102 source-scene directories under:
- D:/desk/智能体资料/全量业务场景/一平台场景
Cross-map each source scene to the current final generated skill
Detect source-side evidence for the five runtime-semantics gap classes
Produce a full risk ledger for all 102 scenes
Define the bounded implementation routes required to harden generation rules
Define the required full rematerialization and validation refresh after rule changes

Out of scope:

Inner-network execution itself
Login / credential handling
Host-bridge runtime hardening outside current generated-scene semantics
Scene-by-scene ad hoc inner-network patching as the primary method

Problem Restatement

The repository already reached:

102 / 102 framework auto-pass
102 / 102 materialized skills
deterministic invocation readiness
full direct mock pass

But sweep-030-scene proved that generated skills can still diverge from original scene runtime semantics in ways that only surface when actually invoked in a browser-attached environment.

The project cannot sustainably close that gap by waiting for each scene to fail in inner-network execution.

The missing capability is:

source-first runtime semantics extraction and rule hardening

Runtime-Semantics Gap Taxonomy

The five anchor gap classes remain the canonical taxonomy.

1. `invocation_alias_gap`

The original scene affords natural operator phrasing, but the generated deterministic manifest is too narrow.

2. `dictionary_recovery_gap`

The original scene contains embedded dictionaries, trees, or option structures, but the generated skill only restores a starter subset or no dictionary.

3. `parameter_default_semantics_gap`

The original page supplies default time / mode / org semantics, but the generated skill initially treats the parameter as explicitly required.

4. `resolver_to_request_mapping_gap`

The generated resolver output names are not the actual request payload field names used by the original page.

5. `runtime_url_semantics_gap`

The generated skill does not properly separate:

app-entry URL
module-route URL
API endpoint URL
runtime browser context URL

New Required Source-Side Scan

The new parent roadmap must explicitly scan the original source scenes for high-signal evidence.

Evidence families to scan

Dictionary files
- city.js
- dict.js
- enum.js
- options*.js
- tree / option / label-code-value arrays
Default-parameter semantics
- moment(
- dayjs(
- month/week defaulting
- implicit query payload initialization
Request payload semantics
- $.ajax
- fetch
- contentType
- data
- request body field names
Runtime URL semantics
- app entry URLs
- module route URLs
- menu navigation targets
- bootstrap candidates
Invocation alias evidence
- titles
- menu labels
- button text
- route names
- report names
- operator-facing wording

Required output of the scan

For each source scene:

whether embedded dictionaries exist
whether page defaults exist
whether request-field aliasing exists
whether multiple URL kinds exist
whether natural alias variation is likely

Work Product Hierarchy

The roadmap should produce three layers of output.

Layer 1: Source-Side Risk Ledger

A full 102-scene ledger that starts from original source evidence.

Layer 2: Rule-Hardening Route Map

A route map that groups scenes by reusable rule fixes rather than by scene name.

Layer 3: Rematerialization + Validation Refresh Plan

A controlled plan for regenerating all 102 skills and refreshing validation assets after the rule changes land.

Core Routes

The source-first roadmap must be split into these fixed routes:

Route A: Source Cross-Scan and Evidence Ledger

Goal:

Build a full 102-scene source-first runtime-semantics risk inventory.

Route B: Rule-Level Hardening Design

Goal:

Translate the source-first gaps into rule-level changes for analyzer/generator/manifest output.

Primary targets:

alias generation
dictionary extraction
parameter default recovery
resolver-to-request field mapping
runtime URL classification

Route C: Bounded Implementation Slices

Goal:

Implement the rule-level hardening in bounded slices organized by reusable fix route, not by single scene.

Route D: Full 102 Rematerialization

Goal:

Regenerate all 102 skills after hardening so the new rules actually propagate to the released skill bundle.

Route E: Validation Refresh

Goal:

Refresh:

deterministic invocation readiness
parameter readiness
static validation
direct mock execution
offline / pseudo-production handoff assets

Inputs

Primary source inventory:

D:/desk/智能体资料/全量业务场景/一平台场景

Primary generated comparison inventory:

examples/scene_skill_102_final_materialization_2026-04-19/skills

Supporting assets:

tests/fixtures/generated_scene/scene_skill_102_final_materialization_manifest_2026-04-19.json
tests/fixtures/generated_scene/scene_skill_102_deterministic_invocation_readiness_after_keyword_refinement_2026-04-20.json
tests/fixtures/generated_scene/scene_skill_102_natural_language_parameter_readiness_2026-04-20.json
tests/fixtures/generated_scene/scene_skill_102_parameter_dictionary_template_normalization_2026-04-20.json

Deliverables

1. Source-first risk ledger

tests/fixtures/generated_scene/generated_scene_source_first_runtime_semantics_ledger_2026-04-20.json

2. Source-first analysis report

docs/superpowers/reports/2026-04-20-generated-scene-source-first-runtime-semantics-report.md

3. Rule-hardening roadmap outputs

Not implemented in this design, but this design must define the bounded next plans that follow the ledger.

Acceptance Criteria

This design is successful when:

it explicitly requires source-scene cross-scan over the full 102 set
it no longer relies on generated-skill-only inspection as the main discovery method
it makes full rematerialization a required downstream step
it treats sweep-030-scene as an anchor case, not a one-off patch
it defines a route from source scan to rule hardening to regeneration

Stop Rule

Stop after publishing the parent design and parent plan.

Do not begin source scanning or implementation inside this design document.

9.2 KiB Raw Permalink Blame History

Generated Scene Source-First Runtime Semantics Hardening Design

Intent

Why the Previous Analysis Was Not Enough

Anchor Problem Family

Source-First Principle

Scope

Problem Restatement

Runtime-Semantics Gap Taxonomy

1. invocation_alias_gap

2. dictionary_recovery_gap

3. parameter_default_semantics_gap

4. resolver_to_request_mapping_gap

5. runtime_url_semantics_gap

New Required Source-Side Scan

Evidence families to scan

Required output of the scan

Work Product Hierarchy

Layer 1: Source-Side Risk Ledger

Layer 2: Rule-Hardening Route Map

Layer 3: Rematerialization + Validation Refresh Plan

Core Routes

Route A: Source Cross-Scan and Evidence Ledger

Route B: Rule-Level Hardening Design

Route C: Bounded Implementation Slices

Route D: Full 102 Rematerialization

Route E: Validation Refresh

Inputs

Deliverables

1. Source-first risk ledger

2. Source-first analysis report

3. Rule-hardening roadmap outputs

Acceptance Criteria

Stop Rule

9.2 KiB

Raw Permalink Blame History

1. `invocation_alias_gap`

2. `dictionary_recovery_gap`

3. `parameter_default_semantics_gap`

4. `resolver_to_request_mapping_gap`

5. `runtime_url_semantics_gap`