claw/docs/superpowers/specs/2026-04-15-generated-scene-skill-platform-design.md

# Generated Scene Skill Platform Design

**Goal:** Evolve `sgClaw` from one-off business-scene integrations into a platform that can generate, register, and invoke staged scene skills through a generic runtime path, while keeping v1 implementation strictly limited to report/collection-oriented `browser_script` scenes.

**Status:** Approved brainstorming direction for formal specification.

---

## Decision Summary

1. `sgClaw` should become a scene-skill platform, not a growing set of per-scene Rust branches.
2. V1 should support only report/collection-oriented `browser_script` scenes generated from existing scenario directories.
3. The generated output must include both the staged skill package and a platform registration manifest so that new scenes can be discovered and invoked with minimal or zero per-scene Rust changes.
4. In the intranet near term, deterministic mode remains the explicit `。。。` suffix path; no model is required for v1 invocation.
5. The design must preserve the existing main architecture, stay close to the current `browser_script` and artifact pipeline, and avoid platform changes that drift into a general workflow engine.
6. The implementation should happen on a new branch copied from `ws`, not directly inside the current `ws` branch.
7. The generator and runtime must be separated by explicit contracts so the generator can later be extracted into a standalone project.
8. The platform design must turn the full `tq-lineloss-report` lessons learned into durable documentation and generator input rules so future generated skills do not repeat the same mistakes.

---

## Hard Constraints

### 1. Extensibility is mandatory

The platform must support future extension without forcing redesign of the core contracts. The design must leave clean seams for:

- additional scene types
- additional deterministic matchers
- additional parameter resolver types
- additional tool invokers beyond `browser_script`
- future LLM semantic routing on top of the same registered scene contracts
- future extraction of the generator into a separate project

### 2. Stay on the main line

The core objective is:

- generate staged scene skills from existing scenario directories
- register them automatically
- invoke them through a generic deterministic runtime path

The design must not drift into:

- a full low-code workflow engine
- a general browser RPA authoring platform
- a full login/session orchestration platform in v1
- a broad runtime rewrite unrelated to generated scene skill support

### 3. Preserve the current architecture theme

The design should reuse and generalize the parts of `sgClaw` that already look platform-like:

- skills discovery/loading
- `browser_script` execution seams
- artifact interpretation
- export/postprocess seams
- bootstrap target resolution seams

It must avoid large theme-breaking rewrites of the runtime unless a generic platform seam truly requires them.

### 4. Execution branch strategy

This work is large enough that implementation should not land directly on the active `ws` branch. The future implementation plan must explicitly require:

- start from the current `ws` branch state
- create a new branch copied from `ws`
- perform platform conversion there
- preserve `ws` as the stable reference baseline during the migration

### 5. Generator extraction must remain possible

The generator should not be tightly coupled to `claw-new` internals. The boundary between runtime and generator must be a stable package/manifest contract so the generator can later move into a separate project without redesigning registered scene skills.

### 6. `tq-lineloss-report` lessons learned must become first-class inputs

The design must require a durable lessons-learned document derived from the full `tq-lineloss-report` path, including deterministic routing, canonical parameterization, bootstrap targets, pipe/ws differences, timeout chains, artifact contracts, and Rust-side export constraints.

This document is not an appendix. It is a required generator-design input and future template hardening source.

The document must be split into two layers so it remains enforceable instead of becoming loose prose:

- a structured machine-consumable lessons artifact that generator templates can read or reference deterministically
- a human-oriented narrative/analysis document explaining the why, trade-offs, and debugging history behind those lessons

### 7. Use the superpowers process end-to-end

This design must be carried through the superpowers flow:

- brainstorming
- formal spec
- review loop
- user review
- implementation planning

### 8. Think through the details before implementation

The spec must make the critical details explicit now so execution does not discover foundational contract problems halfway through.

---

## Why This Platform Exists

The current line-loss integration proves that `sgClaw` can support a staged business scene, but it also exposes the current architecture problem:

- the staged skill package exists and is useful
- the `browser_script` execution seam exists and is useful
- the runtime has some generic pieces already
- but deterministic routing, parameter normalization, bootstrap target selection, and scene-specific invocation are still too tied to one-off Rust code

Examples visible in the current code:

- `src/compat/deterministic_submit.rs` hardcodes the line-loss suffix route, target URL, host, scene matcher, org resolver, and period resolver
- `src/service/server.rs:453` already has a more general bootstrap-target seam, but it still delegates deterministic planning to scene-specific logic
- `src/compat/direct_skill_runtime.rs:148` already knows how to resolve and execute a `browser_script` tool from the skills directory, which is a strong existing platform primitive
- `src/runtime/engine.rs:232` already has multi-directory runtime skill loading and browser-surface-aware filtering, which is another platform primitive

The design goal is to promote the reusable parts into a stable platform and move scene-specific behavior into generated packages plus scene manifests.

---

## V1 Scope

### In scope

V1 is strictly limited to report/collection-oriented `browser_script` scenes generated from existing scenario directories.

That means:

- input source is an existing scenario directory containing page assets and business JS logic
- generated output is a staged skill package plus a platform registration manifest
- runtime invocation uses deterministic `。。。` routing only
- execution reuses the existing `browser_script` invocation chain
- output is a structured report artifact plus optional generic report postprocessing such as local XLSX export/open

### Out of scope

V1 does **not** include:

- generic action/authoring scenes such as navigation, form filling, publishing, or editor automation
- arbitrary multi-step workflow orchestration
- session/login orchestration as a generic platform capability
- non-`browser_script` tool generation
- full LLM semantic scene routing implementation
- a universal low-code engine

### Spec-level future seams

The spec **must** define extension interfaces for future use, but those extensions are not part of v1 implementation:

- matcher extension seam for future LLM semantic selection
- resolver extension seam for more complex domain parsing
- invoker extension seam for new tool kinds
- artifact interpreter extension seam for non-report results
- postprocessor extension seam beyond report export/open
- generator packaging seam for future project extraction

---

## Platform Architecture

The recommended platform has five units.

### 1. Scene Source Analyzer

Input:

- an existing scenario directory
- typical source artifacts such as `index.html`, `js/*`, business requests, export calls, state dictionaries, and target pages

Responsibility:

- inspect source structure and collect candidate scene metadata
- identify the likely business page URL/domain
- identify likely collection mode (report/collection in v1)
- extract request-shape hints, output table hints, export/report-log hints, and page dependencies
- record uncertainty instead of guessing when source evidence is incomplete

This unit is analysis-only. It does not perform runtime registration or invocation.

### 2. Skill Generator

Input:

- analyzed scene source description
- generator templates
- lessons-learned rules derived from existing scenes such as `tq-lineloss-report`

Output:

- staged skill package
- platform registration manifest
- generated references and contract docs

Generated package contents for v1:

- `SKILL.toml`
- `SKILL.md`
- `references/collection-flow.md`
- `references/data-quality.md`
- `scripts/*.js`
- `scripts/*.test.js`
- optional scene snapshot assets
- `scene.toml`

The generator is responsible for producing complete registration-ready output, not just scaffolding files.

### 3. Scene Registry Loader

Responsibility:

- scan staged skill directories
- locate `scene.toml`
- validate scene registration contracts
- register scenes into a unified runtime registry

This replaces the long-term need for per-scene Rust wiring.

The existing runtime already has useful loading primitives in `src/runtime/engine.rs:361` and skill-dir normalization in `src/compat/config_adapter.rs:90`. V1 should build on those instead of replacing them.

### 4. Generic Deterministic Dispatcher

Responsibility:

- activate only when the raw instruction ends with `。。。`
- iterate registered scenes, not hardcoded scene branches
- evaluate deterministic match rules declared in `scene.toml`
- resolve required canonical parameters using platform resolver types
- produce either:
  - mismatch / unsupported-scene prompt
  - missing/ambiguous parameter prompt
  - executable scene invocation plan

#### Multi-match and precedence rules

Extensibility means multiple registered scenes may match the same deterministic request. The platform must define this explicitly instead of allowing hidden first-match behavior.

Design rules:

- deterministic dispatch must score candidate scenes through declared match signals rather than raw file-load order
- higher-confidence signals may include page URL/title context, explicit include/exclude keyword fit, and resolver success for required parameters
- plain keyword overlap alone is not sufficient justification for silently choosing one scene when another remains plausible
- if two or more scenes remain materially plausible after deterministic scoring and required-parameter evaluation, the dispatcher must fail closed with an explicit ambiguity prompt rather than guessing
- the future implementation plan must lock the scoring and tie-break order in tests
- bootstrap/page-context signals are allowed to participate in disambiguation, but they must be declared and explainable

This keeps the system extensible without turning new scenes into routing contradictions.

This should replace scene-specific logic currently concentrated in `src/compat/deterministic_submit.rs`.

### 5. Generic Execution Pipeline

Responsibility:

- invoke the resolved tool through the existing `browser_script` seam
- reuse bootstrap target resolution
- interpret the artifact according to the registered artifact contract
- run generic report postprocessing such as Rust-side XLSX export
- keep business-specific interpretation out of the platform core

The strong requirement is to preserve the already-validated common path in:

- `src/compat/direct_skill_runtime.rs`
- `src/compat/browser_script_skill_tool.rs`
- the existing report-artifact and export seams

---

## Scene Registration Contract

The central platform contract is a per-scene registration manifest, named `scene.toml` in this design.

### Why a separate manifest is needed

`SKILL.toml` describes tools. It does not fully describe:

- deterministic routing rules
- scene identity
- platform parameter resolution contracts
- bootstrap target rules
- artifact interpretation rules
- generic postprocessing declarations

Without this manifest, the generator would only create files while the runtime would still need scene-specific Rust changes.

### Manifest responsibilities

Each generated scene manifest must declare:

1. scene identity and runtime entrypoint
2. bootstrap/page context requirements
3. deterministic matching rules
4. parameter schema and resolver mapping
5. execution contract
6. artifact contract
7. postprocess contract
8. schema/version metadata sufficient for long-term generator/runtime evolution

### Manifest versioning and registry rules

The manifest contract must be explicit and versioned from the start.

Required rules:

- every `scene.toml` must declare a manifest schema version independent from the scene version
- the runtime must validate schema compatibility before registration
- scene registration must require globally unique `scene.id` values across all loaded scene roots
- duplicate scene IDs must fail registration deterministically rather than silently overriding an earlier scene
- the future implementation plan must decide and test the duplicate policy explicitly, but the default design rule is fail-fast with a clear error describing both conflicting manifest locations
- manifest evolution must prefer additive compatibility where possible so a future standalone generator can target the same runtime contract intentionally rather than by coincidence

This versioned contract is part of the extraction seam: it is what allows the runtime and a future standalone generator to evolve without private coupling.

### Recommended manifest shape

```toml
[scene]
id = "tq-lineloss-report"
skill = "tq-lineloss-report"
tool = "collect_lineloss"
kind = "browser_script"
version = "0.1.0"
category = "report_collection"

[manifest]
schema_version = "1"

[bootstrap]
expected_domain = "20.76.57.61"
target_url = "http://20.76.57.61:18080/gsllys/tqLinelossStatis/tqQualifyRateMonitor"
page_title_keywords = ["线损"]
requires_target_page = true

[deterministic]
suffix = "。。。"
include_keywords = ["线损", "月累计", "周累计"]
exclude_keywords = ["知乎"]

[[params]]
name = "org"
resolver = "dictionary_entity"
required = true
prompt_missing = "已命中台区线损报表技能，但缺少供电单位。"
prompt_ambiguous = "已命中台区线损报表技能，但供电单位存在歧义，请补充更完整名称。"

[params.resolver_config]
dictionary_ref = "references/org-dictionary.json"
output_label_field = "org_label"
output_code_field = "org_code"

[[params]]
name = "period"
resolver = "month_week_period"
required = true
prompt_missing = "已命中台区线损报表技能，但缺少统计周期。"
prompt_ambiguous = "已命中台区线损报表技能，但统计周期存在歧义，请补充更明确表达。"

[artifact]
type = "report-artifact"
success_status = ["ok", "partial", "empty"]
failure_status = ["blocked", "error"]

[postprocess]
exporter = "xlsx_report"
auto_open = "excel"
```

### Design rule

`scene.toml` declares behavior. It does not contain business JS code.

- business collection logic stays in `scripts/*.js`
- platform match/resolver selection stays in the manifest
- generic runtime execution stays in the platform

---

## Platform-Provided Generic Capabilities

The platform should expose a small, explicit set of reusable capability types.

### 1. Scene Matchers

V1 deterministic matcher types should stay simple and declarative:

- include keywords
- exclude keywords
- required suffix
- optional page URL/title constraints

This is enough for v1 report scenes and avoids overbuilding NLP into deterministic mode.

Future seam:

- add a semantic matcher interface for model-based routing later without changing the rest of the scene contract

### 2. Parameter Resolvers

The platform should provide reusable resolver types instead of scene-specific branches.

Recommended v1 resolver types:

- `dictionary_entity`
  - maps aliases to canonical label/code pairs using scene-provided dictionary data
- `month_week_period`
  - parses month/week intent and canonical time payloads
- `fixed_enum`
  - maps deterministic text options into fixed internal values
- `literal_passthrough`
  - preserves an already explicit literal value

Design rule:

If a new scene needs a new resolver **type**, add a reusable platform capability. Do not add a scene-specific Rust branch.

### 3. Bootstrap Resolvers

The platform must be able to produce:

- `expected_domain`
- `target_url`
- page-context validation hints

These should come from registration metadata, not from per-scene hardcoded constants.

This generalizes the existing bootstrap-target seam already present in `src/service/server.rs:453`.

### 4. Tool Invokers

V1 supports one invoker type only:

- `browser_script`

This is intentionally narrow. It keeps the platform close to the existing main architecture and avoids broad redesign.

Future seam:

- later add invokers for other tool kinds without changing scene registration concepts

### 5. Artifact Interpreters and Postprocessors

V1 should provide generic handling for report-style results:

- `report-artifact` interpreter
- `xlsx_report` exporter/postprocessor
- open-after-export policies

The platform should not know about line-loss business fields specifically. It should only know the generic artifact contract.

---

## Generated Skill Package Contract

V1 generated scenes must follow a predictable staged package shape.

### Required generated files

- `SKILL.toml`
- `SKILL.md`
- `scene.toml`
- `references/collection-flow.md`
- `references/data-quality.md`
- `scripts/<entry>.js`
- `scripts/<entry>.test.js`
- optional support assets such as scene snapshots

### V1 generated skill assumptions

Generated report/collection skills must:

- accept normalized canonical args only
- validate expected page context before collection
- avoid re-parsing raw user language inside the script
- return one structured artifact object
- keep page/API collection logic inside the script
- leave generic interpretation/export policy to the platform where possible

### Separation rule

The generated skill package owns:

- page inspection
- page-side state usage
- page/API calls
- row normalization
- scene-local docs and references

The platform owns:

- scene discovery and registration
- deterministic scene selection
- canonical parameter resolution using generic resolver types
- tool invocation
- artifact interpretation
- generic postprocessing

---

## Migration Path from `tq-lineloss` One-Off to Platform Sample

The current line-loss implementation should not be discarded. It should become the first migration sample and platform proof point.

### Why line-loss is the right sample

It already exercised most of the hard problems:

- deterministic routing via `。。。`
- canonical org resolution
- canonical month/week resolution
- staged `browser_script` packaging
- bootstrap target selection
- report artifact shaping
- local export needs
- pipe/ws transport differences
- real browser/runtime timeout and callback-host issues

### Phase A: Extract generic registry and invocation seams

First, add:

- scene registry loader
- manifest reader/validator
- generic deterministic dispatch planning

while preserving the existing `browser_script` execution seam.

### Phase B: Convert `tq-lineloss` into the first manifest-driven scene

Move line-loss specific declarations out of hardcoded Rust branches and into registration data:

- scene identity
- target URL and expected domain
- deterministic scene match rules
- resolver mapping
- artifact/postprocess declarations

Keep the business collection script in the skill package.

### Phase C: Build the generator on top of the stabilized contract

Once line-loss runs through the manifest-driven platform path, define generator templates that produce the same contracts automatically from scenario directories.

### Phase D: Add future semantic routing later

When model access is available, layer semantic routing onto the same registered scene contracts.

The LLM should eventually help with:

- selecting a scene
- filling unresolved parameters

But it should not replace the registered execution contract.

---

## Generator Extraction Boundary

The design must support eventually moving the generator out of `sgClaw`.

### Required extraction seam

The generator and runtime should communicate only through generated artifacts and contracts:

- staged skill package layout
- `scene.toml`
- any scene-local dictionaries/reference data

### Consequence

The runtime must not depend on internal generator implementation details.

This means:

- do not let the runtime call generator internals directly
- do not let the generator rely on private runtime types as its only output format
- keep manifest and package contracts explicit and versionable

This is what makes later extraction into a separate repository practical.

---

## `tq-lineloss-report` Lessons-Learned Document Requirement

The platform design requires a dedicated lessons-learned document based on the full `tq-lineloss-report` implementation and debugging path.

### Why this document is required

The line-loss path uncovered issues that a naive generator would recreate immediately.

These include:

- deterministic routing and prompt semantics
- strict canonical org/period normalization
- no hidden page-default fallback
- target URL / expected-domain / bootstrap-target contracts
- `browser_script` target URL requirements
- artifact shape discipline
- Rust-side XLSX export necessity because browser-side localhost export can fail under remote page origin constraints
- pipe vs ws differences
- callback-host and helper bootstrap timeout risks
- real-world service-console runtime validation gaps

### Required sections

The lessons document should at minimum cover:

1. source-scene assumptions that must be surfaced explicitly
2. deterministic routing pitfalls
3. canonical parameterization pitfalls
4. bootstrap target and page-context pitfalls
5. execution transport pitfalls (pipe/ws)
6. artifact and export pitfalls
7. testing pitfalls
8. manual runtime validation pitfalls
9. what should become generator template rules
10. what should remain scene-specific manual work

### Required format and location

The design requires both artifacts to live in a stable, versioned location under the project docs so future plans and a future standalone generator can depend on them intentionally.

Recommended shape:

- `docs/superpowers/references/tq-lineloss-lessons-learned.md`
  - human-oriented narrative and rationale
- `docs/superpowers/references/tq-lineloss-lessons-learned.toml`
  - structured generator input rules

The TOML artifact should be organized as reusable rule sections such as:

- deterministic routing rules
- canonical parameter rules
- bootstrap/target-url rules
- artifact/postprocess rules
- validation/test checklist rules

The generator should consume the structured TOML rules as template constraints or generation-time validation inputs, while the Markdown document remains the explainability companion for human reviewers.

### How it should be used

This document becomes:

- template hardening input for the generator
- a checklist for reviewing generated scenes
- a planning artifact for deciding which pieces can be automated safely

---

## Existing Code Surfaces to Reuse

The design should explicitly build on these current platform-adjacent surfaces rather than replacing them wholesale.

### Skills discovery and loading

- `src/runtime/engine.rs:232` load skills for surface from configurable directories
- `src/runtime/engine.rs:361` load runtime skills across multiple roots
- `src/compat/config_adapter.rs:90` skill-dir normalization

### Generic `browser_script` execution

- `src/compat/direct_skill_runtime.rs:91` raw output execution helper
- `src/compat/direct_skill_runtime.rs:148` tool resolution from staged skills
- `src/compat/browser_script_skill_tool.rs` script loading/wrapping/invocation pipeline

### Bootstrap target resolution seam

- `src/service/server.rs:453` submit bootstrap target resolution

### Current one-off deterministic branch that should be generalized

- `src/compat/deterministic_submit.rs`

The line-loss-specific pieces in that file are the main migration targets for platform conversion.

---

## Failure Semantics

The platform must preserve explicit, business-safe failure semantics.

### Deterministic mismatch

If the request ends with `。。。` but no registered scene matches, the runtime must return an explicit deterministic mismatch response.

### Missing / ambiguous parameters

If a registered scene matches but required parameters cannot be resolved uniquely, the runtime must prompt rather than guess.

### Execution failure

Execution failures should be interpreted according to the registered artifact contract and generic report semantics, not through per-scene special cases in the platform core.

### Design rule

The platform should never silently recover by using page defaults when the scene contract requires canonical inputs.

---

## Verification Requirements for the Future Implementation Plan

The future implementation plan must verify:

1. registry loading from generated scene manifests
2. deterministic dispatch through registered scenes instead of per-scene branches
3. manifest-driven bootstrap target selection
4. manifest-driven parameter resolver dispatch
5. generic `browser_script` invocation of generated scenes
6. generic report artifact interpretation
7. generic XLSX postprocessing compatibility
8. unchanged behavior for existing non-scene core flows outside v1 scope
9. migration of `tq-lineloss` from hardcoded branch to manifest-driven sample
10. branch strategy based on a new branch copied from `ws`
11. lessons-learned document completeness and reuse as generator input
12. separation seam sufficient for future generator extraction

---

## Out of Scope for the V1 Implementation Plan

The future implementation plan should explicitly avoid:

- generic login/session capability as a first-class v1 platform subsystem
- full semantic routing implementation with models
- generalized action workflows
- a full scene DSL runtime
- direct implementation of multiple non-report scene kinds
- replacing the validated core `browser_script` execution path with a new protocol
- broad architectural rewrites unrelated to generated scene skill support

---

## Recommended First Implementation Slice

The most stable first slice is:

1. create the scene manifest contract and validator
2. build a registry loader over existing staged skill directories
3. generalize deterministic dispatch to use registered scenes
4. migrate `tq-lineloss` into the first manifest-driven scene
5. document all line-loss lessons learned
6. only then build the scenario-directory-to-skill generator

This keeps the platform grounded in a working runtime contract before the generator is asked to automate against it.

---

## Final Recommendation

Build `sgClaw` into a generated scene skill platform by separating it into:

- a generic runtime platform that discovers, matches, resolves, invokes, and postprocesses scenes using manifest-driven contracts
- a scenario-directory-to-skill generator that emits staged skill packages and scene registration manifests

Implement v1 only for report/collection-oriented `browser_script` scenes, keep deterministic invocation on the explicit `。。。` suffix, migrate `tq-lineloss` into the first manifest-driven sample, and preserve a clean extraction seam so the generator can later become its own project.