admin/claw

Files

木炎 956f0c2b68 feat: add generated scene skill platform hardening

2026-04-21 23:19:06 +08:00

21 KiB

Raw Blame History

Scene Skill Compiler Implementation Plan

For agentic workers: REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (- [ ]) syntax for tracking.

Goal: Upgrade sg_scene_generate from a scene metadata extractor plus template filler into a reusable scene skill compiler that can understand workflow semantics, classify scene archetypes, and generate runnable skills for both tq-lineloss-report-style and marketing-zero-consumer-report-style internal scenes.

Architecture: Introduce a unified Scene IR, switch extraction to a hybrid deterministic-plus-LLM pipeline, route generation by workflowArchetype, align runtime resolver contracts, and add readiness gates so users can tell whether a generated skill is safe to trial on the internal network.

Tech Stack: Rust, Node.js, HTML/CSS/JavaScript, serde_json, OpenAI-compatible LLM API

Scope Check

This plan implements the design in:

docs/superpowers/specs/2026-04-17-scene-skill-compiler-design.md

This plan builds on the existing generator work already described in:

docs/superpowers/specs/2026-04-17-llm-driven-skill-generation-design.md
docs/superpowers/specs/2026-04-17-enhanced-llm-extraction-schema-design.md
docs/superpowers/specs/2026-04-17-progressive-template-enhancement-design.md
docs/superpowers/specs/2026-04-16-multi-scene-kind-generator-design.md

This plan does not attempt to solve:

full login and authentication reconstruction
all historical scene patterns in one pass
100% no-touch generation without human review

File Map

Core generator pipeline

File	Action	Purpose
`frontend/scene-generator/llm-client.js`	Modify	Replace truncation-only extraction with chunked workflow-aware extraction and `Scene IR` schema output
`frontend/scene-generator/generator-runner.js`	Modify	Add deterministic scene scanning, key-fragment selection, and IR support
`frontend/scene-generator/server.js`	Modify	Expose analysis, preview, readiness, and generation endpoints for `Scene IR`
`frontend/scene-generator/sg_scene_generator.html`	Modify	Show extraction preview, archetype classification, bootstrap, risks, and readiness

Rust backend

File	Action	Purpose
`src/generated_scene/analyzer.rs`	Modify	Add deterministic extraction helpers and archetype support
`src/generated_scene/generator.rs`	Modify	Route generation by archetype and compile from `Scene IR` instead of ad hoc fields
`src/generated_scene/ir.rs`	Create	Define unified `Scene IR` structs and serde contracts
`src/bin/sg_scene_generate.rs`	Modify	Accept `Scene IR` JSON or file input and pass it into generator
`src/compat/scene_platform/resolvers.rs`	Modify	Align runtime parameter resolution with generated contracts

Tests and fixtures

File	Action	Purpose
`tests/scene_generator_test.rs`	Modify	Cover new analysis, archetype classification, and generation routing
`tests/generated_scene_*` or related fixtures	Modify/Create	Add representative fixtures for single-request, multi-mode, and paginated-enrichment scenes

Scope Guardrails

Do not break existing --scene-id, --scene-name, or --scene-kind compatibility.
Do not require all scenes to provide complete metadata in HTML meta tags.
Do not force the runtime to support new resolver contracts unless generation is updated to gate incompatible output.
Do not assume all report scenes share org + period params.
Do not silently generate low-confidence skills as if they were runnable.

Task 1: Fix Current Hard Failures Before Compiler Refactor

Files:

Modify: frontend/scene-generator/llm-client.js
Modify: frontend/scene-generator/generator-runner.js
Modify: frontend/scene-generator/server.js
Modify: frontend/scene-generator/sg_scene_generator.html
Modify: src/generated_scene/generator.rs

Goal: Stop the most obvious wrong outputs that currently make generated skills fail on the internal network even before the full compiler architecture lands.

Step 1: Remove report-scene hardcoded parameter assumptions

Audit scene.toml generation in src/generated_scene/generator.rs and remove default injection of generic report params such as:

fixed org
fixed period
default dictionary entity for a specific city
generic page title keywords like ["报表", "线损"]

Expected result: generated params come from extracted scene semantics or are omitted when not confidently known.

Step 2: Rework bootstrap source priority

Change bootstrap derivation so expected_domain and target_url are resolved using this order:

explicit deep extraction result
deterministic extraction from business entry points
HTML meta tags if trustworthy
fallback empty with warning

Explicitly prevent script-host URLs such as static JS includes from becoming the business domain by mistake.

Step 3: Replace naive truncation with chunked extraction input

Update frontend/scene-generator/llm-client.js and frontend/scene-generator/generator-runner.js so they no longer send only the first 15000/3000 characters. Replace with:

directory tree summary
index.html chunking
URL-bearing fragments
request-construction fragments
branching logic fragments
export-related fragments

Step 4: Add analysis preview and risk banner in Web UI

Update frontend/scene-generator/sg_scene_generator.html and frontend/scene-generator/server.js to preview:

detected archetype
bootstrap
key endpoints
extracted params
workflow steps
confidence and risk notes
Step 5: Verify with marketing and tq reference scenes

Run local analysis against the two reference scenes and confirm:

marketing-zero-consumer-report no longer resolves the wrong domain
tq-lineloss-report still identifies mode-related structures
generated preview no longer shows generic hardcoded report params
Step 6: Commit

git add frontend/scene-generator/llm-client.js frontend/scene-generator/generator-runner.js frontend/scene-generator/server.js frontend/scene-generator/sg_scene_generator.html src/generated_scene/generator.rs
git commit -m "fix(generator): remove hardcoded report defaults and improve bootstrap extraction"

Task 2: Introduce Unified Scene IR

Files:

Create: src/generated_scene/ir.rs
Modify: src/generated_scene/generator.rs
Modify: src/bin/sg_scene_generate.rs
Modify: frontend/scene-generator/server.js
Modify: frontend/scene-generator/llm-client.js

Goal: Introduce a single intermediate representation that all extraction and compilation stages use.

Step 1: Add Rust Scene IR structs

Create src/generated_scene/ir.rs with serde-enabled structs for:

SceneIr
BootstrapIr
ParamIr
ModeIr
WorkflowStepIr
ArtifactContractIr
NormalizeRulesIr
ReadinessIr
EvidenceIr

Minimum top-level fields:

{
  "sceneId": "",
  "sceneName": "",
  "sceneKind": "",
  "workflowArchetype": "",
  "bootstrap": {},
  "params": [],
  "modes": [],
  "workflowSteps": [],
  "requestTemplate": {},
  "responsePath": "",
  "normalizeRules": {},
  "artifactContract": {},
  "validationHints": {},
  "evidence": []
}

Step 2: Wire Scene IR into generator entrypoints

Update src/bin/sg_scene_generate.rs to accept either:

--scene-info-json upgraded to the new IR contract, or
a new --scene-ir-json / --scene-ir-file parameter

Keep backward compatibility by translating old scene info into partial IR where needed.

Step 3: Refactor generator to compile from IR

Update src/generated_scene/generator.rs so its internal interfaces no longer directly depend on loosely grouped fields like expectedDomain, staticParams, and columnDefs alone. It should compile from unified SceneIr.

Step 4: Update Node server to pass IR through generation

Modify frontend/scene-generator/server.js so analyze endpoints return IR-shaped JSON and generate endpoints pass the same structure into Rust without flattening.

Step 5: Verify serde and CLI compatibility

Run:

cargo check
node --check frontend/scene-generator/server.js
node --check frontend/scene-generator/llm-client.js

Expected: Rust and Node compile cleanly with the new IR contract.

Step 6: Commit

git add src/generated_scene/ir.rs src/generated_scene/generator.rs src/bin/sg_scene_generate.rs frontend/scene-generator/server.js frontend/scene-generator/llm-client.js
git commit -m "feat(generator): introduce unified scene ir for analysis and compilation"

Task 3: Build Hybrid Extraction Pipeline

Files:

Modify: src/generated_scene/analyzer.rs
Modify: frontend/scene-generator/generator-runner.js
Modify: frontend/scene-generator/llm-client.js
Modify: frontend/scene-generator/server.js

Goal: Split extraction into deterministic signal collection plus LLM semantic completion.

Step 1: Implement deterministic extraction helpers

Add helper logic in src/generated_scene/analyzer.rs or adjacent extraction code to detect:

URLs and request methods
contentType
request payload builders
pagination variables such as page, rows, pageSize
branch variables such as period_mode, reportType
entry methods
export methods
obvious filter expressions such as charge !== 0
Step 2: Create key-fragment selection in Node runner

Update frontend/scene-generator/generator-runner.js to extract and package:

directory summary
URL fragments
branch fragments
request-body fragments
response normalization fragments
export fragments

for LLM analysis.

Step 3: Redesign LLM prompt for workflow understanding

Update frontend/scene-generator/llm-client.js so the prompt explicitly asks for:

workflowArchetype
bootstrap
params
modes
workflowSteps
requestTemplate
responsePath
normalizeRules
artifactContract
confidence
uncertainties
Step 4: Merge deterministic and LLM results

Implement merge logic in frontend/scene-generator/server.js or a dedicated helper:

deterministic extraction wins for hard facts
LLM fills missing semantics
conflicts are surfaced in preview as warnings
Step 5: Verify against reference workflows

Check that:

marketing-zero-consumer-report emits workflow steps including paginate, secondary_request, filter, and export
tq-lineloss-report emits modes, defaultMode, and modeSwitchField
Step 6: Commit

git add src/generated_scene/analyzer.rs frontend/scene-generator/generator-runner.js frontend/scene-generator/llm-client.js frontend/scene-generator/server.js
git commit -m "feat(generator): add hybrid deterministic and llm workflow extraction"

Task 4: Add Workflow Archetype Classification

Files:

Modify: src/generated_scene/analyzer.rs
Modify: src/generated_scene/ir.rs
Modify: frontend/scene-generator/server.js
Modify: frontend/scene-generator/sg_scene_generator.html

Goal: Reliably classify scenes so the correct compiler path is chosen.

Step 1: Add archetype enum support

Define and support these initial archetypes:

single_request_table
multi_mode_request
paginated_enrichment
page_state_eval
Step 2: Implement classification rules

Classification logic should prefer:

multi_mode_request when explicit mode-switch branching exists
paginated_enrichment when paginated list fetch plus secondary requests are detected
page_state_eval when page-state judgment dominates
single_request_table as fallback with lower confidence

Step 3: Expose classification confidence

Add confidence and evidence fields to the preview payload so UI can show why a scene was classified into an archetype.

Step 4: Add manual override support in UI

Allow users to override archetype in frontend/scene-generator/sg_scene_generator.html before final generation, but preserve the original detected result and confidence.

Step 5: Verify reference classifications

Expected:

marketing-zero-consumer-report => paginated_enrichment
tq-lineloss-report => multi_mode_request
Step 6: Commit

git add src/generated_scene/analyzer.rs src/generated_scene/ir.rs frontend/scene-generator/server.js frontend/scene-generator/sg_scene_generator.html
git commit -m "feat(generator): classify scenes by workflow archetype with confidence"

Task 5: Split Generator Into Archetype Compilers

Files:

Modify: src/generated_scene/generator.rs
Optionally create: src/generated_scene/compiler_single_request.rs
Optionally create: src/generated_scene/compiler_multi_mode.rs
Optionally create: src/generated_scene/compiler_paginated_enrichment.rs
Optionally create: src/generated_scene/compiler_page_state.rs

Goal: Replace the single generic report template with explicit compiler paths.

Step 1: Add compiler routing by archetype

Update src/generated_scene/generator.rs so generation dispatches on workflowArchetype.

Step 2: Implement single_request_table compiler

Generate:

minimal scene.toml
direct request browser script
artifact output for simple table/list data
Step 3: Implement multi_mode_request compiler

Generate:

mode detection
mode-specific request builders
mode-specific column definitions
mode-specific response extraction
unified artifact output

Reference target: tq-lineloss-report

Step 4: Implement paginated_enrichment compiler

Generate:

paginated list loop
per-item or batched secondary requests
aggregation and transform steps
business filters
final artifact or export output

Reference target: marketing-zero-consumer-report

Step 5: Implement page_state_eval compiler

Generate:

state-check script skeleton
light artifact semantics for monitoring or status checks
Step 6: Verify generated outputs by archetype

Validate that generated scripts no longer:

define multiple API endpoints but use only the first
collapse mode-aware scenes into one request body
flatten paginated enrichment scenes into one-step normalization
Step 7: Commit

git add src/generated_scene/generator.rs src/generated_scene/compiler_*.rs
git commit -m "feat(generator): split scene generation into workflow archetype compilers"

Task 6: Align Runtime Resolver Contracts

Files:

Modify: src/compat/scene_platform/resolvers.rs
Modify: src/generated_scene/generator.rs
Modify: src/generated_scene/ir.rs

Goal: Ensure generated parameter contracts are either executable by the runtime or explicitly flagged as unsupported.

Step 1: Audit current resolver coverage

Document which current contracts are already supported, including:

dictionary_entity
month_week_period
fixed_enum
literal_passthrough
Step 2: Add missing resolver types or gate them

Choose one of these paths per parameter type:

implement new runtime resolver support
downgrade generation to an existing supported resolver
block generation with explicit readiness warning

Recommended additions:

mode_enum
date_range
org_tree
page_size
hidden_static
derived_param
Step 3: Reflect runtime compatibility in generated metadata

Generated output should clearly indicate:

supported params
unresolved params
manual-completion requirements
Step 4: Add tests for resolver alignment

Extend tests to ensure a generated skill cannot claim runnable readiness when its params require unsupported resolver behavior.

Step 5: Commit

git add src/compat/scene_platform/resolvers.rs src/generated_scene/generator.rs src/generated_scene/ir.rs tests/scene_generator_test.rs
git commit -m "feat(runtime): align generated scene contracts with resolver support"

Task 7: Add Readiness Gates And Generation Report

Files:

Modify: frontend/scene-generator/server.js
Modify: frontend/scene-generator/sg_scene_generator.html
Modify: src/generated_scene/ir.rs
Modify: src/generated_scene/generator.rs

Goal: Make generation output self-describing so users know whether a skill is ready for internal-network trial.

Step 1: Add static readiness checks

Implement checks for:

entrypoint detection
request-chain completeness
bootstrap plausibility
param/runtime compatibility
archetype compiler completeness
Step 2: Add readiness levels

Define:

A = ready for direct internal-network trial
B = structurally correct, human review recommended
C = draft only, manual completion required
Step 3: Generate human-readable report

Each analysis or generation result should include:

archetype
confidence
key evidence
detected risks
missing pieces
readiness level
Step 4: Display readiness in Web UI

Show the readiness grade before generation and after generation, with explicit warnings for internal-network execution risk.

Step 5: Verify readiness outcomes

Expected baseline:

tq-lineloss-report should reach A or high-confidence B
marketing-zero-consumer-report should not be labeled runnable unless pagination and secondary-request logic are correctly represented
Step 6: Commit

git add frontend/scene-generator/server.js frontend/scene-generator/sg_scene_generator.html src/generated_scene/ir.rs src/generated_scene/generator.rs
git commit -m "feat(generator): add readiness grading and generation risk reporting"

Task 8: Add Regression Coverage For Reference Scenes

Files:

Modify: tests/scene_generator_test.rs
Create/Modify: scene generator fixtures as needed

Goal: Lock in the two reference scenes as ongoing regression cases.

Step 1: Add marketing classification fixture coverage

Test that the marketing source scene is classified as paginated_enrichment and contains evidence for:

paginated list request
secondary request
filter rule
export step
Step 2: Add tq classification fixture coverage

Test that the tq source scene is classified as multi_mode_request and contains evidence for:

month mode
week mode
distinct request templates
distinct column definitions
Step 3: Add generation-shape assertions

Assert that generated outputs differ by archetype and do not collapse to a single generic template shape.

Step 4: Run verification

cargo test --test scene_generator_test -- --nocapture

Expected: both reference cases pass and guard against regression.

Step 5: Commit

git add tests/scene_generator_test.rs tests/fixtures
git commit -m "test(generator): add regression coverage for marketing and tq reference scenes"

Delivery Sequence

Recommended implementation order:

Task 1: hard failure fixes
Task 2: Scene IR
Task 3: hybrid extraction
Task 4: archetype classification
Task 5: compiler split
Task 6: resolver alignment
Task 7: readiness gates
Task 8: regression coverage

Rationale:

Task 1 stops current bad outputs early.
Tasks 2 to 5 establish the new compiler backbone.
Tasks 6 and 7 prevent false claims of runnability.
Task 8 locks the new architecture against regression.

Verification Strategy

Static Verification

cargo check
cargo test --test scene_generator_test -- --nocapture
node --check frontend/scene-generator/llm-client.js
node --check frontend/scene-generator/generator-runner.js
node --check frontend/scene-generator/server.js

Functional Verification

For marketing-zero-consumer-report:

detected as paginated_enrichment
bootstrap resolves to business domain, not static script host
generated workflow includes pagination and secondary requests
generation is not labeled runnable if those steps are missing

For tq-lineloss-report:

detected as multi_mode_request
month and week logic remain distinct
request templates and column definitions are mode-specific

UI Verification

Confirm the scene generator UI now shows:

detected archetype
confidence
bootstrap
key params
readiness grade
risk notes

Acceptance Criteria

This plan is complete when all of the following are true:

sg_scene_generate consumes a unified Scene IR.
The analysis pipeline can distinguish at least single_request_table, multi_mode_request, paginated_enrichment, and page_state_eval.
tq-lineloss-report is generated through the multi-mode compiler path.
marketing-zero-consumer-report is generated through the paginated-enrichment compiler path.
Generated scene.toml no longer injects unrelated default org/period assumptions.
Bootstrap resolution no longer mistakes external script hosts for business target domains.
Runtime resolver compatibility is explicit, not implicit.
Generation results include readiness grading and risk reporting before internal-network trial.

21 KiB Raw Blame History

Scene Skill Compiler Implementation Plan

Scope Check

File Map

Core generator pipeline

Rust backend

Tests and fixtures

Scope Guardrails

Task 1: Fix Current Hard Failures Before Compiler Refactor

Task 2: Introduce Unified Scene IR

Task 3: Build Hybrid Extraction Pipeline

Task 4: Add Workflow Archetype Classification

Task 5: Split Generator Into Archetype Compilers

Task 6: Align Runtime Resolver Contracts

Task 7: Add Readiness Gates And Generation Report

Task 8: Add Regression Coverage For Reference Scenes

Delivery Sequence

Verification Strategy

Static Verification

Functional Verification

UI Verification

Acceptance Criteria

21 KiB

Raw Blame History