Files
claw/docs/superpowers/plans/2026-04-17-scene-skill-compiler-plan.md

21 KiB

Scene Skill Compiler Implementation Plan

For agentic workers: REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (- [ ]) syntax for tracking.

Goal: Upgrade sg_scene_generate from a scene metadata extractor plus template filler into a reusable scene skill compiler that can understand workflow semantics, classify scene archetypes, and generate runnable skills for both tq-lineloss-report-style and marketing-zero-consumer-report-style internal scenes.

Architecture: Introduce a unified Scene IR, switch extraction to a hybrid deterministic-plus-LLM pipeline, route generation by workflowArchetype, align runtime resolver contracts, and add readiness gates so users can tell whether a generated skill is safe to trial on the internal network.

Tech Stack: Rust, Node.js, HTML/CSS/JavaScript, serde_json, OpenAI-compatible LLM API


Scope Check

This plan implements the design in:

  • docs/superpowers/specs/2026-04-17-scene-skill-compiler-design.md

This plan builds on the existing generator work already described in:

  • docs/superpowers/specs/2026-04-17-llm-driven-skill-generation-design.md
  • docs/superpowers/specs/2026-04-17-enhanced-llm-extraction-schema-design.md
  • docs/superpowers/specs/2026-04-17-progressive-template-enhancement-design.md
  • docs/superpowers/specs/2026-04-16-multi-scene-kind-generator-design.md

This plan does not attempt to solve:

  • full login and authentication reconstruction
  • all historical scene patterns in one pass
  • 100% no-touch generation without human review

File Map

Core generator pipeline

File Action Purpose
frontend/scene-generator/llm-client.js Modify Replace truncation-only extraction with chunked workflow-aware extraction and Scene IR schema output
frontend/scene-generator/generator-runner.js Modify Add deterministic scene scanning, key-fragment selection, and IR support
frontend/scene-generator/server.js Modify Expose analysis, preview, readiness, and generation endpoints for Scene IR
frontend/scene-generator/sg_scene_generator.html Modify Show extraction preview, archetype classification, bootstrap, risks, and readiness

Rust backend

File Action Purpose
src/generated_scene/analyzer.rs Modify Add deterministic extraction helpers and archetype support
src/generated_scene/generator.rs Modify Route generation by archetype and compile from Scene IR instead of ad hoc fields
src/generated_scene/ir.rs Create Define unified Scene IR structs and serde contracts
src/bin/sg_scene_generate.rs Modify Accept Scene IR JSON or file input and pass it into generator
src/compat/scene_platform/resolvers.rs Modify Align runtime parameter resolution with generated contracts

Tests and fixtures

File Action Purpose
tests/scene_generator_test.rs Modify Cover new analysis, archetype classification, and generation routing
tests/generated_scene_* or related fixtures Modify/Create Add representative fixtures for single-request, multi-mode, and paginated-enrichment scenes

Scope Guardrails

  • Do not break existing --scene-id, --scene-name, or --scene-kind compatibility.
  • Do not require all scenes to provide complete metadata in HTML meta tags.
  • Do not force the runtime to support new resolver contracts unless generation is updated to gate incompatible output.
  • Do not assume all report scenes share org + period params.
  • Do not silently generate low-confidence skills as if they were runnable.

Task 1: Fix Current Hard Failures Before Compiler Refactor

Files:

  • Modify: frontend/scene-generator/llm-client.js
  • Modify: frontend/scene-generator/generator-runner.js
  • Modify: frontend/scene-generator/server.js
  • Modify: frontend/scene-generator/sg_scene_generator.html
  • Modify: src/generated_scene/generator.rs

Goal: Stop the most obvious wrong outputs that currently make generated skills fail on the internal network even before the full compiler architecture lands.

  • Step 1: Remove report-scene hardcoded parameter assumptions

Audit scene.toml generation in src/generated_scene/generator.rs and remove default injection of generic report params such as:

  • fixed org
  • fixed period
  • default dictionary entity for a specific city
  • generic page title keywords like ["报表", "线损"]

Expected result: generated params come from extracted scene semantics or are omitted when not confidently known.

  • Step 2: Rework bootstrap source priority

Change bootstrap derivation so expected_domain and target_url are resolved using this order:

  1. explicit deep extraction result
  2. deterministic extraction from business entry points
  3. HTML meta tags if trustworthy
  4. fallback empty with warning

Explicitly prevent script-host URLs such as static JS includes from becoming the business domain by mistake.

  • Step 3: Replace naive truncation with chunked extraction input

Update frontend/scene-generator/llm-client.js and frontend/scene-generator/generator-runner.js so they no longer send only the first 15000/3000 characters. Replace with:

  1. directory tree summary
  2. index.html chunking
  3. URL-bearing fragments
  4. request-construction fragments
  5. branching logic fragments
  6. export-related fragments
  • Step 4: Add analysis preview and risk banner in Web UI

Update frontend/scene-generator/sg_scene_generator.html and frontend/scene-generator/server.js to preview:

  • detected archetype

  • bootstrap

  • key endpoints

  • extracted params

  • workflow steps

  • confidence and risk notes

  • Step 5: Verify with marketing and tq reference scenes

Run local analysis against the two reference scenes and confirm:

  • marketing-zero-consumer-report no longer resolves the wrong domain

  • tq-lineloss-report still identifies mode-related structures

  • generated preview no longer shows generic hardcoded report params

  • Step 6: Commit

git add frontend/scene-generator/llm-client.js frontend/scene-generator/generator-runner.js frontend/scene-generator/server.js frontend/scene-generator/sg_scene_generator.html src/generated_scene/generator.rs
git commit -m "fix(generator): remove hardcoded report defaults and improve bootstrap extraction"

Task 2: Introduce Unified Scene IR

Files:

  • Create: src/generated_scene/ir.rs
  • Modify: src/generated_scene/generator.rs
  • Modify: src/bin/sg_scene_generate.rs
  • Modify: frontend/scene-generator/server.js
  • Modify: frontend/scene-generator/llm-client.js

Goal: Introduce a single intermediate representation that all extraction and compilation stages use.

  • Step 1: Add Rust Scene IR structs

Create src/generated_scene/ir.rs with serde-enabled structs for:

  • SceneIr
  • BootstrapIr
  • ParamIr
  • ModeIr
  • WorkflowStepIr
  • ArtifactContractIr
  • NormalizeRulesIr
  • ReadinessIr
  • EvidenceIr

Minimum top-level fields:

{
  "sceneId": "",
  "sceneName": "",
  "sceneKind": "",
  "workflowArchetype": "",
  "bootstrap": {},
  "params": [],
  "modes": [],
  "workflowSteps": [],
  "requestTemplate": {},
  "responsePath": "",
  "normalizeRules": {},
  "artifactContract": {},
  "validationHints": {},
  "evidence": []
}
  • Step 2: Wire Scene IR into generator entrypoints

Update src/bin/sg_scene_generate.rs to accept either:

  • --scene-info-json upgraded to the new IR contract, or
  • a new --scene-ir-json / --scene-ir-file parameter

Keep backward compatibility by translating old scene info into partial IR where needed.

  • Step 3: Refactor generator to compile from IR

Update src/generated_scene/generator.rs so its internal interfaces no longer directly depend on loosely grouped fields like expectedDomain, staticParams, and columnDefs alone. It should compile from unified SceneIr.

  • Step 4: Update Node server to pass IR through generation

Modify frontend/scene-generator/server.js so analyze endpoints return IR-shaped JSON and generate endpoints pass the same structure into Rust without flattening.

  • Step 5: Verify serde and CLI compatibility

Run:

cargo check
node --check frontend/scene-generator/server.js
node --check frontend/scene-generator/llm-client.js

Expected: Rust and Node compile cleanly with the new IR contract.

  • Step 6: Commit
git add src/generated_scene/ir.rs src/generated_scene/generator.rs src/bin/sg_scene_generate.rs frontend/scene-generator/server.js frontend/scene-generator/llm-client.js
git commit -m "feat(generator): introduce unified scene ir for analysis and compilation"

Task 3: Build Hybrid Extraction Pipeline

Files:

  • Modify: src/generated_scene/analyzer.rs
  • Modify: frontend/scene-generator/generator-runner.js
  • Modify: frontend/scene-generator/llm-client.js
  • Modify: frontend/scene-generator/server.js

Goal: Split extraction into deterministic signal collection plus LLM semantic completion.

  • Step 1: Implement deterministic extraction helpers

Add helper logic in src/generated_scene/analyzer.rs or adjacent extraction code to detect:

  • URLs and request methods

  • contentType

  • request payload builders

  • pagination variables such as page, rows, pageSize

  • branch variables such as period_mode, reportType

  • entry methods

  • export methods

  • obvious filter expressions such as charge !== 0

  • Step 2: Create key-fragment selection in Node runner

Update frontend/scene-generator/generator-runner.js to extract and package:

  • directory summary
  • URL fragments
  • branch fragments
  • request-body fragments
  • response normalization fragments
  • export fragments

for LLM analysis.

  • Step 3: Redesign LLM prompt for workflow understanding

Update frontend/scene-generator/llm-client.js so the prompt explicitly asks for:

  • workflowArchetype

  • bootstrap

  • params

  • modes

  • workflowSteps

  • requestTemplate

  • responsePath

  • normalizeRules

  • artifactContract

  • confidence

  • uncertainties

  • Step 4: Merge deterministic and LLM results

Implement merge logic in frontend/scene-generator/server.js or a dedicated helper:

  • deterministic extraction wins for hard facts

  • LLM fills missing semantics

  • conflicts are surfaced in preview as warnings

  • Step 5: Verify against reference workflows

Check that:

  • marketing-zero-consumer-report emits workflow steps including paginate, secondary_request, filter, and export

  • tq-lineloss-report emits modes, defaultMode, and modeSwitchField

  • Step 6: Commit

git add src/generated_scene/analyzer.rs frontend/scene-generator/generator-runner.js frontend/scene-generator/llm-client.js frontend/scene-generator/server.js
git commit -m "feat(generator): add hybrid deterministic and llm workflow extraction"

Task 4: Add Workflow Archetype Classification

Files:

  • Modify: src/generated_scene/analyzer.rs
  • Modify: src/generated_scene/ir.rs
  • Modify: frontend/scene-generator/server.js
  • Modify: frontend/scene-generator/sg_scene_generator.html

Goal: Reliably classify scenes so the correct compiler path is chosen.

  • Step 1: Add archetype enum support

Define and support these initial archetypes:

  • single_request_table

  • multi_mode_request

  • paginated_enrichment

  • page_state_eval

  • Step 2: Implement classification rules

Classification logic should prefer:

  1. multi_mode_request when explicit mode-switch branching exists
  2. paginated_enrichment when paginated list fetch plus secondary requests are detected
  3. page_state_eval when page-state judgment dominates
  4. single_request_table as fallback with lower confidence
  • Step 3: Expose classification confidence

Add confidence and evidence fields to the preview payload so UI can show why a scene was classified into an archetype.

  • Step 4: Add manual override support in UI

Allow users to override archetype in frontend/scene-generator/sg_scene_generator.html before final generation, but preserve the original detected result and confidence.

  • Step 5: Verify reference classifications

Expected:

  • marketing-zero-consumer-report => paginated_enrichment

  • tq-lineloss-report => multi_mode_request

  • Step 6: Commit

git add src/generated_scene/analyzer.rs src/generated_scene/ir.rs frontend/scene-generator/server.js frontend/scene-generator/sg_scene_generator.html
git commit -m "feat(generator): classify scenes by workflow archetype with confidence"

Task 5: Split Generator Into Archetype Compilers

Files:

  • Modify: src/generated_scene/generator.rs
  • Optionally create: src/generated_scene/compiler_single_request.rs
  • Optionally create: src/generated_scene/compiler_multi_mode.rs
  • Optionally create: src/generated_scene/compiler_paginated_enrichment.rs
  • Optionally create: src/generated_scene/compiler_page_state.rs

Goal: Replace the single generic report template with explicit compiler paths.

  • Step 1: Add compiler routing by archetype

Update src/generated_scene/generator.rs so generation dispatches on workflowArchetype.

  • Step 2: Implement single_request_table compiler

Generate:

  • minimal scene.toml

  • direct request browser script

  • artifact output for simple table/list data

  • Step 3: Implement multi_mode_request compiler

Generate:

  • mode detection
  • mode-specific request builders
  • mode-specific column definitions
  • mode-specific response extraction
  • unified artifact output

Reference target: tq-lineloss-report

  • Step 4: Implement paginated_enrichment compiler

Generate:

  • paginated list loop
  • per-item or batched secondary requests
  • aggregation and transform steps
  • business filters
  • final artifact or export output

Reference target: marketing-zero-consumer-report

  • Step 5: Implement page_state_eval compiler

Generate:

  • state-check script skeleton

  • light artifact semantics for monitoring or status checks

  • Step 6: Verify generated outputs by archetype

Validate that generated scripts no longer:

  • define multiple API endpoints but use only the first

  • collapse mode-aware scenes into one request body

  • flatten paginated enrichment scenes into one-step normalization

  • Step 7: Commit

git add src/generated_scene/generator.rs src/generated_scene/compiler_*.rs
git commit -m "feat(generator): split scene generation into workflow archetype compilers"

Task 6: Align Runtime Resolver Contracts

Files:

  • Modify: src/compat/scene_platform/resolvers.rs
  • Modify: src/generated_scene/generator.rs
  • Modify: src/generated_scene/ir.rs

Goal: Ensure generated parameter contracts are either executable by the runtime or explicitly flagged as unsupported.

  • Step 1: Audit current resolver coverage

Document which current contracts are already supported, including:

  • dictionary_entity

  • month_week_period

  • fixed_enum

  • literal_passthrough

  • Step 2: Add missing resolver types or gate them

Choose one of these paths per parameter type:

  1. implement new runtime resolver support
  2. downgrade generation to an existing supported resolver
  3. block generation with explicit readiness warning

Recommended additions:

  • mode_enum

  • date_range

  • org_tree

  • page_size

  • hidden_static

  • derived_param

  • Step 3: Reflect runtime compatibility in generated metadata

Generated output should clearly indicate:

  • supported params

  • unresolved params

  • manual-completion requirements

  • Step 4: Add tests for resolver alignment

Extend tests to ensure a generated skill cannot claim runnable readiness when its params require unsupported resolver behavior.

  • Step 5: Commit
git add src/compat/scene_platform/resolvers.rs src/generated_scene/generator.rs src/generated_scene/ir.rs tests/scene_generator_test.rs
git commit -m "feat(runtime): align generated scene contracts with resolver support"

Task 7: Add Readiness Gates And Generation Report

Files:

  • Modify: frontend/scene-generator/server.js
  • Modify: frontend/scene-generator/sg_scene_generator.html
  • Modify: src/generated_scene/ir.rs
  • Modify: src/generated_scene/generator.rs

Goal: Make generation output self-describing so users know whether a skill is ready for internal-network trial.

  • Step 1: Add static readiness checks

Implement checks for:

  • entrypoint detection

  • request-chain completeness

  • bootstrap plausibility

  • param/runtime compatibility

  • archetype compiler completeness

  • Step 2: Add readiness levels

Define:

  • A = ready for direct internal-network trial

  • B = structurally correct, human review recommended

  • C = draft only, manual completion required

  • Step 3: Generate human-readable report

Each analysis or generation result should include:

  • archetype

  • confidence

  • key evidence

  • detected risks

  • missing pieces

  • readiness level

  • Step 4: Display readiness in Web UI

Show the readiness grade before generation and after generation, with explicit warnings for internal-network execution risk.

  • Step 5: Verify readiness outcomes

Expected baseline:

  • tq-lineloss-report should reach A or high-confidence B

  • marketing-zero-consumer-report should not be labeled runnable unless pagination and secondary-request logic are correctly represented

  • Step 6: Commit

git add frontend/scene-generator/server.js frontend/scene-generator/sg_scene_generator.html src/generated_scene/ir.rs src/generated_scene/generator.rs
git commit -m "feat(generator): add readiness grading and generation risk reporting"

Task 8: Add Regression Coverage For Reference Scenes

Files:

  • Modify: tests/scene_generator_test.rs
  • Create/Modify: scene generator fixtures as needed

Goal: Lock in the two reference scenes as ongoing regression cases.

  • Step 1: Add marketing classification fixture coverage

Test that the marketing source scene is classified as paginated_enrichment and contains evidence for:

  • paginated list request

  • secondary request

  • filter rule

  • export step

  • Step 2: Add tq classification fixture coverage

Test that the tq source scene is classified as multi_mode_request and contains evidence for:

  • month mode

  • week mode

  • distinct request templates

  • distinct column definitions

  • Step 3: Add generation-shape assertions

Assert that generated outputs differ by archetype and do not collapse to a single generic template shape.

  • Step 4: Run verification
cargo test --test scene_generator_test -- --nocapture

Expected: both reference cases pass and guard against regression.

  • Step 5: Commit
git add tests/scene_generator_test.rs tests/fixtures
git commit -m "test(generator): add regression coverage for marketing and tq reference scenes"

Delivery Sequence

Recommended implementation order:

  1. Task 1: hard failure fixes
  2. Task 2: Scene IR
  3. Task 3: hybrid extraction
  4. Task 4: archetype classification
  5. Task 5: compiler split
  6. Task 6: resolver alignment
  7. Task 7: readiness gates
  8. Task 8: regression coverage

Rationale:

  • Task 1 stops current bad outputs early.
  • Tasks 2 to 5 establish the new compiler backbone.
  • Tasks 6 and 7 prevent false claims of runnability.
  • Task 8 locks the new architecture against regression.

Verification Strategy

Static Verification

  • cargo check
  • cargo test --test scene_generator_test -- --nocapture
  • node --check frontend/scene-generator/llm-client.js
  • node --check frontend/scene-generator/generator-runner.js
  • node --check frontend/scene-generator/server.js

Functional Verification

For marketing-zero-consumer-report:

  • detected as paginated_enrichment
  • bootstrap resolves to business domain, not static script host
  • generated workflow includes pagination and secondary requests
  • generation is not labeled runnable if those steps are missing

For tq-lineloss-report:

  • detected as multi_mode_request
  • month and week logic remain distinct
  • request templates and column definitions are mode-specific

UI Verification

Confirm the scene generator UI now shows:

  • detected archetype
  • confidence
  • bootstrap
  • key params
  • readiness grade
  • risk notes

Acceptance Criteria

This plan is complete when all of the following are true:

  1. sg_scene_generate consumes a unified Scene IR.
  2. The analysis pipeline can distinguish at least single_request_table, multi_mode_request, paginated_enrichment, and page_state_eval.
  3. tq-lineloss-report is generated through the multi-mode compiler path.
  4. marketing-zero-consumer-report is generated through the paginated-enrichment compiler path.
  5. Generated scene.toml no longer injects unrelated default org/period assumptions.
  6. Bootstrap resolution no longer mistakes external script hosts for business target domains.
  7. Runtime resolver compatibility is explicit, not implicit.
  8. Generation results include readiness grading and risk reporting before internal-network trial.