16 KiB
Generated Scene Rectification Implementation Plan
For agentic workers: REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (
- [ ]) syntax for tracking.
Goal: Rectify the generated-scene pipeline so it stops emitting false-positive runnable skills for complex internal scenes, specifically by fixing sceneId degeneration, bootstrap pollution, incomplete workflow reconstruction, and readiness fail-open behavior.
Architecture: Keep the current Scene IR pipeline, but add four hard control chains around it: naming validation, bootstrap evidence stratification, workflow evidence reconstruction, and readiness gating. Generation must fail closed whenever these chains are incomplete.
Tech Stack: Rust, Node.js, HTML/CSS/JavaScript, serde_json, OpenAI-compatible LLM API
Scope Check
This plan implements the design in:
docs/superpowers/specs/2026-04-17-generated-scene-rectification-design.md
This plan builds on the existing generated-scene foundation already described in:
docs/superpowers/specs/2026-04-17-scene-skill-compiler-design.mddocs/superpowers/specs/2026-04-17-llm-driven-skill-generation-design.mddocs/superpowers/specs/2026-04-17-enhanced-llm-extraction-schema-design.md
This plan does not attempt to solve:
- login or authentication recovery
- Chromium host integration or browser embedding changes
- full runtime resolver expansion beyond what this rectification needs
- arbitrary historical scene compatibility outside the reference regression cases
File Map
Frontend scene generator
| File | Action | Purpose |
|---|---|---|
frontend/scene-generator/generator-runner.js |
Modify | Implement naming fallback control, URL evidence stratification, workflow evidence cleanup, and pre-generation gate inputs |
frontend/scene-generator/llm-client.js |
Modify | Tighten sceneId semantic constraints and reject low-entropy LLM naming output |
frontend/scene-generator/server.js |
Modify | Aggregate readiness gates, block unsafe generation, and return rectification diagnostics |
frontend/scene-generator/sg_scene_generator.html |
Modify | Show invalid sceneId, bootstrap role breakdown, workflow evidence completeness, and generation block reasons |
Rust generated-scene pipeline
| File | Action | Purpose |
|---|---|---|
src/generated_scene/analyzer.rs |
Modify | Add endpoint denoising, evidence role typing, and stricter archetype preconditions |
src/generated_scene/ir.rs |
Modify | Extend IR to carry candidate roles, gate states, and workflow evidence completeness |
src/generated_scene/generator.rs |
Modify | Prevent compiler routing when gates fail and surface fail-closed diagnostics |
Tests and fixtures
| File | Action | Purpose |
|---|---|---|
tests/scene_generator_test.rs |
Modify | Cover naming, bootstrap, workflow, and readiness regression cases |
tests/scene_generator_html_test.rs |
Modify | Cover HTML/UI risk and blocking output |
tests/fixtures/generated_scene/paginated_enrichment/* |
Modify | Preserve marketing-like reference coverage |
tests/fixtures/generated_scene/multi_mode/* |
Modify | Preserve tq-like multi-mode coverage |
| Additional fixture files as needed | Create | Add low-entropy naming and localhost-pollution regression inputs |
Scope Guardrails
- Do not broaden this work into a generic scene-generator redesign.
- Do not remove the existing
Scene IRstructure; extend and constrain it. - Do not let
localhostor helper/export endpoints participate in bootstrap selection. - Do not silently coerce invalid
sceneIdvalues into accepted ids. - Do not route into
paginated_enrichmentunless its minimum workflow evidence is complete. - Do not emit a default runnable skill when any rectification gate fails.
Task 1: Rectify Naming Chain
Files:
- Modify:
frontend/scene-generator/generator-runner.js - Modify:
frontend/scene-generator/llm-client.js - Modify:
frontend/scene-generator/server.js - Modify:
src/generated_scene/ir.rs
Goal: Stop Chinese-source scenes from degrading into low-information ids such as 2-0, and turn sceneId into a validated business identifier instead of a raw slug fallback.
- Step 1: Classify sceneId candidate sources
Define explicit candidate tiers for sceneId:
- LLM semantic business id
- deterministic keyword-derived id
- controlled alias/transliteration fallback
- invalid fallback candidate
Expected result: the pipeline can explain where the chosen id came from.
- Step 2: Add low-entropy sceneId validation
Implement shared validation rules that reject ids which are:
- numeric-only or numeric-dominant
- too short to be business-readable
- generic placeholders such as
sceneorreport - semantically detached from the extracted
sceneName
Expected result: ids like 2-0, 1-0, scene, report are blocked.
- Step 3: Fail closed on invalid sceneId
Update generation flow so invalid sceneId produces:
invalid_scene_idgate failure- readiness downgrade
- analysis/report output only unless explicitly overridden later by a separate approved flow
Expected result: invalid ids never create a formal generated skill directory by default.
- Step 4: Surface naming diagnostics in server/UI
Return and display:
-
chosen
sceneId -
candidate source
-
validation result
-
invalidation reason if blocked
-
Step 5: Add regression tests
Cover at least:
-
Chinese source name that previously degraded to
2-0 -
valid semantic id chosen over slug fallback
-
invalid low-entropy id blocked from generation
-
Step 6: Commit
git add frontend/scene-generator/generator-runner.js frontend/scene-generator/llm-client.js frontend/scene-generator/server.js src/generated_scene/ir.rs tests/scene_generator_test.rs
git commit -m "fix(generator): block degenerate generated scene ids"
Task 2: Rectify Bootstrap Chain
Files:
- Modify:
frontend/scene-generator/generator-runner.js - Modify:
frontend/scene-generator/server.js - Modify:
src/generated_scene/analyzer.rs - Modify:
src/generated_scene/ir.rs
Goal: Separate business bootstrap candidates from localhost/export/helper URLs so internal-network entry domains resolve correctly.
- Step 1: Add URL evidence role stratification
Classify URL candidates into:
business_entrybusiness_apigateway_apiexport_servicelocal_helperstatic_assettemplate_noise
Expected result: every URL candidate is typed before bootstrap selection.
- Step 2: Add deterministic localhost and noise rejection
Ensure that:
localhost127.0.0.1SurfaceServicesReportServices.js/.cssassets- template placeholders and format strings
are routed away from bootstrap candidates.
Expected result: helper/export/static/template strings can remain as evidence but can never win bootstrap.
- Step 3: Redefine bootstrap resolution order
Bootstrap selection may only consume:
business_entrybusiness_apigateway_api
When only helper/noise roles exist, set bootstrap to unresolved and downgrade readiness.
- Step 4: Preserve export/helper evidence separately
Retain localhost/export endpoints as downstream evidence for workflow/reporting, but isolate them from expectedDomain and targetUrl.
- Step 5: Add regression tests
Cover at least:
-
marketing-like source choosing
yx.gs.sgcc.com.cnoverlocalhost -
mixed business + gateway scene preserving business target page
-
scene with only localhost/noise ending in unresolved bootstrap
-
Step 6: Commit
git add frontend/scene-generator/generator-runner.js frontend/scene-generator/server.js src/generated_scene/analyzer.rs src/generated_scene/ir.rs tests/scene_generator_test.rs
git commit -m "fix(generator): stratify bootstrap evidence and exclude localhost"
Task 3: Rectify Workflow Chain
Files:
- Modify:
frontend/scene-generator/generator-runner.js - Modify:
frontend/scene-generator/server.js - Modify:
src/generated_scene/analyzer.rs - Modify:
src/generated_scene/ir.rs - Modify:
src/generated_scene/generator.rs
Goal: Reconstruct workflow from request-chain evidence instead of generic field names, so paginated_enrichment is only emitted when its true workflow exists.
- Step 1: Split workflow evidence into typed layers
Represent workflow evidence as:
- request evidence
- pagination evidence
- secondary request evidence
- post-process evidence
Expected result: archetype decisions operate on structured workflow signals instead of a flat endpoint list.
- Step 2: Denoise endpoint and method evidence
Normalize and filter out:
${apiUrl}- template placeholders
- exception strings
- log text fragments
- localhost export endpoints
Expected result: workflow reconstruction only consumes business-relevant requests.
- Step 3: Tighten archetype routing rules
Require paginated_enrichment to have at minimum:
- one main list request
- one pagination variable set
- one secondary request or explicit per-item enrichment function
- one post-process action among
filter,transform,export
If only part of this exists, preserve it as candidate evidence but do not route into the compiler.
- Step 4: Narrow multi_mode detection
Allow multi_mode_request only when mode switching materially changes at least one of:
- request body
- endpoint shape
- response path
- column definition
Expected result: generic type/tab/mode/status fields alone no longer misclassify marketing-like scenes.
- Step 5: Block compiler routing on incomplete workflow
Update generator-side routing so incomplete evidence cannot produce a formal paginated_enrichment skill package.
- Step 6: Add regression tests
Cover at least:
-
marketing-like scene must expose
paginate+secondary_request+ post-process evidence -
generic mode fields without real mode divergence must not force
multi_mode_request -
noisy endpoint lists must still reconstruct the correct business request chain
-
Step 7: Commit
git add frontend/scene-generator/generator-runner.js frontend/scene-generator/server.js src/generated_scene/analyzer.rs src/generated_scene/ir.rs src/generated_scene/generator.rs tests/scene_generator_test.rs
git commit -m "fix(generator): require complete workflow evidence before archetype routing"
Task 4: Rectify Readiness Chain
Files:
- Modify:
frontend/scene-generator/server.js - Modify:
frontend/scene-generator/sg_scene_generator.html - Modify:
src/generated_scene/ir.rs - Modify:
src/generated_scene/generator.rs - Modify:
tests/scene_generator_html_test.rs
Goal: Turn readiness into a hard gate that distinguishes analysis output from runnable skill output.
- Step 1: Add explicit rectification gates
Track at minimum:
scene_id_validbootstrap_resolvedworkflow_complete_for_archetyperuntime_contract_compatible
Expected result: readiness is derived from named gates rather than a loose score only.
- Step 2: Enforce fail-closed readiness rules
Require:
-
all core gates pass for readiness
AorB -
any core gate failure forces readiness
C -
generation endpoint blocks runnable output on gate failure
-
Step 3: Separate analysis result from generation result
When gates fail, allow:
- analysis preview
- evidence report
- block reasons
But do not default to:
-
full skill emission
-
compiler success messaging
-
Step 4: Expose readiness breakdown in UI
Display:
-
gate names
-
pass/fail state
-
missing workflow pieces
-
bootstrap resolution reason
-
invalid sceneId reason
-
Step 5: Add regression tests
Cover at least:
-
invalid
sceneIdforcing readinessC -
unresolved bootstrap forcing readiness
C -
incomplete paginated workflow forcing readiness
C -
fully valid reference fixture remaining eligible for generation
-
Step 6: Commit
git add frontend/scene-generator/server.js frontend/scene-generator/sg_scene_generator.html src/generated_scene/ir.rs src/generated_scene/generator.rs tests/scene_generator_html_test.rs tests/scene_generator_test.rs
git commit -m "fix(generator): enforce readiness fail-closed gating"
Task 5: Reference Regression Verification
Files:
- Modify:
tests/scene_generator_test.rs - Modify:
tests/scene_generator_html_test.rs - Modify/Create: relevant fixtures under
tests/fixtures/generated_scene/
Goal: Lock the rectification against the two reference scene families and ensure future changes do not reintroduce the same false positives.
- Step 1: Regress marketing-like fixture
Verify the marketing reference path now satisfies:
-
non-degenerate
sceneId -
bootstrap rooted in
yx.gs.sgcc.com.cnfamily -
workflow includes
paginate -
workflow includes
secondary_request -
readiness does not pass if any of the above are missing
-
Step 2: Regress tq-like fixture
Verify the tq reference path still satisfies:
-
stable semantic
sceneId -
valid non-localhost bootstrap
-
genuine
multi_mode_requestdetection -
no downgrade caused by the stricter marketing rectification rules
-
Step 3: Run verification commands
Run:
cargo check
cargo test --test scene_generator_test -- --nocapture
cargo test --test scene_generator_html_test -- --nocapture
node --check frontend/scene-generator/llm-client.js
node --check frontend/scene-generator/generator-runner.js
node --check frontend/scene-generator/server.js
Expected result: rectification passes both Rust and Node validation plus regression coverage.
- Step 4: Record outcomes in generated reports if needed
If the implementation emits readiness or analysis JSON reports, ensure the test fixtures assert the key blocked/passed states directly.
- Step 5: Commit
git add tests/scene_generator_test.rs tests/scene_generator_html_test.rs tests/fixtures/generated_scene frontend/scene-generator/llm-client.js frontend/scene-generator/generator-runner.js frontend/scene-generator/server.js src/generated_scene/analyzer.rs src/generated_scene/ir.rs src/generated_scene/generator.rs
git commit -m "test(generator): lock generated scene rectification regressions"
Acceptance Criteria
This plan is complete when all of the following are true:
- Chinese-source scene names no longer degrade into low-entropy ids like
2-0. localhost,127.0.0.1, export services, and helper URLs no longer compete for bootstrap resolution.paginated_enrichmentrouting only occurs when pagination, secondary request, and post-process evidence are all present.- Incomplete evidence paths fail closed with explicit readiness gate failures instead of generating false-positive runnable skills.
- The marketing-like and tq-like reference scenes both remain covered by automated regression tests.
Rollback Strategy
If this rectification causes unacceptable regressions:
- Revert the latest rectification task commit only, not unrelated generated-scene work.
- Keep the previous
Scene IRand compiler structure intact. - Preserve newly added fixtures and tests where possible, then relax only the specific gate or classifier that caused the regression.
Notes For Executors
- Implement this plan strictly in order: naming, bootstrap, workflow, readiness, verification.
- Do not skip ahead to UI polish before the gating logic is in place.
- Do not add speculative resolver or login work under this plan.
- Any need for user override or forced draft generation must be handled as a separate follow-up spec, not smuggled into this rectification plan.