Files
claw/docs/superpowers/plans/2026-04-17-generated-scene-rectification-plan.md

16 KiB

Generated Scene Rectification Implementation Plan

For agentic workers: REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (- [ ]) syntax for tracking.

Goal: Rectify the generated-scene pipeline so it stops emitting false-positive runnable skills for complex internal scenes, specifically by fixing sceneId degeneration, bootstrap pollution, incomplete workflow reconstruction, and readiness fail-open behavior.

Architecture: Keep the current Scene IR pipeline, but add four hard control chains around it: naming validation, bootstrap evidence stratification, workflow evidence reconstruction, and readiness gating. Generation must fail closed whenever these chains are incomplete.

Tech Stack: Rust, Node.js, HTML/CSS/JavaScript, serde_json, OpenAI-compatible LLM API


Scope Check

This plan implements the design in:

  • docs/superpowers/specs/2026-04-17-generated-scene-rectification-design.md

This plan builds on the existing generated-scene foundation already described in:

  • docs/superpowers/specs/2026-04-17-scene-skill-compiler-design.md
  • docs/superpowers/specs/2026-04-17-llm-driven-skill-generation-design.md
  • docs/superpowers/specs/2026-04-17-enhanced-llm-extraction-schema-design.md

This plan does not attempt to solve:

  • login or authentication recovery
  • Chromium host integration or browser embedding changes
  • full runtime resolver expansion beyond what this rectification needs
  • arbitrary historical scene compatibility outside the reference regression cases

File Map

Frontend scene generator

File Action Purpose
frontend/scene-generator/generator-runner.js Modify Implement naming fallback control, URL evidence stratification, workflow evidence cleanup, and pre-generation gate inputs
frontend/scene-generator/llm-client.js Modify Tighten sceneId semantic constraints and reject low-entropy LLM naming output
frontend/scene-generator/server.js Modify Aggregate readiness gates, block unsafe generation, and return rectification diagnostics
frontend/scene-generator/sg_scene_generator.html Modify Show invalid sceneId, bootstrap role breakdown, workflow evidence completeness, and generation block reasons

Rust generated-scene pipeline

File Action Purpose
src/generated_scene/analyzer.rs Modify Add endpoint denoising, evidence role typing, and stricter archetype preconditions
src/generated_scene/ir.rs Modify Extend IR to carry candidate roles, gate states, and workflow evidence completeness
src/generated_scene/generator.rs Modify Prevent compiler routing when gates fail and surface fail-closed diagnostics

Tests and fixtures

File Action Purpose
tests/scene_generator_test.rs Modify Cover naming, bootstrap, workflow, and readiness regression cases
tests/scene_generator_html_test.rs Modify Cover HTML/UI risk and blocking output
tests/fixtures/generated_scene/paginated_enrichment/* Modify Preserve marketing-like reference coverage
tests/fixtures/generated_scene/multi_mode/* Modify Preserve tq-like multi-mode coverage
Additional fixture files as needed Create Add low-entropy naming and localhost-pollution regression inputs

Scope Guardrails

  • Do not broaden this work into a generic scene-generator redesign.
  • Do not remove the existing Scene IR structure; extend and constrain it.
  • Do not let localhost or helper/export endpoints participate in bootstrap selection.
  • Do not silently coerce invalid sceneId values into accepted ids.
  • Do not route into paginated_enrichment unless its minimum workflow evidence is complete.
  • Do not emit a default runnable skill when any rectification gate fails.

Task 1: Rectify Naming Chain

Files:

  • Modify: frontend/scene-generator/generator-runner.js
  • Modify: frontend/scene-generator/llm-client.js
  • Modify: frontend/scene-generator/server.js
  • Modify: src/generated_scene/ir.rs

Goal: Stop Chinese-source scenes from degrading into low-information ids such as 2-0, and turn sceneId into a validated business identifier instead of a raw slug fallback.

  • Step 1: Classify sceneId candidate sources

Define explicit candidate tiers for sceneId:

  1. LLM semantic business id
  2. deterministic keyword-derived id
  3. controlled alias/transliteration fallback
  4. invalid fallback candidate

Expected result: the pipeline can explain where the chosen id came from.

  • Step 2: Add low-entropy sceneId validation

Implement shared validation rules that reject ids which are:

  • numeric-only or numeric-dominant
  • too short to be business-readable
  • generic placeholders such as scene or report
  • semantically detached from the extracted sceneName

Expected result: ids like 2-0, 1-0, scene, report are blocked.

  • Step 3: Fail closed on invalid sceneId

Update generation flow so invalid sceneId produces:

  • invalid_scene_id gate failure
  • readiness downgrade
  • analysis/report output only unless explicitly overridden later by a separate approved flow

Expected result: invalid ids never create a formal generated skill directory by default.

  • Step 4: Surface naming diagnostics in server/UI

Return and display:

  • chosen sceneId

  • candidate source

  • validation result

  • invalidation reason if blocked

  • Step 5: Add regression tests

Cover at least:

  • Chinese source name that previously degraded to 2-0

  • valid semantic id chosen over slug fallback

  • invalid low-entropy id blocked from generation

  • Step 6: Commit

git add frontend/scene-generator/generator-runner.js frontend/scene-generator/llm-client.js frontend/scene-generator/server.js src/generated_scene/ir.rs tests/scene_generator_test.rs
git commit -m "fix(generator): block degenerate generated scene ids"

Task 2: Rectify Bootstrap Chain

Files:

  • Modify: frontend/scene-generator/generator-runner.js
  • Modify: frontend/scene-generator/server.js
  • Modify: src/generated_scene/analyzer.rs
  • Modify: src/generated_scene/ir.rs

Goal: Separate business bootstrap candidates from localhost/export/helper URLs so internal-network entry domains resolve correctly.

  • Step 1: Add URL evidence role stratification

Classify URL candidates into:

  • business_entry
  • business_api
  • gateway_api
  • export_service
  • local_helper
  • static_asset
  • template_noise

Expected result: every URL candidate is typed before bootstrap selection.

  • Step 2: Add deterministic localhost and noise rejection

Ensure that:

  • localhost
  • 127.0.0.1
  • SurfaceServices
  • ReportServices
  • .js / .css assets
  • template placeholders and format strings

are routed away from bootstrap candidates.

Expected result: helper/export/static/template strings can remain as evidence but can never win bootstrap.

  • Step 3: Redefine bootstrap resolution order

Bootstrap selection may only consume:

  1. business_entry
  2. business_api
  3. gateway_api

When only helper/noise roles exist, set bootstrap to unresolved and downgrade readiness.

  • Step 4: Preserve export/helper evidence separately

Retain localhost/export endpoints as downstream evidence for workflow/reporting, but isolate them from expectedDomain and targetUrl.

  • Step 5: Add regression tests

Cover at least:

  • marketing-like source choosing yx.gs.sgcc.com.cn over localhost

  • mixed business + gateway scene preserving business target page

  • scene with only localhost/noise ending in unresolved bootstrap

  • Step 6: Commit

git add frontend/scene-generator/generator-runner.js frontend/scene-generator/server.js src/generated_scene/analyzer.rs src/generated_scene/ir.rs tests/scene_generator_test.rs
git commit -m "fix(generator): stratify bootstrap evidence and exclude localhost"

Task 3: Rectify Workflow Chain

Files:

  • Modify: frontend/scene-generator/generator-runner.js
  • Modify: frontend/scene-generator/server.js
  • Modify: src/generated_scene/analyzer.rs
  • Modify: src/generated_scene/ir.rs
  • Modify: src/generated_scene/generator.rs

Goal: Reconstruct workflow from request-chain evidence instead of generic field names, so paginated_enrichment is only emitted when its true workflow exists.

  • Step 1: Split workflow evidence into typed layers

Represent workflow evidence as:

  • request evidence
  • pagination evidence
  • secondary request evidence
  • post-process evidence

Expected result: archetype decisions operate on structured workflow signals instead of a flat endpoint list.

  • Step 2: Denoise endpoint and method evidence

Normalize and filter out:

  • ${apiUrl}
  • template placeholders
  • exception strings
  • log text fragments
  • localhost export endpoints

Expected result: workflow reconstruction only consumes business-relevant requests.

  • Step 3: Tighten archetype routing rules

Require paginated_enrichment to have at minimum:

  1. one main list request
  2. one pagination variable set
  3. one secondary request or explicit per-item enrichment function
  4. one post-process action among filter, transform, export

If only part of this exists, preserve it as candidate evidence but do not route into the compiler.

  • Step 4: Narrow multi_mode detection

Allow multi_mode_request only when mode switching materially changes at least one of:

  • request body
  • endpoint shape
  • response path
  • column definition

Expected result: generic type/tab/mode/status fields alone no longer misclassify marketing-like scenes.

  • Step 5: Block compiler routing on incomplete workflow

Update generator-side routing so incomplete evidence cannot produce a formal paginated_enrichment skill package.

  • Step 6: Add regression tests

Cover at least:

  • marketing-like scene must expose paginate + secondary_request + post-process evidence

  • generic mode fields without real mode divergence must not force multi_mode_request

  • noisy endpoint lists must still reconstruct the correct business request chain

  • Step 7: Commit

git add frontend/scene-generator/generator-runner.js frontend/scene-generator/server.js src/generated_scene/analyzer.rs src/generated_scene/ir.rs src/generated_scene/generator.rs tests/scene_generator_test.rs
git commit -m "fix(generator): require complete workflow evidence before archetype routing"

Task 4: Rectify Readiness Chain

Files:

  • Modify: frontend/scene-generator/server.js
  • Modify: frontend/scene-generator/sg_scene_generator.html
  • Modify: src/generated_scene/ir.rs
  • Modify: src/generated_scene/generator.rs
  • Modify: tests/scene_generator_html_test.rs

Goal: Turn readiness into a hard gate that distinguishes analysis output from runnable skill output.

  • Step 1: Add explicit rectification gates

Track at minimum:

  • scene_id_valid
  • bootstrap_resolved
  • workflow_complete_for_archetype
  • runtime_contract_compatible

Expected result: readiness is derived from named gates rather than a loose score only.

  • Step 2: Enforce fail-closed readiness rules

Require:

  • all core gates pass for readiness A or B

  • any core gate failure forces readiness C

  • generation endpoint blocks runnable output on gate failure

  • Step 3: Separate analysis result from generation result

When gates fail, allow:

  • analysis preview
  • evidence report
  • block reasons

But do not default to:

  • full skill emission

  • compiler success messaging

  • Step 4: Expose readiness breakdown in UI

Display:

  • gate names

  • pass/fail state

  • missing workflow pieces

  • bootstrap resolution reason

  • invalid sceneId reason

  • Step 5: Add regression tests

Cover at least:

  • invalid sceneId forcing readiness C

  • unresolved bootstrap forcing readiness C

  • incomplete paginated workflow forcing readiness C

  • fully valid reference fixture remaining eligible for generation

  • Step 6: Commit

git add frontend/scene-generator/server.js frontend/scene-generator/sg_scene_generator.html src/generated_scene/ir.rs src/generated_scene/generator.rs tests/scene_generator_html_test.rs tests/scene_generator_test.rs
git commit -m "fix(generator): enforce readiness fail-closed gating"

Task 5: Reference Regression Verification

Files:

  • Modify: tests/scene_generator_test.rs
  • Modify: tests/scene_generator_html_test.rs
  • Modify/Create: relevant fixtures under tests/fixtures/generated_scene/

Goal: Lock the rectification against the two reference scene families and ensure future changes do not reintroduce the same false positives.

  • Step 1: Regress marketing-like fixture

Verify the marketing reference path now satisfies:

  • non-degenerate sceneId

  • bootstrap rooted in yx.gs.sgcc.com.cn family

  • workflow includes paginate

  • workflow includes secondary_request

  • readiness does not pass if any of the above are missing

  • Step 2: Regress tq-like fixture

Verify the tq reference path still satisfies:

  • stable semantic sceneId

  • valid non-localhost bootstrap

  • genuine multi_mode_request detection

  • no downgrade caused by the stricter marketing rectification rules

  • Step 3: Run verification commands

Run:

cargo check
cargo test --test scene_generator_test -- --nocapture
cargo test --test scene_generator_html_test -- --nocapture
node --check frontend/scene-generator/llm-client.js
node --check frontend/scene-generator/generator-runner.js
node --check frontend/scene-generator/server.js

Expected result: rectification passes both Rust and Node validation plus regression coverage.

  • Step 4: Record outcomes in generated reports if needed

If the implementation emits readiness or analysis JSON reports, ensure the test fixtures assert the key blocked/passed states directly.

  • Step 5: Commit
git add tests/scene_generator_test.rs tests/scene_generator_html_test.rs tests/fixtures/generated_scene frontend/scene-generator/llm-client.js frontend/scene-generator/generator-runner.js frontend/scene-generator/server.js src/generated_scene/analyzer.rs src/generated_scene/ir.rs src/generated_scene/generator.rs
git commit -m "test(generator): lock generated scene rectification regressions"

Acceptance Criteria

This plan is complete when all of the following are true:

  1. Chinese-source scene names no longer degrade into low-entropy ids like 2-0.
  2. localhost, 127.0.0.1, export services, and helper URLs no longer compete for bootstrap resolution.
  3. paginated_enrichment routing only occurs when pagination, secondary request, and post-process evidence are all present.
  4. Incomplete evidence paths fail closed with explicit readiness gate failures instead of generating false-positive runnable skills.
  5. The marketing-like and tq-like reference scenes both remain covered by automated regression tests.

Rollback Strategy

If this rectification causes unacceptable regressions:

  1. Revert the latest rectification task commit only, not unrelated generated-scene work.
  2. Keep the previous Scene IR and compiler structure intact.
  3. Preserve newly added fixtures and tests where possible, then relax only the specific gate or classifier that caused the regression.

Notes For Executors

  • Implement this plan strictly in order: naming, bootstrap, workflow, readiness, verification.
  • Do not skip ahead to UI polish before the gating logic is in place.
  • Do not add speculative resolver or login work under this plan.
  • Any need for user override or forced draft generation must be handled as a separate follow-up spec, not smuggled into this rectification plan.