claw/docs/superpowers/plans/2026-04-15-generated-scene-skill-platform-plan.md

# Generated Scene Skill Platform Implementation Plan

> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.

**Goal:** Add a manifest-driven generated-scene platform that discovers staged report/collection `browser_script` scenes, routes deterministic `。。。` requests through generic registry/resolver logic, migrates `tq-lineloss-report` off one-off Rust branches, and ships a first in-repo generator that outputs registration-ready scene packages with minimal or zero per-scene Rust changes.

**Architecture:** Keep the existing submit branch shape in `src/agent/task_runner.rs`, but replace the line-loss-specific deterministic branch with a thin adapter over a generic scene registry, deterministic dispatcher, generic report-artifact interpreter, and generic XLSX postprocess path. Keep the generator separate from runtime internals by making `scene.toml` plus the lessons-learned TOML the only stable generator/runtime contract; generator code lives in its own module and binary, while runtime code stays under the existing `compat` submit/bootstrap seams.

**Tech Stack:** Rust 2021, `serde`, `serde_json`, `toml`, existing `browser_script` runtime and callback-host/browser-backend seams, `node:test` for staged JS, Cargo integration tests, filesystem-based package generation.

---

## Execution Context

- Branch from the repo's current ws baseline branch, which is `feature/claw-ws` in this checkout today. Do **not** implement on that branch directly; create a new feature branch from its HEAD.
- Do **not** create a worktree unless the user explicitly asks. Branch isolation is required; worktree isolation is not.
- Keep `skillsDir` as the existing single resolved path. The new scene registry must scan inside that one resolved skills root instead of adding array-style scene roots or a second config field.
- For this branch's automated tests and real smokes, use a repo-local `skillsDir` override that points at `examples/generated_scene_platform`. That still preserves the single-root contract because the runtime scans one resolved root whose `skills/` child contains the committed sample package.
- Put the new runtime registration manifest at `<skill-root>/scene.toml`. Keep existing `skill_staging/scenes/*/scene.json` files for legacy staging/UI metadata and do **not** move runtime dispatch policy back into `scene.json`.
- Keep every required deliverable for this plan inside the current `claw-new` repo so the branch can be built, tested, and committed independently. The first committed sample package should live under `examples/generated_scene_platform/skills/`; publishing the same package into any external skills/staging repo is a separate follow-up, not part of this branch.
- V1 scope is locked to `category = "report_collection"`, `kind = "browser_script"`, `artifact.type = "report-artifact"`. Unsupported scene types must fail fast instead of partially working.
- Deterministic invocation remains exact-suffix-only: only raw instructions ending with the exact `。。。` suffix enter the scene dispatcher.
- Never use hidden page defaults for required canonical parameters. Missing org, missing month/week mode, or missing period must prompt and stop.
- Do **not** add a generic login/session subsystem in this plan.
- Preserve current non-platform flows: Zhihu/LLM, configured `directSubmitSkill`, and ordinary browser-attached orchestration must remain behaviorally unchanged unless an explicit regression test says otherwise.

## File Map

### Core runtime and contract files

- Create: `src/scene_contract/mod.rs`
  - shared serializable manifest contract used by both runtime and generator
- Create: `src/scene_contract/manifest.rs`
  - `scene.toml` schema types, schema-version validation helpers, artifact/postprocess enums
- Create: `src/compat/scene_platform/mod.rs`
  - exports the registry, dispatch, and resolver units
- Create: `src/compat/scene_platform/registry.rs`
  - scans the single resolved `skillsDir`, loads `<skill-root>/scene.toml`, validates duplicates and runtime compatibility
- Create: `src/compat/scene_platform/dispatch.rs`
  - deterministic candidate scoring, ambiguity fail-closed behavior, canonical param resolution, executable scene plan creation
- Create: `src/compat/scene_platform/resolvers.rs`
  - reusable resolver types for `dictionary_entity`, `month_week_period`, `fixed_enum`, and `literal_passthrough`
- Create: `src/compat/report_artifact.rs`
  - generic report-artifact parsing, status mapping, summary building, and export-readiness helpers
- Create: `src/compat/report_xlsx_export.rs`
  - generic XLSX exporter for any `report-artifact` with `column_defs`/`columns` + `rows`
- Modify: `src/lib.rs`
  - export new shared/runtime/generator modules and any CLI helpers needed by tests
- Modify: `src/compat/mod.rs`
  - export the new scene-platform and report-artifact modules
- Modify: `src/compat/deterministic_submit.rs`
  - keep the public API shape, but make it registry/manifest-driven instead of line-loss-hardcoded
- Modify: `src/compat/direct_skill_runtime.rs`
  - reuse the generic report-artifact interpreter so direct-submit and scene-submit summarize/status-map the same way
- Modify: `src/agent/task_runner.rs`
  - keep branch order, but call the new registry-backed deterministic planner before ordinary orchestration/LLM
- Modify: `src/service/server.rs`
  - keep bootstrap precedence shape, but let deterministic plans source `target_url` / `expected_domain` from scene manifests instead of hardcoded constants

### Generator files

- Create: `src/generated_scene/mod.rs`
  - generator entrypoints shared by tests and CLI
- Create: `src/generated_scene/analyzer.rs`
  - source directory inspection for v1 report/collection `browser_script` scenes
- Create: `src/generated_scene/generator.rs`
  - template rendering and package writing into an output staging root
- Create: `src/generated_scene/lessons.rs`
  - loads and validates `tq-lineloss-lessons-learned.toml` as generation constraints
- Create: `src/bin/sg_scene_generate.rs`
  - CLI entry for `sgClaw`'s in-repo scene generator capability

### In-repo sample package and reference assets

- Create: `examples/generated_scene_platform/skills/tq-lineloss-report/scene.toml`
  - first committed manifest-driven sample scene package used by runtime and generator tests in this repo
- Create: `examples/generated_scene_platform/skills/tq-lineloss-report/references/org-dictionary.json`
  - external dictionary data for the `dictionary_entity` resolver fixture
- Create: `examples/generated_scene_platform/skills/tq-lineloss-report/SKILL.toml`
  - committed sample browser-script tool contract aligned with the manifest-driven runtime
- Create: `examples/generated_scene_platform/skills/tq-lineloss-report/SKILL.md`
  - committed sample documentation for canonical args, artifact contract, and runtime expectations
- Create: `examples/generated_scene_platform/skills/tq-lineloss-report/scripts/collect_lineloss.js`
  - committed sample collection script with generic-platform artifact fields
- Create: `examples/generated_scene_platform/skills/tq-lineloss-report/scripts/collect_lineloss.test.js`
  - committed JS contract tests for canonical args and artifact shape
- Create: `examples/generated_scene_platform/skills/tq-lineloss-report/references/data-quality.md`
  - committed sample data-quality notes aligned with manifest-driven output rules
- Create: `examples/generated_scene_platform/skills/tq-lineloss-report/references/collection-flow.md`
  - committed sample bootstrap/collection-flow notes
- Create: `tests/fixtures/scene_source/tq_lineloss/index.html`
  - hermetic in-repo source fixture for required analyzer/generator smoke coverage
- Create: `tests/fixtures/scene_source/tq_lineloss/js/collect.js`
  - hermetic in-repo source fixture JS for analyzer/generator smoke coverage

### Repo-local runtime discovery path for validation

- Use `examples/generated_scene_platform` as the repo-local `skillsDir` override root during tests and manual smokes.
- The runtime still scans one resolved root only; it just resolves that root to `examples/generated_scene_platform`, whose `skills/` child contains the committed sample package.
- Add or reuse a tiny repo-local config fixture such as `tmp/generated_scene_platform_sgclaw_config.json` or an equivalent test helper so the validation steps all point at the same reproducible `skillsDir`.
- Do not require external staging repos to make the manifest-driven runtime discoverable during this branch.

### External publish target kept out of scope for this branch

- Do not modify external paths like `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/...` in this plan.
- If the user later wants the generated sample published into that external staging repo, do it as a separate follow-up after this branch is green.

### Platform-reference files

### Tests and fixtures

- Create: `tests/scene_registry_test.rs`
  - manifest loading, duplicate detection, schema validation, tool compatibility checks
- Create: `tests/report_artifact_postprocess_test.rs`
  - generic report-artifact parsing and XLSX postprocess coverage
- Create: `tests/generated_scene_lessons_test.rs`
  - lessons-TOML shape and required-rule coverage
- Create: `tests/scene_generator_test.rs`
  - analyzer + generator integration coverage using hermetic fixtures
- Create: `tests/fixtures/generated_scene/report_collection/index.html`
  - supported v1 report-scene fixture
- Create: `tests/fixtures/generated_scene/report_collection/js/report.js`
  - supported fixture source hints for analyzer tests
- Create: `tests/fixtures/generated_scene/non_report/index.html`
  - unsupported fixture proving fail-fast behavior
- Modify: `tests/deterministic_submit_test.rs`
  - migrate from hardcoded line-loss expectations to registry-driven deterministic behavior
- Modify: `tests/agent_runtime_test.rs`
  - keep direct-submit behavior intact while sharing generic report-artifact summaries
- Modify: `tests/service_task_flow_test.rs`
  - task-runner/bootstrap regressions for manifest-driven deterministic scenes
- Modify: `tests/service_ws_session_test.rs`
  - callback-host bootstrap target regression for manifest-driven deterministic submit when the browser-ws path is active

### Legacy files to delete only after green verification proves they are unused

- Delete: `src/compat/tq_lineloss/org_units.rs`
- Delete: `src/compat/tq_lineloss/org_resolver.rs`
- Delete: `src/compat/tq_lineloss/period_resolver.rs`
- Delete or reduce to a compatibility shim only if still needed: `src/compat/lineloss_xlsx_export.rs`

---

### Task 1: Create the implementation branch and lock the layout boundaries

**Files:**
- Verify only

- [ ] **Step 1: Switch to the ws baseline branch and create a new platform branch**

Run:

```bash
git switch feature/claw-ws
git switch -c feature/generated-scene-skill-platform
```

Expected: `git status -sb` shows a clean new branch rooted at the current ws baseline, not `feature/claw-ws` itself.

- [ ] **Step 2: Verify the current single-root skills layout before coding**

Run:

```bash
cargo test --test compat_config_test ws_cleanup_resolves_single_configured_skills_dir -- --nocapture
```

Expected: PASS, proving the repo still uses one resolved `skillsDir` path and the platform work must build on that instead of introducing array-style roots.

- [ ] **Step 3: Write down the two non-negotiable layout decisions in the first registry test scaffold**

The very first red test file (`tests/scene_registry_test.rs`) must assume:

```rust
// runtime manifest location:
let manifest_path = skill_root.join("scene.toml");

// legacy scene.json stays outside runtime dispatch ownership:
assert!(skill_root.join("scene.toml").exists());
assert!(!manifest_path.ends_with("skill_staging/scenes/.../scene.json"));
```

This prevents the implementation from drifting back toward `scene.json` routing or multi-root config.

---

### Task 2: Add the shared `scene.toml` contract and registry loader

**Files:**
- Create: `src/scene_contract/mod.rs`
- Create: `src/scene_contract/manifest.rs`
- Create: `src/compat/scene_platform/mod.rs`
- Create: `src/compat/scene_platform/registry.rs`
- Modify: `src/lib.rs`
- Modify: `src/compat/mod.rs`
- Create: `tests/scene_registry_test.rs`

- [ ] **Step 1: Write the failing registry tests first**

Add `tests/scene_registry_test.rs` with focused red cases like:

```rust
#[test]
fn registry_loads_scene_manifest_from_skill_root() {
    let skill_root = temp_skill_with_scene_manifest(r#"
[scene]
id = "tq-lineloss-report"
skill = "tq-lineloss-report"
tool = "collect_lineloss"
kind = "browser_script"
version = "0.1.0"
category = "report_collection"

[manifest]
schema_version = "1"

[bootstrap]
expected_domain = "20.76.57.61"
target_url = "http://20.76.57.61:18080/gsllys/tqLinelossStatis/tqQualifyRateMonitor"
requires_target_page = true

[artifact]
type = "report-artifact"
success_status = ["ok", "partial", "empty"]
failure_status = ["blocked", "error"]
"#);

    let registry = load_scene_registry(skill_root.parent().unwrap()).unwrap();
    assert_eq!(registry.len(), 1);
    assert_eq!(registry[0].manifest.scene.id, "tq-lineloss-report");
}

#[test]
fn registry_rejects_duplicate_scene_ids_with_both_paths_in_error() { /* two skills, same scene.id */ }

#[test]
fn registry_rejects_unknown_manifest_schema_version() { /* schema_version = "999" */ }

#[test]
fn registry_rejects_non_browser_script_scene_tool_in_v1() { /* kind = "shell" should fail */ }

#[test]
fn registry_ignores_skills_without_scene_toml() { /* ordinary skills still load elsewhere */ }
```

- [ ] **Step 2: Run the registry test file and verify it fails**

Run:

```bash
cargo test --test scene_registry_test -- --nocapture
```

Expected: FAIL because `scene.toml` types and registry loading do not exist yet.

- [ ] **Step 3: Implement the serializable manifest contract and the single-root registry loader**

Implement the minimal contract and loader needed to satisfy the tests:

```rust
#[derive(Debug, Clone, Deserialize, Serialize)]
pub struct SceneManifest {
    pub scene: SceneSection,
    pub manifest: ManifestSection,
    pub bootstrap: BootstrapSection,
    pub deterministic: DeterministicSection,
    pub params: Vec<SceneParam>,
    pub artifact: ArtifactSection,
    pub postprocess: Option<PostprocessSection>,
}

#[derive(Debug, Clone)]
pub struct SceneRegistryEntry {
    pub manifest: SceneManifest,
    pub skill_root: PathBuf,
}

pub fn load_scene_registry(skills_dir: &Path) -> Result<Vec<SceneRegistryEntry>, SceneRegistryError> {
    // iterate immediate skill dirs under the already-resolved single skillsDir
    // look for <skill-root>/scene.toml only
    // parse and validate schema version
    // verify scene.id uniqueness across the loaded root
    // verify manifest.scene.skill matches the containing skill package
    // verify referenced tool exists in SKILL.toml and is browser_script in v1
}
```

Rules to lock now:
- `schema_version = "1"` is the only accepted version in v1
- duplicate `scene.id` is a hard error and must report both manifest paths
- manifest loading must not add a second config key or a hardcoded `skill_staging/scenes` scan
- `scene.toml` is runtime-owned; `scene.json` stays legacy-only

- [ ] **Step 4: Re-run the registry tests and verify they pass**

Run:

```bash
cargo test --test scene_registry_test -- --nocapture
```

Expected: PASS.

- [ ] **Step 5: Commit the contract and registry slice**

Run:

```bash
git add src/lib.rs src/scene_contract/mod.rs src/scene_contract/manifest.rs src/compat/mod.rs src/compat/scene_platform/mod.rs src/compat/scene_platform/registry.rs tests/scene_registry_test.rs
git commit -m "feat: add scene manifest registry"
```

Expected: one commit that introduces the stable runtime/generator contract and registry loader.

---

### Task 3: Generalize deterministic dispatch and reusable parameter resolvers

**Files:**
- Create: `src/compat/scene_platform/dispatch.rs`
- Create: `src/compat/scene_platform/resolvers.rs`
- Modify: `src/compat/deterministic_submit.rs`
- Modify: `tests/deterministic_submit_test.rs`

- [ ] **Step 1: Replace the line-loss-only deterministic tests with registry-backed red tests**

Extend `tests/deterministic_submit_test.rs` with registry-backed red cases built from temp fixture manifests under a temporary skills root. Do **not** depend on the committed sample package from Task 6 yet; Task 3 must stay hermetic and independently runnable. Add failing cases such as:

```rust
#[test]
fn deterministic_submit_uses_registry_backed_scene_plan() {
    let decision = decide_deterministic_submit(
        "兰州公司 台区线损大数据 月累计线损率统计分析 2026-03。。。",
        None,
        None,
    );

    match decision {
        DeterministicSubmitDecision::Execute(plan) => {
            assert_eq!(plan.scene_id, "tq-lineloss-report");
            assert_eq!(plan.tool_name, "tq-lineloss-report.collect_lineloss");
            assert_eq!(plan.expected_domain, "20.76.57.61");
            assert_eq!(plan.target_url, "http://20.76.57.61:18080/gsllys/tqLinelossStatis/tqQualifyRateMonitor");
        }
        other => panic!("expected execute plan, got {other:?}"),
    }
}

#[test]
fn deterministic_submit_fails_closed_on_scene_ambiguity() { /* two plausible scene.toml entries -> Prompt */ }

#[test]
fn deterministic_submit_prompts_for_missing_period_instead_of_defaulting() {
    let decision = decide_deterministic_submit("兰州公司 台区线损大数据 月累计线损率统计分析。。。", None, None);
    assert!(matches!(decision, DeterministicSubmitDecision::Prompt { .. }));
}

#[test]
fn deterministic_submit_uses_page_context_to_break_ties_before_keyword_only_match() { /* page_url/title beats keyword overlap */ }

#[test]
fn zhihu_without_suffix_remains_not_deterministic() {
    assert!(matches!(
        decide_deterministic_submit("打开知乎热榜", Some("https://www.zhihu.com/hot"), Some("知乎热榜")),
        DeterministicSubmitDecision::NotDeterministic
    ));
}
```

Also invert the current default-period expectations. `兰州公司 月累计。。。` and `兰州公司 周累计。。。` must now prompt instead of executing.

- [ ] **Step 2: Run the targeted deterministic tests and verify they fail**

Run:

```bash
cargo test --test deterministic_submit_test -- --nocapture
```

Expected: FAIL because the current implementation is still hardcoded to line-loss constants and still defaults missing month/week periods.

- [ ] **Step 3: Implement reusable resolver types and a registry-backed dispatcher**

Implement the generic deterministic planner in the new scene-platform modules, then make `src/compat/deterministic_submit.rs` a thin adapter over it.

Required implementation shape:

```rust
pub enum ResolverKind {
    DictionaryEntity,
    MonthWeekPeriod,
    FixedEnum,
    LiteralPassthrough,
}

pub struct SceneExecutionPlan {
    pub scene_id: String,
    pub instruction: String,
    pub tool_name: String,
    pub expected_domain: String,
    pub target_url: String,
    pub args: Map<String, Value>,
    pub success_statuses: Vec<String>,
    pub failure_statuses: Vec<String>,
    pub postprocess: Option<PostprocessSection>,
}

pub fn plan_deterministic_scene(
    raw_instruction: &str,
    page_url: Option<&str>,
    page_title: Option<&str>,
    skills_dir: &Path,
) -> Result<DeterministicSubmitDecision, SceneDispatchError> {
    // exact suffix gate
    // load registry from the single skillsDir
    // score candidate scenes using include/exclude keywords + page context + required-param resolution
    // if multiple remain plausible -> fail closed with explicit ambiguity prompt
    // resolve params using generic resolver kinds
    // build executable SceneExecutionPlan with manifest bootstrap + tool + canonical args
}
```

Resolver rules to lock now:
- `dictionary_entity` reads external dictionary data such as `references/org-dictionary.json`; no hardcoded org list in Rust after migration
- `month_week_period` returns explicit prompts for missing mode, missing period, contradictory month/week intent, or week-without-year
- `fixed_enum` and `literal_passthrough` exist now so the manifest contract is extensible, even if line-loss is the only v1 user
- if a new scene needs a new resolver **type**, add a reusable resolver, not a scene-specific `if scene_id == ...` branch

- [ ] **Step 4: Re-run the deterministic tests and verify they pass**

Run:

```bash
cargo test --test deterministic_submit_test -- --nocapture
```

Expected: PASS, including the new no-default-period behavior and ambiguity fail-closed coverage.

- [ ] **Step 5: Commit the registry-driven deterministic slice**

Run:

```bash
git add src/compat/deterministic_submit.rs src/compat/scene_platform/dispatch.rs src/compat/scene_platform/resolvers.rs tests/deterministic_submit_test.rs
git commit -m "feat: add registry-driven deterministic scene dispatch"
```

Expected: one commit that removes one-off line-loss decision ownership from the deterministic planner.

---

### Task 4: Add a generic report-artifact interpreter and XLSX postprocess path

**Files:**
- Create: `src/compat/report_artifact.rs`
- Create: `src/compat/report_xlsx_export.rs`
- Modify: `src/compat/direct_skill_runtime.rs`
- Modify: `src/compat/deterministic_submit.rs`
- Create: `tests/report_artifact_postprocess_test.rs`
- Modify: `tests/agent_runtime_test.rs`

- [ ] **Step 1: Write the red tests for generic report-artifact handling**

Add `tests/report_artifact_postprocess_test.rs` and the minimum `tests/agent_runtime_test.rs` extensions needed to prove the platform no longer depends on line-loss-specific Rust export logic:

```rust
#[test]
fn report_artifact_postprocess_exports_xlsx_for_ok_or_partial_scene() {
    let artifact = serde_json::json!({
        "type": "report-artifact",
        "report_name": "tq-lineloss-report",
        "status": "partial",
        "columns": ["ORG_NAME", "LINE_LOSS_RATE"],
        "column_defs": [["ORG_NAME", "供电单位"], ["LINE_LOSS_RATE", "综合线损率(%)"]],
        "rows": [{"ORG_NAME": "国网兰州供电公司", "LINE_LOSS_RATE": "1.23"}],
        "counts": {"rows": 1},
        "partial_reasons": ["report_log_failed"]
    });

    let outcome = interpret_report_artifact_and_postprocess(&artifact, report_postprocess_xlsx(), temp_workspace()).unwrap();
    assert!(outcome.success);
    assert!(outcome.summary.contains("status=partial"));
    assert!(outcome.summary.contains("detail_rows=1"));
    assert!(outcome.summary.contains("export_path="));
}

#[test]
fn report_artifact_postprocess_skips_export_for_blocked_or_error_scene() { /* no xlsx path */ }

#[test]
fn direct_submit_and_scene_submit_share_the_same_report_summary_contract() { /* direct_skill_runtime + deterministic path both use same summary builder */ }
```

- [ ] **Step 2: Run the focused report-artifact tests and verify they fail**

Run:

```bash
cargo test --test report_artifact_postprocess_test -- --nocapture
cargo test --test agent_runtime_test submit_task_treats_partial_report_artifact_as_success_with_warning_summary -- --nocapture
```

Expected: FAIL because the generic interpreter/exporter does not exist yet and deterministic line-loss export is still special-cased.

- [ ] **Step 3: Implement the shared parser, summary builder, and generic XLSX exporter**

Implement a reusable path that both deterministic scenes and configured direct-submit skills can call:

```rust
pub struct ParsedReportArtifact {
    pub report_name: String,
    pub status: String,
    pub columns: Vec<String>,
    pub column_defs: Vec<(String, String)>,
    pub rows: Vec<Map<String, Value>>,
    pub counts: ReportCounts,
    pub partial_reasons: Vec<String>,
}

pub fn interpret_report_artifact_and_postprocess(
    artifact_json: &Value,
    postprocess: Option<&PostprocessSection>,
    workspace_root: &Path,
) -> Result<DirectSubmitOutcome, PipeError> {
    // parse report-artifact generically
    // map ok/partial/empty => success=true
    // map blocked/error => success=false
    // if postprocess.exporter == Some("xlsx_report") and status is exportable, write xlsx under workspace_root/out
    // if postprocess.auto_open == Some("excel"), reuse existing open-export helper
}
```

Rules:
- export logic must read `column_defs` when present, else fall back to `columns`
- do not keep line-loss-only column-name assumptions in Rust
- keep direct-submit behavior unchanged for non-artifact string outputs
- keep `blocked` / `error` as failures even if rows happen to be present late in the artifact

- [ ] **Step 4: Re-run the focused tests and verify they pass**

Run:

```bash
cargo test --test report_artifact_postprocess_test -- --nocapture
cargo test --test agent_runtime_test submit_task_treats_partial_report_artifact_as_success_with_warning_summary -- --nocapture
cargo test --test agent_runtime_test submit_task_treats_blocked_report_artifact_as_failure -- --nocapture
```

Expected: PASS.

- [ ] **Step 5: Commit the generic artifact/postprocess slice**

Run:

```bash
git add src/compat/report_artifact.rs src/compat/report_xlsx_export.rs src/compat/direct_skill_runtime.rs src/compat/deterministic_submit.rs tests/report_artifact_postprocess_test.rs tests/agent_runtime_test.rs
git commit -m "refactor: share generic report artifact postprocess"
```

Expected: one commit that removes the need for per-scene Rust export logic.

---

### Task 5: Wire manifest-driven scenes into submit and bootstrap without regressing other flows

**Files:**
- Modify: `src/agent/task_runner.rs`
- Modify: `src/service/server.rs`
- Modify: `tests/service_task_flow_test.rs`
- Modify: `tests/service_ws_session_test.rs`
- Modify: `tests/agent_runtime_test.rs`

- [ ] **Step 1: Add the failing submit/bootstrap regression tests**

Add focused tests that lock branch order and bootstrap behavior:

```rust
#[test]
fn submit_task_routes_suffix_instruction_through_manifest_scene_before_llm() {
    // no provider call should happen when deterministic scene planning succeeds or prompts
}

#[test]
fn resolve_submit_bootstrap_target_prefers_manifest_scene_target_for_deterministic_scene() {
    let request = SubmitTaskRequest {
        instruction: "兰州公司 台区线损大数据 月累计线损率统计分析 2026-03。。。".to_string(),
        conversation_id: None,
        messages: vec![],
        page_url: None,
        page_title: None,
    };
    let target = resolve_submit_bootstrap_target(&request, workspace_root, &settings);
    assert_eq!(target.request_url, "http://20.76.57.61:18080/gsllys/tqLinelossStatis/tqQualifyRateMonitor");
    assert_eq!(target.expected_domain.as_deref(), Some("20.76.57.61"));
}

#[test]
fn zhihu_without_suffix_keeps_existing_non_scene_path() { /* ordinary path unchanged */ }
```

For the browser-ws/callback-host path, add one regression in `tests/service_ws_session_test.rs` proving the first bootstrap/open target comes from `scene.toml` when a deterministic scene plan exists.

- [ ] **Step 2: Run the focused integration tests and verify they fail**

Run:

```bash
cargo test --test service_task_flow_test -- --nocapture
cargo test --test service_ws_session_test callback_host -- --nocapture
```

Expected: FAIL because the submit/bootstrap path still depends on the old deterministic line-loss branch shape.

- [ ] **Step 3: Implement the minimal wiring changes only where the branch already exists**

Implementation targets:
- keep the current submit branch order in `src/agent/task_runner.rs`
- keep `resolve_submit_bootstrap_target(...)` precedence in `src/service/server.rs`
- replace the old hardcoded deterministic plan source with the new manifest-backed planner
- keep configured `directSubmitSkill` and ordinary LLM/browser orchestration behavior untouched

The resulting branch order must still be:

```rust
// 1. registry-backed deterministic scene (exact suffix only)
// 2. ordinary primary orchestration path
// 3. configured directSubmitSkill
// 4. compat LLM/runtime path
```

- [ ] **Step 4: Re-run the focused integration tests and verify they pass**

Run:

```bash
cargo test --test service_task_flow_test -- --nocapture
cargo test --test service_ws_session_test callback_host -- --nocapture
cargo test --test agent_runtime_test -- --nocapture
```

Expected: PASS, with no regression to the ordinary direct-submit or Zhihu paths.

- [ ] **Step 5: Commit the submit/bootstrap integration slice**

Run:

```bash
git add src/agent/task_runner.rs src/service/server.rs tests/service_task_flow_test.rs tests/service_ws_session_test.rs tests/agent_runtime_test.rs
git commit -m "refactor: wire manifest scenes into submit bootstrap"
```

Expected: one commit that changes wiring only at the existing seams.

---

### Task 6: Add the first manifest-driven `tq-lineloss-report` sample package inside this repo

**Files:**
- Create: `examples/generated_scene_platform/skills/tq-lineloss-report/scene.toml`
- Create: `examples/generated_scene_platform/skills/tq-lineloss-report/references/org-dictionary.json`
- Create: `examples/generated_scene_platform/skills/tq-lineloss-report/SKILL.toml`
- Create: `examples/generated_scene_platform/skills/tq-lineloss-report/SKILL.md`
- Create: `examples/generated_scene_platform/skills/tq-lineloss-report/scripts/collect_lineloss.js`
- Create: `examples/generated_scene_platform/skills/tq-lineloss-report/scripts/collect_lineloss.test.js`
- Create: `examples/generated_scene_platform/skills/tq-lineloss-report/references/data-quality.md`
- Create: `examples/generated_scene_platform/skills/tq-lineloss-report/references/collection-flow.md`
- Modify: `tests/deterministic_submit_test.rs`
- Modify: `tests/scene_registry_test.rs`

- [ ] **Step 1: Add the failing line-loss manifest and runtime-contract checks**

Create the `scene.toml` shape in the in-repo sample package first and lock the migration expectations:

```toml
[scene]
id = "tq-lineloss-report"
skill = "tq-lineloss-report"
tool = "collect_lineloss"
kind = "browser_script"
version = "0.1.0"
category = "report_collection"

[manifest]
schema_version = "1"

[bootstrap]
expected_domain = "20.76.57.61"
target_url = "http://20.76.57.61:18080/gsllys/tqLinelossStatis/tqQualifyRateMonitor"
page_title_keywords = ["线损"]
requires_target_page = true

[deterministic]
suffix = "。。。"
include_keywords = ["线损", "月累计", "周累计", "统计分析"]
exclude_keywords = ["知乎"]

[[params]]
name = "org"
resolver = "dictionary_entity"
required = true
prompt_missing = "已命中台区线损报表技能，但缺少供电单位。"
prompt_ambiguous = "已命中台区线损报表技能，但供电单位存在歧义，请补充更完整名称。"

[params.resolver_config]
dictionary_ref = "references/org-dictionary.json"
output_label_field = "org_label"
output_code_field = "org_code"

[[params]]
name = "period"
resolver = "month_week_period"
required = true
prompt_missing = "已命中台区线损报表技能，但缺少统计周期。"
prompt_ambiguous = "已命中台区线损报表技能，但统计周期存在歧义，请补充更明确表达。"

[artifact]
type = "report-artifact"
success_status = ["ok", "partial", "empty"]
failure_status = ["blocked", "error"]

[postprocess]
exporter = "xlsx_report"
auto_open = "excel"
```

Also add a red JS assertion in the committed sample package proving the script returns `column_defs` and never re-parses raw natural-language org/period text:

```javascript
test('buildBrowserEntrypointResult keeps canonical args and generic export fields only', async () => {
  const artifact = await buildBrowserEntrypointResult({
    expected_domain: '20.76.57.61',
    org_label: '国网兰州供电公司',
    org_code: '62401',
    period_mode: 'month',
    period_mode_code: '1',
    period_value: '2026-03',
    period_payload: { fdate: '2026-03' },
    instruction: '兰州公司 月累计 2026-03'
  }, fakeDeps);

  assert.equal(artifact.org.code, '62401');
  assert.ok(Array.isArray(artifact.column_defs));
  assert.equal(JSON.stringify(artifact).includes('兰州公司 月累计 2026-03'), false);
});
```

- [ ] **Step 2: Run the targeted line-loss tests and verify they fail**

Run:

```bash
cargo test --test deterministic_submit_test -- --nocapture
node "examples/generated_scene_platform/skills/tq-lineloss-report/scripts/collect_lineloss.test.js"
```

Expected: FAIL because the runtime is not yet manifest-driven and the committed sample package does not yet expose the final manifest/dictionary/export contract.

- [ ] **Step 3: Implement the sample-scene migration without adding per-scene Rust branches**

Required actions:
- add `scene.toml` under the in-repo sample skill root and use the same layout the generator will emit
- make tests and service-smoke config resolve `skillsDir` to `examples/generated_scene_platform` so the registry can discover the committed sample package without any external repo copy step
- export the current org unit data into `references/org-dictionary.json` and make the resolver read that file instead of a Rust hardcoded list
- update `collect_lineloss.js` so the returned `report-artifact` includes generic-platform fields needed by `report_xlsx_export.rs`
- keep collection logic in JS; do **not** move line-loss business semantics back into Rust
- write `SKILL.toml` / `SKILL.md` / references docs into the sample package to describe canonical args and the manifest-driven contract
- keep any external staging-repo publish step out of scope for this branch; this task only commits the in-repo sample package

- [ ] **Step 4: Re-run the line-loss tests and verify they pass**

Run:

```bash
cargo test --test deterministic_submit_test -- --nocapture
node "examples/generated_scene_platform/skills/tq-lineloss-report/scripts/collect_lineloss.test.js"
```

Expected: PASS, including the new missing-period prompt behavior and the new manifest-driven sample-package shape.

- [ ] **Step 5: Commit the line-loss sample migration**

Run:

```bash
git add examples/generated_scene_platform/skills/tq-lineloss-report tests/deterministic_submit_test.rs tests/scene_registry_test.rs
git commit -m "feat: add manifest-driven lineloss sample package"
```

Expected: one commit that adds the first committed manifest-driven sample package and updates runtime expectations around it.

---

### Task 7: Write the required `tq-lineloss` lessons-learned artifacts and load them as generator rules

**Files:**
- Create: `docs/superpowers/references/tq-lineloss-lessons-learned.md`
- Create: `docs/superpowers/references/tq-lineloss-lessons-learned.toml`
- Create: `tests/generated_scene_lessons_test.rs`
- Create: `src/generated_scene/mod.rs`
- Create: `src/generated_scene/lessons.rs`
- Modify: `src/lib.rs`

- [ ] **Step 1: Write the failing lessons-rules test before the docs**

Add `tests/generated_scene_lessons_test.rs` that requires all mandatory structured rule sections to exist. In the same red step, wire the empty `src/generated_scene/mod.rs` and `src/lib.rs` exports needed so this test fails on missing implementation/data, not on missing module visibility:

```rust
#[test]
fn lineloss_lessons_toml_declares_required_generator_rules() {
    let lessons = load_generation_lessons("docs/superpowers/references/tq-lineloss-lessons-learned.toml").unwrap();

    assert!(lessons.routing.require_exact_suffix);
    assert!(lessons.routing.unsupported_scene_fail_closed);
    assert!(lessons.canonical_params.require_explicit_period);
    assert!(lessons.bootstrap.require_expected_domain);
    assert!(lessons.bootstrap.require_target_url);
    assert!(lessons.artifact.require_report_artifact);
    assert!(lessons.validation.require_pipe_and_ws_checks);
    assert!(lessons.validation.require_manual_service_console_smoke);
}
```

- [ ] **Step 2: Run the lessons test and verify it fails**

Run:

```bash
cargo test --test generated_scene_lessons_test -- --nocapture
```

Expected: FAIL because the lessons loader and TOML file do not exist yet.

- [ ] **Step 3: Implement the loader and write both lessons artifacts**

Implement the loader and complete the minimal module wiring (`src/generated_scene/mod.rs`, `src/lib.rs`) in this task so `cargo test --test generated_scene_lessons_test` is buildable before Task 8. Use a TOML shape explicit enough for generator enforcement, for example:

```toml
[routing]
require_exact_suffix = true
unsupported_scene_fail_closed = true
ambiguity_fail_closed = true

[canonical_params]
require_dictionary_entity_for_org = true
require_explicit_period = true
forbid_hidden_page_defaults = true

[bootstrap]
require_expected_domain = true
require_target_url = true
prefer_page_context_when_present = true

[artifact]
require_report_artifact = true
require_column_defs_for_export = true
rust_side_xlsx_export_when_postprocess_xlsx = true

[validation]
require_pipe_and_ws_checks = true
require_manual_service_console_smoke = true
require_callback_host_timeout_notes = true
```

The Markdown companion must explain the why behind those rules: deterministic routing pitfalls, canonical parameter pitfalls, bootstrap target pitfalls, pipe/ws differences, callback-host timeout lessons, and Rust-side export constraints.

- [ ] **Step 4: Re-run the lessons tests and verify they pass**

Run:

```bash
cargo test --test generated_scene_lessons_test -- --nocapture
```

Expected: PASS.

- [ ] **Step 5: Commit the lessons artifacts and loader**

Run:

```bash
git add docs/superpowers/references/tq-lineloss-lessons-learned.md docs/superpowers/references/tq-lineloss-lessons-learned.toml src/generated_scene/mod.rs src/generated_scene/lessons.rs src/lib.rs tests/generated_scene_lessons_test.rs
git commit -m "docs: add lineloss generation lessons"
```

Expected: one commit that makes the line-loss lessons machine-consumable and reviewable.

---

### Task 8: Build the v1 source analyzer, package generator, and CLI entry

**Files:**
- Create: `src/generated_scene/analyzer.rs`
- Create: `src/generated_scene/generator.rs`
- Create: `src/bin/sg_scene_generate.rs`
- Modify: `src/generated_scene/mod.rs`
- Modify: `src/lib.rs`
- Create: `tests/scene_generator_test.rs`
- Create: `tests/fixtures/generated_scene/report_collection/index.html`
- Create: `tests/fixtures/generated_scene/report_collection/js/report.js`
- Create: `tests/fixtures/generated_scene/non_report/index.html`
- Create: `tests/fixtures/scene_source/tq_lineloss/index.html`
- Create: `tests/fixtures/scene_source/tq_lineloss/js/collect.js`

- [ ] **Step 1: Add the failing analyzer/generator tests with hermetic fixtures**

Create fixture-backed tests like:

```rust
#[test]
fn analyzer_classifies_supported_report_collection_source() {
    let analysis = analyze_scene_source(Path::new("tests/fixtures/generated_scene/report_collection")).unwrap();
    assert_eq!(analysis.scene_kind, SceneKind::ReportCollection);
    assert_eq!(analysis.tool_kind, ToolKind::BrowserScript);
    assert!(analysis.bootstrap.target_url.is_some());
    assert!(analysis.collection_entry_script.is_some());
}

#[test]
fn generator_writes_registration_ready_package_with_scene_toml() {
    let output_root = tempdir();
    generate_scene_package(GenerateSceneRequest {
        source_dir: PathBuf::from("tests/fixtures/generated_scene/report_collection"),
        scene_id: "sample-report-scene".to_string(),
        scene_name: "示例报表场景".to_string(),
        output_root: output_root.path().to_path_buf(),
        lessons_path: PathBuf::from("docs/superpowers/references/tq-lineloss-lessons-learned.toml"),
    }).unwrap();

    assert!(output_root.path().join("skills/sample-report-scene/SKILL.toml").exists());
    assert!(output_root.path().join("skills/sample-report-scene/scene.toml").exists());
    assert!(output_root.path().join("skills/sample-report-scene/scripts/collect_sample_report_scene.js").exists());
    assert!(output_root.path().join("skills/sample-report-scene/scripts/collect_sample_report_scene.test.js").exists());
}

#[test]
fn generator_rejects_non_report_source_with_explicit_reason() {
    let err = analyze_scene_source(Path::new("tests/fixtures/generated_scene/non_report")).unwrap_err();
    assert!(err.to_string().contains("report/collection browser_script only"));
}
```

- [ ] **Step 2: Run the generator tests and verify they fail**

Run:

```bash
cargo test --test scene_generator_test -- --nocapture
```

Expected: FAIL because the analyzer, generator, fixtures, and CLI do not exist yet.

- [ ] **Step 3: Implement the analyzer, generator, CLI, and the source fixtures used by final smoke**

Implementation rules:
- create the generator test fixtures under `tests/fixtures/generated_scene/*`
- create the hermetic source-smoke fixtures under `tests/fixtures/scene_source/tq_lineloss/*` so Task 9 can run without any external scenario directory
- analyzer must refuse unsupported/non-report scenes explicitly instead of generating broken packages
- generator must emit `scene.toml` inside the generated skill root
- generator must use `tq-lineloss-lessons-learned.toml` as a required input so the same hardening rules apply to future scenes
- generator/runtime coupling must stay at the file-contract level only
- CLI should use an explicit parser, no new heavy dependency

Suggested CLI shape:

```rust
cargo run --bin sg_scene_generate -- \
  --source-dir <scenario-dir> \
  --scene-id <scene-id> \
  --scene-name <display-name> \
  --output-root <skill-staging-root> \
  --lessons docs/superpowers/references/tq-lineloss-lessons-learned.toml
```

Expected outputs under `<output-root>`:
- `skills/<scene-id>/SKILL.toml`
- `skills/<scene-id>/SKILL.md`
- `skills/<scene-id>/scene.toml`
- `skills/<scene-id>/references/*.md`
- `skills/<scene-id>/scripts/*.js`
- `skills/<scene-id>/scripts/*.test.js`

- [ ] **Step 4: Re-run the generator tests and verify they pass**

Run:

```bash
cargo test --test scene_generator_test -- --nocapture
```

Expected: PASS.

- [ ] **Step 5: Commit the generator slice**

Run:

```bash
git add src/lib.rs src/generated_scene/mod.rs src/generated_scene/analyzer.rs src/generated_scene/generator.rs src/bin/sg_scene_generate.rs tests/scene_generator_test.rs tests/fixtures/generated_scene tests/fixtures/scene_source/tq_lineloss
 git commit -m "feat: add generated scene package generator"
```

Expected: one commit that adds the in-repo v1 generator capability.

---

### Task 9: Run the final verification sweep, smoke the real runtime, and remove unused one-off scene code

**Files:**
- Delete if unused after green verification: `src/compat/tq_lineloss/org_units.rs`
- Delete if unused after green verification: `src/compat/tq_lineloss/org_resolver.rs`
- Delete if unused after green verification: `src/compat/tq_lineloss/period_resolver.rs`
- Delete or reduce to shim only if unused after green verification: `src/compat/lineloss_xlsx_export.rs`
- Modify: `src/compat/mod.rs`
- Modify: `src/lib.rs`

- [ ] **Step 1: Remove only the legacy one-off files that are provably unused**

Before deleting anything, prove the new path covers the old responsibilities:

```bash
cargo test --test deterministic_submit_test -- --nocapture
cargo test --test scene_registry_test -- --nocapture
cargo test --test report_artifact_postprocess_test -- --nocapture
```

Then delete the old line-loss-only resolver/export files only if `cargo test` and `Grep` show they are no longer referenced.

- [ ] **Step 2: Run the full automated verification sweep**

Run:

```bash
node "examples/generated_scene_platform/skills/tq-lineloss-report/scripts/collect_lineloss.test.js"
cargo test --test scene_registry_test -- --nocapture
cargo test --test deterministic_submit_test -- --nocapture
cargo test --test report_artifact_postprocess_test -- --nocapture
cargo test --test generated_scene_lessons_test -- --nocapture
cargo test --test scene_generator_test -- --nocapture
cargo test --test agent_runtime_test -- --nocapture
cargo test --test service_task_flow_test -- --nocapture
cargo test --test service_ws_session_test callback_host -- --nocapture
cargo test --test compat_runtime_test -- --nocapture
cargo test --test compat_config_test -- --nocapture
cargo build --bin sgclaw --bin sg_claw --bin sg_scene_generate
```

Expected: PASS.

- [ ] **Step 3: Run the required hermetic generator smoke and keep the real external source smoke optional**

Run the required in-repo smoke first:

```bash
tmp_out="$(mktemp -d)"
cargo run --bin sg_scene_generate -- \
  --source-dir tests/fixtures/scene_source/tq_lineloss \
  --scene-id tq-lineloss-report \
  --scene-name "台区线损月周累计线损率统计分析" \
  --output-root "$tmp_out" \
  --lessons docs/superpowers/references/tq-lineloss-lessons-learned.toml
```

Expected: generator emits a complete package into `$tmp_out` using only in-repo fixtures.

Optional manual follow-up after the required smoke is green:
- if the external scenario directory is available on the implementer's machine, re-run the same command against the real source tree for additional confidence
- if it is unavailable, do **not** block the branch on that machine-specific path

- [ ] **Step 4: Run the real service-console smoke checks with `sg_claw.exe` semantics in mind**

Manual verification checklist:
- write or reuse a repo-local `sgclaw_config.json` whose `skillsDir` points to `examples/generated_scene_platform`
- rebuild and run `sg_claw`/`sg_claw.exe` with that config so the runtime-scanned skills root is reproducible
- on the real line-loss page, submit `兰州公司 台区线损大数据 月累计线损率统计分析 2026-03。。。`
- confirm the request bootstraps the manifest `target_url`, uses the manifest `expected_domain`, and returns the line-loss report artifact through the generic scene runtime
- submit `兰州公司 台区线损大数据 月累计线损率统计分析。。。` and confirm the runtime prompts for missing period instead of defaulting
- submit `打开知乎热榜` and confirm the ordinary Zhihu path still behaves as before
- submit `打开知乎热榜。。。` and confirm the deterministic runtime fails closed with the unsupported-scene prompt instead of falling into the Zhihu path

- [ ] **Step 5: Commit the cleanup + verified platform state**

Run:

```bash
git add src/compat/mod.rs src/lib.rs src/compat src/generated_scene src/scene_contract docs/superpowers/references tests examples/generated_scene_platform
 git commit -m "feat: add generated scene skill platform"
```

Expected: one final commit after the full automated and manual verification passes.

---

## Verification Checklist

### Registry and manifest contract

```bash
cargo test --test scene_registry_test -- --nocapture
```

Expected:
- `scene.toml` loads from the skill root
- only `schema_version = "1"` passes
- duplicate `scene.id` fails with both manifest paths in the error
- non-`browser_script` or non-`report_collection` v1 scenes are rejected cleanly
- the registry still scans exactly one resolved `skillsDir`

### Deterministic routing contract

```bash
cargo test --test deterministic_submit_test -- --nocapture
```

Expected:
- exact `。。。` suffix only
- no-suffix behavior unchanged
- unsupported suffix-scene requests fail closed
- multi-match ambiguity fails closed
- missing org/mode/period prompt instead of defaulting
- page context may improve scoring but cannot cause silent guessing on unresolved ambiguity

### Generic report-artifact handling

```bash
cargo test --test report_artifact_postprocess_test -- --nocapture
cargo test --test agent_runtime_test -- --nocapture
```

Expected:
- `ok` / `partial` / `empty` map to success
- `blocked` / `error` map to failure
- generic XLSX export works from artifact fields, not line-loss-only Rust code
- configured `directSubmitSkill` keeps working on the shared artifact interpreter

### Service submit/bootstrap path

```bash
cargo test --test service_task_flow_test -- --nocapture
cargo test --test service_ws_session_test callback_host -- --nocapture
```

Expected:
- deterministic manifest scenes route before LLM
- bootstrap target resolution uses manifest `target_url` / `expected_domain`
- callback-host/browser-ws paths still receive the correct request URL
- non-deterministic Zhihu and direct-submit flows remain intact

### Generator and lessons

```bash
cargo test --test generated_scene_lessons_test -- --nocapture
cargo test --test scene_generator_test -- --nocapture
cargo build --bin sg_scene_generate
```

Expected:
- lessons TOML contains all required routing/param/bootstrap/artifact/validation rules
- analyzer only accepts v1 report/collection browser-script fixtures
- generator writes a complete package with `scene.toml` and JS test scaffold
- generator/runtime share only the explicit file contract, not hidden Rust internals

### Real runtime smoke

Manual checklist:
- `sg_claw.exe` / service console can still run the line-loss deterministic path
- missing-period deterministic line-loss requests prompt instead of defaulting
- plain Zhihu requests still avoid the scene platform
- suffixed unsupported requests fail closed
- line-loss export still opens through the generic postprocess path when configured

---

## Notes For The Engineer

- The paired approved spec is `docs/superpowers/specs/2026-04-15-generated-scene-skill-platform-design.md`.
- The current repo branch name for the ws baseline is `feature/claw-ws`, even though the design prose says `ws`.
- Do **not** reintroduce the old scene-registry experiment that was explicitly cleaned off the ws branch. This plan deliberately keeps the new runtime under `compat` and a shared serializable contract instead of reviving the deleted scene-only branch structure blindly.
- Keep `scene.toml` inside each skill package root. The separate `skill_staging/scenes/*/scene.json` tree remains legacy metadata only in this plan.
- Keep the generator extractable by holding the boundary at `scene.toml`, generated package layout, and lessons TOML rules. Avoid runtime code that reaches into generator-only internals.
- If a real scenario directory does not fit the v1 report/collection/browser-script envelope, the analyzer/generator must refuse it explicitly instead of emitting a half-valid package.
- Do **not** add a generic login/session platform here. Capture that need in docs if discovered, but keep it out of this implementation slice.