# Generated Scene Skill Platform Implementation Plan > **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking. **Goal:** Add a manifest-driven generated-scene platform that discovers staged report/collection `browser_script` scenes, routes deterministic `。。。` requests through generic registry/resolver logic, migrates `tq-lineloss-report` off one-off Rust branches, and ships a first in-repo generator that outputs registration-ready scene packages with minimal or zero per-scene Rust changes. **Architecture:** Keep the existing submit branch shape in `src/agent/task_runner.rs`, but replace the line-loss-specific deterministic branch with a thin adapter over a generic scene registry, deterministic dispatcher, generic report-artifact interpreter, and generic XLSX postprocess path. Keep the generator separate from runtime internals by making `scene.toml` plus the lessons-learned TOML the only stable generator/runtime contract; generator code lives in its own module and binary, while runtime code stays under the existing `compat` submit/bootstrap seams. **Tech Stack:** Rust 2021, `serde`, `serde_json`, `toml`, existing `browser_script` runtime and callback-host/browser-backend seams, `node:test` for staged JS, Cargo integration tests, filesystem-based package generation. --- ## Execution Context - Branch from the repo's current ws baseline branch, which is `feature/claw-ws` in this checkout today. Do **not** implement on that branch directly; create a new feature branch from its HEAD. - Do **not** create a worktree unless the user explicitly asks. Branch isolation is required; worktree isolation is not. - Keep `skillsDir` as the existing single resolved path. The new scene registry must scan inside that one resolved skills root instead of adding array-style scene roots or a second config field. - For this branch's automated tests and real smokes, use a repo-local `skillsDir` override that points at `examples/generated_scene_platform`. That still preserves the single-root contract because the runtime scans one resolved root whose `skills/` child contains the committed sample package. - Put the new runtime registration manifest at `/scene.toml`. Keep existing `skill_staging/scenes/*/scene.json` files for legacy staging/UI metadata and do **not** move runtime dispatch policy back into `scene.json`. - Keep every required deliverable for this plan inside the current `claw-new` repo so the branch can be built, tested, and committed independently. The first committed sample package should live under `examples/generated_scene_platform/skills/`; publishing the same package into any external skills/staging repo is a separate follow-up, not part of this branch. - V1 scope is locked to `category = "report_collection"`, `kind = "browser_script"`, `artifact.type = "report-artifact"`. Unsupported scene types must fail fast instead of partially working. - Deterministic invocation remains exact-suffix-only: only raw instructions ending with the exact `。。。` suffix enter the scene dispatcher. - Never use hidden page defaults for required canonical parameters. Missing org, missing month/week mode, or missing period must prompt and stop. - Do **not** add a generic login/session subsystem in this plan. - Preserve current non-platform flows: Zhihu/LLM, configured `directSubmitSkill`, and ordinary browser-attached orchestration must remain behaviorally unchanged unless an explicit regression test says otherwise. ## File Map ### Core runtime and contract files - Create: `src/scene_contract/mod.rs` - shared serializable manifest contract used by both runtime and generator - Create: `src/scene_contract/manifest.rs` - `scene.toml` schema types, schema-version validation helpers, artifact/postprocess enums - Create: `src/compat/scene_platform/mod.rs` - exports the registry, dispatch, and resolver units - Create: `src/compat/scene_platform/registry.rs` - scans the single resolved `skillsDir`, loads `/scene.toml`, validates duplicates and runtime compatibility - Create: `src/compat/scene_platform/dispatch.rs` - deterministic candidate scoring, ambiguity fail-closed behavior, canonical param resolution, executable scene plan creation - Create: `src/compat/scene_platform/resolvers.rs` - reusable resolver types for `dictionary_entity`, `month_week_period`, `fixed_enum`, and `literal_passthrough` - Create: `src/compat/report_artifact.rs` - generic report-artifact parsing, status mapping, summary building, and export-readiness helpers - Create: `src/compat/report_xlsx_export.rs` - generic XLSX exporter for any `report-artifact` with `column_defs`/`columns` + `rows` - Modify: `src/lib.rs` - export new shared/runtime/generator modules and any CLI helpers needed by tests - Modify: `src/compat/mod.rs` - export the new scene-platform and report-artifact modules - Modify: `src/compat/deterministic_submit.rs` - keep the public API shape, but make it registry/manifest-driven instead of line-loss-hardcoded - Modify: `src/compat/direct_skill_runtime.rs` - reuse the generic report-artifact interpreter so direct-submit and scene-submit summarize/status-map the same way - Modify: `src/agent/task_runner.rs` - keep branch order, but call the new registry-backed deterministic planner before ordinary orchestration/LLM - Modify: `src/service/server.rs` - keep bootstrap precedence shape, but let deterministic plans source `target_url` / `expected_domain` from scene manifests instead of hardcoded constants ### Generator files - Create: `src/generated_scene/mod.rs` - generator entrypoints shared by tests and CLI - Create: `src/generated_scene/analyzer.rs` - source directory inspection for v1 report/collection `browser_script` scenes - Create: `src/generated_scene/generator.rs` - template rendering and package writing into an output staging root - Create: `src/generated_scene/lessons.rs` - loads and validates `tq-lineloss-lessons-learned.toml` as generation constraints - Create: `src/bin/sg_scene_generate.rs` - CLI entry for `sgClaw`'s in-repo scene generator capability ### In-repo sample package and reference assets - Create: `examples/generated_scene_platform/skills/tq-lineloss-report/scene.toml` - first committed manifest-driven sample scene package used by runtime and generator tests in this repo - Create: `examples/generated_scene_platform/skills/tq-lineloss-report/references/org-dictionary.json` - external dictionary data for the `dictionary_entity` resolver fixture - Create: `examples/generated_scene_platform/skills/tq-lineloss-report/SKILL.toml` - committed sample browser-script tool contract aligned with the manifest-driven runtime - Create: `examples/generated_scene_platform/skills/tq-lineloss-report/SKILL.md` - committed sample documentation for canonical args, artifact contract, and runtime expectations - Create: `examples/generated_scene_platform/skills/tq-lineloss-report/scripts/collect_lineloss.js` - committed sample collection script with generic-platform artifact fields - Create: `examples/generated_scene_platform/skills/tq-lineloss-report/scripts/collect_lineloss.test.js` - committed JS contract tests for canonical args and artifact shape - Create: `examples/generated_scene_platform/skills/tq-lineloss-report/references/data-quality.md` - committed sample data-quality notes aligned with manifest-driven output rules - Create: `examples/generated_scene_platform/skills/tq-lineloss-report/references/collection-flow.md` - committed sample bootstrap/collection-flow notes - Create: `tests/fixtures/scene_source/tq_lineloss/index.html` - hermetic in-repo source fixture for required analyzer/generator smoke coverage - Create: `tests/fixtures/scene_source/tq_lineloss/js/collect.js` - hermetic in-repo source fixture JS for analyzer/generator smoke coverage ### Repo-local runtime discovery path for validation - Use `examples/generated_scene_platform` as the repo-local `skillsDir` override root during tests and manual smokes. - The runtime still scans one resolved root only; it just resolves that root to `examples/generated_scene_platform`, whose `skills/` child contains the committed sample package. - Add or reuse a tiny repo-local config fixture such as `tmp/generated_scene_platform_sgclaw_config.json` or an equivalent test helper so the validation steps all point at the same reproducible `skillsDir`. - Do not require external staging repos to make the manifest-driven runtime discoverable during this branch. ### External publish target kept out of scope for this branch - Do not modify external paths like `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/...` in this plan. - If the user later wants the generated sample published into that external staging repo, do it as a separate follow-up after this branch is green. ### Platform-reference files ### Tests and fixtures - Create: `tests/scene_registry_test.rs` - manifest loading, duplicate detection, schema validation, tool compatibility checks - Create: `tests/report_artifact_postprocess_test.rs` - generic report-artifact parsing and XLSX postprocess coverage - Create: `tests/generated_scene_lessons_test.rs` - lessons-TOML shape and required-rule coverage - Create: `tests/scene_generator_test.rs` - analyzer + generator integration coverage using hermetic fixtures - Create: `tests/fixtures/generated_scene/report_collection/index.html` - supported v1 report-scene fixture - Create: `tests/fixtures/generated_scene/report_collection/js/report.js` - supported fixture source hints for analyzer tests - Create: `tests/fixtures/generated_scene/non_report/index.html` - unsupported fixture proving fail-fast behavior - Modify: `tests/deterministic_submit_test.rs` - migrate from hardcoded line-loss expectations to registry-driven deterministic behavior - Modify: `tests/agent_runtime_test.rs` - keep direct-submit behavior intact while sharing generic report-artifact summaries - Modify: `tests/service_task_flow_test.rs` - task-runner/bootstrap regressions for manifest-driven deterministic scenes - Modify: `tests/service_ws_session_test.rs` - callback-host bootstrap target regression for manifest-driven deterministic submit when the browser-ws path is active ### Legacy files to delete only after green verification proves they are unused - Delete: `src/compat/tq_lineloss/org_units.rs` - Delete: `src/compat/tq_lineloss/org_resolver.rs` - Delete: `src/compat/tq_lineloss/period_resolver.rs` - Delete or reduce to a compatibility shim only if still needed: `src/compat/lineloss_xlsx_export.rs` --- ### Task 1: Create the implementation branch and lock the layout boundaries **Files:** - Verify only - [ ] **Step 1: Switch to the ws baseline branch and create a new platform branch** Run: ```bash git switch feature/claw-ws git switch -c feature/generated-scene-skill-platform ``` Expected: `git status -sb` shows a clean new branch rooted at the current ws baseline, not `feature/claw-ws` itself. - [ ] **Step 2: Verify the current single-root skills layout before coding** Run: ```bash cargo test --test compat_config_test ws_cleanup_resolves_single_configured_skills_dir -- --nocapture ``` Expected: PASS, proving the repo still uses one resolved `skillsDir` path and the platform work must build on that instead of introducing array-style roots. - [ ] **Step 3: Write down the two non-negotiable layout decisions in the first registry test scaffold** The very first red test file (`tests/scene_registry_test.rs`) must assume: ```rust // runtime manifest location: let manifest_path = skill_root.join("scene.toml"); // legacy scene.json stays outside runtime dispatch ownership: assert!(skill_root.join("scene.toml").exists()); assert!(!manifest_path.ends_with("skill_staging/scenes/.../scene.json")); ``` This prevents the implementation from drifting back toward `scene.json` routing or multi-root config. --- ### Task 2: Add the shared `scene.toml` contract and registry loader **Files:** - Create: `src/scene_contract/mod.rs` - Create: `src/scene_contract/manifest.rs` - Create: `src/compat/scene_platform/mod.rs` - Create: `src/compat/scene_platform/registry.rs` - Modify: `src/lib.rs` - Modify: `src/compat/mod.rs` - Create: `tests/scene_registry_test.rs` - [ ] **Step 1: Write the failing registry tests first** Add `tests/scene_registry_test.rs` with focused red cases like: ```rust #[test] fn registry_loads_scene_manifest_from_skill_root() { let skill_root = temp_skill_with_scene_manifest(r#" [scene] id = "tq-lineloss-report" skill = "tq-lineloss-report" tool = "collect_lineloss" kind = "browser_script" version = "0.1.0" category = "report_collection" [manifest] schema_version = "1" [bootstrap] expected_domain = "20.76.57.61" target_url = "http://20.76.57.61:18080/gsllys/tqLinelossStatis/tqQualifyRateMonitor" requires_target_page = true [artifact] type = "report-artifact" success_status = ["ok", "partial", "empty"] failure_status = ["blocked", "error"] "#); let registry = load_scene_registry(skill_root.parent().unwrap()).unwrap(); assert_eq!(registry.len(), 1); assert_eq!(registry[0].manifest.scene.id, "tq-lineloss-report"); } #[test] fn registry_rejects_duplicate_scene_ids_with_both_paths_in_error() { /* two skills, same scene.id */ } #[test] fn registry_rejects_unknown_manifest_schema_version() { /* schema_version = "999" */ } #[test] fn registry_rejects_non_browser_script_scene_tool_in_v1() { /* kind = "shell" should fail */ } #[test] fn registry_ignores_skills_without_scene_toml() { /* ordinary skills still load elsewhere */ } ``` - [ ] **Step 2: Run the registry test file and verify it fails** Run: ```bash cargo test --test scene_registry_test -- --nocapture ``` Expected: FAIL because `scene.toml` types and registry loading do not exist yet. - [ ] **Step 3: Implement the serializable manifest contract and the single-root registry loader** Implement the minimal contract and loader needed to satisfy the tests: ```rust #[derive(Debug, Clone, Deserialize, Serialize)] pub struct SceneManifest { pub scene: SceneSection, pub manifest: ManifestSection, pub bootstrap: BootstrapSection, pub deterministic: DeterministicSection, pub params: Vec, pub artifact: ArtifactSection, pub postprocess: Option, } #[derive(Debug, Clone)] pub struct SceneRegistryEntry { pub manifest: SceneManifest, pub skill_root: PathBuf, } pub fn load_scene_registry(skills_dir: &Path) -> Result, SceneRegistryError> { // iterate immediate skill dirs under the already-resolved single skillsDir // look for /scene.toml only // parse and validate schema version // verify scene.id uniqueness across the loaded root // verify manifest.scene.skill matches the containing skill package // verify referenced tool exists in SKILL.toml and is browser_script in v1 } ``` Rules to lock now: - `schema_version = "1"` is the only accepted version in v1 - duplicate `scene.id` is a hard error and must report both manifest paths - manifest loading must not add a second config key or a hardcoded `skill_staging/scenes` scan - `scene.toml` is runtime-owned; `scene.json` stays legacy-only - [ ] **Step 4: Re-run the registry tests and verify they pass** Run: ```bash cargo test --test scene_registry_test -- --nocapture ``` Expected: PASS. - [ ] **Step 5: Commit the contract and registry slice** Run: ```bash git add src/lib.rs src/scene_contract/mod.rs src/scene_contract/manifest.rs src/compat/mod.rs src/compat/scene_platform/mod.rs src/compat/scene_platform/registry.rs tests/scene_registry_test.rs git commit -m "feat: add scene manifest registry" ``` Expected: one commit that introduces the stable runtime/generator contract and registry loader. --- ### Task 3: Generalize deterministic dispatch and reusable parameter resolvers **Files:** - Create: `src/compat/scene_platform/dispatch.rs` - Create: `src/compat/scene_platform/resolvers.rs` - Modify: `src/compat/deterministic_submit.rs` - Modify: `tests/deterministic_submit_test.rs` - [ ] **Step 1: Replace the line-loss-only deterministic tests with registry-backed red tests** Extend `tests/deterministic_submit_test.rs` with registry-backed red cases built from temp fixture manifests under a temporary skills root. Do **not** depend on the committed sample package from Task 6 yet; Task 3 must stay hermetic and independently runnable. Add failing cases such as: ```rust #[test] fn deterministic_submit_uses_registry_backed_scene_plan() { let decision = decide_deterministic_submit( "兰州公司 台区线损大数据 月累计线损率统计分析 2026-03。。。", None, None, ); match decision { DeterministicSubmitDecision::Execute(plan) => { assert_eq!(plan.scene_id, "tq-lineloss-report"); assert_eq!(plan.tool_name, "tq-lineloss-report.collect_lineloss"); assert_eq!(plan.expected_domain, "20.76.57.61"); assert_eq!(plan.target_url, "http://20.76.57.61:18080/gsllys/tqLinelossStatis/tqQualifyRateMonitor"); } other => panic!("expected execute plan, got {other:?}"), } } #[test] fn deterministic_submit_fails_closed_on_scene_ambiguity() { /* two plausible scene.toml entries -> Prompt */ } #[test] fn deterministic_submit_prompts_for_missing_period_instead_of_defaulting() { let decision = decide_deterministic_submit("兰州公司 台区线损大数据 月累计线损率统计分析。。。", None, None); assert!(matches!(decision, DeterministicSubmitDecision::Prompt { .. })); } #[test] fn deterministic_submit_uses_page_context_to_break_ties_before_keyword_only_match() { /* page_url/title beats keyword overlap */ } #[test] fn zhihu_without_suffix_remains_not_deterministic() { assert!(matches!( decide_deterministic_submit("打开知乎热榜", Some("https://www.zhihu.com/hot"), Some("知乎热榜")), DeterministicSubmitDecision::NotDeterministic )); } ``` Also invert the current default-period expectations. `兰州公司 月累计。。。` and `兰州公司 周累计。。。` must now prompt instead of executing. - [ ] **Step 2: Run the targeted deterministic tests and verify they fail** Run: ```bash cargo test --test deterministic_submit_test -- --nocapture ``` Expected: FAIL because the current implementation is still hardcoded to line-loss constants and still defaults missing month/week periods. - [ ] **Step 3: Implement reusable resolver types and a registry-backed dispatcher** Implement the generic deterministic planner in the new scene-platform modules, then make `src/compat/deterministic_submit.rs` a thin adapter over it. Required implementation shape: ```rust pub enum ResolverKind { DictionaryEntity, MonthWeekPeriod, FixedEnum, LiteralPassthrough, } pub struct SceneExecutionPlan { pub scene_id: String, pub instruction: String, pub tool_name: String, pub expected_domain: String, pub target_url: String, pub args: Map, pub success_statuses: Vec, pub failure_statuses: Vec, pub postprocess: Option, } pub fn plan_deterministic_scene( raw_instruction: &str, page_url: Option<&str>, page_title: Option<&str>, skills_dir: &Path, ) -> Result { // exact suffix gate // load registry from the single skillsDir // score candidate scenes using include/exclude keywords + page context + required-param resolution // if multiple remain plausible -> fail closed with explicit ambiguity prompt // resolve params using generic resolver kinds // build executable SceneExecutionPlan with manifest bootstrap + tool + canonical args } ``` Resolver rules to lock now: - `dictionary_entity` reads external dictionary data such as `references/org-dictionary.json`; no hardcoded org list in Rust after migration - `month_week_period` returns explicit prompts for missing mode, missing period, contradictory month/week intent, or week-without-year - `fixed_enum` and `literal_passthrough` exist now so the manifest contract is extensible, even if line-loss is the only v1 user - if a new scene needs a new resolver **type**, add a reusable resolver, not a scene-specific `if scene_id == ...` branch - [ ] **Step 4: Re-run the deterministic tests and verify they pass** Run: ```bash cargo test --test deterministic_submit_test -- --nocapture ``` Expected: PASS, including the new no-default-period behavior and ambiguity fail-closed coverage. - [ ] **Step 5: Commit the registry-driven deterministic slice** Run: ```bash git add src/compat/deterministic_submit.rs src/compat/scene_platform/dispatch.rs src/compat/scene_platform/resolvers.rs tests/deterministic_submit_test.rs git commit -m "feat: add registry-driven deterministic scene dispatch" ``` Expected: one commit that removes one-off line-loss decision ownership from the deterministic planner. --- ### Task 4: Add a generic report-artifact interpreter and XLSX postprocess path **Files:** - Create: `src/compat/report_artifact.rs` - Create: `src/compat/report_xlsx_export.rs` - Modify: `src/compat/direct_skill_runtime.rs` - Modify: `src/compat/deterministic_submit.rs` - Create: `tests/report_artifact_postprocess_test.rs` - Modify: `tests/agent_runtime_test.rs` - [ ] **Step 1: Write the red tests for generic report-artifact handling** Add `tests/report_artifact_postprocess_test.rs` and the minimum `tests/agent_runtime_test.rs` extensions needed to prove the platform no longer depends on line-loss-specific Rust export logic: ```rust #[test] fn report_artifact_postprocess_exports_xlsx_for_ok_or_partial_scene() { let artifact = serde_json::json!({ "type": "report-artifact", "report_name": "tq-lineloss-report", "status": "partial", "columns": ["ORG_NAME", "LINE_LOSS_RATE"], "column_defs": [["ORG_NAME", "供电单位"], ["LINE_LOSS_RATE", "综合线损率(%)"]], "rows": [{"ORG_NAME": "国网兰州供电公司", "LINE_LOSS_RATE": "1.23"}], "counts": {"rows": 1}, "partial_reasons": ["report_log_failed"] }); let outcome = interpret_report_artifact_and_postprocess(&artifact, report_postprocess_xlsx(), temp_workspace()).unwrap(); assert!(outcome.success); assert!(outcome.summary.contains("status=partial")); assert!(outcome.summary.contains("detail_rows=1")); assert!(outcome.summary.contains("export_path=")); } #[test] fn report_artifact_postprocess_skips_export_for_blocked_or_error_scene() { /* no xlsx path */ } #[test] fn direct_submit_and_scene_submit_share_the_same_report_summary_contract() { /* direct_skill_runtime + deterministic path both use same summary builder */ } ``` - [ ] **Step 2: Run the focused report-artifact tests and verify they fail** Run: ```bash cargo test --test report_artifact_postprocess_test -- --nocapture cargo test --test agent_runtime_test submit_task_treats_partial_report_artifact_as_success_with_warning_summary -- --nocapture ``` Expected: FAIL because the generic interpreter/exporter does not exist yet and deterministic line-loss export is still special-cased. - [ ] **Step 3: Implement the shared parser, summary builder, and generic XLSX exporter** Implement a reusable path that both deterministic scenes and configured direct-submit skills can call: ```rust pub struct ParsedReportArtifact { pub report_name: String, pub status: String, pub columns: Vec, pub column_defs: Vec<(String, String)>, pub rows: Vec>, pub counts: ReportCounts, pub partial_reasons: Vec, } pub fn interpret_report_artifact_and_postprocess( artifact_json: &Value, postprocess: Option<&PostprocessSection>, workspace_root: &Path, ) -> Result { // parse report-artifact generically // map ok/partial/empty => success=true // map blocked/error => success=false // if postprocess.exporter == Some("xlsx_report") and status is exportable, write xlsx under workspace_root/out // if postprocess.auto_open == Some("excel"), reuse existing open-export helper } ``` Rules: - export logic must read `column_defs` when present, else fall back to `columns` - do not keep line-loss-only column-name assumptions in Rust - keep direct-submit behavior unchanged for non-artifact string outputs - keep `blocked` / `error` as failures even if rows happen to be present late in the artifact - [ ] **Step 4: Re-run the focused tests and verify they pass** Run: ```bash cargo test --test report_artifact_postprocess_test -- --nocapture cargo test --test agent_runtime_test submit_task_treats_partial_report_artifact_as_success_with_warning_summary -- --nocapture cargo test --test agent_runtime_test submit_task_treats_blocked_report_artifact_as_failure -- --nocapture ``` Expected: PASS. - [ ] **Step 5: Commit the generic artifact/postprocess slice** Run: ```bash git add src/compat/report_artifact.rs src/compat/report_xlsx_export.rs src/compat/direct_skill_runtime.rs src/compat/deterministic_submit.rs tests/report_artifact_postprocess_test.rs tests/agent_runtime_test.rs git commit -m "refactor: share generic report artifact postprocess" ``` Expected: one commit that removes the need for per-scene Rust export logic. --- ### Task 5: Wire manifest-driven scenes into submit and bootstrap without regressing other flows **Files:** - Modify: `src/agent/task_runner.rs` - Modify: `src/service/server.rs` - Modify: `tests/service_task_flow_test.rs` - Modify: `tests/service_ws_session_test.rs` - Modify: `tests/agent_runtime_test.rs` - [ ] **Step 1: Add the failing submit/bootstrap regression tests** Add focused tests that lock branch order and bootstrap behavior: ```rust #[test] fn submit_task_routes_suffix_instruction_through_manifest_scene_before_llm() { // no provider call should happen when deterministic scene planning succeeds or prompts } #[test] fn resolve_submit_bootstrap_target_prefers_manifest_scene_target_for_deterministic_scene() { let request = SubmitTaskRequest { instruction: "兰州公司 台区线损大数据 月累计线损率统计分析 2026-03。。。".to_string(), conversation_id: None, messages: vec![], page_url: None, page_title: None, }; let target = resolve_submit_bootstrap_target(&request, workspace_root, &settings); assert_eq!(target.request_url, "http://20.76.57.61:18080/gsllys/tqLinelossStatis/tqQualifyRateMonitor"); assert_eq!(target.expected_domain.as_deref(), Some("20.76.57.61")); } #[test] fn zhihu_without_suffix_keeps_existing_non_scene_path() { /* ordinary path unchanged */ } ``` For the browser-ws/callback-host path, add one regression in `tests/service_ws_session_test.rs` proving the first bootstrap/open target comes from `scene.toml` when a deterministic scene plan exists. - [ ] **Step 2: Run the focused integration tests and verify they fail** Run: ```bash cargo test --test service_task_flow_test -- --nocapture cargo test --test service_ws_session_test callback_host -- --nocapture ``` Expected: FAIL because the submit/bootstrap path still depends on the old deterministic line-loss branch shape. - [ ] **Step 3: Implement the minimal wiring changes only where the branch already exists** Implementation targets: - keep the current submit branch order in `src/agent/task_runner.rs` - keep `resolve_submit_bootstrap_target(...)` precedence in `src/service/server.rs` - replace the old hardcoded deterministic plan source with the new manifest-backed planner - keep configured `directSubmitSkill` and ordinary LLM/browser orchestration behavior untouched The resulting branch order must still be: ```rust // 1. registry-backed deterministic scene (exact suffix only) // 2. ordinary primary orchestration path // 3. configured directSubmitSkill // 4. compat LLM/runtime path ``` - [ ] **Step 4: Re-run the focused integration tests and verify they pass** Run: ```bash cargo test --test service_task_flow_test -- --nocapture cargo test --test service_ws_session_test callback_host -- --nocapture cargo test --test agent_runtime_test -- --nocapture ``` Expected: PASS, with no regression to the ordinary direct-submit or Zhihu paths. - [ ] **Step 5: Commit the submit/bootstrap integration slice** Run: ```bash git add src/agent/task_runner.rs src/service/server.rs tests/service_task_flow_test.rs tests/service_ws_session_test.rs tests/agent_runtime_test.rs git commit -m "refactor: wire manifest scenes into submit bootstrap" ``` Expected: one commit that changes wiring only at the existing seams. --- ### Task 6: Add the first manifest-driven `tq-lineloss-report` sample package inside this repo **Files:** - Create: `examples/generated_scene_platform/skills/tq-lineloss-report/scene.toml` - Create: `examples/generated_scene_platform/skills/tq-lineloss-report/references/org-dictionary.json` - Create: `examples/generated_scene_platform/skills/tq-lineloss-report/SKILL.toml` - Create: `examples/generated_scene_platform/skills/tq-lineloss-report/SKILL.md` - Create: `examples/generated_scene_platform/skills/tq-lineloss-report/scripts/collect_lineloss.js` - Create: `examples/generated_scene_platform/skills/tq-lineloss-report/scripts/collect_lineloss.test.js` - Create: `examples/generated_scene_platform/skills/tq-lineloss-report/references/data-quality.md` - Create: `examples/generated_scene_platform/skills/tq-lineloss-report/references/collection-flow.md` - Modify: `tests/deterministic_submit_test.rs` - Modify: `tests/scene_registry_test.rs` - [ ] **Step 1: Add the failing line-loss manifest and runtime-contract checks** Create the `scene.toml` shape in the in-repo sample package first and lock the migration expectations: ```toml [scene] id = "tq-lineloss-report" skill = "tq-lineloss-report" tool = "collect_lineloss" kind = "browser_script" version = "0.1.0" category = "report_collection" [manifest] schema_version = "1" [bootstrap] expected_domain = "20.76.57.61" target_url = "http://20.76.57.61:18080/gsllys/tqLinelossStatis/tqQualifyRateMonitor" page_title_keywords = ["线损"] requires_target_page = true [deterministic] suffix = "。。。" include_keywords = ["线损", "月累计", "周累计", "统计分析"] exclude_keywords = ["知乎"] [[params]] name = "org" resolver = "dictionary_entity" required = true prompt_missing = "已命中台区线损报表技能,但缺少供电单位。" prompt_ambiguous = "已命中台区线损报表技能,但供电单位存在歧义,请补充更完整名称。" [params.resolver_config] dictionary_ref = "references/org-dictionary.json" output_label_field = "org_label" output_code_field = "org_code" [[params]] name = "period" resolver = "month_week_period" required = true prompt_missing = "已命中台区线损报表技能,但缺少统计周期。" prompt_ambiguous = "已命中台区线损报表技能,但统计周期存在歧义,请补充更明确表达。" [artifact] type = "report-artifact" success_status = ["ok", "partial", "empty"] failure_status = ["blocked", "error"] [postprocess] exporter = "xlsx_report" auto_open = "excel" ``` Also add a red JS assertion in the committed sample package proving the script returns `column_defs` and never re-parses raw natural-language org/period text: ```javascript test('buildBrowserEntrypointResult keeps canonical args and generic export fields only', async () => { const artifact = await buildBrowserEntrypointResult({ expected_domain: '20.76.57.61', org_label: '国网兰州供电公司', org_code: '62401', period_mode: 'month', period_mode_code: '1', period_value: '2026-03', period_payload: { fdate: '2026-03' }, instruction: '兰州公司 月累计 2026-03' }, fakeDeps); assert.equal(artifact.org.code, '62401'); assert.ok(Array.isArray(artifact.column_defs)); assert.equal(JSON.stringify(artifact).includes('兰州公司 月累计 2026-03'), false); }); ``` - [ ] **Step 2: Run the targeted line-loss tests and verify they fail** Run: ```bash cargo test --test deterministic_submit_test -- --nocapture node "examples/generated_scene_platform/skills/tq-lineloss-report/scripts/collect_lineloss.test.js" ``` Expected: FAIL because the runtime is not yet manifest-driven and the committed sample package does not yet expose the final manifest/dictionary/export contract. - [ ] **Step 3: Implement the sample-scene migration without adding per-scene Rust branches** Required actions: - add `scene.toml` under the in-repo sample skill root and use the same layout the generator will emit - make tests and service-smoke config resolve `skillsDir` to `examples/generated_scene_platform` so the registry can discover the committed sample package without any external repo copy step - export the current org unit data into `references/org-dictionary.json` and make the resolver read that file instead of a Rust hardcoded list - update `collect_lineloss.js` so the returned `report-artifact` includes generic-platform fields needed by `report_xlsx_export.rs` - keep collection logic in JS; do **not** move line-loss business semantics back into Rust - write `SKILL.toml` / `SKILL.md` / references docs into the sample package to describe canonical args and the manifest-driven contract - keep any external staging-repo publish step out of scope for this branch; this task only commits the in-repo sample package - [ ] **Step 4: Re-run the line-loss tests and verify they pass** Run: ```bash cargo test --test deterministic_submit_test -- --nocapture node "examples/generated_scene_platform/skills/tq-lineloss-report/scripts/collect_lineloss.test.js" ``` Expected: PASS, including the new missing-period prompt behavior and the new manifest-driven sample-package shape. - [ ] **Step 5: Commit the line-loss sample migration** Run: ```bash git add examples/generated_scene_platform/skills/tq-lineloss-report tests/deterministic_submit_test.rs tests/scene_registry_test.rs git commit -m "feat: add manifest-driven lineloss sample package" ``` Expected: one commit that adds the first committed manifest-driven sample package and updates runtime expectations around it. --- ### Task 7: Write the required `tq-lineloss` lessons-learned artifacts and load them as generator rules **Files:** - Create: `docs/superpowers/references/tq-lineloss-lessons-learned.md` - Create: `docs/superpowers/references/tq-lineloss-lessons-learned.toml` - Create: `tests/generated_scene_lessons_test.rs` - Create: `src/generated_scene/mod.rs` - Create: `src/generated_scene/lessons.rs` - Modify: `src/lib.rs` - [ ] **Step 1: Write the failing lessons-rules test before the docs** Add `tests/generated_scene_lessons_test.rs` that requires all mandatory structured rule sections to exist. In the same red step, wire the empty `src/generated_scene/mod.rs` and `src/lib.rs` exports needed so this test fails on missing implementation/data, not on missing module visibility: ```rust #[test] fn lineloss_lessons_toml_declares_required_generator_rules() { let lessons = load_generation_lessons("docs/superpowers/references/tq-lineloss-lessons-learned.toml").unwrap(); assert!(lessons.routing.require_exact_suffix); assert!(lessons.routing.unsupported_scene_fail_closed); assert!(lessons.canonical_params.require_explicit_period); assert!(lessons.bootstrap.require_expected_domain); assert!(lessons.bootstrap.require_target_url); assert!(lessons.artifact.require_report_artifact); assert!(lessons.validation.require_pipe_and_ws_checks); assert!(lessons.validation.require_manual_service_console_smoke); } ``` - [ ] **Step 2: Run the lessons test and verify it fails** Run: ```bash cargo test --test generated_scene_lessons_test -- --nocapture ``` Expected: FAIL because the lessons loader and TOML file do not exist yet. - [ ] **Step 3: Implement the loader and write both lessons artifacts** Implement the loader and complete the minimal module wiring (`src/generated_scene/mod.rs`, `src/lib.rs`) in this task so `cargo test --test generated_scene_lessons_test` is buildable before Task 8. Use a TOML shape explicit enough for generator enforcement, for example: ```toml [routing] require_exact_suffix = true unsupported_scene_fail_closed = true ambiguity_fail_closed = true [canonical_params] require_dictionary_entity_for_org = true require_explicit_period = true forbid_hidden_page_defaults = true [bootstrap] require_expected_domain = true require_target_url = true prefer_page_context_when_present = true [artifact] require_report_artifact = true require_column_defs_for_export = true rust_side_xlsx_export_when_postprocess_xlsx = true [validation] require_pipe_and_ws_checks = true require_manual_service_console_smoke = true require_callback_host_timeout_notes = true ``` The Markdown companion must explain the why behind those rules: deterministic routing pitfalls, canonical parameter pitfalls, bootstrap target pitfalls, pipe/ws differences, callback-host timeout lessons, and Rust-side export constraints. - [ ] **Step 4: Re-run the lessons tests and verify they pass** Run: ```bash cargo test --test generated_scene_lessons_test -- --nocapture ``` Expected: PASS. - [ ] **Step 5: Commit the lessons artifacts and loader** Run: ```bash git add docs/superpowers/references/tq-lineloss-lessons-learned.md docs/superpowers/references/tq-lineloss-lessons-learned.toml src/generated_scene/mod.rs src/generated_scene/lessons.rs src/lib.rs tests/generated_scene_lessons_test.rs git commit -m "docs: add lineloss generation lessons" ``` Expected: one commit that makes the line-loss lessons machine-consumable and reviewable. --- ### Task 8: Build the v1 source analyzer, package generator, and CLI entry **Files:** - Create: `src/generated_scene/analyzer.rs` - Create: `src/generated_scene/generator.rs` - Create: `src/bin/sg_scene_generate.rs` - Modify: `src/generated_scene/mod.rs` - Modify: `src/lib.rs` - Create: `tests/scene_generator_test.rs` - Create: `tests/fixtures/generated_scene/report_collection/index.html` - Create: `tests/fixtures/generated_scene/report_collection/js/report.js` - Create: `tests/fixtures/generated_scene/non_report/index.html` - Create: `tests/fixtures/scene_source/tq_lineloss/index.html` - Create: `tests/fixtures/scene_source/tq_lineloss/js/collect.js` - [ ] **Step 1: Add the failing analyzer/generator tests with hermetic fixtures** Create fixture-backed tests like: ```rust #[test] fn analyzer_classifies_supported_report_collection_source() { let analysis = analyze_scene_source(Path::new("tests/fixtures/generated_scene/report_collection")).unwrap(); assert_eq!(analysis.scene_kind, SceneKind::ReportCollection); assert_eq!(analysis.tool_kind, ToolKind::BrowserScript); assert!(analysis.bootstrap.target_url.is_some()); assert!(analysis.collection_entry_script.is_some()); } #[test] fn generator_writes_registration_ready_package_with_scene_toml() { let output_root = tempdir(); generate_scene_package(GenerateSceneRequest { source_dir: PathBuf::from("tests/fixtures/generated_scene/report_collection"), scene_id: "sample-report-scene".to_string(), scene_name: "示例报表场景".to_string(), output_root: output_root.path().to_path_buf(), lessons_path: PathBuf::from("docs/superpowers/references/tq-lineloss-lessons-learned.toml"), }).unwrap(); assert!(output_root.path().join("skills/sample-report-scene/SKILL.toml").exists()); assert!(output_root.path().join("skills/sample-report-scene/scene.toml").exists()); assert!(output_root.path().join("skills/sample-report-scene/scripts/collect_sample_report_scene.js").exists()); assert!(output_root.path().join("skills/sample-report-scene/scripts/collect_sample_report_scene.test.js").exists()); } #[test] fn generator_rejects_non_report_source_with_explicit_reason() { let err = analyze_scene_source(Path::new("tests/fixtures/generated_scene/non_report")).unwrap_err(); assert!(err.to_string().contains("report/collection browser_script only")); } ``` - [ ] **Step 2: Run the generator tests and verify they fail** Run: ```bash cargo test --test scene_generator_test -- --nocapture ``` Expected: FAIL because the analyzer, generator, fixtures, and CLI do not exist yet. - [ ] **Step 3: Implement the analyzer, generator, CLI, and the source fixtures used by final smoke** Implementation rules: - create the generator test fixtures under `tests/fixtures/generated_scene/*` - create the hermetic source-smoke fixtures under `tests/fixtures/scene_source/tq_lineloss/*` so Task 9 can run without any external scenario directory - analyzer must refuse unsupported/non-report scenes explicitly instead of generating broken packages - generator must emit `scene.toml` inside the generated skill root - generator must use `tq-lineloss-lessons-learned.toml` as a required input so the same hardening rules apply to future scenes - generator/runtime coupling must stay at the file-contract level only - CLI should use an explicit parser, no new heavy dependency Suggested CLI shape: ```rust cargo run --bin sg_scene_generate -- \ --source-dir \ --scene-id \ --scene-name \ --output-root \ --lessons docs/superpowers/references/tq-lineloss-lessons-learned.toml ``` Expected outputs under ``: - `skills//SKILL.toml` - `skills//SKILL.md` - `skills//scene.toml` - `skills//references/*.md` - `skills//scripts/*.js` - `skills//scripts/*.test.js` - [ ] **Step 4: Re-run the generator tests and verify they pass** Run: ```bash cargo test --test scene_generator_test -- --nocapture ``` Expected: PASS. - [ ] **Step 5: Commit the generator slice** Run: ```bash git add src/lib.rs src/generated_scene/mod.rs src/generated_scene/analyzer.rs src/generated_scene/generator.rs src/bin/sg_scene_generate.rs tests/scene_generator_test.rs tests/fixtures/generated_scene tests/fixtures/scene_source/tq_lineloss git commit -m "feat: add generated scene package generator" ``` Expected: one commit that adds the in-repo v1 generator capability. --- ### Task 9: Run the final verification sweep, smoke the real runtime, and remove unused one-off scene code **Files:** - Delete if unused after green verification: `src/compat/tq_lineloss/org_units.rs` - Delete if unused after green verification: `src/compat/tq_lineloss/org_resolver.rs` - Delete if unused after green verification: `src/compat/tq_lineloss/period_resolver.rs` - Delete or reduce to shim only if unused after green verification: `src/compat/lineloss_xlsx_export.rs` - Modify: `src/compat/mod.rs` - Modify: `src/lib.rs` - [ ] **Step 1: Remove only the legacy one-off files that are provably unused** Before deleting anything, prove the new path covers the old responsibilities: ```bash cargo test --test deterministic_submit_test -- --nocapture cargo test --test scene_registry_test -- --nocapture cargo test --test report_artifact_postprocess_test -- --nocapture ``` Then delete the old line-loss-only resolver/export files only if `cargo test` and `Grep` show they are no longer referenced. - [ ] **Step 2: Run the full automated verification sweep** Run: ```bash node "examples/generated_scene_platform/skills/tq-lineloss-report/scripts/collect_lineloss.test.js" cargo test --test scene_registry_test -- --nocapture cargo test --test deterministic_submit_test -- --nocapture cargo test --test report_artifact_postprocess_test -- --nocapture cargo test --test generated_scene_lessons_test -- --nocapture cargo test --test scene_generator_test -- --nocapture cargo test --test agent_runtime_test -- --nocapture cargo test --test service_task_flow_test -- --nocapture cargo test --test service_ws_session_test callback_host -- --nocapture cargo test --test compat_runtime_test -- --nocapture cargo test --test compat_config_test -- --nocapture cargo build --bin sgclaw --bin sg_claw --bin sg_scene_generate ``` Expected: PASS. - [ ] **Step 3: Run the required hermetic generator smoke and keep the real external source smoke optional** Run the required in-repo smoke first: ```bash tmp_out="$(mktemp -d)" cargo run --bin sg_scene_generate -- \ --source-dir tests/fixtures/scene_source/tq_lineloss \ --scene-id tq-lineloss-report \ --scene-name "台区线损月周累计线损率统计分析" \ --output-root "$tmp_out" \ --lessons docs/superpowers/references/tq-lineloss-lessons-learned.toml ``` Expected: generator emits a complete package into `$tmp_out` using only in-repo fixtures. Optional manual follow-up after the required smoke is green: - if the external scenario directory is available on the implementer's machine, re-run the same command against the real source tree for additional confidence - if it is unavailable, do **not** block the branch on that machine-specific path - [ ] **Step 4: Run the real service-console smoke checks with `sg_claw.exe` semantics in mind** Manual verification checklist: - write or reuse a repo-local `sgclaw_config.json` whose `skillsDir` points to `examples/generated_scene_platform` - rebuild and run `sg_claw`/`sg_claw.exe` with that config so the runtime-scanned skills root is reproducible - on the real line-loss page, submit `兰州公司 台区线损大数据 月累计线损率统计分析 2026-03。。。` - confirm the request bootstraps the manifest `target_url`, uses the manifest `expected_domain`, and returns the line-loss report artifact through the generic scene runtime - submit `兰州公司 台区线损大数据 月累计线损率统计分析。。。` and confirm the runtime prompts for missing period instead of defaulting - submit `打开知乎热榜` and confirm the ordinary Zhihu path still behaves as before - submit `打开知乎热榜。。。` and confirm the deterministic runtime fails closed with the unsupported-scene prompt instead of falling into the Zhihu path - [ ] **Step 5: Commit the cleanup + verified platform state** Run: ```bash git add src/compat/mod.rs src/lib.rs src/compat src/generated_scene src/scene_contract docs/superpowers/references tests examples/generated_scene_platform git commit -m "feat: add generated scene skill platform" ``` Expected: one final commit after the full automated and manual verification passes. --- ## Verification Checklist ### Registry and manifest contract ```bash cargo test --test scene_registry_test -- --nocapture ``` Expected: - `scene.toml` loads from the skill root - only `schema_version = "1"` passes - duplicate `scene.id` fails with both manifest paths in the error - non-`browser_script` or non-`report_collection` v1 scenes are rejected cleanly - the registry still scans exactly one resolved `skillsDir` ### Deterministic routing contract ```bash cargo test --test deterministic_submit_test -- --nocapture ``` Expected: - exact `。。。` suffix only - no-suffix behavior unchanged - unsupported suffix-scene requests fail closed - multi-match ambiguity fails closed - missing org/mode/period prompt instead of defaulting - page context may improve scoring but cannot cause silent guessing on unresolved ambiguity ### Generic report-artifact handling ```bash cargo test --test report_artifact_postprocess_test -- --nocapture cargo test --test agent_runtime_test -- --nocapture ``` Expected: - `ok` / `partial` / `empty` map to success - `blocked` / `error` map to failure - generic XLSX export works from artifact fields, not line-loss-only Rust code - configured `directSubmitSkill` keeps working on the shared artifact interpreter ### Service submit/bootstrap path ```bash cargo test --test service_task_flow_test -- --nocapture cargo test --test service_ws_session_test callback_host -- --nocapture ``` Expected: - deterministic manifest scenes route before LLM - bootstrap target resolution uses manifest `target_url` / `expected_domain` - callback-host/browser-ws paths still receive the correct request URL - non-deterministic Zhihu and direct-submit flows remain intact ### Generator and lessons ```bash cargo test --test generated_scene_lessons_test -- --nocapture cargo test --test scene_generator_test -- --nocapture cargo build --bin sg_scene_generate ``` Expected: - lessons TOML contains all required routing/param/bootstrap/artifact/validation rules - analyzer only accepts v1 report/collection browser-script fixtures - generator writes a complete package with `scene.toml` and JS test scaffold - generator/runtime share only the explicit file contract, not hidden Rust internals ### Real runtime smoke Manual checklist: - `sg_claw.exe` / service console can still run the line-loss deterministic path - missing-period deterministic line-loss requests prompt instead of defaulting - plain Zhihu requests still avoid the scene platform - suffixed unsupported requests fail closed - line-loss export still opens through the generic postprocess path when configured --- ## Notes For The Engineer - The paired approved spec is `docs/superpowers/specs/2026-04-15-generated-scene-skill-platform-design.md`. - The current repo branch name for the ws baseline is `feature/claw-ws`, even though the design prose says `ws`. - Do **not** reintroduce the old scene-registry experiment that was explicitly cleaned off the ws branch. This plan deliberately keeps the new runtime under `compat` and a shared serializable contract instead of reviving the deleted scene-only branch structure blindly. - Keep `scene.toml` inside each skill package root. The separate `skill_staging/scenes/*/scene.json` tree remains legacy metadata only in this plan. - Keep the generator extractable by holding the boundary at `scene.toml`, generated package layout, and lessons TOML rules. Avoid runtime code that reaches into generator-only internals. - If a real scenario directory does not fit the v1 report/collection/browser-script envelope, the analyzer/generator must refuse it explicitly instead of emitting a half-valid package. - Do **not** add a generic login/session platform here. Capture that need in docs if discovered, but keep it out of this implementation slice.