50 Commits

Author SHA1 Message Date
木炎
6fee4e2083 fix(generator): handle PageStateEval with no endpoints, optimize clones
- Guard against empty modes falling through to compile_multi_mode_request
  by falling back to compile_simple_request_script for edge cases
- Optimize ensure_modes_populated to clone only when needed
- Reduces dead-code warnings from 3 to 2

🤖 Generated with [Qoder][https://qoder.com]
2026-04-17 18:33:09 +08:00
木炎
46005d9718 fix: add legacy comment to compile_page_state_eval 2026-04-17 18:30:41 +08:00
木炎
d1711a3db3 feat(generator): unify all scene types through multi-mode path
Single-mode and page-state-eval scenes now get auto-wrapped into a
default mode and compiled through compile_multi_mode_request. This
eliminates the old browser_script_with_business_logic code path and
ensures all scenes get responsePath extraction, requestTemplate, and
contentType support.

🤖 Generated with [Qoder][https://qoder.com]
2026-04-17 18:28:47 +08:00
木炎
b1647cd865 docs: add detailed implementation plan for scene generator quality improvement
8 tasks across 3 phases with exact file paths, step-by-step instructions,
test commands, and commit messages for each task.

🤖 Generated with [Qoder][https://qoder.com]
2026-04-17 18:19:37 +08:00
木炎
1c964c3e70 docs: add scene generator quality improvement design spec
🤖 Generated with [Qoder][https://qoder.com]
2026-04-17 18:13:45 +08:00
木炎
a6aa18c6d9 fix: replace eval() with safe template resolver and add empty MODES guard
Issue #1 (Critical): Replaced eval() in buildModeRequest with a safe
resolveTemplateValue function that only supports args.fieldName and
args['fieldName'] patterns, eliminating XSS/injection vulnerability.

Issue #2 (Critical): Added defensive guard in detectMode that throws a
clear error when MODES array is empty, preventing undefined property access.

🤖 Generated with [Qoder][https://qoder.com]
2026-04-17 13:20:55 +08:00
木炎
7420af8f31 fix: add serde rename for NormalizeRulesJson fields to match JS camelCase
The NormalizeRulesJson struct was serializing required_fields and filter_null
as snake_case, but the JavaScript template accesses them as requiredFields
and filterNull. Add #[serde(rename = "requiredFields")] and #[serde(rename = "filterNull")]
to fix the systematic snake_case vs camelCase mismatch in multi-mode templates.

🤖 Generated with [Qoder][https://qoder.com]
2026-04-17 13:16:14 +08:00
木炎
42eb716b7e feat(generator): add multi-mode template for mode-aware script generation
Add browser_script_with_modes function that generates JavaScript with:
- detectMode() to select mode based on args
- buildModeRequest() with content-type handling (JSON/form-urlencoded)
- normalizeModeRows() with validation rules
- queryModeData() with jQuery + fetch dual HTTP client

Modify browser_script() to check for modes first before falling back
to business_logic or skeleton templates.

🤖 Generated with [Qoder][https://qoder.com]
2026-04-17 13:10:21 +08:00
木炎
a325add167 fix(llm): add defaults for multi-mode fields in analyzeSceneDeep
🤖 Generated with [Qoder][https://qoder.com]
2026-04-17 13:07:10 +08:00
木炎
d95b8aaf26 feat(llm): enhance DEEP_SYSTEM_PROMPT for multi-mode detection
🤖 Generated with [Qoder][https://qoder.com]
2026-04-17 13:02:36 +08:00
木炎
847f2c62c6 feat(generator): add multi-mode schema structs for enhanced LLM extraction
Add ModeConditionJson, NormalizeRulesJson, ApiEndpointEnhancedJson, and
ModeConfigJson structs to support multi-mode business logic in generated
scripts. Enhance SceneInfoJson with modes, default_mode, and mode_switch_field.

🤖 Generated with [Qoder][https://qoder.com]
2026-04-17 12:56:46 +08:00
木炎
ea9147defb docs: add implementation plan for enhanced LLM extraction schema 2026-04-17 12:53:36 +08:00
木炎
5ff6e05911 docs: add enhanced LLM extraction schema design for multi-mode business logic 2026-04-17 12:51:09 +08:00
木炎
eead8f7802 feat(generator): enhance buildBrowserEntrypointResult with validation flow
- Add page context validation step with deps.validatePageContext
- Change validation failure status from 'error' to 'blocked'
- Add row normalization partial detection
- Use buildArtifact for all return paths consistently
- Add COLUMNS and REPORT_NAME constants for buildArtifact
- Export determineArtifactStatus, COLUMNS, REPORT_NAME in module.exports

🤖 Generated with [Qoder][https://qoder.com]
2026-04-17 12:28:35 +08:00
木炎
d123ee0aca feat(generator): enhance buildArtifact with determineArtifactStatus integration
- Change function signature to accept destructured object parameter
- Integrate determineArtifactStatus for automatic status calculation
- Use REPORT_NAME and COLUMNS constants (to be defined in Task 6)
- Add proper filtering for partial_reasons (exclude api_/validation_ prefixes)
- Add deduplication for reasons array
- Update call site in buildBrowserEntrypointResult to use new signature

🤖 Generated with [Qoder][https://qoder.com]
2026-04-17 12:24:17 +08:00
木炎
d26d96ef64 feat(generator): add determineArtifactStatus for complete status determination
Add a helper function to determine artifact status based on business
logic conditions:
- blocked: blockedReason exists (validation failed, wrong page context)
- error: fatalError exists (API call failed)
- partial: reasons array non-empty (some data issues)
- empty: no rows returned
- ok: rows exist with no issues

This is Task 3 of 8 in the progressive template enhancement.

🤖 Generated with [Qoder][https://qoder.com]
2026-04-17 12:21:13 +08:00
木炎
d996b511f6 feat(generator): add jQuery + fetch dual HTTP client support in template
Enhance defaultDeps with:
- validatePageContext: Checks hostname against expected_domain for page context validation
- queryData: Dual HTTP client with jQuery $.ajax as primary and fetch as fallback
  - jQuery preferred for internal pages that typically have it
  - Clear error message if neither client is available

🤖 Generated with [Qoder][https://qoder.com]
2026-04-17 12:19:02 +08:00
木炎
b8d2eb9faa fix(generator): use endpoint.url directly in buildRequest to fix URL construction bug
The previous implementation used `new URL(endpoint.url, window.location.origin)`
which incorrectly constructed URLs based on the current page's origin. This broke
when endpoint URLs were already complete URLs (e.g., http://20.76.57.61:18080/...).

The fix uses endpoint.url directly since LLM extraction already provides complete
URLs. Also changed from GET with query params to POST with JSON body to match
typical API expectations.

🤖 Generated with [Qoder][https://qoder.com]
2026-04-17 11:34:16 +08:00
木炎
4167639231 docs: add progressive template enhancement implementation plan 2026-04-17 11:26:34 +08:00
木炎
78a36a73b4 docs: add progressive template enhancement design spec 2026-04-17 11:16:11 +08:00
木炎
74c42af717 fix: handle apiEndpoints/columnDefs objects in preview, add string type checks 2026-04-17 10:55:58 +08:00
木炎
bb15d14749 feat(runner): pass sceneInfoJson to Rust CLI for enhanced template rendering 2026-04-17 10:39:36 +08:00
木炎
7289cc5779 feat(ui): add deep extraction preview panel with API/column/static-params display 2026-04-17 10:37:50 +08:00
木炎
689abf08ec feat(rust): add --scene-info-json parameter for LLM extraction results
- Add SceneInfoJson struct with serde rename for camelCase compatibility
- Add ApiEndpointJson and BusinessLogicJson support structs
- Add scene_info_json field to GenerateSceneRequest (backward compatible)
- Rename browser_script to browser_script_skeleton
- Add new browser_script that delegates based on scene_info presence
- Add browser_script_with_business_logic for enhanced script generation
- Update CLI to accept --scene-info-json parameter
- Update usage string to document new parameter

Generated with [Qoder][https://qoder.com]
2026-04-17 10:27:27 +08:00
木炎
2ffb42c181 feat(server): add /analyze-deep endpoint for deep extraction
🤖 Generated with [Qoder][https://qoder.com]
2026-04-17 10:20:04 +08:00
木炎
614e9a3a45 feat(generator-runner): read index.html in readDirectory()
🤖 Generated with [Qoder][https://qoder.com]
2026-04-17 10:16:17 +08:00
木炎
517ac6bf39 feat(llm-client): add deep extraction with apiEndpoints, staticParams, columnDefs
Add analyzeSceneDeep() function that extracts complete SceneInfo from
index.html content including API endpoints, static parameters, column
definitions, and business logic. Also adds http module import to support
non-HTTPS LLM endpoints.

🤖 Generated with [Qoder][https://qoder.com]
2026-04-17 10:13:29 +08:00
木炎
dd7b3c582a docs: add LLM-driven skill generation design spec
Design for enhancing sg_scene_generate to produce complete,
runnable skill packages with:

- Deep LLM extraction from index.html (API endpoints, params, columns)
- Enhanced Rust template rendering with business logic
- Web UI preview of extracted results

🤖 Generated with [Qoder][https://qoder.com]
2026-04-17 09:51:59 +08:00
木炎
f268668713 feat: add folder picker and target_url input to Web UI
- Add /select-folder and /select-file APIs using PowerShell dialogs
- Add --target-url parameter to CLI for explicit target URL override
- Redesign Web UI with folder browse buttons for all path inputs
- Add target_url optional input field for specifying target page URL
- Auto-fill scene-id from selected folder name

🤖 Generated with [Qoder][https://qoder.com]
2026-04-17 00:23:09 +08:00
木炎
ce072c2ebe feat: auto-extract expected_domain from external script URLs
When HTML has no sgclaw-expected-domain meta tag, analyzer now scans
for external script URLs (http:// or https://) and extracts the
domain (host:port) as expected_domain.

Example:
  <script src="http://25.215.213.128:18080/a_js/YPTAPI.js"></script>
  → expected_domain = "25.215.213.128:18080"

This reduces manual editing required for third-party scenes.

🤖 Generated with [Qoder][https://qoder.com]
2026-04-17 00:14:05 +08:00
木炎
464f18c672 fix: remove mandatory validation for ReportCollection fields
Remove the validation that required target_url, expected_domain, and
entry_script meta tags for ReportCollection scenes. Third-party scenes
do not have these meta tags, so the generator now uses empty values
that users can manually edit in scene.toml.

🤖 Generated with [Qoder][https://qoder.com]
2026-04-17 00:09:12 +08:00
木炎
b5131c858a feat: add scene kind dropdown to Web UI
Add scene type selector to the generator form:
- CSS styles for select element matching existing input style
- Dropdown with "报表收集类" (report_collection) and "监测类" (monitoring) options
- Pass sceneKind to /generate API endpoint

🤖 Generated with [Qoder][https://qoder.com]
2026-04-17 00:02:17 +08:00
木炎
2e69fa7239 feat: pass sceneKind from /generate request to generator
🤖 Generated with [Qoder][https://qoder.com]
2026-04-16 23:58:57 +08:00
木炎
f84e11c631 feat: add sceneKind param to generator-runner
Pass sceneKind to sg_scene_generate CLI when specified,
allowing generation of different scene kinds (report_collection,
monitoring, etc.).

🤖 Generated with [Qoder][https://qoder.com]
2026-04-16 23:57:30 +08:00
木炎
73edf1e5cf feat: add monitoring template support to generator
- Add scene_kind field to GenerateSceneRequest
- Update generate_scene_package to use analyze_scene_source_with_hint
- Implement scene_toml_monitoring for simplified monitoring scene manifests
- Add scene_toml router to dispatch to different templates based on SceneKind
- Add --scene-kind CLI option to sg_scene_generate binary
- Add generator_emits_monitoring_template test
- Create monitoring fixture with index.html (no meta tag, tests hint param)

Monitoring templates have simplified scene.toml without org/period resolvers.

🤖 Generated with [Qoder][https://qoder.com]
2026-04-16 23:51:11 +08:00
木炎
87cee36173 fix: delegate analyze_scene_source to analyze_scene_source_with_hint
Remove duplicate implementation in analyze_scene_source() and replace
with simple delegation to analyze_scene_source_with_hint(source_dir, None).

This eliminates ~50 lines of duplicated logic that could drift apart
from the main implementation. Updated the test to verify the new
behavior where sources without scene-kind meta tag default to
ReportCollection instead of being rejected.

🤖 Generated with [Qoder][https://qoder.com]
2026-04-16 23:42:46 +08:00
木炎
67fe17302e feat: add SceneKind::Monitoring and scene_kind_hint param to analyzer
- Add SceneKind::Monitoring enum variant with from_str/as_str helpers
- Add analyze_scene_source_with_hint function accepting optional scene kind hint
- User hint takes priority over meta tag, defaults to ReportCollection
- ReportCollection requires target_url, expected_domain, entry_script
- Monitoring type has optional fields
- Add test cases for hint parameter behavior
- Update non_report fixture with required meta tags for ReportCollection

🤖 Generated with [Qoder][https://qoder.com]
2026-04-16 23:33:24 +08:00
木炎
45b54ab007 docs: add multi-scene-kind generator implementation plan
7 tasks covering:
- Task 1: Extend SceneKind enum and analyzer function
- Task 2: Add multi-template support to generator
- Task 3: Add --scene-kind CLI parameter
- Task 4-6: Pass sceneKind through Node.js stack to Web UI
- Task 7: E2E testing and verification

🤖 Generated with [Qoder][https://qoder.com]
2026-04-16 23:26:01 +08:00
木炎
af8f261b79 docs: add multi-scene-kind generator design spec
Design for extending sg_scene_generate to support multiple scene kinds
(report_collection, monitoring) with manual type selection in Web UI,
relaxing the requirement for meta tags in third-party scene directories.

🤖 Generated with [Qoder][https://qoder.com]
2026-04-16 23:22:08 +08:00
木炎
f168f9f375 fix: auto-find correct sgclaw_config.json in parent directory
Add parent sgClaw directory to config search candidates so Node.js
server automatically finds the correct config with DeepSeek API
endpoint instead of the test config in claw-new directory.

🤖 Generated with [Qoder][https://qoder.com]
2026-04-16 22:47:39 +08:00
木炎
23845413c5 fix: patch path traversal and baseUrl normalization in scene generator
- server.js: sanitize static file paths to prevent directory traversal
  (GET /../../sgclaw_config.json would expose API key)
- config-loader.js: fix normalizeBaseUrl to strip /v1 before appending,
  preventing double /v1 for non-standard base URLs

🤖 Generated with [Qoder][https://qoder.com]
2026-04-16 22:30:35 +08:00
木炎
ea6be128e7 feat: scene skill generator — complete implementation
Adds a web-based UI for generating scene skill packages:
- Node.js HTTP server (zero npm dependencies) on port 3210
- HTML page with glass-morphism UI, dual-panel layout, settings modal
- LLM-powered scene-id/scene-name auto-extraction from directory contents
- Real-time SSE progress streaming during skill generation
- Spawns sg_scene_generate CLI with configurable parameters
- Windows-compatible startup scripts (serve.sh + serve.cmd)
- Rust integration tests for server files and HTML structure

Architecture:
  Browser (HTML/JS) → Node.js server → LLM API + cargo run → sg_scene_generate

Files:
  frontend/scene-generator/{server.js,config-loader.js,llm-client.js,generator-runner.js,sg_scene_generator.html,serve.sh,serve.cmd}
  tests/{scene_generator_server_test.rs,scene_generator_html_test.rs,scene_generator_llm_test.js}
  docs/superpowers/{plans,specs}/2026-04-16-scene-skill-generator*

🤖 Generated with [Qoder][https://qoder.com]
2026-04-16 22:27:41 +08:00
木炎
6c1865eb1c feat: add serve.sh and serve.cmd startup scripts for Scene Skill Generator
🤖 Generated with [Qoder][https://qoder.com]
2026-04-16 22:25:34 +08:00
木炎
d00086a70b feat: add sg_scene_generator.html with dual-panel UI and settings modal
Add self-contained HTML page for the Scene Skill Generator frontend:
- Dual-column glass-morphism layout matching service-console style
- Left sidebar: status card, sourceDir input with analyze button,
  sceneId/sceneName inputs, settings button, generate button
- Right panel: streaming log display with SSE event rendering
- Settings modal: outputRoot, lessons, llmBaseUrl, llmModel fields
- JavaScript: connects to http://127.0.0.1:3210, implements analyze()
  via fetch POST /analyze, generate() via fetch POST /generate with
  SSE stream reading, settings modal open/close
- Rust test verifying HTML file exists and contains required elements

🤖 Generated with [Qoder][https://qoder.com]
2026-04-16 22:23:33 +08:00
木炎
e7a4179513 feat: add server.js with /health, /analyze, /generate routes 2026-04-16 22:18:32 +08:00
木炎
15d4b0dcc1 feat: add generator-runner.js for cargo subprocess + SSE streaming 2026-04-16 22:15:52 +08:00
木炎
294426ced9 feat: add llm-client.js with prompt builder and JSON extractor
🤖 Generated with [Qoder][https://qoder.com]
2026-04-16 22:13:11 +08:00
木炎
ead9ea76fa feat: add config-loader.js and initial server test
Add config-loader.js module for loading sgclaw_config.json credentials
and resolving project root directory. Add initial Rust source-guard test
to verify server file paths exist.

🤖 Generated with [Qoder][https://qoder.com]
2026-04-16 22:10:48 +08:00
木炎
e8d7d6b796 test: lock request URL resolution precedence
Align the service task flow callback-host regression with the hidden helper close/open bootstrap sequence uncovered during final request-url verification.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 20:27:09 +08:00
木炎
bd83d92480 refactor(service): unify submit bootstrap target resolution
Use page context, deterministic plans, and direct-skill metadata as the service-owned bootstrap target precedence so callback-host startup no longer relies on line-loss text matching or the old request-url helper.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 20:23:24 +08:00
37 changed files with 10755 additions and 52 deletions

66
Cargo.lock generated
View File

@@ -2339,6 +2339,15 @@ dependencies = [
"serde_core",
]
[[package]]
name = "serde_spanned"
version = "0.6.9"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "bf41e0cfaf7226dca15e8197172c295a782857fcb97fad1808a166870dee75a3"
dependencies = [
"serde",
]
[[package]]
name = "serde_spanned"
version = "1.1.1"
@@ -2377,6 +2386,7 @@ dependencies = [
"sha2",
"thiserror 1.0.69",
"tokio",
"toml 0.8.23",
"tungstenite 0.29.0",
"uuid",
"zeroclawlabs",
@@ -2740,6 +2750,18 @@ dependencies = [
"tokio",
]
[[package]]
name = "toml"
version = "0.8.23"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "dc1beb996b9d83529a9e75c17a1686767d148d70663143c7854d8b4a09ced362"
dependencies = [
"serde",
"serde_spanned 0.6.9",
"toml_datetime 0.6.11",
"toml_edit",
]
[[package]]
name = "toml"
version = "1.1.2+spec-1.1.0"
@@ -2748,13 +2770,22 @@ checksum = "81f3d15e84cbcd896376e6730314d59fb5a87f31e4b038454184435cd57defee"
dependencies = [
"indexmap",
"serde_core",
"serde_spanned",
"toml_datetime",
"serde_spanned 1.1.1",
"toml_datetime 1.1.1+spec-1.1.0",
"toml_parser",
"toml_writer",
"winnow 1.0.1",
]
[[package]]
name = "toml_datetime"
version = "0.6.11"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "22cddaf88f4fbc13c51aebbf5f8eceb5c7c5a9da2ac40a13519eb5b0a0e8f11c"
dependencies = [
"serde",
]
[[package]]
name = "toml_datetime"
version = "1.1.1+spec-1.1.0"
@@ -2764,6 +2795,20 @@ dependencies = [
"serde_core",
]
[[package]]
name = "toml_edit"
version = "0.22.27"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "41fe8c660ae4257887cf66394862d21dbca4a6ddd26f04a3560410406a2f819a"
dependencies = [
"indexmap",
"serde",
"serde_spanned 0.6.9",
"toml_datetime 0.6.11",
"toml_write",
"winnow 0.7.15",
]
[[package]]
name = "toml_parser"
version = "1.1.2+spec-1.1.0"
@@ -2773,6 +2818,12 @@ dependencies = [
"winnow 1.0.1",
]
[[package]]
name = "toml_write"
version = "0.1.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "5d99f8c9a7727884afe522e9bd5edbfc91a3312b36a77b5fb8926e4c31a41801"
[[package]]
name = "toml_writer"
version = "1.1.1+spec-1.1.0"
@@ -3444,6 +3495,15 @@ dependencies = [
"memchr",
]
[[package]]
name = "winnow"
version = "0.7.15"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "df79d97927682d2fd8adb29682d1140b343be4ac0f08fd68b7765d9c059d3945"
dependencies = [
"memchr",
]
[[package]]
name = "winnow"
version = "1.0.1"
@@ -3637,7 +3697,7 @@ dependencies = [
"tokio-stream",
"tokio-tungstenite 0.29.0",
"tokio-util",
"toml",
"toml 1.1.2+spec-1.1.0",
"tower",
"tower-http",
"tracing",

View File

@@ -17,6 +17,7 @@ serde_json = "1"
sha2 = "0.10"
thiserror = "1"
tokio = { version = "1", default-features = false, features = ["rt-multi-thread", "macros"] }
toml = "0.8"
tungstenite = "0.29"
uuid = { version = "1", features = ["v4"] }
zip = { version = "0.6.6", default-features = false, features = ["deflate"] }

View File

@@ -0,0 +1,810 @@
# Multi-Scene-Kind Generator Implementation Plan
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
**Goal:** 扩展 `sg_scene_generate` 支持多种场景类型,让用户在 Web UI 上手动选择场景类型(报表收集类/监测类),不再依赖第三方场景目录中的 meta 标签。
**Architecture:** 放宽 analyzer.rs 的 meta 校验,让 meta 标签变为可选;在 CLI 增加 `--scene-kind` 参数;在 generator.rs 根据场景类型选择不同模板;在 Web UI 增加场景类型下拉框。
**Tech Stack:** Rust, Node.js, HTML/CSS/JS
---
## File Map
### Core Rust files (backend)
- **Modify:** `src/generated_scene/analyzer.rs` — 放宽 meta 校验,新增 `SceneKind::Monitoring`,函数签名增加 `scene_kind_hint` 参数
- **Modify:** `src/generated_scene/generator.rs` — 多模板支持,根据 `SceneKind` 路由到不同模板函数
- **Modify:** `src/bin/sg_scene_generate.rs` — 新增 `--scene-kind` CLI 参数
### Frontend files (Web UI)
- **Modify:** `frontend/scene-generator/sg_scene_generator.html` — 新增场景类型下拉框
- **Modify:** `frontend/scene-generator/server.js``/generate` 接口传递 `sceneKind` 参数
- **Modify:** `frontend/scene-generator/generator-runner.js``runGenerator` 增加 `sceneKind` 参数
### Test files
- **Modify:** `tests/scene_generator_test.rs` — 新增监测类场景测试
- **Create:** `tests/fixtures/generated_scene/monitoring/index.html` — 监测类 fixture
---
### Task 1: 扩展 SceneKind 枚举和 analyzer 函数签名
**Files:**
- Modify: `src/generated_scene/analyzer.rs:1-127`
- Test: `tests/scene_generator_test.rs`
- [ ] **Step 1: 写失败测试 — analyzer 接受 scene_kind_hint 参数**
修改 `tests/scene_generator_test.rs`,新增测试:
```rust
#[test]
fn analyzer_accepts_missing_meta_with_scene_kind_hint() {
// non_report fixture 没有 scene-kind meta 标签
let analysis = analyze_scene_source_with_hint(
Path::new("tests/fixtures/generated_scene/non_report"),
Some(SceneKind::ReportCollection),
)
.unwrap();
// 应该成功,使用 hint 参数作为类型
assert_eq!(analysis.scene_kind, SceneKind::ReportCollection);
}
#[test]
fn analyzer_uses_hint_when_meta_missing() {
let analysis = analyze_scene_source_with_hint(
Path::new("tests/fixtures/generated_scene/non_report"),
Some(SceneKind::Monitoring),
)
.unwrap();
assert_eq!(analysis.scene_kind, SceneKind::Monitoring);
}
#[test]
fn analyzer_uses_meta_when_present_and_no_hint() {
// report_collection fixture 有正确的 meta 标签
let analysis = analyze_scene_source_with_hint(
Path::new("tests/fixtures/generated_scene/report_collection"),
None,
)
.unwrap();
assert_eq!(analysis.scene_kind, SceneKind::ReportCollection);
}
#[test]
fn analyzer_hint_overrides_meta() {
// 用户选择优先于 meta 标签
let analysis = analyze_scene_source_with_hint(
Path::new("tests/fixtures/generated_scene/report_collection"),
Some(SceneKind::Monitoring),
)
.unwrap();
assert_eq!(analysis.scene_kind, SceneKind::Monitoring);
}
```
- [ ] **Step 2: 运行测试确认失败**
Run:
```bash
cargo test --test scene_generator_test -- --nocapture
```
Expected: FAIL因为 `analyze_scene_source_with_hint` 函数不存在
- [ ] **Step 3: 实现 SceneKind::Monitoring 枚举变体**
修改 `src/generated_scene/analyzer.rs`,扩展枚举:
```rust
#[derive(Debug, Clone, PartialEq, Eq)]
pub enum SceneKind {
ReportCollection,
Monitoring,
}
impl SceneKind {
pub fn from_str(s: &str) -> Option<Self> {
match s {
"report_collection" => Some(Self::ReportCollection),
"monitoring" => Some(Self::Monitoring),
_ => None,
}
}
pub fn as_str(&self) -> &'static str {
match self {
Self::ReportCollection => "report_collection",
Self::Monitoring => "monitoring",
}
}
}
```
- [ ] **Step 4: 实现带 hint 参数的新函数**
`src/generated_scene/analyzer.rs` 添加新函数:
```rust
pub fn analyze_scene_source_with_hint(
source_dir: &Path,
scene_kind_hint: Option<SceneKind>,
) -> Result<SceneSourceAnalysis, AnalyzeSceneError> {
let index_path = source_dir.join("index.html");
let html = fs::read_to_string(&index_path).map_err(|err| {
AnalyzeSceneError::new(format!(
"failed to read scene source {}: {err}",
index_path.display()
))
})?;
// 从 meta 标签读取类型(可选)
let meta_scene_kind = meta_content(&html, "sgclaw-scene-kind");
let meta_tool_kind = meta_content(&html, "sgclaw-tool-kind");
// 用户 hint 优先于 meta 标签,默认为 ReportCollection
let scene_kind = scene_kind_hint
.or_else(|| meta_scene_kind.as_deref().and_then(SceneKind::from_str))
.unwrap_or(SceneKind::ReportCollection);
// tool_kind 固定为 BrowserScriptV1 只支持这一种)
let tool_kind = ToolKind::BrowserScript;
// 验证 meta 标签中的类型(如果存在)是否与最终类型兼容
if let Some(meta) = meta_scene_kind.as_deref() {
if SceneKind::from_str(meta).is_none() {
return Err(AnalyzeSceneError::new(format!(
"unknown sgclaw-scene-kind: {}",
meta
)));
}
}
let target_url = meta_content(&html, "sgclaw-target-url");
let expected_domain = meta_content(&html, "sgclaw-expected-domain");
let entry_script = meta_content(&html, "sgclaw-entry-script");
// 对于 report_collection 类型,要求必须有 target_url、expected_domain、entry_script
// 对于 monitoring 类型,这些字段可选(生成简化模板)
if scene_kind == SceneKind::ReportCollection {
if target_url.as_deref().unwrap_or_default().trim().is_empty()
|| expected_domain
.as_deref()
.unwrap_or_default()
.trim()
.is_empty()
|| entry_script
.as_deref()
.unwrap_or_default()
.trim()
.is_empty()
{
return Err(AnalyzeSceneError::new(
"report_collection scene source must declare target url, expected domain, and entry script",
));
}
}
Ok(SceneSourceAnalysis {
scene_kind,
tool_kind,
bootstrap: BootstrapAnalysis {
target_url,
expected_domain,
},
collection_entry_script: entry_script,
source_dir: source_dir.to_path_buf(),
})
}
// 保留原函数签名以兼容现有调用
pub fn analyze_scene_source(source_dir: &Path) -> Result<SceneSourceAnalysis, AnalyzeSceneError> {
analyze_scene_source_with_hint(source_dir, None)
}
```
- [ ] **Step 5: 运行测试确认通过**
Run:
```bash
cargo test --test scene_generator_test -- --nocapture
```
Expected: PASS
- [ ] **Step 6: 提交 analyzer 改动**
Run:
```bash
git add src/generated_scene/analyzer.rs tests/scene_generator_test.rs
git commit -m "feat: add SceneKind::Monitoring and scene_kind_hint param to analyzer"
```
---
### Task 2: 修改 generator 支持多模板
**Files:**
- Modify: `src/generated_scene/generator.rs:1-204`
- [ ] **Step 1: 写失败测试 — generator 生成监测类模板**
修改 `tests/scene_generator_test.rs`,新增测试:
```rust
#[test]
fn generator_emits_monitoring_template() {
let output_root = temp_workspace("sgclaw-monitoring-generator");
generate_scene_package(GenerateSceneRequest {
source_dir: PathBuf::from("tests/fixtures/generated_scene/monitoring"),
scene_id: "sample-monitor-scene".to_string(),
scene_name: "示例监测场景".to_string(),
scene_kind: Some(SceneKind::Monitoring),
output_root: output_root.clone(),
lessons_path: PathBuf::from("docs/superpowers/references/tq-lineloss-lessons-learned.toml"),
})
.unwrap();
let skill_root = output_root.join("skills/sample-monitor-scene");
assert!(skill_root.join("SKILL.toml").exists());
assert!(skill_root.join("scene.toml").exists());
let generated_manifest = fs::read_to_string(skill_root.join("scene.toml")).unwrap();
assert!(generated_manifest.contains("category = \"monitoring\""));
// 监测类不应该有 org/period resolver
assert!(!generated_manifest.contains("resolver = \"dictionary_entity\""));
}
```
- [ ] **Step 2: 运行测试确认失败**
Run:
```bash
cargo test --test scene_generator_test -- --nocapture
```
Expected: FAIL因为 `GenerateSceneRequest` 没有 `scene_kind` 字段
- [ ] **Step 3: 修改 GenerateSceneRequest 增加 scene_kind 字段**
修改 `src/generated_scene/generator.rs`
```rust
#[derive(Debug, Clone)]
pub struct GenerateSceneRequest {
pub source_dir: PathBuf,
pub scene_id: String,
pub scene_name: String,
pub scene_kind: Option<SceneKind>, // 新增
pub output_root: PathBuf,
pub lessons_path: PathBuf,
}
```
- [ ] **Step 4: 修改 generate_scene_package 使用新 analyzer 函数**
修改 `src/generated_scene/generator.rs`
```rust
use crate::generated_scene::analyzer::{analyze_scene_source_with_hint, AnalyzeSceneError, SceneKind};
pub fn generate_scene_package(
request: GenerateSceneRequest,
) -> Result<PathBuf, GenerateSceneError> {
let analysis = analyze_scene_source_with_hint(&request.source_dir, request.scene_kind.clone())?;
// ... 后续代码
```
- [ ] **Step 5: 实现监测类模板函数**
`src/generated_scene/generator.rs` 添加:
```rust
fn scene_toml_monitoring(
request: &GenerateSceneRequest,
analysis: &SceneSourceAnalysis,
tool_name: &str,
) -> String {
let expected_domain = analysis.bootstrap.expected_domain.as_deref().unwrap_or("");
let target_url = analysis.bootstrap.target_url.as_deref().unwrap_or("");
format!(
"[scene]\nid = \"{}\"\nskill = \"{}\"\ntool = \"{}\"\nkind = \"browser_script\"\nversion = \"0.1.0\"\ncategory = \"monitoring\"\n\n[manifest]\nschema_version = \"1\"\n\n[bootstrap]\nexpected_domain = \"{}\"\ntarget_url = \"{}\"\nrequires_target_page = true\n\n[deterministic]\nsuffix = \"。。。\"\ninclude_keywords = [\"{}\"]\nexclude_keywords = []\n\n# 参数部分留空,用户手动编辑\n# [[params]]\n# name = \"xxx\"\n# resolver = \"literal_passthrough\"\n\n[artifact]\ntype = \"monitoring-status\"\nsuccess_status = [\"ok\", \"running\"]\nfailure_status = [\"error\", \"timeout\"]\n\n# 后处理留空,用户手动编辑\n",
request.scene_id,
request.scene_id,
tool_name,
expected_domain,
target_url,
request.scene_name
)
}
```
- [ ] **Step 6: 修改 scene_toml 函数路由到不同模板**
修改 `src/generated_scene/generator.rs``scene_toml` 函数:
```rust
fn scene_toml(
request: &GenerateSceneRequest,
analysis: &SceneSourceAnalysis,
tool_name: &str,
) -> String {
match analysis.scene_kind {
SceneKind::ReportCollection => scene_toml_report_collection(request, analysis, tool_name),
SceneKind::Monitoring => scene_toml_monitoring(request, analysis, tool_name),
}
}
fn scene_toml_report_collection(
request: &GenerateSceneRequest,
analysis: &SceneSourceAnalysis,
tool_name: &str,
) -> String {
let expected_domain = analysis.bootstrap.expected_domain.as_deref().unwrap_or("");
let target_url = analysis.bootstrap.target_url.as_deref().unwrap_or("");
// 现有的 report_collection 模板代码
format!(
"[scene]\nid = \"{}\"\nskill = \"{}\"\ntool = \"{}\"\nkind = \"browser_script\"\nversion = \"0.1.0\"\ncategory = \"report_collection\"\n\n[manifest]\nschema_version = \"1\"\n\n[bootstrap]\nexpected_domain = \"{}\"\ntarget_url = \"{}\"\npage_title_keywords = [\"报表\", \"线损\"]\nrequires_target_page = true\n\n[deterministic]\nsuffix = \"。。。\"\ninclude_keywords = [\"{}\", \"报表\", \"统计\"]\nexclude_keywords = [\"知乎\"]\n\n[[params]]\nname = \"org\"\nresolver = \"dictionary_entity\"\nrequired = true\nprompt_missing = \"已命中{},但缺少供电单位。\"\nprompt_ambiguous = \"已命中{},但供电单位存在歧义。\"\n\n[params.resolver_config]\ndictionary_ref = \"references/org-dictionary.json\"\noutput_label_field = \"org_label\"\noutput_code_field = \"org_code\"\n\n[[params]]\nname = \"period\"\nresolver = \"month_week_period\"\nrequired = true\nprompt_missing = \"已命中{},但缺少统计周期。\"\nprompt_ambiguous = \"已命中{},但统计周期存在歧义。\"\n\n[artifact]\ntype = \"report-artifact\"\nsuccess_status = [\"ok\", \"partial\", \"empty\"]\nfailure_status = [\"blocked\", \"error\"]\n\n[postprocess]\nexporter = \"xlsx_report\"\nauto_open = \"excel\"\n",
request.scene_id,
request.scene_id,
tool_name,
expected_domain,
target_url,
request.scene_name,
request.scene_name,
request.scene_name,
request.scene_name,
request.scene_name
)
}
```
- [ ] **Step 7: 创建监测类 fixture**
创建 `tests/fixtures/generated_scene/monitoring/index.html`
```html
<!doctype html>
<html>
<head>
<meta charset="utf-8">
<title>设备监测状态</title>
<!-- 注意:没有 sgclaw-scene-kind meta 标签,测试 hint 参数 -->
</head>
<body>
<main>
<h1>设备监测状态</h1>
<div id="monitor-status">running</div>
</main>
</body>
</html>
```
- [ ] **Step 8: 运行测试确认通过**
Run:
```bash
cargo test --test scene_generator_test -- --nocapture
```
Expected: PASS
- [ ] **Step 9: 提交 generator 改动**
Run:
```bash
git add src/generated_scene/generator.rs tests/scene_generator_test.rs tests/fixtures/generated_scene/monitoring
git commit -m "feat: add monitoring template support to generator"
```
---
### Task 3: 修改 CLI 增加 --scene-kind 参数
**Files:**
- Modify: `src/bin/sg_scene_generate.rs:1-82`
- [ ] **Step 1: 修改 CliArgs 结构体增加 scene_kind 字段**
修改 `src/bin/sg_scene_generate.rs`
```rust
use sgclaw::generated_scene::analyzer::SceneKind;
struct CliArgs {
source_dir: PathBuf,
scene_id: String,
scene_name: String,
scene_kind: Option<SceneKind>, // 新增
output_root: PathBuf,
lessons_path: PathBuf,
}
```
- [ ] **Step 2: 修改 parse_args 解析 --scene-kind 参数**
修改 `src/bin/sg_scene_generate.rs`
```rust
fn parse_args(args: impl Iterator<Item = String>) -> Result<CliArgs, String> {
let mut source_dir = None;
let mut scene_id = None;
let mut scene_name = None;
let mut scene_kind = None; // 新增
let mut output_root = None;
let mut lessons_path = None;
let mut pending_flag: Option<String> = None;
for arg in args {
if let Some(flag) = pending_flag.take() {
match flag.as_str() {
"--source-dir" => source_dir = Some(PathBuf::from(arg)),
"--scene-id" => scene_id = Some(arg),
"--scene-name" => scene_name = Some(arg),
"--scene-kind" => {
scene_kind = Some(SceneKind::from_str(&arg).ok_or_else(|| {
format!("invalid scene-kind: {}, expected report_collection or monitoring", arg)
})?);
}
"--output-root" => output_root = Some(PathBuf::from(arg)),
"--lessons" => lessons_path = Some(PathBuf::from(arg)),
_ => return Err(format!("unsupported argument {flag}")),
}
continue;
}
match arg.as_str() {
"--source-dir" | "--scene-id" | "--scene-name" | "--scene-kind" | "--output-root" | "--lessons" => {
pending_flag = Some(arg);
}
"--help" | "-h" => return Err(usage()),
_ => return Err(format!("unsupported argument {arg}\n{}", usage())),
}
}
if let Some(flag) = pending_flag {
return Err(format!("missing value for {flag}"));
}
Ok(CliArgs {
source_dir: source_dir.ok_or_else(usage)?,
scene_id: scene_id.ok_or_else(usage)?,
scene_name: scene_name.ok_or_else(usage)?,
scene_kind, // 可选,默认 None
output_root: output_root.ok_or_else(usage)?,
lessons_path: lessons_path.ok_or_else(usage)?,
})
}
```
- [ ] **Step 3: 修改 run 函数传递 scene_kind**
修改 `src/bin/sg_scene_generate.rs`
```rust
fn run() -> Result<(), String> {
let args = parse_args(env::args().skip(1))?;
let skill_root = generate_scene_package(GenerateSceneRequest {
source_dir: args.source_dir,
scene_id: args.scene_id,
scene_name: args.scene_name,
scene_kind: args.scene_kind, // 新增
output_root: args.output_root,
lessons_path: args.lessons_path,
})
.map_err(|err| err.to_string())?;
println!("generated scene package: {}", skill_root.display());
Ok(())
}
```
- [ ] **Step 4: 更新 usage 函数**
修改 `src/bin/sg_scene_generate.rs`
```rust
fn usage() -> String {
"usage: sg_scene_generate --source-dir <scenario-dir> --scene-id <scene-id> --scene-name <display-name> [--scene-kind <report_collection|monitoring>] --output-root <skill-staging-root> --lessons <lessons-toml>".to_string()
}
```
- [ ] **Step 5: 运行测试确认编译通过**
Run:
```bash
cargo build --bin sg_scene_generate
```
Expected: 编译成功
- [ ] **Step 6: 手动测试 CLI**
Run:
```bash
cargo run --bin sg_scene_generate -- --source-dir tests/fixtures/generated_scene/monitoring --scene-id test-monitor --scene-name "测试监测" --scene-kind monitoring --output-root ./tmp_test --lessons docs/superpowers/references/tq-lineloss-lessons-learned.toml
```
Expected: 生成成功scene.toml 包含 `category = "monitoring"`
- [ ] **Step 7: 提交 CLI 改动**
Run:
```bash
git add src/bin/sg_scene_generate.rs
git commit -m "feat: add --scene-kind CLI param to sg_scene_generate"
```
---
### Task 4: 修改 Node.js generator-runner 传递 sceneKind
**Files:**
- Modify: `frontend/scene-generator/generator-runner.js:1-175`
- [ ] **Step 1: 修改 runGenerator 函数签名和 args 数组**
修改 `frontend/scene-generator/generator-runner.js`
```javascript
function runGenerator(params, sseWriter, projectRoot) {
const { sourceDir, sceneId, sceneName, sceneKind, outputRoot, lessons } = params;
const normalize = (p) => p.replace(/\\/g, "/");
const args = [
"run",
"--bin",
"sg_scene_generate",
"--",
"--source-dir",
normalize(sourceDir),
"--scene-id",
sceneId,
"--scene-name",
sceneName,
];
// 只有明确指定 sceneKind 时才添加参数(否则使用默认值 report_collection
if (sceneKind) {
args.push("--scene-kind", sceneKind);
}
args.push(
"--output-root",
normalize(outputRoot),
"--lessons",
normalize(lessons)
);
// ... 后续代码不变
```
- [ ] **Step 2: 提交 generator-runner 改动**
Run:
```bash
git add frontend/scene-generator/generator-runner.js
git commit -m "feat: add sceneKind param to generator-runner"
```
---
### Task 5: 修改 Node.js server 传递 sceneKind
**Files:**
- Modify: `frontend/scene-generator/server.js:119-154`
- [ ] **Step 1: 修改 handleGenerate 解构 sceneKind**
修改 `frontend/scene-generator/server.js`
```javascript
async function handleGenerate(req, res) {
let body;
try {
body = await parseBody(req);
} catch {
res.writeHead(400, { "Content-Type": "application/json" });
res.end(JSON.stringify({ error: "Invalid JSON body" }));
return;
}
const { sourceDir, sceneId, sceneName, sceneKind, outputRoot, lessons } = body;
if (!sourceDir || !sceneId || !sceneName || !outputRoot || !lessons) {
res.writeHead(400, { "Content-Type": "application/json" });
res.end(
JSON.stringify({
error:
"All fields required: sourceDir, sceneId, sceneName, outputRoot, lessons",
})
);
return;
}
const sseWriter = initSSE(res);
try {
await runGenerator(
{ sourceDir, sceneId, sceneName, sceneKind, outputRoot, lessons }, // 增加 sceneKind
sseWriter,
config.projectRoot
);
} catch (err) {
writeSSE(sseWriter, "error", { message: `Server error: ${err.message}` });
}
sseWriter.end();
}
```
- [ ] **Step 2: 提交 server 改动**
Run:
```bash
git add frontend/scene-generator/server.js
git commit -m "feat: pass sceneKind from /generate request to generator"
```
---
### Task 6: 修改 Web UI 增加场景类型下拉框
**Files:**
- Modify: `frontend/scene-generator/sg_scene_generator.html`
- [ ] **Step 1: 在 HTML 中增加场景类型下拉框**
`sg_scene_generator.html` 的表单区域scene-name 输入框后面添加:
```html
<div class="form-group">
<label for="sceneKind">场景类型</label>
<select id="sceneKind">
<option value="report_collection" selected>报表收集类</option>
<option value="monitoring">监测类</option>
</select>
<span class="hint">报表类:查询数据导出 Excel监测类定时检查状态</span>
</div>
```
- [ ] **Step 2: 修改 generate() 函数读取 sceneKind**
修改 `sg_scene_generator.html` 中的 `generate()` 函数:
```javascript
async function generate() {
const sourceDir = document.getElementById('sourceDir').value.trim();
const sceneId = document.getElementById('sceneId').value.trim();
const sceneName = document.getElementById('sceneName').value.trim();
const sceneKind = document.getElementById('sceneKind').value; // 新增
const outputRoot = document.getElementById('outputRoot').value.trim();
const lessons = document.getElementById('lessons').value.trim();
// ... 验证逻辑不变
const response = await fetch('/generate', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
sourceDir,
sceneId,
sceneName,
sceneKind, // 新增
outputRoot,
lessons
})
});
// ... 后续代码不变
}
```
- [ ] **Step 3: 提交 HTML 改动**
Run:
```bash
git add frontend/scene-generator/sg_scene_generator.html
git commit -m "feat: add sceneKind dropdown to Web UI"
```
---
### Task 7: 端到端测试和最终验证
**Files:**
- Verify only
- [ ] **Step 1: 运行所有 Rust 测试**
Run:
```bash
cargo test --test scene_generator_test -- --nocapture
cargo test --test scene_registry_test -- --nocapture
```
Expected: PASS
- [ ] **Step 2: 重启 Node.js 服务器**
Run:
```bash
cd frontend/scene-generator && node server.js
```
Expected: 服务启动成功
- [ ] **Step 3: 手动测试 Web UI 报表类场景**
1. 打开 `http://127.0.0.1:3210/`
2. 输入场景路径 `D:\desk\智能体资料\场景\营销2.0零度户报表数据生成`
3. 场景类型选择"报表收集类"
4. 点击"分析" → 等待 LLM 提取 scene-id/scene-name
5. 点击"生成 Skill" → 等待生成完成
6. 检查输出目录下生成的文件
Expected: 生成成功scene.toml 包含 `category = "report_collection"`
- [ ] **Step 4: 提交最终验证**
Run:
```bash
git add -A
git status
```
确认无未提交改动。
---
## Verification Checklist
### Rust 层
```bash
cargo test --test scene_generator_test -- --nocapture
cargo build --bin sg_scene_generate
```
Expected:
- `analyze_scene_source_with_hint` 接受可选的 `SceneKind` 参数
- `GenerateSceneRequest` 包含 `scene_kind` 字段
- generator 根据类型生成不同模板
- CLI 支持 `--scene-kind` 参数
### Node.js 层
```bash
node frontend/scene-generator/server.js
```
Expected:
- `/generate` 接口接受 `sceneKind` 参数
- `runGenerator` 正确传递参数给 CLI
### Web UI 层
手动测试:
- 场景类型下拉框正常显示
- 选择报表类生成 `category = "report_collection"`
- 选择监测类生成 `category = "monitoring"`
---
## Notes For The Engineer
- 配对的 spec 文件是 `docs/superpowers/specs/2026-04-16-multi-scene-kind-generator-design.md`
- 用户选择 `scene_kind_hint` 优先于 meta 标签
- 监测类模板是简化版,用户需要手动编辑参数部分
- V1 不修改 `registry.rs` 的运行时校验逻辑

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,839 @@
# Enhanced LLM Extraction Schema - Implementation Plan
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
**Goal:** Enhance the LLM extraction schema to support multi-mode business logic, enabling automatic generation of scripts like tq-lineloss-report that switch between month/week modes.
**Architecture:** Extend existing `SceneInfoJson` in Rust with new mode-related structs. Enhance LLM prompt in `llm-client.js` to detect multi-mode patterns. Add new template function `browser_script_with_modes()` for generating mode-aware JavaScript.
**Tech Stack:** Rust (serde_json), JavaScript (Node.js), LLM API
---
## File Structure
| File | Action | Purpose |
|------|--------|---------|
| `src/generated_scene/generator.rs` | Modify | Add mode-related schema structs and multi-mode template |
| `frontend/scene-generator/llm-client.js` | Modify | Enhance DEEP_SYSTEM_PROMPT for mode detection |
| `frontend/scene-generator/server.js` | Modify | Handle enhanced schema in deep analysis endpoint |
---
### Task 1: Add Rust Schema Structs for Multi-Mode Support
**Files:**
- Modify: `src/generated_scene/generator.rs` (after line 21)
**Goal:** Add new Rust structs to parse the enhanced JSON schema with modes support.
- [ ] **Step 1: Add ModeConditionJson struct**
Add after `ApiEndpointJson` struct (line 21):
```rust
#[derive(Debug, Clone, serde::Serialize, serde::Deserialize)]
pub struct ModeConditionJson {
pub field: String,
#[serde(default = "default_equals")]
pub operator: String,
pub value: serde_json::Value,
}
fn default_equals() -> String {
"equals".to_string()
}
```
- [ ] **Step 2: Add NormalizeRulesJson struct**
Add after `ModeConditionJson`:
```rust
#[derive(Debug, Clone, serde::Serialize, serde::Deserialize)]
pub struct NormalizeRulesJson {
#[serde(rename = "type", default = "default_validate_all")]
pub rules_type: String,
#[serde(default)]
pub required_fields: Vec<String>,
#[serde(default = "default_true")]
pub filter_null: bool,
}
fn default_validate_all() -> String {
"validate_all_columns".to_string()
}
fn default_true() -> bool {
true
}
```
- [ ] **Step 3: Add ModeConfigJson struct**
Add after `NormalizeRulesJson`:
```rust
#[derive(Debug, Clone, serde::Serialize, serde::Deserialize)]
pub struct ModeConfigJson {
pub name: String,
#[serde(default)]
pub label: Option<String>,
pub condition: ModeConditionJson,
#[serde(rename = "apiEndpoint")]
pub api_endpoint: ApiEndpointEnhancedJson,
#[serde(rename = "columnDefs", default)]
pub column_defs: Vec<(String, String)>,
#[serde(rename = "requestTemplate", default)]
pub request_template: Option<serde_json::Value>,
#[serde(rename = "normalizeRules", default)]
pub normalize_rules: Option<NormalizeRulesJson>,
#[serde(rename = "responsePath", default)]
pub response_path: Option<String>,
}
```
- [ ] **Step 4: Add ApiEndpointEnhancedJson struct**
Add before `ModeConfigJson`:
```rust
#[derive(Debug, Clone, serde::Serialize, serde::Deserialize)]
pub struct ApiEndpointEnhancedJson {
pub name: String,
pub url: String,
#[serde(default)]
pub method: String,
#[serde(rename = "contentType", default)]
pub content_type: Option<String>,
#[serde(default)]
pub description: Option<String>,
}
```
- [ ] **Step 5: Enhance SceneInfoJson struct**
Modify `SceneInfoJson` to add mode fields (add after line 54, before the closing brace):
```rust
// Multi-mode support (new fields)
#[serde(default)]
pub modes: Vec<ModeConfigJson>,
#[serde(rename = "defaultMode", default)]
pub default_mode: Option<String>,
#[serde(rename = "modeSwitchField", default)]
pub mode_switch_field: Option<String>,
```
- [ ] **Step 6: Verify the changes**
Run `cargo check` to verify:
```bash
cargo check
```
Expected: No compilation errors.
- [ ] **Step 7: Commit**
```bash
git add src/generated_scene/generator.rs
git commit -m "feat(generator): add multi-mode schema structs for enhanced LLM extraction"
```
---
### Task 2: Enhance LLM Extraction Prompt
**Files:**
- Modify: `frontend/scene-generator/llm-client.js` (lines 16-46)
**Goal:** Enhance `DEEP_SYSTEM_PROMPT` to instruct LLM to detect multi-mode business logic.
- [ ] **Step 1: Replace DEEP_SYSTEM_PROMPT with enhanced version**
Replace the entire `DEEP_SYSTEM_PROMPT` constant (lines 16-46):
```javascript
const DEEP_SYSTEM_PROMPT = `你是一个场景代码分析专家。分析场景源码,提取关键业务信息。
## 分析目标
1. **多模式识别** (关键):
- 查找条件分支逻辑 (if/switch) 中基于 period_mode、reportType 等字段的分支
- 识别不同分支对应的 API 端点、列定义、请求格式
- 如果发现多模式,使用 modes 数组格式输出
2. **API 端点**: 识别所有 HTTP 请求地址 (URL, method, contentType, 用途)
- 从 $.ajax/fetch 调用中提取 contentType
- 检测请求格式: application/json 或 application/x-www-form-urlencoded
3. **请求模板**: 识别请求参数结构
- 提取硬编码的分页参数 (rows, page, sidx, sord)
- 识别模板变量如 \${args.org_code}
4. **数据归一化**: 识别数据处理规则
- 查找数据渲染/表格填充逻辑
- 检测数据验证条件 (哪些字段不能为空)
5. **响应路径**: 识别数据在响应中的位置
- 如 response.content 或 response.data
## 输出格式
### 单模式场景 (无 modes 数组):
{
"sceneId": "string",
"sceneName": "string",
"sceneKind": "report_collection | monitoring",
"expectedDomain": "string",
"targetUrl": "string",
"apiEndpoints": [{"name": "", "url": "", "method": "POST"}],
"staticParams": {"key": "value"},
"columnDefs": [["fieldName", "中文列名"]]
}
### 多模式场景 (有 modes 数组):
{
"sceneId": "tq-lineloss-report",
"sceneName": "台区线损报表",
"sceneKind": "report_collection",
"modes": [
{
"name": "month",
"label": "月度报表",
"condition": {"field": "period_mode", "operator": "equals", "value": "month"},
"apiEndpoint": {
"name": "月度线损查询",
"url": "http://...",
"method": "POST",
"contentType": "application/x-www-form-urlencoded"
},
"columnDefs": [["ORG_NAME", "供电单位"], ...],
"requestTemplate": {"orgno": "\${args.org_code}", "rows": 1000, "page": 1},
"normalizeRules": {"type": "validate_all_columns", "filterNull": true},
"responsePath": "content"
},
{
"name": "week",
"label": "周报表",
"condition": {"field": "period_mode", "operator": "equals", "value": "week"},
"apiEndpoint": {...},
"columnDefs": [...],
...
}
],
"defaultMode": "month",
"modeSwitchField": "period_mode"
}
**重要**: 如果发现代码中有基于 period_mode 的 if/switch 分支,必须使用多模式格式输出!`;
```
- [ ] **Step 2: Verify JavaScript syntax**
```bash
node --check frontend/scene-generator/llm-client.js
```
Expected: No syntax errors.
- [ ] **Step 3: Commit**
```bash
git add frontend/scene-generator/llm-client.js
git commit -m "feat(llm): enhance DEEP_SYSTEM_PROMPT for multi-mode detection"
```
---
### Task 3: Implement Multi-Mode Template in Rust
**Files:**
- Modify: `src/generated_scene/generator.rs` (add new function after `browser_script_with_business_logic`)
**Goal:** Add a new template function that generates mode-aware JavaScript.
- [ ] **Step 1: Add browser_script_with_modes function**
Add after `browser_script_with_business_logic` function (after line 476):
```rust
fn browser_script_with_modes(scene_id: &str, scene_info: &SceneInfoJson) -> String {
let modes_json = serde_json::to_string_pretty(&scene_info.modes).unwrap_or_else(|_| "[]".to_string());
let default_mode = scene_info.default_mode.as_deref().unwrap_or("month");
let mode_switch_field = scene_info.mode_switch_field.as_deref().unwrap_or("period_mode");
format!(r#"const REPORT_NAME = '{scene_id}';
const MODES = {modes_json};
const DEFAULT_MODE = '{default_mode}';
const MODE_SWITCH_FIELD = '{mode_switch_field}';
function normalizePayload(payload) {{
if (typeof payload === 'string') {{
try {{ return JSON.parse(payload); }} catch (_) {{ return {{}}; }}
}}
return payload && typeof payload === 'object' ? payload : {{}};
}}
function validateArgs(args) {{
const errors = [];
if (!args.org_code) errors.push('Missing org_code');
if (!args.period_value) errors.push('Missing period_value');
return {{ valid: errors.length === 0, errors }};
}}
function detectMode(args) {{
const modeValue = args[MODE_SWITCH_FIELD] || DEFAULT_MODE;
return MODES.find(m => m.condition.value === modeValue) || MODES[0];
}}
function buildModeRequest(args, mode) {{
const endpoint = mode.apiEndpoint;
const template = mode.requestTemplate || {{}};
const contentType = endpoint.contentType || 'application/json';
const url = endpoint.url;
const method = endpoint.method || 'POST';
let body;
if (contentType === 'application/x-www-form-urlencoded') {{
body = {{ ...template }};
for (const [key, value] of Object.entries(body)) {{
if (typeof value === 'string' && value.startsWith('${{') && value.endsWith('}}')) {{
const expr = value.slice(2, -1);
try {{
body[key] = eval(expr);
}} catch (e) {{
body[key] = args.org_code;
}}
}}
}}
body.orgno = args.org_code;
}} else {{
body = JSON.stringify({{ ...template, ...args }});
}}
return {{ url, method, headers: {{ 'Content-Type': contentType }}, body }};
}}
function normalizeModeRows(data, mode) {{
const rules = mode.normalizeRules || {{ type: 'validate_all_columns', filterNull: true }};
const columns = mode.columnDefs.map(([key]) => key);
if (!Array.isArray(data)) return [];
return data.map(row => {{
const result = {{}};
for (const key of columns) {{
const v = row[key];
result[key] = (v === null || v === undefined || v === '') ? '' : String(v).trim();
}}
return result;
}}).filter(row => {{
if (!rules.filterNull) return true;
if (rules.type === 'validate_required' && rules.requiredFields) {{
return rules.requiredFields.every(f => row[f] !== '');
}}
return columns.every(k => row[k] !== '');
}});
}}
function determineArtifactStatus({{ blockedReason = '', fatalError = '', reasons = [], rows = [] }}) {{
if (blockedReason) return 'blocked';
if (fatalError) return 'error';
if (reasons.length > 0) return 'partial';
if (!rows.length) return 'empty';
return 'ok';
}}
function buildArtifact({{ status, blockedReason = '', fatalError = '', reasons = [], rows = [], args, columnDefs, columns }}) {{
return {{
type: 'report-artifact',
report_name: REPORT_NAME,
status: status || determineArtifactStatus({{ blockedReason, fatalError, reasons, rows }}),
period: {{
mode: args.period_mode,
mode_code: args.period_mode_code,
value: args.period_value,
payload: normalizePayload(args.period_payload)
}},
org: {{ label: args.org_label, code: args.org_code }},
column_defs: columnDefs || [],
columns: columns || [],
rows,
counts: {{ detail_rows: rows.length }},
partial_reasons: reasons.filter(r => r && !r.startsWith('api_') && !r.startsWith('validation_')),
reasons: Array.from(new Set(reasons.filter(Boolean)))
}};
}}
const defaultDeps = {{
validatePageContext(args) {{
const host = (globalThis.location?.hostname || '').trim();
const expected = (args.expected_domain || '').trim();
if (!host) return {{ ok: false, reason: 'page_context_unavailable' }};
if (host !== expected) return {{ ok: false, reason: 'page_context_mismatch' }};
return {{ ok: true }};
}},
async queryModeData(args, mode) {{
const endpoint = mode.apiEndpoint;
const request = buildModeRequest(args, mode);
const contentType = endpoint.contentType || 'application/json';
// Prefer jQuery
if (typeof $ !== 'undefined' && typeof $.ajax === 'function') {{
return new Promise((resolve, reject) => {{
$.ajax({{
url: request.url,
type: request.method,
data: request.body,
contentType: contentType,
dataType: 'json',
success: resolve,
error: (xhr, status, err) => reject(new Error(
`API failed (${{xhr.status}}): ${{err}} | body=${{(xhr.responseText || '').substring(0, 200)}}`
))
}});
}});
}}
// Fallback: fetch
if (typeof fetch === 'function') {{
const response = await fetch(request.url, {{
method: request.method,
headers: request.headers,
body: request.method !== 'GET' ? request.body : undefined
}});
if (!response.ok) {{
const text = await response.text().catch(() => '');
throw new Error(`HTTP ${{response.status}}: ${{text.substring(0, 200)}}`);
}}
return response.json();
}}
throw new Error('No HTTP client available (need jQuery or fetch)');
}}
}};
async function buildBrowserEntrypointResult(args, deps = defaultDeps) {{
// 1. Parameter validation
const validation = validateArgs(args);
if (!validation.valid) {{
const mode = detectMode(args);
return buildArtifact({{
status: 'blocked',
blockedReason: 'validation_failed',
reasons: validation.errors,
rows: [],
args,
columnDefs: mode.columnDefs,
columns: mode.columnDefs.map(([key]) => key)
}});
}}
// 2. Page context validation
const pageValidation = typeof deps.validatePageContext === 'function'
? deps.validatePageContext(args)
: {{ ok: true }};
if (!pageValidation?.ok) {{
const mode = detectMode(args);
return buildArtifact({{
status: 'blocked',
blockedReason: pageValidation?.reason || 'page_context_mismatch',
reasons: [pageValidation?.reason || 'page_context_mismatch'],
rows: [],
args,
columnDefs: mode.columnDefs,
columns: mode.columnDefs.map(([key]) => key)
}});
}}
// 3. Detect mode
const mode = detectMode(args);
// 4. Data fetching
const reasons = [];
let rawData = null;
try {{
rawData = await (deps.queryModeData ? deps.queryModeData(args, mode) : Promise.resolve([]));
}} catch (error) {{
return buildArtifact({{
status: 'error',
fatalError: error.message,
reasons: ['api_query_failed:' + error.message],
rows: [],
args,
columnDefs: mode.columnDefs,
columns: mode.columnDefs.map(([key]) => key)
}});
}}
// 5. Extract response data
const responsePath = mode.responsePath || '';
let data = rawData;
if (responsePath && rawData) {{
data = rawData[responsePath] || rawData;
}}
// 6. Row normalization
const rows = normalizeModeRows(data, mode);
if (rows.length === 0 && Array.isArray(data) && data.length > 0) {{
reasons.push('row_normalization_partial');
}}
// 7. Build artifact
return buildArtifact({{
reasons,
rows,
args,
columnDefs: mode.columnDefs,
columns: mode.columnDefs.map(([key]) => key)
}});
}}
if (typeof module !== 'undefined') {{
module.exports = {{ buildBrowserEntrypointResult, normalizePayload, validateArgs, detectMode, buildModeRequest, normalizeModeRows, buildArtifact, determineArtifactStatus, MODES, REPORT_NAME }};
}}
if (typeof args !== 'undefined') {{
return buildBrowserEntrypointResult(args);
}}
"#, scene_id = scene_id, modes_json = modes_json, default_mode = default_mode, mode_switch_field = mode_switch_field)
}
```
- [ ] **Step 2: Modify browser_script function to use multi-mode template**
Replace the `browser_script` function (lines 270-277):
```rust
fn browser_script(scene_id: &str, analysis: &SceneSourceAnalysis, scene_info: Option<&SceneInfoJson>) -> String {
match scene_info {
Some(info) if !info.modes.is_empty() => {
browser_script_with_modes(scene_id, info)
}
Some(info) if !info.api_endpoints.is_empty() || !info.column_defs.is_empty() => {
browser_script_with_business_logic(scene_id, analysis, info)
}
_ => browser_script_skeleton(scene_id, analysis),
}
}
```
- [ ] **Step 3: Verify compilation**
```bash
cargo check
```
Expected: No errors.
- [ ] **Step 4: Commit**
```bash
git add src/generated_scene/generator.rs
git commit -m "feat(generator): add multi-mode template for mode-aware script generation"
```
---
### Task 4: Add Unit Tests for Schema Parsing
**Files:**
- Create: `src/generated_scene/generator_test.rs`
**Goal:** Add tests to verify the enhanced schema parses correctly.
- [ ] **Step 1: Create test file**
Create `src/generated_scene/generator_test.rs`:
```rust
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_parse_mode_condition() {
let json = r#"{"field": "period_mode", "operator": "equals", "value": "month"}"#;
let condition: ModeConditionJson = serde_json::from_str(json).unwrap();
assert_eq!(condition.field, "period_mode");
assert_eq!(condition.operator, "equals");
assert_eq!(condition.value.as_str().unwrap(), "month");
}
#[test]
fn test_parse_normalize_rules() {
let json = r#"{"type": "validate_required", "requiredFields": ["ORG_NAME"], "filterNull": true}"#;
let rules: NormalizeRulesJson = serde_json::from_str(json).unwrap();
assert_eq!(rules.rules_type, "validate_required");
assert_eq!(rules.required_fields, vec!["ORG_NAME"]);
assert!(rules.filter_null);
}
#[test]
fn test_parse_mode_config() {
let json = r#"{
"name": "month",
"label": "月度报表",
"condition": {"field": "period_mode", "operator": "equals", "value": "month"},
"apiEndpoint": {"name": "test", "url": "http://example.com", "method": "POST"},
"columnDefs": [["ORG_NAME", "供电单位"]],
"responsePath": "content"
}"#;
let mode: ModeConfigJson = serde_json::from_str(json).unwrap();
assert_eq!(mode.name, "month");
assert_eq!(mode.column_defs.len(), 1);
assert_eq!(mode.response_path, Some("content".to_string()));
}
#[test]
fn test_parse_scene_info_with_modes() {
let json = r#"{
"sceneId": "test-report",
"sceneName": "测试报表",
"sceneKind": "report_collection",
"modes": [
{"name": "month", "condition": {"field": "period_mode", "value": "month"}, "apiEndpoint": {"name": "m", "url": "http://a"}, "columnDefs": []},
{"name": "week", "condition": {"field": "period_mode", "value": "week"}, "apiEndpoint": {"name": "w", "url": "http://b"}, "columnDefs": []}
],
"defaultMode": "month",
"modeSwitchField": "period_mode"
}"#;
let info: SceneInfoJson = serde_json::from_str(json).unwrap();
assert_eq!(info.scene_id, "test-report");
assert_eq!(info.modes.len(), 2);
assert_eq!(info.default_mode, Some("month".to_string()));
}
#[test]
fn test_parse_scene_info_backward_compatible() {
// Old format without modes should still work
let json = r#"{
"sceneId": "old-report",
"sceneName": "旧格式报表",
"apiEndpoints": [{"name": "test", "url": "http://example.com"}],
"columnDefs": [["col1", "列1"]]
}"#;
let info: SceneInfoJson = serde_json::from_str(json).unwrap();
assert_eq!(info.scene_id, "old-report");
assert!(info.modes.is_empty());
assert_eq!(info.api_endpoints.len(), 1);
}
}
```
- [ ] **Step 2: Add test module to generator.rs**
Add at the end of `generator.rs`:
```rust
#[cfg(test)]
mod generator_test;
```
- [ ] **Step 3: Run tests**
```bash
cargo test --lib generator
```
Expected: All tests pass.
- [ ] **Step 4: Commit**
```bash
git add src/generated_scene/generator_test.rs src/generated_scene/generator.rs
git commit -m "test(generator): add unit tests for multi-mode schema parsing"
```
---
### Task 5: Integration Test with tq-lineloss-report
**Files:**
- Test: Generate skill from tq-lineloss-report source
**Goal:** Verify the enhanced template can generate a multi-mode script.
- [ ] **Step 1: Build the project**
```bash
cargo build --release
```
Expected: Build succeeds.
- [ ] **Step 2: Create a test multi-mode scene-info JSON**
Create a test JSON file to simulate LLM output:
```json
{
"sceneId": "tq-lineloss-test",
"sceneName": "台区线损测试报表",
"sceneKind": "report_collection",
"modes": [
{
"name": "month",
"label": "月度报表",
"condition": {"field": "period_mode", "operator": "equals", "value": "month"},
"apiEndpoint": {
"name": "月度线损查询",
"url": "http://20.76.57.61:18080/gsllys/fourVerEightHor/fourVerEightHorLinelossRateList",
"method": "POST",
"contentType": "application/x-www-form-urlencoded"
},
"columnDefs": [["ORG_NAME", "供电单位"], ["YGDL", "供电量"], ["YYDL", "售电量"]],
"requestTemplate": {"orgno": "${args.org_code}", "rows": 1000, "page": 1, "sidx": "ORG_NO", "sord": "asc"},
"normalizeRules": {"type": "validate_all_columns", "filterNull": true},
"responsePath": "content"
},
{
"name": "week",
"label": "周报表",
"condition": {"field": "period_mode", "operator": "equals", "value": "week"},
"apiEndpoint": {
"name": "周线损查询",
"url": "http://20.76.57.61:18080/gsllys/tqLinelossStatis/getYearMonWeekLinelossAnalysisList",
"method": "POST",
"contentType": "application/x-www-form-urlencoded"
},
"columnDefs": [["ORG_NAME", "供电单位"], ["LINE_LOSS_RATE", "线损率"]],
"requestTemplate": {"orgno": "${args.org_code}", "tjzq": "week", "rows": 1000},
"normalizeRules": {"type": "validate_required", "requiredFields": ["ORG_NAME", "LINE_LOSS_RATE"], "filterNull": true},
"responsePath": "content"
}
],
"defaultMode": "month",
"modeSwitchField": "period_mode"
}
```
Save to `tmp_multi_mode_test.json`.
- [ ] **Step 3: Run generator with the multi-mode JSON**
```bash
cargo run --bin sg_scene_generate -- --source-dir "examples/test-scene" --scene-id "tq-lineloss-test" --scene-name "台区线损测试" --output-root "tmp_multi_test" --scene-info-json "$(cat tmp_multi_mode_test.json)"
```
Expected: Skill package generated without errors.
- [ ] **Step 4: Verify generated script syntax**
```bash
node --check tmp_multi_test/skills/tq-lineloss-test/scripts/collect_tq_lineloss_test.js
```
Expected: No syntax errors.
- [ ] **Step 5: Verify generated script has multi-mode logic**
Check that the generated script contains:
- `detectMode()` function
- `MODES` constant with mode configurations
- `buildModeRequest()` function
- `normalizeModeRows()` function
- [ ] **Step 6: Commit**
```bash
git add -A
git commit -m "test: verify multi-mode template generates valid JavaScript"
```
---
### Task 6: Update Web UI to Display Mode Information
**Files:**
- Modify: `frontend/scene-generator/sg_scene_generator.html`
**Goal:** Add mode information display in the extraction preview panel.
- [ ] **Step 1: Add mode display section**
Add after the column defs display in the preview panel:
```html
<div id="modes-preview" class="preview-section" style="display: none;">
<h4>业务模式</h4>
<div id="modes-list" class="preview-list"></div>
</div>
```
- [ ] **Step 2: Add JavaScript to populate mode info**
In the `showExtractionPreview` function, add:
```javascript
// Show modes if present
const modesSection = document.getElementById('modes-preview');
const modesList = document.getElementById('modes-list');
if (data.modes && data.modes.length > 0) {
modesSection.style.display = 'block';
modesList.innerHTML = data.modes.map(mode => {
const name = escapeHtml(mode.name || 'unknown');
const label = escapeHtml(mode.label || '');
const api = escapeHtml(mode.apiEndpoint?.url || '');
return `<div class="preview-list-item">
<strong>${name}</strong>${label ? ` (${label})` : ''}: ${api}
</div>`;
}).join('');
} else {
modesSection.style.display = 'none';
}
```
- [ ] **Step 3: Verify changes**
```bash
node --check frontend/scene-generator/sg_scene_generator.html
```
Note: HTML files can't be syntax-checked directly, just verify the server starts.
- [ ] **Step 4: Commit**
```bash
git add frontend/scene-generator/sg_scene_generator.html
git commit -m "feat(ui): add mode information display in extraction preview"
```
---
## Self-Review Checklist
**1. Spec Coverage:**
- [x] Multi-mode schema structs → Task 1
- [x] Enhanced LLM prompt → Task 2
- [x] Multi-mode template → Task 3
- [x] Unit tests → Task 4
- [x] Integration test → Task 5
- [x] UI update → Task 6
**2. Placeholder Scan:**
- No TBD, TODO, or placeholder text found
- All code snippets are complete
- All commands have expected output
**3. Type Consistency:**
- `ModeConditionJson` field names match JSON schema
- `ModeConfigJson` uses `apiEndpoint` (camelCase) matching JSON
- `NormalizeRulesJson` uses `rules_type` with serde rename
---
## Execution Handoff
Plan complete and saved to `docs/superpowers/plans/2026-04-17-enhanced-llm-extraction-schema-plan.md`. Two execution options:
**1. Subagent-Driven (recommended)** - I dispatch a fresh subagent per task, review between tasks, fast iteration
**2. Inline Execution** - Execute tasks in this session using executing-plans, batch execution with checkpoints
Which approach?

View File

@@ -0,0 +1,627 @@
# Progressive Browser Script Template Enhancement Implementation Plan
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
**Goal:** Enhance the `browser_script_with_business_logic` template in Rust to generate complete, runnable browser scripts with proper HTTP handling, status determination, and error handling.
**Architecture:** Modify `src/generated_scene/generator.rs` to replace the current incomplete JavaScript template with an enhanced version that includes: direct URL usage (fixing the URL construction bug), jQuery + fetch dual HTTP client support, complete status determination (blocked/error/partial/empty/ok), and enhanced entrypoint with page context validation.
**Tech Stack:** Rust, JavaScript (browser script), serde_json
---
## File Structure
| File | Action | Purpose |
|------|--------|---------|
| `src/generated_scene/generator.rs` | Modify | Replace `browser_script_with_business_logic` function with enhanced template |
---
### Task 1: Fix URL Building in buildRequest()
**Files:**
- Modify: `src/generated_scene/generator.rs:308-321` (current `buildRequest` function in template)
**Current bug:** The template uses `new URL(endpoint.url, window.location.origin)` which incorrectly constructs URLs based on the current page's origin instead of using the complete endpoint URL directly.
**Goal:** Replace the buggy URL construction with direct URL usage.
- [ ] **Step 1: Write the failing test**
Create a test file to verify URL construction behavior:
```javascript
// Test that URL is used directly without window.location.origin
const assert = require('assert');
// Mock a complete URL in endpoint
const endpoint = { url: 'http://20.76.57.61:18080/gsllys/api/test', method: 'POST' };
// Expected: buildRequest should return the URL directly
// NOT: new URL(endpoint.url, 'http://different-origin.com')
```
- [ ] **Step 2: Implement the fix**
Replace the `buildRequest` function in `browser_script_with_business_logic` (lines 308-321 in the generated template):
**Current (buggy):**
```javascript
function buildRequest(args, endpoint) {
const url = new URL(endpoint.url, window.location.origin);
const params = { ...STATIC_PARAMS, ...args };
for (const [key, value] of Object.entries(params)) {
if (value !== undefined && value !== null) {
url.searchParams.set(key, String(value));
}
}
return {
url: url.toString(),
method: endpoint.method || 'GET',
headers: { 'Content-Type': 'application/json' }
};
}
```
**Fixed:**
```javascript
function buildRequest(args, endpoint) {
// Use endpoint.url directly - it's already a complete URL
const url = endpoint.url;
const method = endpoint.method || 'POST';
const headers = { 'Content-Type': 'application/json' };
const body = JSON.stringify({ ...STATIC_PARAMS, ...args });
return { url, method, headers, body };
}
```
Locate this in `src/generated_scene/generator.rs` within the `browser_script_with_business_logic` function (around line 308 in the format! string). Replace the entire `buildRequest` function definition.
- [ ] **Step 3: Verify the change**
Run `cargo build` to verify the Rust code compiles:
```bash
cargo build
```
Expected: Build succeeds without errors.
- [ ] **Step 4: Commit**
```bash
git add src/generated_scene/generator.rs
git commit -m "fix(generator): use endpoint.url directly in buildRequest to fix URL construction bug"
```
---
### Task 2: Add jQuery + fetch Dual HTTP Client Support
**Files:**
- Modify: `src/generated_scene/generator.rs:355-368` (current `defaultDeps` object in template)
**Goal:** Add jQuery `$.ajax` as primary HTTP client with fetch as fallback for environments without jQuery.
- [ ] **Step 1: Replace defaultDeps with enhanced version**
Replace the current `defaultDeps` object in the template with enhanced jQuery + fetch support:
**Current:**
```javascript
const defaultDeps = {
validatePageContext: async () => true,
queryData: async (args) => {
const endpoint = API_ENDPOINTS[0];
if (!endpoint) throw new Error('No API endpoint configured');
const request = buildRequest(args, endpoint);
const response = await fetch(request.url, {
method: request.method,
headers: request.headers
});
if (!response.ok) throw new Error(`HTTP ${response.status}: ${response.statusText}`);
return response.json();
}
};
```
**Enhanced:**
```javascript
const defaultDeps = {
validatePageContext(args) {
const host = (globalThis.location?.hostname || '').trim();
const expected = (args.expected_domain || '').trim();
if (!host) return { ok: false, reason: 'page_context_unavailable' };
if (host !== expected) return { ok: false, reason: 'page_context_mismatch' };
return { ok: true };
},
async queryData(args) {
const endpoint = API_ENDPOINTS[0];
if (!endpoint) throw new Error('No API endpoint configured');
const request = buildRequest(args, endpoint);
// Prefer jQuery (internal pages typically have it)
if (typeof $ !== 'undefined' && typeof $.ajax === 'function') {
return new Promise((resolve, reject) => {
$.ajax({
url: request.url,
type: request.method,
data: request.body,
contentType: 'application/json',
dataType: 'json',
success: resolve,
error: (xhr, status, err) => reject(new Error(
`API failed (${xhr.status}): ${err} | body=${(xhr.responseText || '').substring(0, 200)}`
))
});
});
}
// Fallback: fetch API
if (typeof fetch === 'function') {
const response = await fetch(request.url, {
method: request.method,
headers: request.headers,
body: request.method !== 'GET' ? request.body : undefined
});
if (!response.ok) {
const text = await response.text().catch(() => '');
throw new Error(`HTTP ${response.status}: ${text.substring(0, 200)}`);
}
return response.json();
}
throw new Error('No HTTP client available (need jQuery or fetch)');
}
};
```
This code goes into the format! string in `browser_script_with_business_logic` function in `src/generated_scene/generator.rs`.
- [ ] **Step 2: Verify the change**
Run `cargo build` to verify:
```bash
cargo build
```
Expected: Build succeeds.
- [ ] **Step 3: Commit**
```bash
git add src/generated_scene/generator.rs
git commit -m "feat(generator): add jQuery + fetch dual HTTP client support in template"
```
---
### Task 3: Add determineArtifactStatus Function
**Files:**
- Modify: `src/generated_scene/generator.rs` (add new function to template before `buildArtifact`)
**Goal:** Add complete status determination logic supporting blocked/error/partial/empty/ok statuses.
- [ ] **Step 1: Add determineArtifactStatus function to template**
Insert the following function into the template, before the `buildArtifact` function:
```javascript
function determineArtifactStatus({ blockedReason = '', fatalError = '', reasons = [], rows = [] }) {
if (blockedReason) return 'blocked';
if (fatalError) return 'error';
if (reasons.length > 0) return 'partial';
if (!rows.length) return 'empty';
return 'ok';
}
```
This should be placed in the template string between `normalizeRows` and `buildArtifact` functions.
- [ ] **Step 2: Verify the change**
```bash
cargo build
```
Expected: Build succeeds.
- [ ] **Step 3: Commit**
```bash
git add src/generated_scene/generator.rs
git commit -m "feat(generator): add determineArtifactStatus for complete status determination"
```
---
### Task 4: Enhance buildArtifact Function
**Files:**
- Modify: `src/generated_scene/generator.rs:334-353` (current `buildArtifact` function in template)
**Goal:** Enhance `buildArtifact` to use `determineArtifactStatus` and accept additional parameters.
- [ ] **Step 1: Replace buildArtifact function**
Replace the current `buildArtifact` function with enhanced version:
**Current:**
```javascript
function buildArtifact(args, rows) {
return {
type: 'report-artifact',
report_name: '{scene_id}',
status: rows.length > 0 ? 'ok' : 'empty',
period: {
mode: args.period_mode,
mode_code: args.period_mode_code,
value: args.period_value,
payload: normalizePayload(args.period_payload)
},
org: { label: args.org_label, code: args.org_code },
column_defs: COLUMN_DEFS,
columns: {columns_json},
rows,
counts: { detail_rows: rows.length },
partial_reasons: [],
reasons: []
};
}
```
**Enhanced:**
```javascript
function buildArtifact({ status, blockedReason = '', fatalError = '', reasons = [], rows = [], args }) {
return {
type: 'report-artifact',
report_name: REPORT_NAME,
status: status || determineArtifactStatus({ blockedReason, fatalError, reasons, rows }),
period: {
mode: args.period_mode,
mode_code: args.period_mode_code,
value: args.period_value,
payload: normalizePayload(args.period_payload)
},
org: { label: args.org_label, code: args.org_code },
column_defs: COLUMN_DEFS,
columns: COLUMNS,
rows,
counts: { detail_rows: rows.length },
partial_reasons: reasons.filter(r => r && !r.startsWith('api_') && !r.startsWith('validation_')),
reasons: Array.from(new Set(reasons.filter(Boolean)))
};
}
```
- [ ] **Step 2: Verify the change**
```bash
cargo build
```
Expected: Build succeeds.
- [ ] **Step 3: Commit**
```bash
git add src/generated_scene/generator.rs
git commit -m "feat(generator): enhance buildArtifact with determineArtifactStatus integration"
```
---
### Task 5: Enhance buildBrowserEntrypointResult Function
**Files:**
- Modify: `src/generated_scene/generator.rs:370-405` (current `buildBrowserEntrypointResult` function in template)
**Goal:** Complete rewrite of entrypoint with proper validation flow, page context check, and error handling.
- [ ] **Step 1: Replace buildBrowserEntrypointResult function**
Replace the entire `buildBrowserEntrypointResult` function:
**Current:**
```javascript
async function buildBrowserEntrypointResult(args, deps = defaultDeps) {
const validation = validateArgs(args);
if (!validation.valid) {
return {
type: 'report-artifact',
report_name: '{scene_id}',
status: 'error',
error: 'Validation failed: ' + validation.errors.join(', '),
column_defs: COLUMN_DEFS,
columns: {columns_json},
rows: [],
counts: { detail_rows: 0 },
partial_reasons: [],
reasons: validation.errors
};
}
try {
const rawData = await (deps.queryData ? deps.queryData(args) : Promise.resolve([]));
const rows = normalizeRows(rawData);
return buildArtifact(args, rows);
} catch (error) {
return {
type: 'report-artifact',
report_name: '{scene_id}',
status: 'error',
error: error.message,
column_defs: COLUMN_DEFS,
columns: {columns_json},
rows: [],
counts: { detail_rows: 0 },
partial_reasons: [],
reasons: [error.message]
};
}
}
```
**Enhanced:**
```javascript
async function buildBrowserEntrypointResult(args, deps = defaultDeps) {
// 1. Parameter validation
const validation = validateArgs(args);
if (!validation.valid) {
return buildArtifact({
status: 'blocked',
blockedReason: 'validation_failed',
reasons: validation.errors,
rows: [],
args
});
}
// 2. Page context validation
const pageValidation = typeof deps.validatePageContext === 'function'
? deps.validatePageContext(args)
: { ok: true };
if (!pageValidation?.ok) {
return buildArtifact({
status: 'blocked',
blockedReason: pageValidation?.reason || 'page_context_mismatch',
reasons: [pageValidation?.reason || 'page_context_mismatch'],
rows: [],
args
});
}
// 3. Data fetching
const reasons = [];
let rawData = null;
try {
rawData = await (deps.queryData ? deps.queryData(args) : Promise.resolve([]));
} catch (error) {
return buildArtifact({
status: 'error',
fatalError: error.message,
reasons: ['api_query_failed:' + error.message],
rows: [],
args
});
}
// 4. Row normalization
const rows = normalizeRows(rawData);
if (rows.length === 0 && Array.isArray(rawData) && rawData.length > 0) {
reasons.push('row_normalization_partial');
}
// 5. Build artifact
return buildArtifact({ reasons, rows, args });
}
```
- [ ] **Step 2: Verify the change**
```bash
cargo build
```
Expected: Build succeeds.
- [ ] **Step 3: Commit**
```bash
git add src/generated_scene/generator.rs
git commit -m "feat(generator): enhance buildBrowserEntrypointResult with validation flow"
```
---
### Task 6: Add Helper Functions and Constants
**Files:**
- Modify: `src/generated_scene/generator.rs` (add helper functions to template)
**Goal:** Add utility functions used by the enhanced template.
- [ ] **Step 1: Add helper functions after COLUMN_DEFS constant**
Add these utility functions to the template after the constant definitions:
```javascript
const REPORT_NAME = '{scene_id}';
const COLUMNS = {columns_json};
function pickFirstNonEmpty(...values) {
for (const value of values) {
if (typeof value === 'string' && value.trim() !== '') {
return value.trim();
}
}
return '';
}
function isNonEmptyString(value) {
return typeof value === 'string' && value.trim() !== '';
}
```
- [ ] **Step 2: Verify the change**
```bash
cargo build
```
Expected: Build succeeds.
- [ ] **Step 3: Commit**
```bash
git add src/generated_scene/generator.rs
git commit -m "feat(generator): add helper functions for enhanced template"
```
---
### Task 7: Update Module Exports
**Files:**
- Modify: `src/generated_scene/generator.rs:407-409` (current module.exports in template)
**Goal:** Update module exports to include new functions.
- [ ] **Step 1: Update module.exports**
Replace the current export block:
**Current:**
```javascript
if (typeof module !== 'undefined') {
module.exports = { buildBrowserEntrypointResult, normalizePayload, validateArgs, buildRequest, normalizeRows, buildArtifact, API_ENDPOINTS, STATIC_PARAMS, COLUMN_DEFS };
}
```
**Enhanced:**
```javascript
if (typeof module !== 'undefined') {
module.exports = {
buildBrowserEntrypointResult,
normalizePayload,
validateArgs,
buildRequest,
normalizeRows,
determineArtifactStatus,
buildArtifact,
API_ENDPOINTS,
STATIC_PARAMS,
COLUMN_DEFS,
COLUMNS,
REPORT_NAME
};
}
```
- [ ] **Step 2: Verify the change**
```bash
cargo build
```
Expected: Build succeeds.
- [ ] **Step 3: Commit**
```bash
git add src/generated_scene/generator.rs
git commit -m "feat(generator): update module exports for enhanced template"
```
---
### Task 8: Integration Test - Generate and Verify Script
**Files:**
- Test: Generate a skill package and verify the output
**Goal:** Verify the enhanced template generates valid JavaScript.
- [ ] **Step 1: Build the project**
```bash
cargo build --release
```
Expected: Build succeeds.
- [ ] **Step 2: Generate a test skill package**
Use the scene generator to create a test skill:
```bash
# Assuming you have a test scene directory
cargo run --bin sg_scene_generate -- --source-dir "examples/test-scene" --scene-id "test-enhanced" --scene-name "Test Enhanced" --output-root "tmp_test_enhanced" --scene-info-json '{"sceneId":"test-enhanced","sceneName":"Test Enhanced","apiEndpoints":[{"name":"testApi","url":"http://example.com/api/test","method":"POST"}],"staticParams":{},"columnDefs":[["col1","Column 1"]]}'
```
Expected: Skill package generated without errors.
- [ ] **Step 3: Verify generated script syntax**
Check the generated JavaScript for syntax errors:
```bash
node --check tmp_test_enhanced/skills/test-enhanced/scripts/collect_test_enhanced.js
```
Expected: No syntax errors.
- [ ] **Step 4: Run the generated test**
```bash
node tmp_test_enhanced/skills/test-enhanced/scripts/collect_test_enhanced.test.js
```
Expected: Test passes (may fail on API call, but artifact structure should be valid).
- [ ] **Step 5: Commit**
```bash
git add -A
git commit -m "test: verify enhanced template generates valid JavaScript"
```
---
## Self-Review Checklist
**1. Spec Coverage:**
- [x] URL construction bug fix → Task 1
- [x] jQuery + fetch dual support → Task 2
- [x] determineArtifactStatus function → Task 3
- [x] Enhanced buildArtifact → Task 4
- [x] Enhanced buildBrowserEntrypointResult → Task 5
- [x] Helper functions → Task 6
- [x] Module exports → Task 7
- [x] Integration testing → Task 8
**2. Placeholder Scan:**
- No TBD, TODO, or placeholder text found
- All code snippets are complete
- All commands have expected output
**3. Type Consistency:**
- `buildArtifact` parameter signature consistent across all call sites
- `args` object properties consistently named
- Status values: blocked/error/partial/empty/ok consistently used
---
## Execution Handoff
Plan complete and saved to `docs/superpowers/plans/2026-04-17-progressive-template-enhancement-plan.md`. Two execution options:
**1. Subagent-Driven (recommended)** - I dispatch a fresh subagent per task, review between tasks, fast iteration
**2. Inline Execution** - Execute tasks in this session using executing-plans, batch execution with checkpoints
Which approach?

View File

@@ -0,0 +1,482 @@
# sgClaw 场景生成器质量提升 — 实施计划
> 对应设计文档: `docs/superpowers/specs/2026-04-17-scene-generator-quality-improvement-design.md`
## 总览
3 个阶段8 个任务。每个任务包含:改动文件、具体步骤、验证方式、提交信息。
---
## Phase 1: 修基础
### Task 1: 统一生成路径(废弃 browser_script_with_business_logic
**文件**: `src/generated_scene/generator.rs`
**当前状态** (line 728-735):
```rust
fn compile_scene(scene_ir: &SceneIr, analysis: &SceneSourceAnalysis, tool_name: &str) -> CompiledScene {
let scene_toml = render_scene_toml(scene_ir, analysis, tool_name);
let browser_script = match scene_ir.workflow_archetype() {
WorkflowArchetype::SingleRequestTable => compile_single_request_table(scene_ir),
WorkflowArchetype::MultiModeRequest => compile_multi_mode_request(scene_ir),
WorkflowArchetype::PaginatedEnrichment => compile_paginated_enrichment(scene_ir),
WorkflowArchetype::PageStateEval => compile_page_state_eval(scene_ir),
};
...
}
```
**步骤**:
1. **修改 `compile_scene` 路由逻辑** (line 730-735):
- `SingleRequestTable` 不再调用 `compile_simple_request_script``compile_single_request_table` 的底层),改为将单模式场景包装为一个 mode 后走 `compile_multi_mode_request`
- 新增辅助函数 `ensure_modes_populated(scene_ir: &SceneIr) -> SceneIr`
- 如果 `scene_ir.modes` 为空但 `scene_ir.api_endpoints` 非空,生成一个 default mode
-`SingleRequestTable``PageStateEval` 场景的 `workflow_archetype` 改为 `MultiModeRequest`(因为统一走 modes 路径)
- 修改 match 分支:
```rust
let browser_script = match scene_ir.workflow_archetype() {
WorkflowArchetype::MultiModeRequest => compile_multi_mode_request(scene_ir),
WorkflowArchetype::PaginatedEnrichment => compile_paginated_enrichment(scene_ir),
_ => {
// SingleRequestTable, PageStateEval — fallback to multi-mode with default mode
let adapted = ensure_modes_populated(scene_ir);
compile_multi_mode_request(&adapted)
}
};
```
2. **实现 `ensure_modes_populated`**:
- 接收 `&SceneIr`,返回 `SceneIr`clone
- 如果 `modes` 已非空,直接返回 clone
- 如果 `modes` 为空但 `api_endpoints` 非空:
- 取第一个 endpoint 构造默认 mode
- 设置 `name: "default"`, `label: Some("default")`
- `condition`: `{ field: "period_mode", operator: "equals", value: "default" }`
- `apiEndpoint`: 复制第一个 endpoint
- `requestTemplate`: 取 `scene_ir.request_template`
- `responsePath`: 取 `scene_ir.response_path`
- `normalizeRules`: 取 `scene_ir.normalize_rules` 或默认
- `columnDefs`: 取 `scene_ir.column_defs`
- 同时设置 `default_mode = Some("default")`, `mode_switch_field = Some("period_mode")`
3. **标记 `browser_script_with_business_logic` 为废弃**(如果仍存在于代码中):
- 在当前代码中,该函数已不存在(已被 `compile_simple_request_script` 替代)。在注释中标注 "legacy path, superseded by multi-mode unified path"
**验证**:
- `cargo check` 无编译错误
- 单模式场景生成的 JS 脚本包含 `const MODES =` 和 `detectMode` 逻辑
**提交信息**:
```
feat(generator): unify all scene types through multi-mode path
Single-mode and page-state-eval scenes now get auto-wrapped into a
default mode and compiled through compile_multi_mode_request. This
eliminates the old browser_script_with_business_logic code path and
ensures all scenes get responsePath extraction, requestTemplate, and
contentType support.
```
---
### Task 2: 修复 jQuery processData 参数
**文件**: `src/generated_scene/generator.rs``compile_multi_mode_request` 函数line 1069-1253
**当前状态**: 模板中 `buildModeRequest` 函数line 1098-1118根据 `contentType` 区分了 body 序列化方式form-urlencoded 用 `Object.entries().join('&')`JSON 用 `JSON.stringify`),但 jQuery ajax 调用line 1185-1196**没有**设置 `processData` 参数。
jQuery 对 form-urlencoded body 会默认再次序列化(将字符串当作 query string 处理),导致双重编码。
**步骤**:
1. 修改 `compile_multi_mode_request` 中的 jQuery ajax 调用模板line 1185-1196 区域):
- 在 `$.ajax({...})` 中增加 `processData` 参数:
```javascript
$.ajax({
url: request.url,
type: request.method,
data: request.body,
contentType: request.headers['Content-Type'],
processData: contentType !== 'application/x-www-form-urlencoded',
dataType: 'json',
success: resolve,
error: (xhr, status, err) => reject(new Error(`API failed (${xhr.status}): ${err}`))
});
```
- 需要将 `contentType` 变量在 Promise 回调中可访问,从 `request` 对象中提取
2. 同理修改 `compile_simple_request_script` 中的 jQuery ajax 调用line 994-1004 区域),增加相同的 `processData` 逻辑
**验证**:
- 生成的 JS 中 `$.ajax` 调用包含 `processData` 参数
- form-urlencoded 请求不会双重编码
**提交信息**:
```
fix(generator): add processData to jQuery ajax for form-urlencoded requests
jQuery default processData:true re-serializes string bodies, causing
double-encoding for form-urlencoded payloads. Set processData:false
when contentType is application/x-www-form-urlencoded.
```
---
### Task 3: 单模式场景自动包装为 mode 配置
**文件**: `frontend/scene-generator/llm-client.js`
**当前状态**: `analyzeSceneDeep` (line 729-769) 调用 LLM 后直接 `normalizeSceneIr` 返回。如果 LLM 输出 `modes: []` 但有 `apiEndpoints`,不会自动包装。
**步骤**:
1. 在 `analyzeSceneDeep` 函数中,`normalizeSceneIr(...)` 之后、返回之前,增加自动包装逻辑:
```javascript
async function analyzeSceneDeep(sourceDir, dirContents, config) {
const content = await requestChatCompletionWithRetry(...);
const normalized = normalizeSceneIr(await extractJsonFromResponseWithRepair(content, config));
// ... existing sceneId validation ...
// AUTO-WRAP: single-mode scenes → modes array
if (normalized.modes.length === 0 && normalized.apiEndpoints.length > 0) {
normalized.modes.push({
name: "default",
label: "default",
condition: { field: "period_mode", operator: "equals", value: "default" },
apiEndpoint: normalized.apiEndpoints[0],
columnDefs: normalized.columnDefs || [],
requestTemplate: normalized.requestTemplate || {},
normalizeRules: normalized.normalizeRules || { type: "validate_required", requiredFields: [], filterNull: true },
responsePath: normalized.responsePath || "",
});
normalized.defaultMode = "default";
normalized.modeSwitchField = "period_mode";
// Upgrade archetype if it was single_request_table
if (normalized.workflowArchetype === "single_request_table") {
normalized.workflowArchetype = "multi_mode_request";
}
}
return normalized;
}
```
2. 同时在 `normalizeSceneIr` 中确保 `defaultMode` 和 `modeSwitchField` 有正确的默认值(已有 line 477-478 处理)
**验证**:
- 对单模式场景(如 `用户日电量监测`)运行生成,确认 `modes` 数组包含一个 default mode
- 确认 `workflowArchetype` 被正确升级为 `multi_mode_request`
**提交信息**:
```
feat(llm-client): auto-wrap single-mode scenes into modes array
When the LLM returns an empty modes array but has apiEndpoints,
automatically create a default mode with the first endpoint,
requestTemplate, responsePath, and normalizeRules. This ensures all
scenes compile through the multi-mode path.
```
---
## Phase 2: 增强提取
### Task 4: 增强 LLM prompt 的强制约束
**文件**: `frontend/scene-generator/llm-client.js``DEEP_SYSTEM_PROMPT`line 19-82
**当前状态**: prompt 中已列出 schema 但没有强调哪些字段是**必须**填充的。LLM 经常跳过 `contentType`、`responsePath`、`requestTemplate`。
**步骤**:
1. 在 `DEEP_SYSTEM_PROMPT` 的 schema 定义后,增加**强制字段约束**段落:
```
MANDATORY FIELDS (never leave empty):
- apiEndpoints[].contentType: detect from source code.
* For $.ajax({}): look for 'contentType' property. Default 'application/json' if absent.
* For $http.sendByAxios(): contentType is 'application/json' (axios default).
* For XMLHttpRequest: look for setRequestHeader('Content-Type', ...).
* For form submissions: 'application/x-www-form-urlencoded'.
- modes[].responsePath: the JSON path from raw API response to the data array.
* Common patterns: 'data.list', 'data.rcvblAcctSumAll.rcvblAcctVOS', 'content', 'data.records'
* If response is the array itself, use empty string "".
- modes[].requestTemplate: the static request body shape from the source code.
* Extract ALL keys that appear in the request body object.
* Mark dynamic values as "${args.fieldName}" and static values as literals.
- apiEndpoints[].url: the full API URL as seen in the source code.
RULES:
- If you cannot determine contentType, default to 'application/json'.
- If you cannot determine responsePath, default to '' (empty string).
- If you cannot determine requestTemplate, use {} (empty object).
- NEVER leave these fields as null or undefined.
```
2. 将这段文字插入到 `DEEP_SYSTEM_PROMPT` 中 schema 定义之后、`Instructions` 之前
**验证**:
- 对 `营销2.0零度户报表数据生成` 场景运行生成,确认 LLM 输出的 `contentType` 和 `responsePath` 不再为空
- 确认 `requestTemplate` 包含了业务必需字段
**提交信息**:
```
feat(llm-client): add mandatory field constraints to DEEP_SYSTEM_PROMPT
Explicitly require LLM to fill contentType, responsePath, and
requestTemplate with detected values or defaults. Reduces empty-field
rate from ~60% to target ~10%.
```
---
### Task 5: 增加业务 JS 文件提取
**文件**:
- `frontend/scene-generator/server.js`
- `frontend/scene-generator/generator-runner.js`
**当前状态**: `readDirectory` 在 `generator-runner.js` 中已经读取所有文件到 `dirContents`,但 `buildDeepAnalyzePrompt``llm-client.js` line 125-157主要推送 `index.html` 的 fragments。业务 JS 文件(如 `js/mca.js`, `js/sgApi.js`)的内容没有被单独提取推送。
**步骤**:
1. **在 `generator-runner.js` 中增加业务 JS 文件识别**:
- 在 `buildAnalysisContext` 函数中,增加一个 `businessJsFragments` 数组
- 识别 `js/` 目录下的 `.js` 文件(排除 `vue.js`, `element-ui` 等第三方库)
- 对每个业务 JS 文件,提取前 600 字符的关键片段函数定义、API 调用、配置对象)
- 将结果放入 `analysisContext.businessJsFragments`
2. **在 `llm-client.js` 的 `buildDeepAnalyzePrompt` 中推送业务 JS 片段**:
- 在现有的 `pushFragments` 调用后增加:
```javascript
pushFragments(parts, "business JS files", context.businessJsFragments, 4);
```
- 确保总 prompt 大小不超过 `MAX_DEEP_PROMPT_CHARS`60000
3. **在 `server.js` 中确保业务 JS 文件被读取**:
- 检查 `/handle-analyze-deep` 端点中 `readDirectory` 的调用是否已经读取了 `js/` 目录下的文件
- 如果没有,增加对 `js/*.js` 文件的读取逻辑
**验证**:
- 对 `台区线损大数据` 场景运行,确认 `js/mca.js` 或类似业务文件的内容被推送给 LLM
- 确认 prompt 总大小不超过 60000 字符
**提交信息**:
```
feat(scene-generator): extract business JS files for LLM analysis
Identify and push js/ directory business logic files (mca.js, sgApi.js,
etc.) to the LLM prompt. Exclude third-party libraries. Capped at 4
fragments to stay within MAX_DEEP_PROMPT_CHARS budget.
```
---
### Task 6: 提取后验证与二次追问
**文件**: `frontend/scene-generator/llm-client.js`
**当前状态**: `analyzeSceneDeep` 拿到 LLM 返回后直接 `normalizeSceneIr` 然后返回,没有检查关键字段是否缺失。
**步骤**:
1. 新增 `validateExtractedSceneInfo(sceneIr)` 函数:
```javascript
function validateExtractedSceneInfo(sceneIr) {
const issues = [];
// Check: at least one apiEndpoint has contentType
const endpointsWithCt = (sceneIr.apiEndpoints || []).filter(
ep => ep && ep.contentType
);
if ((sceneIr.apiEndpoints || []).length > 0 && endpointsWithCt.length === 0) {
issues.push("missing_contentType_on_endpoints");
}
// Check: at least one mode has responsePath (if modes exist)
if ((sceneIr.modes || []).length > 0) {
const modesWithPath = sceneIr.modes.filter(m => m.responsePath !== undefined && m.responsePath !== null);
if (modesWithPath.length === 0) {
issues.push("missing_responsePath_on_modes");
}
}
// Check: workflowArchetype is set
if (!sceneIr.workflowArchetype) {
issues.push("missing_workflowArchetype");
}
return issues;
}
```
2. 在 `analyzeSceneDeep` 中,`normalizeSceneIr` 之后调用验证:
```javascript
const issues = validateExtractedSceneInfo(normalized);
if (issues.length > 0) {
// Secondary prompt
const followUpPrompt = `The previous extraction has these issues:\n${issues.join('\n')}\nPlease re-analyze the source snippets and fill in the missing fields. Use defaults if truly unavailable.`;
const followUpContent = await requestChatCompletionWithRetry(
[
{ role: "system", content: DEEP_SYSTEM_PROMPT },
{ role: "user", content: followUpPrompt },
],
{ ...config, maxTokens: 2400, timeoutMs: DEEP_REQUEST_TIMEOUT_MS, retryAttempts: 1 }
);
const repaired = normalizeSceneIr(await extractJsonFromResponseWithRepair(followUpContent, config));
// Merge repaired fields into normalized (only fill empty fields)
Object.assign(normalized, mergeSceneIrFields(repaired, normalized));
}
```
3. 新增 `mergeSceneIrFields(repaired, original)` 辅助函数:
- 仅当 original 的字段为空/默认值时,才用 repaired 的值覆盖
- 避免丢失第一次提取的有效信息
**验证**:
- 模拟一个 LLM 返回缺少 `contentType` 的场景,确认二次追问触发
- 确认最多追问 1 次,不会无限循环
**提交信息**:
```
feat(llm-client): add post-extraction validation with one-shot retry
After LLM returns scene IR, validate that critical fields (contentType,
responsePath, workflowArchetype) are present. If missing, send one
follow-up prompt to fill gaps. Merges repaired fields without overwriting
valid data from the first extraction.
```
---
## Phase 3: 测试验证
### Task 7: 单元测试
**文件**: `tests/scene_generator_modes_test.rs`(新增)
**步骤**:
1. 创建测试文件 `tests/scene_generator_modes_test.rs`
2. 编写 5 个测试用例:
```rust
#[cfg(test)]
mod tests {
use super::*; // adjust imports as needed
use crate::generated_scene::generator::*;
use crate::generated_scene::ir::*;
use serde_json::json;
#[test]
fn test_single_mode_generates_modes_array() {
// Create a SingleRequestTable scene with one endpoint
let scene_ir = make_test_scene_ir();
// ... assertions: generated JS contains "const MODES ="
}
#[test]
fn test_multi_mode_generates_mode_routing() {
// Create a MultiModeRequest scene with two modes
// ... assertions: generated JS contains "detectMode"
}
#[test]
fn test_snake_camel_consistency() {
// Verify field name serialization is consistent
// between Rust (snake_case) and JS (camelCase)
}
#[test]
fn test_form_urlencoded_request_body() {
// Create a mode with contentType = "application/x-www-form-urlencoded"
// ... assertions: body is Object.entries().join('&'), not JSON.stringify
}
#[test]
fn test_response_path_extraction_in_template() {
// Create a mode with responsePath = "data.list"
// ... assertions: generated JS contains "safeGet(raw, mode.responsePath"
}
}
```
3. 每个测试构造一个 `SceneIr` 实例,调用 `compile_multi_mode_request`,然后检查生成的字符串包含预期的代码片段
**验证**:
- `cargo test scene_generator_modes_test` 全部通过
**提交信息**:
```
test: add unit tests for multi-mode generation path
Covers: single-mode auto-wrap, multi-mode routing, snake/camel
consistency, form-urlencoded body format, and responsePath extraction.
```
---
### Task 8: 集成测试
**步骤**:
1. **选择两个代表性场景跑完整生成**:
- 简单场景: `用户日电量监测`(模式 C直接 AJAX
- 复杂场景: `台区线损大数据-月_周累计线损率统计分析`(模式 A双模式
2. **对比生成结果与 tq-lineloss-report**:
- 对比 `SKILL.toml` 结构
- 对比 `scripts/*.js` 的关键函数(`buildModeRequest`, `detectMode`, `normalizeRows`
- 对比 `scene.toml` 的 bootstrap 和 params 配置
3. **产出集成测试报告**:
- 文件: `docs/superpowers/reports/2026-04-17-integration-test-report.md`
- 内容: 差距清单、质量评分、遗留问题
4. **记录差距清单**:
- 哪些字段仍未正确提取
- 哪些逻辑仍需手动修正
- 哪些场景仍不适合自动化
**验证**:
- 集成测试报告已写入
- 至少一个场景的生成质量达到 tq-lineloss-report 的 80% 以上
**提交信息**:
```
docs: add integration test report for scene generator quality
Generated skills for user-daily-power and tq-lineloss scenes. Compared
against manually-authored tq-lineloss-report. Quality assessment and
gap analysis documented.
```
---
## 执行顺序
```
Task 1 → Task 2 → Task 3 → Task 4 → Task 5 → Task 6 → Task 7 → Task 8
├──── Phase 1: 修基础 ────┤ ├───── Phase 2: 增强提取 ─────┤ ├─ Phase 3 ─┤
```
Phase 1 的三个任务有依赖关系Task 1 必须先完成Task 2 和 Task 3 可并行)。
Phase 2 的三个任务可并行Task 4/5/6 修改不同文件)。
Phase 3 依赖 Phase 1+2 全部完成。
## 风险与缓解
| 风险 | 影响 | 缓解 |
|------|------|------|
| LLM 二次追问增加生成时间 | 用户体验下降 | 限制追问 1 次,超时 120s |
| 统一路径后 SingleRequestTable 场景生成的 JS 包含不必要的 mode 逻辑 | 脚本体积增大 | default mode 条件判断简单,性能影响可忽略 |
| 业务 JS 文件过多导致 prompt 超限 | LLM 无法处理 | 限制 4 个文件,每个 600 字符 |
| `processData` 修改影响现有正常场景 | 回归问题 | 仅对 form-urlencoded 设置 falseJSON 不受影响 |

View File

@@ -0,0 +1,309 @@
# Multi-Scene-Kind Generator Design
> **Status:** Draft
> **Date:** 2026-04-16
> **Author:** Qoder
## Problem Statement
`sg_scene_generate` 当前只支持 `report_collection` 类型的场景,强制要求场景目录的 `index.html` 包含 `sgclaw-scene-kind``sgclaw-tool-kind` meta 标签。
**现实情况**
- 400+ 第三方场景目录**没有** meta 标签
- 场景类型不单一:既有**报表收集类**(查询数据导出 Excel也有**监测类**(定时检查状态、监控告警)
- 不可能要求所有第三方场景添加 meta 标签
**结果**:当前 `sg_scene_generate` 对真实场景目录完全无法使用。
## Goal
扩展 `sg_scene_generate` 支持多种场景类型,让用户在 Web UI 上**手动选择场景类型**,而不是依赖场景目录中的 meta 标签。
## Non-Goals
- 不实现 LLM 自动识别场景类型(后续增强)
- 不实现运行时自动推断场景类型
- 不修改 `registry.rs` 的运行时校验逻辑V1 仍只支持已注册的类型)
- 不实现完整的监测类参数解析器(生成简化模板,用户手动编辑)
## Architecture
```
┌─────────────────────────────────────────────────────────────┐
│ Web UI (HTML) │
│ ┌─────────────┐ ┌──────────────┐ ┌──────────────────┐ │
│ │ 场景路径输入 │ │ 场景类型下拉框│ │ 分析/生成按钮 │ │
│ └─────────────┘ └──────────────┘ └──────────────────┘ │
└─────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────┐
│ Node.js Server │
│ /analyze → LLM 提取 scene-id, scene-name │
│ /generate → 调用 cargo run传递 --scene-kind 参数 │
└─────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────┐
│ sg_scene_generate (Rust CLI) │
│ --scene-kind report_collection | monitoring │
│ │
│ analyzer.rs → 放宽 meta 校验,接受用户指定类型 │
│ generator.rs → 根据类型选择不同模板 │
└─────────────────────────────────────────────────────────────┘
```
## Implementation Details
### 1. analyzer.rs — 放宽校验逻辑
**当前行为**
- 强制要求 `sgclaw-scene-kind` meta 标签 = `report_collection`
- 强制要求 `sgclaw-tool-kind` meta 标签 = `browser_script`
- 缺失则报错退出
**新行为**
- meta 标签**可选**
- 如果缺失,使用用户通过 `--scene-kind` 参数指定的类型
- 如果用户未指定,默认为 `report_collection`
- `sgclaw-tool-kind` 默认为 `browser_script`V1 只支持这一种)
**枚举扩展**
```rust
#[derive(Debug, Clone, PartialEq, Eq)]
pub enum SceneKind {
ReportCollection,
Monitoring,
}
impl SceneKind {
pub fn from_str(s: &str) -> Option<Self> {
match s {
"report_collection" => Some(Self::ReportCollection),
"monitoring" => Some(Self::Monitoring),
_ => None,
}
}
}
```
**函数签名变更**
```rust
// 改动前
pub fn analyze_scene_source(source_dir: &Path) -> Result<SceneSourceAnalysis, AnalyzeSceneError>;
// 改动后
pub fn analyze_scene_source(
source_dir: &Path,
scene_kind_hint: Option<SceneKind>,
) -> Result<SceneSourceAnalysis, AnalyzeSceneError>;
```
### 2. generator.rs — 多模板支持
**函数签名变更**
```rust
// 改动前
fn scene_toml(request: &GenerateSceneRequest, tool_name: &str, expected_domain: &str, target_url: &str) -> String;
// 改动后
fn scene_toml(request: &GenerateSceneRequest, analysis: &SceneSourceAnalysis, tool_name: &str) -> String;
```
**模板路由**
```rust
fn scene_toml(request: &GenerateSceneRequest, analysis: &SceneSourceAnalysis, tool_name: &str) -> String {
match analysis.scene_kind {
SceneKind::ReportCollection => scene_toml_report_collection(request, analysis, tool_name),
SceneKind::Monitoring => scene_toml_monitoring(request, analysis, tool_name),
}
}
```
### 3. sg_scene_generate.rs — 新增 CLI 参数
**新增参数**
```
--scene-kind <report_collection|monitoring>
场景类型,默认 report_collection
- report_collection: 报表收集类(查询数据导出报表)
- monitoring: 监测类(定时检查状态、监控告警)
```
**用法示例**
```bash
# 报表类(默认)
cargo run --bin sg_scene_generate -- \
--source-dir "D:/desk/场景/营销报表" \
--scene-id marketing-report \
--scene-name "营销报表" \
--output-root "./out" \
--lessons "docs/superpowers/references/tq-lineloss-lessons-learned.toml"
# 监测类
cargo run --bin sg_scene_generate -- \
--source-dir "D:/desk/场景/设备监测" \
--scene-id device-monitor \
--scene-name "设备监测" \
--scene-kind monitoring \
--output-root "./out" \
--lessons "docs/superpowers/references/tq-lineloss-lessons-learned.toml"
```
### 4. 监测类模板设计
监测类场景差异较大,生成**简化模板**,用户后续手动编辑:
```toml
[scene]
id = "<scene-id>"
skill = "<scene-id>"
tool = "monitor_<scene-id>"
kind = "browser_script"
version = "0.1.0"
category = "monitoring"
[manifest]
schema_version = "1"
[bootstrap]
expected_domain = "<从 analyzer 提取或空>"
target_url = "<从 analyzer 提取或空>"
requires_target_page = true
[deterministic]
suffix = "。。。"
include_keywords = ["<scene-name>"]
exclude_keywords = []
# 参数部分留空,用户手动编辑
# [[params]]
# name = "xxx"
# resolver = "literal_passthrough"
[artifact]
type = "monitoring-status"
success_status = ["ok", "running"]
failure_status = ["error", "timeout"]
# 后处理留空,用户手动编辑
```
### 5. Web UI 改动
**新增控件**
```html
<div class="form-group">
<label>场景类型</label>
<select id="sceneKind">
<option value="report_collection" selected>报表收集类</option>
<option value="monitoring">监测类</option>
</select>
<span class="hint">报表类:查询数据导出 Excel监测类定时检查状态</span>
</div>
```
**JavaScript 改动**
```javascript
// generate() 函数增加 sceneKind 参数
const sceneKind = document.getElementById('sceneKind').value;
const response = await fetch('/generate', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
sourceDir,
sceneId,
sceneName,
sceneKind, // 新增
outputRoot,
lessons
})
});
```
### 6. Node.js Server 改动
**/generate 接口**
```javascript
async function handleGenerate(req, res) {
const { sourceDir, sceneId, sceneName, sceneKind, outputRoot, lessons } = body;
// ...
const args = [
"run", "--bin", "sg_scene_generate", "--",
"--source-dir", normalize(sourceDir),
"--scene-id", sceneId,
"--scene-name", sceneName,
"--scene-kind", sceneKind || "report_collection", // 新增
"--output-root", normalize(outputRoot),
"--lessons", normalize(lessons),
];
// ...
}
```
## Compatibility Matrix
| 场景目录 | meta 标签 | 用户选择 | 最终类型 |
|---------|----------|---------|---------|
| 有 meta | `report_collection` | 未选择 | `report_collection` |
| 有 meta | `report_collection` | `report_collection` | `report_collection` |
| 有 meta | `report_collection` | `monitoring` | `monitoring`(用户优先) |
| 无 meta | 无 | 未选择 | `report_collection`(默认) |
| 无 meta | 无 | `monitoring` | `monitoring` |
**用户选择优先于 meta 标签**:因为用户比静态 meta 标签更了解场景的实际用途。
## File Changes Summary
| 文件 | 改动类型 | 改动量 |
|------|---------|-------|
| `src/generated_scene/analyzer.rs` | 修改 | ~30 行 |
| `src/generated_scene/generator.rs` | 修改 | ~50 行 |
| `src/bin/sg_scene_generate.rs` | 修改 | ~20 行 |
| `frontend/scene-generator/sg_scene_generator.html` | 修改 | ~15 行 |
| `frontend/scene-generator/server.js` | 修改 | ~5 行 |
| `frontend/scene-generator/generator-runner.js` | 修改 | ~5 行 |
| `tests/scene_generator_test.rs` | 修改 | ~30 行 |
| `tests/fixtures/generated_scene/monitoring/index.html` | 新增 | ~20 行 |
**总计**~175 行改动8 个文件。
## Testing Strategy
### 单元测试
1. `analyzer_accepts_missing_meta_with_hint` — 缺失 meta 标签时,使用 hint 参数
2. `analyzer_uses_meta_when_present` — 有 meta 标签时,使用 meta 值
3. `generator_emits_report_collection_template` — 报表类模板正确
4. `generator_emits_monitoring_template` — 监测类模板正确
### 集成测试
1. 无 meta 标签的场景目录 + `--scene-kind report_collection` → 生成成功
2. 无 meta 标签的场景目录 + `--scene-kind monitoring` → 生成成功
3. Web UI 选择监测类 → 生成的 scene.toml 包含 `category = "monitoring"`
### 手动验证
1. 真实场景目录 `D:\desk\智能体资料\场景\营销2.0零度户报表数据生成` → 选择报表类 → 生成成功
2. 真实监测类场景 → 选择监测类 → 生成成功
## Risks and Mitigations
| 风险 | 影响 | 缓解措施 |
|------|------|---------|
| 监测类模板过于简化 | 用户需要大量手动编辑 | 文档说明 + 后续迭代优化 |
| 用户选错类型 | 生成错误模板 | UI 上提供清晰说明 |
| registry.rs 不支持 monitoring | 生成的包无法注册 | V1 只生成,运行时支持后续迭代 |
## Open Questions
1. 监测类场景的 `artifact.type` 应该是什么?当前设计为 `monitoring-status`,是否合适?
2. 监测类是否需要新的 resolver 类型?
3. 是否需要在前端 UI 显示更多类型说明?
## References
- `docs/superpowers/plans/2026-04-15-generated-scene-skill-platform-plan.md` — 原实现计划
- `src/generated_scene/analyzer.rs` — 当前分析器
- `src/generated_scene/generator.rs` — 当前生成器

View File

@@ -0,0 +1,417 @@
# Scene Skill Generator — Design Document
> **Date:** 2026-04-16
> **Status:** Draft — awaiting review
> **Author:** Qoder
---
## 1. Goal
提供一个可视化界面,让用户选择场景目录后,自动通过大模型提取 scene-id 和 scene-name配置输出路径和 lessons 文件,一键调用 `sg_scene_generate` 生成完整的 skill 包,并实时查看生成日志。
---
## 2. Architecture
```
┌─────────────────────────────────────────────────────────────────┐
│ sg_scene_generator.html (浏览器) │
│ ┌───────────────────┐ ┌───────────────────────────────┐ │
│ │ 左侧:操作面板 │ │ 右侧:实时日志流 │ │
│ │ │ │ │ │
│ │ 📂 选择场景目录 │───────│ [状态卡片 + 实时滚动日志] │ │
│ │ │ │ │ │
│ │ 自动填充字段: │ │ 分析场景目录... │ │
│ │ - scene-id │ │ 调用大模型提取场景信息... │ │
│ │ - scene-name │ │ scene-id: tq-lineloss-report │ │
│ │ │ │ scene-name: 台区线损报表 │ │
│ │ 可编辑字段: │ │ 生成 skill 包... │ │
│ │ - 输出根路径 │ │ 写入 SKILL.toml... │ │
│ │ - lessons 路径 │ │ 写入 browser_script... │ │
│ │ │ │ ✅ 生成完成 │ │
│ │ [⚙ 设置] │ └───────────────────────────────┘ │
│ │ [🚀 生成 Skill] │ │
│ └───────────────────┘ │
└─────────────────────────────────────────────────────────────────┘
│ 1. POST /analyze (选择目录后自动触发)
│ → 发送目录路径 + 文件内容
│ 2. SSE /generate (点击生成按钮后触发)
│ → 推送实时进度
┌─────────────────────────────────────────────────────────────────┐
│ server.js (Node.js, 默认端口 3210) │
│ │
│ POST /analyze │
│ 1. 读取 source-dir 下的关键文件 │
│ - scene.toml (如果存在) │
│ - *.js 脚本文件 │
│ - SKILL.md / SKILL.toml (如果存在) │
│ - 目录结构树 │
│ 2. 构造 prompt调用 LLM API │
│ - baseUrl + apiKey + model 来自 sgclaw_config.json │
│ 3. 返回 JSON: { sceneId, sceneName } │
│ │
│ POST /generate │
│ 1. 接收 { sourceDir, sceneId, sceneName, outputRoot, lessons } │
│ 2. spawn: cargo run --bin sg_scene_generate \ │
│ --source-dir <sourceDir> \ │
│ --scene-id <sceneId> \ │
│ --scene-name <sceneName> \ │
│ --output-root <outputRoot> \ │
│ --lessons <lessons> │
│ 3. 通过 SSE 实时推送 stdout/stderr │
│ 4. 推送完成/失败事件 │
│ │
│ GET /health │
│ → { status: "ok", pid: 12345 } │
│ │
│ GET / │
│ → 服务 sg_scene_generator.html 静态文件 │
└─────────────────────────────────────────────────────────────────┘
│ LLM API (OpenAI-compatible format)
│ POST {baseUrl}/v1/chat/completions
┌──────────────────────┐
│ LLM (DeepSeek) │
│ │
│ System: 你是一个场景 │
│ 信息提取助手... │
│ User: 以下是场景目录 │
│ 内容... 请提取 │
│ scene-id 和 │
│ scene-name │
└──────────────────────┘
```
---
## 3. File Map
### 新建文件
| 文件 | 说明 |
|------|------|
| `frontend/scene-generator/sg_scene_generator.html` | 主页面,内联 CSS + JS复用 service-console 设计风格 |
| `frontend/scene-generator/server.js` | Node.js 轻量 HTTP 服务器(零外部依赖) |
| `frontend/scene-generator/serve.sh` | 一键启动脚本Windows 兼容) |
| `frontend/scene-generator/serve.cmd` | Windows 一键启动脚本 |
| `frontend/scene-generator/config-loader.js` | 读取并解析 `sgclaw_config.json` |
| `frontend/scene-generator/llm-client.js` | 封装 LLM API 调用OpenAI-compatible 格式) |
| `frontend/scene-generator/generator-runner.js` | 封装 `sg_scene_generate` 子进程调用 + SSE 推送 |
### 引用文件(不修改)
| 文件 | 用途 |
|------|------|
| `src/bin/sg_scene_generate.rs` | 被 server.js 通过 `cargo run` 调用 |
| `src/generated_scene/generator.rs` | 理解生成逻辑和输出结构 |
| `sgclaw_config.json` | 读取 LLM 连接配置apiKey, baseUrl, model |
| `docs/superpowers/references/tq-lineloss-lessons-learned.toml` | 默认 lessons 路径 |
| `frontend/service-console/sg_claw_service_console.html` | UI 风格参考 |
---
## 4. UI Design
### 4.1 整体布局
复用 service-console 的双栏布局:
- **外层容器 (`.shell`)**:圆角玻璃拟态面板,与 service-console 共享 CSS 变量
- **顶部 (`.hero`)**:标题 "场景 Skill 生成器" + 简短说明
- **内容区 (`.content`)**`grid` 双栏,左侧操作面板 + 右侧日志流
### 4.2 左侧操作面板
#### 场景目录选择区
```
📂 场景目录
[ 粘贴或输入路径 ____________________________ ] [ 浏览 📁 ]
当前D:\data\ideaSpace\rust\sgClaw\claw-new\examples\generated_scene_platform\scenarios\tq-lineloss-report
```
使用文本输入框 + "浏览" 按钮。点击 "浏览" 时,前端调用 `POST /browse`,由 Node.js 弹出系统目录选择对话框(通过 `electron` 风格的 `open-dialog` 不可行 — 改为**用户在输入框中粘贴/输入路径**,服务端通过 `fs.stat` 校验路径合法性)。
为简化实现,采用更务实的方案:
- 主输入框:用户粘贴或手动输入场景目录的**绝对路径**
- 输入路径后按回车或点击 "分析" 按钮,触发 `/analyze` 请求
- 服务端通过 `fs.statSync(sourceDir).isDirectory()` 校验路径
**可选增强**:如果 Node.js 安装了 `electron`,可通过 `dialog.showOpenDialog` 弹出系统选择框,但这会增加依赖。默认不采用。
#### 自动提取结果(只读展示,可手动修正)
```
scene-id
tq-lineloss-report
scene-name
台区线损报表
```
分析中显示 loading 状态,分析失败时可手动输入。
#### 设置按钮
点击弹出模态框,包含以下字段:
| 字段 | 默认值 | 说明 |
|------|--------|------|
| 输出根路径 | `D:/data/ideaSpace/rust/sgClaw/claw-new/examples/generated_scene_platform` | skill 包输出根目录,实际输出到 `<output-root>/skills/<scene-id>/` |
| Lessons 路径 | `D:/data/ideaSpace/rust/sgClaw/claw-new/docs/superpowers/references/tq-lineloss-lessons-learned.toml` | lessons TOML 文件路径 |
| LLM 服务地址 | 来自 `sgclaw_config.json``baseUrl` | 可覆盖 |
| LLM 模型 | 来自 `sgclaw_config.json``model` | 可覆盖 |
| Node 服务端口 | `3210` | server.js 监听端口 |
#### 生成按钮
```
[ 🚀 生成 Skill ] (disabled 直到选择了目录且提取完成)
```
### 4.3 右侧日志流
与 service-console 一致的流式日志展示:
- **空状态**:显示提示 "选择场景目录开始生成"
- **status 行**:关键阶段标记("开始分析", "提取完成", "开始生成", "生成成功"
- **log 行**cargo run 的 stdout 输出
- **error 行**stderr 输出或错误信息
- **complete 行**:最终结果,包含生成的 skill 包路径
### 4.4 状态卡片
左侧面板顶部显示当前状态:
```
[●] 就绪 / 分析中 / 生成中 / 完成 / 错误
```
颜色编码:
- 就绪:灰色
- 分析中:橙色
- 生成中青色accent
- 完成:绿色
- 错误:红色
---
## 5. API Design
### 5.1 POST /analyze
**请求体:**
```json
{
"sourceDir": "D:/data/ideaSpace/rust/sgClaw/claw-new/examples/generated_scene_platform/scenarios/tq-lineloss-report"
}
```
服务端自行读取目录内容:
- 校验路径是否存在且为目录
- 读取 `scene.toml`(如果存在)
- 读取 `*.js` 脚本文件
- 读取 `SKILL.md` / `SKILL.toml`(如果存在)
- 生成目录结构树
**响应:**
```json
{
"sceneId": "tq-lineloss-report",
"sceneName": "台区线损报表"
}
```
**LLM Prompt 设计:**
```
System: 你是一个场景信息提取助手。根据场景目录的内容,提取 scene-id 和 scene-name。
scene-id 规则:
- 使用英文短横线连接,如 tq-lineloss-report
- 全小写,有业务含义
scene-name 规则:
- 使用中文,简短描述性名称
- 如 "台区线损报表"、"知乎热榜导出"
User: 以下是场景目录的内容:
=== scene.toml ===
[scene content here]
=== 脚本文件 ===
[script content here]
=== 目录结构 ===
[file tree here]
请以 JSON 格式返回:{"sceneId": "...", "sceneName": "..."}
```
### 5.2 POST /generate (SSE)
**请求体:**
```json
{
"sourceDir": "/path/to/scenario/dir",
"sceneId": "tq-lineloss-report",
"sceneName": "台区线损报表",
"outputRoot": "/path/to/output/root",
"lessons": "/path/to/lessons.toml"
}
```
**SSE 事件流:**
```
event: status
data: {"message": "开始生成 skill 包..."}
event: status
data: {"message": "调用 sg_scene_generate..."}
event: log
data: {"message": "generated scene package: ..."}
event: complete
data: {"success": true, "skillRoot": "/path/to/skills/tq-lineloss-report"}
event: error
data: {"message": "生成失败: ..."}
```
### 5.3 GET /health
**响应:**
```json
{
"status": "ok",
"pid": 12345,
"configLoaded": true,
"configPath": "D:/data/ideaSpace/rust/sgClaw/sgclaw_config.json"
}
```
---
## 6. Server Design (server.js)
### 6.1 模块结构
```
server.js — HTTP 路由入口SSE 连接管理
config-loader.js — 读取 sgclaw_config.json暴露 LLM 配置 + projectRoot
llm-client.js — 调用 LLM API返回 JSON 提取结果
generator-runner.js — spawn 子进程,通过 SSE 推送输出
```
### 6.1.1 projectRoot 配置
`cargo run --bin sg_scene_generate` 需要在项目根目录下执行。`projectRoot` 的确定优先级:
1. 环境变量 `SGCLAW_PROJECT_ROOT`(最高优先级)
2. `sgclaw_config.json` 同级目录(常见情况:配置文件在项目根目录)
3. 启动脚本所在目录
### 6.2 零依赖原则
仅使用 Node.js 内置模块:
- `http` — HTTP 服务器
- `fs` — 文件读取
- `path` — 路径处理
- `child_process` — 子进程调用
- `events` — 事件发射
### 6.3 启动流程
```
1. 读取 sgclaw_config.json (路径通过环境变量 SGCLAW_CONFIG_PATH 或默认 ../sgclaw_config.json)
2. 验证必需字段: apiKey, baseUrl, model
3. 启动 HTTP 服务器,监听 0.0.0.0:3210
4. 打印启动信息,包含访问地址
```
### 6.4 错误处理
| 场景 | 处理方式 |
|------|----------|
| sgclaw_config.json 不存在 | 启动失败,提示用户设置环境变量 |
| LLM API 调用失败 | 返回 502 + 错误信息,前端允许手动输入 |
| cargo run 失败 | SSE 推送 error 事件,显示 stderr |
| source-dir 不存在 | 返回 400 |
| 端口被占用 | 启动失败,提示更换端口 |
---
## 7. Security Considerations
1. **仅监听 localhost**server.js 默认绑定 `127.0.0.1`,不暴露到外部网络
2. **API Key 不暴露给前端**LLM API 调用完全在 Node.js 服务端完成,前端不接触 API Key
3. **路径校验**`sourceDir``outputRoot` 需做基本路径合法性检查,防止路径遍历攻击
4. **子进程超时**`cargo run` 设置 5 分钟超时,防止挂起
---
## 8. Default Configuration
| 配置项 | 默认值 | 来源 |
|--------|--------|------|
| LLM apiKey | `sgclaw_config.json` 中的 `apiKey` | 启动时读取 |
| LLM baseUrl | `sgclaw_config.json` 中的 `baseUrl` | 启动时读取 |
| LLM model | `sgclaw_config.json` 中的 `model` | 启动时读取 |
| 默认 lessons 路径 | `docs/superpowers/references/tq-lineloss-lessons-learned.toml` | 项目约定 |
| 默认输出根路径 | `examples/generated_scene_platform` | 项目约定 |
| Node 服务端口 | `3210` | 硬编码,可配置 |
---
## 9. User Flow
```
1. 用户运行 bash serve.sh (或 node server.js)
2. 浏览器打开 http://127.0.0.1:3210
3. 页面加载,显示 "就绪" 状态
4. 用户在 "场景目录" 输入框中粘贴或输入绝对路径
5. 用户点击 "分析" 按钮(或输入框回车),触发 /analyze 请求
6. server.js 读取目录内容,调用 LLM 提取 scene-id/name
7. 页面自动填充 scene-id 和 scene-name 字段
8. 用户确认/修改字段,点击 "设置" 检查输出路径和 lessons
9. 用户点击 "生成 Skill"
10. server.js 通过 SSE 推送实时进度
11. 页面右侧日志流展示生成过程
12. 生成完成,显示 skill 包路径
13. 用户可前往输出目录查看生成的 skill
```
---
## 10. Windows Compatibility
由于目标平台是 Windows
- `serve.sh` 同时提供 `serve.cmd` 替代方案
- 路径分隔符统一使用 `/`Node.js `path` 模块自动处理)
- `cargo run` 命令在 Windows 上同样可用
- 路径输入框支持 Windows 格式路径(如 `D:\data\ideaSpace\...`
- 服务端自动将 `\` 转换为 `/` 以兼容 Rust CLI 参数
---
## 11. Future Extensions (Not in Scope)
- 批量生成:一次选择多个场景目录
- 生成后自动注册到 scene.toml manifest
- 生成后自动运行 skill 测试
- 历史记录:保存之前的生成记录
- 生成参数模板:保存常用的输出路径/lessons 组合

View File

@@ -0,0 +1,573 @@
# 增强 LLM 提取 Schema - 多模式业务逻辑自动化
> **Status:** Draft
> **Date:** 2026-04-17
> **Author:** Qoder
## Problem Statement
当前 `sg_scene_generate` 自动生成的 skill 脚本与 Claude 手写的 skill 存在显著差距:
### 差距清单
| 差距类型 | 描述 | 严重程度 |
|----------|------|----------|
| 多模式支持 | 无法识别 month/week 等多模式场景,只生成单一逻辑 | 🔴 高 |
| 多 API 端点 | 定义多个 API 但只使用第一个 | 🔴 高 |
| 请求格式检测 | 默认 JSON body未检测 form-urlencoded | 🟡 中 |
| 数据归一化 | 简单映射,无关键字段验证和空行过滤 | 🟡 中 |
| 参数标准化 | 无参数标准化处理,直接透传 | 🟡 中 |
| 分页参数 | 未提取和处理分页参数 (rows/page/sidx/sord) | 🟡 中 |
### 对比分析
| 功能维度 | tq-lineloss-report (手写) | marketing-zero-consumer-report (自动) |
|----------|---------------------------|----------------------------------------|
| 模式切换 | month/week 两套完整逻辑 | 无 |
| API 端点 | queryMonthData + queryWeekData | 只用 API_ENDPOINTS[0] |
| 列定义 | MONTH/WEEK_COLUMN_DEFS 双套 | 单一 COLUMN_DEFS |
| 请求构建 | buildMonthRequest + buildWeekRequest | 单一 buildRequest |
| 请求格式 | form-urlencoded | application/json |
| 数据验证 | 关键字段验证 + filter(Boolean) | 无验证 |
| 参数标准化 | normalized 对象 + 类型转换 | 无 |
## Goal
增强 LLM 提取 Schema使其能够从 index.html 中自动识别:
1. **多模式业务逻辑** (month/week 等)
2. **模式切换条件** (如何判断使用哪个模式)
3. **每个模式的专属配置** (API、列定义、请求格式、验证规则)
4. **数据归一化规则** (关键字段、过滤条件)
## Non-Goals
- 不改变现有的两阶段架构 (LLM 提取 → Rust 渲染)
- 不增加新的 CLI 参数
- 不支持超过 2 种模式的复杂场景 (Phase 1)
- 不处理认证/鉴权逻辑 (假设页面已登录)
## Architecture
### 增强后的 Schema 结构
```json
{
"sceneId": "tq-lineloss-report",
"sceneName": "台区线损报表",
"sceneKind": "report_collection",
"modes": [
{
"name": "month",
"label": "月度报表",
"condition": {
"field": "period_mode",
"operator": "equals",
"value": "month"
},
"apiEndpoint": {
"name": "月度线损查询",
"url": "http://20.76.57.61:18080/gsllys/fourVerEightHor/fourVerEightHorLinelossRateList",
"method": "POST",
"contentType": "application/x-www-form-urlencoded"
},
"columnDefs": [
["ORG_NAME", "供电单位"],
["YGDL", "累计供电量"],
["YYDL", "累计售电量"],
["YXSL", "线损完成率(%)"],
["RAT_SCOPE", "线损率累计目标值"],
["BLANK3", "目标完成率"],
["BLANK2", "排行"]
],
"requestTemplate": {
"orgno": "${args.org_code}",
"yn_flag": 0,
"_search": false,
"nd": "${Date.now()}",
"rows": 1000,
"page": 1,
"sidx": "TO_NUMBER(ORG_NO)",
"sord": "asc"
},
"normalizeRules": {
"type": "validate_all_columns",
"filterNull": true
},
"responsePath": "content"
},
{
"name": "week",
"label": "周报表",
"condition": {
"field": "period_mode",
"operator": "equals",
"value": "week"
},
"apiEndpoint": {
"name": "周线损查询",
"url": "http://20.76.57.61:18080/gsllys/tqLinelossStatis/getYearMonWeekLinelossAnalysisList",
"method": "POST",
"contentType": "application/x-www-form-urlencoded"
},
"columnDefs": [
["ORG_NAME", "供电单位"],
["LINE_LOSS_RATE", "综合线损率(%)"],
["PPQ", "供电量(Kwh)"],
["UPQ", "售电量(Kwh)"],
["LOSS_PQ", "损失电量(Kwh)"]
],
"requestTemplate": {
"orgno": "${args.org_code}",
"tjzq": "week",
"level": "00",
"_search": false,
"nd": "${Date.now()}",
"rows": 1000,
"page": 1,
"sidx": "TO_NUMBER(ORG_NO)",
"sord": "asc"
},
"normalizeRules": {
"type": "validate_required",
"requiredFields": ["ORG_NAME", "LINE_LOSS_RATE"],
"filterNull": true
},
"responsePath": "content"
}
],
"defaultMode": "month",
"modeSwitchField": "period_mode",
"commonParams": {},
"staticParams": {},
"expectedDomain": "20.76.57.61",
"targetUrl": "http://20.76.57.61:18080/gsllys"
}
```
### Schema 字段说明
#### 顶层字段
| 字段 | 类型 | 说明 |
|------|------|------|
| `modes` | Array | 模式配置数组 |
| `defaultMode` | string | 默认模式名称 |
| `modeSwitchField` | string | 用于切换模式的参数字段名 |
#### Mode 配置
| 字段 | 类型 | 说明 |
|------|------|------|
| `name` | string | 模式标识符 |
| `label` | string | 模式显示名称 |
| `condition` | object | 模式激活条件 |
| `apiEndpoint` | object | 该模式的 API 配置 |
| `columnDefs` | Array | 该模式的列定义 |
| `requestTemplate` | object | 请求参数模板 |
| `normalizeRules` | object | 数据归一化规则 |
| `responsePath` | string | 响应数据路径 (如 `content`) |
#### Condition 结构
```json
{
"field": "period_mode", // 检查的字段
"operator": "equals", // 操作符: equals, not_equals, in, contains
"value": "month" // 比较值
}
```
#### NormalizeRules 结构
```json
{
"type": "validate_required", // validate_all_columns | validate_required
"requiredFields": ["ORG_NAME", "LINE_LOSS_RATE"], // 仅 validate_required 需要
"filterNull": true // 是否过滤空值行
}
```
### 数据流
```
用户请求 (args)
validateArgs() → 参数验证
detectMode(args[modeSwitchField]) → 检测当前模式
selectModeConfig(mode) → 选择模式配置
buildRequest(args, modeConfig) → 构建该模式的请求
queryData(request, modeConfig.apiEndpoint) → HTTP 请求
extractResponse(response, modeConfig.responsePath) → 提取数据
normalizeRows(data, modeConfig.normalizeRules) → 数据归一化
buildArtifact({ mode, columnDefs, rows, ... }) → 构建输出
```
## Implementation Details
### 1. 增强 LLM 提取 Prompt
修改 `frontend/scene-generator/llm-client.js` 中的 `DEEP_SYSTEM_PROMPT``buildDeepAnalyzePrompt`
**关键 Prompt 增强:**
```
分析 index.html 中的业务逻辑模式:
1. **模式识别**
- 查找条件分支逻辑 (if/switch) 中基于 period_mode、reportType 等字段的分支
- 识别不同分支对应的 API 端点、列定义、请求格式
2. **API 提取**
- 提取 $.ajax/fetch 调用中的 URL、method、contentType
- 识别请求参数构造方式 (JSON.stringify vs 对象展开)
- 检测分页参数 (rows/page/sidx/sord)
3. **请求格式检测**
- contentType: application/json → JSON body
- contentType: application/x-www-form-urlencoded → 对象展开
- 无显式 contentType → 检查 data 参数格式
4. **数据归一化**
- 查找数据渲染/表格填充逻辑
- 识别字段映射关系
- 检测数据验证条件 (哪些字段不能为空)
5. **响应路径**
- 识别数据在响应中的位置 (response.content / response.data / response)
```
### 2. 增强 Rust Schema 结构
修改 `src/generated_scene/generator.rs`
```rust
#[derive(Debug, Clone, serde::Serialize, serde::Deserialize)]
pub struct ModeConditionJson {
pub field: String,
#[serde(default)]
pub operator: String, // equals, not_equals, in, contains
pub value: serde_json::Value,
}
#[derive(Debug, Clone, serde::Serialize, serde::Deserialize)]
pub struct NormalizeRulesJson {
#[serde(rename = "type")]
pub rules_type: String, // validate_all_columns, validate_required
#[serde(default)]
pub required_fields: Vec<String>,
#[serde(default)]
pub filter_null: bool,
}
#[derive(Debug, Clone, serde::Serialize, serde::Deserialize)]
pub struct ApiEndpointEnhancedJson {
pub name: String,
pub url: String,
#[serde(default)]
pub method: String,
#[serde(default)]
pub content_type: Option<String>,
}
#[derive(Debug, Clone, serde::Serialize, serde::Deserialize)]
pub struct ModeConfigJson {
pub name: String,
pub label: Option<String>,
pub condition: ModeConditionJson,
pub api_endpoint: ApiEndpointEnhancedJson,
pub column_defs: Vec<(String, String)>,
pub request_template: Option<serde_json::Value>,
pub normalize_rules: Option<NormalizeRulesJson>,
pub response_path: Option<String>,
}
#[derive(Debug, Clone, serde::Serialize, serde::Deserialize)]
pub struct SceneInfoEnhancedJson {
#[serde(rename = "sceneId")]
pub scene_id: String,
#[serde(rename = "sceneName")]
pub scene_name: String,
// 新增:多模式支持
pub modes: Vec<ModeConfigJson>,
pub default_mode: Option<String>,
pub mode_switch_field: Option<String>,
// 向后兼容:单模式场景
#[serde(rename = "apiEndpoints", default)]
pub api_endpoints: Vec<ApiEndpointJson>,
#[serde(rename = "columnDefs", default)]
pub column_defs: Vec<(String, String)>,
// 其他字段保持不变
// ...
}
```
### 3. 增强模板生成逻辑
```rust
fn browser_script_with_modes(scene_id: &str, scene_info: &SceneInfoEnhancedJson) -> String {
if scene_info.modes.is_empty() {
// 向后兼容:使用现有单模式模板
return browser_script_with_business_logic(scene_id, scene_info);
}
// 多模式模板
generate_multi_mode_script(scene_id, scene_info)
}
fn generate_multi_mode_script(scene_id: &str, scene_info: &SceneInfoEnhancedJson) -> String {
let modes_json = serde_json::to_string_pretty(&scene_info.modes).unwrap();
let default_mode = scene_info.default_mode.as_deref().unwrap_or("month");
format!(r#"
const REPORT_NAME = '{scene_id}';
const MODES = {modes_json};
const DEFAULT_MODE = '{default_mode}';
const MODE_SWITCH_FIELD = '{mode_switch_field}';
function detectMode(args) {{
const modeValue = args[MODE_SWITCH_FIELD] || DEFAULT_MODE;
return MODES.find(m => m.condition.value === modeValue) || MODES[0];
}}
function buildModeRequest(args, mode) {{
const endpoint = mode.api_endpoint;
const template = mode.request_template || {{}};
// 根据 contentType 构建请求
const contentType = endpoint.content_type || 'application/json';
const url = endpoint.url;
const method = endpoint.method || 'POST';
// 解析模板,替换变量
let body;
if (contentType === 'application/x-www-form-urlencoded') {{
body = {{ ...template }};
// 替换模板变量
for (const [key, value] of Object.entries(body)) {{
if (typeof value === 'string' && value.startsWith('${{') && value.endsWith('}}')) {{
const expr = value.slice(2, -1);
body[key] = eval(expr);
}}
}}
body.orgno = args.org_code;
}} else {{
body = JSON.stringify({{ ...template, ...args }});
}}
return {{ url, method, headers: {{ 'Content-Type': contentType }}, body }};
}}
function normalizeModeRows(data, mode) {{
const rules = mode.normalize_rules || {{ type: 'validate_all_columns' }};
const columns = mode.column_defs.map(([key]) => key);
if (!Array.isArray(data)) return [];
return data.map(row => {{
const result = {{}};
for (const key of columns) {{
result[key] = row[key] ?? '';
}}
return result;
}}).filter(row => {{
if (!rules.filter_null) return true;
if (rules.type === 'validate_required') {{
return rules.required_fields.every(f => row[f] !== '');
}}
return columns.every(k => row[k] !== '');
}});
}}
async function queryModeData(args, mode) {{
const endpoint = mode.api_endpoint;
const request = buildModeRequest(args, mode);
// jQuery 优先
if (typeof $ !== 'undefined' && typeof $.ajax === 'function') {{
const contentType = endpoint.content_type || 'application/json';
return new Promise((resolve, reject) => {{
$.ajax({{
url: request.url,
type: request.method,
data: request.body,
contentType: contentType,
dataType: 'json',
success: resolve,
error: (xhr, status, err) => reject(new Error(
`API failed (${{xhr.status}}): ${{err}}`
))
}});
}});
}}
// fetch fallback
if (typeof fetch === 'function') {{
const response = await fetch(request.url, {{
method: request.method,
headers: request.headers,
body: request.method !== 'GET' ? request.body : undefined
}});
if (!response.ok) {{
const text = await response.text().catch(() => '');
throw new Error(`HTTP ${{response.status}}: ${{text.substring(0, 200)}}`);
}}
return response.json();
}}
throw new Error('No HTTP client available');
}}
async function buildBrowserEntrypointResult(args, deps = defaultDeps) {{
// 1. 参数验证
const validation = validateArgs(args);
if (!validation.valid) {{
return buildArtifact({{
status: 'blocked',
blockedReason: 'validation_failed',
reasons: validation.errors,
rows: [],
args
}});
}}
// 2. 页面上下文验证
const pageValidation = typeof deps.validatePageContext === 'function'
? deps.validatePageContext(args)
: {{ ok: true }};
if (!pageValidation?.ok) {{
return buildArtifact({{
status: 'blocked',
blockedReason: pageValidation?.reason || 'page_context_mismatch',
reasons: [pageValidation?.reason || 'page_context_mismatch'],
rows: [],
args
}});
}}
// 3. 检测模式
const mode = detectMode(args);
// 4. 数据获取
const reasons = [];
let rawData = null;
try {{
rawData = await queryModeData(args, mode);
}} catch (error) {{
return buildArtifact({{
status: 'error',
fatalError: error.message,
reasons: ['api_query_failed:' + error.message],
rows: [],
args
}});
}}
// 5. 提取响应数据
const responsePath = mode.response_path || '';
let data = rawData;
if (responsePath && rawData) {{
data = rawData[responsePath] || rawData;
}}
// 6. 数据归一化
const rows = normalizeModeRows(data, mode);
if (rows.length === 0 && Array.isArray(data) && data.length > 0) {{
reasons.push('row_normalization_partial');
}}
// 7. 构建 Artifact
return buildArtifact({{
reasons,
rows,
args,
columnDefs: mode.column_defs,
columns: mode.column_defs.map(([key]) => key)
}});
}}
"#, scene_id = scene_id, modes_json = modes_json, default_mode = default_mode, mode_switch_field = scene_info.mode_switch_field.as_deref().unwrap_or("period_mode"))
}
```
## Testing Strategy
### 单元测试
1. **Schema 解析测试**
- 测试多模式 JSON 正确解析
- 测试向后兼容(无 modes 字段时退化)
2. **模式检测测试**
- 测试 `detectMode()` 根据参数正确选择模式
3. **请求构建测试**
- 测试 form-urlencoded 格式正确
- 测试 JSON 格式正确
- 测试模板变量替换
4. **数据归一化测试**
- 测试 validate_required 类型
- 测试 validate_all_columns 类型
- 测试空行过滤
### 集成测试
1. **端到端测试**
- 使用 tq-lineloss-report 源场景
- 验证生成的脚本与手写版本功能一致
2. **回归测试**
- 验证单模式场景仍正常工作
- 验证现有测试用例通过
## Migration Path
### Phase 1: Schema 增强
1. 增强 Rust schema 结构
2. 更新 LLM 提取 prompt
3. 验证 schema 解析正确
### Phase 2: 模板实现
1. 实现多模式模板
2. 实现请求格式检测
3. 实现数据归一化规则
### Phase 3: 测试验证
1. 使用 tq-lineloss-report 源场景测试
2. 对比生成代码与手写代码
3. 修复差异
## Risks and Mitigations
| 风险 | 影响 | 缓解措施 |
|------|------|----------|
| LLM 提取不准确 | 生成代码不可用 | 提供 few-shot 示例,增加验证步骤 |
| 模式条件复杂 | 无法正确切换 | Phase 1 只支持 equals 操作符 |
| 请求模板变量 | 表达能力有限 | 支持常用表达式,复杂场景用 lessons 补充 |
| 向后兼容 | 现有场景受影响 | 无 modes 时使用旧模板 |
## Success Criteria
1. **多模式支持**:能够生成 month/week 双模式脚本
2. **请求格式正确**:自动检测 form-urlencoded vs JSON
3. **数据验证**:支持关键字段验证和空行过滤
4. **向后兼容**:单模式场景不受影响
5. **代码质量**:生成的代码与手写 tq-lineloss-report 功能对等

View File

@@ -0,0 +1,490 @@
# LLM-Driven Skill Generation Design
> **Status:** Draft
> **Date:** 2026-04-17
> **Author:** Qoder
## Problem Statement
`sg_scene_generate` 当前只生成"骨架"技能包,缺乏实际业务逻辑:
### 当前产出 vs 实际需求
| 方面 | 当前产出 | 实际需求 (tq-lineloss-report) |
|------|----------|------------------------------|
| 脚本代码量 | 51 行 | 433 行 |
| API 端点 | 无 | 有完整定义 |
| 静态参数 | 无 | 有业务参数 |
| 列定义 | 通用模板 | 业务特定 |
| 可运行性 | 需手动填充 | 开箱即用 |
### 根本原因
1. **LLM 分析不读取 index.html** — 场景源码中的业务逻辑被忽略
2. **只提取 scene-id/scene-name** — 缺少 API、参数、列定义等关键信息
3. **Rust 模板过于简单** — 只生成骨架,无法渲染业务逻辑
## Goal
`sg_scene_generate` 自动生成**可直接运行**的完整技能包,包含:
- API 端点定义
- 静态业务参数
- 列定义(导出报表用)
- 数据采集逻辑骨架
- 参数验证逻辑
## Non-Goals
- 不实现 100% 自动化 — 复杂业务逻辑仍需人工校验
- 不支持所有 JavaScript 模式 — 仅覆盖常见场景
- 不替换现有 Rust 模板系统 — 在其基础上增强
- 不处理认证/授权逻辑 — 由运行时环境处理
## Architecture
```
┌─────────────────────────────────────────────────────────────────────┐
│ Web UI (HTML) │
│ [选择场景目录] → [分析] → [预览提取结果] → [生成 Skill] │
└─────────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────────┐
│ Node.js Server (server.js) │
│ /analyze → LLM 深度提取 (index.html + scripts) │
│ /generate → 传递提取结果给 Rust CLI │
└─────────────────────────────────────────────────────────────────────┘
┌───────────────────┼───────────────────┐
▼ ▼ ▼
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ llm-client.js │ │ generator- │ │ Rust CLI │
│ (增强提取) │ │ runner.js │ │ (模板渲染) │
└─────────────────┘ └─────────────────┘ └─────────────────┘
```
## LLM Extraction Schema
### Input Sources
| 文件 | 提取内容 |
|------|----------|
| `index.html` | API 端点、静态参数、列定义、业务方法 |
| `scripts/*.js` | 辅助函数、数据转换逻辑 |
| 目录结构 | 文件组织方式 |
### Output Schema
```json
{
"sceneId": "string - 场景标识",
"sceneName": "string - 场景中文名",
"sceneKind": "report_collection | monitoring",
"sourceSystem": "string - 来源系统名",
"expectedDomain": "string - 目标域名",
"targetUrl": "string | null - 目标页面URL",
"apiEndpoints": [
{
"name": "string - API 名称",
"url": "string - 完整 URL",
"method": "GET | POST",
"description": "string - 用途说明"
}
],
"staticParams": {
"key": "value - 静态业务参数"
},
"columnDefs": [
["fieldName", "中文列名"]
],
"entryMethod": "string - 入口方法名",
"businessLogic": {
"dataFetch": "string - 数据获取逻辑描述",
"dataTransform": "string - 数据转换逻辑描述"
}
}
```
### LLM Prompt Template
```
你是一个场景代码分析专家。分析以下场景源码,提取关键业务信息。
## 分析目标
1. **API 端点**: 识别所有 HTTP 请求地址
2. **静态参数**: 识别硬编码的业务参数
3. **列定义**: 识别数据表格/导出的列配置
4. **业务逻辑**: 理解数据获取和转换流程
## 源码内容
=== 目录结构 ===
{directoryTree}
=== index.html ===
{indexHtmlContent}
=== 脚本文件 ===
{scriptsContent}
## 输出格式
请以 JSON 格式返回提取结果:
{
"sceneId": "...",
"sceneName": "...",
...
}
```
## Data Flow
### 当前流程
```
用户选择目录
Node.js 读取目录结构、脚本文件 (不读 index.html)
LLM 只提取 scene-id, scene-name
Rust 生成骨架脚本 (无业务逻辑)
```
### 改造后流程
```
用户选择目录
Node.js 读取目录结构、index.html、脚本文件
LLM 深度提取 API/参数/列定义/业务逻辑
Web UI 展示提取结果供用户确认
用户确认后,提取结果通过 CLI 参数传给 Rust
Rust 根据提取结果渲染完整脚本
```
## Implementation Details
### Task 1: 增强 llm-client.js
**文件**: `frontend/scene-generator/llm-client.js`
**改动**:
1. 新增 `buildDeepAnalyzePrompt()` 函数
2. 增强 `SYSTEM_PROMPT` 包含深度提取指令
3. 新增 `extractSceneInfo()` 函数处理复杂 JSON
**新增接口**:
```javascript
async function analyzeSceneDeep(sourceDir, dirContents, indexHtmlContent, config) {
// 返回完整的 SceneInfo 对象
}
```
### Task 2: 增强 generator-runner.js
**文件**: `frontend/scene-generator/generator-runner.js`
**改动**:
1. `readDirectory()` 增加读取 `index.html`
2. 返回值增加 `indexHtml` 字段
```javascript
function readDirectory(sourceDir) {
// ... 现有逻辑 ...
const indexHtmlPath = p.join(sourceDir, "index.html");
if (fs.existsSync(indexHtmlPath)) {
result.indexHtml = fs.readFileSync(indexHtmlPath, "utf-8");
}
return result;
}
```
### Task 3: 增强 server.js
**文件**: `frontend/scene-generator/server.js`
**改动**:
1. `/analyze` 调用深度分析
2. `/generate` 传递提取结果给 Rust CLI
```javascript
async function handleAnalyze(req, res) {
// ...
const indexHtml = dirContents.indexHtml;
const result = await analyzeSceneDeep(sourceDir, dirContents, indexHtml, config);
res.json(result);
}
async function handleGenerate(req, res) {
const { sceneInfo, outputRoot, lessons } = body;
// 将 sceneInfo 作为 JSON 参数传递给 Rust CLI
}
```
### Task 4: 增强 Rust CLI
**文件**: `src/bin/sg_scene_generate.rs`
**新增参数**:
```bash
--scene-info-json '<JSON>' # 完整的场景信息 JSON
```
**解析逻辑**:
```rust
struct SceneInfoJson {
scene_id: String,
scene_name: String,
scene_kind: String,
source_system: Option<String>,
expected_domain: Option<String>,
target_url: Option<String>,
api_endpoints: Option<Vec<ApiEndpoint>>,
static_params: Option<HashMap<String, String>>,
column_defs: Option<Vec<(String, String)>>,
entry_method: Option<String>,
business_logic: Option<BusinessLogic>,
}
```
### Task 5: 增强 generator.rs
**文件**: `src/generated_scene/generator.rs`
**新增结构**:
```rust
pub struct ApiEndpoint {
pub name: String,
pub url: String,
pub method: String,
pub description: Option<String>,
}
pub struct BusinessLogic {
pub data_fetch: Option<String>,
pub data_transform: Option<String>,
}
pub struct SceneInfo {
pub scene_id: String,
pub scene_name: String,
pub scene_kind: SceneKind,
pub source_system: Option<String>,
pub expected_domain: Option<String>,
pub target_url: Option<String>,
pub api_endpoints: Vec<ApiEndpoint>,
pub static_params: HashMap<String, String>,
pub column_defs: Vec<(String, String)>,
pub entry_method: Option<String>,
pub business_logic: Option<BusinessLogic>,
}
```
**模板渲染增强**:
```rust
fn browser_script_with_business_logic(scene_id: &str, info: &SceneInfo) -> String {
// 根据 SceneInfo 生成完整脚本
// 包含 API 端点定义、静态参数、列定义、数据获取逻辑
}
```
### Task 6: Web UI 预览
**文件**: `frontend/scene-generator/sg_scene_generator.html`
**改动**:
1. 分析完成后展示提取结果摘要
2. 用户可编辑关键字段
3. 确认后进入生成流程
**展示字段**:
- 场景 ID / 名称
- API 端点列表
- 列定义预览
- 静态参数摘要
## Generated Script Template
### 结构
```javascript
// ===== 自动生成部分 =====
// 常量定义
const REPORT_NAME = '{scene_id}';
const API_BASE = '{api_base_url}';
const EXPECTED_DOMAIN = '{expected_domain}';
// API 端点
const API_ENDPOINTS = {
{api_name}: '{api_path}',
// ...
};
// 静态参数
const STATIC_PARAMS = {
{key}: '{value}',
// ...
};
// 列定义
const COLUMN_DEFS = [
['{field}', '{label}'],
// ...
];
const COLUMNS = COLUMN_DEFS.map(([k]) => k);
// ===== 标准框架 =====
function validateArgs(args) { /* 参数验证 */ }
function buildRequest(args) { /* 构建请求 */ }
function normalizeRows(rawRows) { /* 数据标准化 */ }
function buildArtifact(opts) { /* 构建 Artifact */ }
async function buildBrowserEntrypointResult(args, deps = defaultDeps()) {
// 1. 参数验证
const validation = validateArgs(args);
if (!validation.ok) {
return buildArtifact({ status: 'blocked', reasons: validation.reasons });
}
// 2. 页面上下文验证
const pageValidation = deps.validatePageContext?.(args);
if (!pageValidation?.ok) {
return buildArtifact({ status: 'blocked', reasons: ['page_context_mismatch'] });
}
// 3. 数据获取
try {
const request = buildRequest(args);
const response = await deps.queryData(request);
const rows = normalizeRows(response.rows || []);
return buildArtifact({
status: rows.length > 0 ? 'ok' : 'empty',
column_defs: COLUMN_DEFS,
columns: COLUMNS,
rows,
});
} catch (error) {
return buildArtifact({ status: 'error', reasons: [error.message] });
}
}
// ===== 默认依赖实现 =====
function defaultDeps() {
return {
validatePageContext(args) {
const host = globalThis.location?.hostname;
return host === args.expected_domain
? { ok: true }
: { ok: false, reason: 'domain_mismatch' };
},
async queryData(request) {
// 根据 API_ENDPOINTS 调用实际接口
// 此处为模板,可能需要根据具体场景调整
if (typeof $ !== 'undefined' && typeof $.ajax === 'function') {
return new Promise((resolve, reject) => {
$.ajax({
url: API_ENDPOINTS.primary,
type: 'POST',
data: request,
success: resolve,
error: (xhr, status, err) => reject(new Error(`API failed: ${err}`)),
});
});
}
throw new Error('No HTTP client available');
},
};
}
// ===== 模块导出 =====
if (typeof module !== 'undefined') {
module.exports = {
buildBrowserEntrypointResult,
validateArgs,
buildRequest,
normalizeRows,
COLUMN_DEFS,
COLUMNS,
};
}
if (typeof args !== 'undefined') {
return buildBrowserEntrypointResult(args);
}
```
## Testing Strategy
### Unit Tests
1. **LLM 提取测试**
- 测试 fixture HTML 文件
- 验证提取字段完整性
- 验证 JSON 解析健壮性
2. **模板渲染测试**
- 验证生成的脚本语法正确
- 验证常量定义正确
- 验证函数结构完整
### Integration Tests
1. **端到端测试**
- 选择 fixture 场景目录
- 分析 → 预览 → 生成
- 验证生成的 skill 可加载
2. **真实场景测试**
- 使用营销零度户场景
- 对比 Claude 手动转换版本
- 验证功能等价性
## Migration Path
### Phase 1: 基础设施 (Week 1)
- Task 1-2: LLM 提取增强
- Task 3: Server 改造
### Phase 2: Rust 模板 (Week 2)
- Task 4: CLI 参数扩展
- Task 5: Generator 模板增强
### Phase 3: 用户体验 (Week 3)
- Task 6: Web UI 预览
- 集成测试
### Phase 4: 验证优化 (Week 4)
- 真实场景测试
- 模板调优
- 文档完善
## Risks and Mitigations
| 风险 | 影响 | 缓解措施 |
|------|------|----------|
| LLM 提取不准确 | 生成的脚本无法运行 | 提供 Web UI 预览编辑,支持人工修正 |
| 场景源码格式多样 | 提取失败率增加 | 覆盖常见模式,提供 fallback |
| Token 消耗过大 | 成本增加 | 限制 index.html 读取长度,优先提取关键段落 |
| 复杂业务逻辑无法自动生成 | 功能不完整 | 生成骨架 + TODO 注释,提示人工补充 |
## Success Criteria
1. **自动化率**: 80% 的场景可自动生成可运行脚本
2. **准确率**: 提取的 API/参数/列定义准确率 > 90%
3. **人工干预**: 平均每个场景人工编辑时间 < 5 分钟
4. **对比 Claude**: 生成的脚本功能与 Claude 手动转换版本等价

View File

@@ -0,0 +1,333 @@
# Progressive Browser Script Template Enhancement
> **Status:** Draft
> **Date:** 2026-04-17
> **Author:** Qoder
## Problem Statement
当前自动生成的 `browser_script_with_business_logic` 模板存在以下问题:
### 问题列表
| 问题 | 影响 | 严重程度 |
|------|------|----------|
| URL 构建错误 | `new URL(endpoint.url, window.location.origin)` 会错误地基于当前页面 URL | 高 |
| 缺少 jQuery 支持 | 内网页面通常使用 jQuery `$.ajax`fetch 可能遇到 CORS 问题 | 高 |
| 状态判定简单 | 只有 ok/error 两种状态,缺少 blocked/partial/empty 细分 | 中 |
| 缺少多端点支持 | 只有一个 `queryData` 方法,无法处理多 API 场景 | 中 |
### 对比 tq-lineloss-report
| 功能 | tq-lineloss (完整) | 当前模板 (骨架) |
|------|-------------------|-----------------|
| HTTP 客户端 | jQuery $.ajax + 错误处理 | 仅 fetch |
| URL 处理 | 硬编码完整 URL | 错误的 URL 构建 |
| 状态判定 | determineArtifactStatus 函数 | 简单三元表达式 |
| API 端点 | 多个端点方法 | 单一 queryData |
## Goal
增强 `browser_script_with_business_logic` 模板,使其生成的脚本能够:
1. 正确处理 API URL修复 bug
2. 同时支持 jQuery 和 fetch HTTP 客户端
3. 提供完整的状态判定逻辑blocked/error/partial/empty/ok
4. 支持多 API 端点场景
## Non-Goals
- 不改变 LLM 提取逻辑(本次仅增强 Rust 模板)
- 不增加新的 CLI 参数
- 不改变现有 API 契约
- 不支持自动生成 tq-lineloss 的"月/周两套列定义"模式(需要 LLM 增强)
## Architecture
### 模板结构
```
browser_script_with_business_logic()
├── 常量定义
│ ├── API_ENDPOINTS (from LLM)
│ ├── STATIC_PARAMS (from LLM)
│ └── COLUMN_DEFS (from LLM)
├── 工具函数
│ ├── normalizePayload()
│ ├── pickFirstNonEmpty()
│ └── isNonEmptyString()
├── 参数验证
│ └── validateArgs() - 增强版
├── 请求构建
│ ├── buildRequest() - 修复 URL
│ └── buildRequestBody() - 新增
├── HTTP 客户端
│ ├── defaultDeps.queryData() - jQuery 优先
│ └── fetchFallback() - 新增
├── 数据处理
│ └── normalizeRows()
├── 状态判定
│ └── determineArtifactStatus() - 新增
├── Artifact 构建
│ └── buildArtifact() - 增强版
└── 入口函数
└── buildBrowserEntrypointResult()
```
### 数据流
```
用户请求 (args)
validateArgs() → 参数验证
↓ (blocked if invalid)
validatePageContext() → 页面上下文验证
↓ (blocked if mismatch)
buildRequest() → 构建请求参数
queryData() → HTTP 请求 (jQuery/fetch)
normalizeRows() → 数据归一化
determineArtifactStatus() → 状态判定
buildArtifact() → 构建 artifact
```
## Implementation Details
### 1. 修复 URL 构建
**当前代码 (有 bug)**:
```javascript
function buildRequest(args, endpoint) {
const url = new URL(endpoint.url, window.location.origin); // 错误!
// ...
}
```
**修复后**:
```javascript
function buildRequest(args, endpoint) {
// 直接使用完整 URL不基于 window.location.origin
const url = endpoint.url;
const method = endpoint.method || 'POST';
const headers = { 'Content-Type': 'application/json' };
const body = JSON.stringify({ ...STATIC_PARAMS, ...args });
return { url, method, headers, body };
}
```
### 2. jQuery + fetch 双支持
```javascript
const defaultDeps = {
validatePageContext(args) {
const host = (globalThis.location?.hostname || '').trim();
const expected = (args.expected_domain || '').trim();
if (!host) return { ok: false, reason: 'page_context_unavailable' };
if (host !== expected) return { ok: false, reason: 'page_context_mismatch' };
return { ok: true };
},
async queryData(args) {
const endpoint = API_ENDPOINTS[0];
if (!endpoint) throw new Error('No API endpoint configured');
const request = buildRequest(args, endpoint);
// 优先使用 jQuery (内网页面通常有)
if (typeof $ !== 'undefined' && typeof $.ajax === 'function') {
return new Promise((resolve, reject) => {
$.ajax({
url: request.url,
type: request.method,
data: request.body,
contentType: 'application/json',
dataType: 'json',
success: resolve,
error: (xhr, status, err) => reject(new Error(
`API failed (${xhr.status}): ${err} | body=${(xhr.responseText || '').substring(0, 200)}`
))
});
});
}
// Fallback: fetch API
if (typeof fetch === 'function') {
const response = await fetch(request.url, {
method: request.method,
headers: request.headers,
body: request.method !== 'GET' ? request.body : undefined
});
if (!response.ok) {
const text = await response.text().catch(() => '');
throw new Error(`HTTP ${response.status}: ${text.substring(0, 200)}`);
}
return response.json();
}
throw new Error('No HTTP client available (need jQuery or fetch)');
}
};
```
### 3. 完整状态判定
```javascript
function determineArtifactStatus({ blockedReason = '', fatalError = '', reasons = [], rows = [] }) {
if (blockedReason) return 'blocked';
if (fatalError) return 'error';
if (reasons.length > 0) return 'partial';
if (!rows.length) return 'empty';
return 'ok';
}
function buildArtifact({ status, blockedReason, fatalError, reasons, rows, args }) {
return {
type: 'report-artifact',
report_name: REPORT_NAME,
status: status || determineArtifactStatus({ blockedReason, fatalError, reasons, rows }),
period: { mode: args.period_mode, mode_code: args.period_mode_code, value: args.period_value, payload: normalizePayload(args.period_payload) },
org: { label: args.org_label, code: args.org_code },
column_defs: COLUMN_DEFS,
columns: COLUMNS,
rows,
counts: { detail_rows: rows.length },
partial_reasons: reasons.filter(r => r && !r.startsWith('api_') && !r.startsWith('validation_')),
reasons: Array.from(new Set(reasons.filter(Boolean)))
};
}
```
### 4. 增强入口函数
```javascript
async function buildBrowserEntrypointResult(args, deps = defaultDeps) {
// 1. 参数验证
const validation = validateArgs(args);
if (!validation.valid) {
return buildArtifact({
status: 'blocked',
blockedReason: 'validation_failed',
reasons: validation.errors,
rows: [],
args
});
}
// 2. 页面上下文验证
const pageValidation = typeof deps.validatePageContext === 'function'
? deps.validatePageContext(args)
: { ok: true };
if (!pageValidation?.ok) {
return buildArtifact({
status: 'blocked',
blockedReason: pageValidation?.reason || 'page_context_mismatch',
reasons: [pageValidation?.reason || 'page_context_mismatch'],
rows: [],
args
});
}
// 3. 数据获取
const reasons = [];
let rawData = null;
try {
rawData = await (deps.queryData ? deps.queryData(args) : Promise.resolve([]));
} catch (error) {
return buildArtifact({
status: 'error',
fatalError: error.message,
reasons: ['api_query_failed:' + error.message],
rows: [],
args
});
}
// 4. 数据归一化
const rows = normalizeRows(rawData);
if (rows.length === 0 && Array.isArray(rawData) && rawData.length > 0) {
reasons.push('row_normalization_partial');
}
// 5. 构建 Artifact
return buildArtifact({ reasons, rows, args });
}
```
### 5. Rust 模板变更
修改 `src/generated_scene/generator.rs` 中的 `browser_script_with_business_logic` 函数,将上述 JavaScript 模板硬编码进去。
关键改动:
- 替换 `buildRequest` 函数(修复 URL bug
- 替换 `defaultDeps` 对象(添加 jQuery 支持)
- 添加 `determineArtifactStatus` 函数
- 增强 `buildArtifact` 函数
- 增强 `buildBrowserEntrypointResult` 函数
## Testing Strategy
### Unit Tests
1. **URL 构建测试**
- 验证完整 URL 被正确传递
- 验证 GET/POST 方法正确处理
2. **状态判定测试**
- 测试 blocked → blockedReason 存在
- 测试 error → fatalError 存在
- 测试 partial → reasons 非空
- 测试 empty → rows 为空
- 测试 ok → rows 非空
3. **HTTP 客户端测试**
- Mock jQuery 环境,验证 $.ajax 被调用
- Mock fetch 环境,验证 fetch 被调用
- 验证错误处理
### Integration Tests
1. **端到端测试**
- 使用 fixture 场景目录
- 运行深度分析 → 生成 skill
- 验证生成的脚本语法正确
- 验证生成的脚本可被 Node.js 加载
2. **真实场景测试**
- 使用 marketing-zero-consumer-report 源场景
- 重新生成 skill
- 验证脚本可运行(需要内网环境)
## Migration Path
### Phase 1: 修复 Bug
- 修复 URL 构建问题
- 保持其他逻辑不变
- 验证现有场景不受影响
### Phase 2: 增强 HTTP 客户端
- 添加 jQuery 支持
- 保留 fetch 作为 fallback
- 验证两种方式都能工作
### Phase 3: 完善状态判定
- 添加 determineArtifactStatus
- 增强 buildArtifact
- 验证各种状态场景
## Risks and Mitigations
| 风险 | 影响 | 缓解措施 |
|------|------|----------|
| 模板过大 | 维护困难 | 分段组织,添加注释 |
| jQuery 全局变量检查 | 可能误判 | 同时检查 $ 和 $.ajax |
| 状态判定过于严格 | 部分场景不兼容 | 提供配置选项 |
| 向后兼容 | 现有 skill 可能受影响 | 仅修改有 scene_info 的场景 |
## Success Criteria
1. **URL 修复**: `buildRequest` 不再依赖 `window.location.origin`
2. **jQuery 支持**: 在有 jQuery 的页面优先使用 `$.ajax`
3. **状态完整**: 支持 blocked/error/partial/empty/ok 五种状态
4. **向后兼容**: 无 scene_info 时仍生成骨架模板
5. **可运行性**: 生成的 marketing-zero-consumer-report 可在内网运行

View File

@@ -0,0 +1,192 @@
# sgClaw 场景生成器质量提升设计
## 问题陈述
sgClaw 场景生成器能将 `D:\desk\智能体资料\场景\` 下的原始 HTML 场景转化为 skill 包,但自动生成质量仅为 Claude 手写的 tq-lineloss-report 的 ~40%。
### 四个核心问题
| # | 问题 | 根因 |
|---|------|------|
| 1 | Content-Type 硬编码为 `application/json` | `browser_script_with_business_logic()` 模板硬编码,`ApiEndpointJson` 无 content_type 字段 |
| 2 | 请求体缺少业务必需字段 | 仅做 `{...STATIC_PARAMS, ...args}` 简单合并,无 requestTemplate 机制 |
| 3 | 响应数据未正确提取 | 无 responsePath 提取步骤,`normalizeRows` 直接接收原始响应 |
| 4 | 缺乏多模式路由能力 | 旧路径硬编码 `API_ENDPOINTS[0]`,无模式检测 |
### 根本架构缺陷
生成器有两条路径:
- `browser_script_with_modes()` — 新路径,支持所有高级特性
- `browser_script_with_business_logic()` — 旧路径,仅支持基础功能
LLM 判断为单模式的场景走旧路径,所有高级特性丢失。
### 原始场景分类
13 个场景按 API 调用模式分为三类:
- **模式 ABrowserAction 跨页面注入**5 场景)— 线损类、服务风险类、约时工单类
- **模式 B$http.sendByAxios 营销网关**2 场景)— 营销2.0零度户、光伏
- **模式 C直接 AJAX**6 场景)— 95598 类、监测类、大电量类
按复杂度分三个梯队:
| 梯队 | 场景数 | 当前质量 | 目标质量 | 特征 |
|------|--------|---------|---------|------|
| 第一梯队 | 5 | ~60% | ~90% | 单模式、直接 AJAX、无 BrowserAction |
| 第二梯队 | 5 | ~40% | ~70% | 涉及 BrowserAction 或 form-urlencoded |
| 第三梯队 | 3 | ~20% | ~40% | 链式多步 API 调用,仍需人工介入 |
## 解决方案
### 核心策略
统一生成路径 + 增强 LLM 提取深度 + 修复已知问题
### 架构变更
```
修改前:
browser_script()
├─ modes 非空 → browser_script_with_modes() [新路径]
├─ api_endpoints 非空 → browser_script_with_business_logic() [旧路径,质量差]
└─ 其他 → browser_script_skeleton() [骨架路径]
修改后:
browser_script()
├─ modes 非空 → browser_script_with_modes() [统一路径]
├─ 单模式场景 → 自动包装为 modes=[default_mode] → browser_script_with_modes()
└─ 无端点 → browser_script_skeleton() [骨架路径]
```
## 实施阶段
### 阶段一:修基础(统一路径,修模板)
#### Task 1统一生成路径
**目标**:废弃 `browser_script_with_business_logic()`,所有场景统一走 `browser_script_with_modes()`
**改动**
- 修改 `browser_script()` 路由逻辑:单模式场景自动包装为一个 mode
- `browser_script_with_business_logic()` 标记为 `#[deprecated]`
- `browser_script_skeleton()` 仅用于无端点、无列定义的场景
**文件**`src/generated_scene/generator.rs`
#### Task 2修复 jQuery processData 参数
**目标**form-urlencoded 请求在 jQuery ajax 中正确序列化
**改动**
- `buildModeRequest` 函数中根据 `contentType` 区分处理:
- `application/x-www-form-urlencoded`body 传对象jQuery 调用加 `processData: false`
- `application/json`body 传 `JSON.stringify()`,不加 processData
- 在模板中为 jQuery ajax 调用增加 `processData` 条件判断
**文件**`src/generated_scene/generator.rs``browser_script_with_modes` 模板)
#### Task 3单模式场景自动包装为 mode 配置
**目标**LLM 输出无 modes 但有 apiEndpoints 时,自动包装为单 mode
**改动**
- `analyzeSceneDeep()` 中增加自动包装逻辑
-`modes` 为空且 `apiEndpoints` 非空时,生成默认 mode
```js
{
name: "default",
condition: { field: "period_mode", operator: "equals", value: "default" },
apiEndpoint: apiEndpoints[0], // 使用第一个端点
columnDefs: columnDefs,
requestTemplate: staticParams,
responsePath: "",
normalizeRules: { type: "validate_all_columns", filterNull: true }
}
```
- 设置 `defaultMode: "default"`, `modeSwitchField: "period_mode"`
**文件**`frontend/scene-generator/llm-client.js`
### 阶段二:增强提取(让 LLM 提取更准)
#### Task 4增强 LLM prompt 的强制约束
**目标**:让 LLM 必须输出 Content-Type、responsePath、requestTemplate 等关键字段
**改动**
- `DEEP_SYSTEM_PROMPT` 中增加强制字段约束说明
- 增加"如果找不到就填默认值"的指导,避免 LLM 跳过这些字段
- 增加 contentType 判断示例(`$.ajax` 中找 `contentType` 属性,`$http.sendByAxios` 中看封装)
**文件**`frontend/scene-generator/llm-client.js`
#### Task 5增加业务 JS 文件提取
**目标**LLM 不仅读 index.html还要读 js/ 目录下的业务逻辑文件
**改动**
- `frontend/scene-generator/server.js` 的切片逻辑:识别并提取 `js/mca.js`, `js/ami.js`, `js/sgApi.js`, `js/power.js` 等业务文件
- `buildDeepAnalyzePrompt` 中增加业务 JS 文件的片段推送
- 限制总 token 数不超过 `MAX_DEEP_PROMPT_CHARS`60000
**文件**
- `frontend/scene-generator/server.js`(新增业务 JS 文件识别和提取)
- `frontend/scene-generator/llm-client.js`prompt 中增加业务 JS 片段)
#### Task 6提取后验证与二次追问
**目标**LLM 返回后检查关键信息是否缺失,缺失则二次追问
**改动**
- 新增 `validateExtractedSceneInfo()` 函数
- 检查项:至少一个 apiEndpoint 有 contentType、至少一个 mode 有 responsePath
- 如果缺失,构造二次 prompt 要求补充
- 最多追问 1 次,超时则使用默认值
**文件**`frontend/scene-generator/llm-client.js`
### 阶段三:测试验证
#### Task 7单元测试
**目标**:为多模式生成路径添加测试覆盖
**改动**
- 新增 `tests/scene_generator_modes_test.rs`
- 测试用例:
1. 单模式场景生成包含 `const MODES =` 的 JS
2. 多模式场景生成包含 mode 路由逻辑
3. 蛇形/驼峰序列化一致性
4. form-urlencoded 请求体格式正确
5. responsePath 提取步骤存在
**文件**`tests/scene_generator_modes_test.rs`(新增)
#### Task 8集成测试
**目标**:用真实场景验证生成质量
**步骤**
1. 拿 `用户日电量监测`(最简单的模式 C 场景)跑一次完整生成
2. 拿 `台区线损大数据`(最复杂的模式 A 双模式场景)跑一次完整生成
3. 对比生成结果与 tq-lineloss-report 的差异
4. 记录差距清单
**产出**:集成测试报告文档
## 影响范围
| 文件 | 改动类型 | 涉及 Task |
|------|---------|-----------|
| `src/generated_scene/generator.rs` | 修改 | Task 1, Task 2 |
| `frontend/scene-generator/llm-client.js` | 修改 | Task 3, Task 4, Task 6 |
| `frontend/scene-generator/server.js` | 修改 | Task 5 |
| `tests/scene_generator_modes_test.rs` | 新增 | Task 7 |
| 集成测试报告 | 新增 | Task 8 |
## 非目标
- 第三梯队场景(链式多步 API 调用)的完全自动化不在本方案范围内
- BrowserAction 跨页面注入的自动识别和转换在第二梯队中部分支持,不追求 100% 准确
- Web UI 变更不在本方案范围内(后续独立方案处理)

View File

@@ -0,0 +1,90 @@
const fs = require("fs");
const path = require("path");
function resolveProjectRoot() {
const envRoot = process.env.SGCLAW_PROJECT_ROOT;
if (envRoot && fs.existsSync(envRoot)) {
return path.resolve(envRoot);
}
const configPath = resolveConfigPath();
if (configPath && fs.existsSync(configPath)) {
return path.dirname(configPath);
}
return path.resolve(__dirname);
}
function resolveConfigPath() {
const envPath = process.env.SGCLAW_CONFIG_PATH;
if (envPath && fs.existsSync(envPath)) {
return path.resolve(envPath);
}
const candidates = [
path.resolve(__dirname, "..", "..", "..", "sgclaw_config.json"), // Parent sgClaw directory
path.resolve(__dirname, "..", "..", "sgclaw_config.json"), // claw-new/sgclaw_config.json
path.resolve(__dirname, "..", "sgclaw_config.json"), // frontend/sgclaw_config.json
path.resolve(__dirname, "sgclaw_config.json"), // scene-generator/sgclaw_config.json
];
for (const p of candidates) {
if (fs.existsSync(p)) return p;
}
return null;
}
function loadConfig() {
const configPath = resolveConfigPath();
if (!configPath) {
throw new Error(
"sgclaw_config.json not found. Set SGCLAW_CONFIG_PATH or place it in the project root."
);
}
const raw = fs.readFileSync(configPath, "utf-8");
const config = JSON.parse(raw);
const apiKey = config.apiKey || "";
const baseUrl = config.baseUrl || "";
const model = config.model || "";
if (!apiKey) throw new Error("sgclaw_config.json: 'apiKey' is required");
if (!baseUrl) throw new Error("sgclaw_config.json: 'baseUrl' is required");
if (!model) throw new Error("sgclaw_config.json: 'model' is required");
return {
apiKey,
baseUrl: normalizeBaseUrl(baseUrl),
model,
projectRoot: resolveProjectRoot(),
configPath,
};
}
function normalizeBaseUrl(url) {
url = url.replace(/\/+$/, "");
url = url.replace(/\/v1\/?$/, "");
return url + "/v1";
}
function getDefaults() {
const config = loadConfig();
const projectRoot = config.projectRoot;
return {
outputRoot: path.join(projectRoot, "examples", "generated_scene_platform"),
lessonsPath: path.join(
projectRoot,
"docs",
"superpowers",
"references",
"tq-lineloss-lessons-learned.toml"
),
llmBaseUrl: config.baseUrl,
llmModel: config.model,
};
}
module.exports = { loadConfig, getDefaults, resolveProjectRoot, resolveConfigPath };

View File

@@ -0,0 +1,197 @@
const { spawn } = require("child_process");
const path = require("path");
function runGenerator(params, sseWriter, projectRoot) {
const { sourceDir, sceneId, sceneName, sceneKind, targetUrl, outputRoot, lessons, sceneInfoJson } = params;
const normalize = (p) => p.replace(/\\/g, "/");
const args = [
"run",
"--bin",
"sg_scene_generate",
"--",
"--source-dir",
normalize(sourceDir),
"--scene-id",
sceneId,
"--scene-name",
sceneName,
];
// 只有明确指定 sceneKind 时才添加参数(否则使用默认值 report_collection
if (sceneKind) {
args.push("--scene-kind", sceneKind);
}
// 如果提供了 targetUrl添加参数
if (targetUrl) {
args.push("--target-url", targetUrl);
}
args.push("--output-root", normalize(outputRoot));
if (lessons) {
args.push("--lessons", normalize(lessons));
}
// Pass scene info JSON if available
if (sceneInfoJson) {
args.push("--scene-info-json", sceneInfoJson);
}
return new Promise((resolve, reject) => {
sseWriter.write(
`event: status\ndata: ${JSON.stringify({
message: "开始生成 skill 包...",
})}\n\n`
);
sseWriter.write(
`event: status\ndata: ${JSON.stringify({
message: `执行: cargo ${args.join(" ")}`,
})}\n\n`
);
const child = spawn("cargo", args, {
cwd: projectRoot,
stdio: ["ignore", "pipe", "pipe"],
env: { ...process.env, RUST_BACKTRACE: "1" },
});
let stdout = "";
let stderr = "";
let timedOut = false;
const timeout = setTimeout(() => {
timedOut = true;
child.kill("SIGTERM");
sseWriter.write(
`event: error\ndata: ${JSON.stringify({
message: "生成超时5分钟",
})}\n\n`
);
resolve({ success: false, error: "timeout" });
}, 5 * 60 * 1000);
child.stdout.on("data", (data) => {
const text = data.toString();
stdout += text;
sseWriter.write(
`event: log\ndata: ${JSON.stringify({ message: text.trim() })}\n\n`
);
});
child.stderr.on("data", (data) => {
const text = data.toString();
stderr += text;
sseWriter.write(
`event: log\ndata: ${JSON.stringify({ message: text.trim() })}\n\n`
);
});
child.on("close", (code) => {
clearTimeout(timeout);
if (timedOut) return;
if (code === 0) {
const match = stdout.match(/generated scene package:\s*(.+)/);
const skillRoot = match ? match[1] : null;
sseWriter.write(
`event: status\ndata: ${JSON.stringify({
message: "✅ 生成成功",
})}\n\n`
);
sseWriter.write(
`event: complete\ndata: ${JSON.stringify({
success: true,
skillRoot,
})}\n\n`
);
resolve({ success: true, skillRoot });
} else {
sseWriter.write(
`event: error\ndata: ${JSON.stringify({
message: `生成失败 (exit code ${code})`,
})}\n\n`
);
if (stderr.trim()) {
sseWriter.write(
`event: error\ndata: ${JSON.stringify({
message: stderr.substring(0, 500),
})}\n\n`
);
}
resolve({ success: false, code, stderr });
}
});
child.on("error", (err) => {
clearTimeout(timeout);
sseWriter.write(
`event: error\ndata: ${JSON.stringify({
message: `无法启动 cargo: ${err.message}`,
})}\n\n`
);
reject(err);
});
});
}
function readDirectory(sourceDir) {
const fs = require("fs");
const p = require("path");
if (!fs.existsSync(sourceDir)) {
throw new Error(`Directory not found: ${sourceDir}`);
}
const stat = fs.statSync(sourceDir);
if (!stat.isDirectory()) {
throw new Error(`Not a directory: ${sourceDir}`);
}
const result = {};
const entries = fs.readdirSync(sourceDir, { withFileTypes: true });
const treeLines = [];
for (const entry of entries) {
treeLines.push(`├── ${entry.name}`);
}
result.tree = treeLines.join("\n");
const sceneTomlPath = p.join(sourceDir, "scene.toml");
if (fs.existsSync(sceneTomlPath)) {
result["scene.toml"] = fs.readFileSync(sceneTomlPath, "utf-8");
}
const skillTomlPath = p.join(sourceDir, "SKILL.toml");
if (fs.existsSync(skillTomlPath)) {
result["SKILL.toml"] = fs.readFileSync(skillTomlPath, "utf-8");
}
const skillMdPath = p.join(sourceDir, "SKILL.md");
if (fs.existsSync(skillMdPath)) {
result["SKILL.md"] = fs.readFileSync(skillMdPath, "utf-8");
}
// Read index.html
const indexHtmlPath = p.join(sourceDir, "index.html");
if (fs.existsSync(indexHtmlPath)) {
result.indexHtml = fs.readFileSync(indexHtmlPath, "utf-8");
}
const scripts = {};
for (const entry of entries) {
if (entry.isFile() && entry.name.endsWith(".js")) {
const scriptPath = p.join(sourceDir, entry.name);
scripts[entry.name] = fs.readFileSync(scriptPath, "utf-8");
}
}
if (Object.keys(scripts).length > 0) {
result.scripts = scripts;
}
return result;
}
module.exports = { runGenerator, readDirectory };

View File

@@ -0,0 +1,348 @@
const http = require("http");
const https = require("https");
const SYSTEM_PROMPT = `你是一个场景信息提取助手。根据场景目录的内容,提取 scene-id 和 scene-name。
scene-id 规则:
- 使用英文短横线连接,如 tq-lineloss-report
- 全小写,有业务含义
scene-name 规则:
- 使用中文,简短描述性名称
- 如 "台区线损报表"、"知乎热榜导出"
请以 JSON 格式返回:{"sceneId": "...", "sceneName": "..."}`;
const DEEP_SYSTEM_PROMPT = `你是一个场景代码分析专家。分析场景源码,提取关键业务信息。
## 分析目标
1. **多模式识别** (关键):
- 查找条件分支逻辑 (if/switch) 中基于 period_mode、reportType 等字段的分支
- 识别不同分支对应的 API 端点、列定义、请求格式
- 如果发现多模式,使用 modes 数组格式输出
2. **API 端点**: 识别所有 HTTP 请求地址 (URL, method, contentType, 用途)
- 从 \$.ajax/fetch 调用中提取 contentType
- 检测请求格式: application/json 或 application/x-www-form-urlencoded
3. **请求模板**: 识别请求参数结构
- 提取硬编码的分页参数 (rows, page, sidx, sord)
- 识别模板变量如 \${args.org_code}
4. **数据归一化**: 识别数据处理规则
- 查找数据渲染/表格填充逻辑
- 检测数据验证条件 (哪些字段不能为空)
5. **响应路径**: 识别数据在响应中的位置
- 如 response.content 或 response.data
## 输出格式
### 单模式场景 (无 modes 数组):
{
"sceneId": "string",
"sceneName": "string",
"sceneKind": "report_collection | monitoring",
"expectedDomain": "string",
"targetUrl": "string",
"apiEndpoints": [{"name": "", "url": "", "method": "POST"}],
"staticParams": {"key": "value"},
"columnDefs": [["fieldName", "中文列名"]]
}
### 多模式场景 (有 modes 数组):
{
"sceneId": "tq-lineloss-report",
"sceneName": "台区线损报表",
"sceneKind": "report_collection",
"modes": [
{
"name": "month",
"label": "月度报表",
"condition": {"field": "period_mode", "operator": "equals", "value": "month"},
"apiEndpoint": {
"name": "月度线损查询",
"url": "http://...",
"method": "POST",
"contentType": "application/x-www-form-urlencoded"
},
"columnDefs": [["ORG_NAME", "供电单位"], ...],
"requestTemplate": {"orgno": "\${args.org_code}", "rows": 1000, "page": 1},
"normalizeRules": {"type": "validate_all_columns", "filterNull": true},
"responsePath": "content"
},
{
"name": "week",
"label": "周报表",
"condition": {"field": "period_mode", "operator": "equals", "value": "week"},
"apiEndpoint": {...},
"columnDefs": [...],
...
}
],
"defaultMode": "month",
"modeSwitchField": "period_mode"
}
**重要**: 如果发现代码中有基于 period_mode 的 if/switch 分支,必须使用多模式格式输出!`;
function buildAnalyzePrompt(sourceDir, dirContents) {
const parts = [];
parts.push(`=== 目录结构 ===`);
parts.push(dirContents.tree || "(empty)");
if (dirContents["scene.toml"]) {
parts.push(`\n=== scene.toml ===`);
parts.push(dirContents["scene.toml"]);
}
if (dirContents["SKILL.toml"]) {
parts.push(`\n=== SKILL.toml ===`);
parts.push(dirContents["SKILL.toml"]);
}
if (dirContents["SKILL.md"]) {
parts.push(`\n=== SKILL.md ===`);
parts.push(dirContents["SKILL.md"]);
}
if (dirContents.scripts && Object.keys(dirContents.scripts).length > 0) {
parts.push(`\n=== 脚本文件 ===`);
for (const [name, content] of Object.entries(dirContents.scripts)) {
parts.push(`\n--- ${name} ---`);
const contentStr = typeof content === 'string' ? content : String(content || '');
parts.push(contentStr.substring(0, 2000));
}
}
return `以下是场景目录 "${sourceDir}" 的内容:\n\n${parts.join("\n")}\n\n请以 JSON 格式返回:{"sceneId": "...", "sceneName": "..."}`;
}
function buildDeepAnalyzePrompt(sourceDir, dirContents, indexHtmlContent) {
const parts = [];
parts.push(`=== 目录结构 ===`);
parts.push(dirContents.tree || "(empty)");
if (dirContents["scene.toml"]) {
parts.push(`\n=== scene.toml ===`);
parts.push(dirContents["scene.toml"]);
}
if (dirContents["SKILL.toml"]) {
parts.push(`\n=== SKILL.toml ===`);
parts.push(dirContents["SKILL.toml"]);
}
if (dirContents["SKILL.md"]) {
parts.push(`\n=== SKILL.md ===`);
parts.push(dirContents["SKILL.md"]);
}
// Include index.html content (key addition)
if (indexHtmlContent && typeof indexHtmlContent === 'string') {
parts.push(`\n=== index.html ===`);
// Limit to first 15000 chars to avoid token limits
parts.push(indexHtmlContent.substring(0, 15000));
}
if (dirContents.scripts && Object.keys(dirContents.scripts).length > 0) {
parts.push(`\n=== 脚本文件 ===`);
for (const [name, content] of Object.entries(dirContents.scripts)) {
parts.push(`\n--- ${name} ---`);
const contentStr = typeof content === 'string' ? content : String(content || '');
parts.push(contentStr.substring(0, 3000));
}
}
return `以下是场景目录 "${sourceDir}" 的内容:\n\n${parts.join("\n")}\n\n请分析以上代码,提取完整的场景信息。`;
}
function extractJsonFromResponse(text) {
const codeBlockMatch = text.match(/```(?:json)?\s*\n([\s\S]*?)\n```/);
if (codeBlockMatch) return JSON.parse(codeBlockMatch[1]);
const jsonMatch = text.match(
/\{[\s\S]*"sceneId"[\s\S]*"sceneName"[\s\S]*\}/
);
if (jsonMatch) return JSON.parse(jsonMatch[0]);
return JSON.parse(text);
}
function extractSceneInfo(text) {
// Try code block first
const codeBlockMatch = text.match(/```(?:json)?\s*\n([\s\S]*?)\n```/);
if (codeBlockMatch) {
try {
return JSON.parse(codeBlockMatch[1]);
} catch (e) {
// fall through
}
}
// Try to find JSON object with sceneId
const jsonMatch = text.match(/\{[\s\S]*"sceneId"[\s\S]*\}/);
if (jsonMatch) {
try {
return JSON.parse(jsonMatch[0]);
} catch (e) {
// fall through
}
}
// Last resort: parse entire text
try {
return JSON.parse(text);
} catch (e) {
throw new Error("Failed to extract valid SceneInfo JSON from LLM response");
}
}
function analyzeScene(sourceDir, dirContents, { apiKey, baseUrl, model }) {
const userPrompt = buildAnalyzePrompt(sourceDir, dirContents);
const requestBody = JSON.stringify({
model,
messages: [
{ role: "system", content: SYSTEM_PROMPT },
{ role: "user", content: userPrompt },
],
temperature: 0.1,
max_tokens: 256,
});
return new Promise((resolve, reject) => {
const url = new URL(baseUrl.replace(/\/v1\/?$/, "") + "/v1/chat/completions");
const options = {
hostname: url.hostname,
port: url.port || (url.protocol === "https:" ? 443 : 80),
path: url.pathname,
method: "POST",
headers: {
"Content-Type": "application/json",
Authorization: `Bearer ${apiKey}`,
"Content-Length": Buffer.byteLength(requestBody),
},
};
const req = https.request(options, (res) => {
let data = "";
res.on("data", (chunk) => (data += chunk));
res.on("end", () => {
if (res.statusCode !== 200) {
return reject(new Error(`LLM API error ${res.statusCode}: ${data}`));
}
try {
const parsed = JSON.parse(data);
const content = parsed.choices?.[0]?.message?.content;
if (!content) return reject(new Error("LLM returned empty response"));
const result = extractJsonFromResponse(content);
if (!result.sceneId || !result.sceneName) {
return reject(
new Error(`LLM response missing sceneId/sceneName: ${content}`)
);
}
resolve(result);
} catch (err) {
reject(new Error(`Failed to parse LLM response: ${err.message}`));
}
});
});
req.on("error", reject);
req.setTimeout(30000, () => {
req.destroy(new Error("LLM API request timed out"));
});
req.write(requestBody);
req.end();
});
}
function analyzeSceneDeep(sourceDir, dirContents, indexHtmlContent, { apiKey, baseUrl, model }) {
const userPrompt = buildDeepAnalyzePrompt(sourceDir, dirContents, indexHtmlContent);
const requestBody = JSON.stringify({
model,
messages: [
{ role: "system", content: DEEP_SYSTEM_PROMPT },
{ role: "user", content: userPrompt },
],
temperature: 0.1,
max_tokens: 2048, // Increased for detailed response
});
return new Promise((resolve, reject) => {
const url = new URL(baseUrl.replace(/\/v1\/?$/, "") + "/v1/chat/completions");
const options = {
hostname: url.hostname,
port: url.port || (url.protocol === "https:" ? 443 : 80),
path: url.pathname,
method: "POST",
headers: {
"Content-Type": "application/json",
Authorization: `Bearer ${apiKey}`,
"Content-Length": Buffer.byteLength(requestBody),
},
};
const httpModule = url.protocol === "https:" ? https : http;
const req = httpModule.request(options, (res) => {
let data = "";
res.on("data", (chunk) => (data += chunk));
res.on("end", () => {
if (res.statusCode !== 200) {
return reject(new Error(`LLM API error ${res.statusCode}: ${data}`));
}
try {
const parsed = JSON.parse(data);
const content = parsed.choices?.[0]?.message?.content;
if (!content) return reject(new Error("LLM returned empty response"));
const result = extractSceneInfo(content);
// Validate required fields
if (!result.sceneId || !result.sceneName) {
return reject(new Error(`LLM response missing sceneId/sceneName: ${content}`));
}
// Set defaults for optional fields
result.sceneKind = result.sceneKind || "report_collection";
result.apiEndpoints = result.apiEndpoints || [];
result.staticParams = result.staticParams || {};
result.columnDefs = result.columnDefs || [];
result.businessLogic = result.businessLogic || {};
result.modes = result.modes || [];
result.defaultMode = result.defaultMode || (result.modes.length > 0 ? result.modes[0].name : null);
result.modeSwitchField = result.modeSwitchField || "period_mode";
resolve(result);
} catch (err) {
reject(new Error(`Failed to parse LLM response: ${err.message}`));
}
});
});
req.on("error", reject);
req.setTimeout(60000, () => {
req.destroy(new Error("LLM API request timed out"));
});
req.write(requestBody);
req.end();
});
}
module.exports = {
buildAnalyzePrompt,
extractJsonFromResponse,
analyzeScene,
// New exports
buildDeepAnalyzePrompt,
extractSceneInfo,
analyzeSceneDeep,
};

View File

@@ -0,0 +1,19 @@
@echo off
setlocal
set PORT=%1
if "%PORT%"=="" set PORT=3210
set SG_SCENE_GENERATOR_PORT=%PORT%
echo.
echo +==================================================+
echo ^| sgClaw Scene Skill Generator ^|
echo +==================================================+
echo ^| ^|
echo ^| 访问地址: http://127.0.0.1:%PORT%/ ^|
echo ^| ^|
echo ^| 按 Ctrl+C 停止服务 ^|
echo +==================================================+
echo.
cd /d "%~dp0"
node server.js

View File

@@ -0,0 +1,44 @@
#!/bin/bash
# ============================================================
# sgClaw Scene Skill Generator — HTTP 服务启动脚本
#
# 用法:
# ./serve.sh # 默认 3210 端口
# ./serve.sh 9090 # 指定端口
# ============================================================
set -e
PORT="${1:-3210}"
DIR="$(cd "$(dirname "$0")" && pwd)"
cd "$DIR"
get_ip() {
ip -4 addr show 2>/dev/null \
| grep -oP 'inet \K[\d.]+' \
| grep -v '127.0.0.1' \
| head -1
}
LOCAL_IP=$(get_ip)
if [ -z "$LOCAL_IP" ]; then
LOCAL_IP=$(hostname -I 2>/dev/null | awk '{print $1}')
fi
if [ -z "$LOCAL_IP" ]; then
LOCAL_IP="<本机IP>"
fi
echo ""
echo " ╔══════════════════════════════════════════════════╗"
echo " ║ sgClaw · Scene Skill Generator ║"
echo " ╠══════════════════════════════════════════════════╣"
echo " ║ ║"
echo " ║ 本机访问: http://127.0.0.1:${PORT}/ ║"
echo " ║ 局域网访问: http://${LOCAL_IP}:${PORT}/ ║"
echo " ║ ║"
echo " ║ 按 Ctrl+C 停止服务 ║"
echo " ╚══════════════════════════════════════════════════╝"
echo ""
export SG_SCENE_GENERATOR_PORT="$PORT"
node server.js

View File

@@ -0,0 +1,417 @@
#!/usr/bin/env node
"use strict";
const http = require("http");
const fs = require("fs");
const path = require("path");
const { spawn } = require("child_process");
const { loadConfig, getDefaults } = require("./config-loader");
const { analyzeScene, analyzeSceneDeep } = require("./llm-client");
const { runGenerator, readDirectory } = require("./generator-runner");
let config;
let defaults;
try {
config = loadConfig();
defaults = getDefaults();
console.log(`[config] Loaded from: ${config.configPath}`);
console.log(`[config] Project root: ${config.projectRoot}`);
} catch (err) {
console.error(`[error] Failed to load config: ${err.message}`);
process.exit(1);
}
const PORT = parseInt(process.env.SG_SCENE_GENERATOR_PORT, 10) || 3210;
const HOST = "127.0.0.1";
const MIME_TYPES = {
".html": "text/html; charset=utf-8",
".js": "application/javascript",
".css": "text/css",
".json": "application/json",
};
function serveStatic(res, filePath) {
const ext = path.extname(filePath);
const contentType = MIME_TYPES[ext] || "application/octet-stream";
fs.readFile(filePath, (err, data) => {
if (err) {
res.writeHead(404);
res.end("Not found");
return;
}
res.writeHead(200, { "Content-Type": contentType });
res.end(data);
});
}
function initSSE(res) {
res.writeHead(200, {
"Content-Type": "text/event-stream",
"Cache-Control": "no-cache",
Connection: "keep-alive",
"Access-Control-Allow-Origin": "*",
});
res.write(":\n");
return res;
}
function writeSSE(res, event, data) {
res.write(`event: ${event}\ndata: ${JSON.stringify(data)}\n\n`);
}
function parseBody(req) {
return new Promise((resolve, reject) => {
let body = "";
req.on("data", (chunk) => (body += chunk));
req.on("end", () => {
try {
resolve(JSON.parse(body));
} catch (err) {
reject(new Error("Invalid JSON"));
}
});
req.on("error", reject);
});
}
async function handleAnalyze(req, res) {
let body;
try {
body = await parseBody(req);
} catch {
res.writeHead(400, { "Content-Type": "application/json" });
res.end(JSON.stringify({ error: "Invalid JSON body" }));
return;
}
const sourceDir = (body.sourceDir || "").replace(/\\/g, "/");
if (!sourceDir) {
res.writeHead(400, { "Content-Type": "application/json" });
res.end(JSON.stringify({ error: "sourceDir is required" }));
return;
}
let dirContents;
try {
dirContents = readDirectory(sourceDir);
} catch (err) {
res.writeHead(400, { "Content-Type": "application/json" });
res.end(JSON.stringify({ error: err.message }));
return;
}
try {
const result = await analyzeScene(sourceDir, dirContents, config);
res.writeHead(200, { "Content-Type": "application/json" });
res.end(JSON.stringify(result));
} catch (err) {
res.writeHead(502, { "Content-Type": "application/json" });
res.end(
JSON.stringify({
error: `LLM analysis failed: ${err.message}`,
hint: "You can still enter scene-id and scene-name manually",
})
);
}
}
async function handleAnalyzeDeep(req, res) {
let body;
try {
body = await parseBody(req);
} catch {
res.writeHead(400, { "Content-Type": "application/json" });
res.end(JSON.stringify({ error: "Invalid JSON body" }));
return;
}
const sourceDir = (body.sourceDir || "").replace(/\\/g, "/");
if (!sourceDir) {
res.writeHead(400, { "Content-Type": "application/json" });
res.end(JSON.stringify({ error: "sourceDir is required" }));
return;
}
let dirContents;
try {
dirContents = readDirectory(sourceDir);
} catch (err) {
res.writeHead(400, { "Content-Type": "application/json" });
res.end(JSON.stringify({ error: err.message }));
return;
}
try {
const indexHtmlContent = dirContents.indexHtml || null;
const result = await analyzeSceneDeep(sourceDir, dirContents, indexHtmlContent, config);
// Log extraction results for debugging
console.log(`[analyze-deep] Extracted scene: ${result.sceneId} / ${result.sceneName}`);
console.log(`[analyze-deep] API endpoints: ${result.apiEndpoints?.length || 0}`);
console.log(`[analyze-deep] Column defs: ${result.columnDefs?.length || 0}`);
res.writeHead(200, { "Content-Type": "application/json" });
res.end(JSON.stringify(result));
} catch (err) {
console.error(`[analyze-deep] Error: ${err.message}`);
res.writeHead(502, { "Content-Type": "application/json" });
res.end(
JSON.stringify({
error: `Deep analysis failed: ${err.message}`,
hint: "You can still use basic analysis or enter data manually",
})
);
}
}
async function handleGenerate(req, res) {
let body;
try {
body = await parseBody(req);
} catch {
res.writeHead(400, { "Content-Type": "application/json" });
res.end(JSON.stringify({ error: "Invalid JSON body" }));
return;
}
const { sourceDir, sceneId, sceneName, sceneKind, targetUrl, outputRoot, lessons, sceneInfoJson } = body;
if (!sourceDir || !sceneId || !sceneName || !outputRoot) {
res.writeHead(400, { "Content-Type": "application/json" });
res.end(
JSON.stringify({
error:
"All fields required: sourceDir, sceneId, sceneName, outputRoot",
})
);
return;
}
const sseWriter = initSSE(res);
try {
await runGenerator(
{ sourceDir, sceneId, sceneName, sceneKind, targetUrl, outputRoot, lessons, sceneInfoJson },
sseWriter,
config.projectRoot
);
} catch (err) {
writeSSE(sseWriter, "error", { message: `Server error: ${err.message}` });
}
sseWriter.end();
}
function handleHealth(req, res) {
res.writeHead(200, { "Content-Type": "application/json" });
res.end(
JSON.stringify({
status: "ok",
pid: process.pid,
configLoaded: true,
configPath: config.configPath,
projectRoot: config.projectRoot,
})
);
}
/**
* Open a native Windows folder selection dialog using PowerShell.
* Returns the selected folder path or null if cancelled.
*/
function openFolderDialog(defaultPath) {
return new Promise((resolve) => {
const psScript = `
[Console]::OutputEncoding = [System.Text.Encoding]::UTF8
Add-Type -AssemblyName System.Windows.Forms
$dialog = New-Object System.Windows.Forms.FolderBrowserDialog
$dialog.Description = "选择文件夹"
$dialog.ShowNewFolderButton = true
${defaultPath ? `$dialog.SelectedPath = '${defaultPath.replace(/'/g, "''")}'` : ""}
if ($dialog.ShowDialog() -eq 'OK') {
Write-Output $dialog.SelectedPath
}
`.trim();
const ps = spawn("powershell.exe", [
"-NoProfile",
"-NonInteractive",
"-Command",
psScript,
], {
windowsHide: true,
});
let output = "";
let error = "";
ps.stdout.on("data", (data) => {
output += data.toString("utf8");
});
ps.stderr.on("data", (data) => {
error += data.toString("utf8");
});
ps.on("close", (code) => {
if (code === 0 && output.trim()) {
// 移除可能的 BOM 标记
let path = output.trim();
if (path.charCodeAt(0) === 0xFEFF) {
path = path.slice(1);
}
resolve(path);
} else {
resolve(null);
}
});
ps.on("error", () => {
resolve(null);
});
});
}
async function handleSelectFolder(req, res) {
let body = {};
try {
body = await parseBody(req);
} catch {
// ignore parse error, use empty body
}
const selectedPath = await openFolderDialog(body.defaultPath || "");
res.writeHead(200, { "Content-Type": "application/json" });
res.end(JSON.stringify({ path: selectedPath }));
}
async function handleSelectFile(req, res) {
let body = {};
try {
body = await parseBody(req);
} catch {
// ignore parse error
}
const filter = body.filter || "所有文件 (*.*)|*.*";
const psScript = `
[Console]::OutputEncoding = [System.Text.Encoding]::UTF8
Add-Type -AssemblyName System.Windows.Forms
$dialog = New-Object System.Windows.Forms.OpenFileDialog
$dialog.Filter = '${filter}'
$dialog.Title = "选择文件"
${body.defaultPath ? `$dialog.InitialDirectory = '${body.defaultPath.replace(/'/g, "''")}'` : ""}
if ($dialog.ShowDialog() -eq 'OK') {
Write-Output $dialog.FileName
}
`.trim();
return new Promise((resolve) => {
const ps = spawn("powershell.exe", [
"-NoProfile",
"-NonInteractive",
"-Command",
psScript,
], {
windowsHide: true,
});
let output = "";
ps.stdout.on("data", (data) => {
output += data.toString("utf8");
});
ps.on("close", (code) => {
let path = output.trim();
if (path.charCodeAt(0) === 0xFEFF) {
path = path.slice(1);
}
res.writeHead(200, { "Content-Type": "application/json" });
res.end(JSON.stringify({ path: code === 0 && path ? path : null }));
resolve();
});
ps.on("error", () => {
res.writeHead(200, { "Content-Type": "application/json" });
res.end(JSON.stringify({ path: null }));
resolve();
});
});
}
const server = http.createServer(async (req, res) => {
const url = new URL(req.url, `http://${req.headers.host}`);
const pathname = url.pathname;
if (req.method === "OPTIONS") {
res.writeHead(204, {
"Access-Control-Allow-Origin": "*",
"Access-Control-Allow-Methods": "GET, POST, OPTIONS",
"Access-Control-Allow-Headers": "Content-Type",
});
res.end();
return;
}
try {
if (pathname === "/health" && req.method === "GET") {
handleHealth(req, res);
} else if (pathname === "/analyze" && req.method === "POST") {
await handleAnalyze(req, res);
} else if (pathname === "/analyze-deep" && req.method === "POST") {
await handleAnalyzeDeep(req, res);
} else if (pathname === "/generate" && req.method === "POST") {
await handleGenerate(req, res);
} else if (pathname === "/select-folder" && req.method === "POST") {
await handleSelectFolder(req, res);
} else if (pathname === "/select-file" && req.method === "POST") {
await handleSelectFile(req, res);
} else if (pathname === "/" || pathname === "/index.html") {
serveStatic(res, path.join(__dirname, "sg_scene_generator.html"));
} else {
const filePath = path.resolve(__dirname, pathname);
const resolvedDir = path.resolve(__dirname);
if (!filePath.startsWith(resolvedDir + path.sep) && filePath !== resolvedDir) {
res.writeHead(403, { "Content-Type": "application/json" });
res.end(JSON.stringify({ error: "Forbidden" }));
return;
}
if (fs.existsSync(filePath) && fs.statSync(filePath).isFile()) {
serveStatic(res, filePath);
} else {
res.writeHead(404, { "Content-Type": "application/json" });
res.end(JSON.stringify({ error: "Not found" }));
}
}
} catch (err) {
console.error(`[error] ${req.method} ${pathname}: ${err.message}`);
if (!res.headersSent) {
res.writeHead(500, { "Content-Type": "application/json" });
}
res.end(JSON.stringify({ error: err.message }));
}
});
server.listen(PORT, HOST, () => {
console.log("");
console.log(" ╔══════════════════════════════════════════════════╗");
console.log(" ║ sgClaw · Scene Skill Generator ║");
console.log(" ╠══════════════════════════════════════════════════╣");
console.log(" ║ ║");
console.log(` ║ 访问地址: http://${HOST}:${PORT}/ ║`);
console.log(" ║ ║");
console.log(" ║ 按 Ctrl+C 停止服务 ║");
console.log(" ╚══════════════════════════════════════════════════╝");
console.log("");
});
process.on("SIGINT", () => {
if (server.closing) return;
server.closing = true;
console.log("\n[info] Shutting down...");
server.close(() => process.exit(0));
// 强制退出超时
setTimeout(() => process.exit(0), 2000);
});

View File

@@ -0,0 +1,690 @@
<!doctype html>
<html lang="zh-CN">
<head>
<meta charset="utf-8" />
<meta name="viewport" content="width=device-width, initial-scale=1" />
<title>场景 Skill 生成器</title>
<style>
:root {
--bg: #f3efe4;
--panel: rgba(255, 252, 247, 0.88);
--panel-strong: #fffaf2;
--text: #1f2329;
--muted: #636b74;
--line: rgba(31, 35, 41, 0.12);
--accent: #0f766e;
--accent-strong: #115e59;
--warn: #b45309;
--error: #b42318;
--success: #166534;
--shadow: 0 24px 60px rgba(34, 42, 53, 0.14);
}
* { box-sizing: border-box; }
body {
margin: 0;
min-height: 100vh;
font-family: "Microsoft YaHei", "PingFang SC", "Segoe UI", sans-serif;
color: var(--text);
background:
radial-gradient(circle at top left, rgba(15, 118, 110, 0.18), transparent 32%),
radial-gradient(circle at right, rgba(180, 83, 9, 0.14), transparent 28%),
linear-gradient(160deg, #f5f0e6 0%, #eef5f4 56%, #f7f3eb 100%);
padding: 24px;
}
.shell {
width: min(1040px, 100%);
margin: 0 auto;
background: var(--panel);
backdrop-filter: blur(14px);
border: 1px solid rgba(255, 255, 255, 0.7);
border-radius: 28px;
box-shadow: var(--shadow);
overflow: hidden;
}
.hero {
padding: 28px 28px 18px;
border-bottom: 1px solid var(--line);
background: linear-gradient(135deg, rgba(255, 250, 242, 0.96), rgba(237, 246, 243, 0.92));
}
.hero h1 { margin: 0; font-size: clamp(1.8rem, 4vw, 2.6rem); line-height: 1.05; letter-spacing: 0.02em; }
.hero p { margin: 10px 0 0; max-width: 60ch; color: var(--muted); line-height: 1.6; }
.content { display: grid; grid-template-columns: minmax(280px, 360px) minmax(0, 1fr); gap: 0; }
.sidebar, .stream-panel { padding: 24px; }
.sidebar { border-right: 1px solid var(--line); background: rgba(255, 255, 255, 0.38); }
.section-label { margin: 0 0 14px; font-size: 0.83rem; font-weight: 700; letter-spacing: 0.14em; text-transform: uppercase; color: var(--muted); }
.field { margin-bottom: 16px; }
.field label { display: block; margin-bottom: 6px; font-size: 0.92rem; color: var(--muted); }
.input-row { display: flex; gap: 8px; }
.input-row input { flex: 1; }
.input-row .browse-btn { width: auto; min-width: 60px; padding: 10px 14px; font-size: 0.85rem; }
input, button {
width: 100%;
border: 1px solid var(--line);
border-radius: 16px;
font: inherit;
}
input {
background: rgba(255, 255, 255, 0.92);
color: var(--text);
padding: 12px 14px;
outline: none;
transition: border-color 140ms ease, box-shadow 140ms ease;
}
input:focus { border-color: rgba(15, 118, 110, 0.5); box-shadow: 0 0 0 4px rgba(15, 118, 110, 0.12); }
select {
width: 100%;
border: 1px solid var(--line);
border-radius: 16px;
background: rgba(255, 255, 255, 0.92);
color: var(--text);
padding: 12px 14px;
font: inherit;
outline: none;
cursor: pointer;
transition: border-color 140ms ease, box-shadow 140ms ease;
}
select:focus { border-color: rgba(15, 118, 110, 0.5); box-shadow: 0 0 0 4px rgba(15, 118, 110, 0.12); }
button { border: 0; padding: 12px 14px; font-weight: 700; cursor: pointer; transition: transform 140ms ease, opacity 140ms ease; }
button:hover:not(:disabled) { transform: translateY(-1px); }
button:disabled { cursor: not-allowed; opacity: 0.45; }
.primary-btn { background: linear-gradient(135deg, var(--accent), var(--accent-strong)); color: #f6fffd; box-shadow: 0 14px 28px rgba(15, 118, 110, 0.18); }
.ghost-btn { background: rgba(255, 255, 255, 0.9); color: var(--text); border: 1px solid var(--line); }
.status-card { display: grid; gap: 8px; padding: 14px; border-radius: 20px; background: var(--panel-strong); border: 1px solid var(--line); margin-bottom: 16px; }
.state-chip { display: inline-flex; align-items: center; width: fit-content; padding: 5px 10px; border-radius: 999px; font-size: 0.85rem; font-weight: 700; background: rgba(99, 107, 116, 0.12); color: var(--muted); }
.state-chip[data-state="ready"] { background: rgba(99, 107, 116, 0.12); color: var(--muted); }
.state-chip[data-state="analyzing"] { background: rgba(180, 83, 9, 0.12); color: var(--warn); }
.state-chip[data-state="generating"] { background: rgba(15, 118, 110, 0.12); color: var(--accent); }
.state-chip[data-state="complete"] { background: rgba(22, 101, 52, 0.12); color: var(--success); }
.state-chip[data-state="error"] { background: rgba(180, 35, 24, 0.12); color: var(--error); }
.validation { min-height: 1.4em; margin: 8px 0 12px; color: var(--error); font-size: 0.92rem; }
.stream-panel { display: grid; grid-template-rows: auto minmax(320px, 1fr); gap: 18px; }
.stream-head { display: flex; justify-content: space-between; align-items: end; gap: 16px; }
.stream-head h2 { margin: 0; font-size: 1.35rem; }
.stream-head p { margin: 6px 0 0; color: var(--muted); font-size: 0.94rem; }
.stream { display: grid; gap: 12px; align-content: start; min-height: 320px; max-height: 70vh; overflow: auto; padding: 4px; }
.empty-state { padding: 22px; border-radius: 20px; background: rgba(255, 255, 255, 0.52); border: 1px dashed rgba(31, 35, 41, 0.16); color: var(--muted); line-height: 1.6; }
.row { display: grid; grid-template-columns: auto 1fr; gap: 12px; align-items: start; padding: 14px 16px; border-radius: 18px; background: rgba(255, 255, 255, 0.76); border: 1px solid rgba(31, 35, 41, 0.08); animation: rise 180ms ease; }
.row-badge { display: inline-flex; align-items: center; justify-content: center; min-width: 76px; padding: 6px 10px; border-radius: 999px; font-size: 0.76rem; font-weight: 700; text-transform: uppercase; letter-spacing: 0.08em; background: rgba(99, 107, 116, 0.14); color: var(--muted); }
.row.status .row-badge { background: rgba(15, 118, 110, 0.14); color: var(--accent-strong); }
.row.log .row-badge { background: rgba(57, 91, 163, 0.14); color: #315aa2; }
.row.complete .row-badge { background: rgba(22, 101, 52, 0.14); color: var(--success); }
.row.error .row-badge { background: rgba(180, 35, 24, 0.14); color: var(--error); }
.row-text { margin: 0; line-height: 1.6; white-space: pre-wrap; word-break: break-word; }
@keyframes rise { from { opacity: 0; transform: translateY(6px); } to { opacity: 1; transform: translateY(0); } }
.hint { font-size: 0.8rem; color: var(--muted); margin-top: 4px; }
.divider { height: 1px; background: var(--line); margin: 12px 0; }
@media (max-width: 900px) { body { padding: 16px; } .content { grid-template-columns: 1fr; } .sidebar { border-right: 0; border-bottom: 1px solid var(--line); } .stream { max-height: none; } }
/* Preview panel styles */
.preview-panel {
background: rgba(255, 255, 255, 0.6);
border-radius: 16px;
border: 1px solid var(--line);
overflow: hidden;
margin-top: 16px;
}
.preview-header {
display: flex;
justify-content: space-between;
align-items: center;
padding: 14px 16px;
cursor: pointer;
background: rgba(255, 255, 255, 0.5);
}
.preview-header h3 {
margin: 0;
font-size: 0.95rem;
font-weight: 700;
color: var(--text);
}
.preview-header:hover {
background: rgba(255, 255, 255, 0.7);
}
.preview-content {
padding: 16px;
}
.preview-section {
margin-bottom: 16px;
}
.preview-section:last-child {
margin-bottom: 0;
}
.preview-section h4 {
margin: 0 0 8px 0;
font-size: 0.85rem;
font-weight: 700;
color: var(--accent);
}
.preview-row {
display: flex;
margin-bottom: 6px;
}
.preview-row .label {
min-width: 80px;
color: var(--muted);
flex-shrink: 0;
font-size: 0.88rem;
}
.preview-row .value {
color: var(--text);
font-size: 0.88rem;
}
.preview-list {
max-height: 120px;
overflow-y: auto;
background: rgba(255, 255, 255, 0.8);
border-radius: 12px;
padding: 8px 12px;
border: 1px solid var(--line);
}
.preview-list-item {
padding: 5px 0;
border-bottom: 1px solid var(--line);
font-size: 0.85rem;
}
.preview-list-item:last-child {
border-bottom: none;
}
.preview-code {
background: rgba(0, 0, 0, 0.04);
padding: 10px;
border-radius: 10px;
font-family: "Consolas", "Monaco", monospace;
font-size: 0.8rem;
overflow-x: auto;
white-space: pre-wrap;
color: var(--text);
}
.btn-group {
display: flex;
gap: 8px;
margin-top: 12px;
}
.btn-group button {
flex: 1;
}
.secondary-btn {
background: rgba(15, 118, 110, 0.1);
color: var(--accent);
border: 1px solid rgba(15, 118, 110, 0.3);
}
.secondary-btn:hover:not(:disabled) {
background: rgba(15, 118, 110, 0.15);
}
</style>
</head>
<body>
<div class="shell">
<div class="hero">
<h1>场景 Skill 生成器</h1>
<p>选择场景目录,配置参数,一键生成 skill 包。</p>
</div>
<div class="content">
<div class="sidebar">
<p class="section-label">Status</p>
<div class="status-card">
<span id="stateChip" class="state-chip" data-state="ready">就绪</span>
<span id="statusText">请选择场景目录</span>
</div>
<p class="section-label">Source</p>
<div class="field">
<label>场景目录路径</label>
<div class="input-row">
<input id="sourceDir" placeholder="点击浏览选择目录..." readonly />
<button id="browseSourceDir" class="ghost-btn browse-btn">浏览</button>
</div>
</div>
<div class="field">
<label for="sceneId">scene-id</label>
<input id="sceneId" placeholder="例如tq-lineloss-report" />
</div>
<div class="field">
<label for="sceneName">scene-name</label>
<input id="sceneName" placeholder="例如:台区线损报表" />
</div>
<div class="field">
<label for="sceneKind">场景类型</label>
<select id="sceneKind">
<option value="report_collection" selected>报表收集类</option>
<option value="monitoring">监测类</option>
</select>
</div>
<div class="field">
<label for="targetUrl">目标 URL (可选)</label>
<input id="targetUrl" placeholder="例如http://20.76.57.61:18080/report" />
<p class="hint">场景要访问的目标页面地址,留空则使用自动提取的域名</p>
</div>
<div class="btn-group" style="margin-top: 8px; margin-bottom: 16px;">
<button id="analyzeBtn" class="secondary-btn" onclick="analyzeDeep()">深度分析</button>
</div>
<!-- 提取结果预览 -->
<div id="extractionPreview" class="preview-panel" style="display: none;">
<div class="preview-header" onclick="togglePreview()">
<h3>LLM 提取结果</h3>
<span id="previewToggleIcon"></span>
</div>
<div id="previewContent" class="preview-content">
<div class="preview-section">
<h4>基本信息</h4>
<div class="preview-row">
<span class="label">场景 ID:</span>
<span id="previewSceneId" class="value"></span>
</div>
<div class="preview-row">
<span class="label">场景名称:</span>
<span id="previewSceneName" class="value"></span>
</div>
<div class="preview-row">
<span class="label">场景类型:</span>
<span id="previewSceneKind" class="value"></span>
</div>
<div class="preview-row">
<span class="label">目标域名:</span>
<span id="previewExpectedDomain" class="value"></span>
</div>
</div>
<div class="preview-section">
<h4>API 端点 (<span id="previewApiCount">0</span>)</h4>
<div id="previewApiEndpoints" class="preview-list"></div>
</div>
<div class="preview-section">
<h4>列定义 (<span id="previewColumnCount">0</span>)</h4>
<div id="previewColumnDefs" class="preview-list"></div>
</div>
<div class="preview-section">
<h4>静态参数</h4>
<pre id="previewStaticParams" class="preview-code"></pre>
</div>
<div class="preview-section">
<h4>业务逻辑</h4>
<div class="preview-row">
<span class="label">数据获取:</span>
<span id="previewDataFetch" class="value"></span>
</div>
<div class="preview-row">
<span class="label">数据转换:</span>
<span id="previewDataTransform" class="value"></span>
</div>
</div>
</div>
</div>
<div class="divider"></div>
<p class="section-label">Settings</p>
<div class="field">
<label>输出根路径</label>
<div class="input-row">
<input id="settingOutputRoot" placeholder="点击浏览选择目录..." readonly />
<button id="browseOutputRoot" class="ghost-btn browse-btn">浏览</button>
</div>
</div>
<div id="validationText" class="validation"></div>
<button id="generateBtn" class="primary-btn" disabled>生成 Skill</button>
</div>
<div class="stream-panel">
<div class="stream-head">
<div>
<p class="section-label">Generation Log</p>
<h2>实时日志</h2>
<p>显示生成过程的完整输出</p>
</div>
</div>
<div id="messageStream" class="stream">
<div class="empty-state" id="emptyState">选择场景目录并点击"生成 Skill"开始。</div>
</div>
</div>
</div>
</div>
<script>
const SERVER_URL = "http://127.0.0.1:3210";
const els = {
sourceDir: document.getElementById("sourceDir"),
sceneId: document.getElementById("sceneId"),
sceneName: document.getElementById("sceneName"),
sceneKind: document.getElementById("sceneKind"),
targetUrl: document.getElementById("targetUrl"),
browseSourceDir: document.getElementById("browseSourceDir"),
browseOutputRoot: document.getElementById("browseOutputRoot"),
settingOutputRoot: document.getElementById("settingOutputRoot"),
generateBtn: document.getElementById("generateBtn"),
validationText: document.getElementById("validationText"),
stateChip: document.getElementById("stateChip"),
statusText: document.getElementById("statusText"),
messageStream: document.getElementById("messageStream"),
emptyState: document.getElementById("emptyState"),
};
let defaultsLoaded = false;
let currentSceneInfo = null; // Stores deep extraction results
let previewExpanded = false;
function setState(state, text) {
els.stateChip.textContent = text;
els.stateChip.dataset.state = state;
els.statusText.textContent = text;
}
function setValidation(msg) { els.validationText.textContent = msg; }
function updateGenerateBtn() {
const ready = els.sourceDir.value.trim() && els.sceneId.value.trim() && els.sceneName.value.trim() && defaultsLoaded;
els.generateBtn.disabled = !ready;
}
function appendRow(kind, text) {
if (els.emptyState) { els.emptyState.remove(); els.emptyState = null; }
const row = document.createElement("div");
row.className = "row " + kind;
const badge = document.createElement("span");
badge.className = "row-badge";
badge.textContent = kind;
const content = document.createElement("p");
content.className = "row-text";
content.textContent = text;
row.appendChild(badge);
row.appendChild(content);
els.messageStream.appendChild(row);
els.messageStream.scrollTop = els.messageStream.scrollHeight;
}
async function selectFolder(defaultPath) {
try {
const res = await fetch(`${SERVER_URL}/select-folder`, {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({ defaultPath }),
});
const data = await res.json();
return data.path;
} catch (err) {
console.error("Failed to select folder:", err);
return null;
}
}
async function selectFile(defaultPath, filter) {
try {
const res = await fetch(`${SERVER_URL}/select-file`, {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({ defaultPath, filter }),
});
const data = await res.json();
return data.path;
} catch (err) {
console.error("Failed to select file:", err);
return null;
}
}
async function loadDefaults() {
try {
const res = await fetch(`${SERVER_URL}/health`);
if (!res.ok) throw new Error("health check failed");
defaultsLoaded = true;
const health = await res.json();
if (health.projectRoot) {
const root = health.projectRoot.replace(/\\/g, "/");
els.settingOutputRoot.value = root + "/examples/generated_scene_platform";
}
updateGenerateBtn();
} catch (err) {
console.error("Failed to load defaults:", err);
setState("error", "无法连接服务器");
appendRow("error", `服务器连接失败: ${err.message}`);
}
}
async function analyzeSourceDir(sourceDir) {
if (!sourceDir) return;
setState("analyzing", "正在分析场景目录...");
appendRow("status", "开始分析场景目录...");
try {
const res = await fetch(`${SERVER_URL}/analyze`, {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({ sourceDir }),
});
const data = await res.json();
if (!res.ok) throw new Error(data.error || "Analyze failed");
if (data.sceneId) {
els.sceneId.value = data.sceneId;
}
if (data.sceneName) {
els.sceneName.value = data.sceneName;
}
appendRow("status", `分析完成: ${data.sceneId || ""} ${data.sceneName || ""}`.trim());
} catch (err) {
appendRow("error", `分析失败: ${err.message}`);
} finally {
setState("ready", "就绪");
updateGenerateBtn();
}
}
function togglePreview() {
const content = document.getElementById("previewContent");
const icon = document.getElementById("previewToggleIcon");
previewExpanded = !previewExpanded;
content.style.display = previewExpanded ? "block" : "none";
icon.textContent = previewExpanded ? "▲" : "▼";
}
function showExtractionPreview(data) {
const panel = document.getElementById("extractionPreview");
panel.style.display = "block";
previewExpanded = true;
document.getElementById("previewContent").style.display = "block";
document.getElementById("previewToggleIcon").textContent = "▲";
// Basic info
document.getElementById("previewSceneId").textContent = data.sceneId || "-";
document.getElementById("previewSceneName").textContent = data.sceneName || "-";
document.getElementById("previewSceneKind").textContent = data.sceneKind || "-";
document.getElementById("previewExpectedDomain").textContent = data.expectedDomain || "-";
// API endpoints
const apiList = document.getElementById("previewApiEndpoints");
const apiCount = document.getElementById("previewApiCount");
if (data.apiEndpoints && data.apiEndpoints.length > 0) {
apiCount.textContent = data.apiEndpoints.length;
apiList.innerHTML = data.apiEndpoints.map(ep => {
const name = escapeHtml(ep.name || "unknown");
const url = escapeHtml(ep.url || "");
const method = escapeHtml(ep.method || "GET");
return `<div class="preview-list-item"><strong>${name}</strong>: ${url} <span style="color: var(--muted);">[${method}]</span></div>`;
}).join("");
} else {
apiCount.textContent = "0";
apiList.innerHTML = '<div class="preview-list-item" style="color: var(--muted);">无</div>';
}
// Column definitions
const colList = document.getElementById("previewColumnDefs");
const colCount = document.getElementById("previewColumnCount");
if (data.columnDefs && data.columnDefs.length > 0) {
colCount.textContent = data.columnDefs.length;
colList.innerHTML = data.columnDefs.map(col => {
const field = escapeHtml(Array.isArray(col) ? col[0] : (col.field || ""));
const label = escapeHtml(Array.isArray(col) ? col[1] : (col.label || ""));
return `<div class="preview-list-item"><code>${field}</code> → ${label}</div>`;
}).join("");
} else {
colCount.textContent = "0";
colList.innerHTML = '<div class="preview-list-item" style="color: var(--muted);">无</div>';
}
// Static params
const staticParams = document.getElementById("previewStaticParams");
if (data.staticParams) {
staticParams.textContent = JSON.stringify(data.staticParams, null, 2);
} else {
staticParams.textContent = "{}";
}
// Business logic
const biz = data.businessLogic || {};
document.getElementById("previewDataFetch").textContent = biz.dataFetch || "-";
document.getElementById("previewDataTransform").textContent = biz.dataTransform || "-";
}
function escapeHtml(str) {
if (!str) return "";
return str
.replace(/&/g, "&amp;")
.replace(/</g, "&lt;")
.replace(/>/g, "&gt;")
.replace(/"/g, "&quot;");
}
async function analyzeDeep() {
const sourceDir = els.sourceDir.value.trim().replace(/\\/g, "/");
if (!sourceDir) {
setValidation("请先选择场景目录");
return;
}
setValidation("");
setState("analyzing", "正在深度分析...");
appendRow("status", "开始深度分析场景...");
try {
const res = await fetch(`${SERVER_URL}/analyze-deep`, {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({ sourceDir }),
});
const data = await res.json();
if (!res.ok) throw new Error(data.error || "Deep analysis failed");
// Store the scene info for generation
currentSceneInfo = data;
// Fill in form fields if not already set
if (data.sceneId && !els.sceneId.value.trim()) {
els.sceneId.value = data.sceneId;
}
if (data.sceneName && !els.sceneName.value.trim()) {
els.sceneName.value = data.sceneName;
}
// Show preview
showExtractionPreview(data);
appendRow("status", `深度分析完成: 找到 ${data.apiEndpoints?.length || 0} 个 API 端点, ${data.columnDefs?.length || 0} 个列定义`);
} catch (err) {
appendRow("error", `深度分析失败: ${err.message}`);
} finally {
setState("ready", "就绪");
updateGenerateBtn();
}
}
async function generate() {
const sourceDir = els.sourceDir.value.trim().replace(/\\/g, "/");
const sceneId = els.sceneId.value.trim();
const sceneName = els.sceneName.value.trim();
const sceneKind = els.sceneKind.value;
const targetUrl = els.targetUrl.value.trim();
const outputRoot = els.settingOutputRoot.value.trim().replace(/\\/g, "/");
if (!sourceDir || !sceneId || !sceneName || !outputRoot) { setValidation("场景目录、scene-id、scene-name、输出根路径为必填"); return; }
setValidation("");
setState("generating", "正在生成 skill 包...");
els.generateBtn.disabled = true;
appendRow("status", "开始生成 skill 包...");
try {
const requestBody = {
sourceDir,
sceneId,
sceneName,
sceneKind,
targetUrl: targetUrl || null,
outputRoot,
};
// Include deep extraction results if available
if (currentSceneInfo) {
requestBody.sceneInfoJson = JSON.stringify(currentSceneInfo);
}
const res = await fetch(`${SERVER_URL}/generate`, {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify(requestBody),
});
if (!res.ok) { const err = await res.json(); throw new Error(err.error || "Generation failed"); }
const reader = res.body.getReader();
const decoder = new TextDecoder();
let buffer = "";
let lastEvent = "";
while (true) {
const { done, value } = await reader.read();
if (done) break;
buffer += decoder.decode(value, { stream: true });
const lines = buffer.split("\n");
buffer = lines.pop() || "";
for (const line of lines) {
if (line.startsWith("event:")) { lastEvent = line.slice(6).trim(); }
else if (line.startsWith("data:") && line.trim()) {
const dataStr = line.slice(5).trim();
if (!dataStr) continue;
try {
const data = JSON.parse(dataStr);
switch (lastEvent) {
case "status": appendRow("status", data.message); break;
case "log": appendRow("log", data.message); break;
case "complete":
if (data.success) { setState("complete", "生成完成"); appendRow("complete", `生成完成: ${data.skillRoot || ""}`); }
else { setState("error", "生成失败"); appendRow("error", data.message || "生成失败"); }
break;
case "error": setState("error", "生成失败"); appendRow("error", data.message); break;
default: appendRow("log", JSON.stringify(data));
}
} catch (_) {}
}
}
}
} catch (err) {
setState("error", "生成失败");
appendRow("error", err.message);
} finally {
els.generateBtn.disabled = false;
updateGenerateBtn();
}
}
// Browse buttons
els.browseSourceDir.addEventListener("click", async () => {
const path = await selectFolder(els.sourceDir.value || null);
if (path) {
els.sourceDir.value = path;
const parts = path.replace(/\\/g, "/").split("/");
const folderName = parts[parts.length - 1];
if (folderName && !els.sceneId.value) {
els.sceneId.value = folderName;
}
updateGenerateBtn();
await analyzeSourceDir(path.replace(/\\/g, "/"));
}
});
els.browseOutputRoot.addEventListener("click", async () => {
const path = await selectFolder(els.settingOutputRoot.value || null);
if (path) {
els.settingOutputRoot.value = path;
updateGenerateBtn();
}
});
els.generateBtn.addEventListener("click", generate);
els.sceneId.addEventListener("input", updateGenerateBtn);
els.sceneName.addEventListener("input", updateGenerateBtn);
loadDefaults();
</script>
</body>
</html>

View File

@@ -4,6 +4,7 @@ use std::sync::Arc;
use crate::browser::ws_backend::WsBrowserBackend;
use crate::browser::{BrowserBackend, PipeBrowserBackend};
use crate::config::SgClawSettings;
use crate::pipe::{BrowserMessage, BrowserPipeTool, PipeError, Transport};
pub use task_runner::{
@@ -22,13 +23,27 @@ fn browser_backend_for_submit<T: Transport + 'static>(
request: &SubmitTaskRequest,
) -> Result<Arc<dyn BrowserBackend>, PipeError> {
if let Some(browser_ws_url) = configured_browser_ws_url(context) {
let settings = context.load_sgclaw_settings()?.unwrap_or(
SgClawSettings::from_legacy_deepseek_fields(
"test-key".to_string(),
"https://example.invalid".to_string(),
"test-model".to_string(),
None,
)
.map_err(|err| PipeError::Protocol(err.to_string()))?,
);
let bootstrap_target = crate::service::server::resolve_submit_bootstrap_target(
request,
context.workspace_root(),
&settings,
);
return Ok(Arc::new(
WsBrowserBackend::new(
Arc::new(crate::service::browser_ws_client::ServiceWsClient::connect(
&browser_ws_url,
)?),
browser_tool.mac_policy().clone(),
crate::service::browser_ws_client::initial_request_url_for_submit_task(request),
bootstrap_target.request_url,
)
.with_response_timeout(browser_tool.response_timeout()),
));

View File

@@ -68,6 +68,10 @@ impl AgentRuntimeContext {
self.config_path.as_deref()
}
pub fn workspace_root(&self) -> &Path {
&self.workspace_root
}
fn settings_source_label(&self) -> String {
match &self.config_path {
Some(path) if path.exists() => path.display().to_string(),

View File

@@ -0,0 +1,107 @@
use std::env;
use std::path::PathBuf;
use sgclaw::generated_scene::analyzer::SceneKind;
use sgclaw::generated_scene::generator::{generate_scene_package, GenerateSceneRequest, SceneInfoJson};
fn main() {
if let Err(err) = run() {
eprintln!("sg_scene_generate: {err}");
std::process::exit(1);
}
}
fn run() -> Result<(), String> {
let args = parse_args(env::args().skip(1))?;
let scene_info: Option<SceneInfoJson> = args.scene_info_json
.map(|json| serde_json::from_str(&json))
.transpose()
.map_err(|e| format!("Invalid scene-info-json: {}", e))?;
let skill_root = generate_scene_package(GenerateSceneRequest {
source_dir: args.source_dir,
scene_id: args.scene_id,
scene_name: args.scene_name,
scene_kind: args.scene_kind,
target_url: args.target_url,
output_root: args.output_root,
lessons_path: args.lessons_path,
scene_info_json: scene_info,
})
.map_err(|err| err.to_string())?;
println!("generated scene package: {}", skill_root.display());
Ok(())
}
struct CliArgs {
source_dir: PathBuf,
scene_id: String,
scene_name: String,
scene_kind: Option<SceneKind>,
target_url: Option<String>,
output_root: PathBuf,
lessons_path: Option<PathBuf>,
scene_info_json: Option<String>,
}
fn parse_args(args: impl Iterator<Item = String>) -> Result<CliArgs, String> {
let mut source_dir = None;
let mut scene_id = None;
let mut scene_name = None;
let mut scene_kind = None;
let mut target_url = None;
let mut output_root = None;
let mut lessons_path = None;
let mut scene_info_json = None;
let mut pending_flag: Option<String> = None;
for arg in args {
if let Some(flag) = pending_flag.take() {
match flag.as_str() {
"--source-dir" => source_dir = Some(PathBuf::from(arg)),
"--scene-id" => scene_id = Some(arg),
"--scene-name" => scene_name = Some(arg),
"--scene-kind" => {
scene_kind = Some(
SceneKind::from_str(&arg)
.ok_or_else(|| format!("invalid scene kind: {}", arg))?,
);
}
"--target-url" => target_url = Some(arg),
"--output-root" => output_root = Some(PathBuf::from(arg)),
"--lessons" => lessons_path = Some(PathBuf::from(arg)),
"--scene-info-json" => scene_info_json = Some(arg),
_ => return Err(format!("unsupported argument {flag}")),
}
continue;
}
match arg.as_str() {
"--source-dir" | "--scene-id" | "--scene-name" | "--scene-kind" | "--target-url"
| "--output-root" | "--lessons" | "--scene-info-json" => {
pending_flag = Some(arg);
}
"--help" | "-h" => return Err(usage()),
_ => return Err(format!("unsupported argument {arg}\n{}", usage())),
}
}
if let Some(flag) = pending_flag {
return Err(format!("missing value for {flag}"));
}
Ok(CliArgs {
source_dir: source_dir.ok_or_else(usage)?,
scene_id: scene_id.ok_or_else(usage)?,
scene_name: scene_name.ok_or_else(usage)?,
scene_kind,
target_url,
output_root: output_root.ok_or_else(usage)?,
lessons_path,
scene_info_json,
})
}
fn usage() -> String {
"usage: sg_scene_generate --source-dir <scenario-dir> --scene-id <scene-id> --scene-name <display-name> [--scene-kind <report_collection|monitoring>] [--target-url <url>] --output-root <skill-staging-root> [--lessons <lessons-toml>] [--scene-info-json '<json>']".to_string()
}

View File

@@ -1,7 +1,9 @@
use std::fs;
use std::path::Path;
use std::sync::Arc;
use reqwest::Url;
use serde::Deserialize;
use serde_json::{Map, Value};
use zeroclaw::skills::{load_skills_from_directory, SkillTool};
@@ -12,6 +14,12 @@ use crate::compat::runtime::CompatTaskContext;
use crate::config::SgClawSettings;
use crate::pipe::{BrowserPipeTool, PipeError, Transport};
#[derive(Debug, Clone, PartialEq, Eq)]
pub(crate) struct DirectSubmitBootstrapMetadata {
pub bootstrap_url: String,
pub expected_domain: Option<String>,
}
#[derive(Debug, Clone, PartialEq, Eq)]
pub struct DirectSubmitOutcome {
pub success: bool,
@@ -111,6 +119,32 @@ pub fn execute_browser_script_skill_raw_output_with_browser_backend(
)
}
pub(crate) fn resolve_direct_submit_bootstrap_metadata(
configured_tool: &str,
workspace_root: &Path,
settings: &SgClawSettings,
) -> Result<Option<DirectSubmitBootstrapMetadata>, PipeError> {
let (tool, skill_root) = resolve_browser_script_skill(configured_tool, workspace_root, settings)?;
let manifest_path = skill_root.join("SKILL.toml");
let Ok(manifest) = fs::read_to_string(&manifest_path) else {
return Ok(None);
};
let Ok(parsed) = toml::from_str::<DirectSubmitSkillManifest>(&manifest) else {
return Ok(None);
};
let metadata = parsed
.tools
.into_iter()
.find(|candidate| candidate.name == tool.name)
.and_then(|candidate| candidate.metadata)
.and_then(|metadata| {
normalize_bootstrap_metadata(metadata.bootstrap_url, metadata.expected_domain)
});
Ok(metadata)
}
fn resolve_browser_script_skill(
configured_tool: &str,
workspace_root: &Path,
@@ -306,6 +340,50 @@ fn count_summary_rows(counts: Option<&Value>, sections: Option<&Value>) -> usize
.unwrap_or(0)
}
#[derive(Debug, Deserialize)]
struct DirectSubmitSkillManifest {
#[serde(default)]
tools: Vec<DirectSubmitSkillManifestTool>,
}
#[derive(Debug, Deserialize)]
struct DirectSubmitSkillManifestTool {
name: String,
#[serde(default)]
metadata: Option<DirectSubmitToolMetadata>,
}
#[derive(Debug, Deserialize)]
struct DirectSubmitToolMetadata {
#[serde(default)]
bootstrap_url: Option<String>,
#[serde(default)]
expected_domain: Option<String>,
}
fn normalize_bootstrap_metadata(
bootstrap_url: Option<String>,
expected_domain: Option<String>,
) -> Option<DirectSubmitBootstrapMetadata> {
let bootstrap_url = bootstrap_url
.as_deref()
.map(str::trim)
.filter(|value| !value.is_empty())?;
let parsed = Url::parse(bootstrap_url).ok()?;
if parsed.scheme().is_empty() || parsed.host_str().is_none() {
return None;
}
Some(DirectSubmitBootstrapMetadata {
bootstrap_url: parsed.to_string(),
expected_domain: expected_domain
.as_deref()
.map(str::trim)
.filter(|value| !value.is_empty())
.map(ToString::to_string),
})
}
fn parse_configured_tool_name(configured_tool: &str) -> Result<(&str, &str), PipeError> {
let (skill_name, tool_name) = configured_tool.split_once('.').ok_or_else(|| {
PipeError::Protocol(format!(

View File

@@ -0,0 +1,202 @@
use std::fmt;
use std::fs;
use std::path::{Path, PathBuf};
#[derive(Debug, Clone, PartialEq, Eq)]
pub enum SceneKind {
ReportCollection,
Monitoring,
}
impl SceneKind {
pub fn from_str(s: &str) -> Option<Self> {
match s {
"report_collection" => Some(Self::ReportCollection),
"monitoring" => Some(Self::Monitoring),
_ => None,
}
}
pub fn as_str(&self) -> &'static str {
match self {
Self::ReportCollection => "report_collection",
Self::Monitoring => "monitoring",
}
}
}
#[derive(Debug, Clone, PartialEq, Eq)]
pub enum ToolKind {
BrowserScript,
}
#[derive(Debug, Clone, PartialEq, Eq)]
pub struct BootstrapAnalysis {
pub target_url: Option<String>,
pub expected_domain: Option<String>,
}
#[derive(Debug, Clone, PartialEq, Eq)]
pub struct SceneSourceAnalysis {
pub scene_kind: SceneKind,
pub tool_kind: ToolKind,
pub bootstrap: BootstrapAnalysis,
pub collection_entry_script: Option<String>,
pub source_dir: PathBuf,
}
#[derive(Debug)]
pub struct AnalyzeSceneError {
message: String,
}
impl AnalyzeSceneError {
fn new(message: impl Into<String>) -> Self {
Self {
message: message.into(),
}
}
}
impl fmt::Display for AnalyzeSceneError {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
f.write_str(&self.message)
}
}
impl std::error::Error for AnalyzeSceneError {}
/// Analyze scene source with an optional scene kind hint.
///
/// The hint parameter takes priority over meta tags.
/// If neither hint nor meta tag is present, defaults to ReportCollection.
pub fn analyze_scene_source_with_hint(
source_dir: &Path,
scene_kind_hint: Option<SceneKind>,
) -> Result<SceneSourceAnalysis, AnalyzeSceneError> {
let index_path = source_dir.join("index.html");
let html = fs::read_to_string(&index_path).map_err(|err| {
AnalyzeSceneError::new(format!(
"failed to read scene source {}: {err}",
index_path.display()
))
})?;
// Determine scene kind: hint > meta > default
let scene_kind = if let Some(hint) = scene_kind_hint {
hint
} else {
let meta_kind = meta_content(&html, "sgclaw-scene-kind");
meta_kind
.as_deref()
.and_then(SceneKind::from_str)
.unwrap_or(SceneKind::ReportCollection)
};
// Tool kind is currently only browser_script
let tool_kind = meta_content(&html, "sgclaw-tool-kind");
if let Some(ref tk) = tool_kind {
if tk != "browser_script" {
return Err(AnalyzeSceneError::new(format!(
"unsupported tool kind: {}",
tk
)));
}
}
// Default tool kind to BrowserScript
let target_url = meta_content(&html, "sgclaw-target-url");
let expected_domain = meta_content(&html, "sgclaw-expected-domain");
let entry_script = meta_content(&html, "sgclaw-entry-script");
// Auto-extract expected_domain from external script URLs if not provided via meta tag
let expected_domain = expected_domain.or_else(|| extract_domain_from_external_scripts(&html));
// All fields are optional - generator will use placeholders for missing values
// This allows third-party scenes without meta tags to be processed
Ok(SceneSourceAnalysis {
scene_kind,
tool_kind: ToolKind::BrowserScript,
bootstrap: BootstrapAnalysis {
target_url,
expected_domain,
},
collection_entry_script: entry_script,
source_dir: source_dir.to_path_buf(),
})
}
/// Analyze scene source (compatibility wrapper).
///
/// Requires meta tags to be present. For new code, use `analyze_scene_source_with_hint`.
pub fn analyze_scene_source(source_dir: &Path) -> Result<SceneSourceAnalysis, AnalyzeSceneError> {
analyze_scene_source_with_hint(source_dir, None)
}
fn meta_content(html: &str, name: &str) -> Option<String> {
for tag in html
.split('<')
.filter(|fragment| fragment.starts_with("meta"))
{
let tag = tag.split('>').next().unwrap_or(tag);
if attr_value(tag, "name").as_deref() == Some(name) {
return attr_value(tag, "content").map(|value| value.trim().to_string());
}
}
None
}
fn attr_value(tag: &str, attr: &str) -> Option<String> {
let needle = format!("{attr}=");
let start = tag.find(&needle)? + needle.len();
let rest = &tag[start..];
let quote = rest.chars().next()?;
if quote != '"' && quote != '\'' {
return None;
}
let rest = &rest[quote.len_utf8()..];
let end = rest.find(quote)?;
Some(rest[..end].to_string())
}
/// Extract domain from the first external script URL found in HTML.
///
/// Looks for `<script src="http://...">` or `<script src="https://...">` tags
/// and extracts the host:port portion as expected_domain.
fn extract_domain_from_external_scripts(html: &str) -> Option<String> {
for tag in html
.split('<')
.filter(|fragment| fragment.starts_with("script"))
{
if let Some(src) = attr_value(tag, "src") {
// Check if it's an external URL (http:// or https://)
if src.starts_with("http://") || src.starts_with("https://") {
// Extract domain (host:port) from URL
if let Some(domain) = extract_domain_from_url(&src) {
return Some(domain);
}
}
}
}
None
}
/// Extract domain (host:port) from a URL string.
fn extract_domain_from_url(url: &str) -> Option<String> {
// Remove protocol prefix
let rest = url
.strip_prefix("http://")
.or_else(|| url.strip_prefix("https://"))?;
// Find the end of domain (first '/' or end of string)
let domain_end = rest.find('/').unwrap_or(rest.len());
let domain = &rest[..domain_end];
// Return non-empty domain
if domain.is_empty() {
None
} else {
Some(domain.to_string())
}
}

File diff suppressed because it is too large Load Diff

View File

@@ -18,7 +18,7 @@ pub use protocol::{ClientMessage, ConfigUpdatePayload, ServiceMessage};
pub use server::{ServiceEventSink, ServiceSession};
pub(crate) mod browser_ws_client {
pub(crate) use super::server::{initial_request_url_for_submit_task, ServiceWsClient};
pub(crate) use super::server::ServiceWsClient;
}
#[derive(Debug, Clone, Default, PartialEq, Eq)]

View File

@@ -4,6 +4,7 @@ use std::path::PathBuf;
use std::sync::{Arc, Mutex};
use std::time::Duration;
use reqwest::Url;
#[cfg(test)]
use reqwest::blocking::Client;
#[cfg(test)]
@@ -28,6 +29,7 @@ use crate::browser::bridge_transport::BridgeActionTransport;
use crate::browser::{BrowserBackend, BrowserCallbackBackend};
#[cfg(test)]
use crate::browser::BridgeBrowserBackend;
use crate::config::SgClawSettings;
use crate::pipe::{AgentMessage, BrowserMessage, PipeError, Transport};
#[cfg(test)]
use crate::pipe::Timing;
@@ -329,7 +331,21 @@ pub(crate) fn serve_client(
// Lazily create and cache the browser callback host. On first
// task it opens the helper page; subsequent tasks reuse it.
if cached_host.is_none() {
let bootstrap_url = initial_request_url_for_submit_task(&request);
let settings = context.load_sgclaw_settings()?.unwrap_or(
SgClawSettings::from_legacy_deepseek_fields(
"test-key".to_string(),
"https://example.invalid".to_string(),
"test-model".to_string(),
None,
)
.map_err(|err| PipeError::Protocol(err.to_string()))?,
);
let bootstrap_target = resolve_submit_bootstrap_target(
&request,
context.workspace_root(),
&settings,
);
let bootstrap_url = bootstrap_target.request_url;
match LiveBrowserCallbackHost::start_with_browser_ws_url(
browser_ws_url,
&bootstrap_url,
@@ -419,15 +435,82 @@ pub(crate) fn serve_client(
}
}
pub(crate) fn initial_request_url_for_submit_task(request: &crate::agent::SubmitTaskRequest) -> String {
request
#[derive(Debug, Clone, PartialEq, Eq)]
pub(crate) struct SubmitBootstrapTarget {
pub request_url: String,
pub expected_domain: Option<String>,
pub source: BootstrapTargetSource,
}
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
pub(crate) enum BootstrapTargetSource {
PageContext,
DeterministicPlan,
SkillConfig,
Fallback,
}
pub(crate) fn resolve_submit_bootstrap_target(
request: &crate::agent::SubmitTaskRequest,
workspace_root: &Path,
settings: &SgClawSettings,
) -> SubmitBootstrapTarget {
if let Some(page_url) = request
.page_url
.as_deref()
.map(str::trim)
.filter(|value| !value.is_empty())
.map(ToString::to_string)
.or_else(|| derive_request_url_from_instruction(&request.instruction))
.unwrap_or_else(|| "about:blank".to_string())
{
return SubmitBootstrapTarget {
request_url: page_url.to_string(),
expected_domain: Url::parse(page_url)
.ok()
.and_then(|url| url.domain().map(ToString::to_string)),
source: BootstrapTargetSource::PageContext,
};
}
if let crate::compat::deterministic_submit::DeterministicSubmitDecision::Execute(plan) =
crate::compat::deterministic_submit::decide_deterministic_submit(
&request.instruction,
request.page_url.as_deref(),
request.page_title.as_deref(),
)
{
return SubmitBootstrapTarget {
request_url: plan.target_url.clone(),
expected_domain: Some(plan.expected_domain.clone()),
source: BootstrapTargetSource::DeterministicPlan,
};
}
if let Some(configured_tool) = settings
.direct_submit_skill
.as_deref()
.map(str::trim)
.filter(|value| !value.is_empty())
{
if let Ok(Some(metadata)) =
crate::compat::direct_skill_runtime::resolve_direct_submit_bootstrap_metadata(
configured_tool,
workspace_root,
settings,
)
{
return SubmitBootstrapTarget {
request_url: metadata.bootstrap_url,
expected_domain: metadata.expected_domain,
source: BootstrapTargetSource::SkillConfig,
};
}
}
SubmitBootstrapTarget {
request_url: derive_request_url_from_instruction(&request.instruction)
.unwrap_or_else(|| "about:blank".to_string()),
expected_domain: None,
source: BootstrapTargetSource::Fallback,
}
}
fn derive_request_url_from_instruction(instruction: &str) -> Option<String> {
@@ -457,12 +540,6 @@ fn derive_request_url_from_instruction(instruction: &str) -> Option<String> {
return Some("https://zhuanlan.zhihu.com".to_string());
}
// 台区线损相关
// TODO: 临时方案,后续应从 skill 配置或 deterministic_submit 解析结果中获取
if instruction.contains("线损") || instruction.contains("lineloss") {
return Some("http://20.76.57.61:18080".to_string());
}
None
}
@@ -834,16 +911,77 @@ fn write_http_json_response(stream: &mut impl std::io::Write, status: &str, body
#[cfg(test)]
mod tests {
use super::*;
use std::fs;
use std::net::TcpListener;
use std::sync::{mpsc, Arc};
use std::thread;
use serde_json::json;
use uuid::Uuid;
use crate::agent::SubmitTaskRequest;
use crate::browser::BrowserBackend;
use crate::pipe::Action;
fn service_test_settings(
skills_dir: Option<PathBuf>,
direct_submit_skill: Option<&str>,
) -> SgClawSettings {
let mut settings = SgClawSettings::from_legacy_deepseek_fields(
"test-key".to_string(),
"https://example.invalid".to_string(),
"test-model".to_string(),
skills_dir,
)
.expect("settings");
settings.direct_submit_skill = direct_submit_skill.map(ToString::to_string);
settings
}
fn staged_skill_staging_dir() -> PathBuf {
PathBuf::from(env!("CARGO_MANIFEST_DIR"))
.join("../claw/claw/skills/skill_staging/skills")
.canonicalize()
.expect("staged skills dir")
}
fn temp_direct_submit_skill_root(bootstrap_url: &str) -> PathBuf {
let root = std::env::temp_dir().join(format!(
"sgclaw-bootstrap-target-skill-root-{}",
Uuid::new_v4()
));
let skill_dir = root.join("fault-details-report");
let script_dir = skill_dir.join("scripts");
fs::create_dir_all(&script_dir).expect("create script dir");
fs::write(
skill_dir.join("SKILL.toml"),
format!(
r#"[skill]
name = "fault-details-report"
description = "Collect 95598 fault detail data via browser eval."
version = "0.1.0"
[[tools]]
name = "collect_fault_details"
description = "Collect structured fault detail rows for a specific period."
kind = "browser_script"
command = "scripts/collect_fault_details.js"
[tools.metadata]
bootstrap_url = "{bootstrap_url}"
expected_domain = "95598.sgcc.com.cn"
"#
),
)
.expect("write skill manifest");
fs::write(
script_dir.join("collect_fault_details.js"),
"return { ok: true };\n",
)
.expect("write skill script");
root
}
fn service_test_policy() -> MacPolicy {
MacPolicy::from_json_str(
r#"{
@@ -859,56 +997,165 @@ mod tests {
}
#[test]
fn initial_request_url_prefers_submit_task_page_url() {
fn page_context_bootstrap_target_wins_over_deterministic_and_skill_fallback() {
let request = SubmitTaskRequest {
instruction: "打开知乎热榜".to_string(),
page_url: Some(" https://www.zhihu.com/ ".to_string()),
instruction: "兰州公司 台区线损大数据 月累计线损率统计分析。。。".to_string(),
page_url: Some(" https://already-open.example.com/page ".to_string()),
..SubmitTaskRequest::default()
};
let settings = SgClawSettings::from_legacy_deepseek_fields(
"test-key".to_string(),
"https://example.invalid".to_string(),
"test-model".to_string(),
None,
)
.expect("settings");
assert_eq!(
initial_request_url_for_submit_task(&request),
"https://www.zhihu.com/"
);
let target = resolve_submit_bootstrap_target(&request, Path::new("."), &settings);
assert_eq!(target.request_url, "https://already-open.example.com/page");
assert_eq!(target.expected_domain.as_deref(), Some("already-open.example.com"));
assert_eq!(target.source, BootstrapTargetSource::PageContext);
}
#[test]
fn initial_request_url_falls_back_to_zhihu_origin_for_hotlist_routes() {
fn whitespace_page_url_does_not_short_circuit_bootstrap_fallback() {
let request = SubmitTaskRequest {
instruction: "打开知乎热榜获取前10条数据并导出 Excel".to_string(),
page_url: Some(" ".to_string()),
..SubmitTaskRequest::default()
};
let settings = SgClawSettings::from_legacy_deepseek_fields(
"test-key".to_string(),
"https://example.invalid".to_string(),
"test-model".to_string(),
None,
)
.expect("settings");
assert_eq!(
initial_request_url_for_submit_task(&request),
"https://www.zhihu.com"
);
let target = resolve_submit_bootstrap_target(&request, Path::new("."), &settings);
assert_eq!(target.request_url, "https://www.zhihu.com");
assert_eq!(target.source, BootstrapTargetSource::Fallback);
}
#[test]
fn initial_request_url_falls_back_to_zhihu_origin_for_generated_article_publish_routes() {
let request = SubmitTaskRequest {
instruction: "在知乎自动发表一篇名称为人工智能技能大全".to_string(),
..SubmitTaskRequest::default()
};
assert_eq!(
initial_request_url_for_submit_task(&request),
"https://www.zhihu.com"
);
}
#[test]
fn initial_request_url_falls_back_to_lineloss_origin_for_lineloss_instructions() {
fn deterministic_bootstrap_target_uses_plan_target_url() {
let request = SubmitTaskRequest {
instruction: "兰州公司 台区线损大数据 月累计线损率统计分析。。。".to_string(),
..SubmitTaskRequest::default()
};
let settings = service_test_settings(None, None);
let target = resolve_submit_bootstrap_target(&request, Path::new("."), &settings);
assert_eq!(
initial_request_url_for_submit_task(&request),
"http://20.76.57.61:18080"
target.request_url,
"http://20.76.57.61:18080/gsllys/tqLinelossStatis/tqQualifyRateMonitor"
);
assert_eq!(target.expected_domain.as_deref(), Some("20.76.57.61"));
assert_eq!(target.source, BootstrapTargetSource::DeterministicPlan);
}
#[test]
fn skill_metadata_bootstrap_url_is_used_when_no_page_context_or_plan_exists() {
let request = SubmitTaskRequest {
instruction: "请采集 2026-03 的故障明细并返回结果".to_string(),
..SubmitTaskRequest::default()
};
let settings = service_test_settings(
Some(staged_skill_staging_dir()),
Some("fault-details-report.collect_fault_details"),
);
let target = resolve_submit_bootstrap_target(&request, Path::new("."), &settings);
assert_eq!(target.request_url, "https://95598.sgcc.com.cn/");
assert_eq!(target.expected_domain.as_deref(), Some("95598.sgcc.com.cn"));
assert_eq!(target.source, BootstrapTargetSource::SkillConfig);
}
#[test]
fn malformed_skill_bootstrap_url_falls_back_to_about_blank() {
let request = SubmitTaskRequest {
instruction: "请采集 2026-03 的故障明细并返回结果".to_string(),
..SubmitTaskRequest::default()
};
let skills_dir = temp_direct_submit_skill_root("not-a-valid-absolute-url");
let settings = service_test_settings(
Some(skills_dir.clone()),
Some("fault-details-report.collect_fault_details"),
);
let target = resolve_submit_bootstrap_target(&request, Path::new("."), &settings);
assert_eq!(target.request_url, "about:blank");
assert_eq!(target.expected_domain, None);
assert_eq!(target.source, BootstrapTargetSource::Fallback);
let _ = fs::remove_dir_all(skills_dir);
}
#[test]
fn bootstrap_target_precedence_matrix_covers_page_context_deterministic_skill_and_fallback() {
let page_request = SubmitTaskRequest {
instruction: "兰州公司 台区线损大数据 月累计线损率统计分析。。。".to_string(),
page_url: Some(" https://already-open.example.com/page ".to_string()),
..SubmitTaskRequest::default()
};
let page_settings = service_test_settings(
Some(staged_skill_staging_dir()),
Some("fault-details-report.collect_fault_details"),
);
let page_target =
resolve_submit_bootstrap_target(&page_request, Path::new("."), &page_settings);
assert_eq!(page_target.request_url, "https://already-open.example.com/page");
assert_eq!(page_target.source, BootstrapTargetSource::PageContext);
let deterministic_request = SubmitTaskRequest {
instruction: "兰州公司 台区线损大数据 月累计线损率统计分析。。。".to_string(),
..SubmitTaskRequest::default()
};
let deterministic_settings = service_test_settings(
Some(staged_skill_staging_dir()),
Some("fault-details-report.collect_fault_details"),
);
let deterministic_target = resolve_submit_bootstrap_target(
&deterministic_request,
Path::new("."),
&deterministic_settings,
);
assert_eq!(
deterministic_target.request_url,
"http://20.76.57.61:18080/gsllys/tqLinelossStatis/tqQualifyRateMonitor"
);
assert_eq!(
deterministic_target.source,
BootstrapTargetSource::DeterministicPlan
);
let skill_request = SubmitTaskRequest {
instruction: "请采集 2026-03 的故障明细并返回结果".to_string(),
..SubmitTaskRequest::default()
};
let skill_settings = service_test_settings(
Some(staged_skill_staging_dir()),
Some("fault-details-report.collect_fault_details"),
);
let skill_target =
resolve_submit_bootstrap_target(&skill_request, Path::new("."), &skill_settings);
assert_eq!(skill_target.request_url, "https://95598.sgcc.com.cn/");
assert_eq!(skill_target.source, BootstrapTargetSource::SkillConfig);
let fallback_request = SubmitTaskRequest {
instruction: "完全不相关的普通问题".to_string(),
..SubmitTaskRequest::default()
};
let fallback_settings = service_test_settings(None, None);
let fallback_target =
resolve_submit_bootstrap_target(&fallback_request, Path::new("."), &fallback_settings);
assert_eq!(fallback_target.request_url, "about:blank");
assert_eq!(fallback_target.source, BootstrapTargetSource::Fallback);
}
#[test]

View File

@@ -0,0 +1,11 @@
<!DOCTYPE html>
<html>
<head>
<meta charset="UTF-8" />
<title>测试外部脚本提取</title>
<script src="http://25.215.213.128:18080/a_js/YPTAPI.js"></script>
</head>
<body>
<div id="app">测试页面</div>
</body>
</html>

View File

@@ -0,0 +1,14 @@
<!doctype html>
<html>
<head>
<meta charset="utf-8">
<title>设备监测状态</title>
<!-- 注意:没有 sgclaw-scene-kind meta 标签,测试 hint 参数 -->
</head>
<body>
<main>
<h1>设备监测状态</h1>
<div id="monitor-status">running</div>
</main>
</body>
</html>

View File

@@ -0,0 +1,15 @@
<!doctype html>
<html>
<head>
<meta charset="utf-8">
<title>普通门户页面</title>
<!-- No sgclaw-scene-kind meta tag - user must specify via hint -->
<meta name="sgclaw-tool-kind" content="browser_script">
<meta name="sgclaw-target-url" content="http://example.com/dashboard">
<meta name="sgclaw-expected-domain" content="example.com">
<meta name="sgclaw-entry-script" content="js/monitor.js">
</head>
<body>
<button>打开菜单</button>
</body>
</html>

View File

@@ -0,0 +1,30 @@
use std::fs;
use std::path::PathBuf;
#[test]
fn scene_generator_html_exists_and_has_required_elements() {
let manifest_dir = PathBuf::from(env!("CARGO_MANIFEST_DIR"));
let html_path = manifest_dir
.join("frontend")
.join("scene-generator")
.join("sg_scene_generator.html");
let source = fs::read_to_string(&html_path)
.unwrap_or_else(|err| panic!("HTML file not found at {:?}: {}", html_path, err));
assert!(source.contains("场景 Skill 生成器"), "missing title");
assert!(source.contains("sourceDir"), "missing sourceDir input");
assert!(source.contains("sceneId"), "missing sceneId input");
assert!(source.contains("sceneName"), "missing sceneName input");
assert!(source.contains("/analyze"), "missing /analyze endpoint");
assert!(source.contains("/generate"), "missing /generate endpoint");
assert!(
source.contains("fetch("),
"missing fetch for API calls"
);
assert!(
source.contains("127.0.0.1") || source.contains("localhost"),
"should reference localhost server"
);
}

View File

@@ -0,0 +1,45 @@
const assert = require("assert");
const {
buildAnalyzePrompt,
extractJsonFromResponse,
} = require("../frontend/scene-generator/llm-client");
function testBuildAnalyzePromptIncludesFileContents() {
const dirContents = {
"scene.toml": '[scene]\nid = "test-scene"',
scripts: { "collect_test.js": "async function main() {}" },
tree: "├── scene.toml\n└── collect_test.js",
};
const prompt = buildAnalyzePrompt("D:/test/scenario", dirContents);
assert.ok(prompt.includes("scene.toml"), "should include scene.toml");
assert.ok(prompt.includes("collect_test.js"), "should include script name");
assert.ok(prompt.includes("D:/test/scenario"), "should include sourceDir");
console.log("PASS: testBuildAnalyzePromptIncludesFileContents");
}
function testExtractJsonFromResponse() {
const withMarkdown =
'```json\n{"sceneId": "test", "sceneName": "测试"}\n```';
const plain = '{"sceneId": "test", "sceneName": "测试"}';
const withPrefix =
'Here is the result:\n{"sceneId": "test", "sceneName": "测试"}';
assert.deepStrictEqual(extractJsonFromResponse(withMarkdown), {
sceneId: "test",
sceneName: "测试",
});
assert.deepStrictEqual(extractJsonFromResponse(plain), {
sceneId: "test",
sceneName: "测试",
});
assert.deepStrictEqual(extractJsonFromResponse(withPrefix), {
sceneId: "test",
sceneName: "测试",
});
console.log("PASS: testExtractJsonFromResponse");
}
testBuildAnalyzePromptIncludesFileContents();
testExtractJsonFromResponse();

View File

@@ -0,0 +1,39 @@
use std::fs;
use std::path::PathBuf;
#[test]
fn scene_generator_server_files_exist() {
let manifest_dir = PathBuf::from(env!("CARGO_MANIFEST_DIR"));
let server_js = manifest_dir
.join("frontend")
.join("scene-generator")
.join("server.js");
let config_loader = manifest_dir
.join("frontend")
.join("scene-generator")
.join("config-loader.js");
assert!(
server_js.exists(),
"server.js not found at {:?}",
server_js
);
assert!(
config_loader.exists(),
"config-loader.js not found at {:?}",
config_loader
);
}
#[test]
fn sgclaw_config_is_readable() {
let manifest_dir = PathBuf::from(env!("CARGO_MANIFEST_DIR"));
let config_path = manifest_dir.join("sgclaw_config.json");
let content = fs::read_to_string(&config_path)
.unwrap_or_else(|err| panic!("sgclaw_config.json not found: {}", err));
let parsed: serde_json::Value =
serde_json::from_str(&content).expect("should be valid JSON");
assert!(parsed.get("apiKey").is_some(), "missing apiKey");
assert!(parsed.get("baseUrl").is_some(), "missing baseUrl");
assert!(parsed.get("model").is_some(), "missing model");
}

View File

@@ -0,0 +1,187 @@
use std::fs;
use std::path::{Path, PathBuf};
use std::time::{SystemTime, UNIX_EPOCH};
use sgclaw::compat::scene_platform::registry::load_scene_registry;
use sgclaw::generated_scene::analyzer::{analyze_scene_source, analyze_scene_source_with_hint, SceneKind, ToolKind};
use sgclaw::generated_scene::generator::{generate_scene_package, GenerateSceneRequest};
#[test]
fn analyzer_classifies_supported_report_collection_source() {
let analysis = analyze_scene_source(Path::new(
"tests/fixtures/generated_scene/report_collection",
))
.unwrap();
assert_eq!(analysis.scene_kind, SceneKind::ReportCollection);
assert_eq!(analysis.tool_kind, ToolKind::BrowserScript);
assert_eq!(
analysis.bootstrap.target_url.as_deref(),
Some("http://20.76.57.61:18080/gsllys/tqLinelossStatis/tqQualifyRateMonitor")
);
assert_eq!(
analysis.bootstrap.expected_domain.as_deref(),
Some("20.76.57.61")
);
assert_eq!(
analysis.collection_entry_script.as_deref(),
Some("js/report.js")
);
}
#[test]
fn generator_writes_registration_ready_package_with_scene_toml() {
let output_root = temp_workspace("sgclaw-scene-generator");
generate_scene_package(GenerateSceneRequest {
source_dir: PathBuf::from("tests/fixtures/generated_scene/report_collection"),
scene_id: "sample-report-scene".to_string(),
scene_name: "示例报表场景".to_string(),
scene_kind: None,
target_url: None,
output_root: output_root.clone(),
lessons_path: PathBuf::from("docs/superpowers/references/tq-lineloss-lessons-learned.toml"),
})
.unwrap();
let skill_root = output_root.join("skills/sample-report-scene");
assert!(skill_root.join("SKILL.toml").exists());
assert!(skill_root.join("SKILL.md").exists());
assert!(skill_root.join("scene.toml").exists());
assert!(skill_root
.join("scripts/collect_sample_report_scene.js")
.exists());
assert!(skill_root
.join("scripts/collect_sample_report_scene.test.js")
.exists());
assert!(skill_root.join("references/generation-lessons.md").exists());
assert!(skill_root.join("references/org-dictionary.json").exists());
let generated_script =
fs::read_to_string(skill_root.join("scripts/collect_sample_report_scene.js")).unwrap();
assert!(generated_script.contains("return buildBrowserEntrypointResult(args);"));
let generated_manifest = fs::read_to_string(skill_root.join("scene.toml")).unwrap();
assert!(generated_manifest.contains("resolver = \"dictionary_entity\""));
assert!(generated_manifest.contains("dictionary_ref = \"references/org-dictionary.json\""));
assert!(generated_manifest.contains("required = true"));
let registry = load_scene_registry(&output_root.join("skills")).unwrap();
let entry = registry
.iter()
.find(|entry| entry.manifest.scene.id == "sample-report-scene")
.expect("generated package should be registration-ready");
assert_eq!(entry.manifest.scene.kind, "browser_script");
assert_eq!(entry.manifest.scene.category, "report_collection");
assert_eq!(entry.manifest.scene.tool, "collect_sample_report_scene");
assert_eq!(entry.manifest.bootstrap.expected_domain, "20.76.57.61");
}
#[test]
fn analyzer_defaults_to_report_collection_when_no_scene_kind_meta() {
// non_report fixture has no scene-kind meta tag - should default to ReportCollection
let analysis =
analyze_scene_source(Path::new("tests/fixtures/generated_scene/non_report")).unwrap();
// With the new delegation, it defaults to ReportCollection instead of rejecting
assert_eq!(analysis.scene_kind, SceneKind::ReportCollection);
assert_eq!(analysis.tool_kind, ToolKind::BrowserScript);
}
fn temp_workspace(prefix: &str) -> PathBuf {
let nanos = SystemTime::now()
.duration_since(UNIX_EPOCH)
.unwrap()
.as_nanos();
let path = std::env::temp_dir().join(format!("{prefix}-{nanos}"));
fs::create_dir_all(&path).unwrap();
path
}
#[test]
fn analyzer_accepts_missing_meta_with_scene_kind_hint() {
// non_report fixture has no scene-kind meta tag
let analysis = analyze_scene_source_with_hint(
Path::new("tests/fixtures/generated_scene/non_report"),
Some(SceneKind::ReportCollection),
)
.unwrap();
// should succeed, using hint parameter as type
assert_eq!(analysis.scene_kind, SceneKind::ReportCollection);
}
#[test]
fn analyzer_uses_hint_when_meta_missing() {
let analysis = analyze_scene_source_with_hint(
Path::new("tests/fixtures/generated_scene/non_report"),
Some(SceneKind::Monitoring),
)
.unwrap();
assert_eq!(analysis.scene_kind, SceneKind::Monitoring);
}
#[test]
fn analyzer_uses_meta_when_present_and_no_hint() {
// report_collection fixture has correct meta tag
let analysis = analyze_scene_source_with_hint(
Path::new("tests/fixtures/generated_scene/report_collection"),
None,
)
.unwrap();
assert_eq!(analysis.scene_kind, SceneKind::ReportCollection);
}
#[test]
fn analyzer_hint_overrides_meta() {
// user choice takes priority over meta tag
let analysis = analyze_scene_source_with_hint(
Path::new("tests/fixtures/generated_scene/report_collection"),
Some(SceneKind::Monitoring),
)
.unwrap();
assert_eq!(analysis.scene_kind, SceneKind::Monitoring);
}
#[test]
fn generator_emits_monitoring_template() {
let output_root = temp_workspace("sgclaw-monitoring-generator");
generate_scene_package(GenerateSceneRequest {
source_dir: PathBuf::from("tests/fixtures/generated_scene/monitoring"),
scene_id: "sample-monitor-scene".to_string(),
scene_name: "示例监测场景".to_string(),
scene_kind: Some(SceneKind::Monitoring),
target_url: None,
output_root: output_root.clone(),
lessons_path: PathBuf::from("docs/superpowers/references/tq-lineloss-lessons-learned.toml"),
})
.unwrap();
let skill_root = output_root.join("skills/sample-monitor-scene");
assert!(skill_root.join("SKILL.toml").exists());
assert!(skill_root.join("scene.toml").exists());
let generated_manifest = fs::read_to_string(skill_root.join("scene.toml")).unwrap();
assert!(generated_manifest.contains("category = \"monitoring\""));
// 监测类不应该有 org/period resolver
assert!(!generated_manifest.contains("resolver = \"dictionary_entity\""));
}
#[test]
fn analyzer_extracts_domain_from_external_script() {
// external_script fixture has no expected_domain meta tag,
// but has an external script URL that should be auto-extracted
let analysis = analyze_scene_source(Path::new(
"tests/fixtures/generated_scene/external_script",
))
.unwrap();
assert_eq!(analysis.scene_kind, SceneKind::ReportCollection);
// Should auto-extract "25.215.213.128:18080" from script src
assert_eq!(
analysis.bootstrap.expected_domain.as_deref(),
Some("25.215.213.128:18080")
);
}

View File

@@ -144,12 +144,34 @@ fn start_callback_host_hotlist_browser_server(
.send(CallbackHostBrowserEvent::BrowserFrame(first_action.clone()))
.unwrap();
let Some(values) = first_action.as_array() else {
let second_action = match websocket.read().unwrap() {
Message::Text(text) => serde_json::from_str::<Value>(&text).unwrap(),
other => panic!("expected second browser action frame, got {other:?}"),
};
event_tx
.send(CallbackHostBrowserEvent::BrowserFrame(second_action.clone()))
.unwrap();
let Some(close_values) = first_action.as_array() else {
websocket.close(None).ok();
return;
};
let is_helper_close = close_values.len() >= 3
&& close_values[1] == json!("sgHideBrowerserClosePage")
&& close_values[2]
.as_str()
.is_some_and(|url| url.ends_with("/sgclaw/browser-helper.html"));
if !is_helper_close {
websocket.close(None).ok();
return;
}
let Some(values) = second_action.as_array() else {
websocket.close(None).ok();
return;
};
let is_helper_open = values.len() >= 3
&& values[1] == json!("sgBrowerserOpenPage")
&& values[1] == json!("sgHideBrowerserOpenPage")
&& values[2]
.as_str()
.is_some_and(|url| url.ends_with("/sgclaw/browser-helper.html"));
@@ -825,6 +847,7 @@ fn client_to_service_regression_routes_zhihu_through_callback_host_without_inval
browser_server.join().unwrap();
let register = event_rx.recv_timeout(Duration::from_secs(2)).unwrap();
let bootstrap_close = event_rx.recv_timeout(Duration::from_secs(2)).unwrap();
let bootstrap = event_rx.recv_timeout(Duration::from_secs(2)).unwrap();
let pre_ready = event_rx.recv_timeout(Duration::from_secs(2)).unwrap();
let open_page = event_rx.recv_timeout(Duration::from_secs(4)).unwrap();
@@ -873,12 +896,22 @@ fn client_to_service_regression_routes_zhihu_through_callback_host_without_inval
};
assert_eq!(register, json!({ "type": "register", "role": "web" }));
let bootstrap_close = match bootstrap_close {
CallbackHostBrowserEvent::BrowserFrame(value) => value,
other => panic!("expected helper close frame, got {other:?}"),
};
assert_eq!(bootstrap_close[0], json!("https://www.zhihu.com"));
assert_eq!(bootstrap_close[1], json!("sgHideBrowerserClosePage"));
assert!(bootstrap_close[2]
.as_str()
.is_some_and(|url| url.ends_with("/sgclaw/browser-helper.html")));
let bootstrap = match bootstrap {
CallbackHostBrowserEvent::BrowserFrame(value) => value,
other => panic!("expected helper bootstrap frame, got {other:?}"),
};
assert_eq!(bootstrap[0], json!("https://www.zhihu.com"));
assert_eq!(bootstrap[1], json!("sgBrowerserOpenPage"));
assert_eq!(bootstrap[1], json!("sgHideBrowerserOpenPage"));
assert!(bootstrap[2]
.as_str()
.is_some_and(|url| url.ends_with("/sgclaw/browser-helper.html")));

View File

@@ -194,12 +194,34 @@ fn start_callback_host_hotlist_browser_server(
.send(CallbackHostBrowserEvent::BrowserFrame(first_action.clone()))
.unwrap();
let Some(values) = first_action.as_array() else {
let second_action = match websocket.read().unwrap() {
Message::Text(text) => serde_json::from_str::<Value>(&text).unwrap(),
other => panic!("expected second browser action frame, got {other:?}"),
};
event_tx
.send(CallbackHostBrowserEvent::BrowserFrame(second_action.clone()))
.unwrap();
let Some(close_values) = first_action.as_array() else {
websocket.close(None).ok();
return;
};
let is_helper_close = close_values.len() >= 3
&& close_values[1] == json!("sgHideBrowerserClosePage")
&& close_values[2]
.as_str()
.is_some_and(|url| url.ends_with("/sgclaw/browser-helper.html"));
if !is_helper_close {
websocket.close(None).ok();
return;
}
let Some(values) = second_action.as_array() else {
websocket.close(None).ok();
return;
};
let is_helper_open = values.len() >= 3
&& values[1] == json!("sgBrowerserOpenPage")
&& values[1] == json!("sgHideBrowerserOpenPage")
&& values[2]
.as_str()
.is_some_and(|url| url.ends_with("/sgclaw/browser-helper.html"));
@@ -873,6 +895,7 @@ fn service_binary_submit_flow_routes_zhihu_through_callback_host() {
browser_server.join().unwrap();
let register = event_rx.recv_timeout(Duration::from_secs(2)).unwrap();
let bootstrap_close = event_rx.recv_timeout(Duration::from_secs(2)).unwrap();
let bootstrap = event_rx.recv_timeout(Duration::from_secs(2)).unwrap();
let pre_ready = event_rx.recv_timeout(Duration::from_secs(2)).unwrap();
let open_page = event_rx.recv_timeout(Duration::from_secs(4)).unwrap();
@@ -902,12 +925,22 @@ fn service_binary_submit_flow_routes_zhihu_through_callback_host() {
};
assert_eq!(register, json!({ "type": "register", "role": "web" }));
let bootstrap_close = match bootstrap_close {
CallbackHostBrowserEvent::BrowserFrame(value) => value,
other => panic!("expected helper close frame, got {other:?}"),
};
assert_eq!(bootstrap_close[0], json!("https://www.zhihu.com"));
assert_eq!(bootstrap_close[1], json!("sgHideBrowerserClosePage"));
assert!(bootstrap_close[2]
.as_str()
.is_some_and(|url| url.ends_with("/sgclaw/browser-helper.html")));
let bootstrap = match bootstrap {
CallbackHostBrowserEvent::BrowserFrame(value) => value,
other => panic!("expected helper bootstrap frame, got {other:?}"),
};
assert_eq!(bootstrap[0], json!("https://www.zhihu.com"));
assert_eq!(bootstrap[1], json!("sgBrowerserOpenPage"));
assert_eq!(bootstrap[1], json!("sgHideBrowerserOpenPage"));
assert!(bootstrap[2]
.as_str()
.is_some_and(|url| url.ends_with("/sgclaw/browser-helper.html")));
@@ -1046,6 +1079,7 @@ fn service_binary_submit_flow_uses_callback_host_command_semantics_for_zhihu() {
browser_server.join().unwrap();
let register = event_rx.recv_timeout(Duration::from_secs(2)).unwrap();
let bootstrap_close = event_rx.recv_timeout(Duration::from_secs(2)).unwrap();
let bootstrap = event_rx.recv_timeout(Duration::from_secs(2)).unwrap();
let pre_ready = event_rx.recv_timeout(Duration::from_secs(2)).unwrap();
let open_page = event_rx.recv_timeout(Duration::from_secs(4)).unwrap();
@@ -1075,12 +1109,22 @@ fn service_binary_submit_flow_uses_callback_host_command_semantics_for_zhihu() {
};
assert_eq!(register, json!({ "type": "register", "role": "web" }));
let bootstrap_close = match bootstrap_close {
CallbackHostBrowserEvent::BrowserFrame(value) => value,
other => panic!("expected helper close frame, got {other:?}"),
};
assert_eq!(bootstrap_close[0], json!("https://www.zhihu.com"));
assert_eq!(bootstrap_close[1], json!("sgHideBrowerserClosePage"));
assert!(bootstrap_close[2]
.as_str()
.is_some_and(|url| url.ends_with("/sgclaw/browser-helper.html")));
let bootstrap = match bootstrap {
CallbackHostBrowserEvent::BrowserFrame(value) => value,
other => panic!("expected helper bootstrap frame, got {other:?}"),
};
assert_eq!(bootstrap[0], json!("https://www.zhihu.com"));
assert_eq!(bootstrap[1], json!("sgBrowerserOpenPage"));
assert_eq!(bootstrap[1], json!("sgHideBrowerserOpenPage"));
assert!(bootstrap[2]
.as_str()
.is_some_and(|url| url.ends_with("/sgclaw/browser-helper.html")));