# LLM-Driven Skill Generation Implementation Plan > **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking. **Goal:** Enhance `sg_scene_generate` to generate complete, runnable skill packages instead of skeleton code by deeply analyzing scene source code (index.html) with LLM to extract API endpoints, static params, column definitions, and business logic. **Architecture:** - LLM reads `index.html` from scene directory - Extracts complete SceneInfo (sceneId, sceneName, apiEndpoints, staticParams, columnDefs, businessLogic) - Web UI shows preview for user confirmation - Rust CLI receives extracted info via `--scene-info-json` parameter - Rust template renders complete browser_script with business logic **Tech Stack:** JavaScript (Node.js), Rust, HTML/CSS, OpenAI-compatible LLM API --- ## Scope Check This plan covers the enhancement of existing scene skill generator to support LLM-driven deep extraction. It builds upon: - Existing `frontend/scene-generator/` files (server.js, llm-client.js, generator-runner.js) - Existing `src/generated_scene/generator.rs` and `src/bin/sg_scene_generate.rs` --- ## File Map ### Modified Files | File | Changes | |------|---------| | `frontend/scene-generator/llm-client.js` | Add deep extraction prompt + `analyzeSceneDeep()` | | `frontend/scene-generator/generator-runner.js` | Add `index.html` reading in `readDirectory()` | | `frontend/scene-generator/server.js` | New `/analyze-deep` route, pass sceneInfo to generator | | `src/bin/sg_scene_generate.rs` | Add `--scene-info-json` CLI parameter | | `src/generated_scene/generator.rs` | Add SceneInfo struct, enhanced template rendering | | `frontend/scene-generator/sg_scene_generator.html` | Add extraction preview UI | ### Reference Files (not modified) | File | Purpose | |------|---------| | `docs/superpowers/specs/2026-04-17-llm-driven-skill-generation-design.md` | Design spec | | `claw/skills/skill_staging/skills/tq-lineloss-report/scripts/collect_tq_lineloss_report.js` | Reference complete script (433 lines) | | `claw/skills/skill_staging/skills/marketing-zero-consumer-report/scripts/collect_marketing_zero_consumer_report.js` | Reference skeleton (51 lines) | --- ## Scope Guardrails - Do not change existing API contracts for backward compatibility - Do not require `index.html` to exist (fallback to current behavior) - Do not break existing `--scene-id`, `--scene-name` CLI arguments - Do not add npm dependencies (only Node.js built-in modules) --- ### Task 1: Enhance llm-client.js with Deep Extraction **Files:** - Modify: `frontend/scene-generator/llm-client.js` **Goal:** Add a new function `analyzeSceneDeep()` that reads index.html content and extracts complete SceneInfo including API endpoints, static params, column definitions, and business logic. - [ ] **Step 1: Add DEEP_SYSTEM_PROMPT constant** Add after the existing `SYSTEM_PROMPT` constant in `llm-client.js`: ```javascript const DEEP_SYSTEM_PROMPT = `你是一个场景代码分析专家。分析场景源码,提取关键业务信息。 ## 分析目标 1. **API 端点**: 识别所有 HTTP 请求地址 (URL, method, 用途) 2. **静态参数**: 识别硬编码的业务参数 (key-value pairs) 3. **列定义**: 识别数据表格/导出的列配置 ([field, label] pairs) 4. **业务逻辑**: 理解数据获取和转换流程 5. **场景类型**: 判断是 report_collection 还是 monitoring ## 输出格式 请以 JSON 格式返回: { "sceneId": "string - 场景标识 (英文短横线)", "sceneName": "string - 场景中文名", "sceneKind": "report_collection | monitoring", "sourceSystem": "string - 来源系统名 (可选)", "expectedDomain": "string - 目标域名 (可选)", "targetUrl": "string | null - 目标页面URL", "apiEndpoints": [ {"name": "string", "url": "string", "method": "GET|POST", "description": "string"} ], "staticParams": {"key": "value"}, "columnDefs": [["fieldName", "中文列名"]], "entryMethod": "string - 入口方法名", "businessLogic": { "dataFetch": "string - 数据获取逻辑描述", "dataTransform": "string - 数据转换逻辑描述" } }`; ``` - [ ] **Step 2: Add buildDeepAnalyzePrompt function** Add after `buildAnalyzePrompt` function: ```javascript function buildDeepAnalyzePrompt(sourceDir, dirContents, indexHtmlContent) { const parts = []; parts.push(`=== 目录结构 ===`); parts.push(dirContents.tree || "(empty)"); if (dirContents["scene.toml"]) { parts.push(`\n=== scene.toml ===`); parts.push(dirContents["scene.toml"]); } if (dirContents["SKILL.toml"]) { parts.push(`\n=== SKILL.toml ===`); parts.push(dirContents["SKILL.toml"]); } if (dirContents["SKILL.md"]) { parts.push(`\n=== SKILL.md ===`); parts.push(dirContents["SKILL.md"]); } // Include index.html content (key addition) if (indexHtmlContent) { parts.push(`\n=== index.html ===`); // Limit to first 15000 chars to avoid token limits parts.push(indexHtmlContent.substring(0, 15000)); } if (dirContents.scripts && Object.keys(dirContents.scripts).length > 0) { parts.push(`\n=== 脚本文件 ===`); for (const [name, content] of Object.entries(dirContents.scripts)) { parts.push(`\n--- ${name} ---`); parts.push(content.substring(0, 3000)); } } return `以下是场景目录 "${sourceDir}" 的内容:\n\n${parts.join("\n")}\n\n请分析以上代码,提取完整的场景信息。`; } ``` - [ ] **Step 3: Add extractSceneInfo function** Add after `extractJsonFromResponse` function: ```javascript function extractSceneInfo(text) { // Try code block first const codeBlockMatch = text.match(/```(?:json)?\s*\n([\s\S]*?)\n```/); if (codeBlockMatch) { try { return JSON.parse(codeBlockMatch[1]); } catch (e) { // fall through } } // Try to find JSON object with sceneId const jsonMatch = text.match(/\{[\s\S]*"sceneId"[\s\S]*\}/); if (jsonMatch) { try { return JSON.parse(jsonMatch[0]); } catch (e) { // fall through } } // Last resort: parse entire text try { return JSON.parse(text); } catch (e) { throw new Error("Failed to extract valid SceneInfo JSON from LLM response"); } } ``` - [ ] **Step 4: Add analyzeSceneDeep function** Add after `analyzeScene` function: ```javascript function analyzeSceneDeep(sourceDir, dirContents, indexHtmlContent, { apiKey, baseUrl, model }) { const userPrompt = buildDeepAnalyzePrompt(sourceDir, dirContents, indexHtmlContent); const requestBody = JSON.stringify({ model, messages: [ { role: "system", content: DEEP_SYSTEM_PROMPT }, { role: "user", content: userPrompt }, ], temperature: 0.1, max_tokens: 2048, // Increased for detailed response }); return new Promise((resolve, reject) => { const url = new URL(baseUrl.replace(/\/v1\/?$/, "") + "/v1/chat/completions"); const options = { hostname: url.hostname, port: url.port || (url.protocol === "https:" ? 443 : 80), path: url.pathname, method: "POST", headers: { "Content-Type": "application/json", Authorization: `Bearer ${apiKey}`, "Content-Length": Buffer.byteLength(requestBody), }, }; const httpModule = url.protocol === "https:" ? https : http; const req = httpModule.request(options, (res) => { let data = ""; res.on("data", (chunk) => (data += chunk)); res.on("end", () => { if (res.statusCode !== 200) { return reject(new Error(`LLM API error ${res.statusCode}: ${data}`)); } try { const parsed = JSON.parse(data); const content = parsed.choices?.[0]?.message?.content; if (!content) return reject(new Error("LLM returned empty response")); const result = extractSceneInfo(content); // Validate required fields if (!result.sceneId || !result.sceneName) { return reject(new Error(`LLM response missing sceneId/sceneName: ${content}`)); } // Set defaults for optional fields result.sceneKind = result.sceneKind || "report_collection"; result.apiEndpoints = result.apiEndpoints || []; result.staticParams = result.staticParams || {}; result.columnDefs = result.columnDefs || []; result.businessLogic = result.businessLogic || {}; resolve(result); } catch (err) { reject(new Error(`Failed to parse LLM response: ${err.message}`)); } }); }); req.on("error", reject); req.setTimeout(60000, () => { req.destroy(new Error("LLM API request timed out")); }); req.write(requestBody); req.end(); }); } ``` - [ ] **Step 5: Add http module import and update exports** At the top of the file, add `http` import alongside `https`: ```javascript const http = require("http"); const https = require("https"); ``` Update the exports at the bottom: ```javascript module.exports = { buildAnalyzePrompt, extractJsonFromResponse, analyzeScene, // New exports buildDeepAnalyzePrompt, extractSceneInfo, analyzeSceneDeep, }; ``` - [ ] **Step 6: Verify syntax** Run: `node -c frontend/scene-generator/llm-client.js` Expected: No syntax errors - [ ] **Step 7: Commit** ```bash git add frontend/scene-generator/llm-client.js git commit -m "feat(llm-client): add deep extraction with apiEndpoints, staticParams, columnDefs" ``` --- ### Task 2: Enhance generator-runner.js to Read index.html **Files:** - Modify: `frontend/scene-generator/generator-runner.js` **Goal:** Modify `readDirectory()` to also read `index.html` content. - [ ] **Step 1: Add index.html reading in readDirectory function** Locate the `readDirectory` function and add index.html reading after the SKILL.md section: ```javascript // After the SKILL.md reading section, add: const indexHtmlPath = p.join(sourceDir, "index.html"); if (fs.existsSync(indexHtmlPath)) { result.indexHtml = fs.readFileSync(indexHtmlPath, "utf-8"); } ``` The complete modified function should look like: ```javascript function readDirectory(sourceDir) { const fs = require("fs"); const p = require("path"); if (!fs.existsSync(sourceDir)) { throw new Error(`Directory not found: ${sourceDir}`); } const stat = fs.statSync(sourceDir); if (!stat.isDirectory()) { throw new Error(`Not a directory: ${sourceDir}`); } const result = {}; const entries = fs.readdirSync(sourceDir, { withFileTypes: true }); const treeLines = []; for (const entry of entries) { treeLines.push(`├── ${entry.name}`); } result.tree = treeLines.join("\n"); const sceneTomlPath = p.join(sourceDir, "scene.toml"); if (fs.existsSync(sceneTomlPath)) { result["scene.toml"] = fs.readFileSync(sceneTomlPath, "utf-8"); } const skillTomlPath = p.join(sourceDir, "SKILL.toml"); if (fs.existsSync(skillTomlPath)) { result["SKILL.toml"] = fs.readFileSync(skillTomlPath, "utf-8"); } const skillMdPath = p.join(sourceDir, "SKILL.md"); if (fs.existsSync(skillMdPath)) { result["SKILL.md"] = fs.readFileSync(skillMdPath, "utf-8"); } // NEW: Read index.html const indexHtmlPath = p.join(sourceDir, "index.html"); if (fs.existsSync(indexHtmlPath)) { result.indexHtml = fs.readFileSync(indexHtmlPath, "utf-8"); } const scripts = {}; for (const entry of entries) { if (entry.isFile() && entry.name.endsWith(".js")) { const scriptPath = p.join(sourceDir, entry.name); scripts[entry.name] = fs.readFileSync(scriptPath, "utf-8"); } } if (Object.keys(scripts).length > 0) { result.scripts = scripts; } return result; } ``` - [ ] **Step 2: Verify syntax** Run: `node -c frontend/scene-generator/generator-runner.js` Expected: No syntax errors - [ ] **Step 3: Commit** ```bash git add frontend/scene-generator/generator-runner.js git commit -m "feat(generator-runner): read index.html in readDirectory()" ``` --- ### Task 3: Add /analyze-deep Route in server.js **Files:** - Modify: `frontend/scene-generator/server.js` **Goal:** Add new `/analyze-deep` endpoint that calls the deep extraction LLM function. - [ ] **Step 1: Update llm-client import** Change the import line at the top: ```javascript const { analyzeScene, analyzeSceneDeep } = require("./llm-client"); ``` - [ ] **Step 2: Add handleAnalyzeDeep function** Add after the existing `handleAnalyze` function: ```javascript async function handleAnalyzeDeep(req, res) { let body; try { body = await parseBody(req); } catch { res.writeHead(400, { "Content-Type": "application/json" }); res.end(JSON.stringify({ error: "Invalid JSON body" })); return; } const sourceDir = (body.sourceDir || "").replace(/\\/g, "/"); if (!sourceDir) { res.writeHead(400, { "Content-Type": "application/json" }); res.end(JSON.stringify({ error: "sourceDir is required" })); return; } let dirContents; try { dirContents = readDirectory(sourceDir); } catch (err) { res.writeHead(400, { "Content-Type": "application/json" }); res.end(JSON.stringify({ error: err.message })); return; } try { const indexHtmlContent = dirContents.indexHtml || null; const result = await analyzeSceneDeep(sourceDir, dirContents, indexHtmlContent, config); // Log extraction results for debugging console.log(`[analyze-deep] Extracted scene: ${result.sceneId} / ${result.sceneName}`); console.log(`[analyze-deep] API endpoints: ${result.apiEndpoints?.length || 0}`); console.log(`[analyze-deep] Column defs: ${result.columnDefs?.length || 0}`); res.writeHead(200, { "Content-Type": "application/json" }); res.end(JSON.stringify(result)); } catch (err) { console.error(`[analyze-deep] Error: ${err.message}`); res.writeHead(502, { "Content-Type": "application/json" }); res.end( JSON.stringify({ error: `Deep analysis failed: ${err.message}`, hint: "You can still use basic analysis or enter data manually", }) ); } } ``` - [ ] **Step 3: Add route in server request handler** In the `http.createServer` handler, add the new route after `/analyze`: ```javascript } else if (pathname === "/analyze-deep" && req.method === "POST") { await handleAnalyzeDeep(req, res); ``` - [ ] **Step 4: Verify syntax** Run: `node -c frontend/scene-generator/server.js` Expected: No syntax errors - [ ] **Step 5: Commit** ```bash git add frontend/scene-generator/server.js git commit -m "feat(server): add /analyze-deep endpoint for deep extraction" ``` --- ### Task 4: Add --scene-info-json CLI Parameter **Files:** - Modify: `src/bin/sg_scene_generate.rs` - Modify: `src/generated_scene/generator.rs` **Goal:** Add `--scene-info-json` parameter to Rust CLI to receive pre-extracted scene info from the Node.js server. - [ ] **Step 1: Add SceneInfoJson struct in generator.rs** In `src/generated_scene/generator.rs`, add after imports: ```rust use std::collections::HashMap; #[derive(Debug, Clone, serde::Deserialize)] pub struct ApiEndpointJson { pub name: String, pub url: String, #[serde(default)] pub method: String, #[serde(default)] pub description: Option, } #[derive(Debug, Clone, serde::Deserialize)] pub struct BusinessLogicJson { #[serde(default)] pub data_fetch: Option, #[serde(default)] pub data_transform: Option, } #[derive(Debug, Clone, serde::Deserialize)] pub struct SceneInfoJson { #[serde(rename = "sceneId")] pub scene_id: String, #[serde(rename = "sceneName")] pub scene_name: String, #[serde(rename = "sceneKind", default)] pub scene_kind: String, #[serde(rename = "sourceSystem", default)] pub source_system: Option, #[serde(rename = "expectedDomain", default)] pub expected_domain: Option, #[serde(rename = "targetUrl", default)] pub target_url: Option, #[serde(rename = "apiEndpoints", default)] pub api_endpoints: Vec, #[serde(rename = "staticParams", default)] pub static_params: HashMap, #[serde(rename = "columnDefs", default)] pub column_defs: Vec<(String, String)>, #[serde(rename = "entryMethod", default)] pub entry_method: Option, #[serde(rename = "businessLogic", default)] pub business_logic: Option, } ``` - [ ] **Step 2: Add scene_info_json field to GenerateSceneRequest** In `src/generated_scene/generator.rs`, modify `GenerateSceneRequest`: ```rust #[derive(Debug, Clone)] pub struct GenerateSceneRequest { pub source_dir: PathBuf, pub scene_id: String, pub scene_name: String, pub scene_kind: Option, pub target_url: Option, pub output_root: PathBuf, pub lessons_path: Option, // NEW pub scene_info_json: Option, } ``` - [ ] **Step 3: Modify browser_script function to use SceneInfo** Replace the existing `browser_script` function with enhanced version: ```rust fn browser_script(scene_id: &str, analysis: &SceneSourceAnalysis, scene_info: Option<&SceneInfoJson>) -> String { // If we have scene info with business logic, generate enhanced script if let Some(info) = scene_info { if !info.api_endpoints.is_empty() || !info.column_defs.is_empty() { return browser_script_with_business_logic(scene_id, info); } } // Fallback to skeleton template browser_script_skeleton(scene_id, analysis) } fn browser_script_skeleton(scene_id: &str, _analysis: &SceneSourceAnalysis) -> String { // Keep existing skeleton template format!( "function normalizePayload(payload) {{ if (typeof payload === 'string') {{ try {{ return JSON.parse(payload); }} catch (_) {{ return {{}}; }} }} return payload && typeof payload === 'object' ? payload : {{}}; }} async function buildBrowserEntrypointResult(args, deps = {{}}) {{ const rows = typeof deps.collectRows === 'function' ? await deps.collectRows(args) : [{{ org_label: args.org_label || '', org_code: args.org_code || '', period_mode: args.period_mode || '', period_value: args.period_value || '', value: '' }}]; return {{ type: 'report-artifact', report_name: '{}', status: rows.length > 0 ? 'ok' : 'empty', period: {{ mode: args.period_mode, mode_code: args.period_mode_code, value: args.period_value, payload: normalizePayload(args.period_payload) }}, org: {{ label: args.org_label, code: args.org_code }}, column_defs: [ ['org_label', '供电单位'], ['org_code', '供电单位编码'], ['period_mode', '统计周期类型'], ['period_value', '统计周期'], ['value', '采集值'] ], columns: ['org_label', 'org_code', 'period_mode', 'period_value', 'value'], rows, counts: {{ detail_rows: rows.length }}, partial_reasons: [], reasons: [] }}; }} if (typeof module !== 'undefined') {{ module.exports = {{ buildBrowserEntrypointResult, normalizePayload }}; }} if (typeof args !== 'undefined') {{ return buildBrowserEntrypointResult(args); }} ", scene_id ) } fn browser_script_with_business_logic(scene_id: &str, info: &SceneInfoJson) -> String { // Generate API endpoints constant let api_endpoints_code = info.api_endpoints.iter() .map(|ep| format!(" {}: '{}',", ep.name, ep.url)) .collect::>() .join("\n"); // Generate static params constant let static_params_code = info.static_params.iter() .map(|(k, v)| format!(" {}: '{}',", k, v)) .collect::>() .join("\n"); // Generate column defs let column_defs_code = info.column_defs.iter() .map(|(field, label)| format!(" ['{}', '{}'],", field, label)) .collect::>() .join("\n"); let columns_code = info.column_defs.iter() .map(|(field, _)| format!("'{}'", field)) .collect::>() .join(", "); let primary_api = info.api_endpoints.first() .map(|ep| ep.url.clone()) .unwrap_or_else(|| "/api/data".to_string()); let expected_domain = info.expected_domain.as_deref().unwrap_or(""); format!(r#"// ===== 自动生成部分 ===== const REPORT_NAME = '{scene_id}'; const EXPECTED_DOMAIN = '{expected_domain}'; // API 端点 const API_ENDPOINTS = {{ {api_endpoints_code} }}; // 静态参数 const STATIC_PARAMS = {{ {static_params_code} }}; // 列定义 const COLUMN_DEFS = [ {column_defs_code} ]; const COLUMNS = [{columns_code}]; // ===== 标准框架 ===== function normalizePayload(payload) {{ if (typeof payload === 'string') {{ try {{ return JSON.parse(payload); }} catch (_) {{ return {{}}; }} }} return payload && typeof payload === 'object' ? payload : {{}}; }} function validateArgs(args) {{ const reasons = []; if (!args.org_code) reasons.push('missing org_code'); if (!args.period_value) reasons.push('missing period_value'); return reasons.length === 0 ? {{ ok: true }} : {{ ok: false, reasons }}; }} function buildRequest(args) {{ return {{ orgCode: args.org_code, periodMode: args.period_mode, periodValue: args.period_value, ...STATIC_PARAMS }}; }} function normalizeRows(rawRows) {{ if (!Array.isArray(rawRows)) return []; return rawRows.map((row, index) => ({{ org_label: row.orgLabel || row.org_label || '', org_code: row.orgCode || row.org_code || args.org_code || '', period_mode: args.period_mode || '', period_value: args.period_value || '', ...row }})); }} function buildArtifact(opts) {{ return {{ type: 'report-artifact', report_name: REPORT_NAME, status: opts.status || 'ok', period: {{ mode: args.period_mode, mode_code: args.period_mode_code, value: args.period_value, payload: normalizePayload(args.period_payload) }}, org: {{ label: args.org_label, code: args.org_code }}, column_defs: COLUMN_DEFS, columns: COLUMNS, rows: opts.rows || [], counts: {{ detail_rows: (opts.rows || []).length }}, partial_reasons: opts.partial_reasons || [], reasons: opts.reasons || [] }}; }} async function buildBrowserEntrypointResult(args, deps = defaultDeps()) {{ // 1. 参数验证 const validation = validateArgs(args); if (!validation.ok) {{ return buildArtifact({{ status: 'blocked', reasons: validation.reasons }}); }} // 2. 页面上下文验证 const pageValidation = deps.validatePageContext?.(args); if (!pageValidation?.ok) {{ return buildArtifact({{ status: 'blocked', reasons: ['page_context_mismatch'] }}); }} // 3. 数据获取 try {{ const request = buildRequest(args); const response = await deps.queryData(request); const rows = normalizeRows(response.rows || response.data || []); return buildArtifact({{ status: rows.length > 0 ? 'ok' : 'empty', rows }}); }} catch (error) {{ return buildArtifact({{ status: 'error', reasons: [error.message] }}); }} }} // ===== 默认依赖实现 ===== function defaultDeps() {{ return {{ validatePageContext(args) {{ const host = globalThis.location?.hostname; return host === args.expected_domain || host === EXPECTED_DOMAIN ? {{ ok: true }} : {{ ok: false, reason: 'domain_mismatch' }}; }}, async queryData(request) {{ // 根据 API_ENDPOINTS 调用实际接口 if (typeof $ !== 'undefined' && typeof $.ajax === 'function') {{ return new Promise((resolve, reject) => {{ $.ajax({{ url: API_ENDPOINTS.primary || '{primary_api}', type: 'POST', data: JSON.stringify(request), contentType: 'application/json', success: resolve, error: (xhr, status, err) => reject(new Error(`API failed: ${{err}}`)), }}); }}); }} // Fallback: fetch API if (typeof fetch === 'function') {{ const response = await fetch(API_ENDPOINTS.primary || '{primary_api}', {{ method: 'POST', headers: {{ 'Content-Type': 'application/json' }}, body: JSON.stringify(request) }}); return response.json(); }} throw new Error('No HTTP client available'); }}, }}; }} // ===== 模块导出 ===== if (typeof module !== 'undefined') {{ module.exports = {{ buildBrowserEntrypointResult, validateArgs, buildRequest, normalizeRows, COLUMN_DEFS, COLUMNS, }}; }} if (typeof args !== 'undefined') {{ return buildBrowserEntrypointResult(args); }} "#, scene_id = scene_id, expected_domain = expected_domain, api_endpoints_code = api_endpoints_code, static_params_code = static_params_code, column_defs_code = column_defs_code, columns_code = columns_code, primary_api = primary_api) } ``` - [ ] **Step 4: Update generate_scene_package function** Modify `generate_scene_package` in generator.rs to pass scene_info: ```rust pub fn generate_scene_package( request: GenerateSceneRequest, ) -> Result { let analysis = analyze_scene_source_with_hint(&request.source_dir, request.scene_kind.clone())?; // ... existing code ... write_file( &scripts_dir.join(format!("{tool_name}.js")), &browser_script(&request.scene_id, &analysis, request.scene_info_json.as_ref()), )?; // ... rest of function ... } ``` - [ ] **Step 5: Add CLI parameter in sg_scene_generate.rs** Modify `CliArgs` struct: ```rust struct CliArgs { source_dir: PathBuf, scene_id: String, scene_name: String, scene_kind: Option, target_url: Option, output_root: PathBuf, lessons_path: Option, // NEW scene_info_json: Option, } ``` Add parsing in `parse_args`: ```rust fn parse_args(args: impl Iterator) -> Result { // ... existing code ... let mut scene_info_json = None; // ... in match block ... "--scene-info-json" => scene_info_json = Some(arg), // ... } ``` Parse JSON in `run`: ```rust fn run() -> Result<(), String> { let args = parse_args(env::args().skip(1))?; let scene_info = args.scene_info_json .map(|json| serde_json::from_str(&json)) .transpose() .map_err(|e| format!("Invalid scene-info-json: {}", e))?; let skill_root = generate_scene_package(GenerateSceneRequest { source_dir: args.source_dir, scene_id: args.scene_id, scene_name: args.scene_name, scene_kind: args.scene_kind, target_url: args.target_url, output_root: args.output_root, lessons_path: args.lessons_path, scene_info_json: scene_info, }) .map_err(|err| err.to_string())?; println!("generated scene package: {}", skill_root.display()); Ok(()) } ``` Update usage: ```rust fn usage() -> String { "usage: sg_scene_generate --source-dir --scene-id --scene-name [--scene-kind ] [--target-url ] --output-root [--lessons ] [--scene-info-json '']".to_string() } ``` - [ ] **Step 6: Verify Rust compilation** Run: `cargo check` Expected: No compilation errors - [ ] **Step 7: Commit** ```bash git add src/bin/sg_scene_generate.rs src/generated_scene/generator.rs git commit -m "feat(rust): add --scene-info-json parameter for LLM extraction results" ``` --- ### Task 5: Update Web UI with Extraction Preview **Files:** - Modify: `frontend/scene-generator/sg_scene_generator.html` **Goal:** Add UI elements to show extraction results and allow user confirmation before generation. - [ ] **Step 1: Add extraction results preview section** Add after the existing form section, a new collapsible panel for extraction preview: ```html ``` - [ ] **Step 2: Add CSS for preview panel** Add styles: ```css .preview-panel { background: rgba(255, 255, 255, 0.05); border-radius: 12px; border: 1px solid rgba(255, 255, 255, 0.1); overflow: hidden; } .preview-header { display: flex; justify-content: space-between; align-items: center; padding: 16px 20px; cursor: pointer; background: rgba(255, 255, 255, 0.03); } .preview-header h3 { margin: 0; font-size: 16px; } .preview-content { padding: 20px; } .preview-section { margin-bottom: 20px; } .preview-section h4 { margin: 0 0 10px 0; font-size: 14px; color: #a78bfa; } .preview-row { display: flex; margin-bottom: 8px; } .preview-row .label { width: 100px; color: #888; flex-shrink: 0; } .preview-row .value { color: #fff; } .preview-list { max-height: 150px; overflow-y: auto; background: rgba(0, 0, 0, 0.2); border-radius: 8px; padding: 10px; } .preview-list-item { padding: 6px 0; border-bottom: 1px solid rgba(255, 255, 255, 0.05); } .preview-code { background: rgba(0, 0, 0, 0.3); padding: 10px; border-radius: 8px; font-family: monospace; font-size: 12px; overflow-x: auto; white-space: pre-wrap; } ``` - [ ] **Step 3: Add JavaScript for deep analysis and preview** ```javascript let currentSceneInfo = null; async function analyzeDeep() { const sourceDir = document.getElementById('sourceDir').value; if (!sourceDir) { alert('请先选择场景目录'); return; } showStatus('正在深度分析...'); try { const response = await fetch(`${SERVER_URL}/analyze-deep`, { method: 'POST', headers: { 'Content-Type': 'application/json' }, body: JSON.stringify({ sourceDir }) }); const data = await response.json(); if (data.error) { showStatus('分析失败: ' + data.error); return; } currentSceneInfo = data; // Update form fields document.getElementById('sceneId').value = data.sceneId || ''; document.getElementById('sceneName').value = data.sceneName || ''; document.getElementById('sceneKind').value = data.sceneKind || 'report_collection'; if (data.targetUrl) { document.getElementById('targetUrl').value = data.targetUrl; } // Show preview showExtractionPreview(data); showStatus('分析完成,请确认提取结果'); } catch (err) { showStatus('分析失败: ' + err.message); } } function showExtractionPreview(data) { document.getElementById('previewSceneId').textContent = data.sceneId || '-'; document.getElementById('previewSceneName').textContent = data.sceneName || '-'; document.getElementById('previewSceneKind').textContent = data.sceneKind || '-'; document.getElementById('previewExpectedDomain').textContent = data.expectedDomain || '-'; // API endpoints const apiList = document.getElementById('previewApiEndpoints'); const apiCount = (data.apiEndpoints || []).length; document.getElementById('previewApiCount').textContent = apiCount; apiList.innerHTML = (data.apiEndpoints || []).map(ep => `
${ep.name}: ${ep.url} [${ep.method || 'GET'}]
`).join('') || '
无 API 端点
'; // Column defs const colList = document.getElementById('previewColumnDefs'); const colCount = (data.columnDefs || []).length; document.getElementById('previewColumnCount').textContent = colCount; colList.innerHTML = (data.columnDefs || []).map(([field, label]) => `
${field} → ${label}
`).join('') || '
无列定义
'; // Static params document.getElementById('previewStaticParams').textContent = JSON.stringify(data.staticParams || {}, null, 2) || '{}'; // Business logic document.getElementById('previewDataFetch').textContent = data.businessLogic?.dataFetch || '-'; document.getElementById('previewDataTransform').textContent = data.businessLogic?.dataTransform || '-'; document.getElementById('extractionPreview').style.display = 'block'; } function togglePreview() { const content = document.getElementById('previewContent'); const icon = document.getElementById('previewToggleIcon'); if (content.style.display === 'none') { content.style.display = 'block'; icon.textContent = '▼'; } else { content.style.display = 'none'; icon.textContent = '▶'; } } ``` - [ ] **Step 4: Add "深度分析" button** Add a new button in the button group: ```html ``` - [ ] **Step 5: Update generate function to pass sceneInfo** Modify the generate function to include scene info JSON: ```javascript async function generate() { const params = { sourceDir: document.getElementById('sourceDir').value, sceneId: document.getElementById('sceneId').value, sceneName: document.getElementById('sceneName').value, sceneKind: document.getElementById('sceneKind').value, targetUrl: document.getElementById('targetUrl').value || null, outputRoot: document.getElementById('outputRoot').value, lessons: document.getElementById('lessons').value || null, }; // Add scene info JSON if available if (currentSceneInfo) { params.sceneInfoJson = JSON.stringify(currentSceneInfo); } // ... rest of generate function ... } ``` - [ ] **Step 6: Verify UI loads** Run the server and open the page in browser: ```bash cd frontend/scene-generator && node server.js ``` Open `http://127.0.0.1:3210/` Expected: Page loads without JavaScript errors - [ ] **Step 7: Commit** ```bash git add frontend/scene-generator/sg_scene_generator.html git commit -m "feat(ui): add deep extraction preview panel with API/column/static-params display" ``` --- ### Task 6: Update generator-runner.js to Pass sceneInfoJson **Files:** - Modify: `frontend/scene-generator/generator-runner.js` **Goal:** Update `runGenerator` to pass `sceneInfoJson` parameter to Rust CLI. - [ ] **Step 1: Modify runGenerator function** Update the function to accept and pass `sceneInfoJson`: ```javascript function runGenerator(params, sseWriter, projectRoot) { const { sourceDir, sceneId, sceneName, sceneKind, targetUrl, outputRoot, lessons, sceneInfoJson } = params; const normalize = (p) => p.replace(/\\/g, "/"); const args = [ "run", "--bin", "sg_scene_generate", "--", "--source-dir", normalize(sourceDir), "--scene-id", sceneId, "--scene-name", sceneName, ]; if (sceneKind) { args.push("--scene-kind", sceneKind); } if (targetUrl) { args.push("--target-url", targetUrl); } args.push("--output-root", normalize(outputRoot)); if (lessons) { args.push("--lessons", normalize(lessons)); } // NEW: Pass scene info JSON if (sceneInfoJson) { args.push("--scene-info-json", sceneInfoJson); } // ... rest of function unchanged ... } ``` - [ ] **Step 2: Update server.js handleGenerate** Ensure `handleGenerate` passes the new parameter: ```javascript async function handleGenerate(req, res) { let body; try { body = await parseBody(req); } catch { res.writeHead(400, { "Content-Type": "application/json" }); res.end(JSON.stringify({ error: "Invalid JSON body" })); return; } const { sourceDir, sceneId, sceneName, sceneKind, targetUrl, outputRoot, lessons, sceneInfoJson } = body; if (!sourceDir || !sceneId || !sceneName || !outputRoot) { res.writeHead(400, { "Content-Type": "application/json" }); res.end( JSON.stringify({ error: "All fields required: sourceDir, sceneId, sceneName, outputRoot", }) ); return; } const sseWriter = initSSE(res); try { await runGenerator( { sourceDir, sceneId, sceneName, sceneKind, targetUrl, outputRoot, lessons, sceneInfoJson }, sseWriter, config.projectRoot ); } catch (err) { writeSSE(sseWriter, "error", { message: `Server error: ${err.message}` }); } sseWriter.end(); } ``` - [ ] **Step 3: Verify syntax** Run: `node -c frontend/scene-generator/generator-runner.js && node -c frontend/scene-generator/server.js` Expected: No syntax errors - [ ] **Step 4: Commit** ```bash git add frontend/scene-generator/generator-runner.js frontend/scene-generator/server.js git commit -m "feat(runner): pass sceneInfoJson to Rust CLI for enhanced template rendering" ``` --- ### Task 7: End-to-End Verification **Files:** - All modified files **Goal:** Verify the complete flow works from UI to Rust CLI. - [ ] **Step 1: Build Rust binary** ```bash cargo build --release --bin sg_scene_generate ``` Expected: Build succeeds - [ ] **Step 2: Start the server** ```bash cd frontend/scene-generator && node server.js ``` Expected: Server starts on port 3210 - [ ] **Step 3: Test health endpoint** ```bash curl http://127.0.0.1:3210/health ``` Expected: `{"status":"ok",...}` - [ ] **Step 4: Test analyze-deep endpoint with real scene** Use a real scene directory with index.html: ```bash curl -X POST http://127.0.0.1:3210/analyze-deep \ -H "Content-Type: application/json" \ -d '{"sourceDir": "D:/path/to/scene/with/index.html"}' ``` Expected: JSON response with sceneId, sceneName, apiEndpoints, columnDefs - [ ] **Step 5: Test full generation flow** 1. Open browser to `http://127.0.0.1:3210/` 2. Select a scene directory with index.html 3. Click "深度分析" button 4. Verify preview shows extracted API/column data 5. Click "生成" button 6. Verify generated script contains extracted API endpoints and column definitions - [ ] **Step 6: Compare generated script** Compare the generated script with the reference: - Before: 51 lines (skeleton) - After: Should have API_ENDPOINTS, COLUMN_DEFS constants populated - [ ] **Step 7: Final commit** ```bash git add -A git commit -m "feat: complete LLM-driven skill generation with deep extraction - Add /analyze-deep endpoint for deep LLM extraction - Extract apiEndpoints, staticParams, columnDefs from index.html - Pass extraction results via --scene-info-json to Rust CLI - Generate complete browser_script with business logic constants - Add UI preview panel for extraction results " ``` --- ## Self-Review ### 1. Spec Coverage | Spec Requirement | Task | |------------------|------| | LLM reads index.html | Task 1 (buildDeepAnalyzePrompt), Task 2 (readDirectory) | | Extract apiEndpoints | Task 1 (DEEP_SYSTEM_PROMPT, analyzeSceneDeep) | | Extract staticParams | Task 1 (DEEP_SYSTEM_PROMPT, analyzeSceneDeep) | | Extract columnDefs | Task 1 (DEEP_SYSTEM_PROMPT, analyzeSceneDeep) | | Extract businessLogic | Task 1 (DEEP_SYSTEM_PROMPT, analyzeSceneDeep) | | --scene-info-json CLI parameter | Task 4 | | Enhanced template rendering | Task 4 (browser_script_with_business_logic) | | Web UI preview | Task 5 | | User confirmation before generation | Task 5 (extraction preview) | All covered. ### 2. Placeholder Scan No TBD/TODO/"implement later"/"add tests"/"similar to" patterns found. ### 3. Type Consistency - `/analyze-deep`: `{ sourceDir }` → `SceneInfoJson` — consistent in Tasks 1, 3, 5 - `/generate`: `{ ..., sceneInfoJson }` — consistent in Tasks 5, 6 - SceneInfoJson struct fields match JavaScript extraction output — consistent in Task 1, 4 - Column defs: `Vec<(String, String)>` matches `[[field, label]]` — consistent All consistent. ### 4. Backward Compatibility - Existing `/analyze` endpoint unchanged - Existing CLI arguments (`--scene-id`, `--scene-name`) still work - `--scene-info-json` is optional, falls back to skeleton template - `index.html` reading is optional, falls back if not present