admin/claw

Files

木炎 956f0c2b68 feat: add generated scene skill platform hardening

2026-04-21 23:19:06 +08:00

41 KiB

Raw Permalink Blame History

LLM-Driven Skill Generation Implementation Plan

For agentic workers: REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (- [ ]) syntax for tracking.

Goal: Enhance sg_scene_generate to generate complete, runnable skill packages instead of skeleton code by deeply analyzing scene source code (index.html) with LLM to extract API endpoints, static params, column definitions, and business logic.

Architecture:

LLM reads index.html from scene directory
Extracts complete SceneInfo (sceneId, sceneName, apiEndpoints, staticParams, columnDefs, businessLogic)
Web UI shows preview for user confirmation
Rust CLI receives extracted info via --scene-info-json parameter
Rust template renders complete browser_script with business logic

Tech Stack: JavaScript (Node.js), Rust, HTML/CSS, OpenAI-compatible LLM API

Scope Check

This plan covers the enhancement of existing scene skill generator to support LLM-driven deep extraction. It builds upon:

Existing frontend/scene-generator/ files (server.js, llm-client.js, generator-runner.js)
Existing src/generated_scene/generator.rs and src/bin/sg_scene_generate.rs

File Map

Modified Files

File	Changes
`frontend/scene-generator/llm-client.js`	Add deep extraction prompt + `analyzeSceneDeep()`
`frontend/scene-generator/generator-runner.js`	Add `index.html` reading in `readDirectory()`
`frontend/scene-generator/server.js`	New `/analyze-deep` route, pass sceneInfo to generator
`src/bin/sg_scene_generate.rs`	Add `--scene-info-json` CLI parameter
`src/generated_scene/generator.rs`	Add SceneInfo struct, enhanced template rendering
`frontend/scene-generator/sg_scene_generator.html`	Add extraction preview UI

Reference Files (not modified)

File	Purpose
`docs/superpowers/specs/2026-04-17-llm-driven-skill-generation-design.md`	Design spec
`claw/skills/skill_staging/skills/tq-lineloss-report/scripts/collect_tq_lineloss_report.js`	Reference complete script (433 lines)
`claw/skills/skill_staging/skills/marketing-zero-consumer-report/scripts/collect_marketing_zero_consumer_report.js`	Reference skeleton (51 lines)

Scope Guardrails

Do not change existing API contracts for backward compatibility
Do not require index.html to exist (fallback to current behavior)
Do not break existing --scene-id, --scene-name CLI arguments
Do not add npm dependencies (only Node.js built-in modules)

Task 1: Enhance llm-client.js with Deep Extraction

Files:

Modify: frontend/scene-generator/llm-client.js

Goal: Add a new function analyzeSceneDeep() that reads index.html content and extracts complete SceneInfo including API endpoints, static params, column definitions, and business logic.

Step 1: Add DEEP_SYSTEM_PROMPT constant

Add after the existing SYSTEM_PROMPT constant in llm-client.js:

const DEEP_SYSTEM_PROMPT = `你是一个场景代码分析专家。分析场景源码，提取关键业务信息。

## 分析目标

1. **API 端点**: 识别所有 HTTP 请求地址 (URL, method, 用途)
2. **静态参数**: 识别硬编码的业务参数 (key-value pairs)
3. **列定义**: 识别数据表格/导出的列配置 ([field, label] pairs)
4. **业务逻辑**: 理解数据获取和转换流程
5. **场景类型**: 判断是 report_collection 还是 monitoring

## 输出格式

请以 JSON 格式返回：
{
  "sceneId": "string - 场景标识 (英文短横线)",
  "sceneName": "string - 场景中文名",
  "sceneKind": "report_collection | monitoring",
  "sourceSystem": "string - 来源系统名 (可选)",
  "expectedDomain": "string - 目标域名 (可选)",
  "targetUrl": "string | null - 目标页面URL",
  "apiEndpoints": [
    {"name": "string", "url": "string", "method": "GET|POST", "description": "string"}
  ],
  "staticParams": {"key": "value"},
  "columnDefs": [["fieldName", "中文列名"]],
  "entryMethod": "string - 入口方法名",
  "businessLogic": {
    "dataFetch": "string - 数据获取逻辑描述",
    "dataTransform": "string - 数据转换逻辑描述"
  }
}`;

Step 2: Add buildDeepAnalyzePrompt function

Add after buildAnalyzePrompt function:

function buildDeepAnalyzePrompt(sourceDir, dirContents, indexHtmlContent) {
  const parts = [];

  parts.push(`=== 目录结构 ===`);
  parts.push(dirContents.tree || "(empty)");

  if (dirContents["scene.toml"]) {
    parts.push(`\n=== scene.toml ===`);
    parts.push(dirContents["scene.toml"]);
  }

  if (dirContents["SKILL.toml"]) {
    parts.push(`\n=== SKILL.toml ===`);
    parts.push(dirContents["SKILL.toml"]);
  }

  if (dirContents["SKILL.md"]) {
    parts.push(`\n=== SKILL.md ===`);
    parts.push(dirContents["SKILL.md"]);
  }

  // Include index.html content (key addition)
  if (indexHtmlContent) {
    parts.push(`\n=== index.html ===`);
    // Limit to first 15000 chars to avoid token limits
    parts.push(indexHtmlContent.substring(0, 15000));
  }

  if (dirContents.scripts && Object.keys(dirContents.scripts).length > 0) {
    parts.push(`\n=== 脚本文件 ===`);
    for (const [name, content] of Object.entries(dirContents.scripts)) {
      parts.push(`\n--- ${name} ---`);
      parts.push(content.substring(0, 3000));
    }
  }

  return `以下是场景目录 "${sourceDir}" 的内容：\n\n${parts.join("\n")}\n\n请分析以上代码，提取完整的场景信息。`;
}

Step 3: Add extractSceneInfo function

Add after extractJsonFromResponse function:

function extractSceneInfo(text) {
  // Try code block first
  const codeBlockMatch = text.match(/```(?:json)?\s*\n([\s\S]*?)\n```/);
  if (codeBlockMatch) {
    try {
      return JSON.parse(codeBlockMatch[1]);
    } catch (e) {
      // fall through
    }
  }

  // Try to find JSON object with sceneId
  const jsonMatch = text.match(/\{[\s\S]*"sceneId"[\s\S]*\}/);
  if (jsonMatch) {
    try {
      return JSON.parse(jsonMatch[0]);
    } catch (e) {
      // fall through
    }
  }

  // Last resort: parse entire text
  try {
    return JSON.parse(text);
  } catch (e) {
    throw new Error("Failed to extract valid SceneInfo JSON from LLM response");
  }
}

Step 4: Add analyzeSceneDeep function

Add after analyzeScene function:

function analyzeSceneDeep(sourceDir, dirContents, indexHtmlContent, { apiKey, baseUrl, model }) {
  const userPrompt = buildDeepAnalyzePrompt(sourceDir, dirContents, indexHtmlContent);

  const requestBody = JSON.stringify({
    model,
    messages: [
      { role: "system", content: DEEP_SYSTEM_PROMPT },
      { role: "user", content: userPrompt },
    ],
    temperature: 0.1,
    max_tokens: 2048, // Increased for detailed response
  });

  return new Promise((resolve, reject) => {
    const url = new URL(baseUrl.replace(/\/v1\/?$/, "") + "/v1/chat/completions");
    const options = {
      hostname: url.hostname,
      port: url.port || (url.protocol === "https:" ? 443 : 80),
      path: url.pathname,
      method: "POST",
      headers: {
        "Content-Type": "application/json",
        Authorization: `Bearer ${apiKey}`,
        "Content-Length": Buffer.byteLength(requestBody),
      },
    };

    const httpModule = url.protocol === "https:" ? https : http;
    const req = httpModule.request(options, (res) => {
      let data = "";
      res.on("data", (chunk) => (data += chunk));
      res.on("end", () => {
        if (res.statusCode !== 200) {
          return reject(new Error(`LLM API error ${res.statusCode}: ${data}`));
        }

        try {
          const parsed = JSON.parse(data);
          const content = parsed.choices?.[0]?.message?.content;
          if (!content) return reject(new Error("LLM returned empty response"));
          const result = extractSceneInfo(content);

          // Validate required fields
          if (!result.sceneId || !result.sceneName) {
            return reject(new Error(`LLM response missing sceneId/sceneName: ${content}`));
          }

          // Set defaults for optional fields
          result.sceneKind = result.sceneKind || "report_collection";
          result.apiEndpoints = result.apiEndpoints || [];
          result.staticParams = result.staticParams || {};
          result.columnDefs = result.columnDefs || [];
          result.businessLogic = result.businessLogic || {};

          resolve(result);
        } catch (err) {
          reject(new Error(`Failed to parse LLM response: ${err.message}`));
        }
      });
    });

    req.on("error", reject);
    req.setTimeout(60000, () => {
      req.destroy(new Error("LLM API request timed out"));
    });

    req.write(requestBody);
    req.end();
  });
}

Step 5: Add http module import and update exports

At the top of the file, add http import alongside https:

const http = require("http");
const https = require("https");

Update the exports at the bottom:

module.exports = {
  buildAnalyzePrompt,
  extractJsonFromResponse,
  analyzeScene,
  // New exports
  buildDeepAnalyzePrompt,
  extractSceneInfo,
  analyzeSceneDeep,
};

Step 6: Verify syntax

Run: node -c frontend/scene-generator/llm-client.js Expected: No syntax errors

Step 7: Commit

git add frontend/scene-generator/llm-client.js
git commit -m "feat(llm-client): add deep extraction with apiEndpoints, staticParams, columnDefs"

Task 2: Enhance generator-runner.js to Read index.html

Files:

Modify: frontend/scene-generator/generator-runner.js

Goal: Modify readDirectory() to also read index.html content.

Step 1: Add index.html reading in readDirectory function

Locate the readDirectory function and add index.html reading after the SKILL.md section:

// After the SKILL.md reading section, add:

const indexHtmlPath = p.join(sourceDir, "index.html");
if (fs.existsSync(indexHtmlPath)) {
  result.indexHtml = fs.readFileSync(indexHtmlPath, "utf-8");
}

The complete modified function should look like:

function readDirectory(sourceDir) {
  const fs = require("fs");
  const p = require("path");

  if (!fs.existsSync(sourceDir)) {
    throw new Error(`Directory not found: ${sourceDir}`);
  }

  const stat = fs.statSync(sourceDir);
  if (!stat.isDirectory()) {
    throw new Error(`Not a directory: ${sourceDir}`);
  }

  const result = {};
  const entries = fs.readdirSync(sourceDir, { withFileTypes: true });

  const treeLines = [];
  for (const entry of entries) {
    treeLines.push(`├── ${entry.name}`);
  }
  result.tree = treeLines.join("\n");

  const sceneTomlPath = p.join(sourceDir, "scene.toml");
  if (fs.existsSync(sceneTomlPath)) {
    result["scene.toml"] = fs.readFileSync(sceneTomlPath, "utf-8");
  }

  const skillTomlPath = p.join(sourceDir, "SKILL.toml");
  if (fs.existsSync(skillTomlPath)) {
    result["SKILL.toml"] = fs.readFileSync(skillTomlPath, "utf-8");
  }

  const skillMdPath = p.join(sourceDir, "SKILL.md");
  if (fs.existsSync(skillMdPath)) {
    result["SKILL.md"] = fs.readFileSync(skillMdPath, "utf-8");
  }

  // NEW: Read index.html
  const indexHtmlPath = p.join(sourceDir, "index.html");
  if (fs.existsSync(indexHtmlPath)) {
    result.indexHtml = fs.readFileSync(indexHtmlPath, "utf-8");
  }

  const scripts = {};
  for (const entry of entries) {
    if (entry.isFile() && entry.name.endsWith(".js")) {
      const scriptPath = p.join(sourceDir, entry.name);
      scripts[entry.name] = fs.readFileSync(scriptPath, "utf-8");
    }
  }
  if (Object.keys(scripts).length > 0) {
    result.scripts = scripts;
  }

  return result;
}

Step 2: Verify syntax

Run: node -c frontend/scene-generator/generator-runner.js Expected: No syntax errors

Step 3: Commit

git add frontend/scene-generator/generator-runner.js
git commit -m "feat(generator-runner): read index.html in readDirectory()"

Task 3: Add /analyze-deep Route in server.js

Files:

Modify: frontend/scene-generator/server.js

Goal: Add new /analyze-deep endpoint that calls the deep extraction LLM function.

Step 1: Update llm-client import

Change the import line at the top:

const { analyzeScene, analyzeSceneDeep } = require("./llm-client");

Step 2: Add handleAnalyzeDeep function

Add after the existing handleAnalyze function:

async function handleAnalyzeDeep(req, res) {
  let body;
  try {
    body = await parseBody(req);
  } catch {
    res.writeHead(400, { "Content-Type": "application/json" });
    res.end(JSON.stringify({ error: "Invalid JSON body" }));
    return;
  }

  const sourceDir = (body.sourceDir || "").replace(/\\/g, "/");
  if (!sourceDir) {
    res.writeHead(400, { "Content-Type": "application/json" });
    res.end(JSON.stringify({ error: "sourceDir is required" }));
    return;
  }

  let dirContents;
  try {
    dirContents = readDirectory(sourceDir);
  } catch (err) {
    res.writeHead(400, { "Content-Type": "application/json" });
    res.end(JSON.stringify({ error: err.message }));
    return;
  }

  try {
    const indexHtmlContent = dirContents.indexHtml || null;
    const result = await analyzeSceneDeep(sourceDir, dirContents, indexHtmlContent, config);

    // Log extraction results for debugging
    console.log(`[analyze-deep] Extracted scene: ${result.sceneId} / ${result.sceneName}`);
    console.log(`[analyze-deep] API endpoints: ${result.apiEndpoints?.length || 0}`);
    console.log(`[analyze-deep] Column defs: ${result.columnDefs?.length || 0}`);

    res.writeHead(200, { "Content-Type": "application/json" });
    res.end(JSON.stringify(result));
  } catch (err) {
    console.error(`[analyze-deep] Error: ${err.message}`);
    res.writeHead(502, { "Content-Type": "application/json" });
    res.end(
      JSON.stringify({
        error: `Deep analysis failed: ${err.message}`,
        hint: "You can still use basic analysis or enter data manually",
      })
    );
  }
}

Step 3: Add route in server request handler

In the http.createServer handler, add the new route after /analyze:

} else if (pathname === "/analyze-deep" && req.method === "POST") {
  await handleAnalyzeDeep(req, res);

Step 4: Verify syntax

Run: node -c frontend/scene-generator/server.js Expected: No syntax errors

Step 5: Commit

git add frontend/scene-generator/server.js
git commit -m "feat(server): add /analyze-deep endpoint for deep extraction"

Task 4: Add --scene-info-json CLI Parameter

Files:

Modify: src/bin/sg_scene_generate.rs
Modify: src/generated_scene/generator.rs

Goal: Add --scene-info-json parameter to Rust CLI to receive pre-extracted scene info from the Node.js server.

Step 1: Add SceneInfoJson struct in generator.rs

In src/generated_scene/generator.rs, add after imports:

use std::collections::HashMap;

#[derive(Debug, Clone, serde::Deserialize)]
pub struct ApiEndpointJson {
    pub name: String,
    pub url: String,
    #[serde(default)]
    pub method: String,
    #[serde(default)]
    pub description: Option<String>,
}

#[derive(Debug, Clone, serde::Deserialize)]
pub struct BusinessLogicJson {
    #[serde(default)]
    pub data_fetch: Option<String>,
    #[serde(default)]
    pub data_transform: Option<String>,
}

#[derive(Debug, Clone, serde::Deserialize)]
pub struct SceneInfoJson {
    #[serde(rename = "sceneId")]
    pub scene_id: String,
    #[serde(rename = "sceneName")]
    pub scene_name: String,
    #[serde(rename = "sceneKind", default)]
    pub scene_kind: String,
    #[serde(rename = "sourceSystem", default)]
    pub source_system: Option<String>,
    #[serde(rename = "expectedDomain", default)]
    pub expected_domain: Option<String>,
    #[serde(rename = "targetUrl", default)]
    pub target_url: Option<String>,
    #[serde(rename = "apiEndpoints", default)]
    pub api_endpoints: Vec<ApiEndpointJson>,
    #[serde(rename = "staticParams", default)]
    pub static_params: HashMap<String, String>,
    #[serde(rename = "columnDefs", default)]
    pub column_defs: Vec<(String, String)>,
    #[serde(rename = "entryMethod", default)]
    pub entry_method: Option<String>,
    #[serde(rename = "businessLogic", default)]
    pub business_logic: Option<BusinessLogicJson>,
}

Step 2: Add scene_info_json field to GenerateSceneRequest

In src/generated_scene/generator.rs, modify GenerateSceneRequest:

#[derive(Debug, Clone)]
pub struct GenerateSceneRequest {
    pub source_dir: PathBuf,
    pub scene_id: String,
    pub scene_name: String,
    pub scene_kind: Option<SceneKind>,
    pub target_url: Option<String>,
    pub output_root: PathBuf,
    pub lessons_path: Option<PathBuf>,
    // NEW
    pub scene_info_json: Option<SceneInfoJson>,
}

Step 3: Modify browser_script function to use SceneInfo

Replace the existing browser_script function with enhanced version:

fn browser_script(scene_id: &str, analysis: &SceneSourceAnalysis, scene_info: Option<&SceneInfoJson>) -> String {
    // If we have scene info with business logic, generate enhanced script
    if let Some(info) = scene_info {
        if !info.api_endpoints.is_empty() || !info.column_defs.is_empty() {
            return browser_script_with_business_logic(scene_id, info);
        }
    }

    // Fallback to skeleton template
    browser_script_skeleton(scene_id, analysis)
}

fn browser_script_skeleton(scene_id: &str, _analysis: &SceneSourceAnalysis) -> String {
    // Keep existing skeleton template
    format!(
        "function normalizePayload(payload) {{
  if (typeof payload === 'string') {{
    try {{ return JSON.parse(payload); }} catch (_) {{ return {{}}; }}
  }}
  return payload && typeof payload === 'object' ? payload : {{}};
}}

async function buildBrowserEntrypointResult(args, deps = {{}}) {{
  const rows = typeof deps.collectRows === 'function'
    ? await deps.collectRows(args)
    : [{{
        org_label: args.org_label || '',
        org_code: args.org_code || '',
        period_mode: args.period_mode || '',
        period_value: args.period_value || '',
        value: ''
      }}];
  return {{
    type: 'report-artifact',
    report_name: '{}',
    status: rows.length > 0 ? 'ok' : 'empty',
    period: {{
      mode: args.period_mode,
      mode_code: args.period_mode_code,
      value: args.period_value,
      payload: normalizePayload(args.period_payload)
    }},
    org: {{ label: args.org_label, code: args.org_code }},
    column_defs: [
      ['org_label', '供电单位'],
      ['org_code', '供电单位编码'],
      ['period_mode', '统计周期类型'],
      ['period_value', '统计周期'],
      ['value', '采集值']
    ],
    columns: ['org_label', 'org_code', 'period_mode', 'period_value', 'value'],
    rows,
    counts: {{ detail_rows: rows.length }},
    partial_reasons: [],
    reasons: []
  }};
}}

if (typeof module !== 'undefined') {{
  module.exports = {{ buildBrowserEntrypointResult, normalizePayload }};
}}

if (typeof args !== 'undefined') {{
  return buildBrowserEntrypointResult(args);
}}
",
        scene_id
    )
}

fn browser_script_with_business_logic(scene_id: &str, info: &SceneInfoJson) -> String {
    // Generate API endpoints constant
    let api_endpoints_code = info.api_endpoints.iter()
        .map(|ep| format!("  {}: '{}',", ep.name, ep.url))
        .collect::<Vec<_>>()
        .join("\n");

    // Generate static params constant
    let static_params_code = info.static_params.iter()
        .map(|(k, v)| format!("  {}: '{}',", k, v))
        .collect::<Vec<_>>()
        .join("\n");

    // Generate column defs
    let column_defs_code = info.column_defs.iter()
        .map(|(field, label)| format!("  ['{}', '{}'],", field, label))
        .collect::<Vec<_>>()
        .join("\n");

    let columns_code = info.column_defs.iter()
        .map(|(field, _)| format!("'{}'", field))
        .collect::<Vec<_>>()
        .join(", ");

    let primary_api = info.api_endpoints.first()
        .map(|ep| ep.url.clone())
        .unwrap_or_else(|| "/api/data".to_string());

    let expected_domain = info.expected_domain.as_deref().unwrap_or("");

    format!(r#"// ===== 自动生成部分 =====

const REPORT_NAME = '{scene_id}';
const EXPECTED_DOMAIN = '{expected_domain}';

// API 端点
const API_ENDPOINTS = {{
{api_endpoints_code}
}};

// 静态参数
const STATIC_PARAMS = {{
{static_params_code}
}};

// 列定义
const COLUMN_DEFS = [
{column_defs_code}
];
const COLUMNS = [{columns_code}];

// ===== 标准框架 =====

function normalizePayload(payload) {{
  if (typeof payload === 'string') {{
    try {{ return JSON.parse(payload); }} catch (_) {{ return {{}}; }}
  }}
  return payload && typeof payload === 'object' ? payload : {{}};
}}

function validateArgs(args) {{
  const reasons = [];
  if (!args.org_code) reasons.push('missing org_code');
  if (!args.period_value) reasons.push('missing period_value');
  return reasons.length === 0 ? {{ ok: true }} : {{ ok: false, reasons }};
}}

function buildRequest(args) {{
  return {{
    orgCode: args.org_code,
    periodMode: args.period_mode,
    periodValue: args.period_value,
    ...STATIC_PARAMS
  }};
}}

function normalizeRows(rawRows) {{
  if (!Array.isArray(rawRows)) return [];
  return rawRows.map((row, index) => ({{
    org_label: row.orgLabel || row.org_label || '',
    org_code: row.orgCode || row.org_code || args.org_code || '',
    period_mode: args.period_mode || '',
    period_value: args.period_value || '',
    ...row
  }}));
}}

function buildArtifact(opts) {{
  return {{
    type: 'report-artifact',
    report_name: REPORT_NAME,
    status: opts.status || 'ok',
    period: {{
      mode: args.period_mode,
      mode_code: args.period_mode_code,
      value: args.period_value,
      payload: normalizePayload(args.period_payload)
    }},
    org: {{ label: args.org_label, code: args.org_code }},
    column_defs: COLUMN_DEFS,
    columns: COLUMNS,
    rows: opts.rows || [],
    counts: {{ detail_rows: (opts.rows || []).length }},
    partial_reasons: opts.partial_reasons || [],
    reasons: opts.reasons || []
  }};
}}

async function buildBrowserEntrypointResult(args, deps = defaultDeps()) {{
  // 1. 参数验证
  const validation = validateArgs(args);
  if (!validation.ok) {{
    return buildArtifact({{ status: 'blocked', reasons: validation.reasons }});
  }}

  // 2. 页面上下文验证
  const pageValidation = deps.validatePageContext?.(args);
  if (!pageValidation?.ok) {{
    return buildArtifact({{ status: 'blocked', reasons: ['page_context_mismatch'] }});
  }}

  // 3. 数据获取
  try {{
    const request = buildRequest(args);
    const response = await deps.queryData(request);
    const rows = normalizeRows(response.rows || response.data || []);

    return buildArtifact({{
      status: rows.length > 0 ? 'ok' : 'empty',
      rows
    }});
  }} catch (error) {{
    return buildArtifact({{ status: 'error', reasons: [error.message] }});
  }}
}}

// ===== 默认依赖实现 =====

function defaultDeps() {{
  return {{
    validatePageContext(args) {{
      const host = globalThis.location?.hostname;
      return host === args.expected_domain || host === EXPECTED_DOMAIN
        ? {{ ok: true }}
        : {{ ok: false, reason: 'domain_mismatch' }};
    }},

    async queryData(request) {{
      // 根据 API_ENDPOINTS 调用实际接口
      if (typeof $ !== 'undefined' && typeof $.ajax === 'function') {{
        return new Promise((resolve, reject) => {{
          $.ajax({{
            url: API_ENDPOINTS.primary || '{primary_api}',
            type: 'POST',
            data: JSON.stringify(request),
            contentType: 'application/json',
            success: resolve,
            error: (xhr, status, err) => reject(new Error(`API failed: ${{err}}`)),
          }});
        }});
      }}
      // Fallback: fetch API
      if (typeof fetch === 'function') {{
        const response = await fetch(API_ENDPOINTS.primary || '{primary_api}', {{
          method: 'POST',
          headers: {{ 'Content-Type': 'application/json' }},
          body: JSON.stringify(request)
        }});
        return response.json();
      }}
      throw new Error('No HTTP client available');
    }},
  }};
}}

// ===== 模块导出 =====

if (typeof module !== 'undefined') {{
  module.exports = {{
    buildBrowserEntrypointResult,
    validateArgs,
    buildRequest,
    normalizeRows,
    COLUMN_DEFS,
    COLUMNS,
  }};
}}

if (typeof args !== 'undefined') {{
  return buildBrowserEntrypointResult(args);
}}
"#, scene_id = scene_id, expected_domain = expected_domain, api_endpoints_code = api_endpoints_code, static_params_code = static_params_code, column_defs_code = column_defs_code, columns_code = columns_code, primary_api = primary_api)
}

Step 4: Update generate_scene_package function

Modify generate_scene_package in generator.rs to pass scene_info:

pub fn generate_scene_package(
    request: GenerateSceneRequest,
) -> Result<PathBuf, GenerateSceneError> {
    let analysis = analyze_scene_source_with_hint(&request.source_dir, request.scene_kind.clone())?;
    // ... existing code ...

    write_file(
        &scripts_dir.join(format!("{tool_name}.js")),
        &browser_script(&request.scene_id, &analysis, request.scene_info_json.as_ref()),
    )?;

    // ... rest of function ...
}

Step 5: Add CLI parameter in sg_scene_generate.rs

Modify CliArgs struct:

struct CliArgs {
    source_dir: PathBuf,
    scene_id: String,
    scene_name: String,
    scene_kind: Option<SceneKind>,
    target_url: Option<String>,
    output_root: PathBuf,
    lessons_path: Option<PathBuf>,
    // NEW
    scene_info_json: Option<String>,
}

Add parsing in parse_args:

fn parse_args(args: impl Iterator<Item = String>) -> Result<CliArgs, String> {
    // ... existing code ...
    let mut scene_info_json = None;
    // ... in match block ...
    "--scene-info-json" => scene_info_json = Some(arg),
    // ...
}

Parse JSON in run:

fn run() -> Result<(), String> {
    let args = parse_args(env::args().skip(1))?;

    let scene_info = args.scene_info_json
        .map(|json| serde_json::from_str(&json))
        .transpose()
        .map_err(|e| format!("Invalid scene-info-json: {}", e))?;

    let skill_root = generate_scene_package(GenerateSceneRequest {
        source_dir: args.source_dir,
        scene_id: args.scene_id,
        scene_name: args.scene_name,
        scene_kind: args.scene_kind,
        target_url: args.target_url,
        output_root: args.output_root,
        lessons_path: args.lessons_path,
        scene_info_json: scene_info,
    })
    .map_err(|err| err.to_string())?;

    println!("generated scene package: {}", skill_root.display());
    Ok(())
}

Update usage:

fn usage() -> String {
    "usage: sg_scene_generate --source-dir <scenario-dir> --scene-id <scene-id> --scene-name <display-name> [--scene-kind <report_collection|monitoring>] [--target-url <url>] --output-root <skill-staging-root> [--lessons <lessons-toml>] [--scene-info-json '<json>']".to_string()
}

Step 6: Verify Rust compilation

Run: cargo check Expected: No compilation errors

Step 7: Commit

git add src/bin/sg_scene_generate.rs src/generated_scene/generator.rs
git commit -m "feat(rust): add --scene-info-json parameter for LLM extraction results"

Task 5: Update Web UI with Extraction Preview

Files:

Modify: frontend/scene-generator/sg_scene_generator.html

Goal: Add UI elements to show extraction results and allow user confirmation before generation.

Step 1: Add extraction results preview section

Add after the existing form section, a new collapsible panel for extraction preview:

<!-- 提取结果预览 -->
<div id="extractionPreview" class="preview-panel" style="display: none; margin-top: 20px;">
  <div class="preview-header" onclick="togglePreview()">
    <h3>LLM 提取结果</h3>
    <span id="previewToggleIcon">▼</span>
  </div>
  <div id="previewContent" class="preview-content">
    <div class="preview-section">
      <h4>基本信息</h4>
      <div class="preview-row">
        <span class="label">场景 ID:</span>
        <span id="previewSceneId" class="value"></span>
      </div>
      <div class="preview-row">
        <span class="label">场景名称:</span>
        <span id="previewSceneName" class="value"></span>
      </div>
      <div class="preview-row">
        <span class="label">场景类型:</span>
        <span id="previewSceneKind" class="value"></span>
      </div>
      <div class="preview-row">
        <span class="label">目标域名:</span>
        <span id="previewExpectedDomain" class="value"></span>
      </div>
    </div>

    <div class="preview-section">
      <h4>API 端点 (<span id="previewApiCount">0</span>)</h4>
      <div id="previewApiEndpoints" class="preview-list"></div>
    </div>

    <div class="preview-section">
      <h4>列定义 (<span id="previewColumnCount">0</span>)</h4>
      <div id="previewColumnDefs" class="preview-list"></div>
    </div>

    <div class="preview-section">
      <h4>静态参数</h4>
      <pre id="previewStaticParams" class="preview-code"></pre>
    </div>

    <div class="preview-section">
      <h4>业务逻辑</h4>
      <div class="preview-row">
        <span class="label">数据获取:</span>
        <span id="previewDataFetch" class="value"></span>
      </div>
      <div class="preview-row">
        <span class="label">数据转换:</span>
        <span id="previewDataTransform" class="value"></span>
      </div>
    </div>
  </div>
</div>

Step 2: Add CSS for preview panel

Add styles:

.preview-panel {
  background: rgba(255, 255, 255, 0.05);
  border-radius: 12px;
  border: 1px solid rgba(255, 255, 255, 0.1);
  overflow: hidden;
}

.preview-header {
  display: flex;
  justify-content: space-between;
  align-items: center;
  padding: 16px 20px;
  cursor: pointer;
  background: rgba(255, 255, 255, 0.03);
}

.preview-header h3 {
  margin: 0;
  font-size: 16px;
}

.preview-content {
  padding: 20px;
}

.preview-section {
  margin-bottom: 20px;
}

.preview-section h4 {
  margin: 0 0 10px 0;
  font-size: 14px;
  color: #a78bfa;
}

.preview-row {
  display: flex;
  margin-bottom: 8px;
}

.preview-row .label {
  width: 100px;
  color: #888;
  flex-shrink: 0;
}

.preview-row .value {
  color: #fff;
}

.preview-list {
  max-height: 150px;
  overflow-y: auto;
  background: rgba(0, 0, 0, 0.2);
  border-radius: 8px;
  padding: 10px;
}

.preview-list-item {
  padding: 6px 0;
  border-bottom: 1px solid rgba(255, 255, 255, 0.05);
}

.preview-code {
  background: rgba(0, 0, 0, 0.3);
  padding: 10px;
  border-radius: 8px;
  font-family: monospace;
  font-size: 12px;
  overflow-x: auto;
  white-space: pre-wrap;
}

Step 3: Add JavaScript for deep analysis and preview

let currentSceneInfo = null;

async function analyzeDeep() {
  const sourceDir = document.getElementById('sourceDir').value;
  if (!sourceDir) {
    alert('请先选择场景目录');
    return;
  }

  showStatus('正在深度分析...');

  try {
    const response = await fetch(`${SERVER_URL}/analyze-deep`, {
      method: 'POST',
      headers: { 'Content-Type': 'application/json' },
      body: JSON.stringify({ sourceDir })
    });

    const data = await response.json();

    if (data.error) {
      showStatus('分析失败: ' + data.error);
      return;
    }

    currentSceneInfo = data;

    // Update form fields
    document.getElementById('sceneId').value = data.sceneId || '';
    document.getElementById('sceneName').value = data.sceneName || '';
    document.getElementById('sceneKind').value = data.sceneKind || 'report_collection';
    if (data.targetUrl) {
      document.getElementById('targetUrl').value = data.targetUrl;
    }

    // Show preview
    showExtractionPreview(data);
    showStatus('分析完成，请确认提取结果');

  } catch (err) {
    showStatus('分析失败: ' + err.message);
  }
}

function showExtractionPreview(data) {
  document.getElementById('previewSceneId').textContent = data.sceneId || '-';
  document.getElementById('previewSceneName').textContent = data.sceneName || '-';
  document.getElementById('previewSceneKind').textContent = data.sceneKind || '-';
  document.getElementById('previewExpectedDomain').textContent = data.expectedDomain || '-';

  // API endpoints
  const apiList = document.getElementById('previewApiEndpoints');
  const apiCount = (data.apiEndpoints || []).length;
  document.getElementById('previewApiCount').textContent = apiCount;
  apiList.innerHTML = (data.apiEndpoints || []).map(ep => `
    <div class="preview-list-item">
      <strong>${ep.name}</strong>: ${ep.url}
      <span style="color: #888">[${ep.method || 'GET'}]</span>
    </div>
  `).join('') || '<div style="color: #888">无 API 端点</div>';

  // Column defs
  const colList = document.getElementById('previewColumnDefs');
  const colCount = (data.columnDefs || []).length;
  document.getElementById('previewColumnCount').textContent = colCount;
  colList.innerHTML = (data.columnDefs || []).map(([field, label]) => `
    <div class="preview-list-item">
      <code>${field}</code> → ${label}
    </div>
  `).join('') || '<div style="color: #888">无列定义</div>';

  // Static params
  document.getElementById('previewStaticParams').textContent =
    JSON.stringify(data.staticParams || {}, null, 2) || '{}';

  // Business logic
  document.getElementById('previewDataFetch').textContent =
    data.businessLogic?.dataFetch || '-';
  document.getElementById('previewDataTransform').textContent =
    data.businessLogic?.dataTransform || '-';

  document.getElementById('extractionPreview').style.display = 'block';
}

function togglePreview() {
  const content = document.getElementById('previewContent');
  const icon = document.getElementById('previewToggleIcon');
  if (content.style.display === 'none') {
    content.style.display = 'block';
    icon.textContent = '▼';
  } else {
    content.style.display = 'none';
    icon.textContent = '▶';
  }
}

Step 4: Add "深度分析" button

Add a new button in the button group:

<button onclick="analyzeDeep()" class="btn btn-secondary">
  深度分析
</button>

Step 5: Update generate function to pass sceneInfo

Modify the generate function to include scene info JSON:

async function generate() {
  const params = {
    sourceDir: document.getElementById('sourceDir').value,
    sceneId: document.getElementById('sceneId').value,
    sceneName: document.getElementById('sceneName').value,
    sceneKind: document.getElementById('sceneKind').value,
    targetUrl: document.getElementById('targetUrl').value || null,
    outputRoot: document.getElementById('outputRoot').value,
    lessons: document.getElementById('lessons').value || null,
  };

  // Add scene info JSON if available
  if (currentSceneInfo) {
    params.sceneInfoJson = JSON.stringify(currentSceneInfo);
  }

  // ... rest of generate function ...
}

Step 6: Verify UI loads

Run the server and open the page in browser:

cd frontend/scene-generator && node server.js

Open http://127.0.0.1:3210/

Expected: Page loads without JavaScript errors

Step 7: Commit

git add frontend/scene-generator/sg_scene_generator.html
git commit -m "feat(ui): add deep extraction preview panel with API/column/static-params display"

Task 6: Update generator-runner.js to Pass sceneInfoJson

Files:

Modify: frontend/scene-generator/generator-runner.js

Goal: Update runGenerator to pass sceneInfoJson parameter to Rust CLI.

Step 1: Modify runGenerator function

Update the function to accept and pass sceneInfoJson:

function runGenerator(params, sseWriter, projectRoot) {
  const { sourceDir, sceneId, sceneName, sceneKind, targetUrl, outputRoot, lessons, sceneInfoJson } = params;

  const normalize = (p) => p.replace(/\\/g, "/");

  const args = [
    "run",
    "--bin",
    "sg_scene_generate",
    "--",
    "--source-dir",
    normalize(sourceDir),
    "--scene-id",
    sceneId,
    "--scene-name",
    sceneName,
  ];

  if (sceneKind) {
    args.push("--scene-kind", sceneKind);
  }

  if (targetUrl) {
    args.push("--target-url", targetUrl);
  }

  args.push("--output-root", normalize(outputRoot));

  if (lessons) {
    args.push("--lessons", normalize(lessons));
  }

  // NEW: Pass scene info JSON
  if (sceneInfoJson) {
    args.push("--scene-info-json", sceneInfoJson);
  }

  // ... rest of function unchanged ...
}

Step 2: Update server.js handleGenerate

Ensure handleGenerate passes the new parameter:

async function handleGenerate(req, res) {
  let body;
  try {
    body = await parseBody(req);
  } catch {
    res.writeHead(400, { "Content-Type": "application/json" });
    res.end(JSON.stringify({ error: "Invalid JSON body" }));
    return;
  }

  const { sourceDir, sceneId, sceneName, sceneKind, targetUrl, outputRoot, lessons, sceneInfoJson } = body;
  if (!sourceDir || !sceneId || !sceneName || !outputRoot) {
    res.writeHead(400, { "Content-Type": "application/json" });
    res.end(
      JSON.stringify({
        error: "All fields required: sourceDir, sceneId, sceneName, outputRoot",
      })
    );
    return;
  }

  const sseWriter = initSSE(res);

  try {
    await runGenerator(
      { sourceDir, sceneId, sceneName, sceneKind, targetUrl, outputRoot, lessons, sceneInfoJson },
      sseWriter,
      config.projectRoot
    );
  } catch (err) {
    writeSSE(sseWriter, "error", { message: `Server error: ${err.message}` });
  }

  sseWriter.end();
}

Step 3: Verify syntax

Run: node -c frontend/scene-generator/generator-runner.js && node -c frontend/scene-generator/server.js Expected: No syntax errors

Step 4: Commit

git add frontend/scene-generator/generator-runner.js frontend/scene-generator/server.js
git commit -m "feat(runner): pass sceneInfoJson to Rust CLI for enhanced template rendering"

Task 7: End-to-End Verification

Files:

All modified files

Goal: Verify the complete flow works from UI to Rust CLI.

Step 1: Build Rust binary

cargo build --release --bin sg_scene_generate

Expected: Build succeeds

Step 2: Start the server

cd frontend/scene-generator && node server.js

Expected: Server starts on port 3210

Step 3: Test health endpoint

curl http://127.0.0.1:3210/health

Expected: {"status":"ok",...}

Step 4: Test analyze-deep endpoint with real scene

Use a real scene directory with index.html:

curl -X POST http://127.0.0.1:3210/analyze-deep \
  -H "Content-Type: application/json" \
  -d '{"sourceDir": "D:/path/to/scene/with/index.html"}'

Expected: JSON response with sceneId, sceneName, apiEndpoints, columnDefs

Step 5: Test full generation flow

Open browser to http://127.0.0.1:3210/
Select a scene directory with index.html
Click "深度分析" button
Verify preview shows extracted API/column data
Click "生成" button
Verify generated script contains extracted API endpoints and column definitions

Step 6: Compare generated script

Compare the generated script with the reference:

Before: 51 lines (skeleton)
After: Should have API_ENDPOINTS, COLUMN_DEFS constants populated
Step 7: Final commit

git add -A
git commit -m "feat: complete LLM-driven skill generation with deep extraction

- Add /analyze-deep endpoint for deep LLM extraction
- Extract apiEndpoints, staticParams, columnDefs from index.html
- Pass extraction results via --scene-info-json to Rust CLI
- Generate complete browser_script with business logic constants
- Add UI preview panel for extraction results
"

Self-Review

1. Spec Coverage

Spec Requirement	Task
LLM reads index.html	Task 1 (buildDeepAnalyzePrompt), Task 2 (readDirectory)
Extract apiEndpoints	Task 1 (DEEP_SYSTEM_PROMPT, analyzeSceneDeep)
Extract staticParams	Task 1 (DEEP_SYSTEM_PROMPT, analyzeSceneDeep)
Extract columnDefs	Task 1 (DEEP_SYSTEM_PROMPT, analyzeSceneDeep)
Extract businessLogic	Task 1 (DEEP_SYSTEM_PROMPT, analyzeSceneDeep)
--scene-info-json CLI parameter	Task 4
Enhanced template rendering	Task 4 (browser_script_with_business_logic)
Web UI preview	Task 5
User confirmation before generation	Task 5 (extraction preview)

All covered.

2. Placeholder Scan

No TBD/TODO/"implement later"/"add tests"/"similar to" patterns found.

3. Type Consistency

/analyze-deep: { sourceDir } → SceneInfoJson — consistent in Tasks 1, 3, 5
/generate: { ..., sceneInfoJson } — consistent in Tasks 5, 6
SceneInfoJson struct fields match JavaScript extraction output — consistent in Task 1, 4
Column defs: Vec<(String, String)> matches [[field, label]] — consistent

All consistent.

4. Backward Compatibility

Existing /analyze endpoint unchanged
Existing CLI arguments (--scene-id, --scene-name) still work
--scene-info-json is optional, falls back to skeleton template
index.html reading is optional, falls back if not present

41 KiB Raw Permalink Blame History Unescape Escape

LLM-Driven Skill Generation Implementation Plan

Scope Check

File Map

Modified Files

Reference Files (not modified)

Scope Guardrails

Task 1: Enhance llm-client.js with Deep Extraction

Task 2: Enhance generator-runner.js to Read index.html

Task 3: Add /analyze-deep Route in server.js

Task 4: Add --scene-info-json CLI Parameter

Task 5: Update Web UI with Extraction Preview

Task 6: Update generator-runner.js to Pass sceneInfoJson

Task 7: End-to-End Verification

Self-Review

1. Spec Coverage

2. Placeholder Scan

3. Type Consistency

4. Backward Compatibility

41 KiB

Raw Permalink Blame History