claw/docs/superpowers/plans/2026-04-17-llm-driven-skill-generation-plan.md

# LLM-Driven Skill Generation Implementation Plan

> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.

**Goal:** Enhance `sg_scene_generate` to generate complete, runnable skill packages instead of skeleton code by deeply analyzing scene source code (index.html) with LLM to extract API endpoints, static params, column definitions, and business logic.

**Architecture:**
- LLM reads `index.html` from scene directory
- Extracts complete SceneInfo (sceneId, sceneName, apiEndpoints, staticParams, columnDefs, businessLogic)
- Web UI shows preview for user confirmation
- Rust CLI receives extracted info via `--scene-info-json` parameter
- Rust template renders complete browser_script with business logic

**Tech Stack:** JavaScript (Node.js), Rust, HTML/CSS, OpenAI-compatible LLM API

---

## Scope Check

This plan covers the enhancement of existing scene skill generator to support LLM-driven deep extraction. It builds upon:
- Existing `frontend/scene-generator/` files (server.js, llm-client.js, generator-runner.js)
- Existing `src/generated_scene/generator.rs` and `src/bin/sg_scene_generate.rs`

---

## File Map

### Modified Files

| File | Changes |
|------|---------|
| `frontend/scene-generator/llm-client.js` | Add deep extraction prompt + `analyzeSceneDeep()` |
| `frontend/scene-generator/generator-runner.js` | Add `index.html` reading in `readDirectory()` |
| `frontend/scene-generator/server.js` | New `/analyze-deep` route, pass sceneInfo to generator |
| `src/bin/sg_scene_generate.rs` | Add `--scene-info-json` CLI parameter |
| `src/generated_scene/generator.rs` | Add SceneInfo struct, enhanced template rendering |
| `frontend/scene-generator/sg_scene_generator.html` | Add extraction preview UI |

### Reference Files (not modified)

| File | Purpose |
|------|---------|
| `docs/superpowers/specs/2026-04-17-llm-driven-skill-generation-design.md` | Design spec |
| `claw/skills/skill_staging/skills/tq-lineloss-report/scripts/collect_tq_lineloss_report.js` | Reference complete script (433 lines) |
| `claw/skills/skill_staging/skills/marketing-zero-consumer-report/scripts/collect_marketing_zero_consumer_report.js` | Reference skeleton (51 lines) |

---

## Scope Guardrails

- Do not change existing API contracts for backward compatibility
- Do not require `index.html` to exist (fallback to current behavior)
- Do not break existing `--scene-id`, `--scene-name` CLI arguments
- Do not add npm dependencies (only Node.js built-in modules)

---

### Task 1: Enhance llm-client.js with Deep Extraction

**Files:**
- Modify: `frontend/scene-generator/llm-client.js`

**Goal:** Add a new function `analyzeSceneDeep()` that reads index.html content and extracts complete SceneInfo including API endpoints, static params, column definitions, and business logic.

- [ ] **Step 1: Add DEEP_SYSTEM_PROMPT constant**

Add after the existing `SYSTEM_PROMPT` constant in `llm-client.js`:

```javascript
const DEEP_SYSTEM_PROMPT = `你是一个场景代码分析专家。分析场景源码，提取关键业务信息。

## 分析目标

1. **API 端点**: 识别所有 HTTP 请求地址 (URL, method, 用途)
2. **静态参数**: 识别硬编码的业务参数 (key-value pairs)
3. **列定义**: 识别数据表格/导出的列配置 ([field, label] pairs)
4. **业务逻辑**: 理解数据获取和转换流程
5. **场景类型**: 判断是 report_collection 还是 monitoring

## 输出格式

请以 JSON 格式返回：
{
  "sceneId": "string - 场景标识 (英文短横线)",
  "sceneName": "string - 场景中文名",
  "sceneKind": "report_collection | monitoring",
  "sourceSystem": "string - 来源系统名 (可选)",
  "expectedDomain": "string - 目标域名 (可选)",
  "targetUrl": "string | null - 目标页面URL",
  "apiEndpoints": [
    {"name": "string", "url": "string", "method": "GET|POST", "description": "string"}
  ],
  "staticParams": {"key": "value"},
  "columnDefs": [["fieldName", "中文列名"]],
  "entryMethod": "string - 入口方法名",
  "businessLogic": {
    "dataFetch": "string - 数据获取逻辑描述",
    "dataTransform": "string - 数据转换逻辑描述"
  }
}`;
```

- [ ] **Step 2: Add buildDeepAnalyzePrompt function**

Add after `buildAnalyzePrompt` function:

```javascript
function buildDeepAnalyzePrompt(sourceDir, dirContents, indexHtmlContent) {
  const parts = [];

  parts.push(`=== 目录结构 ===`);
  parts.push(dirContents.tree || "(empty)");

  if (dirContents["scene.toml"]) {
    parts.push(`\n=== scene.toml ===`);
    parts.push(dirContents["scene.toml"]);
  }

  if (dirContents["SKILL.toml"]) {
    parts.push(`\n=== SKILL.toml ===`);
    parts.push(dirContents["SKILL.toml"]);
  }

  if (dirContents["SKILL.md"]) {
    parts.push(`\n=== SKILL.md ===`);
    parts.push(dirContents["SKILL.md"]);
  }

  // Include index.html content (key addition)
  if (indexHtmlContent) {
    parts.push(`\n=== index.html ===`);
    // Limit to first 15000 chars to avoid token limits
    parts.push(indexHtmlContent.substring(0, 15000));
  }

  if (dirContents.scripts && Object.keys(dirContents.scripts).length > 0) {
    parts.push(`\n=== 脚本文件 ===`);
    for (const [name, content] of Object.entries(dirContents.scripts)) {
      parts.push(`\n--- ${name} ---`);
      parts.push(content.substring(0, 3000));
    }
  }

  return `以下是场景目录 "${sourceDir}" 的内容：\n\n${parts.join("\n")}\n\n请分析以上代码，提取完整的场景信息。`;
}
```

- [ ] **Step 3: Add extractSceneInfo function**

Add after `extractJsonFromResponse` function:

```javascript
function extractSceneInfo(text) {
  // Try code block first
  const codeBlockMatch = text.match(/```(?:json)?\s*\n([\s\S]*?)\n```/);
  if (codeBlockMatch) {
    try {
      return JSON.parse(codeBlockMatch[1]);
    } catch (e) {
      // fall through
    }
  }

  // Try to find JSON object with sceneId
  const jsonMatch = text.match(/\{[\s\S]*"sceneId"[\s\S]*\}/);
  if (jsonMatch) {
    try {
      return JSON.parse(jsonMatch[0]);
    } catch (e) {
      // fall through
    }
  }

  // Last resort: parse entire text
  try {
    return JSON.parse(text);
  } catch (e) {
    throw new Error("Failed to extract valid SceneInfo JSON from LLM response");
  }
}
```

- [ ] **Step 4: Add analyzeSceneDeep function**

Add after `analyzeScene` function:

```javascript
function analyzeSceneDeep(sourceDir, dirContents, indexHtmlContent, { apiKey, baseUrl, model }) {
  const userPrompt = buildDeepAnalyzePrompt(sourceDir, dirContents, indexHtmlContent);

  const requestBody = JSON.stringify({
    model,
    messages: [
      { role: "system", content: DEEP_SYSTEM_PROMPT },
      { role: "user", content: userPrompt },
    ],
    temperature: 0.1,
    max_tokens: 2048, // Increased for detailed response
  });

  return new Promise((resolve, reject) => {
    const url = new URL(baseUrl.replace(/\/v1\/?$/, "") + "/v1/chat/completions");
    const options = {
      hostname: url.hostname,
      port: url.port || (url.protocol === "https:" ? 443 : 80),
      path: url.pathname,
      method: "POST",
      headers: {
        "Content-Type": "application/json",
        Authorization: `Bearer ${apiKey}`,
        "Content-Length": Buffer.byteLength(requestBody),
      },
    };

    const httpModule = url.protocol === "https:" ? https : http;
    const req = httpModule.request(options, (res) => {
      let data = "";
      res.on("data", (chunk) => (data += chunk));
      res.on("end", () => {
        if (res.statusCode !== 200) {
          return reject(new Error(`LLM API error ${res.statusCode}: ${data}`));
        }

        try {
          const parsed = JSON.parse(data);
          const content = parsed.choices?.[0]?.message?.content;
          if (!content) return reject(new Error("LLM returned empty response"));
          const result = extractSceneInfo(content);

          // Validate required fields
          if (!result.sceneId || !result.sceneName) {
            return reject(new Error(`LLM response missing sceneId/sceneName: ${content}`));
          }

          // Set defaults for optional fields
          result.sceneKind = result.sceneKind || "report_collection";
          result.apiEndpoints = result.apiEndpoints || [];
          result.staticParams = result.staticParams || {};
          result.columnDefs = result.columnDefs || [];
          result.businessLogic = result.businessLogic || {};

          resolve(result);
        } catch (err) {
          reject(new Error(`Failed to parse LLM response: ${err.message}`));
        }
      });
    });

    req.on("error", reject);
    req.setTimeout(60000, () => {
      req.destroy(new Error("LLM API request timed out"));
    });

    req.write(requestBody);
    req.end();
  });
}
```

- [ ] **Step 5: Add http module import and update exports**

At the top of the file, add `http` import alongside `https`:

```javascript
const http = require("http");
const https = require("https");
```

Update the exports at the bottom:

```javascript
module.exports = {
  buildAnalyzePrompt,
  extractJsonFromResponse,
  analyzeScene,
  // New exports
  buildDeepAnalyzePrompt,
  extractSceneInfo,
  analyzeSceneDeep,
};
```

- [ ] **Step 6: Verify syntax**

Run: `node -c frontend/scene-generator/llm-client.js`
Expected: No syntax errors

- [ ] **Step 7: Commit**

```bash
git add frontend/scene-generator/llm-client.js
git commit -m "feat(llm-client): add deep extraction with apiEndpoints, staticParams, columnDefs"
```

---

### Task 2: Enhance generator-runner.js to Read index.html

**Files:**
- Modify: `frontend/scene-generator/generator-runner.js`

**Goal:** Modify `readDirectory()` to also read `index.html` content.

- [ ] **Step 1: Add index.html reading in readDirectory function**

Locate the `readDirectory` function and add index.html reading after the SKILL.md section:

```javascript
// After the SKILL.md reading section, add:

const indexHtmlPath = p.join(sourceDir, "index.html");
if (fs.existsSync(indexHtmlPath)) {
  result.indexHtml = fs.readFileSync(indexHtmlPath, "utf-8");
}
```

The complete modified function should look like:

```javascript
function readDirectory(sourceDir) {
  const fs = require("fs");
  const p = require("path");

  if (!fs.existsSync(sourceDir)) {
    throw new Error(`Directory not found: ${sourceDir}`);
  }

  const stat = fs.statSync(sourceDir);
  if (!stat.isDirectory()) {
    throw new Error(`Not a directory: ${sourceDir}`);
  }

  const result = {};
  const entries = fs.readdirSync(sourceDir, { withFileTypes: true });

  const treeLines = [];
  for (const entry of entries) {
    treeLines.push(`├── ${entry.name}`);
  }
  result.tree = treeLines.join("\n");

  const sceneTomlPath = p.join(sourceDir, "scene.toml");
  if (fs.existsSync(sceneTomlPath)) {
    result["scene.toml"] = fs.readFileSync(sceneTomlPath, "utf-8");
  }

  const skillTomlPath = p.join(sourceDir, "SKILL.toml");
  if (fs.existsSync(skillTomlPath)) {
    result["SKILL.toml"] = fs.readFileSync(skillTomlPath, "utf-8");
  }

  const skillMdPath = p.join(sourceDir, "SKILL.md");
  if (fs.existsSync(skillMdPath)) {
    result["SKILL.md"] = fs.readFileSync(skillMdPath, "utf-8");
  }

  // NEW: Read index.html
  const indexHtmlPath = p.join(sourceDir, "index.html");
  if (fs.existsSync(indexHtmlPath)) {
    result.indexHtml = fs.readFileSync(indexHtmlPath, "utf-8");
  }

  const scripts = {};
  for (const entry of entries) {
    if (entry.isFile() && entry.name.endsWith(".js")) {
      const scriptPath = p.join(sourceDir, entry.name);
      scripts[entry.name] = fs.readFileSync(scriptPath, "utf-8");
    }
  }
  if (Object.keys(scripts).length > 0) {
    result.scripts = scripts;
  }

  return result;
}
```

- [ ] **Step 2: Verify syntax**

Run: `node -c frontend/scene-generator/generator-runner.js`
Expected: No syntax errors

- [ ] **Step 3: Commit**

```bash
git add frontend/scene-generator/generator-runner.js
git commit -m "feat(generator-runner): read index.html in readDirectory()"
```

---

### Task 3: Add /analyze-deep Route in server.js

**Files:**
- Modify: `frontend/scene-generator/server.js`

**Goal:** Add new `/analyze-deep` endpoint that calls the deep extraction LLM function.

- [ ] **Step 1: Update llm-client import**

Change the import line at the top:

```javascript
const { analyzeScene, analyzeSceneDeep } = require("./llm-client");
```

- [ ] **Step 2: Add handleAnalyzeDeep function**

Add after the existing `handleAnalyze` function:

```javascript
async function handleAnalyzeDeep(req, res) {
  let body;
  try {
    body = await parseBody(req);
  } catch {
    res.writeHead(400, { "Content-Type": "application/json" });
    res.end(JSON.stringify({ error: "Invalid JSON body" }));
    return;
  }

  const sourceDir = (body.sourceDir || "").replace(/\\/g, "/");
  if (!sourceDir) {
    res.writeHead(400, { "Content-Type": "application/json" });
    res.end(JSON.stringify({ error: "sourceDir is required" }));
    return;
  }

  let dirContents;
  try {
    dirContents = readDirectory(sourceDir);
  } catch (err) {
    res.writeHead(400, { "Content-Type": "application/json" });
    res.end(JSON.stringify({ error: err.message }));
    return;
  }

  try {
    const indexHtmlContent = dirContents.indexHtml || null;
    const result = await analyzeSceneDeep(sourceDir, dirContents, indexHtmlContent, config);

    // Log extraction results for debugging
    console.log(`[analyze-deep] Extracted scene: ${result.sceneId} / ${result.sceneName}`);
    console.log(`[analyze-deep] API endpoints: ${result.apiEndpoints?.length || 0}`);
    console.log(`[analyze-deep] Column defs: ${result.columnDefs?.length || 0}`);

    res.writeHead(200, { "Content-Type": "application/json" });
    res.end(JSON.stringify(result));
  } catch (err) {
    console.error(`[analyze-deep] Error: ${err.message}`);
    res.writeHead(502, { "Content-Type": "application/json" });
    res.end(
      JSON.stringify({
        error: `Deep analysis failed: ${err.message}`,
        hint: "You can still use basic analysis or enter data manually",
      })
    );
  }
}
```

- [ ] **Step 3: Add route in server request handler**

In the `http.createServer` handler, add the new route after `/analyze`:

```javascript
} else if (pathname === "/analyze-deep" && req.method === "POST") {
  await handleAnalyzeDeep(req, res);
```

- [ ] **Step 4: Verify syntax**

Run: `node -c frontend/scene-generator/server.js`
Expected: No syntax errors

- [ ] **Step 5: Commit**

```bash
git add frontend/scene-generator/server.js
git commit -m "feat(server): add /analyze-deep endpoint for deep extraction"
```

---

### Task 4: Add --scene-info-json CLI Parameter

**Files:**
- Modify: `src/bin/sg_scene_generate.rs`
- Modify: `src/generated_scene/generator.rs`

**Goal:** Add `--scene-info-json` parameter to Rust CLI to receive pre-extracted scene info from the Node.js server.

- [ ] **Step 1: Add SceneInfoJson struct in generator.rs**

In `src/generated_scene/generator.rs`, add after imports:

```rust
use std::collections::HashMap;

#[derive(Debug, Clone, serde::Deserialize)]
pub struct ApiEndpointJson {
    pub name: String,
    pub url: String,
    #[serde(default)]
    pub method: String,
    #[serde(default)]
    pub description: Option<String>,
}

#[derive(Debug, Clone, serde::Deserialize)]
pub struct BusinessLogicJson {
    #[serde(default)]
    pub data_fetch: Option<String>,
    #[serde(default)]
    pub data_transform: Option<String>,
}

#[derive(Debug, Clone, serde::Deserialize)]
pub struct SceneInfoJson {
    #[serde(rename = "sceneId")]
    pub scene_id: String,
    #[serde(rename = "sceneName")]
    pub scene_name: String,
    #[serde(rename = "sceneKind", default)]
    pub scene_kind: String,
    #[serde(rename = "sourceSystem", default)]
    pub source_system: Option<String>,
    #[serde(rename = "expectedDomain", default)]
    pub expected_domain: Option<String>,
    #[serde(rename = "targetUrl", default)]
    pub target_url: Option<String>,
    #[serde(rename = "apiEndpoints", default)]
    pub api_endpoints: Vec<ApiEndpointJson>,
    #[serde(rename = "staticParams", default)]
    pub static_params: HashMap<String, String>,
    #[serde(rename = "columnDefs", default)]
    pub column_defs: Vec<(String, String)>,
    #[serde(rename = "entryMethod", default)]
    pub entry_method: Option<String>,
    #[serde(rename = "businessLogic", default)]
    pub business_logic: Option<BusinessLogicJson>,
}
```

- [ ] **Step 2: Add scene_info_json field to GenerateSceneRequest**

In `src/generated_scene/generator.rs`, modify `GenerateSceneRequest`:

```rust
#[derive(Debug, Clone)]
pub struct GenerateSceneRequest {
    pub source_dir: PathBuf,
    pub scene_id: String,
    pub scene_name: String,
    pub scene_kind: Option<SceneKind>,
    pub target_url: Option<String>,
    pub output_root: PathBuf,
    pub lessons_path: Option<PathBuf>,
    // NEW
    pub scene_info_json: Option<SceneInfoJson>,
}
```

- [ ] **Step 3: Modify browser_script function to use SceneInfo**

Replace the existing `browser_script` function with enhanced version:

```rust
fn browser_script(scene_id: &str, analysis: &SceneSourceAnalysis, scene_info: Option<&SceneInfoJson>) -> String {
    // If we have scene info with business logic, generate enhanced script
    if let Some(info) = scene_info {
        if !info.api_endpoints.is_empty() || !info.column_defs.is_empty() {
            return browser_script_with_business_logic(scene_id, info);
        }
    }

    // Fallback to skeleton template
    browser_script_skeleton(scene_id, analysis)
}

fn browser_script_skeleton(scene_id: &str, _analysis: &SceneSourceAnalysis) -> String {
    // Keep existing skeleton template
    format!(
        "function normalizePayload(payload) {{
  if (typeof payload === 'string') {{
    try {{ return JSON.parse(payload); }} catch (_) {{ return {{}}; }}
  }}
  return payload && typeof payload === 'object' ? payload : {{}};
}}

async function buildBrowserEntrypointResult(args, deps = {{}}) {{
  const rows = typeof deps.collectRows === 'function'
    ? await deps.collectRows(args)
    : [{{
        org_label: args.org_label || '',
        org_code: args.org_code || '',
        period_mode: args.period_mode || '',
        period_value: args.period_value || '',
        value: ''
      }}];
  return {{
    type: 'report-artifact',
    report_name: '{}',
    status: rows.length > 0 ? 'ok' : 'empty',
    period: {{
      mode: args.period_mode,
      mode_code: args.period_mode_code,
      value: args.period_value,
      payload: normalizePayload(args.period_payload)
    }},
    org: {{ label: args.org_label, code: args.org_code }},
    column_defs: [
      ['org_label', '供电单位'],
      ['org_code', '供电单位编码'],
      ['period_mode', '统计周期类型'],
      ['period_value', '统计周期'],
      ['value', '采集值']
    ],
    columns: ['org_label', 'org_code', 'period_mode', 'period_value', 'value'],
    rows,
    counts: {{ detail_rows: rows.length }},
    partial_reasons: [],
    reasons: []
  }};
}}

if (typeof module !== 'undefined') {{
  module.exports = {{ buildBrowserEntrypointResult, normalizePayload }};
}}

if (typeof args !== 'undefined') {{
  return buildBrowserEntrypointResult(args);
}}
",
        scene_id
    )
}

fn browser_script_with_business_logic(scene_id: &str, info: &SceneInfoJson) -> String {
    // Generate API endpoints constant
    let api_endpoints_code = info.api_endpoints.iter()
        .map(|ep| format!("  {}: '{}',", ep.name, ep.url))
        .collect::<Vec<_>>()
        .join("\n");

    // Generate static params constant
    let static_params_code = info.static_params.iter()
        .map(|(k, v)| format!("  {}: '{}',", k, v))
        .collect::<Vec<_>>()
        .join("\n");

    // Generate column defs
    let column_defs_code = info.column_defs.iter()
        .map(|(field, label)| format!("  ['{}', '{}'],", field, label))
        .collect::<Vec<_>>()
        .join("\n");

    let columns_code = info.column_defs.iter()
        .map(|(field, _)| format!("'{}'", field))
        .collect::<Vec<_>>()
        .join(", ");

    let primary_api = info.api_endpoints.first()
        .map(|ep| ep.url.clone())
        .unwrap_or_else(|| "/api/data".to_string());

    let expected_domain = info.expected_domain.as_deref().unwrap_or("");

    format!(r#"// ===== 自动生成部分 =====

const REPORT_NAME = '{scene_id}';
const EXPECTED_DOMAIN = '{expected_domain}';

// API 端点
const API_ENDPOINTS = {{
{api_endpoints_code}
}};

// 静态参数
const STATIC_PARAMS = {{
{static_params_code}
}};

// 列定义
const COLUMN_DEFS = [
{column_defs_code}
];
const COLUMNS = [{columns_code}];

// ===== 标准框架 =====

function normalizePayload(payload) {{
  if (typeof payload === 'string') {{
    try {{ return JSON.parse(payload); }} catch (_) {{ return {{}}; }}
  }}
  return payload && typeof payload === 'object' ? payload : {{}};
}}

function validateArgs(args) {{
  const reasons = [];
  if (!args.org_code) reasons.push('missing org_code');
  if (!args.period_value) reasons.push('missing period_value');
  return reasons.length === 0 ? {{ ok: true }} : {{ ok: false, reasons }};
}}

function buildRequest(args) {{
  return {{
    orgCode: args.org_code,
    periodMode: args.period_mode,
    periodValue: args.period_value,
    ...STATIC_PARAMS
  }};
}}

function normalizeRows(rawRows) {{
  if (!Array.isArray(rawRows)) return [];
  return rawRows.map((row, index) => ({{
    org_label: row.orgLabel || row.org_label || '',
    org_code: row.orgCode || row.org_code || args.org_code || '',
    period_mode: args.period_mode || '',
    period_value: args.period_value || '',
    ...row
  }}));
}}

function buildArtifact(opts) {{
  return {{
    type: 'report-artifact',
    report_name: REPORT_NAME,
    status: opts.status || 'ok',
    period: {{
      mode: args.period_mode,
      mode_code: args.period_mode_code,
      value: args.period_value,
      payload: normalizePayload(args.period_payload)
    }},
    org: {{ label: args.org_label, code: args.org_code }},
    column_defs: COLUMN_DEFS,
    columns: COLUMNS,
    rows: opts.rows || [],
    counts: {{ detail_rows: (opts.rows || []).length }},
    partial_reasons: opts.partial_reasons || [],
    reasons: opts.reasons || []
  }};
}}

async function buildBrowserEntrypointResult(args, deps = defaultDeps()) {{
  // 1. 参数验证
  const validation = validateArgs(args);
  if (!validation.ok) {{
    return buildArtifact({{ status: 'blocked', reasons: validation.reasons }});
  }}

  // 2. 页面上下文验证
  const pageValidation = deps.validatePageContext?.(args);
  if (!pageValidation?.ok) {{
    return buildArtifact({{ status: 'blocked', reasons: ['page_context_mismatch'] }});
  }}

  // 3. 数据获取
  try {{
    const request = buildRequest(args);
    const response = await deps.queryData(request);
    const rows = normalizeRows(response.rows || response.data || []);

    return buildArtifact({{
      status: rows.length > 0 ? 'ok' : 'empty',
      rows
    }});
  }} catch (error) {{
    return buildArtifact({{ status: 'error', reasons: [error.message] }});
  }}
}}

// ===== 默认依赖实现 =====

function defaultDeps() {{
  return {{
    validatePageContext(args) {{
      const host = globalThis.location?.hostname;
      return host === args.expected_domain || host === EXPECTED_DOMAIN
        ? {{ ok: true }}
        : {{ ok: false, reason: 'domain_mismatch' }};
    }},

    async queryData(request) {{
      // 根据 API_ENDPOINTS 调用实际接口
      if (typeof $ !== 'undefined' && typeof $.ajax === 'function') {{
        return new Promise((resolve, reject) => {{
          $.ajax({{
            url: API_ENDPOINTS.primary || '{primary_api}',
            type: 'POST',
            data: JSON.stringify(request),
            contentType: 'application/json',
            success: resolve,
            error: (xhr, status, err) => reject(new Error(`API failed: ${{err}}`)),
          }});
        }});
      }}
      // Fallback: fetch API
      if (typeof fetch === 'function') {{
        const response = await fetch(API_ENDPOINTS.primary || '{primary_api}', {{
          method: 'POST',
          headers: {{ 'Content-Type': 'application/json' }},
          body: JSON.stringify(request)
        }});
        return response.json();
      }}
      throw new Error('No HTTP client available');
    }},
  }};
}}

// ===== 模块导出 =====

if (typeof module !== 'undefined') {{
  module.exports = {{
    buildBrowserEntrypointResult,
    validateArgs,
    buildRequest,
    normalizeRows,
    COLUMN_DEFS,
    COLUMNS,
  }};
}}

if (typeof args !== 'undefined') {{
  return buildBrowserEntrypointResult(args);
}}
"#, scene_id = scene_id, expected_domain = expected_domain, api_endpoints_code = api_endpoints_code, static_params_code = static_params_code, column_defs_code = column_defs_code, columns_code = columns_code, primary_api = primary_api)
}
```

- [ ] **Step 4: Update generate_scene_package function**

Modify `generate_scene_package` in generator.rs to pass scene_info:

```rust
pub fn generate_scene_package(
    request: GenerateSceneRequest,
) -> Result<PathBuf, GenerateSceneError> {
    let analysis = analyze_scene_source_with_hint(&request.source_dir, request.scene_kind.clone())?;
    // ... existing code ...

    write_file(
        &scripts_dir.join(format!("{tool_name}.js")),
        &browser_script(&request.scene_id, &analysis, request.scene_info_json.as_ref()),
    )?;

    // ... rest of function ...
}
```

- [ ] **Step 5: Add CLI parameter in sg_scene_generate.rs**

Modify `CliArgs` struct:

```rust
struct CliArgs {
    source_dir: PathBuf,
    scene_id: String,
    scene_name: String,
    scene_kind: Option<SceneKind>,
    target_url: Option<String>,
    output_root: PathBuf,
    lessons_path: Option<PathBuf>,
    // NEW
    scene_info_json: Option<String>,
}
```

Add parsing in `parse_args`:

```rust
fn parse_args(args: impl Iterator<Item = String>) -> Result<CliArgs, String> {
    // ... existing code ...
    let mut scene_info_json = None;
    // ... in match block ...
    "--scene-info-json" => scene_info_json = Some(arg),
    // ...
}
```

Parse JSON in `run`:

```rust
fn run() -> Result<(), String> {
    let args = parse_args(env::args().skip(1))?;

    let scene_info = args.scene_info_json
        .map(|json| serde_json::from_str(&json))
        .transpose()
        .map_err(|e| format!("Invalid scene-info-json: {}", e))?;

    let skill_root = generate_scene_package(GenerateSceneRequest {
        source_dir: args.source_dir,
        scene_id: args.scene_id,
        scene_name: args.scene_name,
        scene_kind: args.scene_kind,
        target_url: args.target_url,
        output_root: args.output_root,
        lessons_path: args.lessons_path,
        scene_info_json: scene_info,
    })
    .map_err(|err| err.to_string())?;

    println!("generated scene package: {}", skill_root.display());
    Ok(())
}
```

Update usage:

```rust
fn usage() -> String {
    "usage: sg_scene_generate --source-dir <scenario-dir> --scene-id <scene-id> --scene-name <display-name> [--scene-kind <report_collection|monitoring>] [--target-url <url>] --output-root <skill-staging-root> [--lessons <lessons-toml>] [--scene-info-json '<json>']".to_string()
}
```

- [ ] **Step 6: Verify Rust compilation**

Run: `cargo check`
Expected: No compilation errors

- [ ] **Step 7: Commit**

```bash
git add src/bin/sg_scene_generate.rs src/generated_scene/generator.rs
git commit -m "feat(rust): add --scene-info-json parameter for LLM extraction results"
```

---

### Task 5: Update Web UI with Extraction Preview

**Files:**
- Modify: `frontend/scene-generator/sg_scene_generator.html`

**Goal:** Add UI elements to show extraction results and allow user confirmation before generation.

- [ ] **Step 1: Add extraction results preview section**

Add after the existing form section, a new collapsible panel for extraction preview:

```html
<!-- 提取结果预览 -->
<div id="extractionPreview" class="preview-panel" style="display: none; margin-top: 20px;">
  <div class="preview-header" onclick="togglePreview()">
    <h3>LLM 提取结果</h3>
    <span id="previewToggleIcon">▼</span>
  </div>
  <div id="previewContent" class="preview-content">
    <div class="preview-section">
      <h4>基本信息</h4>
      <div class="preview-row">
        <span class="label">场景 ID:</span>
        <span id="previewSceneId" class="value"></span>
      </div>
      <div class="preview-row">
        <span class="label">场景名称:</span>
        <span id="previewSceneName" class="value"></span>
      </div>
      <div class="preview-row">
        <span class="label">场景类型:</span>
        <span id="previewSceneKind" class="value"></span>
      </div>
      <div class="preview-row">
        <span class="label">目标域名:</span>
        <span id="previewExpectedDomain" class="value"></span>
      </div>
    </div>

    <div class="preview-section">
      <h4>API 端点 (<span id="previewApiCount">0</span>)</h4>
      <div id="previewApiEndpoints" class="preview-list"></div>
    </div>

    <div class="preview-section">
      <h4>列定义 (<span id="previewColumnCount">0</span>)</h4>
      <div id="previewColumnDefs" class="preview-list"></div>
    </div>

    <div class="preview-section">
      <h4>静态参数</h4>
      <pre id="previewStaticParams" class="preview-code"></pre>
    </div>

    <div class="preview-section">
      <h4>业务逻辑</h4>
      <div class="preview-row">
        <span class="label">数据获取:</span>
        <span id="previewDataFetch" class="value"></span>
      </div>
      <div class="preview-row">
        <span class="label">数据转换:</span>
        <span id="previewDataTransform" class="value"></span>
      </div>
    </div>
  </div>
</div>
```

- [ ] **Step 2: Add CSS for preview panel**

Add styles:

```css
.preview-panel {
  background: rgba(255, 255, 255, 0.05);
  border-radius: 12px;
  border: 1px solid rgba(255, 255, 255, 0.1);
  overflow: hidden;
}

.preview-header {
  display: flex;
  justify-content: space-between;
  align-items: center;
  padding: 16px 20px;
  cursor: pointer;
  background: rgba(255, 255, 255, 0.03);
}

.preview-header h3 {
  margin: 0;
  font-size: 16px;
}

.preview-content {
  padding: 20px;
}

.preview-section {
  margin-bottom: 20px;
}

.preview-section h4 {
  margin: 0 0 10px 0;
  font-size: 14px;
  color: #a78bfa;
}

.preview-row {
  display: flex;
  margin-bottom: 8px;
}

.preview-row .label {
  width: 100px;
  color: #888;
  flex-shrink: 0;
}

.preview-row .value {
  color: #fff;
}

.preview-list {
  max-height: 150px;
  overflow-y: auto;
  background: rgba(0, 0, 0, 0.2);
  border-radius: 8px;
  padding: 10px;
}

.preview-list-item {
  padding: 6px 0;
  border-bottom: 1px solid rgba(255, 255, 255, 0.05);
}

.preview-code {
  background: rgba(0, 0, 0, 0.3);
  padding: 10px;
  border-radius: 8px;
  font-family: monospace;
  font-size: 12px;
  overflow-x: auto;
  white-space: pre-wrap;
}
```

- [ ] **Step 3: Add JavaScript for deep analysis and preview**

```javascript
let currentSceneInfo = null;

async function analyzeDeep() {
  const sourceDir = document.getElementById('sourceDir').value;
  if (!sourceDir) {
    alert('请先选择场景目录');
    return;
  }

  showStatus('正在深度分析...');

  try {
    const response = await fetch(`${SERVER_URL}/analyze-deep`, {
      method: 'POST',
      headers: { 'Content-Type': 'application/json' },
      body: JSON.stringify({ sourceDir })
    });

    const data = await response.json();

    if (data.error) {
      showStatus('分析失败: ' + data.error);
      return;
    }

    currentSceneInfo = data;

    // Update form fields
    document.getElementById('sceneId').value = data.sceneId || '';
    document.getElementById('sceneName').value = data.sceneName || '';
    document.getElementById('sceneKind').value = data.sceneKind || 'report_collection';
    if (data.targetUrl) {
      document.getElementById('targetUrl').value = data.targetUrl;
    }

    // Show preview
    showExtractionPreview(data);
    showStatus('分析完成，请确认提取结果');

  } catch (err) {
    showStatus('分析失败: ' + err.message);
  }
}

function showExtractionPreview(data) {
  document.getElementById('previewSceneId').textContent = data.sceneId || '-';
  document.getElementById('previewSceneName').textContent = data.sceneName || '-';
  document.getElementById('previewSceneKind').textContent = data.sceneKind || '-';
  document.getElementById('previewExpectedDomain').textContent = data.expectedDomain || '-';

  // API endpoints
  const apiList = document.getElementById('previewApiEndpoints');
  const apiCount = (data.apiEndpoints || []).length;
  document.getElementById('previewApiCount').textContent = apiCount;
  apiList.innerHTML = (data.apiEndpoints || []).map(ep => `
    <div class="preview-list-item">
      <strong>${ep.name}</strong>: ${ep.url}
      <span style="color: #888">[${ep.method || 'GET'}]</span>
    </div>
  `).join('') || '<div style="color: #888">无 API 端点</div>';

  // Column defs
  const colList = document.getElementById('previewColumnDefs');
  const colCount = (data.columnDefs || []).length;
  document.getElementById('previewColumnCount').textContent = colCount;
  colList.innerHTML = (data.columnDefs || []).map(([field, label]) => `
    <div class="preview-list-item">
      <code>${field}</code> → ${label}
    </div>
  `).join('') || '<div style="color: #888">无列定义</div>';

  // Static params
  document.getElementById('previewStaticParams').textContent =
    JSON.stringify(data.staticParams || {}, null, 2) || '{}';

  // Business logic
  document.getElementById('previewDataFetch').textContent =
    data.businessLogic?.dataFetch || '-';
  document.getElementById('previewDataTransform').textContent =
    data.businessLogic?.dataTransform || '-';

  document.getElementById('extractionPreview').style.display = 'block';
}

function togglePreview() {
  const content = document.getElementById('previewContent');
  const icon = document.getElementById('previewToggleIcon');
  if (content.style.display === 'none') {
    content.style.display = 'block';
    icon.textContent = '▼';
  } else {
    content.style.display = 'none';
    icon.textContent = '▶';
  }
}
```

- [ ] **Step 4: Add "深度分析" button**

Add a new button in the button group:

```html
<button onclick="analyzeDeep()" class="btn btn-secondary">
  深度分析
</button>
```

- [ ] **Step 5: Update generate function to pass sceneInfo**

Modify the generate function to include scene info JSON:

```javascript
async function generate() {
  const params = {
    sourceDir: document.getElementById('sourceDir').value,
    sceneId: document.getElementById('sceneId').value,
    sceneName: document.getElementById('sceneName').value,
    sceneKind: document.getElementById('sceneKind').value,
    targetUrl: document.getElementById('targetUrl').value || null,
    outputRoot: document.getElementById('outputRoot').value,
    lessons: document.getElementById('lessons').value || null,
  };

  // Add scene info JSON if available
  if (currentSceneInfo) {
    params.sceneInfoJson = JSON.stringify(currentSceneInfo);
  }

  // ... rest of generate function ...
}
```

- [ ] **Step 6: Verify UI loads**

Run the server and open the page in browser:
```bash
cd frontend/scene-generator && node server.js
```
Open `http://127.0.0.1:3210/`

Expected: Page loads without JavaScript errors

- [ ] **Step 7: Commit**

```bash
git add frontend/scene-generator/sg_scene_generator.html
git commit -m "feat(ui): add deep extraction preview panel with API/column/static-params display"
```

---

### Task 6: Update generator-runner.js to Pass sceneInfoJson

**Files:**
- Modify: `frontend/scene-generator/generator-runner.js`

**Goal:** Update `runGenerator` to pass `sceneInfoJson` parameter to Rust CLI.

- [ ] **Step 1: Modify runGenerator function**

Update the function to accept and pass `sceneInfoJson`:

```javascript
function runGenerator(params, sseWriter, projectRoot) {
  const { sourceDir, sceneId, sceneName, sceneKind, targetUrl, outputRoot, lessons, sceneInfoJson } = params;

  const normalize = (p) => p.replace(/\\/g, "/");

  const args = [
    "run",
    "--bin",
    "sg_scene_generate",
    "--",
    "--source-dir",
    normalize(sourceDir),
    "--scene-id",
    sceneId,
    "--scene-name",
    sceneName,
  ];

  if (sceneKind) {
    args.push("--scene-kind", sceneKind);
  }

  if (targetUrl) {
    args.push("--target-url", targetUrl);
  }

  args.push("--output-root", normalize(outputRoot));

  if (lessons) {
    args.push("--lessons", normalize(lessons));
  }

  // NEW: Pass scene info JSON
  if (sceneInfoJson) {
    args.push("--scene-info-json", sceneInfoJson);
  }

  // ... rest of function unchanged ...
}
```

- [ ] **Step 2: Update server.js handleGenerate**

Ensure `handleGenerate` passes the new parameter:

```javascript
async function handleGenerate(req, res) {
  let body;
  try {
    body = await parseBody(req);
  } catch {
    res.writeHead(400, { "Content-Type": "application/json" });
    res.end(JSON.stringify({ error: "Invalid JSON body" }));
    return;
  }

  const { sourceDir, sceneId, sceneName, sceneKind, targetUrl, outputRoot, lessons, sceneInfoJson } = body;
  if (!sourceDir || !sceneId || !sceneName || !outputRoot) {
    res.writeHead(400, { "Content-Type": "application/json" });
    res.end(
      JSON.stringify({
        error: "All fields required: sourceDir, sceneId, sceneName, outputRoot",
      })
    );
    return;
  }

  const sseWriter = initSSE(res);

  try {
    await runGenerator(
      { sourceDir, sceneId, sceneName, sceneKind, targetUrl, outputRoot, lessons, sceneInfoJson },
      sseWriter,
      config.projectRoot
    );
  } catch (err) {
    writeSSE(sseWriter, "error", { message: `Server error: ${err.message}` });
  }

  sseWriter.end();
}
```

- [ ] **Step 3: Verify syntax**

Run: `node -c frontend/scene-generator/generator-runner.js && node -c frontend/scene-generator/server.js`
Expected: No syntax errors

- [ ] **Step 4: Commit**

```bash
git add frontend/scene-generator/generator-runner.js frontend/scene-generator/server.js
git commit -m "feat(runner): pass sceneInfoJson to Rust CLI for enhanced template rendering"
```

---

### Task 7: End-to-End Verification

**Files:**
- All modified files

**Goal:** Verify the complete flow works from UI to Rust CLI.

- [ ] **Step 1: Build Rust binary**

```bash
cargo build --release --bin sg_scene_generate
```
Expected: Build succeeds

- [ ] **Step 2: Start the server**

```bash
cd frontend/scene-generator && node server.js
```
Expected: Server starts on port 3210

- [ ] **Step 3: Test health endpoint**

```bash
curl http://127.0.0.1:3210/health
```
Expected: `{"status":"ok",...}`

- [ ] **Step 4: Test analyze-deep endpoint with real scene**

Use a real scene directory with index.html:

```bash
curl -X POST http://127.0.0.1:3210/analyze-deep \
  -H "Content-Type: application/json" \
  -d '{"sourceDir": "D:/path/to/scene/with/index.html"}'
```

Expected: JSON response with sceneId, sceneName, apiEndpoints, columnDefs

- [ ] **Step 5: Test full generation flow**

1. Open browser to `http://127.0.0.1:3210/`
2. Select a scene directory with index.html
3. Click "深度分析" button
4. Verify preview shows extracted API/column data
5. Click "生成" button
6. Verify generated script contains extracted API endpoints and column definitions

- [ ] **Step 6: Compare generated script**

Compare the generated script with the reference:
- Before: 51 lines (skeleton)
- After: Should have API_ENDPOINTS, COLUMN_DEFS constants populated

- [ ] **Step 7: Final commit**

```bash
git add -A
git commit -m "feat: complete LLM-driven skill generation with deep extraction

- Add /analyze-deep endpoint for deep LLM extraction
- Extract apiEndpoints, staticParams, columnDefs from index.html
- Pass extraction results via --scene-info-json to Rust CLI
- Generate complete browser_script with business logic constants
- Add UI preview panel for extraction results
"
```

---

## Self-Review

### 1. Spec Coverage

| Spec Requirement | Task |
|------------------|------|
| LLM reads index.html | Task 1 (buildDeepAnalyzePrompt), Task 2 (readDirectory) |
| Extract apiEndpoints | Task 1 (DEEP_SYSTEM_PROMPT, analyzeSceneDeep) |
| Extract staticParams | Task 1 (DEEP_SYSTEM_PROMPT, analyzeSceneDeep) |
| Extract columnDefs | Task 1 (DEEP_SYSTEM_PROMPT, analyzeSceneDeep) |
| Extract businessLogic | Task 1 (DEEP_SYSTEM_PROMPT, analyzeSceneDeep) |
| --scene-info-json CLI parameter | Task 4 |
| Enhanced template rendering | Task 4 (browser_script_with_business_logic) |
| Web UI preview | Task 5 |
| User confirmation before generation | Task 5 (extraction preview) |

All covered.

### 2. Placeholder Scan

No TBD/TODO/"implement later"/"add tests"/"similar to" patterns found.

### 3. Type Consistency

- `/analyze-deep`: `{ sourceDir }` → `SceneInfoJson` — consistent in Tasks 1, 3, 5
- `/generate`: `{ ..., sceneInfoJson }` — consistent in Tasks 5, 6
- SceneInfoJson struct fields match JavaScript extraction output — consistent in Task 1, 4
- Column defs: `Vec<(String, String)>` matches `[[field, label]]` — consistent

All consistent.

### 4. Backward Compatibility

- Existing `/analyze` endpoint unchanged
- Existing CLI arguments (`--scene-id`, `--scene-name`) still work
- `--scene-info-json` is optional, falls back to skeleton template
- `index.html` reading is optional, falls back if not present