8 tasks across 3 phases with exact file paths, step-by-step instructions, test commands, and commit messages for each task. 🤖 Generated with [Qoder][https://qoder.com]
483 lines
19 KiB
Markdown
483 lines
19 KiB
Markdown
# sgClaw 场景生成器质量提升 — 实施计划
|
||
|
||
> 对应设计文档: `docs/superpowers/specs/2026-04-17-scene-generator-quality-improvement-design.md`
|
||
|
||
## 总览
|
||
|
||
3 个阶段,8 个任务。每个任务包含:改动文件、具体步骤、验证方式、提交信息。
|
||
|
||
---
|
||
|
||
## Phase 1: 修基础
|
||
|
||
### Task 1: 统一生成路径(废弃 browser_script_with_business_logic)
|
||
|
||
**文件**: `src/generated_scene/generator.rs`
|
||
|
||
**当前状态** (line 728-735):
|
||
```rust
|
||
fn compile_scene(scene_ir: &SceneIr, analysis: &SceneSourceAnalysis, tool_name: &str) -> CompiledScene {
|
||
let scene_toml = render_scene_toml(scene_ir, analysis, tool_name);
|
||
let browser_script = match scene_ir.workflow_archetype() {
|
||
WorkflowArchetype::SingleRequestTable => compile_single_request_table(scene_ir),
|
||
WorkflowArchetype::MultiModeRequest => compile_multi_mode_request(scene_ir),
|
||
WorkflowArchetype::PaginatedEnrichment => compile_paginated_enrichment(scene_ir),
|
||
WorkflowArchetype::PageStateEval => compile_page_state_eval(scene_ir),
|
||
};
|
||
...
|
||
}
|
||
```
|
||
|
||
**步骤**:
|
||
|
||
1. **修改 `compile_scene` 路由逻辑** (line 730-735):
|
||
- `SingleRequestTable` 不再调用 `compile_simple_request_script`(`compile_single_request_table` 的底层),改为将单模式场景包装为一个 mode 后走 `compile_multi_mode_request`
|
||
- 新增辅助函数 `ensure_modes_populated(scene_ir: &SceneIr) -> SceneIr`:
|
||
- 如果 `scene_ir.modes` 为空但 `scene_ir.api_endpoints` 非空,生成一个 default mode
|
||
- 将 `SingleRequestTable` 和 `PageStateEval` 场景的 `workflow_archetype` 改为 `MultiModeRequest`(因为统一走 modes 路径)
|
||
- 修改 match 分支:
|
||
```rust
|
||
let browser_script = match scene_ir.workflow_archetype() {
|
||
WorkflowArchetype::MultiModeRequest => compile_multi_mode_request(scene_ir),
|
||
WorkflowArchetype::PaginatedEnrichment => compile_paginated_enrichment(scene_ir),
|
||
_ => {
|
||
// SingleRequestTable, PageStateEval — fallback to multi-mode with default mode
|
||
let adapted = ensure_modes_populated(scene_ir);
|
||
compile_multi_mode_request(&adapted)
|
||
}
|
||
};
|
||
```
|
||
|
||
2. **实现 `ensure_modes_populated`**:
|
||
- 接收 `&SceneIr`,返回 `SceneIr`(clone)
|
||
- 如果 `modes` 已非空,直接返回 clone
|
||
- 如果 `modes` 为空但 `api_endpoints` 非空:
|
||
- 取第一个 endpoint 构造默认 mode
|
||
- 设置 `name: "default"`, `label: Some("default")`
|
||
- `condition`: `{ field: "period_mode", operator: "equals", value: "default" }`
|
||
- `apiEndpoint`: 复制第一个 endpoint
|
||
- `requestTemplate`: 取 `scene_ir.request_template`
|
||
- `responsePath`: 取 `scene_ir.response_path`
|
||
- `normalizeRules`: 取 `scene_ir.normalize_rules` 或默认
|
||
- `columnDefs`: 取 `scene_ir.column_defs`
|
||
- 同时设置 `default_mode = Some("default")`, `mode_switch_field = Some("period_mode")`
|
||
|
||
3. **标记 `browser_script_with_business_logic` 为废弃**(如果仍存在于代码中):
|
||
- 在当前代码中,该函数已不存在(已被 `compile_simple_request_script` 替代)。在注释中标注 "legacy path, superseded by multi-mode unified path"
|
||
|
||
**验证**:
|
||
- `cargo check` 无编译错误
|
||
- 单模式场景生成的 JS 脚本包含 `const MODES =` 和 `detectMode` 逻辑
|
||
|
||
**提交信息**:
|
||
```
|
||
feat(generator): unify all scene types through multi-mode path
|
||
|
||
Single-mode and page-state-eval scenes now get auto-wrapped into a
|
||
default mode and compiled through compile_multi_mode_request. This
|
||
eliminates the old browser_script_with_business_logic code path and
|
||
ensures all scenes get responsePath extraction, requestTemplate, and
|
||
contentType support.
|
||
```
|
||
|
||
---
|
||
|
||
### Task 2: 修复 jQuery processData 参数
|
||
|
||
**文件**: `src/generated_scene/generator.rs`(`compile_multi_mode_request` 函数,line 1069-1253)
|
||
|
||
**当前状态**: 模板中 `buildModeRequest` 函数(line 1098-1118)根据 `contentType` 区分了 body 序列化方式(form-urlencoded 用 `Object.entries().join('&')`,JSON 用 `JSON.stringify`),但 jQuery ajax 调用(line 1185-1196)**没有**设置 `processData` 参数。
|
||
|
||
jQuery 对 form-urlencoded body 会默认再次序列化(将字符串当作 query string 处理),导致双重编码。
|
||
|
||
**步骤**:
|
||
|
||
1. 修改 `compile_multi_mode_request` 中的 jQuery ajax 调用模板(line 1185-1196 区域):
|
||
- 在 `$.ajax({...})` 中增加 `processData` 参数:
|
||
```javascript
|
||
$.ajax({
|
||
url: request.url,
|
||
type: request.method,
|
||
data: request.body,
|
||
contentType: request.headers['Content-Type'],
|
||
processData: contentType !== 'application/x-www-form-urlencoded',
|
||
dataType: 'json',
|
||
success: resolve,
|
||
error: (xhr, status, err) => reject(new Error(`API failed (${xhr.status}): ${err}`))
|
||
});
|
||
```
|
||
- 需要将 `contentType` 变量在 Promise 回调中可访问,从 `request` 对象中提取
|
||
|
||
2. 同理修改 `compile_simple_request_script` 中的 jQuery ajax 调用(line 994-1004 区域),增加相同的 `processData` 逻辑
|
||
|
||
**验证**:
|
||
- 生成的 JS 中 `$.ajax` 调用包含 `processData` 参数
|
||
- form-urlencoded 请求不会双重编码
|
||
|
||
**提交信息**:
|
||
```
|
||
fix(generator): add processData to jQuery ajax for form-urlencoded requests
|
||
|
||
jQuery default processData:true re-serializes string bodies, causing
|
||
double-encoding for form-urlencoded payloads. Set processData:false
|
||
when contentType is application/x-www-form-urlencoded.
|
||
```
|
||
|
||
---
|
||
|
||
### Task 3: 单模式场景自动包装为 mode 配置
|
||
|
||
**文件**: `frontend/scene-generator/llm-client.js`
|
||
|
||
**当前状态**: `analyzeSceneDeep` (line 729-769) 调用 LLM 后直接 `normalizeSceneIr` 返回。如果 LLM 输出 `modes: []` 但有 `apiEndpoints`,不会自动包装。
|
||
|
||
**步骤**:
|
||
|
||
1. 在 `analyzeSceneDeep` 函数中,`normalizeSceneIr(...)` 之后、返回之前,增加自动包装逻辑:
|
||
|
||
```javascript
|
||
async function analyzeSceneDeep(sourceDir, dirContents, config) {
|
||
const content = await requestChatCompletionWithRetry(...);
|
||
const normalized = normalizeSceneIr(await extractJsonFromResponseWithRepair(content, config));
|
||
|
||
// ... existing sceneId validation ...
|
||
|
||
// AUTO-WRAP: single-mode scenes → modes array
|
||
if (normalized.modes.length === 0 && normalized.apiEndpoints.length > 0) {
|
||
normalized.modes.push({
|
||
name: "default",
|
||
label: "default",
|
||
condition: { field: "period_mode", operator: "equals", value: "default" },
|
||
apiEndpoint: normalized.apiEndpoints[0],
|
||
columnDefs: normalized.columnDefs || [],
|
||
requestTemplate: normalized.requestTemplate || {},
|
||
normalizeRules: normalized.normalizeRules || { type: "validate_required", requiredFields: [], filterNull: true },
|
||
responsePath: normalized.responsePath || "",
|
||
});
|
||
normalized.defaultMode = "default";
|
||
normalized.modeSwitchField = "period_mode";
|
||
// Upgrade archetype if it was single_request_table
|
||
if (normalized.workflowArchetype === "single_request_table") {
|
||
normalized.workflowArchetype = "multi_mode_request";
|
||
}
|
||
}
|
||
|
||
return normalized;
|
||
}
|
||
```
|
||
|
||
2. 同时在 `normalizeSceneIr` 中确保 `defaultMode` 和 `modeSwitchField` 有正确的默认值(已有 line 477-478 处理)
|
||
|
||
**验证**:
|
||
- 对单模式场景(如 `用户日电量监测`)运行生成,确认 `modes` 数组包含一个 default mode
|
||
- 确认 `workflowArchetype` 被正确升级为 `multi_mode_request`
|
||
|
||
**提交信息**:
|
||
```
|
||
feat(llm-client): auto-wrap single-mode scenes into modes array
|
||
|
||
When the LLM returns an empty modes array but has apiEndpoints,
|
||
automatically create a default mode with the first endpoint,
|
||
requestTemplate, responsePath, and normalizeRules. This ensures all
|
||
scenes compile through the multi-mode path.
|
||
```
|
||
|
||
---
|
||
|
||
## Phase 2: 增强提取
|
||
|
||
### Task 4: 增强 LLM prompt 的强制约束
|
||
|
||
**文件**: `frontend/scene-generator/llm-client.js`(`DEEP_SYSTEM_PROMPT`,line 19-82)
|
||
|
||
**当前状态**: prompt 中已列出 schema 但没有强调哪些字段是**必须**填充的。LLM 经常跳过 `contentType`、`responsePath`、`requestTemplate`。
|
||
|
||
**步骤**:
|
||
|
||
1. 在 `DEEP_SYSTEM_PROMPT` 的 schema 定义后,增加**强制字段约束**段落:
|
||
|
||
```
|
||
MANDATORY FIELDS (never leave empty):
|
||
- apiEndpoints[].contentType: detect from source code.
|
||
* For $.ajax({}): look for 'contentType' property. Default 'application/json' if absent.
|
||
* For $http.sendByAxios(): contentType is 'application/json' (axios default).
|
||
* For XMLHttpRequest: look for setRequestHeader('Content-Type', ...).
|
||
* For form submissions: 'application/x-www-form-urlencoded'.
|
||
- modes[].responsePath: the JSON path from raw API response to the data array.
|
||
* Common patterns: 'data.list', 'data.rcvblAcctSumAll.rcvblAcctVOS', 'content', 'data.records'
|
||
* If response is the array itself, use empty string "".
|
||
- modes[].requestTemplate: the static request body shape from the source code.
|
||
* Extract ALL keys that appear in the request body object.
|
||
* Mark dynamic values as "${args.fieldName}" and static values as literals.
|
||
- apiEndpoints[].url: the full API URL as seen in the source code.
|
||
|
||
RULES:
|
||
- If you cannot determine contentType, default to 'application/json'.
|
||
- If you cannot determine responsePath, default to '' (empty string).
|
||
- If you cannot determine requestTemplate, use {} (empty object).
|
||
- NEVER leave these fields as null or undefined.
|
||
```
|
||
|
||
2. 将这段文字插入到 `DEEP_SYSTEM_PROMPT` 中 schema 定义之后、`Instructions` 之前
|
||
|
||
**验证**:
|
||
- 对 `营销2.0零度户报表数据生成` 场景运行生成,确认 LLM 输出的 `contentType` 和 `responsePath` 不再为空
|
||
- 确认 `requestTemplate` 包含了业务必需字段
|
||
|
||
**提交信息**:
|
||
```
|
||
feat(llm-client): add mandatory field constraints to DEEP_SYSTEM_PROMPT
|
||
|
||
Explicitly require LLM to fill contentType, responsePath, and
|
||
requestTemplate with detected values or defaults. Reduces empty-field
|
||
rate from ~60% to target ~10%.
|
||
```
|
||
|
||
---
|
||
|
||
### Task 5: 增加业务 JS 文件提取
|
||
|
||
**文件**:
|
||
- `frontend/scene-generator/server.js`
|
||
- `frontend/scene-generator/generator-runner.js`
|
||
|
||
**当前状态**: `readDirectory` 在 `generator-runner.js` 中已经读取所有文件到 `dirContents`,但 `buildDeepAnalyzePrompt`(`llm-client.js` line 125-157)主要推送 `index.html` 的 fragments。业务 JS 文件(如 `js/mca.js`, `js/sgApi.js`)的内容没有被单独提取推送。
|
||
|
||
**步骤**:
|
||
|
||
1. **在 `generator-runner.js` 中增加业务 JS 文件识别**:
|
||
- 在 `buildAnalysisContext` 函数中,增加一个 `businessJsFragments` 数组
|
||
- 识别 `js/` 目录下的 `.js` 文件(排除 `vue.js`, `element-ui` 等第三方库)
|
||
- 对每个业务 JS 文件,提取前 600 字符的关键片段(函数定义、API 调用、配置对象)
|
||
- 将结果放入 `analysisContext.businessJsFragments`
|
||
|
||
2. **在 `llm-client.js` 的 `buildDeepAnalyzePrompt` 中推送业务 JS 片段**:
|
||
- 在现有的 `pushFragments` 调用后增加:
|
||
```javascript
|
||
pushFragments(parts, "business JS files", context.businessJsFragments, 4);
|
||
```
|
||
- 确保总 prompt 大小不超过 `MAX_DEEP_PROMPT_CHARS`(60000)
|
||
|
||
3. **在 `server.js` 中确保业务 JS 文件被读取**:
|
||
- 检查 `/handle-analyze-deep` 端点中 `readDirectory` 的调用是否已经读取了 `js/` 目录下的文件
|
||
- 如果没有,增加对 `js/*.js` 文件的读取逻辑
|
||
|
||
**验证**:
|
||
- 对 `台区线损大数据` 场景运行,确认 `js/mca.js` 或类似业务文件的内容被推送给 LLM
|
||
- 确认 prompt 总大小不超过 60000 字符
|
||
|
||
**提交信息**:
|
||
```
|
||
feat(scene-generator): extract business JS files for LLM analysis
|
||
|
||
Identify and push js/ directory business logic files (mca.js, sgApi.js,
|
||
etc.) to the LLM prompt. Exclude third-party libraries. Capped at 4
|
||
fragments to stay within MAX_DEEP_PROMPT_CHARS budget.
|
||
```
|
||
|
||
---
|
||
|
||
### Task 6: 提取后验证与二次追问
|
||
|
||
**文件**: `frontend/scene-generator/llm-client.js`
|
||
|
||
**当前状态**: `analyzeSceneDeep` 拿到 LLM 返回后直接 `normalizeSceneIr` 然后返回,没有检查关键字段是否缺失。
|
||
|
||
**步骤**:
|
||
|
||
1. 新增 `validateExtractedSceneInfo(sceneIr)` 函数:
|
||
```javascript
|
||
function validateExtractedSceneInfo(sceneIr) {
|
||
const issues = [];
|
||
|
||
// Check: at least one apiEndpoint has contentType
|
||
const endpointsWithCt = (sceneIr.apiEndpoints || []).filter(
|
||
ep => ep && ep.contentType
|
||
);
|
||
if ((sceneIr.apiEndpoints || []).length > 0 && endpointsWithCt.length === 0) {
|
||
issues.push("missing_contentType_on_endpoints");
|
||
}
|
||
|
||
// Check: at least one mode has responsePath (if modes exist)
|
||
if ((sceneIr.modes || []).length > 0) {
|
||
const modesWithPath = sceneIr.modes.filter(m => m.responsePath !== undefined && m.responsePath !== null);
|
||
if (modesWithPath.length === 0) {
|
||
issues.push("missing_responsePath_on_modes");
|
||
}
|
||
}
|
||
|
||
// Check: workflowArchetype is set
|
||
if (!sceneIr.workflowArchetype) {
|
||
issues.push("missing_workflowArchetype");
|
||
}
|
||
|
||
return issues;
|
||
}
|
||
```
|
||
|
||
2. 在 `analyzeSceneDeep` 中,`normalizeSceneIr` 之后调用验证:
|
||
```javascript
|
||
const issues = validateExtractedSceneInfo(normalized);
|
||
if (issues.length > 0) {
|
||
// Secondary prompt
|
||
const followUpPrompt = `The previous extraction has these issues:\n${issues.join('\n')}\nPlease re-analyze the source snippets and fill in the missing fields. Use defaults if truly unavailable.`;
|
||
|
||
const followUpContent = await requestChatCompletionWithRetry(
|
||
[
|
||
{ role: "system", content: DEEP_SYSTEM_PROMPT },
|
||
{ role: "user", content: followUpPrompt },
|
||
],
|
||
{ ...config, maxTokens: 2400, timeoutMs: DEEP_REQUEST_TIMEOUT_MS, retryAttempts: 1 }
|
||
);
|
||
|
||
const repaired = normalizeSceneIr(await extractJsonFromResponseWithRepair(followUpContent, config));
|
||
// Merge repaired fields into normalized (only fill empty fields)
|
||
Object.assign(normalized, mergeSceneIrFields(repaired, normalized));
|
||
}
|
||
```
|
||
|
||
3. 新增 `mergeSceneIrFields(repaired, original)` 辅助函数:
|
||
- 仅当 original 的字段为空/默认值时,才用 repaired 的值覆盖
|
||
- 避免丢失第一次提取的有效信息
|
||
|
||
**验证**:
|
||
- 模拟一个 LLM 返回缺少 `contentType` 的场景,确认二次追问触发
|
||
- 确认最多追问 1 次,不会无限循环
|
||
|
||
**提交信息**:
|
||
```
|
||
feat(llm-client): add post-extraction validation with one-shot retry
|
||
|
||
After LLM returns scene IR, validate that critical fields (contentType,
|
||
responsePath, workflowArchetype) are present. If missing, send one
|
||
follow-up prompt to fill gaps. Merges repaired fields without overwriting
|
||
valid data from the first extraction.
|
||
```
|
||
|
||
---
|
||
|
||
## Phase 3: 测试验证
|
||
|
||
### Task 7: 单元测试
|
||
|
||
**文件**: `tests/scene_generator_modes_test.rs`(新增)
|
||
|
||
**步骤**:
|
||
|
||
1. 创建测试文件 `tests/scene_generator_modes_test.rs`
|
||
|
||
2. 编写 5 个测试用例:
|
||
|
||
```rust
|
||
#[cfg(test)]
|
||
mod tests {
|
||
use super::*; // adjust imports as needed
|
||
use crate::generated_scene::generator::*;
|
||
use crate::generated_scene::ir::*;
|
||
use serde_json::json;
|
||
|
||
#[test]
|
||
fn test_single_mode_generates_modes_array() {
|
||
// Create a SingleRequestTable scene with one endpoint
|
||
let scene_ir = make_test_scene_ir();
|
||
// ... assertions: generated JS contains "const MODES ="
|
||
}
|
||
|
||
#[test]
|
||
fn test_multi_mode_generates_mode_routing() {
|
||
// Create a MultiModeRequest scene with two modes
|
||
// ... assertions: generated JS contains "detectMode"
|
||
}
|
||
|
||
#[test]
|
||
fn test_snake_camel_consistency() {
|
||
// Verify field name serialization is consistent
|
||
// between Rust (snake_case) and JS (camelCase)
|
||
}
|
||
|
||
#[test]
|
||
fn test_form_urlencoded_request_body() {
|
||
// Create a mode with contentType = "application/x-www-form-urlencoded"
|
||
// ... assertions: body is Object.entries().join('&'), not JSON.stringify
|
||
}
|
||
|
||
#[test]
|
||
fn test_response_path_extraction_in_template() {
|
||
// Create a mode with responsePath = "data.list"
|
||
// ... assertions: generated JS contains "safeGet(raw, mode.responsePath"
|
||
}
|
||
}
|
||
```
|
||
|
||
3. 每个测试构造一个 `SceneIr` 实例,调用 `compile_multi_mode_request`,然后检查生成的字符串包含预期的代码片段
|
||
|
||
**验证**:
|
||
- `cargo test scene_generator_modes_test` 全部通过
|
||
|
||
**提交信息**:
|
||
```
|
||
test: add unit tests for multi-mode generation path
|
||
|
||
Covers: single-mode auto-wrap, multi-mode routing, snake/camel
|
||
consistency, form-urlencoded body format, and responsePath extraction.
|
||
```
|
||
|
||
---
|
||
|
||
### Task 8: 集成测试
|
||
|
||
**步骤**:
|
||
|
||
1. **选择两个代表性场景跑完整生成**:
|
||
- 简单场景: `用户日电量监测`(模式 C,直接 AJAX)
|
||
- 复杂场景: `台区线损大数据-月_周累计线损率统计分析`(模式 A,双模式)
|
||
|
||
2. **对比生成结果与 tq-lineloss-report**:
|
||
- 对比 `SKILL.toml` 结构
|
||
- 对比 `scripts/*.js` 的关键函数(`buildModeRequest`, `detectMode`, `normalizeRows`)
|
||
- 对比 `scene.toml` 的 bootstrap 和 params 配置
|
||
|
||
3. **产出集成测试报告**:
|
||
- 文件: `docs/superpowers/reports/2026-04-17-integration-test-report.md`
|
||
- 内容: 差距清单、质量评分、遗留问题
|
||
|
||
4. **记录差距清单**:
|
||
- 哪些字段仍未正确提取
|
||
- 哪些逻辑仍需手动修正
|
||
- 哪些场景仍不适合自动化
|
||
|
||
**验证**:
|
||
- 集成测试报告已写入
|
||
- 至少一个场景的生成质量达到 tq-lineloss-report 的 80% 以上
|
||
|
||
**提交信息**:
|
||
```
|
||
docs: add integration test report for scene generator quality
|
||
|
||
Generated skills for user-daily-power and tq-lineloss scenes. Compared
|
||
against manually-authored tq-lineloss-report. Quality assessment and
|
||
gap analysis documented.
|
||
```
|
||
|
||
---
|
||
|
||
## 执行顺序
|
||
|
||
```
|
||
Task 1 → Task 2 → Task 3 → Task 4 → Task 5 → Task 6 → Task 7 → Task 8
|
||
├──── Phase 1: 修基础 ────┤ ├───── Phase 2: 增强提取 ─────┤ ├─ Phase 3 ─┤
|
||
```
|
||
|
||
Phase 1 的三个任务有依赖关系(Task 1 必须先完成,Task 2 和 Task 3 可并行)。
|
||
Phase 2 的三个任务可并行(Task 4/5/6 修改不同文件)。
|
||
Phase 3 依赖 Phase 1+2 全部完成。
|
||
|
||
## 风险与缓解
|
||
|
||
| 风险 | 影响 | 缓解 |
|
||
|------|------|------|
|
||
| LLM 二次追问增加生成时间 | 用户体验下降 | 限制追问 1 次,超时 120s |
|
||
| 统一路径后 SingleRequestTable 场景生成的 JS 包含不必要的 mode 逻辑 | 脚本体积增大 | default mode 条件判断简单,性能影响可忽略 |
|
||
| 业务 JS 文件过多导致 prompt 超限 | LLM 无法处理 | 限制 4 个文件,每个 600 字符 |
|
||
| `processData` 修改影响现有正常场景 | 回归问题 | 仅对 form-urlencoded 设置 false,JSON 不受影响 |
|