8 tasks across 3 phases with exact file paths, step-by-step instructions, test commands, and commit messages for each task. 🤖 Generated with [Qoder][https://qoder.com]
19 KiB
sgClaw 场景生成器质量提升 — 实施计划
对应设计文档:
docs/superpowers/specs/2026-04-17-scene-generator-quality-improvement-design.md
总览
3 个阶段,8 个任务。每个任务包含:改动文件、具体步骤、验证方式、提交信息。
Phase 1: 修基础
Task 1: 统一生成路径(废弃 browser_script_with_business_logic)
文件: src/generated_scene/generator.rs
当前状态 (line 728-735):
fn compile_scene(scene_ir: &SceneIr, analysis: &SceneSourceAnalysis, tool_name: &str) -> CompiledScene {
let scene_toml = render_scene_toml(scene_ir, analysis, tool_name);
let browser_script = match scene_ir.workflow_archetype() {
WorkflowArchetype::SingleRequestTable => compile_single_request_table(scene_ir),
WorkflowArchetype::MultiModeRequest => compile_multi_mode_request(scene_ir),
WorkflowArchetype::PaginatedEnrichment => compile_paginated_enrichment(scene_ir),
WorkflowArchetype::PageStateEval => compile_page_state_eval(scene_ir),
};
...
}
步骤:
-
修改
compile_scene路由逻辑 (line 730-735):SingleRequestTable不再调用compile_simple_request_script(compile_single_request_table的底层),改为将单模式场景包装为一个 mode 后走compile_multi_mode_request- 新增辅助函数
ensure_modes_populated(scene_ir: &SceneIr) -> SceneIr:- 如果
scene_ir.modes为空但scene_ir.api_endpoints非空,生成一个 default mode - 将
SingleRequestTable和PageStateEval场景的workflow_archetype改为MultiModeRequest(因为统一走 modes 路径)
- 如果
- 修改 match 分支:
let browser_script = match scene_ir.workflow_archetype() { WorkflowArchetype::MultiModeRequest => compile_multi_mode_request(scene_ir), WorkflowArchetype::PaginatedEnrichment => compile_paginated_enrichment(scene_ir), _ => { // SingleRequestTable, PageStateEval — fallback to multi-mode with default mode let adapted = ensure_modes_populated(scene_ir); compile_multi_mode_request(&adapted) } };
-
实现
ensure_modes_populated:- 接收
&SceneIr,返回SceneIr(clone) - 如果
modes已非空,直接返回 clone - 如果
modes为空但api_endpoints非空:- 取第一个 endpoint 构造默认 mode
- 设置
name: "default",label: Some("default") condition:{ field: "period_mode", operator: "equals", value: "default" }apiEndpoint: 复制第一个 endpointrequestTemplate: 取scene_ir.request_templateresponsePath: 取scene_ir.response_pathnormalizeRules: 取scene_ir.normalize_rules或默认columnDefs: 取scene_ir.column_defs
- 同时设置
default_mode = Some("default"),mode_switch_field = Some("period_mode")
- 接收
-
标记
browser_script_with_business_logic为废弃(如果仍存在于代码中):- 在当前代码中,该函数已不存在(已被
compile_simple_request_script替代)。在注释中标注 "legacy path, superseded by multi-mode unified path"
- 在当前代码中,该函数已不存在(已被
验证:
cargo check无编译错误- 单模式场景生成的 JS 脚本包含
const MODES =和detectMode逻辑
提交信息:
feat(generator): unify all scene types through multi-mode path
Single-mode and page-state-eval scenes now get auto-wrapped into a
default mode and compiled through compile_multi_mode_request. This
eliminates the old browser_script_with_business_logic code path and
ensures all scenes get responsePath extraction, requestTemplate, and
contentType support.
Task 2: 修复 jQuery processData 参数
文件: src/generated_scene/generator.rs(compile_multi_mode_request 函数,line 1069-1253)
当前状态: 模板中 buildModeRequest 函数(line 1098-1118)根据 contentType 区分了 body 序列化方式(form-urlencoded 用 Object.entries().join('&'),JSON 用 JSON.stringify),但 jQuery ajax 调用(line 1185-1196)没有设置 processData 参数。
jQuery 对 form-urlencoded body 会默认再次序列化(将字符串当作 query string 处理),导致双重编码。
步骤:
-
修改
compile_multi_mode_request中的 jQuery ajax 调用模板(line 1185-1196 区域):- 在
$.ajax({...})中增加processData参数:$.ajax({ url: request.url, type: request.method, data: request.body, contentType: request.headers['Content-Type'], processData: contentType !== 'application/x-www-form-urlencoded', dataType: 'json', success: resolve, error: (xhr, status, err) => reject(new Error(`API failed (${xhr.status}): ${err}`)) }); - 需要将
contentType变量在 Promise 回调中可访问,从request对象中提取
- 在
-
同理修改
compile_simple_request_script中的 jQuery ajax 调用(line 994-1004 区域),增加相同的processData逻辑
验证:
- 生成的 JS 中
$.ajax调用包含processData参数 - form-urlencoded 请求不会双重编码
提交信息:
fix(generator): add processData to jQuery ajax for form-urlencoded requests
jQuery default processData:true re-serializes string bodies, causing
double-encoding for form-urlencoded payloads. Set processData:false
when contentType is application/x-www-form-urlencoded.
Task 3: 单模式场景自动包装为 mode 配置
文件: frontend/scene-generator/llm-client.js
当前状态: analyzeSceneDeep (line 729-769) 调用 LLM 后直接 normalizeSceneIr 返回。如果 LLM 输出 modes: [] 但有 apiEndpoints,不会自动包装。
步骤:
-
在
analyzeSceneDeep函数中,normalizeSceneIr(...)之后、返回之前,增加自动包装逻辑:async function analyzeSceneDeep(sourceDir, dirContents, config) { const content = await requestChatCompletionWithRetry(...); const normalized = normalizeSceneIr(await extractJsonFromResponseWithRepair(content, config)); // ... existing sceneId validation ... // AUTO-WRAP: single-mode scenes → modes array if (normalized.modes.length === 0 && normalized.apiEndpoints.length > 0) { normalized.modes.push({ name: "default", label: "default", condition: { field: "period_mode", operator: "equals", value: "default" }, apiEndpoint: normalized.apiEndpoints[0], columnDefs: normalized.columnDefs || [], requestTemplate: normalized.requestTemplate || {}, normalizeRules: normalized.normalizeRules || { type: "validate_required", requiredFields: [], filterNull: true }, responsePath: normalized.responsePath || "", }); normalized.defaultMode = "default"; normalized.modeSwitchField = "period_mode"; // Upgrade archetype if it was single_request_table if (normalized.workflowArchetype === "single_request_table") { normalized.workflowArchetype = "multi_mode_request"; } } return normalized; } -
同时在
normalizeSceneIr中确保defaultMode和modeSwitchField有正确的默认值(已有 line 477-478 处理)
验证:
- 对单模式场景(如
用户日电量监测)运行生成,确认modes数组包含一个 default mode - 确认
workflowArchetype被正确升级为multi_mode_request
提交信息:
feat(llm-client): auto-wrap single-mode scenes into modes array
When the LLM returns an empty modes array but has apiEndpoints,
automatically create a default mode with the first endpoint,
requestTemplate, responsePath, and normalizeRules. This ensures all
scenes compile through the multi-mode path.
Phase 2: 增强提取
Task 4: 增强 LLM prompt 的强制约束
文件: frontend/scene-generator/llm-client.js(DEEP_SYSTEM_PROMPT,line 19-82)
当前状态: prompt 中已列出 schema 但没有强调哪些字段是必须填充的。LLM 经常跳过 contentType、responsePath、requestTemplate。
步骤:
-
在
DEEP_SYSTEM_PROMPT的 schema 定义后,增加强制字段约束段落:MANDATORY FIELDS (never leave empty): - apiEndpoints[].contentType: detect from source code. * For $.ajax({}): look for 'contentType' property. Default 'application/json' if absent. * For $http.sendByAxios(): contentType is 'application/json' (axios default). * For XMLHttpRequest: look for setRequestHeader('Content-Type', ...). * For form submissions: 'application/x-www-form-urlencoded'. - modes[].responsePath: the JSON path from raw API response to the data array. * Common patterns: 'data.list', 'data.rcvblAcctSumAll.rcvblAcctVOS', 'content', 'data.records' * If response is the array itself, use empty string "". - modes[].requestTemplate: the static request body shape from the source code. * Extract ALL keys that appear in the request body object. * Mark dynamic values as "${args.fieldName}" and static values as literals. - apiEndpoints[].url: the full API URL as seen in the source code. RULES: - If you cannot determine contentType, default to 'application/json'. - If you cannot determine responsePath, default to '' (empty string). - If you cannot determine requestTemplate, use {} (empty object). - NEVER leave these fields as null or undefined. -
将这段文字插入到
DEEP_SYSTEM_PROMPT中 schema 定义之后、Instructions之前
验证:
- 对
营销2.0零度户报表数据生成场景运行生成,确认 LLM 输出的contentType和responsePath不再为空 - 确认
requestTemplate包含了业务必需字段
提交信息:
feat(llm-client): add mandatory field constraints to DEEP_SYSTEM_PROMPT
Explicitly require LLM to fill contentType, responsePath, and
requestTemplate with detected values or defaults. Reduces empty-field
rate from ~60% to target ~10%.
Task 5: 增加业务 JS 文件提取
文件:
frontend/scene-generator/server.jsfrontend/scene-generator/generator-runner.js
当前状态: readDirectory 在 generator-runner.js 中已经读取所有文件到 dirContents,但 buildDeepAnalyzePrompt(llm-client.js line 125-157)主要推送 index.html 的 fragments。业务 JS 文件(如 js/mca.js, js/sgApi.js)的内容没有被单独提取推送。
步骤:
-
在
generator-runner.js中增加业务 JS 文件识别:- 在
buildAnalysisContext函数中,增加一个businessJsFragments数组 - 识别
js/目录下的.js文件(排除vue.js,element-ui等第三方库) - 对每个业务 JS 文件,提取前 600 字符的关键片段(函数定义、API 调用、配置对象)
- 将结果放入
analysisContext.businessJsFragments
- 在
-
在
llm-client.js的buildDeepAnalyzePrompt中推送业务 JS 片段:- 在现有的
pushFragments调用后增加:pushFragments(parts, "business JS files", context.businessJsFragments, 4); - 确保总 prompt 大小不超过
MAX_DEEP_PROMPT_CHARS(60000)
- 在现有的
-
在
server.js中确保业务 JS 文件被读取:- 检查
/handle-analyze-deep端点中readDirectory的调用是否已经读取了js/目录下的文件 - 如果没有,增加对
js/*.js文件的读取逻辑
- 检查
验证:
- 对
台区线损大数据场景运行,确认js/mca.js或类似业务文件的内容被推送给 LLM - 确认 prompt 总大小不超过 60000 字符
提交信息:
feat(scene-generator): extract business JS files for LLM analysis
Identify and push js/ directory business logic files (mca.js, sgApi.js,
etc.) to the LLM prompt. Exclude third-party libraries. Capped at 4
fragments to stay within MAX_DEEP_PROMPT_CHARS budget.
Task 6: 提取后验证与二次追问
文件: frontend/scene-generator/llm-client.js
当前状态: analyzeSceneDeep 拿到 LLM 返回后直接 normalizeSceneIr 然后返回,没有检查关键字段是否缺失。
步骤:
-
新增
validateExtractedSceneInfo(sceneIr)函数:function validateExtractedSceneInfo(sceneIr) { const issues = []; // Check: at least one apiEndpoint has contentType const endpointsWithCt = (sceneIr.apiEndpoints || []).filter( ep => ep && ep.contentType ); if ((sceneIr.apiEndpoints || []).length > 0 && endpointsWithCt.length === 0) { issues.push("missing_contentType_on_endpoints"); } // Check: at least one mode has responsePath (if modes exist) if ((sceneIr.modes || []).length > 0) { const modesWithPath = sceneIr.modes.filter(m => m.responsePath !== undefined && m.responsePath !== null); if (modesWithPath.length === 0) { issues.push("missing_responsePath_on_modes"); } } // Check: workflowArchetype is set if (!sceneIr.workflowArchetype) { issues.push("missing_workflowArchetype"); } return issues; } -
在
analyzeSceneDeep中,normalizeSceneIr之后调用验证:const issues = validateExtractedSceneInfo(normalized); if (issues.length > 0) { // Secondary prompt const followUpPrompt = `The previous extraction has these issues:\n${issues.join('\n')}\nPlease re-analyze the source snippets and fill in the missing fields. Use defaults if truly unavailable.`; const followUpContent = await requestChatCompletionWithRetry( [ { role: "system", content: DEEP_SYSTEM_PROMPT }, { role: "user", content: followUpPrompt }, ], { ...config, maxTokens: 2400, timeoutMs: DEEP_REQUEST_TIMEOUT_MS, retryAttempts: 1 } ); const repaired = normalizeSceneIr(await extractJsonFromResponseWithRepair(followUpContent, config)); // Merge repaired fields into normalized (only fill empty fields) Object.assign(normalized, mergeSceneIrFields(repaired, normalized)); } -
新增
mergeSceneIrFields(repaired, original)辅助函数:- 仅当 original 的字段为空/默认值时,才用 repaired 的值覆盖
- 避免丢失第一次提取的有效信息
验证:
- 模拟一个 LLM 返回缺少
contentType的场景,确认二次追问触发 - 确认最多追问 1 次,不会无限循环
提交信息:
feat(llm-client): add post-extraction validation with one-shot retry
After LLM returns scene IR, validate that critical fields (contentType,
responsePath, workflowArchetype) are present. If missing, send one
follow-up prompt to fill gaps. Merges repaired fields without overwriting
valid data from the first extraction.
Phase 3: 测试验证
Task 7: 单元测试
文件: tests/scene_generator_modes_test.rs(新增)
步骤:
-
创建测试文件
tests/scene_generator_modes_test.rs -
编写 5 个测试用例:
#[cfg(test)] mod tests { use super::*; // adjust imports as needed use crate::generated_scene::generator::*; use crate::generated_scene::ir::*; use serde_json::json; #[test] fn test_single_mode_generates_modes_array() { // Create a SingleRequestTable scene with one endpoint let scene_ir = make_test_scene_ir(); // ... assertions: generated JS contains "const MODES =" } #[test] fn test_multi_mode_generates_mode_routing() { // Create a MultiModeRequest scene with two modes // ... assertions: generated JS contains "detectMode" } #[test] fn test_snake_camel_consistency() { // Verify field name serialization is consistent // between Rust (snake_case) and JS (camelCase) } #[test] fn test_form_urlencoded_request_body() { // Create a mode with contentType = "application/x-www-form-urlencoded" // ... assertions: body is Object.entries().join('&'), not JSON.stringify } #[test] fn test_response_path_extraction_in_template() { // Create a mode with responsePath = "data.list" // ... assertions: generated JS contains "safeGet(raw, mode.responsePath" } } -
每个测试构造一个
SceneIr实例,调用compile_multi_mode_request,然后检查生成的字符串包含预期的代码片段
验证:
cargo test scene_generator_modes_test全部通过
提交信息:
test: add unit tests for multi-mode generation path
Covers: single-mode auto-wrap, multi-mode routing, snake/camel
consistency, form-urlencoded body format, and responsePath extraction.
Task 8: 集成测试
步骤:
-
选择两个代表性场景跑完整生成:
- 简单场景:
用户日电量监测(模式 C,直接 AJAX) - 复杂场景:
台区线损大数据-月_周累计线损率统计分析(模式 A,双模式)
- 简单场景:
-
对比生成结果与 tq-lineloss-report:
- 对比
SKILL.toml结构 - 对比
scripts/*.js的关键函数(buildModeRequest,detectMode,normalizeRows) - 对比
scene.toml的 bootstrap 和 params 配置
- 对比
-
产出集成测试报告:
- 文件:
docs/superpowers/reports/2026-04-17-integration-test-report.md - 内容: 差距清单、质量评分、遗留问题
- 文件:
-
记录差距清单:
- 哪些字段仍未正确提取
- 哪些逻辑仍需手动修正
- 哪些场景仍不适合自动化
验证:
- 集成测试报告已写入
- 至少一个场景的生成质量达到 tq-lineloss-report 的 80% 以上
提交信息:
docs: add integration test report for scene generator quality
Generated skills for user-daily-power and tq-lineloss scenes. Compared
against manually-authored tq-lineloss-report. Quality assessment and
gap analysis documented.
执行顺序
Task 1 → Task 2 → Task 3 → Task 4 → Task 5 → Task 6 → Task 7 → Task 8
├──── Phase 1: 修基础 ────┤ ├───── Phase 2: 增强提取 ─────┤ ├─ Phase 3 ─┤
Phase 1 的三个任务有依赖关系(Task 1 必须先完成,Task 2 和 Task 3 可并行)。 Phase 2 的三个任务可并行(Task 4/5/6 修改不同文件)。 Phase 3 依赖 Phase 1+2 全部完成。
风险与缓解
| 风险 | 影响 | 缓解 |
|---|---|---|
| LLM 二次追问增加生成时间 | 用户体验下降 | 限制追问 1 次,超时 120s |
| 统一路径后 SingleRequestTable 场景生成的 JS 包含不必要的 mode 逻辑 | 脚本体积增大 | default mode 条件判断简单,性能影响可忽略 |
| 业务 JS 文件过多导致 prompt 超限 | LLM 无法处理 | 限制 4 个文件,每个 600 字符 |
processData 修改影响现有正常场景 |
回归问题 | 仅对 form-urlencoded 设置 false,JSON 不受影响 |