diff --git a/docs/superpowers/plans/2026-04-10-fault-details-full-skill-alignment-plan.md b/docs/superpowers/plans/2026-04-10-fault-details-full-skill-alignment-plan.md new file mode 100644 index 0000000..4b10566 --- /dev/null +++ b/docs/superpowers/plans/2026-04-10-fault-details-full-skill-alignment-plan.md @@ -0,0 +1,672 @@ +# Fault Details Full Skill Alignment Implementation Plan + +> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking. + +**Goal:** Upgrade `fault-details-report.collect_fault_details` into a real staged browser skill that matches the original fault-details workflow, and make `claw-new` interpret the returned artifact status correctly in the direct-submit path. + +**Architecture:** Keep routing and direct-skill selection in `claw-new`, but move all fault-details collection, normalization, classification, summary, export, and report-log behavior into the staged skill under `skill_staging`. Implement the staged skill as a true browser-eval entrypoint that remains valid in page context, while exposing testable pure helpers through an environment-safe export guard for `node:test`; then add a narrow Rust artifact interpreter in `src/compat/direct_skill_runtime.rs` so `ok` / `partial` / `empty` map to successful task completion while `blocked` / `error` map to failed completion. + +**Tech Stack:** Rust 2021, `serde_json`, existing `BrowserPipeTool` / `browser_script` runtime, `node:test`, staged skill fixtures, Cargo integration tests. + +--- + +## Execution Context + +- Follow @superpowers:test-driven-development for every behavior change. +- Follow @superpowers:verification-before-completion before claiming each task is done. +- Do **not** create a git worktree unless the user explicitly asks. This repo preference is already established. +- Keep scope tight. Do **not** add a new browser protocol, new dispatch metadata, new UI opener behavior, or Rust-side fault classification logic. +- Keep the current direct path bootstrap requirement intact: the user instruction must still include an explicit `YYYY-MM`, but the staged skill must treat the page-selected range as the source of truth for collection once execution begins. +- Preserve parity with the original package’s real behavior: port the original classification table, `qxxcjl`-based reason heuristics, canonical detail mapping, summary aggregation rules, localhost export call, and report-log call into the staged skill rather than implementing a fixture-only subset. + +## File Map + +### Existing files to modify in `claw-new` + +- Modify: `src/compat/direct_skill_runtime.rs` + - add narrow structured artifact parsing and status-to-summary mapping + - keep direct-skill routing/config ownership unchanged +- Modify: `tests/agent_runtime_test.rs` + - add direct-submit regressions for `ok`, `partial`, `empty`, `blocked`, and `error` +- Modify: `tests/browser_script_skill_tool_test.rs` + - add browser-script execution-shape regression for browser-eval return payloads used by fault-details + +### Existing files to modify in `skill_staging` + +- Modify: `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/fault-details-report/scripts/collect_fault_details.js` + - replace empty shell with browser-eval entrypoint plus parity helpers +- Create: `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/fault-details-report/scripts/collect_fault_details.test.js` + - deterministic fixture coverage for normalization, classification, summary, artifact contract, export/logging degradation, and entrypoint shape helpers +- Modify: `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/fault-details-report/SKILL.toml` + - align tool description with real collection/export/report-log behavior +- Modify: `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/fault-details-report/SKILL.md` + - align written contract with actual runtime behavior and artifact fields +- Modify: `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/fault-details-report/references/collection-flow.md` + - align flow with page-range/query/export/report-log sequence +- Modify: `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/fault-details-report/references/data-quality.md` + - make canonical columns, original classification tables, reason heuristics, summary rules, and partial semantics explicit +- Modify: `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/scenes/fault-details-report/scene.json` + - keep scene output/state contract aligned with real staged artifact behavior + +### Existing files to read but not redesign + +- Read only: `docs/superpowers/specs/2026-04-10-fault-details-full-skill-alignment-design.md` +- Read only: `src/agent/mod.rs` +- Read only: `src/compat/browser_script_skill_tool.rs` +- Read only: `D:/desk/智能体资料/大四区报告监测项/故障明细/index.html` + +--- + +### Task 1: Add staged-skill red tests for normalization, summary, and artifact-contract semantics + +**Files:** +- Create: `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/fault-details-report/scripts/collect_fault_details.test.js` +- Read only: `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/fault-details-report/scripts/collect_fault_details.js` +- Read only: `D:/desk/智能体资料/大四区报告监测项/故障明细/index.html` + +- [ ] **Step 1: Write the failing staged-skill test file** + +Add `collect_fault_details.test.js` using `node:test` and `assert/strict`. Cover these behaviors with fixed fixtures: + +```javascript +const test = require('node:test'); +const assert = require('node:assert/strict'); + +const { + DETAIL_COLUMNS, + SUMMARY_COLUMNS, + normalizeDetailRow, + deriveSummaryRows, + determineArtifactStatus, + buildFaultDetailsArtifact, + buildBrowserEntrypointResult +} = require('./collect_fault_details.js'); + +test('normalizeDetailRow maps canonical detail fields from raw repair rows', () => { + const row = normalizeDetailRow({ + qxdbh: 'QX-1', + bxsj: '2026-03-09 08:00:00', + cityName: '国网兰州供电公司', + maintOrgName: '城关供电服务班', + maintGroupName: '抢修一班', + bdzMc: '110kV东岗变', + xlmc10: '10kV东岗线', + byqmc: '东岗1号变', + yjflMc: '电网故障', + ejflMc: '线路故障', + sjflMc: '低压线路', + qxxcjl: '现场检查:低压线路断线,已处理完成', + gzms: '客户报修停电' + }, { + companyName: '国网兰州供电公司' + }); + + assert.equal(row.slsj, '2026-03-09 08:00:00'); + assert.equal(row.gssgs, '甘肃省电力公司'); + assert.equal(row.gddw, '城关供电服务班'); + assert.equal(row.gds, '抢修一班'); + assert.equal(row.clzt, '处理完成'); + assert.equal(row.bdz, '110kV东岗变'); + assert.equal(row.line, '10kV东岗线'); + assert.equal(row.pb, '东岗1号变'); +}); + +test('deriveSummaryRows groups normalized rows by gds and computes counters', () => { + const rows = [ + { gds: '抢修一班', gddw: '城关供电服务班', sgs: '国网兰州供电公司', sxfl1: '无效', sxfl2: '无效', gzsb: '' }, + { gds: '抢修一班', gddw: '城关供电服务班', sgs: '国网兰州供电公司', sxfl1: '有效', sxfl2: '用户侧', gzsb: '表后线' }, + { gds: '抢修一班', gddw: '城关供电服务班', sgs: '国网兰州供电公司', sxfl1: '有效', sxfl2: '电网侧', dwcFl: '低压故障', gzsb: '低压线路' } + ]; + + const summaryRows = deriveSummaryRows(rows, { companyName: '国网兰州供电公司' }); + assert.equal(summaryRows.length, 1); + assert.equal(summaryRows[0].className, '抢修一班'); + assert.equal(summaryRows[0].allCount, 3); + assert.equal(summaryRows[0].wxCount, 1); + assert.equal(summaryRows[0].khcCount, 0); + assert.equal(summaryRows[0].dyGzCount, 1); + assert.equal(summaryRows[0].dyxlCount, 1); + assert.equal(summaryRows[0].bhxCount, 1); +}); + +test('determineArtifactStatus follows blocked > error > partial > empty > ok precedence', () => { + assert.equal(determineArtifactStatus({ blockedReason: 'missing_session', fatalError: null, partialReasons: [], detailRows: [{}] }), 'blocked'); + assert.equal(determineArtifactStatus({ blockedReason: null, fatalError: 'parse_failed', partialReasons: [], detailRows: [{}] }), 'error'); + assert.equal(determineArtifactStatus({ blockedReason: null, fatalError: null, partialReasons: ['export_failed'], detailRows: [{}] }), 'partial'); + assert.equal(determineArtifactStatus({ blockedReason: null, fatalError: null, partialReasons: [], detailRows: [] }), 'empty'); + assert.equal(determineArtifactStatus({ blockedReason: null, fatalError: null, partialReasons: [], detailRows: [{}] }), 'ok'); +}); + +test('buildFaultDetailsArtifact keeps canonical fields, selected range, counts, and downstream results', () => { + const artifact = buildFaultDetailsArtifact({ + period: '2026-03', + selectedRange: { start: '2026-03-08 16:00:00', end: '2026-03-09 16:00:00' }, + detailRows: [{ qxdbh: 'QX-1' }], + summaryRows: [{ index: 1 }], + partialReasons: ['report_log_failed'], + downstream: { + export: { attempted: true, success: true, path: 'http://localhost/export.xlsx' }, + report_log: { attempted: true, success: false, error: '500' } + } + }); + + assert.equal(artifact.type, 'report-artifact'); + assert.equal(artifact.status, 'partial'); + assert.deepEqual(artifact.selected_range, { start: '2026-03-08 16:00:00', end: '2026-03-09 16:00:00' }); + assert.equal(artifact.counts.detail_rows, 1); + assert.equal(artifact.counts.summary_rows, 1); + assert.deepEqual(artifact.partial_reasons, ['report_log_failed']); +}); + +test('buildFaultDetailsArtifact keeps required top-level fields for blocked artifact', () => { + const artifact = buildFaultDetailsArtifact({ + period: '2026-03', + blockedReason: 'selected_range_unavailable', + partialReasons: ['selected_range_unavailable'] + }); + + assert.equal(artifact.type, 'report-artifact'); + assert.equal(artifact.report_name, 'fault-details-report'); + assert.equal(artifact.period, '2026-03'); + assert.equal(artifact.status, 'blocked'); + assert.deepEqual(artifact.partial_reasons, ['selected_range_unavailable']); + assert.equal('downstream' in artifact, false); +}); + +test('buildFaultDetailsArtifact keeps known selected range and counts on late error', () => { + const artifact = buildFaultDetailsArtifact({ + period: '2026-03', + selectedRange: { start: '2026-03-08 16:00:00', end: '2026-03-09 16:00:00' }, + detailRows: [], + summaryRows: [], + fatalError: 'summary_failed', + partialReasons: ['summary_failed'] + }); + + assert.equal(artifact.status, 'error'); + assert.deepEqual(artifact.selected_range, { start: '2026-03-08 16:00:00', end: '2026-03-09 16:00:00' }); + assert.equal(artifact.counts.detail_rows, 0); + assert.equal(artifact.counts.summary_rows, 0); +}); + +test('buildBrowserEntrypointResult returns blocked artifact when selected range is unavailable', async () => { + const artifact = await buildBrowserEntrypointResult({ + period: '2026-03' + }, { + readSelectedRange: async () => null + }); + + assert.equal(artifact.status, 'blocked'); + assert.ok(artifact.partial_reasons.includes('selected_range_unavailable')); +}); +``` + +- [ ] **Step 2: Run the staged-skill test file and verify it fails** + +Run: + +```bash +node "D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/fault-details-report/scripts/collect_fault_details.test.js" +``` + +Expected: FAIL because `collect_fault_details.js` does not export these helpers yet and still only returns an empty shell. + +--- + +### Task 2: Implement staged-skill parity helpers and a valid browser entrypoint + +**Files:** +- Modify: `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/fault-details-report/scripts/collect_fault_details.js` +- Test: `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/fault-details-report/scripts/collect_fault_details.test.js` + +- [ ] **Step 1: Implement the helper exports and browser entrypoint shape needed to satisfy the red tests** + +Refactor `collect_fault_details.js` so the file remains a valid browser-eval script in page context while still supporting `node:test` through an environment-safe export guard. + +Required implementation pieces: + +```javascript +const DETAIL_COLUMNS = [/* existing canonical columns */]; +const SUMMARY_COLUMNS = [/* existing summary columns */]; + +function normalizeDetailRow(raw, context) { + // map qxdbh/gssgs/sgs/gddw/gds/slsj/clzt/bdz/line/pb + // derive sxfl1/sxfl2/sxfl3/gzsb/gzyy from the original package rules +} + +function deriveSummaryRows(detailRows, context) { + // group by gds and compute all original package counters +} + +function determineArtifactStatus({ blockedReason, fatalError, partialReasons, detailRows }) { + // blocked > error > partial > empty > ok +} + +function buildFaultDetailsArtifact({ + period, + selectedRange, + detailRows, + summaryRows, + partialReasons, + blockedReason, + fatalError, + downstream +}) { + // return report-artifact with columns, sections, counts, status, partial_reasons, downstream +} + +async function buildBrowserEntrypointResult(input, deps = defaultBrowserDeps()) { + // read selected range from page + // collect raw rows from page query + // normalize rows + // derive summary + // attempt export + report log + // return final artifact +} + +if (typeof module !== 'undefined' && module.exports) { + module.exports = { + DETAIL_COLUMNS, + SUMMARY_COLUMNS, + normalizeDetailRow, + deriveSummaryRows, + determineArtifactStatus, + buildFaultDetailsArtifact, + buildBrowserEntrypointResult + }; +} + +return await buildBrowserEntrypointResult(args); +``` + +Rules: +- keep `DETAIL_COLUMNS` and `SUMMARY_COLUMNS` canonical and stable +- keep helper functions self-contained in this file unless a separate pure helper file becomes necessary for runtime validity +- keep the browser entrypoint compatible with current `eval` wrapper +- keep browser runtime free of unguarded Node-only assumptions +- do **not** invent a new protocol or callback surface + +- [ ] **Step 2: Re-run the staged-skill test file and verify it now reaches deeper failures or passes the initial helper coverage** + +Run: + +```bash +node "D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/fault-details-report/scripts/collect_fault_details.test.js" +``` + +Expected: either PASS for the Task 1 cases, or fail only on the still-missing full parity/export/history specifics added in Task 3. + +--- + +### Task 3: Add red tests for full classification parity, downstream partials, and empty-result export semantics + +**Files:** +- Modify: `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/fault-details-report/scripts/collect_fault_details.test.js` +- Modify: `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/fault-details-report/scripts/collect_fault_details.js` +- Read only: `D:/desk/智能体资料/大四区报告监测项/故障明细/index.html` + +- [ ] **Step 1: Extend the staged-skill tests with failing parity and downstream cases** + +Add focused failing tests such as: + +```javascript +test('normalizeDetailRow derives gzyy from qxxcjl text heuristics', () => { + const row = normalizeDetailRow({ + qxxcjl: '现场检查:客户表后线烧损,已恢复送电', + ejflMc: '客户侧故障', + sjflMc: '表后线' + }, { companyName: '国网兰州供电公司' }); + + assert.equal(row.gzsb, '表后线'); + assert.equal(row.gzyy, '表后线烧损'); +}); + +test('buildBrowserEntrypointResult returns partial when export fails after detail collection succeeds', async () => { + const artifact = await buildBrowserEntrypointResult({ period: '2026-03' }, { + readSelectedRange: async () => ({ start: '2026-03-08 16:00:00', end: '2026-03-09 16:00:00' }), + queryFaultRows: async () => [{ qxdbh: 'QX-1', bxsj: '2026-03-09 08:00:00', maintGroupName: '抢修一班' }], + readCompanyContext: () => ({ companyName: '国网兰州供电公司' }), + exportWorkbook: async () => { + throw new Error('export_failed'); + }, + writeReportLog: async () => ({ success: true }) + }); + + assert.equal(artifact.status, 'partial'); + assert.ok(artifact.partial_reasons.includes('export_failed')); + assert.equal(artifact.counts.detail_rows, 1); + assert.equal(artifact.downstream.export.attempted, true); + assert.equal(artifact.downstream.export.success, false); +}); + +test('buildBrowserEntrypointResult returns error when normalized detail rows cannot be produced', async () => { + const artifact = await buildBrowserEntrypointResult({ period: '2026-03' }, { + readSelectedRange: async () => ({ start: '2026-03-08 16:00:00', end: '2026-03-09 16:00:00' }), + queryFaultRows: async () => [{ qxdbh: '', bxsj: '' }], + readCompanyContext: () => ({ companyName: '国网兰州供电公司' }) + }); + + assert.equal(artifact.status, 'error'); + assert.ok(artifact.partial_reasons.includes('detail_normalization_failed')); +}); + +test('buildBrowserEntrypointResult keeps canonical rows empty for empty result and omits downstream before attempts', async () => { + const artifact = await buildBrowserEntrypointResult({ period: '2026-03' }, { + readSelectedRange: async () => ({ start: '2026-03-08 16:00:00', end: '2026-03-09 16:00:00' }), + queryFaultRows: async () => [], + readCompanyContext: () => ({ companyName: '国网兰州供电公司' }) + }); + + assert.equal(artifact.status, 'empty'); + assert.deepEqual(artifact.rows, []); + assert.equal('downstream' in artifact, false); +}); +``` + +Also add fixture cases derived from the original package’s full classification table and summary counters so the staged skill is forced toward parity, not a subset implementation. + +- [ ] **Step 2: Run the staged-skill test file and verify it fails on the new cases** + +Run: + +```bash +node "D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/fault-details-report/scripts/collect_fault_details.test.js" +``` + +Expected: FAIL on missing full classification parity or downstream partial/error behavior. + +- [ ] **Step 3: Implement the full business logic needed to satisfy the new tests** + +In `collect_fault_details.js`: +- port the original classification table and `qxxcjl` text heuristics for `sxfl1`, `sxfl2`, `sxfl3`, `gzsb`, `gzyy` +- port the original summary derivation rules and counters completely +- add required-field validation so structurally unusable normalized rows escalate to `error` +- add downstream `exportWorkbook` and `writeReportLog` stages that record `{attempted, success, path, error}` +- keep collection success distinct from downstream failures so export/logging failures become `partial`, not full failure +- keep placeholder rows, if needed for downstream empty-export payloads, downstream-only and never in canonical returned `rows` +- include both `period` and `selected_range` in the artifact +- omit `downstream` when export/report-log have not been attempted yet + +- [ ] **Step 4: Re-run the staged-skill test file and verify it passes** + +Run: + +```bash +node "D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/fault-details-report/scripts/collect_fault_details.test.js" +``` + +Expected: PASS. + +--- + +### Task 4: Align staged-skill metadata and reference docs with the implemented behavior + +**Files:** +- Modify: `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/fault-details-report/SKILL.toml` +- Modify: `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/fault-details-report/SKILL.md` +- Modify: `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/fault-details-report/references/collection-flow.md` +- Modify: `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/fault-details-report/references/data-quality.md` +- Modify: `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/scenes/fault-details-report/scene.json` + +- [ ] **Step 1: Update the staged metadata/docs to match the implemented runtime contract** + +Required changes: +- `SKILL.toml`: description must say the tool collects rows, derives summary, attempts localhost export, and records report history +- `SKILL.md`: artifact example must include `selected_range`, `counts`, `status`, `partial_reasons`, and `downstream` +- `references/collection-flow.md`: sequence must explicitly include page-selected range -> raw query -> normalization -> summary -> export -> report-log +- `references/data-quality.md`: document the original classification tables, `qxxcjl` heuristics, summary rules, partial/error escalation rules, and empty-result semantics explicitly enough to match the implemented helpers +- `scene.json`: keep inputs/outputs/status semantics aligned with the richer artifact; do not add routing policy there + +- [ ] **Step 2: Read the updated staged docs and verify they match the implemented JS behavior** + +Read and confirm: +- descriptions no longer claim “artifact shell” behavior +- docs do not move routing ownership out of `claw-new` +- docs do not promise auto-opening/downloading behavior in this slice +- docs reflect blocked/error field-presence rules and downstream-attempt semantics + +Expected: staged metadata/docs accurately reflect the implemented collector. + +--- + +### Task 5: Add Rust red tests for artifact-status interpretation in the direct-submit runtime + +**Files:** +- Modify: `tests/agent_runtime_test.rs` +- Modify: `tests/browser_script_skill_tool_test.rs` +- Modify: `src/compat/direct_skill_runtime.rs` +- Read only: `src/compat/browser_script_skill_tool.rs` + +- [ ] **Step 1: Add failing direct-submit runtime tests for structured artifact statuses** + +Extend `tests/agent_runtime_test.rs` with focused regressions that use the existing temp skill-root harness but return real `report-artifact` payloads: + +```rust +#[test] +fn submit_task_treats_partial_report_artifact_as_success_with_warning_summary() { + let skill_root = build_direct_runtime_skill_root(); + let runtime_context = direct_submit_runtime_context(&skill_root); + let transport = Arc::new(MockTransport::new(vec![success_browser_response( + 1, + serde_json::json!({ + "text": { + "type": "report-artifact", + "report_name": "fault-details-report", + "period": "2026-03", + "selected_range": { "start": "2026-03-08 16:00:00", "end": "2026-03-09 16:00:00" }, + "columns": ["qxdbh"], + "rows": [{ "qxdbh": "QX-1" }], + "sections": [{ "name": "summary-sheet", "columns": ["index"], "rows": [{ "index": 1 }] }], + "counts": { "detail_rows": 1, "summary_rows": 1 }, + "status": "partial", + "partial_reasons": ["report_log_failed"], + "downstream": { + "export": { "attempted": true, "success": true, "path": "http://localhost/export.xlsx" }, + "report_log": { "attempted": true, "success": false, "error": "500" } + } + } + }), + )])); + // ... invoke handle_browser_message_with_context(...) + // assert TaskComplete.success == true + // assert summary contains partial/report_log_failed/detail_rows=1 +} + +#[test] +fn submit_task_treats_empty_report_artifact_as_success() { /* status=empty => success=true */ } + +#[test] +fn submit_task_treats_blocked_report_artifact_as_failure() { /* status=blocked => success=false */ } + +#[test] +fn submit_task_treats_error_report_artifact_as_failure() { /* status=error => success=false */ } +``` + +Also add one focused helper regression to `tests/browser_script_skill_tool_test.rs` that proves the browser-script helper can return a structured object payload used by the fault-details path without flattening required fields away. + +Suggested test name: + +```rust +#[tokio::test] +async fn execute_browser_script_tool_preserves_structured_report_artifact_payload() { /* ... */ } +``` + +- [ ] **Step 2: Run the focused Rust tests and verify they fail** + +Run: + +```bash +cargo test --test agent_runtime_test submit_task_treats_partial_report_artifact_as_success_with_warning_summary -- --nocapture +cargo test --test browser_script_skill_tool_test execute_browser_script_tool_preserves_structured_report_artifact_payload -- --nocapture +``` + +Expected: the new `agent_runtime_test` case fails because `execute_direct_submit_skill` still returns raw JSON text and `src/agent/mod.rs` still marks all direct-submit results as success when no Rust-side interpretation exists. + +--- + +### Task 6: Implement narrow Rust artifact interpretation without moving business rules into Rust + +**Files:** +- Modify: `src/compat/direct_skill_runtime.rs` +- Modify: `tests/agent_runtime_test.rs` +- Modify: `tests/browser_script_skill_tool_test.rs` + +- [ ] **Step 1: Implement a narrow structured-artifact interpreter in `src/compat/direct_skill_runtime.rs`** + +Add a small internal result type and parser, for example: + +```rust +struct DirectSubmitOutcome { + success: bool, + summary: String, +} + +fn interpret_direct_submit_output(output: &str) -> DirectSubmitOutcome { + // parse JSON if possible + // if type == "report-artifact", read status/counts/partial_reasons/downstream + // map ok/partial/empty => success=true + // map blocked/error => success=false + // build concise summary with report_name, period, detail_rows, summary_rows, status, partial reasons + // fall back to raw output text when payload is not a recognized artifact +} +``` + +Then change the public entrypoint shape from `Result` to a narrow result carrying `success` and `summary`, or add a second helper that `src/agent/mod.rs` can use without changing routing ownership. + +Rules: +- do **not** reimplement fault normalization/classification/summary in Rust +- do **not** add fault-specific branching in `src/agent/mod.rs` +- keep unrecognized non-artifact outputs working as before +- keep explicit `YYYY-MM` derivation and configured `skill.tool` resolution unchanged + +- [ ] **Step 2: Update the submit-path caller to use the interpreted success flag** + +Adjust the direct-submit branch so `TaskComplete.success` comes from the artifact interpretation instead of blindly treating every `Ok(summary)` as success. + +Implementation target: +- keep the direct path in `src/agent/mod.rs` +- keep error handling narrow +- if needed, return a dedicated direct-submit outcome from `execute_direct_submit_skill` + +- [ ] **Step 3: Re-run the focused Rust tests and verify they pass** + +Run: + +```bash +cargo test --test agent_runtime_test submit_task_treats_partial_report_artifact_as_success_with_warning_summary -- --nocapture +cargo test --test agent_runtime_test submit_task_treats_empty_report_artifact_as_success -- --nocapture +cargo test --test agent_runtime_test submit_task_treats_blocked_report_artifact_as_failure -- --nocapture +cargo test --test agent_runtime_test submit_task_treats_error_report_artifact_as_failure -- --nocapture +cargo test --test browser_script_skill_tool_test execute_browser_script_tool_preserves_structured_report_artifact_payload -- --nocapture +``` + +Expected: PASS. + +--- + +### Task 7: Run the full verification sweep for the staged skill and direct runtime + +**Files:** +- Verify only + +- [ ] **Step 1: Run the staged-skill deterministic test file** + +Run: + +```bash +node "D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/fault-details-report/scripts/collect_fault_details.test.js" +``` + +Expected: PASS. + +- [ ] **Step 2: Run the relevant Rust regression suites** + +Run: + +```bash +cargo test --test browser_script_skill_tool_test -- --nocapture +cargo test --test agent_runtime_test -- --nocapture +``` + +Expected: PASS. + +- [ ] **Step 3: Run the broader compatibility coverage and build** + +Run: + +```bash +cargo test --test compat_runtime_test -- --nocapture +cargo test --test compat_config_test -- --nocapture +cargo build --bin sgclaw +``` + +Expected: PASS. + +- [ ] **Step 4: Manually verify the requirements against the approved spec** + +Checklist: +- staged skill now reads page-selected range instead of inventing a month window after entry +- staged skill returns canonical detail rows and summary rows +- staged skill ports the original classification table, `qxxcjl` heuristics, and summary counters with parity coverage +- staged skill records downstream export/report-log outcome +- staged skill distinguishes `ok` / `partial` / `empty` / `blocked` / `error` +- `blocked` / `error` artifacts keep the required top-level fields, and preserve known `selected_range` / `counts` when failure happens late enough +- `downstream` is omitted when export/report-log were not attempted and included with attempted/success flags once they were attempted +- empty-result canonical `rows` stay empty even if downstream export uses a placeholder transport row +- `claw-new` maps `ok` / `partial` / `empty` to success and `blocked` / `error` to failure +- no new routing metadata was added to `SKILL.toml` or `scene.json` +- no new browser protocol or opener/UI behavior was introduced + +Expected: all checklist items satisfied before calling the work complete. + +--- + +## Verification Checklist + +### Staged skill behavior + +```bash +node "D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/fault-details-report/scripts/collect_fault_details.test.js" +``` + +Expected: deterministic fixture coverage passes for normalization, full classification parity, summary derivation, artifact shape, empty semantics, and downstream partial semantics. + +### Direct-submit runtime mapping + +```bash +cargo test --test agent_runtime_test -- --nocapture +``` + +Expected: +- valid artifact `ok` / `partial` / `empty` completes successfully +- valid artifact `blocked` / `error` completes as failure +- existing invalid config regression still passes +- existing direct-submit happy path still passes + +### Browser-script helper safety + +```bash +cargo test --test browser_script_skill_tool_test -- --nocapture +``` + +Expected: current browser-script execution semantics remain intact while returning structured artifact payloads. + +### Compatibility/build + +```bash +cargo test --test compat_runtime_test -- --nocapture +cargo test --test compat_config_test -- --nocapture +cargo build --bin sgclaw +``` + +Expected: no regressions in compat execution/config loading; main binary builds cleanly. + +--- + +## Notes For The Engineer + +- The paired spec is `docs/superpowers/specs/2026-04-10-fault-details-full-skill-alignment-design.md`. +- Keep all fault business transforms in `skill_staging`, not in Rust. +- Keep direct routing config-owned via `skillsDir` + `directSubmitSkill`. +- Do **not** broaden this slice into LLM routing, generic dispatch policy, new browser opcodes, or export auto-open behavior. +- If the original package reveals extra classification rules that are needed for parity, add them only inside `collect_fault_details.js` and its staged references/tests, not in `claw-new`. diff --git a/docs/superpowers/specs/2026-04-10-fault-details-full-skill-alignment-design.md b/docs/superpowers/specs/2026-04-10-fault-details-full-skill-alignment-design.md new file mode 100644 index 0000000..4795190 --- /dev/null +++ b/docs/superpowers/specs/2026-04-10-fault-details-full-skill-alignment-design.md @@ -0,0 +1,495 @@ +# Fault Details Full Skill Alignment Design + +**Goal:** Upgrade `fault-details-report.collect_fault_details` from an empty artifact shell into a real staged business skill that matches the original fault-details package's collection, normalization, summary, export, and report-history behavior, while keeping direct-skill routing config-owned in `claw-new`. + +**Status:** Approved design direction for the next remediation slice. + +--- + +## Decision Summary + +1. Keep direct-skill selection in `claw-new` via `skillsDir` + `directSubmitSkill`; do not move dispatch ownership into `skill_staging` manifests. +2. Put the fault-details business logic in `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging`, not in `claw-new`. +3. Align the staged skill with the original package's real behavior: query raw rows, normalize detail columns, derive summary rows, call localhost export, and write report history. +4. Keep the current browser-execution seam narrow: use the existing `browser_script` / browser-eval path, not a new browser protocol or new opcodes. +5. Add a narrow artifact interpreter in `claw-new` so structured fault-results map cleanly to `TaskComplete.success` and a readable completion summary. + +--- + +## Why This Slice Exists + +The current staged skill contract and the current staged skill implementation do not match. + +### What the original package actually does + +The original package under `D:/desk/智能体资料/大四区报告监测项/故障明细` does all of the following: + +- reads the selected date range from the page UI +- queries the D4 repair-order data source +- filters and normalizes raw rows into the canonical detail export schema +- derives grouped summary rows by `gds` +- calls `http://localhost:13313/SurfaceServices/personalBread/export/faultDetailsExportXLSXS` +- auto-opens/downloads the generated file +- writes report history through `http://localhost:13313/ReportServices/Api/setReportLog` + +### What the staged skill currently does + +The current staged `collect_fault_details.js` only returns an empty `report-artifact` shell with empty `rows` and empty summary `sections`. + +It also still uses a Node-style export shape instead of the browser-eval entrypoint shape that the current `browser_script` runtime expects. In practice, this means the staged script is not yet aligned with the real runtime contract even before business behavior is considered. + +This slice closes that gap by making the staged skill actually perform the work the original package performs, but through the current sgClaw direct-skill runtime. + +--- + +## Design Rules + +### 1. `claw-new` owns routing, not business transforms + +`claw-new` stays responsible for: + +- loading config +- deciding whether submit-task takes the direct-skill path +- resolving the configured staged skill +- executing the staged browser-script tool +- turning the returned artifact into `TaskComplete.success` + human-readable summary + +`claw-new` must **not** become the place where the original fault classification table, detail-row field mapping, or summary aggregation rules are reimplemented. + +### 2. `skill_staging` owns fault-details business behavior + +The staged skill package owns: + +- query orchestration inside the browser page context +- raw-row extraction +- canonical detail-row normalization +- classification and derived fields +- summary-sheet derivation +- localhost export request +- localhost report-log request +- structured result payload + +### 3. Keep the current browser seam narrow + +Do not introduce a new browser bridge, callback protocol, or skill-specific browser opcode for this slice. + +The implementation should continue using the current `browser_script` execution seam already wired through `claw-new/src/compat/browser_script_skill_tool.rs` and `claw-new/src/compat/direct_skill_runtime.rs`. + +### 4. Match business behavior, not the original shell verbatim + +The original package is a local HTML/Vue shell that uses `BrowserAction(...)`, timers, and hidden-browser choreography. That shell does **not** need to be recreated inside `claw-new`. + +What must be preserved is the business outcome: + +- same canonical detail columns +- same key field mappings +- same classification rules +- same summary metrics +- same downstream export/history behavior +- same distinction between empty, partial, blocked, and failed work + +--- + +## Ownership Boundary and Landing Zones + +### Staged skill changes + +These changes land in `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging`. + +Primary files: + +- `skills/fault-details-report/scripts/collect_fault_details.js` + - becomes the real browser-eval entrypoint + - must directly `return` the final structured artifact from the wrapped browser script + - may contain internal helper functions, but should remain self-contained for the current runtime +- `skills/fault-details-report/SKILL.toml` + - keep `browser_script` + - tighten the tool description so it matches the real behavior + - do not turn `SKILL.toml` into the source of truth for classification rules or routing policy +- `skills/fault-details-report/SKILL.md` + - align the written contract with the implemented runtime behavior +- `skills/fault-details-report/references/collection-flow.md` + - align the staged flow with the implemented query/export/history sequence +- `skills/fault-details-report/references/data-quality.md` + - stay authoritative for canonical columns, required fields, classification tables, `qxxcjl`-based reason heuristics, summary rules, and partial semantics +- `scenes/fault-details-report/scene.json` + - keep the scene contract aligned with the actual output and state semantics + - do not move classification or routing policy into scene metadata + +### Caller/runtime changes + +These changes land in `D:/data/ideaSpace/rust/sgClaw/claw-new`. + +Primary files: + +- `src/compat/direct_skill_runtime.rs` + - keep configured direct-skill execution here + - add narrow structured-artifact interpretation after the browser-script returns +- `src/agent/mod.rs` + - keep the current direct-submit routing seam here + - do not add fault-specific business logic here +- `src/compat/browser_script_skill_tool.rs` + - keep the browser-script contract strict: browser-eval entrypoint, no Node-only assumptions +- `tests/agent_runtime_test.rs` + - direct-submit path and result-surface regressions +- `tests/browser_script_skill_tool_test.rs` + - browser-script execution-shape regressions + +If a new helper is needed in `claw-new`, it should be a narrow artifact-format/parser helper, not a new business-rules module. + +--- + +## Target Runtime Flow + +### Step 1: Submit-task stays config-owned + +The user still types natural language into the current sgClaw input. + +`claw-new`: + +- receives `BrowserMessage::SubmitTask` +- loads `SgClawSettings` +- sees `directSubmitSkill = "fault-details-report.collect_fault_details"` +- bypasses LLM routing exactly as it does now +- resolves the staged skill from `skillsDir` + +This preserves the already approved config-owned routing boundary. + +### Step 2: Browser-script tool executes as a true browser entrypoint + +`collect_fault_details.js` must be shaped for the current runtime: + +- the script runs inside the current browser page context through `eval` +- it must not rely on `module.exports` +- it must directly `return collectFaultDetails(args)` from the wrapped script body + +This is required because the current sgClaw browser-script runtime reads one script file and wraps it in a browser-side IIFE. + +### Step 3: The skill reads the page-selected time range + +The source-of-truth query window should come from the current page state, matching the original package behavior. + +Design rule: + +- read the selected start and end time from the business page controls or page state +- include that exact selected range in the returned artifact +- keep `period` as a bootstrap label from `claw-new`, not as a license to silently guess a different business range + +Compatibility rule with the current direct-submit seam: + +- the current `claw-new` direct path still requires an explicit `YYYY-MM` token in the user's instruction in order to enter the configured direct-skill flow +- that requirement remains in place for this slice +- once inside the skill, the browser page's selected start/end range is the source of truth for collection +- the returned artifact should include both the user-visible `period` label and the exact selected page range so mismatches are observable instead of hidden + +If the page-selected range cannot be read reliably, the skill should return `blocked` instead of inventing a month-wide query window from `period` alone. + +### Step 4: The skill collects raw rows and normalizes detail fields + +The staged skill must reproduce the original package's detail normalization logic inside the browser-executed script. + +That includes preserving the canonical detail schema from the original `excleIni[0].cols`, including the key transforms already present in the original package, such as: + +- `slsj = bxsj` +- `gssgs = "甘肃省电力公司"` +- `sgs` derived from the current company/city context +- `gddw = maintOrgName` +- `gds = maintGroupName` +- `clzt = "处理完成"` +- `bdz = bdzMc` +- `line = xlmc10` +- `pb = byqmc` + +The staged skill must also port the original classification/derivation logic that fills: + +- `sxfl1` +- `sxfl2` +- `sxfl3` +- `gzsb` +- `gzyy` + +That includes the original matching table and the `qxxcjl`-based text extraction heuristics that derive the fault reason. + +### Step 5: The skill derives summary rows from normalized detail rows + +The staged skill must derive the summary sheet from grouped detail rows, keyed around the same business totals the original package computes. + +At minimum that includes: + +- `index` +- `gsName` +- `fwDept` +- `className` +- `allCount` +- `wxCount` +- `khcCount` +- `sbdSbCount` +- `gyGzCount` +- `dyGzCount` +- `tqdzCount` +- `tqbxCount` +- `dyxlCount` +- `bqxCount` +- `jllCount` +- `bhxCount` +- `qftdCount` + +The summary derivation must stay in the staged skill so the same package can later be routed by LLM without moving business logic back into `claw-new`. + +### Step 6: The skill performs downstream export and report logging + +After detail rows and summary rows are available, the staged skill should reproduce the original package's downstream behavior: + +- build the export payload for `faultDetailsExportXLSXS` +- call the localhost export endpoint +- capture the returned export path/URL +- write report history via `setReportLog` + +Important boundary: + +- export/report-log are downstream side effects +- they do not redefine whether collection itself succeeded +- if collection succeeds but export/logging fails, the result is `partial`, not a full collection failure +- auto-opening/downloading the exported file is out of scope for this slice; this slice records the export path/result in the artifact but does not add new opener/UI behavior in `claw-new` + +### Step 7: The skill returns one structured artifact + +The staged skill should return one self-describing JSON artifact containing: + +- business identity (`type`, `report_name`) +- selected period label +- exact selected start/end range +- canonical detail columns + normalized rows +- summary section columns + rows +- counts +- business status +- partial reasons if any +- downstream export outcome +- downstream report-log outcome + +### Step 8: `claw-new` interprets the artifact, not the business rules + +After the browser-script returns, `claw-new` should parse the JSON artifact and map it into final submit-task behavior. + +Recommended mapping: + +- `status = ok` -> `TaskComplete.success = true` +- `status = partial` -> `TaskComplete.success = true`, with warnings in summary +- `status = empty` -> `TaskComplete.success = true`, clearly reported as empty-result +- `status = blocked` -> `TaskComplete.success = false` +- `status = error` -> `TaskComplete.success = false` + +This keeps business classification in the staged skill while preventing false-positive success in the direct path. + +--- + +## Artifact Contract + +The returned payload should stay `type = "report-artifact"`, but it must become rich enough to describe the real run. + +Recommended contract: + +```json +{ + "type": "report-artifact", + "report_name": "fault-details-report", + "period": "2026-03", + "selected_range": { + "start": "2026-03-08 16:00:00", + "end": "2026-03-09 16:00:00" + }, + "columns": ["qxdbh", "gssgs", "sgs", "gddw", "gds", "slsj", "yjflMc", "ejflMc", "sjflMc", "gzms", "yhbh", "yhmc", "lxr", "gzdd", "lxdh", "bxsj", "gdsj", "clzt", "qxxcjl", "bdz", "line", "pb", "sxfl1", "sxfl2", "sxfl3", "gzsb", "gzyy", "bz"], + "rows": [], + "sections": [ + { + "name": "summary-sheet", + "columns": ["index", "gsName", "fwDept", "className", "allCount", "wxCount", "khcCount", "sbdSbCount", "gyGzCount", "dyGzCount", "tqdzCount", "tqbxCount", "dyxlCount", "bqxCount", "jllCount", "bhxCount", "qftdCount"], + "rows": [] + } + ], + "counts": { + "detail_rows": 0, + "summary_rows": 0 + }, + "status": "ok", + "partial_reasons": [], + "downstream": { + "export": { + "attempted": true, + "success": true, + "path": "http://localhost:13313/.../fault-details.xlsx" + }, + "report_log": { + "attempted": true, + "success": true, + "report_name": "国网XX故障报修明细表(03月09日)", + "path": "http://localhost:13313/.../fault-details.xlsx" + } + } +} +``` + +### Contract notes + +- `rows` is the canonical returned detail table, not the export-service transport payload. +- If the export service still requires a placeholder row for an empty spreadsheet, that placeholder should be synthesized only for the downstream export call, not as the canonical returned `rows` contract. +- `counts` should be computed from the canonical returned tables. +- `selected_range`, `columns`, `sections`, `counts`, `status`, and `partial_reasons` should always be present for `ok`, `partial`, and `empty`. +- For `blocked` and `error`, the artifact should still include `type`, `report_name`, `period`, `status`, and `partial_reasons`; `selected_range`, `columns`, `sections`, and `counts` should be included whenever they were already known before the failure point. +- `downstream` should be omitted only when export/report-log were not attempted yet; otherwise include it with `attempted` / `success` flags and any available path or failure detail. + +--- + +## Error Handling and Status Semantics + +### `ok` + +Use `ok` when all of the following are true: + +- raw collection succeeded +- required detail-field normalization succeeded +- summary derivation succeeded +- export succeeded +- report-log write succeeded + +### `partial` + +Use `partial` when detail collection succeeded but at least one downstream stage degraded, including: + +- one or more required fields could not be normalized, but the row set still remains exportable and summary derivation can proceed with explicit gaps recorded +- summary derivation was incomplete, but the detail table is still available +- export failed after rows were available +- report-log write failed after rows/export were available + +Escalation rule: + +- if the raw query succeeds but required fields are missing so broadly that the canonical detail table cannot be produced at all, use `error`, not `partial` +- if summary derivation cannot even start because the normalized detail rows are structurally unusable, use `error`, not `partial` + +`partial_reasons` must name the degraded stage instead of hiding it. + +### `empty` + +Use `empty` when: + +- the query succeeds for the selected range +- zero real detail rows match + +This is not a failure. + +If the business flow still wants an empty export file or placeholder export payload, that happens downstream without changing the semantic meaning of the result. + +### `blocked` + +Use `blocked` when the page/session preconditions are not met, for example: + +- expected page/session is not available +- required page controls cannot be read +- login/session state is missing or expired +- required browser-visible APIs are unavailable in the current page context + +### `error` + +Use `error` when the run starts but fails due to operational or parsing problems, for example: + +- request failure +- page script failure +- raw response parse failure +- malformed export response + +### `claw-new` completion mapping + +`claw-new` should convert structured status into final submit completion behavior: + +- `ok` / `partial` / `empty`: return a success completion with a concise human summary +- `blocked` / `error`: return a failed completion with a concise human summary + +This avoids the current risk where a structured error-like payload could still be surfaced as a nominal success string. + +--- + +## Testing and Acceptance Strategy + +### Skill-side deterministic coverage + +Add deterministic coverage around the staged skill's business logic in `skill_staging` for: + +- canonical detail field mapping +- classification table parity +- `gzyy` extraction heuristics +- summary aggregation parity +- empty-result handling +- partial-result generation when downstream export/logging fails +- browser-script entrypoint shape (`return ...`, not `module.exports`) + +The classification/summary tests should use fixed raw-row fixtures so the business rules are validated without a live browser session. + +### `claw-new` runtime regressions + +Add Rust coverage in `claw-new` for: + +- direct-submit success with a populated `report-artifact` +- `partial` artifact mapping to `TaskComplete.success = true` +- `empty` artifact mapping to `TaskComplete.success = true` +- `blocked` / `error` artifact mapping to `TaskComplete.success = false` +- browser-script helper behavior for a real browser-eval return payload + +### Manual acceptance + +The live manual acceptance bar for this slice should be: + +1. Configure `skillsDir` to the staged skill root and `directSubmitSkill` to `fault-details-report.collect_fault_details`. +2. Attach sgClaw to the real target browser page/session. +3. Submit a natural-language fault-details request without LLM routing. +4. Verify the staged skill: + - reads the selected page range + - queries real fault rows + - produces populated detail rows + - produces populated summary rows + - exports the workbook through localhost + - writes report history +5. Verify the final sgClaw completion message reports the correct status, counts, and downstream file/log outcome. + +### Acceptance matrix + +At minimum, acceptance should cover: + +- normal populated result +- empty result with no matching rows +- partial result where export or report-log fails after collection +- blocked result where page/session preconditions are missing +- error result where parsing/query execution fails + +--- + +## Explicit Non-Goals + +This slice does **not**: + +- move routing ownership out of `claw-new` +- require LLM routing to be available first +- add per-skill dispatch metadata to external manifests for routing policy +- introduce a new browser protocol or browser opcode +- recreate the original Vue shell inside `claw-new` +- move fault classification logic into Rust +- redesign the submit-task protocol beyond better interpretation of the returned artifact + +--- + +## Resulting Design Rule + +For the fault-details path: + +- `claw-new` decides whether to invoke the fixed staged skill +- the staged skill performs the real fault business workflow +- the staged skill returns a structured artifact that describes collection + downstream outcomes +- `claw-new` interprets that artifact for submit-task success/failure and summary output + +That keeps routing config-owned, keeps business logic with the staged skill, and makes `fault-details-report.collect_fault_details` ready for both the current no-LLM path and a later LLM-routed path. + +--- + +## Document Landing Zones + +- Approved spec: `docs/superpowers/specs/2026-04-10-fault-details-full-skill-alignment-design.md` +- Follow-up implementation plan: `docs/superpowers/plans/2026-04-10-fault-details-full-skill-alignment-plan.md`