docs: add fault-details alignment design and plan

Capture the approved fault-details staged-skill design and implementation plan so the remaining work can be resumed from the documented contract.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
木炎
2026-04-10 18:10:02 +08:00
parent 7443b9da7f
commit 311cc1fee6
2 changed files with 1167 additions and 0 deletions

View File

@@ -0,0 +1,495 @@
# Fault Details Full Skill Alignment Design
**Goal:** Upgrade `fault-details-report.collect_fault_details` from an empty artifact shell into a real staged business skill that matches the original fault-details package's collection, normalization, summary, export, and report-history behavior, while keeping direct-skill routing config-owned in `claw-new`.
**Status:** Approved design direction for the next remediation slice.
---
## Decision Summary
1. Keep direct-skill selection in `claw-new` via `skillsDir` + `directSubmitSkill`; do not move dispatch ownership into `skill_staging` manifests.
2. Put the fault-details business logic in `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging`, not in `claw-new`.
3. Align the staged skill with the original package's real behavior: query raw rows, normalize detail columns, derive summary rows, call localhost export, and write report history.
4. Keep the current browser-execution seam narrow: use the existing `browser_script` / browser-eval path, not a new browser protocol or new opcodes.
5. Add a narrow artifact interpreter in `claw-new` so structured fault-results map cleanly to `TaskComplete.success` and a readable completion summary.
---
## Why This Slice Exists
The current staged skill contract and the current staged skill implementation do not match.
### What the original package actually does
The original package under `D:/desk/智能体资料/大四区报告监测项/故障明细` does all of the following:
- reads the selected date range from the page UI
- queries the D4 repair-order data source
- filters and normalizes raw rows into the canonical detail export schema
- derives grouped summary rows by `gds`
- calls `http://localhost:13313/SurfaceServices/personalBread/export/faultDetailsExportXLSXS`
- auto-opens/downloads the generated file
- writes report history through `http://localhost:13313/ReportServices/Api/setReportLog`
### What the staged skill currently does
The current staged `collect_fault_details.js` only returns an empty `report-artifact` shell with empty `rows` and empty summary `sections`.
It also still uses a Node-style export shape instead of the browser-eval entrypoint shape that the current `browser_script` runtime expects. In practice, this means the staged script is not yet aligned with the real runtime contract even before business behavior is considered.
This slice closes that gap by making the staged skill actually perform the work the original package performs, but through the current sgClaw direct-skill runtime.
---
## Design Rules
### 1. `claw-new` owns routing, not business transforms
`claw-new` stays responsible for:
- loading config
- deciding whether submit-task takes the direct-skill path
- resolving the configured staged skill
- executing the staged browser-script tool
- turning the returned artifact into `TaskComplete.success` + human-readable summary
`claw-new` must **not** become the place where the original fault classification table, detail-row field mapping, or summary aggregation rules are reimplemented.
### 2. `skill_staging` owns fault-details business behavior
The staged skill package owns:
- query orchestration inside the browser page context
- raw-row extraction
- canonical detail-row normalization
- classification and derived fields
- summary-sheet derivation
- localhost export request
- localhost report-log request
- structured result payload
### 3. Keep the current browser seam narrow
Do not introduce a new browser bridge, callback protocol, or skill-specific browser opcode for this slice.
The implementation should continue using the current `browser_script` execution seam already wired through `claw-new/src/compat/browser_script_skill_tool.rs` and `claw-new/src/compat/direct_skill_runtime.rs`.
### 4. Match business behavior, not the original shell verbatim
The original package is a local HTML/Vue shell that uses `BrowserAction(...)`, timers, and hidden-browser choreography. That shell does **not** need to be recreated inside `claw-new`.
What must be preserved is the business outcome:
- same canonical detail columns
- same key field mappings
- same classification rules
- same summary metrics
- same downstream export/history behavior
- same distinction between empty, partial, blocked, and failed work
---
## Ownership Boundary and Landing Zones
### Staged skill changes
These changes land in `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging`.
Primary files:
- `skills/fault-details-report/scripts/collect_fault_details.js`
- becomes the real browser-eval entrypoint
- must directly `return` the final structured artifact from the wrapped browser script
- may contain internal helper functions, but should remain self-contained for the current runtime
- `skills/fault-details-report/SKILL.toml`
- keep `browser_script`
- tighten the tool description so it matches the real behavior
- do not turn `SKILL.toml` into the source of truth for classification rules or routing policy
- `skills/fault-details-report/SKILL.md`
- align the written contract with the implemented runtime behavior
- `skills/fault-details-report/references/collection-flow.md`
- align the staged flow with the implemented query/export/history sequence
- `skills/fault-details-report/references/data-quality.md`
- stay authoritative for canonical columns, required fields, classification tables, `qxxcjl`-based reason heuristics, summary rules, and partial semantics
- `scenes/fault-details-report/scene.json`
- keep the scene contract aligned with the actual output and state semantics
- do not move classification or routing policy into scene metadata
### Caller/runtime changes
These changes land in `D:/data/ideaSpace/rust/sgClaw/claw-new`.
Primary files:
- `src/compat/direct_skill_runtime.rs`
- keep configured direct-skill execution here
- add narrow structured-artifact interpretation after the browser-script returns
- `src/agent/mod.rs`
- keep the current direct-submit routing seam here
- do not add fault-specific business logic here
- `src/compat/browser_script_skill_tool.rs`
- keep the browser-script contract strict: browser-eval entrypoint, no Node-only assumptions
- `tests/agent_runtime_test.rs`
- direct-submit path and result-surface regressions
- `tests/browser_script_skill_tool_test.rs`
- browser-script execution-shape regressions
If a new helper is needed in `claw-new`, it should be a narrow artifact-format/parser helper, not a new business-rules module.
---
## Target Runtime Flow
### Step 1: Submit-task stays config-owned
The user still types natural language into the current sgClaw input.
`claw-new`:
- receives `BrowserMessage::SubmitTask`
- loads `SgClawSettings`
- sees `directSubmitSkill = "fault-details-report.collect_fault_details"`
- bypasses LLM routing exactly as it does now
- resolves the staged skill from `skillsDir`
This preserves the already approved config-owned routing boundary.
### Step 2: Browser-script tool executes as a true browser entrypoint
`collect_fault_details.js` must be shaped for the current runtime:
- the script runs inside the current browser page context through `eval`
- it must not rely on `module.exports`
- it must directly `return collectFaultDetails(args)` from the wrapped script body
This is required because the current sgClaw browser-script runtime reads one script file and wraps it in a browser-side IIFE.
### Step 3: The skill reads the page-selected time range
The source-of-truth query window should come from the current page state, matching the original package behavior.
Design rule:
- read the selected start and end time from the business page controls or page state
- include that exact selected range in the returned artifact
- keep `period` as a bootstrap label from `claw-new`, not as a license to silently guess a different business range
Compatibility rule with the current direct-submit seam:
- the current `claw-new` direct path still requires an explicit `YYYY-MM` token in the user's instruction in order to enter the configured direct-skill flow
- that requirement remains in place for this slice
- once inside the skill, the browser page's selected start/end range is the source of truth for collection
- the returned artifact should include both the user-visible `period` label and the exact selected page range so mismatches are observable instead of hidden
If the page-selected range cannot be read reliably, the skill should return `blocked` instead of inventing a month-wide query window from `period` alone.
### Step 4: The skill collects raw rows and normalizes detail fields
The staged skill must reproduce the original package's detail normalization logic inside the browser-executed script.
That includes preserving the canonical detail schema from the original `excleIni[0].cols`, including the key transforms already present in the original package, such as:
- `slsj = bxsj`
- `gssgs = "甘肃省电力公司"`
- `sgs` derived from the current company/city context
- `gddw = maintOrgName`
- `gds = maintGroupName`
- `clzt = "处理完成"`
- `bdz = bdzMc`
- `line = xlmc10`
- `pb = byqmc`
The staged skill must also port the original classification/derivation logic that fills:
- `sxfl1`
- `sxfl2`
- `sxfl3`
- `gzsb`
- `gzyy`
That includes the original matching table and the `qxxcjl`-based text extraction heuristics that derive the fault reason.
### Step 5: The skill derives summary rows from normalized detail rows
The staged skill must derive the summary sheet from grouped detail rows, keyed around the same business totals the original package computes.
At minimum that includes:
- `index`
- `gsName`
- `fwDept`
- `className`
- `allCount`
- `wxCount`
- `khcCount`
- `sbdSbCount`
- `gyGzCount`
- `dyGzCount`
- `tqdzCount`
- `tqbxCount`
- `dyxlCount`
- `bqxCount`
- `jllCount`
- `bhxCount`
- `qftdCount`
The summary derivation must stay in the staged skill so the same package can later be routed by LLM without moving business logic back into `claw-new`.
### Step 6: The skill performs downstream export and report logging
After detail rows and summary rows are available, the staged skill should reproduce the original package's downstream behavior:
- build the export payload for `faultDetailsExportXLSXS`
- call the localhost export endpoint
- capture the returned export path/URL
- write report history via `setReportLog`
Important boundary:
- export/report-log are downstream side effects
- they do not redefine whether collection itself succeeded
- if collection succeeds but export/logging fails, the result is `partial`, not a full collection failure
- auto-opening/downloading the exported file is out of scope for this slice; this slice records the export path/result in the artifact but does not add new opener/UI behavior in `claw-new`
### Step 7: The skill returns one structured artifact
The staged skill should return one self-describing JSON artifact containing:
- business identity (`type`, `report_name`)
- selected period label
- exact selected start/end range
- canonical detail columns + normalized rows
- summary section columns + rows
- counts
- business status
- partial reasons if any
- downstream export outcome
- downstream report-log outcome
### Step 8: `claw-new` interprets the artifact, not the business rules
After the browser-script returns, `claw-new` should parse the JSON artifact and map it into final submit-task behavior.
Recommended mapping:
- `status = ok` -> `TaskComplete.success = true`
- `status = partial` -> `TaskComplete.success = true`, with warnings in summary
- `status = empty` -> `TaskComplete.success = true`, clearly reported as empty-result
- `status = blocked` -> `TaskComplete.success = false`
- `status = error` -> `TaskComplete.success = false`
This keeps business classification in the staged skill while preventing false-positive success in the direct path.
---
## Artifact Contract
The returned payload should stay `type = "report-artifact"`, but it must become rich enough to describe the real run.
Recommended contract:
```json
{
"type": "report-artifact",
"report_name": "fault-details-report",
"period": "2026-03",
"selected_range": {
"start": "2026-03-08 16:00:00",
"end": "2026-03-09 16:00:00"
},
"columns": ["qxdbh", "gssgs", "sgs", "gddw", "gds", "slsj", "yjflMc", "ejflMc", "sjflMc", "gzms", "yhbh", "yhmc", "lxr", "gzdd", "lxdh", "bxsj", "gdsj", "clzt", "qxxcjl", "bdz", "line", "pb", "sxfl1", "sxfl2", "sxfl3", "gzsb", "gzyy", "bz"],
"rows": [],
"sections": [
{
"name": "summary-sheet",
"columns": ["index", "gsName", "fwDept", "className", "allCount", "wxCount", "khcCount", "sbdSbCount", "gyGzCount", "dyGzCount", "tqdzCount", "tqbxCount", "dyxlCount", "bqxCount", "jllCount", "bhxCount", "qftdCount"],
"rows": []
}
],
"counts": {
"detail_rows": 0,
"summary_rows": 0
},
"status": "ok",
"partial_reasons": [],
"downstream": {
"export": {
"attempted": true,
"success": true,
"path": "http://localhost:13313/.../fault-details.xlsx"
},
"report_log": {
"attempted": true,
"success": true,
"report_name": "国网XX故障报修明细表(03月09日)",
"path": "http://localhost:13313/.../fault-details.xlsx"
}
}
}
```
### Contract notes
- `rows` is the canonical returned detail table, not the export-service transport payload.
- If the export service still requires a placeholder row for an empty spreadsheet, that placeholder should be synthesized only for the downstream export call, not as the canonical returned `rows` contract.
- `counts` should be computed from the canonical returned tables.
- `selected_range`, `columns`, `sections`, `counts`, `status`, and `partial_reasons` should always be present for `ok`, `partial`, and `empty`.
- For `blocked` and `error`, the artifact should still include `type`, `report_name`, `period`, `status`, and `partial_reasons`; `selected_range`, `columns`, `sections`, and `counts` should be included whenever they were already known before the failure point.
- `downstream` should be omitted only when export/report-log were not attempted yet; otherwise include it with `attempted` / `success` flags and any available path or failure detail.
---
## Error Handling and Status Semantics
### `ok`
Use `ok` when all of the following are true:
- raw collection succeeded
- required detail-field normalization succeeded
- summary derivation succeeded
- export succeeded
- report-log write succeeded
### `partial`
Use `partial` when detail collection succeeded but at least one downstream stage degraded, including:
- one or more required fields could not be normalized, but the row set still remains exportable and summary derivation can proceed with explicit gaps recorded
- summary derivation was incomplete, but the detail table is still available
- export failed after rows were available
- report-log write failed after rows/export were available
Escalation rule:
- if the raw query succeeds but required fields are missing so broadly that the canonical detail table cannot be produced at all, use `error`, not `partial`
- if summary derivation cannot even start because the normalized detail rows are structurally unusable, use `error`, not `partial`
`partial_reasons` must name the degraded stage instead of hiding it.
### `empty`
Use `empty` when:
- the query succeeds for the selected range
- zero real detail rows match
This is not a failure.
If the business flow still wants an empty export file or placeholder export payload, that happens downstream without changing the semantic meaning of the result.
### `blocked`
Use `blocked` when the page/session preconditions are not met, for example:
- expected page/session is not available
- required page controls cannot be read
- login/session state is missing or expired
- required browser-visible APIs are unavailable in the current page context
### `error`
Use `error` when the run starts but fails due to operational or parsing problems, for example:
- request failure
- page script failure
- raw response parse failure
- malformed export response
### `claw-new` completion mapping
`claw-new` should convert structured status into final submit completion behavior:
- `ok` / `partial` / `empty`: return a success completion with a concise human summary
- `blocked` / `error`: return a failed completion with a concise human summary
This avoids the current risk where a structured error-like payload could still be surfaced as a nominal success string.
---
## Testing and Acceptance Strategy
### Skill-side deterministic coverage
Add deterministic coverage around the staged skill's business logic in `skill_staging` for:
- canonical detail field mapping
- classification table parity
- `gzyy` extraction heuristics
- summary aggregation parity
- empty-result handling
- partial-result generation when downstream export/logging fails
- browser-script entrypoint shape (`return ...`, not `module.exports`)
The classification/summary tests should use fixed raw-row fixtures so the business rules are validated without a live browser session.
### `claw-new` runtime regressions
Add Rust coverage in `claw-new` for:
- direct-submit success with a populated `report-artifact`
- `partial` artifact mapping to `TaskComplete.success = true`
- `empty` artifact mapping to `TaskComplete.success = true`
- `blocked` / `error` artifact mapping to `TaskComplete.success = false`
- browser-script helper behavior for a real browser-eval return payload
### Manual acceptance
The live manual acceptance bar for this slice should be:
1. Configure `skillsDir` to the staged skill root and `directSubmitSkill` to `fault-details-report.collect_fault_details`.
2. Attach sgClaw to the real target browser page/session.
3. Submit a natural-language fault-details request without LLM routing.
4. Verify the staged skill:
- reads the selected page range
- queries real fault rows
- produces populated detail rows
- produces populated summary rows
- exports the workbook through localhost
- writes report history
5. Verify the final sgClaw completion message reports the correct status, counts, and downstream file/log outcome.
### Acceptance matrix
At minimum, acceptance should cover:
- normal populated result
- empty result with no matching rows
- partial result where export or report-log fails after collection
- blocked result where page/session preconditions are missing
- error result where parsing/query execution fails
---
## Explicit Non-Goals
This slice does **not**:
- move routing ownership out of `claw-new`
- require LLM routing to be available first
- add per-skill dispatch metadata to external manifests for routing policy
- introduce a new browser protocol or browser opcode
- recreate the original Vue shell inside `claw-new`
- move fault classification logic into Rust
- redesign the submit-task protocol beyond better interpretation of the returned artifact
---
## Resulting Design Rule
For the fault-details path:
- `claw-new` decides whether to invoke the fixed staged skill
- the staged skill performs the real fault business workflow
- the staged skill returns a structured artifact that describes collection + downstream outcomes
- `claw-new` interprets that artifact for submit-task success/failure and summary output
That keeps routing config-owned, keeps business logic with the staged skill, and makes `fault-details-report.collect_fault_details` ready for both the current no-LLM path and a later LLM-routed path.
---
## Document Landing Zones
- Approved spec: `docs/superpowers/specs/2026-04-10-fault-details-full-skill-alignment-design.md`
- Follow-up implementation plan: `docs/superpowers/plans/2026-04-10-fault-details-full-skill-alignment-plan.md`