From 72b79feca9ae51e1977e67f78f362fe65025e2a7 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=E6=9C=A8=E7=82=8E?= <635735027@qq.com> Date: Sun, 12 Apr 2026 13:12:35 +0800 Subject: [PATCH] docs: add tq lineloss design and plan Add the tq lineloss design spec and implementation plan documents used for the deterministic submit work. Co-Authored-By: Claude Sonnet 4.6 --- ...11-tq-lineloss-deterministic-skill-plan.md | 808 ++++++++++++++++++ ...-tq-lineloss-deterministic-skill-design.md | 618 ++++++++++++++ 2 files changed, 1426 insertions(+) create mode 100644 docs/superpowers/plans/2026-04-11-tq-lineloss-deterministic-skill-plan.md create mode 100644 docs/superpowers/specs/2026-04-11-tq-lineloss-deterministic-skill-design.md diff --git a/docs/superpowers/plans/2026-04-11-tq-lineloss-deterministic-skill-plan.md b/docs/superpowers/plans/2026-04-11-tq-lineloss-deterministic-skill-plan.md new file mode 100644 index 0000000..5562a1f --- /dev/null +++ b/docs/superpowers/plans/2026-04-11-tq-lineloss-deterministic-skill-plan.md @@ -0,0 +1,808 @@ +# TQ Lineloss Deterministic Skill Implementation Plan + +> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking. + +**Goal:** Add a staged `tq-lineloss-report.collect_lineloss` browser-script skill plus a `。。。` deterministic submit path in `claw-new` that extracts and normalizes company/month/week parameters without LLM, executes through the existing pipe browser-script seam, and does not regress Zhihu hotlist behavior. + +**Architecture:** Keep the new behavior behind a narrow deterministic branch that activates only when the raw instruction ends with the exact suffix `。。。`. `claw-new` owns deterministic trigger detection, explicit scene matching, semantic extraction, canonical normalization, prompt-or-execute control flow, and artifact interpretation; the staged skill owns page inspection, source/API collection, row normalization, export/report-log behavior, and final artifact generation. Reuse the existing `browser_script` execution seam already used by the direct browser path so the backend can later swap from pipe to ws without changing the deterministic contract. + +**Tech Stack:** Rust 2021, Cargo tests, existing `BrowserPipeTool` / `execute_browser_script_tool` seam, staged skill packaging under `claw/claw/skills/skill_staging`, browser-side JavaScript, deterministic string parsing and normalization. + +--- + +## Execution Context + +- Follow @superpowers:test-driven-development for every behavior change. +- Follow @superpowers:verification-before-completion before claiming each task is done. +- Do **not** create a git worktree unless the user explicitly asks. +- Keep the new behavior as a narrow branch; do **not** redesign the whole runtime into a general registry engine in this slice. +- Preserve `src/runtime/engine.rs:147-159` and `src/runtime/engine.rs:265-286` behavior unless a failing regression test proves a change is required. +- Do **not** add ws runtime requirements on `main`; keep ws-readiness isolated to backend-neutral contracts only. +- Never fall back to page defaults for missing company, mode, or period in deterministic mode. +- If a deterministic request does not match the lineloss whitelist scene, return a deterministic mismatch prompt instead of falling through to ordinary orchestration. + +## File Map + +### New or modified files in `claw-new` + +- Create: `src/compat/deterministic_submit.rs` + - suffix detection, deterministic scene match, prompt-or-execute decision +- Create: `src/compat/tq_lineloss/mod.rs` + - public normalization and artifact helpers +- Create: `src/compat/tq_lineloss/contracts.rs` + - canonical request/result data structures and status semantics +- Create: `src/compat/tq_lineloss/org_resolver.rs` + - alias generation, canonical label/code resolution, ambiguity handling +- Create: `src/compat/tq_lineloss/period_resolver.rs` + - month/week extraction, contradiction detection, canonical payload building +- Create: `src/compat/tq_lineloss/org_units.rs` + - checked-in canonical unit dictionary derived from the real source tree data +- Modify: `src/compat/mod.rs` + - export the deterministic and lineloss modules +- Modify: `src/agent/mod.rs` + - insert the deterministic branch before ordinary LLM interpretation, but only when the exact suffix is present +- Modify only if code duplication would otherwise occur: `src/compat/direct_skill_runtime.rs` + - extract narrow shared browser-script execution helpers without changing current configured direct-submit behavior +- Read but avoid changing unless tests force it: `src/runtime/engine.rs` + - existing Zhihu hotlist routing/prompt logic must remain intact + +### New staged skill package in `claw` + +- Create: `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/tq-lineloss-report/SKILL.md` +- Create: `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/tq-lineloss-report/SKILL.toml` +- Create: `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/tq-lineloss-report/references/collection-flow.md` +- Create: `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/tq-lineloss-report/references/data-quality.md` +- Create: `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/tq-lineloss-report/assets/scene-snapshot/index.html` +- Create: `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/tq-lineloss-report/scripts/collect_lineloss.js` +- Create: `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/tq-lineloss-report/scripts/collect_lineloss.test.js` +- Create if staging conventions require it: `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/scenes/tq-lineloss-report/scene.json` + +### Tests + +- Create: `tests/deterministic_submit_test.rs` +- Modify: `tests/compat_runtime_test.rs` +- Modify only if end-to-end submit coverage requires it: `tests/runtime_task_flow_test.rs` + +--- + +## Locked contracts + +### Deterministic trigger contract + +- Trigger only when the raw instruction ends with the exact suffix `。。。`. +- No suffix: current behavior unchanged. +- Suffix + unsupported scene: explicit deterministic mismatch prompt. +- Suffix is not permission for arbitrary browser actions; only fixed deterministic scenes are allowed. +- Negative cases must stay non-deterministic or mismatched exactly as designed: + - ASCII `...` is not the trigger + - `。。。。` is not the trigger + - `。。。` appearing in the middle of the instruction is not the trigger + - any trailing whitespace after `。。。` is not the trigger in this slice + +### Canonical org contract + +The resolver must output both display and backend values: + +```rust +pub struct ResolvedOrg { + pub label: String, + pub code: String, +} +``` + +Required supported inputs include: +- `兰州公司` +- `天水公司` +- `国网兰州供电公司` +- `城关供电分公司` +- `榆中县供电公司` +- normalized shorthand such as `榆中县公司` + +Rules: +- derive aliases from the real unit tree data +- require uniqueness before execution +- ambiguous aliases prompt and stop +- missing company prompts and stop + +### Canonical period contract + +```rust +pub enum PeriodMode { + Month, + Week, +} + +pub struct ResolvedPeriod { + pub mode: PeriodMode, + pub mode_code: String, + pub value: String, + pub payload: serde_json::Value, +} +``` + +Required supported inputs include: +- `月累计 2026-03` +- `月累计 2026年3月` +- `周累计 2026年第12周` + +Rules: +- month and week intent are mutually exclusive +- missing mode prompts and stop +- missing period prompts and stop +- bare `第12周` is incomplete in this slice and must prompt for year instead of guessing +- derive the real backend `period_mode_code` values and request payload field names from the source page/API contract before implementation; do not ship placeholder enum echoes such as `month`/`week` unless the source materials prove those are the real backend codes +- never use page-selected defaults in deterministic mode + +### Artifact contract + +Lock the field names now so `claw-new` can interpret status without re-embedding business logic: + +```json +{ + "type": "report-artifact", + "report_name": "tq-lineloss-report", + "status": "ok", + "org": { + "label": "国网兰州供电公司", + "code": "008df5db70319f73e0508eoac23e0c3c" + }, + "period": { + "mode": "month", + "mode_code": "", + "value": "2026-03", + "payload": { + "": "" + } + }, + "columns": [], + "rows": [], + "counts": { + "rows": 0 + }, + "export": { + "attempted": false, + "status": "skipped", + "message": null + }, + "reasons": [] +} +``` + +Status mapping in `claw-new`: +- `ok` -> task success +- `partial` -> task success with partial summary +- `blocked` -> task failure +- `error` -> task failure + +--- + +### Task 1: Scaffold the staged skill package and written contract + +**Files:** +- Create: `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/tq-lineloss-report/SKILL.md` +- Create: `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/tq-lineloss-report/SKILL.toml` +- Create: `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/tq-lineloss-report/references/collection-flow.md` +- Create: `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/tq-lineloss-report/references/data-quality.md` +- Create: `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/tq-lineloss-report/assets/scene-snapshot/index.html` +- Create if required: `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/scenes/tq-lineloss-report/scene.json` + +- [ ] **Step 1: Write the failing package contract files** + +Create the package using `fault-details-report` as the structure reference. Lock one tool only: + +```toml +[[tools]] +name = "collect_lineloss" +kind = "browser_script" +description = "Collect 台区线损月/周累计线损率 rows using normalized company and period parameters and return a structured report artifact." +``` + +Declare required args in `SKILL.toml`: +- `expected_domain` +- `org_label` +- `org_code` +- `period_mode` +- `period_mode_code` +- `period_value` +- `period_payload` + +- [ ] **Step 2: Write `SKILL.md` before implementation** + +Document: +- when to use / when not to use +- required normalized args only +- blocked/error semantics +- exact returned artifact fields +- no raw natural-language values passed to backend requests + +- [ ] **Step 3: Write the reference docs** + +`references/collection-flow.md` must describe: +- relevant page state +- month request mapping +- week request mapping +- export/report-log flow if retained + +`references/data-quality.md` must define: +- canonical output columns +- required field coverage +- status semantics +- partial/error rules +- org/period normalization assumptions + +- [ ] **Step 4: Add scene metadata if the current staging registry needs it** + +Keep it narrow: one scene, one tool, one artifact type. + +- [ ] **Step 5: Add an automated staged-skill load/resolve check** + +Add `tests/deterministic_submit_test.rs` coverage that loads the staged skills root used by runtime tests, resolves `tq-lineloss-report.collect_lineloss`, and asserts the tool is discoverable with the required args: +- `expected_domain` +- `org_label` +- `org_code` +- `period_mode` +- `period_mode_code` +- `period_value` +- `period_payload` + +Run: +```bash +cargo test deterministic_submit_discovers_tq_lineloss_skill_contract -- --exact +``` + +Expected: FAIL before the package is fully wired, PASS once the staged skill contract is discoverable and complete. + +- [ ] **Step 6: Verify structural parity with `fault-details-report`** + +Run a manual file-layout diff and confirm there are no placeholder descriptions or missing required docs. + +- [ ] **Step 7: Commit** + +```bash +git add "D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/tq-lineloss-report" "D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/scenes/tq-lineloss-report/scene.json" +git commit -m "feat: scaffold tq lineloss staged skill contract" +``` + +--- + +### Task 2: Add browser-side JS red tests and implement the staged collector + +**Files:** +- Create: `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/tq-lineloss-report/scripts/collect_lineloss.js` +- Create: `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/tq-lineloss-report/scripts/collect_lineloss.test.js` + +- [ ] **Step 1: Write the failing JS tests first** + +Cover deterministic pure helpers for: +- missing normalized args -> blocked/error contract +- month request shape uses `org_code` + canonical month payload +- week request shape uses `org_code` + canonical week payload +- artifact field names and counts +- partial/error status shaping +- no raw user-entered org text leakage into request fields + +Example test skeleton: + +```javascript +const test = require('node:test'); +const assert = require('node:assert/strict'); + +const { + validateArgs, + buildMonthRequest, + buildWeekRequest, + normalizeRows, + buildArtifact +} = require('./collect_lineloss.js'); + +test('buildMonthRequest uses canonical org code and month payload', () => { + const request = buildMonthRequest({ + org_code: 'ORG-1', + period_payload: { year: 2026, month: 3 } + }); + + assert.equal(request.orgCode, 'ORG-1'); + assert.equal(request.year, 2026); + assert.equal(request.month, 3); +}); + +test('buildArtifact locks field names and partial semantics', () => { + const artifact = buildArtifact({ + org_label: '国网兰州供电公司', + org_code: 'ORG-1', + period_mode: 'month', + period_mode_code: 'month', + period_value: '2026-03', + period_payload: { year: 2026, month: 3 }, + rows: [{ id: 1 }], + status: 'partial', + reasons: ['export_failed'] + }); + + assert.equal(artifact.report_name, 'tq-lineloss-report'); + assert.equal(artifact.org.code, 'ORG-1'); + assert.equal(artifact.period.value, '2026-03'); + assert.deepEqual(artifact.reasons, ['export_failed']); +}); +``` + +- [ ] **Step 2: Run the JS test file to confirm failure** + +Run: +```bash +node --test "D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/tq-lineloss-report/scripts/collect_lineloss.test.js" +``` + +Expected: FAIL because the script/helpers do not exist yet. + +- [ ] **Step 3: Write the minimal browser-side implementation** + +Required structure: + +```javascript +function validateArgs(args) { /* require normalized canonical args */ } +function buildMonthRequest(args) { /* build month request from canonical values */ } +function buildWeekRequest(args) { /* build week request from canonical values */ } +function normalizeRows(rawRows) { /* canonical columns only */ } +function buildArtifact(input) { /* locked artifact shape */ } + +return (async () => { + const args = __SKILL_ARGS__; + validateArgs(args); + // validate page context + // collect from page/API + // normalize rows + // optionally attempt export/report-log if the real business flow requires it + return buildArtifact(result); +})(); +``` + +Keep test exports behind an environment-safe guard so the file still works as browser-eval code. + +- [ ] **Step 4: Re-run the JS tests until they pass** + +Run: +```bash +node --test "D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/tq-lineloss-report/scripts/collect_lineloss.test.js" +``` + +Expected: PASS. + +- [ ] **Step 5: Commit** + +```bash +git add "D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/tq-lineloss-report/scripts/collect_lineloss.js" "D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/tq-lineloss-report/scripts/collect_lineloss.test.js" +git commit -m "feat: add tq lineloss browser collection script" +``` + +--- + +### Task 3: Add deterministic suffix detection and explicit scene routing + +**Files:** +- Create: `src/compat/deterministic_submit.rs` +- Modify: `src/compat/mod.rs` +- Modify: `src/agent/mod.rs` +- Create: `tests/deterministic_submit_test.rs` + +- [ ] **Step 1: Write failing routing tests** + +Add Rust tests for: +- exact raw `。。。` suffix enables deterministic mode +- no suffix leaves current routing untouched +- suffix + unsupported deterministic request returns supported-scene prompt +- when page URL/title context is available and does not match the lineloss scene, deterministic routing returns mismatch/block prompt instead of proceeding +- Zhihu hotlist request without suffix keeps the current route +- ASCII `...` does not trigger deterministic mode +- `。。。。` does not trigger deterministic mode +- `。。。` in the middle of the instruction does not trigger deterministic mode +- trailing whitespace after `。。。` does not trigger deterministic mode in this slice + +Suggested tests: + +```rust +#[test] +fn deterministic_submit_requires_exact_suffix() {} + +#[test] +fn deterministic_submit_nonmatch_returns_supported_scene_message() {} + +#[test] +fn deterministic_submit_rejects_page_context_mismatch() {} + +#[test] +fn zhihu_hotlist_request_without_suffix_keeps_existing_route() {} + +#[test] +fn deterministic_submit_rejects_non_exact_suffix_variants() {} +``` + +- [ ] **Step 2: Run the targeted routing tests and confirm failure** + +Run: +```bash +cargo test deterministic_submit_requires_exact_suffix -- --exact +cargo test deterministic_submit_nonmatch_returns_supported_scene_message -- --exact +cargo test zhihu_hotlist_request_without_suffix_keeps_existing_route -- --exact +``` + +Expected: FAIL because the deterministic routing seam does not exist yet. + +- [ ] **Step 3: Implement the narrow deterministic routing module** + +Recommended public shape: + +```rust +pub enum DeterministicSubmitDecision { + NotDeterministic, + Prompt { summary: String }, + Execute(DeterministicExecutionPlan), +} +``` + +`src/agent/mod.rs` should: +1. detect deterministic suffix +2. if not deterministic, continue current flow untouched +3. if prompt, return `TaskComplete` +4. if execute, pass the plan into the browser-script execution seam + +- [ ] **Step 4: Re-run the routing tests** + +Run: +```bash +cargo test deterministic_submit_requires_exact_suffix -- --exact +cargo test deterministic_submit_nonmatch_returns_supported_scene_message -- --exact +cargo test zhihu_hotlist_request_without_suffix_keeps_existing_route -- --exact +``` + +Expected: PASS. + +- [ ] **Step 5: Commit** + +```bash +git add src/compat/deterministic_submit.rs src/compat/mod.rs src/agent/mod.rs tests/deterministic_submit_test.rs +git commit -m "feat: add deterministic submit routing seam" +``` + +--- + +### Task 4: Implement company/unit normalization from real source data + +**Files:** +- Create: `src/compat/tq_lineloss/mod.rs` +- Create: `src/compat/tq_lineloss/contracts.rs` +- Create: `src/compat/tq_lineloss/org_resolver.rs` +- Create: `src/compat/tq_lineloss/org_units.rs` +- Modify: `tests/deterministic_submit_test.rs` + +- [ ] **Step 1: Write failing org resolver tests** + +Cover: +- `兰州公司` -> canonical `国网兰州供电公司` + correct code +- `天水公司` -> canonical `国网天水供电公司` + correct code +- `城关供电分公司` -> lower-level direct match +- `榆中县公司` -> normalized county alias match +- ambiguous alias prompts instead of guessing +- missing company prompts instead of executing + +Example skeleton: + +```rust +#[test] +fn lineloss_org_resolver_matches_city_alias() {} + +#[test] +fn lineloss_org_resolver_matches_county_alias() {} + +#[test] +fn lineloss_org_resolver_prompts_on_ambiguity() {} +``` + +- [ ] **Step 2: Run the org tests and confirm failure** + +Run: +```bash +cargo test lineloss_org_resolver_matches_city_alias -- --exact +cargo test lineloss_org_resolver_matches_county_alias -- --exact +cargo test lineloss_org_resolver_prompts_on_ambiguity -- --exact +``` + +Expected: FAIL because the resolver and checked-in unit dictionary do not exist yet. + +- [ ] **Step 3: Check in the canonical unit dictionary and implement alias resolution** + +Rules: +- derive data from the real source materials, not guessed literals +- keep canonical `label` and `code` +- generate normalized aliases from formal names +- support both city-company and district/county/sub-company levels +- require uniqueness before execution + +- [ ] **Step 4: Implement explicit prompt messages** + +Examples: +- `已命中台区线损报表技能,但缺少供电单位,请补充如“兰州公司”或“城关供电分公司”。` +- `已命中台区线损报表技能,但供电单位存在歧义,请补充更完整名称。` + +- [ ] **Step 5: Re-run the org tests** + +Run: +```bash +cargo test lineloss_org_resolver_matches_city_alias -- --exact +cargo test lineloss_org_resolver_matches_county_alias -- --exact +cargo test lineloss_org_resolver_prompts_on_ambiguity -- --exact +``` + +Expected: PASS. + +- [ ] **Step 6: Commit** + +```bash +git add src/compat/tq_lineloss/mod.rs src/compat/tq_lineloss/contracts.rs src/compat/tq_lineloss/org_resolver.rs src/compat/tq_lineloss/org_units.rs tests/deterministic_submit_test.rs +git commit -m "feat: add tq lineloss org normalization" +``` + +--- + +### Task 5: Implement period extraction and canonical payload building + +**Files:** +- Create: `src/compat/tq_lineloss/period_resolver.rs` +- Modify: `src/compat/tq_lineloss/mod.rs` +- Modify: `tests/deterministic_submit_test.rs` + +- [ ] **Step 1: Write failing period resolver tests** + +Cover: +- `月累计 2026-03` +- `月累计 2026年3月` +- `周累计 2026年第12周` +- contradictory month/week expressions prompt +- missing mode prompts +- missing period prompts +- bare `第12周` prompts for year in this slice +- real backend month/week mode codes and request payload field names are derived from source materials instead of placeholder values + +Example skeleton: + +```rust +#[test] +fn lineloss_period_resolver_parses_month_text() {} + +#[test] +fn lineloss_period_resolver_parses_week_text() {} + +#[test] +fn lineloss_period_resolver_prompts_for_missing_year_on_week() {} + +#[test] +fn lineloss_period_resolver_rejects_contradictory_mode() {} +``` + +- [ ] **Step 2: Run the period tests and confirm failure** + +Run: +```bash +cargo test lineloss_period_resolver_parses_month_text -- --exact +cargo test lineloss_period_resolver_parses_week_text -- --exact +cargo test lineloss_period_resolver_prompts_for_missing_year_on_week -- --exact +cargo test lineloss_period_resolver_rejects_contradictory_mode -- --exact +``` + +Expected: FAIL because the period resolver does not exist yet. + +- [ ] **Step 3: Implement the minimal resolver** + +Output contract: + +```rust +pub struct ResolvedPeriod { + pub mode: PeriodMode, + pub mode_code: String, + pub value: String, + pub payload: serde_json::Value, +} +``` + +Rules: +- no page-default fallback +- no implicit current-year assumptions +- no mixed month/week execution + +- [ ] **Step 4: Re-run the period tests** + +Run: +```bash +cargo test lineloss_period_resolver_parses_month_text -- --exact +cargo test lineloss_period_resolver_parses_week_text -- --exact +cargo test lineloss_period_resolver_prompts_for_missing_year_on_week -- --exact +cargo test lineloss_period_resolver_rejects_contradictory_mode -- --exact +``` + +Expected: PASS. + +- [ ] **Step 5: Commit** + +```bash +git add src/compat/tq_lineloss/period_resolver.rs src/compat/tq_lineloss/mod.rs tests/deterministic_submit_test.rs +git commit -m "feat: add tq lineloss period normalization" +``` + +--- + +### Task 6: Wire deterministic execution through the existing browser-script seam + +**Files:** +- Modify: `src/compat/deterministic_submit.rs` +- Modify: `src/agent/mod.rs` +- Modify if needed: `src/compat/direct_skill_runtime.rs` +- Modify: `tests/deterministic_submit_test.rs` +- Modify: `tests/compat_runtime_test.rs` + +- [ ] **Step 1: Write failing execution tests** + +Cover: +- successful deterministic lineloss request builds canonical tool args +- missing company/mode/period returns prompt without browser execution +- `partial` artifact maps to successful partial summary +- `blocked` and `error` artifacts map to failed completion + +Example skeleton: + +```rust +#[test] +fn deterministic_lineloss_execution_passes_canonical_args() {} + +#[test] +fn deterministic_lineloss_missing_company_does_not_invoke_browser() {} + +#[test] +fn deterministic_lineloss_partial_artifact_maps_to_partial_summary() {} +``` + +- [ ] **Step 2: Run the execution tests and confirm failure** + +Run: +```bash +cargo test deterministic_lineloss_execution_passes_canonical_args -- --exact +cargo test deterministic_lineloss_missing_company_does_not_invoke_browser -- --exact +cargo test deterministic_lineloss_partial_artifact_maps_to_partial_summary -- --exact +``` + +Expected: FAIL because the deterministic execution plan is not wired yet. + +- [ ] **Step 3: Implement execution via the existing `browser_script` seam** + +Build tool args only from normalized values: +- `expected_domain` +- `org_label` +- `org_code` +- `period_mode` +- `period_mode_code` +- `period_value` +- `period_payload` + +Resolve the tool explicitly to: +- `tq-lineloss-report.collect_lineloss` + +Do not introduce a new browser opcode family or second browser protocol. + +- [ ] **Step 4: Implement central artifact interpretation** + +Recommended helper: + +```rust +fn summarize_lineloss_artifact(artifact: &serde_json::Value) -> (bool, String) +``` + +Summary must include canonical org/period and row counts, and surface blocked/partial/error reasons. + +- [ ] **Step 5: Re-run the execution tests** + +Run: +```bash +cargo test deterministic_lineloss_execution_passes_canonical_args -- --exact +cargo test deterministic_lineloss_missing_company_does_not_invoke_browser -- --exact +cargo test deterministic_lineloss_partial_artifact_maps_to_partial_summary -- --exact +``` + +Expected: PASS. + +- [ ] **Step 6: Commit** + +```bash +git add src/compat/deterministic_submit.rs src/agent/mod.rs src/compat/direct_skill_runtime.rs tests/deterministic_submit_test.rs tests/compat_runtime_test.rs +git commit -m "feat: execute deterministic tq lineloss skill through browser script seam" +``` + +--- + +### Task 7: Add Zhihu regression coverage and run the full verification set + +**Files:** +- Modify: `tests/compat_runtime_test.rs` +- Modify only if required: `tests/runtime_task_flow_test.rs` +- Reuse: `tests/deterministic_submit_test.rs` + +- [ ] **Step 1: Add focused Zhihu regression tests** + +Required assertions: +- ordinary Zhihu hotlist requests without `。。。` still use the current path +- existing export/presentation requests still preserve their current behavior +- deterministic suffix does not silently route unmatched requests into Zhihu logic +- an existing non-lineloss direct `browser_script` path outside the new scene still behaves unchanged + +- [ ] **Step 2: Add end-to-end deterministic submit coverage** + +Required assertions: +- suffix detection +- scene match +- page-context mismatch prompt/block behavior when URL/title contradict the lineloss scene +- missing/ambiguous prompts +- canonical args passed to the browser-script tool +- returned summary shows canonical org and period +- execution stays on the existing pipe-backed browser-script seam with no ws-only dependency introduced on `main` + +- [ ] **Step 3: Run the focused Rust tests** + +Run: +```bash +cargo test --test deterministic_submit_test +cargo test --test compat_runtime_test +cargo test --test runtime_task_flow_test +``` + +Expected: PASS. + +- [ ] **Step 4: Run the whole Rust suite** + +Run: +```bash +cargo test +``` + +Expected: PASS. + +- [ ] **Step 5: Re-run the staged skill JS tests** + +Run: +```bash +node --test "D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/tq-lineloss-report/scripts/collect_lineloss.test.js" +``` + +Expected: PASS. + +- [ ] **Step 6: Commit** + +```bash +git add tests/deterministic_submit_test.rs tests/compat_runtime_test.rs tests/runtime_task_flow_test.rs +git commit -m "test: cover deterministic tq lineloss routing and zhihu regression" +``` + +--- + +## Final verification checklist + +- [ ] `。。。` is the only deterministic trigger. +- [ ] Non-`。。。` requests preserve current routing. +- [ ] Deterministic page-context mismatch blocks or mismatches before execution when URL/title contradict the lineloss scene. +- [ ] Zhihu hotlist behavior is unchanged. +- [ ] Existing non-lineloss direct `browser_script` behavior is unchanged. +- [ ] Deterministic non-match returns an explicit supported-scene message. +- [ ] Missing company prompts. +- [ ] Ambiguous company prompts. +- [ ] Missing mode prompts. +- [ ] Missing period prompts. +- [ ] Bare `第12周` prompts for year. +- [ ] Canonical org code is passed to the staged skill. +- [ ] Canonical period mode code and payload are passed to the staged skill. +- [ ] The staged skill returns the locked artifact shape. +- [ ] Execution uses the existing `browser_script` seam only. +- [ ] No ws-specific runtime dependency is added on `main`. + +## Implementation notes + +- Prefer extracting a tiny shared execution helper from `src/compat/direct_skill_runtime.rs` if needed instead of duplicating tool lookup or browser-script invocation code. +- Keep deterministic whitelist configuration in one place, but do not expand this slice into a full general scene-registry redesign. +- If a failing test suggests changing Zhihu behavior, fix the deterministic branch or test harness instead of weakening the existing Zhihu path. +- The checked-in unit dictionary is part of the deterministic contract; treat updates to that data as explicit behavior changes and cover them with tests. diff --git a/docs/superpowers/specs/2026-04-11-tq-lineloss-deterministic-skill-design.md b/docs/superpowers/specs/2026-04-11-tq-lineloss-deterministic-skill-design.md new file mode 100644 index 0000000..1f6f806 --- /dev/null +++ b/docs/superpowers/specs/2026-04-11-tq-lineloss-deterministic-skill-design.md @@ -0,0 +1,618 @@ +# TQ Line-Loss Deterministic Skill Design + +**Goal:** Add a staged business skill for `台区线损大数据-月_周累计线损率统计分析` and a deterministic natural-language routing path in `claw-new` that can bypass LLM when the instruction ends with `。。。`, while preserving the existing Zhihu hotlist behavior and keeping the execution seam pipe-first but ws-ready. + +**Status:** Approved design direction for implementation planning. + +--- + +## Decision Summary + +1. Add a new staged skill package `tq-lineloss-report` under `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/`, following the same packaging discipline as `fault-details-report`. +2. In `claw-new`, add a deterministic submit path triggered only when the instruction ends with the three-Chinese-dot suffix `。。。`. +3. In deterministic mode, route only through a fixed whitelist of staged skills; for this slice the new target is `tq-lineloss-report.collect_lineloss`. +4. Deterministic mode must extract business parameters from natural language without using an LLM: company/unit, month-vs-week mode, and period text. +5. Parsed natural-language parameters are not the final backend parameters. They must be normalized into the canonical codes required by the source page / source APIs (for example company code and period mode code). +6. If required parameters are missing or ambiguous, the runtime must stop and ask the user to provide them explicitly. It must **not** silently fall back to page defaults in this slice. +7. Skill execution must reuse the existing browser-script → pipe injection seam already proven by the Zhihu hotlist path. Do not create a second browser execution protocol. +8. The design must not regress or weaken the existing Zhihu hotlist direct path, browser-script path, export path, or current routing behavior. +9. The main branch implementation remains pipe-only, but all new deterministic-routing and skill contracts must stay backend-neutral so the execution backend can later be swapped to ws on the ws branch. + +--- + +## Non-Negotiable Boundaries + +### 1. Do not break the existing Zhihu hotlist flow + +This is the top safety boundary for the slice. + +The new deterministic routing for `tq-lineloss-report` must not break, narrow, or silently change: + +- current Zhihu hotlist routing +- current Zhihu direct browser-script execution +- current Zhihu export behavior +- current browser-script skill loading/execution +- existing direct-submit configuration behavior + +Design implication: + +- The new deterministic path must be added as a narrow, explicit branch. +- Existing Zhihu logic must keep its current trigger semantics and current execution seam. +- Verification for this slice must include targeted Zhihu regression coverage before implementation is considered complete. + +### 2. Current main branch is pipe-only + +The implementation landing on `main` must execute browser-script skills through the current pipe-backed browser execution seam. + +Do not introduce ws as an active runtime requirement for this slice. + +### 3. Future ws migration must stay cheap + +Although `main` remains pipe-only, the new work must leave a clean extension seam so that after this slice is merged into `ws`, the browser backend can be switched without redesigning: + +- the staged skill package +- the deterministic trigger contract +- the parameter extraction contract +- the parameter normalization contract +- the returned artifact contract + +--- + +## Why This Slice Exists + +The user wants a staged business skill for `台区线损大数据-月_周累计线损率统计分析` that behaves like a deterministic business operation, not a free-form LLM task. + +The desired operator experience is: + +- ordinary instructions continue to use the current normal routing / LLM path +- an instruction ending in `。。。` switches to deterministic business execution +- deterministic execution targets a fixed staged skill +- business parameters are extracted from the instruction +- those parameters are normalized to the real coded values the source page/API needs +- the staged browser-script is injected into the third-party browser through the existing pipe seam + +This provides an inner-network-safe path that can work without a model today, while reserving an upgrade path for future semantic fallback. + +--- + +## Terminology + +### Deterministic mode + +A submit-task mode enabled only when the instruction ends with `。。。`. + +### Natural-language business parameters + +Values expressed by the user in text, such as: + +- `兰州公司` +- `天水公司` +- `月累计` +- `周累计` +- `2026-03` +- `2026年第12周` + +These are intermediate semantic values, not final backend parameters. + +### Canonical execution parameters + +The normalized values required by the source page / source API, such as: + +- canonical company label +- canonical company code +- period mode code (month/week) +- canonical request period payload + +--- + +## Ownership Boundary and Landing Zones + +### Staged skill changes + +These land in: + +`D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging` + +Primary landing zone: + +- `skills/tq-lineloss-report/` + +Target package structure: + +- `SKILL.md` +- `SKILL.toml` +- `references/collection-flow.md` +- `references/data-quality.md` +- `assets/scene-snapshot/index.html` +- `scripts/collect_lineloss.js` +- `scripts/collect_lineloss.test.js` + +Potential aligned scene metadata (if included in this slice): + +- `scenes/tq-lineloss-report/scene.json` +- optional scene registry updates if the current staging conventions require it + +### Caller/runtime changes + +These land in: + +`D:/data/ideaSpace/rust/sgClaw/claw-new` + +Likely ownership areas: + +- deterministic instruction detection and deterministic skill matching +- parameter extraction and normalization +- deterministic skill dispatch to the existing browser-script seam +- narrow result interpretation for the returned artifact +- focused regression tests + +Design rule: + +`claw-new` owns routing, extraction, normalization, and dispatch. + +`claw-new` must **not** absorb the line-loss business logic itself. + +The staged skill package owns: + +- page inspection +- page-side state reading +- page/API data collection +- row normalization +- export/report-log behavior +- final artifact generation + +--- + +## Target Runtime Flow + +### Step 1: Submit-task enters deterministic mode only on `。。。` + +When the user instruction does **not** end in `。。。`: + +- keep the current runtime behavior unchanged +- preserve existing Zhihu hotlist behavior exactly +- preserve existing direct-submit and compat/LLM flows + +When the instruction **does** end in `。。。`: + +- enter deterministic mode +- do not run the ordinary LLM interpretation branch for this request +- evaluate only the deterministic skill whitelist + +### Step 2: Deterministic whitelist match + +The runtime should match the instruction against deterministic business scenes. + +For this slice the new required deterministic scene is: + +- `tq-lineloss-report.collect_lineloss` + +The matching layer should remain narrow and explicit. It should not become a general scene-registry runtime in this slice. + +Matching should use a deterministic combination of: + +- instruction keywords +- optional page URL/title constraints when available + +The runtime must not accidentally steal instructions that should still go down the Zhihu path. + +### Step 3: Extract semantic business parameters from natural language + +After `tq-lineloss-report` is matched, the runtime extracts semantic business parameters from the instruction. + +Required semantic categories: + +- company/unit expression +- period mode (`month` vs `week`) +- period text/value + +Examples of accepted user-facing expressions include: + +- `兰州公司` +- `天水公司` +- `国网兰州供电公司` +- `城关供电分公司` +- `2026-03` +- `2026年3月` +- `2026年第12周` +- `第12周` +- `月累计` +- `周累计` + +### Step 4: Normalize semantic values into canonical coded values + +This is a required separate design step. + +The runtime must not pass raw natural-language company text directly to the business request layer. + +Instead it must normalize semantic values into canonical execution parameters, including: + +- `org_label` — canonical unit label +- `org_code` — the actual code/value required by the business page/API +- `period_mode` — canonical mode (`month` or `week`) +- `period_mode_code` — the page/API code (for example `timeChage`-style encoded mode) +- canonical time payload required by the source APIs/page state + +This normalization should be derived from the actual source materials, including page-side dictionaries such as the existing unit tree data. + +### Step 5: Missing and ambiguous parameters must stop execution + +This slice must not silently infer missing parameters from page defaults. + +If a required parameter is missing, execution must stop with an explicit prompt to the user. + +If a parameter is ambiguous, execution must stop with an explicit ambiguity prompt. + +Examples: + +- no company matched +- no month/week mode matched +- no period value matched when required +- a short company alias matches multiple canonical units +- both monthly and weekly intent appear in the same instruction + +This is preferable to silently using the wrong company code or the wrong query period. + +### Step 6: Execute the staged skill through the existing pipe seam + +If and only if parameters are present and successfully normalized: + +- resolve `tq-lineloss-report.collect_lineloss` +- build the args object +- execute it through the current `browser_script` runtime +- inject the script into the browser through the existing pipe-backed browser tool seam + +This slice must reuse the execution pattern already proven by the current browser-script/direct-skill infrastructure and the current Zhihu hotlist path. + +Do not introduce a second browser protocol, new browser opcode family, or parallel execution harness. + +### Step 7: Skill JS performs page-side work and returns one artifact + +The staged script owns the actual line-loss business behavior: + +- reading page-side state when needed +- validating the page context +- using normalized codes/parameters from args +- building source API requests +- collecting/normalizing rows +- export/report logging behavior if required by the final business contract +- returning a structured artifact + +--- + +## Deterministic Trigger Contract + +### Trigger rule + +Deterministic mode is activated only when the raw instruction ends with the exact three-Chinese-dot suffix: + +- `。。。` + +This suffix is a user-controlled explicit mode switch. + +### Why the suffix exists + +It lets the user force business-deterministic behavior without relying on a model, while preserving the normal LLM path for ordinary requests. + +### Scope rule + +The suffix is not a free pass to run arbitrary browser actions. + +It only selects among the deterministic skill whitelist. + +If no deterministic scene matches, the runtime should return a deterministic-mode mismatch error that explains the currently supported deterministic scenes, rather than silently dropping into another behavior. + +--- + +## Company / Unit Matching Contract + +### Accepted input style + +The user does **not** need to type the exact full canonical label. + +The runtime should support business shorthand such as: + +- `兰州公司` +- `天水公司` +- `白银公司` +- `城关供电分公司` +- `榆中县供电公司` + +### Matching approach + +Do not use regex alone as the primary company-resolution mechanism. + +Use a three-stage resolution strategy: + +1. text normalization +2. alias/candidate generation from canonical unit names +3. uniqueness resolution against the real unit dictionary + +### Normalization examples + +Canonical names such as: + +- `国网兰州供电公司` +- `国网天水供电公司` +- `国网榆中县供电公司` + +should be matchable from business shorthand forms such as: + +- `兰州公司` +- `天水公司` +- `榆中县公司` +- `榆中供电公司` + +### Data source for canonical mapping + +The company/unit resolver should derive canonical mappings from the real source materials used by the business page, such as the current unit tree dictionary embedded in the source page resources. + +Design implication: + +- the resolver should produce the real `value`/code required downstream +- the resolver should also keep the canonical label for display/auditability + +### Ambiguity rule + +If a short alias resolves to more than one valid unit, execution must stop and ask the user to be more specific. + +Do not auto-guess. + +### Supported granularity + +The first implementation must support both: + +- city-company level +- district/county/sub-company level + +This includes forms like: + +- `兰州公司` +- `天水公司` +- `城关供电分公司` +- `榆中县供电公司` + +--- + +## Period Extraction and Normalization Contract + +### Required period dimensions + +The runtime must identify: + +- mode: `month` or `week` +- actual requested period value in a canonical form + +### Accepted user-facing patterns + +At minimum the design should account for patterns such as: + +- `月累计` +- `周累计` +- `2026-03` +- `2026年3月` +- `2026年第12周` +- `第12周` + +### Normalization output + +The resolver should produce: + +- a canonical mode enum/string +- a mode code required by the page/API +- a canonical period payload consumable by the script/business request layer + +### Ambiguity rule + +If both month and week intent appear, stop and ask the user to clarify. + +### Missing-period rule + +If the selected line-loss query requires a time period and the instruction does not provide enough information to construct one, stop and ask the user to provide it. + +Do not default to the page-selected period in this slice. + +--- + +## Parameter Prompting Contract + +When deterministic mode matches `tq-lineloss-report` but one or more required parameters are missing or ambiguous, the runtime should return a user-facing prompt rather than executing. + +Expected prompting cases include: + +- missing company/unit +- missing month/week mode +- missing period value +- ambiguous company alias +- contradictory period expressions + +The prompt should be specific enough to let the user correct only the missing field(s). + +Example style: + +- `已命中台区线损报表技能,但缺少供电单位,请补充如“兰州公司”或“城关供电分公司”。` +- `已命中台区线损报表技能,但未识别到月/周类型,请补充“月累计”或“周累计”。` + +--- + +## Skill Package Contract + +### SKILL.toml + +The new skill package must declare a single deterministic collection entrypoint: + +- tool name: `collect_lineloss` +- kind: `browser_script` + +The tool description must reflect the real staged behavior, not a placeholder shell. + +### SKILL.md + +The written contract should cover: + +- when to use the skill +- when not to use it +- collection workflow +- runtime contract +- explicit missing/partial/error semantics +- returned artifact contract + +### references/collection-flow.md + +Must explain: + +- the source page state used by the skill +- how company and period parameters map to business requests +- which page/API calls are used for month vs week +- export/report-log sequencing if retained in the business flow + +### references/data-quality.md + +Must define: + +- canonical output columns +- required field coverage +- status semantics +- partial/error conditions +- company/period normalization assumptions that the script relies on + +### scripts/collect_lineloss.js + +This is the real browser-side entrypoint. It should: + +- accept normalized args +- validate page context +- execute deterministic page/API data collection +- normalize rows +- perform downstream export/report-history behavior if required +- directly return the final artifact from the browser-script runtime entrypoint shape + +### scripts/collect_lineloss.test.js + +Must cover the business transforms that can be tested off-browser, especially: + +- company normalization assumptions consumed by the script +- monthly vs weekly request-shape logic +- status semantics +- artifact shaping + +--- + +## Returned Artifact Contract + +The final line-loss skill should return one structured artifact object rather than free-form prose. + +At minimum it should expose: + +- artifact type +- report name +- canonical company label/code used for the query +- period mode and canonical period value used for the query +- columns +- rows +- status +- counts +- downstream export/report-log status when applicable +- clear reasons for blocked/partial/error states + +The exact field names may be finalized during implementation planning, but the contract must be structured enough for `claw-new` to interpret success vs partial vs blocked without re-embedding business logic. + +--- + +## Pipe-First / Ws-Ready Execution Seam + +### Current requirement + +The first implementation on `main` must use the existing pipe-backed browser execution path. + +### Future requirement + +The design must allow later ws adoption without redesigning the skill or routing contract. + +### Practical design rule + +Keep these backend-neutral: + +- deterministic trigger contract +- skill matching contract +- parameter extraction contract +- parameter normalization contract +- tool args contract +- artifact contract + +Keep backend-specific code isolated to the execution seam only. + +That way the later ws migration can replace the browser backend beneath the same deterministic skill contract. + +--- + +## Caller/Runtime Design Rules + +### 1. Keep new business logic out of broad orchestration + +Do not thread line-loss-specific business behavior through the general orchestration/runtime path. + +### 2. Add a narrow deterministic-routing seam + +This slice should add a narrow deterministic branch around submit-task routing, rather than rewriting the whole runtime decision tree. + +### 3. Separate extraction from normalization + +Do not mix “what the user typed” with “what the backend needs”. + +There must be a distinct normalization step. + +### 4. Keep the direct-skill browser seam narrow + +Reuse the current `browser_script` execution seam instead of inventing a new browser bridge. + +### 5. Preserve Zhihu behavior by design, not by hope + +The design should assume new deterministic routing can accidentally steal or alter existing Zhihu behavior unless explicitly guarded against. + +This is why focused Zhihu regression coverage is mandatory. + +--- + +## Verification Requirements for the Future Implementation Plan + +Implementation planning must include explicit verification for: + +1. deterministic suffix detection +2. deterministic lineloss scene matching +3. company alias normalization to canonical code +4. support for both company-level and district/county/sub-company-level units +5. month/week extraction and normalization +6. missing-parameter prompt behavior +7. ambiguous-company prompt behavior +8. pipe-backed browser-script execution for the new skill +9. no regression to the existing Zhihu hotlist path +10. preserved direct-skill/browser-script behavior outside the new line-loss scene + +--- + +## Out of Scope for This Slice + +- enabling ws execution on `main` +- replacing the current Zhihu routing model +- general scene-registry runtime architecture redesign +- full free-form semantic understanding of arbitrary business language +- typo-tolerant fuzzy NLP beyond deterministic business-safe matching +- making page defaults the hidden source of truth when the user omitted parameters + +--- + +## Planning Notes + +The implementation plan should likely split into distinct work items for: + +1. staged skill package creation and business contract definition +2. deterministic trigger + scene match in `claw-new` +3. company/unit normalization and ambiguity handling +4. period extraction/normalization and ambiguity handling +5. pipe-backed direct execution integration +6. returned artifact interpretation +7. Zhihu regression verification +8. ws-readiness seam verification + +The plan should explicitly keep the “do not break Zhihu hotlist” boundary visible in every execution and verification stage.