admin/claw

Files

木炎 72b79feca9 docs: add tq lineloss design and plan

Add the tq lineloss design spec and implementation plan documents used for the deterministic submit work.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

2026-04-12 13:12:35 +08:00

19 KiB

Raw Blame History

TQ Line-Loss Deterministic Skill Design

Goal: Add a staged business skill for 台区线损大数据-月_周累计线损率统计分析 and a deterministic natural-language routing path in claw-new that can bypass LLM when the instruction ends with 。。。, while preserving the existing Zhihu hotlist behavior and keeping the execution seam pipe-first but ws-ready.

Status: Approved design direction for implementation planning.

Decision Summary

Add a new staged skill package tq-lineloss-report under D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/, following the same packaging discipline as fault-details-report.
In claw-new, add a deterministic submit path triggered only when the instruction ends with the three-Chinese-dot suffix 。。。.
In deterministic mode, route only through a fixed whitelist of staged skills; for this slice the new target is tq-lineloss-report.collect_lineloss.
Deterministic mode must extract business parameters from natural language without using an LLM: company/unit, month-vs-week mode, and period text.
Parsed natural-language parameters are not the final backend parameters. They must be normalized into the canonical codes required by the source page / source APIs (for example company code and period mode code).
If required parameters are missing or ambiguous, the runtime must stop and ask the user to provide them explicitly. It must not silently fall back to page defaults in this slice.
Skill execution must reuse the existing browser-script → pipe injection seam already proven by the Zhihu hotlist path. Do not create a second browser execution protocol.
The design must not regress or weaken the existing Zhihu hotlist direct path, browser-script path, export path, or current routing behavior.
The main branch implementation remains pipe-only, but all new deterministic-routing and skill contracts must stay backend-neutral so the execution backend can later be swapped to ws on the ws branch.

Non-Negotiable Boundaries

1. Do not break the existing Zhihu hotlist flow

This is the top safety boundary for the slice.

The new deterministic routing for tq-lineloss-report must not break, narrow, or silently change:

current Zhihu hotlist routing
current Zhihu direct browser-script execution
current Zhihu export behavior
current browser-script skill loading/execution
existing direct-submit configuration behavior

Design implication:

The new deterministic path must be added as a narrow, explicit branch.
Existing Zhihu logic must keep its current trigger semantics and current execution seam.
Verification for this slice must include targeted Zhihu regression coverage before implementation is considered complete.

2. Current main branch is pipe-only

The implementation landing on main must execute browser-script skills through the current pipe-backed browser execution seam.

Do not introduce ws as an active runtime requirement for this slice.

3. Future ws migration must stay cheap

Although main remains pipe-only, the new work must leave a clean extension seam so that after this slice is merged into ws, the browser backend can be switched without redesigning:

the staged skill package
the deterministic trigger contract
the parameter extraction contract
the parameter normalization contract
the returned artifact contract

Why This Slice Exists

The user wants a staged business skill for 台区线损大数据-月_周累计线损率统计分析 that behaves like a deterministic business operation, not a free-form LLM task.

The desired operator experience is:

ordinary instructions continue to use the current normal routing / LLM path
an instruction ending in 。。。 switches to deterministic business execution
deterministic execution targets a fixed staged skill
business parameters are extracted from the instruction
those parameters are normalized to the real coded values the source page/API needs
the staged browser-script is injected into the third-party browser through the existing pipe seam

This provides an inner-network-safe path that can work without a model today, while reserving an upgrade path for future semantic fallback.

Terminology

Deterministic mode

A submit-task mode enabled only when the instruction ends with 。。。.

Natural-language business parameters

Values expressed by the user in text, such as:

兰州公司
天水公司
月累计
周累计
2026-03
2026年第12周

These are intermediate semantic values, not final backend parameters.

Canonical execution parameters

The normalized values required by the source page / source API, such as:

canonical company label
canonical company code
period mode code (month/week)
canonical request period payload

Ownership Boundary and Landing Zones

Staged skill changes

These land in:

D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging

Primary landing zone:

skills/tq-lineloss-report/

Target package structure:

SKILL.md
SKILL.toml
references/collection-flow.md
references/data-quality.md
assets/scene-snapshot/index.html
scripts/collect_lineloss.js
scripts/collect_lineloss.test.js

Potential aligned scene metadata (if included in this slice):

scenes/tq-lineloss-report/scene.json
optional scene registry updates if the current staging conventions require it

Caller/runtime changes

These land in:

D:/data/ideaSpace/rust/sgClaw/claw-new

Likely ownership areas:

deterministic instruction detection and deterministic skill matching
parameter extraction and normalization
deterministic skill dispatch to the existing browser-script seam
narrow result interpretation for the returned artifact
focused regression tests

Design rule:

claw-new owns routing, extraction, normalization, and dispatch.

claw-new must not absorb the line-loss business logic itself.

The staged skill package owns:

page inspection
page-side state reading
page/API data collection
row normalization
export/report-log behavior
final artifact generation

Target Runtime Flow

Step 1: Submit-task enters deterministic mode only on `。。。`

When the user instruction does not end in 。。。:

keep the current runtime behavior unchanged
preserve existing Zhihu hotlist behavior exactly
preserve existing direct-submit and compat/LLM flows

When the instruction does end in 。。。:

enter deterministic mode
do not run the ordinary LLM interpretation branch for this request
evaluate only the deterministic skill whitelist

Step 2: Deterministic whitelist match

The runtime should match the instruction against deterministic business scenes.

For this slice the new required deterministic scene is:

tq-lineloss-report.collect_lineloss

The matching layer should remain narrow and explicit. It should not become a general scene-registry runtime in this slice.

Matching should use a deterministic combination of:

instruction keywords
optional page URL/title constraints when available

The runtime must not accidentally steal instructions that should still go down the Zhihu path.

Step 3: Extract semantic business parameters from natural language

After tq-lineloss-report is matched, the runtime extracts semantic business parameters from the instruction.

Required semantic categories:

company/unit expression
period mode (month vs week)
period text/value

Examples of accepted user-facing expressions include:

兰州公司
天水公司
国网兰州供电公司
城关供电分公司
2026-03
2026年3月
2026年第12周
第12周
月累计
周累计

Step 4: Normalize semantic values into canonical coded values

This is a required separate design step.

The runtime must not pass raw natural-language company text directly to the business request layer.

Instead it must normalize semantic values into canonical execution parameters, including:

org_label — canonical unit label
org_code — the actual code/value required by the business page/API
period_mode — canonical mode (month or week)
period_mode_code — the page/API code (for example timeChage-style encoded mode)
canonical time payload required by the source APIs/page state

This normalization should be derived from the actual source materials, including page-side dictionaries such as the existing unit tree data.

Step 5: Missing and ambiguous parameters must stop execution

This slice must not silently infer missing parameters from page defaults.

If a required parameter is missing, execution must stop with an explicit prompt to the user.

If a parameter is ambiguous, execution must stop with an explicit ambiguity prompt.

Examples:

no company matched
no month/week mode matched
no period value matched when required
a short company alias matches multiple canonical units
both monthly and weekly intent appear in the same instruction

This is preferable to silently using the wrong company code or the wrong query period.

Step 6: Execute the staged skill through the existing pipe seam

If and only if parameters are present and successfully normalized:

resolve tq-lineloss-report.collect_lineloss
build the args object
execute it through the current browser_script runtime
inject the script into the browser through the existing pipe-backed browser tool seam

This slice must reuse the execution pattern already proven by the current browser-script/direct-skill infrastructure and the current Zhihu hotlist path.

Do not introduce a second browser protocol, new browser opcode family, or parallel execution harness.

Step 7: Skill JS performs page-side work and returns one artifact

The staged script owns the actual line-loss business behavior:

reading page-side state when needed
validating the page context
using normalized codes/parameters from args
building source API requests
collecting/normalizing rows
export/report logging behavior if required by the final business contract
returning a structured artifact

Deterministic Trigger Contract

Trigger rule

Deterministic mode is activated only when the raw instruction ends with the exact three-Chinese-dot suffix:

。。。

This suffix is a user-controlled explicit mode switch.

Why the suffix exists

It lets the user force business-deterministic behavior without relying on a model, while preserving the normal LLM path for ordinary requests.

Scope rule

The suffix is not a free pass to run arbitrary browser actions.

It only selects among the deterministic skill whitelist.

If no deterministic scene matches, the runtime should return a deterministic-mode mismatch error that explains the currently supported deterministic scenes, rather than silently dropping into another behavior.

Company / Unit Matching Contract

Accepted input style

The user does not need to type the exact full canonical label.

The runtime should support business shorthand such as:

兰州公司
天水公司
白银公司
城关供电分公司
榆中县供电公司

Matching approach

Do not use regex alone as the primary company-resolution mechanism.

Use a three-stage resolution strategy:

text normalization
alias/candidate generation from canonical unit names
uniqueness resolution against the real unit dictionary

Normalization examples

Canonical names such as:

国网兰州供电公司
国网天水供电公司
国网榆中县供电公司

should be matchable from business shorthand forms such as:

兰州公司
天水公司
榆中县公司
榆中供电公司

Data source for canonical mapping

The company/unit resolver should derive canonical mappings from the real source materials used by the business page, such as the current unit tree dictionary embedded in the source page resources.

Design implication:

the resolver should produce the real value/code required downstream
the resolver should also keep the canonical label for display/auditability

Ambiguity rule

If a short alias resolves to more than one valid unit, execution must stop and ask the user to be more specific.

Do not auto-guess.

Supported granularity

The first implementation must support both:

city-company level
district/county/sub-company level

This includes forms like:

兰州公司
天水公司
城关供电分公司
榆中县供电公司

Period Extraction and Normalization Contract

Required period dimensions

The runtime must identify:

mode: month or week
actual requested period value in a canonical form

Accepted user-facing patterns

At minimum the design should account for patterns such as:

月累计
周累计
2026-03
2026年3月
2026年第12周
第12周

Normalization output

The resolver should produce:

a canonical mode enum/string
a mode code required by the page/API
a canonical period payload consumable by the script/business request layer

Ambiguity rule

If both month and week intent appear, stop and ask the user to clarify.

Missing-period rule

If the selected line-loss query requires a time period and the instruction does not provide enough information to construct one, stop and ask the user to provide it.

Do not default to the page-selected period in this slice.

Parameter Prompting Contract

When deterministic mode matches tq-lineloss-report but one or more required parameters are missing or ambiguous, the runtime should return a user-facing prompt rather than executing.

Expected prompting cases include:

missing company/unit
missing month/week mode
missing period value
ambiguous company alias
contradictory period expressions

The prompt should be specific enough to let the user correct only the missing field(s).

Example style:

已命中台区线损报表技能，但缺少供电单位，请补充如“兰州公司”或“城关供电分公司”。
已命中台区线损报表技能，但未识别到月/周类型，请补充“月累计”或“周累计”。

Skill Package Contract

SKILL.toml

The new skill package must declare a single deterministic collection entrypoint:

tool name: collect_lineloss
kind: browser_script

The tool description must reflect the real staged behavior, not a placeholder shell.

SKILL.md

The written contract should cover:

when to use the skill
when not to use it
collection workflow
runtime contract
explicit missing/partial/error semantics
returned artifact contract

references/collection-flow.md

Must explain:

the source page state used by the skill
how company and period parameters map to business requests
which page/API calls are used for month vs week
export/report-log sequencing if retained in the business flow

references/data-quality.md

Must define:

canonical output columns
required field coverage
status semantics
partial/error conditions
company/period normalization assumptions that the script relies on

scripts/collect_lineloss.js

This is the real browser-side entrypoint. It should:

accept normalized args
validate page context
execute deterministic page/API data collection
normalize rows
perform downstream export/report-history behavior if required
directly return the final artifact from the browser-script runtime entrypoint shape

scripts/collect_lineloss.test.js

Must cover the business transforms that can be tested off-browser, especially:

company normalization assumptions consumed by the script
monthly vs weekly request-shape logic
status semantics
artifact shaping

Returned Artifact Contract

The final line-loss skill should return one structured artifact object rather than free-form prose.

At minimum it should expose:

artifact type
report name
canonical company label/code used for the query
period mode and canonical period value used for the query
columns
rows
status
counts
downstream export/report-log status when applicable
clear reasons for blocked/partial/error states

The exact field names may be finalized during implementation planning, but the contract must be structured enough for claw-new to interpret success vs partial vs blocked without re-embedding business logic.

Pipe-First / Ws-Ready Execution Seam

Current requirement

The first implementation on main must use the existing pipe-backed browser execution path.

Future requirement

The design must allow later ws adoption without redesigning the skill or routing contract.

Practical design rule

Keep these backend-neutral:

deterministic trigger contract
skill matching contract
parameter extraction contract
parameter normalization contract
tool args contract
artifact contract

Keep backend-specific code isolated to the execution seam only.

That way the later ws migration can replace the browser backend beneath the same deterministic skill contract.

Caller/Runtime Design Rules

1. Keep new business logic out of broad orchestration

Do not thread line-loss-specific business behavior through the general orchestration/runtime path.

2. Add a narrow deterministic-routing seam

This slice should add a narrow deterministic branch around submit-task routing, rather than rewriting the whole runtime decision tree.

3. Separate extraction from normalization

Do not mix “what the user typed” with “what the backend needs”.

There must be a distinct normalization step.

4. Keep the direct-skill browser seam narrow

Reuse the current browser_script execution seam instead of inventing a new browser bridge.

5. Preserve Zhihu behavior by design, not by hope

The design should assume new deterministic routing can accidentally steal or alter existing Zhihu behavior unless explicitly guarded against.

This is why focused Zhihu regression coverage is mandatory.

Verification Requirements for the Future Implementation Plan

Implementation planning must include explicit verification for:

deterministic suffix detection
deterministic lineloss scene matching
company alias normalization to canonical code
support for both company-level and district/county/sub-company-level units
month/week extraction and normalization
missing-parameter prompt behavior
ambiguous-company prompt behavior
pipe-backed browser-script execution for the new skill
no regression to the existing Zhihu hotlist path
preserved direct-skill/browser-script behavior outside the new line-loss scene

Out of Scope for This Slice

enabling ws execution on main
replacing the current Zhihu routing model
general scene-registry runtime architecture redesign
full free-form semantic understanding of arbitrary business language
typo-tolerant fuzzy NLP beyond deterministic business-safe matching
making page defaults the hidden source of truth when the user omitted parameters

Planning Notes

The implementation plan should likely split into distinct work items for:

staged skill package creation and business contract definition
deterministic trigger + scene match in claw-new
company/unit normalization and ambiguity handling
period extraction/normalization and ambiguity handling
pipe-backed direct execution integration
returned artifact interpretation
Zhihu regression verification
ws-readiness seam verification

The plan should explicitly keep the “do not break Zhihu hotlist” boundary visible in every execution and verification stage.

19 KiB Raw Blame History

TQ Line-Loss Deterministic Skill Design

Decision Summary

Non-Negotiable Boundaries

1. Do not break the existing Zhihu hotlist flow

2. Current main branch is pipe-only

3. Future ws migration must stay cheap

Why This Slice Exists

Terminology

Deterministic mode

Natural-language business parameters

Canonical execution parameters

Ownership Boundary and Landing Zones

Staged skill changes

Caller/runtime changes

Target Runtime Flow

Step 1: Submit-task enters deterministic mode only on 。。。

Step 2: Deterministic whitelist match

Step 3: Extract semantic business parameters from natural language

Step 4: Normalize semantic values into canonical coded values

Step 5: Missing and ambiguous parameters must stop execution

Step 6: Execute the staged skill through the existing pipe seam

Step 7: Skill JS performs page-side work and returns one artifact

Deterministic Trigger Contract

Trigger rule

Why the suffix exists

Scope rule

Company / Unit Matching Contract

Accepted input style

Matching approach

Normalization examples

Data source for canonical mapping

Ambiguity rule

Supported granularity

Period Extraction and Normalization Contract

Required period dimensions

Accepted user-facing patterns

Normalization output

Ambiguity rule

Missing-period rule

Parameter Prompting Contract

Skill Package Contract

SKILL.toml

SKILL.md

references/collection-flow.md

references/data-quality.md

scripts/collect_lineloss.js

scripts/collect_lineloss.test.js

Returned Artifact Contract

Pipe-First / Ws-Ready Execution Seam

Current requirement

Future requirement

Practical design rule

Caller/Runtime Design Rules

1. Keep new business logic out of broad orchestration

2. Add a narrow deterministic-routing seam

3. Separate extraction from normalization

4. Keep the direct-skill browser seam narrow

5. Preserve Zhihu behavior by design, not by hope

Verification Requirements for the Future Implementation Plan

Out of Scope for This Slice

Planning Notes

19 KiB

Raw Blame History

Step 1: Submit-task enters deterministic mode only on `。。。`