Add the tq lineloss design spec and implementation plan documents used for the deterministic submit work. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
19 KiB
TQ Line-Loss Deterministic Skill Design
Goal: Add a staged business skill for 台区线损大数据-月_周累计线损率统计分析 and a deterministic natural-language routing path in claw-new that can bypass LLM when the instruction ends with 。。。, while preserving the existing Zhihu hotlist behavior and keeping the execution seam pipe-first but ws-ready.
Status: Approved design direction for implementation planning.
Decision Summary
- Add a new staged skill package
tq-lineloss-reportunderD:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/, following the same packaging discipline asfault-details-report. - In
claw-new, add a deterministic submit path triggered only when the instruction ends with the three-Chinese-dot suffix。。。. - In deterministic mode, route only through a fixed whitelist of staged skills; for this slice the new target is
tq-lineloss-report.collect_lineloss. - Deterministic mode must extract business parameters from natural language without using an LLM: company/unit, month-vs-week mode, and period text.
- Parsed natural-language parameters are not the final backend parameters. They must be normalized into the canonical codes required by the source page / source APIs (for example company code and period mode code).
- If required parameters are missing or ambiguous, the runtime must stop and ask the user to provide them explicitly. It must not silently fall back to page defaults in this slice.
- Skill execution must reuse the existing browser-script → pipe injection seam already proven by the Zhihu hotlist path. Do not create a second browser execution protocol.
- The design must not regress or weaken the existing Zhihu hotlist direct path, browser-script path, export path, or current routing behavior.
- The main branch implementation remains pipe-only, but all new deterministic-routing and skill contracts must stay backend-neutral so the execution backend can later be swapped to ws on the ws branch.
Non-Negotiable Boundaries
1. Do not break the existing Zhihu hotlist flow
This is the top safety boundary for the slice.
The new deterministic routing for tq-lineloss-report must not break, narrow, or silently change:
- current Zhihu hotlist routing
- current Zhihu direct browser-script execution
- current Zhihu export behavior
- current browser-script skill loading/execution
- existing direct-submit configuration behavior
Design implication:
- The new deterministic path must be added as a narrow, explicit branch.
- Existing Zhihu logic must keep its current trigger semantics and current execution seam.
- Verification for this slice must include targeted Zhihu regression coverage before implementation is considered complete.
2. Current main branch is pipe-only
The implementation landing on main must execute browser-script skills through the current pipe-backed browser execution seam.
Do not introduce ws as an active runtime requirement for this slice.
3. Future ws migration must stay cheap
Although main remains pipe-only, the new work must leave a clean extension seam so that after this slice is merged into ws, the browser backend can be switched without redesigning:
- the staged skill package
- the deterministic trigger contract
- the parameter extraction contract
- the parameter normalization contract
- the returned artifact contract
Why This Slice Exists
The user wants a staged business skill for 台区线损大数据-月_周累计线损率统计分析 that behaves like a deterministic business operation, not a free-form LLM task.
The desired operator experience is:
- ordinary instructions continue to use the current normal routing / LLM path
- an instruction ending in
。。。switches to deterministic business execution - deterministic execution targets a fixed staged skill
- business parameters are extracted from the instruction
- those parameters are normalized to the real coded values the source page/API needs
- the staged browser-script is injected into the third-party browser through the existing pipe seam
This provides an inner-network-safe path that can work without a model today, while reserving an upgrade path for future semantic fallback.
Terminology
Deterministic mode
A submit-task mode enabled only when the instruction ends with 。。。.
Natural-language business parameters
Values expressed by the user in text, such as:
兰州公司天水公司月累计周累计2026-032026年第12周
These are intermediate semantic values, not final backend parameters.
Canonical execution parameters
The normalized values required by the source page / source API, such as:
- canonical company label
- canonical company code
- period mode code (month/week)
- canonical request period payload
Ownership Boundary and Landing Zones
Staged skill changes
These land in:
D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging
Primary landing zone:
skills/tq-lineloss-report/
Target package structure:
SKILL.mdSKILL.tomlreferences/collection-flow.mdreferences/data-quality.mdassets/scene-snapshot/index.htmlscripts/collect_lineloss.jsscripts/collect_lineloss.test.js
Potential aligned scene metadata (if included in this slice):
scenes/tq-lineloss-report/scene.json- optional scene registry updates if the current staging conventions require it
Caller/runtime changes
These land in:
D:/data/ideaSpace/rust/sgClaw/claw-new
Likely ownership areas:
- deterministic instruction detection and deterministic skill matching
- parameter extraction and normalization
- deterministic skill dispatch to the existing browser-script seam
- narrow result interpretation for the returned artifact
- focused regression tests
Design rule:
claw-new owns routing, extraction, normalization, and dispatch.
claw-new must not absorb the line-loss business logic itself.
The staged skill package owns:
- page inspection
- page-side state reading
- page/API data collection
- row normalization
- export/report-log behavior
- final artifact generation
Target Runtime Flow
Step 1: Submit-task enters deterministic mode only on 。。。
When the user instruction does not end in 。。。:
- keep the current runtime behavior unchanged
- preserve existing Zhihu hotlist behavior exactly
- preserve existing direct-submit and compat/LLM flows
When the instruction does end in 。。。:
- enter deterministic mode
- do not run the ordinary LLM interpretation branch for this request
- evaluate only the deterministic skill whitelist
Step 2: Deterministic whitelist match
The runtime should match the instruction against deterministic business scenes.
For this slice the new required deterministic scene is:
tq-lineloss-report.collect_lineloss
The matching layer should remain narrow and explicit. It should not become a general scene-registry runtime in this slice.
Matching should use a deterministic combination of:
- instruction keywords
- optional page URL/title constraints when available
The runtime must not accidentally steal instructions that should still go down the Zhihu path.
Step 3: Extract semantic business parameters from natural language
After tq-lineloss-report is matched, the runtime extracts semantic business parameters from the instruction.
Required semantic categories:
- company/unit expression
- period mode (
monthvsweek) - period text/value
Examples of accepted user-facing expressions include:
兰州公司天水公司国网兰州供电公司城关供电分公司2026-032026年3月2026年第12周第12周月累计周累计
Step 4: Normalize semantic values into canonical coded values
This is a required separate design step.
The runtime must not pass raw natural-language company text directly to the business request layer.
Instead it must normalize semantic values into canonical execution parameters, including:
org_label— canonical unit labelorg_code— the actual code/value required by the business page/APIperiod_mode— canonical mode (monthorweek)period_mode_code— the page/API code (for exampletimeChage-style encoded mode)- canonical time payload required by the source APIs/page state
This normalization should be derived from the actual source materials, including page-side dictionaries such as the existing unit tree data.
Step 5: Missing and ambiguous parameters must stop execution
This slice must not silently infer missing parameters from page defaults.
If a required parameter is missing, execution must stop with an explicit prompt to the user.
If a parameter is ambiguous, execution must stop with an explicit ambiguity prompt.
Examples:
- no company matched
- no month/week mode matched
- no period value matched when required
- a short company alias matches multiple canonical units
- both monthly and weekly intent appear in the same instruction
This is preferable to silently using the wrong company code or the wrong query period.
Step 6: Execute the staged skill through the existing pipe seam
If and only if parameters are present and successfully normalized:
- resolve
tq-lineloss-report.collect_lineloss - build the args object
- execute it through the current
browser_scriptruntime - inject the script into the browser through the existing pipe-backed browser tool seam
This slice must reuse the execution pattern already proven by the current browser-script/direct-skill infrastructure and the current Zhihu hotlist path.
Do not introduce a second browser protocol, new browser opcode family, or parallel execution harness.
Step 7: Skill JS performs page-side work and returns one artifact
The staged script owns the actual line-loss business behavior:
- reading page-side state when needed
- validating the page context
- using normalized codes/parameters from args
- building source API requests
- collecting/normalizing rows
- export/report logging behavior if required by the final business contract
- returning a structured artifact
Deterministic Trigger Contract
Trigger rule
Deterministic mode is activated only when the raw instruction ends with the exact three-Chinese-dot suffix:
。。。
This suffix is a user-controlled explicit mode switch.
Why the suffix exists
It lets the user force business-deterministic behavior without relying on a model, while preserving the normal LLM path for ordinary requests.
Scope rule
The suffix is not a free pass to run arbitrary browser actions.
It only selects among the deterministic skill whitelist.
If no deterministic scene matches, the runtime should return a deterministic-mode mismatch error that explains the currently supported deterministic scenes, rather than silently dropping into another behavior.
Company / Unit Matching Contract
Accepted input style
The user does not need to type the exact full canonical label.
The runtime should support business shorthand such as:
兰州公司天水公司白银公司城关供电分公司榆中县供电公司
Matching approach
Do not use regex alone as the primary company-resolution mechanism.
Use a three-stage resolution strategy:
- text normalization
- alias/candidate generation from canonical unit names
- uniqueness resolution against the real unit dictionary
Normalization examples
Canonical names such as:
国网兰州供电公司国网天水供电公司国网榆中县供电公司
should be matchable from business shorthand forms such as:
兰州公司天水公司榆中县公司榆中供电公司
Data source for canonical mapping
The company/unit resolver should derive canonical mappings from the real source materials used by the business page, such as the current unit tree dictionary embedded in the source page resources.
Design implication:
- the resolver should produce the real
value/code required downstream - the resolver should also keep the canonical label for display/auditability
Ambiguity rule
If a short alias resolves to more than one valid unit, execution must stop and ask the user to be more specific.
Do not auto-guess.
Supported granularity
The first implementation must support both:
- city-company level
- district/county/sub-company level
This includes forms like:
兰州公司天水公司城关供电分公司榆中县供电公司
Period Extraction and Normalization Contract
Required period dimensions
The runtime must identify:
- mode:
monthorweek - actual requested period value in a canonical form
Accepted user-facing patterns
At minimum the design should account for patterns such as:
月累计周累计2026-032026年3月2026年第12周第12周
Normalization output
The resolver should produce:
- a canonical mode enum/string
- a mode code required by the page/API
- a canonical period payload consumable by the script/business request layer
Ambiguity rule
If both month and week intent appear, stop and ask the user to clarify.
Missing-period rule
If the selected line-loss query requires a time period and the instruction does not provide enough information to construct one, stop and ask the user to provide it.
Do not default to the page-selected period in this slice.
Parameter Prompting Contract
When deterministic mode matches tq-lineloss-report but one or more required parameters are missing or ambiguous, the runtime should return a user-facing prompt rather than executing.
Expected prompting cases include:
- missing company/unit
- missing month/week mode
- missing period value
- ambiguous company alias
- contradictory period expressions
The prompt should be specific enough to let the user correct only the missing field(s).
Example style:
已命中台区线损报表技能,但缺少供电单位,请补充如“兰州公司”或“城关供电分公司”。已命中台区线损报表技能,但未识别到月/周类型,请补充“月累计”或“周累计”。
Skill Package Contract
SKILL.toml
The new skill package must declare a single deterministic collection entrypoint:
- tool name:
collect_lineloss - kind:
browser_script
The tool description must reflect the real staged behavior, not a placeholder shell.
SKILL.md
The written contract should cover:
- when to use the skill
- when not to use it
- collection workflow
- runtime contract
- explicit missing/partial/error semantics
- returned artifact contract
references/collection-flow.md
Must explain:
- the source page state used by the skill
- how company and period parameters map to business requests
- which page/API calls are used for month vs week
- export/report-log sequencing if retained in the business flow
references/data-quality.md
Must define:
- canonical output columns
- required field coverage
- status semantics
- partial/error conditions
- company/period normalization assumptions that the script relies on
scripts/collect_lineloss.js
This is the real browser-side entrypoint. It should:
- accept normalized args
- validate page context
- execute deterministic page/API data collection
- normalize rows
- perform downstream export/report-history behavior if required
- directly return the final artifact from the browser-script runtime entrypoint shape
scripts/collect_lineloss.test.js
Must cover the business transforms that can be tested off-browser, especially:
- company normalization assumptions consumed by the script
- monthly vs weekly request-shape logic
- status semantics
- artifact shaping
Returned Artifact Contract
The final line-loss skill should return one structured artifact object rather than free-form prose.
At minimum it should expose:
- artifact type
- report name
- canonical company label/code used for the query
- period mode and canonical period value used for the query
- columns
- rows
- status
- counts
- downstream export/report-log status when applicable
- clear reasons for blocked/partial/error states
The exact field names may be finalized during implementation planning, but the contract must be structured enough for claw-new to interpret success vs partial vs blocked without re-embedding business logic.
Pipe-First / Ws-Ready Execution Seam
Current requirement
The first implementation on main must use the existing pipe-backed browser execution path.
Future requirement
The design must allow later ws adoption without redesigning the skill or routing contract.
Practical design rule
Keep these backend-neutral:
- deterministic trigger contract
- skill matching contract
- parameter extraction contract
- parameter normalization contract
- tool args contract
- artifact contract
Keep backend-specific code isolated to the execution seam only.
That way the later ws migration can replace the browser backend beneath the same deterministic skill contract.
Caller/Runtime Design Rules
1. Keep new business logic out of broad orchestration
Do not thread line-loss-specific business behavior through the general orchestration/runtime path.
2. Add a narrow deterministic-routing seam
This slice should add a narrow deterministic branch around submit-task routing, rather than rewriting the whole runtime decision tree.
3. Separate extraction from normalization
Do not mix “what the user typed” with “what the backend needs”.
There must be a distinct normalization step.
4. Keep the direct-skill browser seam narrow
Reuse the current browser_script execution seam instead of inventing a new browser bridge.
5. Preserve Zhihu behavior by design, not by hope
The design should assume new deterministic routing can accidentally steal or alter existing Zhihu behavior unless explicitly guarded against.
This is why focused Zhihu regression coverage is mandatory.
Verification Requirements for the Future Implementation Plan
Implementation planning must include explicit verification for:
- deterministic suffix detection
- deterministic lineloss scene matching
- company alias normalization to canonical code
- support for both company-level and district/county/sub-company-level units
- month/week extraction and normalization
- missing-parameter prompt behavior
- ambiguous-company prompt behavior
- pipe-backed browser-script execution for the new skill
- no regression to the existing Zhihu hotlist path
- preserved direct-skill/browser-script behavior outside the new line-loss scene
Out of Scope for This Slice
- enabling ws execution on
main - replacing the current Zhihu routing model
- general scene-registry runtime architecture redesign
- full free-form semantic understanding of arbitrary business language
- typo-tolerant fuzzy NLP beyond deterministic business-safe matching
- making page defaults the hidden source of truth when the user omitted parameters
Planning Notes
The implementation plan should likely split into distinct work items for:
- staged skill package creation and business contract definition
- deterministic trigger + scene match in
claw-new - company/unit normalization and ambiguity handling
- period extraction/normalization and ambiguity handling
- pipe-backed direct execution integration
- returned artifact interpretation
- Zhihu regression verification
- ws-readiness seam verification
The plan should explicitly keep the “do not break Zhihu hotlist” boundary visible in every execution and verification stage.