67 Commits

Author SHA1 Message Date
木炎
044d38003d chore: remove sgclaw_config.json (gitignored) and skills-lock.json
🤖 Generated with [Qoder][https://qoder.com]
2026-04-14 14:48:26 +08:00
木炎
f07f7d63ef chore: add .sgclaw-zeroclaw-workspace and sgclaw_config.json to gitignore
Remove tracked internal state files from repository.

🤖 Generated with [Qoder][https://qoder.com]
2026-04-14 14:39:05 +08:00
木炎
c60cd308ca feat: service console auto-connect, settings panel, and batch of enhancements
- Auto-connect WebSocket on page load in service console
- Settings modal for editing sgclaw_config.json (API key, base URL, model, skills dir, etc.)
- UpdateConfig/ConfigUpdated protocol messages for remote config save
- save_to_path() for SgClawSettings serialization
- ConfigUpdated handler in sg_claw_client binary
- Protocol serialization tests for new message types
- HTML test assertions for auto-connect and settings UI
- Additional pending changes: deterministic submit, org units, lineloss xlsx export, browser script tool, and docs

🤖 Generated with [Qoder][https://qoder.com]
2026-04-14 14:32:46 +08:00
木炎
6aa0c110bd fix(callback_host): close orphaned helper page before opening on same WS connection
Sends sgHideBrowerserClosePage (best-effort) before sgHideBrowerserOpenPage
on the same bootstrap WebSocket connection. This prevents duplicate helper
pages across process restarts. Also enables hidden domain mode so the helper
page is not visible to users.

🤖 Generated with [Qoder][https://qoder.com]
2026-04-14 10:01:50 +08:00
木炎
390a431a4b fix(callback_host): revert close_helper_page that broke helper page loading
The close_helper_page function opened a second browser WebSocket
connection during Drop and sent a close command directly via the WS
bypassing the HTTP polling system. This interfered with the browser's
normal state and caused the helper page to fail to open.

The cached_host lift (previous commit) already solves the duplicate
helper page issue within a single process lifetime. The Drop-based
close logic is deferred until a proper cleanup mechanism is designed.

🤖 Generated with [Qoder][https://qoder.com]
2026-04-14 09:41:33 +08:00
木炎
0f70702914 test(callback_host): add hidden domain bootstrap test
🤖 Generated with [Qoder][https://qoder.com]
2026-04-14 09:21:17 +08:00
木炎
8decd9554c fix(service): lift cached_host to outer loop to prevent duplicate helper pages
🤖 Generated with [Qoder][https://qoder.com]
2026-04-14 09:15:33 +08:00
木炎
adb64429ee feat(callback_host): close helper page on Drop via browser WS
🤖 Generated with [Qoder][https://qoder.com]
2026-04-14 09:09:59 +08:00
木炎
32e2c59a40 feat(callback_host): add use_hidden_domain param to bootstrap_helper_page
🤖 Generated with [Qoder][https://qoder.com]
2026-04-14 09:07:20 +08:00
木炎
fae2fd57d6 docs: add helper page lifecycle fix implementation plan
🤖 Generated with [Qoder][https://qoder.com]
2026-04-14 09:01:59 +08:00
木炎
899c670e5c docs: add helper page lifecycle fix & hidden domain design spec
🤖 Generated with [Qoder][https://qoder.com]
2026-04-14 08:59:15 +08:00
木炎
583bb117cb docs: add async eval .then() fix design spec
🤖 Generated with [Qoder][https://qoder.com]
2026-04-13 18:32:05 +08:00
木炎
ad3778d4c5 fix: pass expected_domain to wrapped browser scripts
The `expected_domain` was removed from args for normalization but never
re-inserted, causing JS scripts to receive empty expected_domain and
report "missing_expected_domain" errors.

🤖 Generated with [Qoder][https://qoder.com]
2026-04-13 17:20:48 +08:00
木炎
4d1070dff0 docs: add expected_domain arg fix spec
🤖 Generated with [Qoder][https://qoder.com]
2026-04-13 17:13:53 +08:00
木炎
0303111d5b test: add async browser script test case
🤖 Generated with [Qoder][https://qoder.com]
2026-04-13 16:13:46 +08:00
木炎
7320fb7f79 fix: support async browser scripts in build_eval_js
Wrap eval script in async IIFE and await Promise-like results.
Fixes Promise serialization returning '{}' for async skill scripts.

🤖 Generated with [Qoder][https://qoder.com]
2026-04-13 16:12:08 +08:00
木炎
dbbc5d030b docs: add async browser script support implementation plan
Plan for modifying build_eval_js to support async scripts.
Two tasks: modify callback_backend.rs, add test case.

🤖 Generated with [Qoder][https://qoder.com]
2026-04-13 16:09:09 +08:00
木炎
ce6b3e6749 docs: add async browser script support design
Design for fixing Promise serialization issue in build_eval_js.
Async functions return Promise which gets JSON.stringify'd to "{}".

🤖 Generated with [Qoder][https://qoder.com]
2026-04-13 16:06:41 +08:00
木炎
a957712590 fix: add target_url param for Action::Eval in browser_script_skill_tool
🤖 Generated with [Qoder][https://qoder.com]
2026-04-13 15:03:49 +08:00
木炎
0ebe060484 docs: add lineloss target_url fix implementation plan
🤖 Generated with [Qoder][https://qoder.com]
2026-04-13 15:01:23 +08:00
木炎
695a888840 docs: add lineloss target_url fix design spec
🤖 Generated with [Qoder][https://qoder.com]
2026-04-13 14:59:51 +08:00
木炎
733aee1e9a feat: add lineloss URL mapping in derive_request_url_from_instruction
临时方案:检测指令中包含'线损'或'lineloss'时返回台区线损平台 URL

🤖 Generated with [Qoder][https://qoder.com]
2026-04-13 14:41:14 +08:00
木炎
f8f822e1f3 test: add lineloss requesturl mapping test 2026-04-13 14:38:03 +08:00
木炎
3b156e4bd1 docs: add lineloss requesturl fix implementation plan
🤖 Generated with [Qoder][https://qoder.com]
2026-04-13 14:34:35 +08:00
木炎
645dc60bae docs: add lineloss requesturl fix design spec
临时方案:在 derive_request_url_from_instruction 中添加台区线损 URL 映射

🤖 Generated with [Qoder][https://qoder.com]
2026-04-13 14:31:39 +08:00
木炎
007959b903 merge: bring main lineloss defaults into ws
Bring the main-branch lineloss default-period fix into feature/claw-ws while keeping the ws submit/backend path intact.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-12 21:37:03 +08:00
木炎
a8a470481d fix: align lineloss default periods with page semantics
Default month/week deterministic lineloss requests to the source page's built-in time ranges while preserving explicit-period parsing and existing routing contracts.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-12 21:35:28 +08:00
木炎
447457b7d3 fix: restore zhihu ws routing before direct submit
Keep ws-backed submit flows routing Zhihu natural-language requests through orchestration before direct submit so sg_claw service console behavior stays consistent.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-12 21:14:35 +08:00
木炎
45b60e37f7 fix: restore zhihu export routing before direct submit
Keep Zhihu hotlist export requests on the orchestration path so natural-language submits without page context no longer fail in direct-submit routing.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-12 19:24:09 +08:00
木炎
d230ff0389 merge: integrate main deterministic submit into ws branch
Keep the ws submit path while bringing over main's deterministic lineloss routing and the focused merge verification updates.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-12 14:05:55 +08:00
木炎
72b79feca9 docs: add tq lineloss design and plan
Add the tq lineloss design spec and implementation plan documents used for the deterministic submit work.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-12 13:12:35 +08:00
木炎
dd7805d341 feat: add deterministic tq lineloss submit path
Add the deterministic tq-lineloss routing and normalization flow so exact-suffix requests execute through the existing browser-script seam with canonical org and period arguments.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-12 13:10:58 +08:00
木炎
883647dffc feat: add config-owned direct submit runtime
Keep browser-attached workflows on the configured direct-skill path and align the Zhihu export/browser regression contracts with the current ws merge state.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-11 15:45:42 +08:00
木炎
b454fa3f54 refactor: remove ws-only scene routing remnants
Keep the ws branch focused on websocket and Zhihu behavior by dropping staged scene-routing artifacts and restoring single-path skills dir semantics.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-10 22:35:43 +08:00
木炎
311cc1fee6 docs: add fault-details alignment design and plan
Capture the approved fault-details staged-skill design and implementation plan so the remaining work can be resumed from the documented contract.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-10 18:10:02 +08:00
木炎
7443b9da7f fix: classify direct report artifacts by status
Treat direct skill report-artifact payloads as task outcomes so partial and empty reports stay successful while blocked and error statuses fail explicitly.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-10 18:01:18 +08:00
zhaoyilun
34035cdc9c fix: stabilize zhihu export and dashboard flow 2026-04-10 17:21:13 +08:00
木炎
4becf81066 feat: add config-owned direct skill submit path
Add fixed direct-submit skill loading from configured staged skills and validate directSubmitSkill early so malformed configs fail before routing.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-09 19:02:30 +08:00
木炎
81de162756 docs: add ws branch cleanup plan
Document the post-main cleanup steps for removing staged scene routing from the ws branch while preserving websocket and Zhihu flows.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-09 11:13:43 +08:00
木炎
630190e4d3 docs: add command center analysis artifacts
Capture the command-center operation analyses, inventory outputs, and browser pipeline reference files produced during the current research pass so they can be reviewed from the branch.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-09 10:43:09 +08:00
木炎
57b9be733d fix: harden service websocket reconnect flows
Stabilize the service console and callback-host websocket paths so idle disconnects and mid-task client drops no longer wedge task execution or spam repeated commands.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-09 10:34:34 +08:00
木炎
96c3bf1dee feat: route staged scene skills through runtime
Add registry-driven scene routing and multi-root skill loading so fault-details and 95598 scene skills can be triggered from natural language while still running through the browser-backed runtime.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-07 16:17:17 +08:00
木炎
bdf8e12246 feat: align browser callback runtime and export flows
Consolidate the browser task runtime around the callback path, add safer artifact opening for Zhihu exports, and cover the new service/browser flows with focused tests and supporting docs.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-06 21:44:53 +08:00
木炎
0dd655712c feat: add standalone service chat console
Provide a local HTML console that reuses the existing service websocket so task entry stays outside the browser-helper runtime path.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-06 15:26:15 +08:00
木炎
6068a8228b feat: realign zhihu browser callback runtime
Keep Zhihu browser-attached execution on the callback-host path so direct routes, runtime wiring, and service startup stay aligned for the current websocket browser flow.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-06 12:09:47 +08:00
木炎
3e18350320 feat: add websocket browser service runtime
Wire the service/browser runtime onto the websocket-driven execution path and add the new browser/service modules needed for the submit flow and runtime integration.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-04 23:42:27 +08:00
zyl
2ae71fb1c9 compat: probe zhihu extractor before renavigate 2026-03-30 18:39:49 +08:00
zyl
5bccd02d6f docs: add sgclaw project briefing materials 2026-03-30 18:39:49 +08:00
zyl
f51d6b7659 sgclaw: snapshot today's runtime and skill updates 2026-03-30 18:39:49 +08:00
zyl
c793bfc6a1 build: restore repo cargo registry override 2026-03-30 18:39:49 +08:00
zyl
305b6d5110 build: stop using broken rsproxy registry 2026-03-30 18:39:49 +08:00
zyl
4c4f45581f tests: cover nested zhihu creator write entry 2026-03-30 18:39:49 +08:00
zyl
cd94904329 sgclaw: stop zhihu publish flow before editor on creator page 2026-03-30 18:39:49 +08:00
zyl
bf09de6700 test: cover zhihu hotlist script fallback 2026-03-30 08:29:44 +08:00
zyl
dbb18a094c fix: sanitize provider tool names 2026-03-30 08:29:44 +08:00
zyl
5db25b513e fix: sync launcher rules for source checkout 2026-03-30 08:29:44 +08:00
zyl
d2c9902966 feat: add browser script skill execution 2026-03-30 08:29:44 +08:00
zyl
f7e2ff256e logging: include runtime and skill versions 2026-03-30 08:29:44 +08:00
zyl
c7d3d45c68 chore: record final sgclaw superrpa runtime verification 2026-03-30 08:29:44 +08:00
zyl
0fc6fe0c8e frontend: document runtime host bundle 2026-03-30 08:29:44 +08:00
zyl
ef88487f4a acceptance: stabilize zhihu hotlist excel flow 2026-03-30 08:29:44 +08:00
zyl
e294fbb9b1 wip: checkpoint 2026-03-29 runtime work 2026-03-30 08:29:44 +08:00
zyl
7d9036b2d4 sgclaw: move runtime policy into config 2026-03-30 08:29:44 +08:00
zyl
54049a1e1e tools: add sgclaw browser runtime entry wrapper 2026-03-30 08:29:44 +08:00
zyl
3844f2c34c docs: define superrpa sgclaw runtime boundary 2026-03-30 08:29:44 +08:00
zyl
5ed81e5f0c docs: redefine sgclaw as hardened zeroclaw runtime 2026-03-30 08:29:44 +08:00
木炎
b87968632a chore: ignore local workspace directories
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-27 12:40:21 +08:00
198 changed files with 56345 additions and 1336 deletions

5
.cargo/config.toml Normal file
View File

@@ -0,0 +1,5 @@
[source.crates-io]
replace-with = "rsproxy-sparse"
[source.rsproxy-sparse]
registry = "sparse+https://rsproxy.cn/index/"

11
.gitignore vendored
View File

@@ -1,2 +1,13 @@
.worktrees/
target/
.claude/
.idea/
.playwright-mcp/
.qoder/
.sgclaw_workspace/
.sgclaw_workspace_dev1/
.sgclaw-zeroclaw-workspace/
sgclaw_config.json
nul
target-test/
target-zhihu-nav/

View File

@@ -3,6 +3,8 @@
## Project Structure & Module Organization
`docs/` is the main source of product, architecture, integration, and team-process documentation. Keep active engineering documents in `docs/*.md`; presentation exports belong under `docs/archive/领导演示资料/`. `frontend/archive/sgClaw验证-已归档/` contains the historical Vue 2 verification page (`index.html`, `index.vue`) plus helper scripts (`serve.sh`, `download-libs.sh`, `testRunner.js`). `frontend/README.md` and `docs/README.md` describe what is active versus archived.
This repository only manages the sgClaw runtime, compatibility layers, skills, and architecture docs. The frontend overlay and Chromium-side code that embed sgClaw into SuperRPA live in the `superRPA` checkout; see `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/` (overlay HTML/JS/CSS) together with the surrounding Chromium resources and host bootstrap code for the actual browser-integrated UI.
## Build, Test, and Development Commands
There is no formal build system in the repository today. Use the local verification page directly:

336
Cargo.lock generated
View File

@@ -26,18 +26,6 @@ dependencies = [
"generic-array",
]
[[package]]
name = "ahash"
version = "0.8.12"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "5a15f179cd60c4584b8a8c596927aadc462e27f2ca70c04e0071964a73ba7a75"
dependencies = [
"cfg-if",
"once_cell",
"version_check",
"zerocopy",
]
[[package]]
name = "aho-corasick"
version = "1.1.4"
@@ -47,12 +35,6 @@ dependencies = [
"memchr",
]
[[package]]
name = "allocator-api2"
version = "0.2.21"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "683d7910e743518b0e34f1186f92494becacb047c7b6bf616c96772180fef923"
[[package]]
name = "android_system_properties"
version = "0.1.5"
@@ -118,15 +100,6 @@ version = "1.0.102"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "7f202df86484c868dbad7eaa557ef785d5c66295e41b460ef922eca0723b842c"
[[package]]
name = "ar_archive_writer"
version = "0.5.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "7eb93bbb63b9c227414f6eb3a0adfddca591a8ce1e9b60661bb08969b87e340b"
dependencies = [
"object",
]
[[package]]
name = "async-channel"
version = "1.9.0"
@@ -220,9 +193,9 @@ dependencies = [
[[package]]
name = "aws-lc-sys"
version = "0.39.0"
version = "0.39.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "1fa7e52a4c5c547c741610a2c6f123f3881e409b714cd27e6798ef020c514f0a"
checksum = "83a25cf98105baa966497416dbd42565ce3a8cf8dbfd59803ec9ad46f3126399"
dependencies = [
"cc",
"cmake",
@@ -328,6 +301,12 @@ version = "1.25.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "c8efb64bd706a16a1bdde310ae86b351e4d21550d98d056f22f8a7f7a2183fec"
[[package]]
name = "byteorder"
version = "1.5.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "1fd0f2584146f6f2ef48085050886acf353beff7305ebd1ae69500e27c67f64b"
[[package]]
name = "byteorder-lite"
version = "0.1.0"
@@ -342,9 +321,9 @@ checksum = "1e748733b7cbc798e1434b6ac524f0c1ff2ab456fe201501e6497c8417a4fc33"
[[package]]
name = "cc"
version = "1.2.57"
version = "1.2.59"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "7a0dd1ca384932ff3641c8718a02769f1698e7563dc6974ffd03346116310423"
checksum = "b7a4d3ec6524d28a329fc53654bbadc9bdd7b0431f5d65f1a56ffb28a1ee5283"
dependencies = [
"find-msvc-tools",
"jobserver",
@@ -421,16 +400,6 @@ dependencies = [
"phf",
]
[[package]]
name = "chumsky"
version = "0.9.3"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "8eebd66744a15ded14960ab4ccdbfb51ad3b81f51f3f04a80adac98c985396c9"
dependencies = [
"hashbrown 0.14.5",
"stacker",
]
[[package]]
name = "cipher"
version = "0.4.4"
@@ -493,9 +462,9 @@ checksum = "c8d4a3bb8b1e0c1050499d1815f5ab16d04f0959b233085fb31653fbfc9d98f9"
[[package]]
name = "cmake"
version = "0.1.57"
version = "0.1.58"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "75443c44cd6b379beb8c5b45d85d0773baf31cce901fe7bb252f4eff3008ef7d"
checksum = "c0f78a02292a74a88ac736019ab962ece0bc380e3f977bf72e376c5d78ff0678"
dependencies = [
"cc",
]
@@ -772,9 +741,9 @@ checksum = "7360491ce676a36bf9bb3c56c1aa791658183a54d2744120f27285738d90465a"
[[package]]
name = "fastrand"
version = "2.3.0"
version = "2.4.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "37909eebbb50d72f9059c3b6d82c0463f2ff062c9e95845c43a6c9c0355411be"
checksum = "9f1f227452a390804cdb637b74a86990f2a7d7ba4b7d5693aac9b4dd6defd8d6"
[[package]]
name = "fdeflate"
@@ -1012,16 +981,6 @@ dependencies = [
"tracing",
]
[[package]]
name = "hashbrown"
version = "0.14.5"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "e5274423e17b7c9fc20b6e7e208532f9b19825d82dfd615708b70edd83df41f1"
dependencies = [
"ahash",
"allocator-api2",
]
[[package]]
name = "hashbrown"
version = "0.15.5"
@@ -1039,10 +998,11 @@ checksum = "841d1cc9bed7f9236f321df977030373f4a4163ae1a7dbfe1a51a2c1a51d9100"
[[package]]
name = "hashify"
version = "0.2.7"
version = "0.2.9"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "149e3ea90eb5a26ad354cfe3cb7f7401b9329032d0235f2687d03a35f30e5d4c"
checksum = "dd1246c0e5493286aeb2dde35b1f4eb9c4ce00e628641210a5e553fc001a1f26"
dependencies = [
"indexmap",
"proc-macro2",
"quote",
"syn",
@@ -1136,9 +1096,9 @@ checksum = "df3b46402a9d5adb4c86a0cf463f42e19994e3ee891101b1841f30a545cb49a9"
[[package]]
name = "hyper"
version = "1.8.1"
version = "1.9.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "2ab2d4f250c3d7b1c9fcdff1cece94ea4e2dfbec68614f7b87cb205f24ca9d11"
checksum = "6299f016b246a94207e63da54dbe807655bf9e00044f73ded42c3ac5305fbcca"
dependencies = [
"atomic-waker",
"bytes",
@@ -1151,7 +1111,6 @@ dependencies = [
"httpdate",
"itoa",
"pin-project-lite",
"pin-utils",
"smallvec",
"tokio",
"want",
@@ -1223,12 +1182,13 @@ dependencies = [
[[package]]
name = "icu_collections"
version = "2.1.1"
version = "2.2.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "4c6b649701667bbe825c3b7e6388cb521c23d88644678e83c0c4d0a621a34b43"
checksum = "2984d1cd16c883d7935b9e07e44071dca8d917fd52ecc02c04d5fa0b5a3f191c"
dependencies = [
"displaydoc",
"potential_utf",
"utf8_iter",
"yoke",
"zerofrom",
"zerovec",
@@ -1236,9 +1196,9 @@ dependencies = [
[[package]]
name = "icu_locale_core"
version = "2.1.1"
version = "2.2.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "edba7861004dd3714265b4db54a3c390e880ab658fec5f7db895fae2046b5bb6"
checksum = "92219b62b3e2b4d88ac5119f8904c10f8f61bf7e95b640d25ba3075e6cac2c29"
dependencies = [
"displaydoc",
"litemap",
@@ -1249,9 +1209,9 @@ dependencies = [
[[package]]
name = "icu_normalizer"
version = "2.1.1"
version = "2.2.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "5f6c8828b67bf8908d82127b2054ea1b4427ff0230ee9141c54251934ab1b599"
checksum = "c56e5ee99d6e3d33bd91c5d85458b6005a22140021cc324cea84dd0e72cff3b4"
dependencies = [
"icu_collections",
"icu_normalizer_data",
@@ -1263,15 +1223,15 @@ dependencies = [
[[package]]
name = "icu_normalizer_data"
version = "2.1.1"
version = "2.2.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "7aedcccd01fc5fe81e6b489c15b247b8b0690feb23304303a9e560f37efc560a"
checksum = "da3be0ae77ea334f4da67c12f149704f19f81d1adf7c51cf482943e84a2bad38"
[[package]]
name = "icu_properties"
version = "2.1.2"
version = "2.2.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "020bfc02fe870ec3a66d93e677ccca0562506e5872c650f893269e08615d74ec"
checksum = "bee3b67d0ea5c2cca5003417989af8996f8604e34fb9ddf96208a033901e70de"
dependencies = [
"icu_collections",
"icu_locale_core",
@@ -1283,15 +1243,15 @@ dependencies = [
[[package]]
name = "icu_properties_data"
version = "2.1.2"
version = "2.2.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "616c294cf8d725c6afcd8f55abc17c56464ef6211f9ed59cccffe534129c77af"
checksum = "8e2bbb201e0c04f7b4b3e14382af113e17ba4f63e2c9d2ee626b720cbce54a14"
[[package]]
name = "icu_provider"
version = "2.1.1"
version = "2.2.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "85962cf0ce02e1e0a629cc34e7ca3e373ce20dda4c4d7294bbd0bf1fdb59e614"
checksum = "139c4cf31c8b5f33d7e199446eff9c1e02decfc2f0eec2c8d71f65befa45b421"
dependencies = [
"displaydoc",
"icu_locale_core",
@@ -1355,9 +1315,9 @@ dependencies = [
[[package]]
name = "indexmap"
version = "2.13.0"
version = "2.13.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "7714e70437a7dc3ac8eb7e6f8df75fd8eb422675fc7678aff7364301092b1017"
checksum = "45a8a2b9cb3e0b0c1803dbb0758ffac5de2f425b23c28f518faabd9d805342ff"
dependencies = [
"equivalent",
"hashbrown 0.16.1",
@@ -1395,9 +1355,9 @@ checksum = "d98f6fed1fde3f8c21bc40a1abb88dd75e67924f9cffc3ef95607bad8017f8e2"
[[package]]
name = "iri-string"
version = "0.7.11"
version = "0.7.12"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "d8e7418f59cc01c88316161279a7f665217ae316b388e58a0d10e29f54f1e5eb"
checksum = "25e659a4bb38e810ebc252e53b5814ff908a8c58c2a9ce2fae1bbec24cbf4e20"
dependencies = [
"memchr",
"serde",
@@ -1427,10 +1387,12 @@ dependencies = [
[[package]]
name = "js-sys"
version = "0.3.91"
version = "0.3.94"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "b49715b7073f385ba4bc528e5747d02e66cb39c6146efb66b781f131f0fb399c"
checksum = "2e04e2ef80ce82e13552136fabeef8a5ed1f985a96805761cbb9a2c34e7664d9"
dependencies = [
"cfg-if",
"futures-util",
"once_cell",
"wasm-bindgen",
]
@@ -1449,12 +1411,11 @@ checksum = "09edd9e8b54e49e587e4f6295a7d29c3ea94d469cb40ab8ca70b288248a81db2"
[[package]]
name = "lettre"
version = "0.11.19"
version = "0.11.21"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "9e13e10e8818f8b2a60f52cb127041d388b89f3a96a62be9ceaffa22262fef7f"
checksum = "dabda5859ee7c06b995b9d1165aa52c39110e079ef609db97178d86aeb051fa7"
dependencies = [
"base64",
"chumsky",
"email-encoding",
"email_address",
"fastrand",
@@ -1473,9 +1434,9 @@ dependencies = [
[[package]]
name = "libc"
version = "0.2.183"
version = "0.2.184"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "b5b646652bf6661599e1da8901b3b9522896f01e736bad5f723fe7a3a27f899d"
checksum = "48f5d2a454e16a5ea0f4ced81bd44e4cfc7bd3a507b61887c99fd3538b28e4af"
[[package]]
name = "libloading"
@@ -1518,9 +1479,9 @@ checksum = "32a66949e030da00e8c7d4434b251670a91556f4144941d37452769c25d58a53"
[[package]]
name = "litemap"
version = "0.8.1"
version = "0.8.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "6373607a59f0be73a39b6fe456b8192fcc3585f602af20751600e974dd455e77"
checksum = "92daf443525c4cce67b150400bc2316076100ce0b3686209eb8cf3c31612e6f0"
[[package]]
name = "lock_api"
@@ -1607,9 +1568,9 @@ dependencies = [
[[package]]
name = "mio"
version = "1.1.1"
version = "1.2.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "a69bcab0ad47271a0234d9422b131806bf3968021e5dc9328caf2d4cd58557fc"
checksum = "50b7e5b27aa02a74bac8c3f23f448f8d87ff11f92d3aac1a6ed369ee08cc56c1"
dependencies = [
"libc",
"wasi",
@@ -1669,15 +1630,6 @@ dependencies = [
"autocfg",
]
[[package]]
name = "object"
version = "0.37.3"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "ff76201f031d8863c38aa7f905eca4f53abbfa15f609db4277d44cd8938f33fe"
dependencies = [
"memchr",
]
[[package]]
name = "once_cell"
version = "1.21.4"
@@ -1831,9 +1783,9 @@ checksum = "c33a9471896f1c69cecef8d20cbe2f7accd12527ce60845ff44c153bb2a21b49"
[[package]]
name = "potential_utf"
version = "0.1.4"
version = "0.1.5"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "b73949432f5e2a09657003c25bca5e19a0e9c84f8058ca374f49e0ebe605af77"
checksum = "0103b1cef7ec0cf76490e969665504990193874ea05c85ff9bab8b911d0a0564"
dependencies = [
"zerovec",
]
@@ -1866,16 +1818,6 @@ dependencies = [
"unicode-ident",
]
[[package]]
name = "psm"
version = "0.1.30"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "3852766467df634d74f0b2d7819bf8dc483a0eb2e3b0f50f756f9cfe8b0d18d8"
dependencies = [
"ar_archive_writer",
"cc",
]
[[package]]
name = "pxfm"
version = "0.1.28"
@@ -2204,9 +2146,9 @@ dependencies = [
[[package]]
name = "rustc-hash"
version = "2.1.1"
version = "2.1.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "357703d41365b4b27c590e3ed91eabb1b663f07c4c084095e60cbed4362dff0d"
checksum = "94300abf3f1ae2e2b8ffb7b58043de3d399c73fa6f4b73826402a5c457614dbe"
[[package]]
name = "rustix"
@@ -2328,9 +2270,9 @@ checksum = "b12e76d157a900eb52e81bc6e9f3069344290341720e9178cde2407113ac8d89"
[[package]]
name = "semver"
version = "1.0.27"
version = "1.0.28"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "d767eb0aabc880b29956c35734170f26ed551a859dbd361d140cdbeca61ab1e2"
checksum = "8a7852d02fc848982e0c167ef163aaff9cd91dc640ba85e263cb1ce46fae51cd"
[[package]]
name = "serde"
@@ -2399,9 +2341,9 @@ dependencies = [
[[package]]
name = "serde_spanned"
version = "1.1.0"
version = "1.1.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "876ac351060d4f882bb1032b6369eb0aef79ad9df1ea8bc404874d8cc3d0cd98"
checksum = "6662b5879511e06e8999a8a235d848113e942c9124f211511b16466ee2995f26"
dependencies = [
"serde_core",
]
@@ -2420,7 +2362,7 @@ dependencies = [
[[package]]
name = "sgclaw"
version = "0.1.0"
version = "0.1.0-2026.4.9"
dependencies = [
"anyhow",
"async-trait",
@@ -2428,14 +2370,17 @@ dependencies = [
"futures-util",
"hex",
"hmac",
"regex",
"reqwest",
"serde",
"serde_json",
"sha2",
"thiserror 1.0.69",
"tokio",
"tungstenite 0.29.0",
"uuid",
"zeroclawlabs",
"zip 0.6.6",
]
[[package]]
@@ -2502,9 +2447,9 @@ dependencies = [
[[package]]
name = "simd-adler32"
version = "0.3.8"
version = "0.3.9"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "e320a6c5ad31d271ad523dcf3ad13e2767ad8b1cb8f047f75a8aeaf8da139da2"
checksum = "703d5c7ef118737c72f1af64ad2f6f8c5e1921f818cdcb97b8fe6fc69bf66214"
[[package]]
name = "siphasher"
@@ -2540,20 +2485,6 @@ version = "1.2.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "6ce2be8dc25455e1f91df71bfa12ad37d7af1092ae736f3a6cd0e37bc7810596"
[[package]]
name = "stacker"
version = "0.1.23"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "08d74a23609d509411d10e2176dc2a4346e3b4aea2e7b1869f19fdedbc71c013"
dependencies = [
"cc",
"cfg-if",
"libc",
"psm",
"windows-sys 0.52.0",
"windows-sys 0.59.0",
]
[[package]]
name = "stop-token"
version = "0.7.0"
@@ -2684,9 +2615,9 @@ dependencies = [
[[package]]
name = "tinystr"
version = "0.8.2"
version = "0.8.3"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "42d3e9c45c09de15d06dd8acf5f4e0e399e85927b7f00711024eb7ae10fa4869"
checksum = "c8323304221c2a851516f22236c5722a72eaa19749016521d6dff0824447d96d"
dependencies = [
"displaydoc",
"zerovec",
@@ -2709,9 +2640,9 @@ checksum = "1f3ccbac311fea05f86f61904b462b55fb3df8837a366dfc601a0161d0532f20"
[[package]]
name = "tokio"
version = "1.50.0"
version = "1.51.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "27ad5e34374e03cfffefc301becb44e9dc3c17584f414349ebe29ed26661822d"
checksum = "f66bf9585cda4b724d3e78ab34b73fb2bbaba9011b9bfdf69dc836382ea13b8c"
dependencies = [
"bytes",
"libc",
@@ -2725,9 +2656,9 @@ dependencies = [
[[package]]
name = "tokio-macros"
version = "2.6.1"
version = "2.7.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "5c55a2eff8b69ce66c84f85e1da1c233edc36ceb85a2058d11b0d6a3c7e7569c"
checksum = "385a6cb71ab9ab790c5fe8d67f1645e6c450a7ce006a33de03daa956cf70a496"
dependencies = [
"proc-macro2",
"quote",
@@ -2811,9 +2742,9 @@ dependencies = [
[[package]]
name = "toml"
version = "1.1.0+spec-1.1.0"
version = "1.1.2+spec-1.1.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "f8195ca05e4eb728f4ba94f3e3291661320af739c4e43779cbdfae82ab239fcc"
checksum = "81f3d15e84cbcd896376e6730314d59fb5a87f31e4b038454184435cd57defee"
dependencies = [
"indexmap",
"serde_core",
@@ -2821,32 +2752,32 @@ dependencies = [
"toml_datetime",
"toml_parser",
"toml_writer",
"winnow 1.0.0",
"winnow 1.0.1",
]
[[package]]
name = "toml_datetime"
version = "1.1.0+spec-1.1.0"
version = "1.1.1+spec-1.1.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "97251a7c317e03ad83774a8752a7e81fb6067740609f75ea2b585b569a59198f"
checksum = "3165f65f62e28e0115a00b2ebdd37eb6f3b641855f9d636d3cd4103767159ad7"
dependencies = [
"serde_core",
]
[[package]]
name = "toml_parser"
version = "1.1.0+spec-1.1.0"
version = "1.1.2+spec-1.1.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "2334f11ee363607eb04df9b8fc8a13ca1715a72ba8662a26ac285c98aabb4011"
checksum = "a2abe9b86193656635d2411dc43050282ca48aa31c2451210f4202550afb7526"
dependencies = [
"winnow 1.0.0",
"winnow 1.0.1",
]
[[package]]
name = "toml_writer"
version = "1.1.0+spec-1.1.0"
version = "1.1.1+spec-1.1.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "d282ade6016312faf3e41e57ebbba0c073e4056dab1232ab1cb624199648f8ed"
checksum = "756daf9b1013ebe47a8776667b466417e2d4c5679d441c26230efd9ef78692db"
[[package]]
name = "tower"
@@ -3067,9 +2998,9 @@ checksum = "06abde3611657adf66d383f00b093d7faecc7fa57071cce2578660c9f1010821"
[[package]]
name = "uuid"
version = "1.22.0"
version = "1.23.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "a68d3c8f01c0cfa54a75291d83601161799e4a89a39e0929f4b0354d88757a37"
checksum = "5ac8b6f42ead25368cf5b098aeb3dc8a1a2c05a3eee8a9a1a68c640edbfc79d9"
dependencies = [
"getrandom 0.4.2",
"js-sys",
@@ -3133,9 +3064,9 @@ dependencies = [
[[package]]
name = "wasm-bindgen"
version = "0.2.114"
version = "0.2.117"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "6532f9a5c1ece3798cb1c2cfdba640b9b3ba884f5db45973a6f442510a87d38e"
checksum = "0551fc1bb415591e3372d0bc4780db7e587d84e2a7e79da121051c5c4b89d0b0"
dependencies = [
"cfg-if",
"once_cell",
@@ -3146,23 +3077,19 @@ dependencies = [
[[package]]
name = "wasm-bindgen-futures"
version = "0.4.64"
version = "0.4.67"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "e9c5522b3a28661442748e09d40924dfb9ca614b21c00d3fd135720e48b67db8"
checksum = "03623de6905b7206edd0a75f69f747f134b7f0a2323392d664448bf2d3c5d87e"
dependencies = [
"cfg-if",
"futures-util",
"js-sys",
"once_cell",
"wasm-bindgen",
"web-sys",
]
[[package]]
name = "wasm-bindgen-macro"
version = "0.2.114"
version = "0.2.117"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "18a2d50fcf105fb33bb15f00e7a77b772945a2ee45dcf454961fd843e74c18e6"
checksum = "7fbdf9a35adf44786aecd5ff89b4563a90325f9da0923236f6104e603c7e86be"
dependencies = [
"quote",
"wasm-bindgen-macro-support",
@@ -3170,9 +3097,9 @@ dependencies = [
[[package]]
name = "wasm-bindgen-macro-support"
version = "0.2.114"
version = "0.2.117"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "03ce4caeaac547cdf713d280eda22a730824dd11e6b8c3ca9e42247b25c631e3"
checksum = "dca9693ef2bab6d4e6707234500350d8dad079eb508dca05530c85dc3a529ff2"
dependencies = [
"bumpalo",
"proc-macro2",
@@ -3183,9 +3110,9 @@ dependencies = [
[[package]]
name = "wasm-bindgen-shared"
version = "0.2.114"
version = "0.2.117"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "75a326b8c223ee17883a4251907455a2431acc2791c98c26279376490c378c16"
checksum = "39129a682a6d2d841b6c429d0c51e5cb0ed1a03829d8b3d1e69a011e62cb3d3b"
dependencies = [
"unicode-ident",
]
@@ -3239,9 +3166,9 @@ dependencies = [
[[package]]
name = "web-sys"
version = "0.3.91"
version = "0.3.94"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "854ba17bb104abfb26ba36da9729addc7ce7f06f5c0f90f3c391f8461cca21f9"
checksum = "cd70027e39b12f0849461e08ffc50b9cd7688d942c1c8e3c7b22273236b4dd0a"
dependencies = [
"js-sys",
"wasm-bindgen",
@@ -3361,15 +3288,6 @@ dependencies = [
"windows-targets 0.52.6",
]
[[package]]
name = "windows-sys"
version = "0.59.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "1e38bc4d79ed67fd075bcc251a1c39b32a1776bbe92e5bef1f0bf1f8c531853b"
dependencies = [
"windows-targets 0.52.6",
]
[[package]]
name = "windows-sys"
version = "0.60.2"
@@ -3528,9 +3446,9 @@ dependencies = [
[[package]]
name = "winnow"
version = "1.0.0"
version = "1.0.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "a90e88e4667264a994d34e6d1ab2d26d398dcdca8b7f52bec8668957517fc7d8"
checksum = "09dac053f1cd375980747450bfc7250c264eaae0583872e845c0c7cd578872b5"
[[package]]
name = "wit-bindgen"
@@ -3622,9 +3540,9 @@ dependencies = [
[[package]]
name = "writeable"
version = "0.6.2"
version = "0.6.3"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "9edde0db4769d2dc68579893f2306b26c6ecfbe0ef499b013d731b7b9247e0b9"
checksum = "1ffae5123b2d3fc086436f8834ae3ab053a283cfac8fe0a0b8eaae044768a4c4"
[[package]]
name = "xattr"
@@ -3638,9 +3556,9 @@ dependencies = [
[[package]]
name = "yoke"
version = "0.8.1"
version = "0.8.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "72d6e5c6afb84d73944e5cedb052c4680d5657337201555f9f2a16b7406d4954"
checksum = "abe8c5fda708d9ca3df187cae8bfb9ceda00dd96231bed36e445a1a48e66f9ca"
dependencies = [
"stable_deref_trait",
"yoke-derive",
@@ -3649,9 +3567,9 @@ dependencies = [
[[package]]
name = "yoke-derive"
version = "0.8.1"
version = "0.8.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "b659052874eb698efe5b9e8cf382204678a0086ebf46982b79d6ca3182927e5d"
checksum = "de844c262c8848816172cef550288e7dc6c7b7814b4ee56b3e1553f275f1858e"
dependencies = [
"proc-macro2",
"quote",
@@ -3728,23 +3646,23 @@ dependencies = [
"uuid",
"webpki-roots 1.0.6",
"which",
"zip",
"zip 8.5.1",
]
[[package]]
name = "zerocopy"
version = "0.8.47"
version = "0.8.48"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "efbb2a062be311f2ba113ce66f697a4dc589f85e78a4aea276200804cea0ed87"
checksum = "eed437bf9d6692032087e337407a86f04cd8d6a16a37199ed57949d415bd68e9"
dependencies = [
"zerocopy-derive",
]
[[package]]
name = "zerocopy-derive"
version = "0.8.47"
version = "0.8.48"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "0e8bc7269b54418e7aeeef514aa68f8690b8c0489a06b0136e5f57c4c5ccab89"
checksum = "70e3cd084b1788766f53af483dd21f93881ff30d7320490ec3ef7526d203bad4"
dependencies = [
"proc-macro2",
"quote",
@@ -3753,18 +3671,18 @@ dependencies = [
[[package]]
name = "zerofrom"
version = "0.1.6"
version = "0.1.7"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "50cc42e0333e05660c3587f3bf9d0478688e15d870fab3346451ce7f8c9fbea5"
checksum = "69faa1f2a1ea75661980b013019ed6687ed0e83d069bc1114e2cc74c6c04c4df"
dependencies = [
"zerofrom-derive",
]
[[package]]
name = "zerofrom-derive"
version = "0.1.6"
version = "0.1.7"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "d71e5d6e06ab090c67b5e44993ec16b72dcbaabc526db883a360057678b48502"
checksum = "11532158c46691caf0f2593ea8358fed6bbf68a0315e80aae9bd41fbade684a1"
dependencies = [
"proc-macro2",
"quote",
@@ -3780,9 +3698,9 @@ checksum = "b97154e67e32c85465826e8bcc1c59429aaaf107c1e4a9e53c8d8ccd5eff88d0"
[[package]]
name = "zerotrie"
version = "0.2.3"
version = "0.2.4"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "2a59c17a5562d507e4b54960e8569ebee33bee890c70aa3fe7b97e85a9fd7851"
checksum = "0f9152d31db0792fa83f70fb2f83148effb5c1f5b8c7686c3459e361d9bc20bf"
dependencies = [
"displaydoc",
"yoke",
@@ -3791,9 +3709,9 @@ dependencies = [
[[package]]
name = "zerovec"
version = "0.11.5"
version = "0.11.6"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "6c28719294829477f525be0186d13efa9a3c602f7ec202ca9e353d310fb9a002"
checksum = "90f911cbc359ab6af17377d242225f4d75119aec87ea711a880987b18cd7b239"
dependencies = [
"yoke",
"zerofrom",
@@ -3802,9 +3720,9 @@ dependencies = [
[[package]]
name = "zerovec-derive"
version = "0.11.2"
version = "0.11.3"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "eadce39539ca5cb3985590102671f2567e659fca9666581ad3411d59207951f3"
checksum = "625dc425cab0dca6dc3c3319506e6593dcb08a9f387ea3b284dbd52a92c40555"
dependencies = [
"proc-macro2",
"quote",
@@ -3813,9 +3731,21 @@ dependencies = [
[[package]]
name = "zip"
version = "8.4.0"
version = "0.6.6"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "7756d0206d058333667493c4014f545f4b9603c4330ccd6d9b3f86dcab59f7d9"
checksum = "760394e246e4c28189f19d488c058bf16f564016aefac5d32bb1f3b51d5e9261"
dependencies = [
"byteorder",
"crc32fast",
"crossbeam-utils",
"flate2",
]
[[package]]
name = "zip"
version = "8.5.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "dcab981e19633ebcf0b001ddd37dd802996098bc1864f90b7c5d970ce76c1d59"
dependencies = [
"crc32fast",
"flate2",
@@ -3838,9 +3768,9 @@ checksum = "cb8a0807f7c01457d0379ba880ba6322660448ddebc890ce29bb64da71fb40f9"
[[package]]
name = "zune-jpeg"
version = "0.5.14"
version = "0.5.15"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "0b7a1c0af6e5d8d1363f4994b7a091ccf963d8b694f7da5b0b9cceb82da2c0a6"
checksum = "27bc9d5b815bc103f142aa054f561d9187d191692ec7c2d1e2b4737f8dbd7296"
dependencies = [
"zune-core",
]

View File

@@ -1,6 +1,6 @@
[package]
name = "sgclaw"
version = "0.1.0"
version = "0.1.0-2026.4.9"
edition = "2021"
[dependencies]
@@ -11,10 +11,13 @@ futures-util = "0.3"
hex = "0.4"
hmac = "0.12"
reqwest = { version = "0.12", default-features = false, features = ["blocking", "json", "rustls-tls"] }
regex = "1.10"
serde = { version = "1", features = ["derive"] }
serde_json = "1"
sha2 = "0.10"
thiserror = "1"
tokio = { version = "1", default-features = false, features = ["rt-multi-thread", "macros"] }
tungstenite = "0.29"
uuid = { version = "1", features = ["v4"] }
zip = { version = "0.6.6", default-features = false, features = ["deflate"] }
zeroclaw = { package = "zeroclawlabs", path = "third_party/zeroclaw", default-features = false }

View File

@@ -8,15 +8,15 @@
## 1. 产品定义
sgClaw 是一个嵌入企业浏览器运行环境中的浏览器智能体执行内核。它的职责不是替代整个平台,也不是承诺“全自动数字员工”,而是把自然语言任务转换成受控的浏览器操作,并通过既有浏览器宿主完成页面执行
sgClaw 不是“浏览器智能体执行内核”意义上的 browser-only 产品。更准确地说,它是一个嵌入企业浏览器运行环境中的、安全加固后的 zeroclaw runtimezeroclaw 负责智能体能力本体sgClaw 负责把这些能力包裹进受控的执行边界中
ZeroClaw 重构之后sgClaw 的产品形态可以概括为三件事
它的职责不是替代整个平台,也不是承诺“全自动数字员工”,而是
1. 把用户任务接入统一的 Agent 执行入口
2. 通过固定的 `browser_action` 工具把意图翻译为浏览器命令
3.协议、域名和动作白名单的约束下完成可审计的页面操作
1. 以 zeroclaw 为核心提供统一的 Agent 执行入口、prompt/skills/memory/tool routing 能力
2. 通过 sgClaw 的 pipe、HMAC、MAC Policy 和宿主二次校验,把高风险执行面收敛成受保护的工具表面
3.浏览器场景下,把页面操作能力作为特权工具面暴露给 runtime而不是让浏览器反过来定义整个 runtime
当前仓库中的 sgClaw 不是一个完整前端产品,也不是浏览器发行版本身,而是“浏览器 Agent Runtime + Pipe 协议 + ZeroClaw 兼容层”的产品核心。
当前仓库中的 sgClaw 不是一个完整前端产品,也不是浏览器发行版本身,而是“vendored zeroclaw core + sgClaw 安全封装层 + browser host integration”的产品核心。
---
@@ -24,17 +24,20 @@ ZeroClaw 重构之后sgClaw 的产品形态可以概括为三件事:
### 2.1 当前已经落地的能力
- 已 vendored `zeroclaw`,并通过 Rust 运行时接入 provider、skills、memory 等核心能力入口。
- 浏览器侧通过 STDIO JSON Line 协议与 Rust 进程通信。
- 启动时执行 `init -> init_ack` 握手,并建立会话级 HMAC 密钥。
- 任务输入统一走 `submit_task` 消息。
- Rust 侧支持两条执行路径:
- 未配置大模型时,使用仓库内置 planner/fallback 逻辑。
- 配置 `DEEPSEEK_*` 环境变量时,切换到 ZeroClaw compatibility runtime。
- 当前有效工具面收敛为一个工具:`browser_action`
- 当前真正稳定落地的特权执行面收敛为一个工具:`browser_action`
- 当前真正开放给模型的动作仅 4 个:`click``type``navigate``getText`
- 所有浏览器动作都受 `resources/rules.json` 中的域名和动作白名单约束。
- 执行过程中会向宿主发送结构化日志和最终任务结果。
注意上述“browser-first + compat-only”是当前实现状态不是目标产品定义。主线目标仍然是“安全加固后的 zeroclaw runtime”而不是“只会操作浏览器的 agent”。
### 2.2 当前明确不宣称的能力
以下内容在旧文档中存在较多规划性描述,但并非当前仓库中的已实现事实:
@@ -45,6 +48,7 @@ ZeroClaw 重构之后sgClaw 的产品形态可以概括为三件事:
- 完整的浏览器 Side Panel 产品界面。
- 40+ 页面动作在 Agent 侧全部开放。
- 真实生产级多租户、审计后台、任务编排中心。
- “浏览器是 sgClaw 的全部 runtime” 这种定义。
这些能力可以保留为后续扩展方向,但不应继续写入 L0-L4 作为现状描述。
@@ -52,7 +56,7 @@ ZeroClaw 重构之后sgClaw 的产品形态可以概括为三件事:
## 3. 产品价值主张
ZeroClaw 重构后的 sgClaw核心价值不在“功能堆叠”而在于把原本分散的浏览器自动化能力收敛成一个可控、可替换、可验证的智能体执行底座。
ZeroClaw 重构后的 sgClaw核心价值不在“功能堆叠”也不在“单纯浏览器自动化”,而在于把 zeroclaw 的智能体能力放进一个可控、可替换、可验证的安全执行底座
### 3.1 对业务侧
@@ -62,15 +66,15 @@ ZeroClaw 重构后的 sgClaw核心价值不在“功能堆叠”而在于
### 3.2 对集成侧
- 浏览器宿主只需实现固定协议,不必理解模型内部细节。
- Agent Runtime 可以在保留宿主协议的前提下切换实现策略。
- ZeroClaw 兼容层把未来模型、记忆、工具调度的升级入口预留在 Rust 侧
- 浏览器宿主只需实现固定协议和宿主侧安全复检,不必理解模型内部细节。
- sgClaw Runtime 可以在保留宿主协议的前提下演进 zeroclaw 配置、skills 和工具策略。
- 浏览器只是一个受保护执行面;未来其它客户端也应复用同一 runtime而不是另起一套架构
### 3.3 对安全侧
- 不是“模型可任意操作浏览器”,而是“模型只能调用被允许的动作”。
- 安全边界前置到协议MAC Policy而不是把约束留给提示词。
- 域名、动作、HMAC 三类控制共同组成最小可信执行面。
- 不是“模型可任意操作浏览器”,而是“runtime 只能通过被授权的特权工具面触发高风险动作”。
- 安全边界前置到协议MAC Policy、宿主二次校验和运行时工具策略,而不是把约束留给提示词。
- 域名、动作、HMAC 与工具暴露策略共同组成最小可信执行面。
---
@@ -80,15 +84,16 @@ ZeroClaw 重构后的 sgClaw核心价值不在“功能堆叠”而在于
|---|---|---|
| 任务接入 | 已实现 | 接收浏览器宿主发来的 `submit_task` 指令 |
| 协议握手 | 已实现 | 统一版本、会话标识、HMAC 种子交换 |
| Agent 执行 | 已实现 | planner fallback 与 ZeroClaw compat 共存 |
| 浏览器工具 | 已实现 | 单一 `browser_action` 工具 |
| zeroclaw Core | 已接入但未充分释放 | prompt、skills、memory、provider routing 的能力本体已经在仓库中 |
| Agent 执行 | 已实现但仍处过渡态 | planner fallback 与 browser-first ZeroClaw compat 共存 |
| 浏览器特权工具面 | 已实现 | 通过 `browser_action` 暴露受保护的浏览器执行能力 |
| 核心动作 | 已实现 | `click/type/navigate/getText` |
| 域名白名单 | 已实现 | 仅允许规则文件中的域名 |
| 动作白名单 | 已实现 | 仅允许规则文件中的动作 |
| 结构化日志 | 已实现 | `log_entry``task_complete` 回传 |
| 扩展动作枚举 | 已预留 | 协议枚举已定义,但默认未开放 |
| Skill 引擎 | 未独立实现 | 当前仅保留“可被工具和提示词扩展”的语义入口 |
| MCP 生态 | 未在主链路启用 | ZeroClaw 兼容层为后续保留位置 |
| Skill 体系 | 已 vendored 但运行时只部分使用 | 应复用 zeroclaw 原生机制,而不是另起一套浏览器专用技能系统 |
| MCP 生态 | 未在主链路启用 | 未来可接入,但仍应服从 sgClaw 安全封装 |
---
@@ -109,13 +114,13 @@ ZeroClaw 重构后的 sgClaw核心价值不在“功能堆叠”而在于
当页面元素定位规则明确时,系统可用 `click``type` 组合完成表单录入、按钮点击、简单提交等动作。
是否能覆盖完整业务流程,取决于浏览器宿主是否提供对应页面、选择器和回包信息,而不是文档层面预设“所有流程都能端到端执行”。
### 5.3 作为更大产品中的 Agent 执行核
### 5.3 作为更大产品中的安全执行核
sgClaw 更适合被理解为产品底座中的一个执行核:
- 上层可以接入任务输入框、审批入口或业务编排器。
- 下层通过既有浏览器控制面执行。
- 中间由 sgClaw 把自然语言与浏览器动作连接起来。
- 下层可以通过既有浏览器控制面执行,也可以在未来接入其它受保护工具面
- 中间由 sgClaw 把 zeroclaw runtime 与外部受控执行面连接起来。
---

View File

@@ -1,32 +1,43 @@
# L1 — 系统架构与安全模型层
**文档版本**: 2.0
**适用项目**: sgClawZeroClaw 重构版)
**编制日期**: 2026-03-26
**文档版本**: 2.1<br>
**适用项目**: sgClawZeroClaw 重构版)<br>
**编制日期**: 2026-03-29
---
## 1. 架构总览
重构后的 sgClaw 架构要点很简单浏览器宿主负责页面执行Rust 进程负责任务解释与协议编排ZeroClaw 作为兼容运行时被接入到 Rust 侧,而不是直接替代整个系统
重构后的 sgClaw 架构要点应当这样理解:`host`SuperRPA是受保护的宿主安全边界`sgClaw` 是运行时能力编排层,`zeroclaw` 是 planner / model / skill 的能力本体,`frontend bundle` 只是展示面。当前代码尚未完全落到这个目标结构,但主线架构口径必须先统一
```
┌──────────────────────────────┐
│ Browser Host / Chromium Side │
│ - 读取 launch config │
│ - 启动 sgClaw 子进程 │
│ - 发送 init / submit_task │
│ - 执行 command 并回 response
│ - 复检 HMAC / domain / params
│ - 执行 browser command 并回包 │
│ - 选择 frontend bundle │
└──────────────┬───────────────┘
│ STDIO + JSON Line
┌──────────────▼───────────────┐
│ sgClaw Rust Runtime
│ sgClaw Security Envelope
│ - 握手与消息循环 │
│ - MAC Policy │
│ - BrowserPipeTool │
│ - Planner fallback
│ - ZeroClaw compat runtime
│ - Runtime / Tool Policy
│ - Config Adaptation
└──────────────┬───────────────┘
Provider API / Local Config
zeroclaw APIs / Local Config
┌──────────────▼───────────────┐
│ ZeroClaw Core Runtime │
│ - Prompt Builder │
│ - Skills / Memory │
│ - Tool Loop / Routing │
│ - Provider Dispatch │
└──────────────┬───────────────┘
│ Provider API / Optional Tools
┌──────────────▼───────────────┐
│ Model Provider │
│ - DeepSeek/OpenAI-compatible │
@@ -34,23 +45,25 @@
└──────────────────────────────┘
```
架构上最重要的变化是:当前系统不是“完整 ZeroClaw 产品”,而是“保留现有浏览器协议前提下,把 ZeroClaw 作为兼容执行内核引入”
架构上最重要的变化是:sgClaw 不应被定义为“浏览器专用 agent”而应被定义为“保留现有浏览器协议前提下,对 zeroclaw 做安全化封装的运行时分发”。同时 `host` 不再持有高频变化的业务策略,`launch config``runtime config``frontend bundle` 应成为运行时可替换对象
---
## 2. 运行时分层
### 2.1 浏览器宿主层
### 2.1 host浏览器宿主
宿主负责三类职责:
`host` 负责受保护边界内的四类职责:
- 读取并校验 `launch config`,决定进程如何被拉起。
- 启动和托管 sgClaw Rust 子进程。
- 按协议发送 `init``submit_task``response`
- 执行 Rust 发来的浏览器命令并回包。
- 选择外部 `frontend bundle` 或内置资源作为展示面。
sgClaw 仓库本身不包含 Chromium/C++ 实现代码,因此 L1 只定义宿主责任边界,不再把外部仓库中的假定文件结构写成“当前仓库现状”。
`host` 不拥有 planner、模型路由、skills 编排或业务策略定义。sgClaw 仓库本身不包含 Chromium/C++ 实现代码,因此 L1 只定义宿主责任边界,不再把外部仓库中的假定文件结构写成“当前仓库现状”。
### 2.2 Rust 控制层
### 2.2 sgClaw 安全/控制层
Rust 侧是当前仓库的事实主体,职责包括:
@@ -58,10 +71,40 @@ Rust 侧是当前仓库的事实主体,职责包括:
- 完成握手、加载 `rules.json`、创建 `BrowserPipeTool`
- 在消息循环中接收浏览器消息并分发到执行层。
- 把执行日志和任务结果回传给宿主。
- 读取 `runtime config`,决定 planner、provider、skills prompt mode、backend 选择等运行时行为。
- 决定哪些 zeroclaw 能力能够暴露给当前运行环境。
### 2.3 执行
### 2.3 zeroclaw 核心
执行层当前有两条路径
主线目标中zeroclaw 应承担
- planner / executor 决策
- prompt/system sections 组装
- skills / memory / routing
- tool loop 与 provider 协调
- 通用 agent 能力而非仅浏览器能力
也就是说,`planner-first` 是 sgClaw / zeroclaw 运行时行为,不是 `frontend bundle``host` 页面逻辑。
### 2.4 展示层frontend bundle
前端展示层只拥有显示权,不拥有执行权。其职责应被限制为:
- 展示会话状态、日志、消息、验收结果。
- 将用户输入转成宿主可接受的事件。
- 呈现 `planner-first` 的可视化状态,例如“先展示计划,再执行”。
前端不应决定:
- 是否调用 planner
- 是否切换 provider / model
- 是否绕过 sgClaw / zeroclaw 执行
这些决策必须留在 sgClaw / zeroclaw 运行时。
### 2.5 当前实现的过渡态
当前执行层仍有两条路径:
1. `planner fallback`
说明:当未配置 `DEEPSEEK_API_KEY` 等环境变量时,使用仓库内置的轻量 planner 执行。
@@ -69,23 +112,38 @@ Rust 侧是当前仓库的事实主体,职责包括:
2. `ZeroClaw compat runtime`
说明:当提供模型配置后,通过 [`src/compat/runtime.rs`](/home/zyl/projects/sgClaw/claw/src/compat/runtime.rs) 构造 provider、memory 和 `browser_action` 工具,把任务交给 vendored ZeroClaw Agent。
这两条路径共存,是当前重构期的核心现实。文档必须保留这一点,否则会误导实现和联调
这两条路径是当前代码现实但都不应被写成长期产品定义。长期目标是“zeroclaw-first runtime + sgClaw security layer”而不是 browser-only compat
### 2.6 运行时配置边界
本轮架构冻结后,必须使用以下统一术语:
- `launch config`
`host` 读取,描述 `binary``args``env``working_dir``runtime_config_path``frontend_bundle_dir` 等启动时边界。
- `runtime config`
由 sgClaw 读取,描述 provider、model、planner mode、backend 选择、skills 策略等运行时行为。
- `frontend bundle`
`host` 装载的展示资源目录,允许外部 bundle 优先、内置资源兜底。
这三类对象都应支持运行期调整;只有安全边界和能力暴露由 `host` 固化。
---
## 3. ZeroClaw 重构的架构意义
ZeroClaw 在本项目中的角色不是“大而全框架接管一切”,而是解决三个具体问题:
ZeroClaw 在本项目中的角色不是“大而全框架接管一切”,也不是“被 sgClaw 套壳后只剩一个 browser_action 工具”而是系统能力本体。sgClaw 应该在它上面解决三个具体问题:
- 统一模型 Provider 抽象。
- 为后续记忆、工具调度、可观测性留出标准扩展位
- 在不改浏览器协议的前提下,替换任务执行内核
- 统一模型 Provider 抽象、skills、memory 和 tool loop
- 在不改浏览器协议的前提下,把高风险执行约束到受保护的工具面
- 让浏览器成为特权执行面,而不是反过来让浏览器定义整个 runtime
当前兼容层的限制也必须明确:
- 只注册一个工具:`browser_action`
- 只开放 4 个动作:`click/type/navigate/getText`
- 不以 ZeroClaw 的全量工具生态作为对外能力宣称。
- 不以 ZeroClaw 的全量工具生态作为当前对外能力宣称。
这些限制是当前实现状态,不是目标架构原则。
---
@@ -142,21 +200,38 @@ ZeroClaw 在本项目中的角色不是“大而全框架接管一切”,而
原因:
- 浏览器宿主联调成本最低。
- Rust 侧可以独立迭代 planner 和 ZeroClaw 路径
- Rust 侧可以独立演进 zeroclaw runtime 与安全策略,而不破坏宿主联调
- 产品文档、测试和协议标准可以围绕同一条 contract 收敛。
### 5.3 先做最小工具面,再扩动作
### 5.3 先做最小特权工具面,再扩动作
原因:
- 当前最稳定的是 `click/type/navigate/getText`
- 动作越多,宿主和模型之间的契约越难稳定。
- 动作越多,宿主和 runtime 之间的契约越难稳定。
- 在规则文件仍只开放 4 个动作的前提下,文档不应提前放大能力范围。
### 5.4 启动策略配置化,而不是编译期写死
原因:
- 更换 sgClaw binary 不应要求重编 Chromium。
- 切换 `runtime config` 不应要求重编 Chromium。
- 更换 `frontend bundle` 不应要求重编 Chromium。
- 安全边界仍由 `host` 控制,配置化不等于放弃校验。
### 5.5 planner-first 只属于运行时,不属于展示层
原因:
- “先展示计划,再执行”是执行引擎行为,不是前端动画效果。
- 前端只能显示 planner 状态,不能私自触发或跳过 planner。
- 验收必须同时覆盖视觉呈现和功能执行两个维度。
---
## 6. 架构结论
L1 层面可以把 sgClaw 定义为:一个通过固定浏览器协议接入宿主、以 Rust 为控制层、以 ZeroClaw 为兼容执行核、以 MAC Policy 为最小安全边界的浏览器智能体运行时。
L1 层面可以把 sgClaw 定义为:一个通过固定浏览器协议接入 `host`、以 Rust 为安全与控制层、以 zeroclaw 为能力核心、以 `runtime config` 驱动执行行为、以 `frontend bundle` 作为仅展示界面、以 MAC Policy 与宿主复检为受保护执行边界的安全加固运行时分发
这一定义当前仓库实现保持一致,也为后续继续扩展动作、工具和记忆系统保留了清晰边界。
这一定义既承认当前仓库仍存在 browser-first compat 的过渡实现,也为后续把 runtime 真正收口到 zeroclaw-first、planner-first 主线保留了清晰边界。

View File

@@ -1,8 +1,8 @@
# L2 — 核心模块与接口契约层
**文档版本**: 2.0
**适用项目**: sgClawZeroClaw 重构版)
**编制日期**: 2026-03-26
**文档版本**: 2.1<br>
**适用项目**: sgClawZeroClaw 重构版)<br>
**编制日期**: 2026-03-29
**读者**: 架构工程师、实现工程师、联调工程师
@@ -24,14 +24,28 @@ src/
└── security/
```
模块边界按职责划分为层:
模块边界按职责划分为层:
| 层级 | 模块 | 责任 |
|---|---|---|
| 传输层 | `pipe` | 定义消息、握手、序列号、收发与命令等待 |
| 控制层 | `lib.rs``agent` | 接收任务、选择执行路径、回传日志与结果 |
| 兼容层 | `compat` | 对接 vendored ZeroClaw暴露单一 `browser_action` |
| 安全层 | `security``resources/rules.json` | 域名与动作白名单控制 |
| 安全层 | `security``resources/rules.json` | 域名、动作与 pipe 命令边界控制 |
| 运行时入口层 | `lib.rs``agent` | 接收任务、选择执行路径、回传日志与结果 |
| 适配层 | `compat``config` | 把 sgClaw 宿主环境映射到 zeroclaw 运行时;当前仍带有过渡性限制 |
| 核心能力层 | `third_party/zeroclaw` | prompt、skills、memory、tool loop、provider 抽象 |
本轮冻结后还需要一组跨仓统一术语:
- `host`
指 SuperRPA 浏览器宿主,只拥有安全边界、进程托管和展示装配权。
- `launch config`
指由 `host` 读取的启动描述文件。
- `runtime config`
指由 sgClaw 读取的运行时策略文件。
- `frontend bundle`
指由 `host` 装载的前端展示资源。
- `planner-first`
指 sgClaw / zeroclaw 先产生计划、前端先展示计划、随后再执行的运行时行为。
---
@@ -51,26 +65,27 @@ src/
### 2.2 `src/agent/mod.rs`
[`src/agent/mod.rs`](/home/zyl/projects/sgClaw/claw/src/agent/mod.rs) 决定执行路径:
[`src/agent/mod.rs`](/home/zyl/projects/sgClaw/claw/src/agent/mod.rs) 当前决定执行路径:
- 收到 `BrowserMessage::SubmitTask` 时优先尝试读取 `DeepSeekSettings`
- 环境配置存在,则走 `compat::runtime::execute_task`
- 环境配置不存在,则走内置 planner fallback
- 收到 `BrowserMessage::SubmitTask` 时优先尝试读取 `SgClawSettings`
- 环境配置存在,则走 `compat::runtime::execute_task_with_sgclaw_settings`
- 环境配置不存在,则直接返回“未配置大语言模型”,而不是再把生产 submit 流量导回旧 planner
这就是当前系统的“路由器”。
这就是当前系统的“路由器”。但文档上应把它理解为过渡性 runtime selector而不是最终产品架构中心。长期看`planner-first`、provider 选择与 backend 选择都应由 `runtime config` 显式驱动。
### 2.3 `src/agent/runtime.rs`
该文件保留仓库内的轻量 LLM/tool 调用逻辑,核心特点:
该文件现在应被视为 `legacy/dev-only` 模块,只保留仓库内的轻量 LLM/tool 调用逻辑用于局部验证,核心特点:
- 工具名固定为 `browser_action`
- schema 只允许 `click/type/navigate/getText`
- 每次工具调用前后发送 `log_entry`
- 结果失败时直接返回 `PipeError::Protocol`
- 不参与当前生产浏览器 submit 路由。
### 2.4 `src/compat/runtime.rs`
[`src/compat/runtime.rs`](/home/zyl/projects/sgClaw/claw/src/compat/runtime.rs) 是 ZeroClaw 重构的关键模块:
[`src/compat/runtime.rs`](/home/zyl/projects/sgClaw/claw/src/compat/runtime.rs) 是当前 ZeroClaw 接入的关键模块:
- 负责构造 ZeroClaw config。
- 负责创建 provider。
@@ -81,11 +96,64 @@ src/
- 当前 compat 层只向 ZeroClaw 注册一个工具。
- `allowed_tools` 被收敛到 `browser_action`
- 这意味着 ZeroClaw 在本项目中是“兼容执行器”,不是“多工具平台”
- 这意味着当前代码还没有把 sgClaw 做成 zeroclaw-first runtime
- 这是一种实现限制,不应被文档提升为产品原则。
### 2.7 host / launch config / runtime config 契约
跨仓接口收口后,`host` 与 sgClaw 的最小契约应为:
| 对象 | 读取方 | 责任 | 失败时兜底 |
|---|---|---|---|
| `launch config` | `host` | 解析 `binary``args``env``working_dir``runtime_config_path``frontend_bundle_dir` | 回退到浏览器内置默认启动策略 |
| `runtime config` | sgClaw | 解析 provider、model、planner mode、backend、skills policy | 回退到 sgClaw 默认运行时配置 |
| `frontend bundle` | `host` | 提供浮窗/面板展示资源 | 回退到浏览器内置 WebUI 资源 |
这里的关键点是:`host` 只负责校验和装配,不负责决定 planner、model routing 或 business behavior。
### 2.8 launch config 文件路径与回退规则
设计冻结口径如下:
- profile-local `launch config` 路径:`<profile>/superrpa/sgclaw_launch_config.json`
- profile-local `runtime config` 路径:`<profile>/superrpa/sgclaw_config.json`
- profile-local hooks / rules 仍沿用:
- `<profile>/superrpa/hooks.json`
- `<profile>/superrpa/rules.json`
`launch config` 的回退规则必须保持稳定:
1. 优先读取 `<profile>/superrpa/sgclaw_launch_config.json`
2. 缺失或解析失败时,回退到当前浏览器内置默认启动参数
3. `runtime_config_path` 缺失时,回退到 `<profile>/superrpa/sgclaw_config.json`
4. `frontend_bundle_dir` 缺失、无效或校验失败时,回退到 bundled frontend resources
### 2.9 frontend bundle 与 planner-first 契约
`frontend bundle` 只能消费这些由 `host` 转发的运行时状态:
- 当前状态与日志
- planner 输出
- 会话消息
- 最终执行结果
`frontend bundle` 不能直接拥有:
- provider 切换决策权
- planner 开关控制权
- executor 旁路能力
因此 `planner-first` 的契约应是:
1. sgClaw / zeroclaw 产生计划
2. `frontend bundle` 先展示计划
3. 经运行时确认后再执行
这是一条 runtime contract不是一条前端内部约定。
### 2.5 `src/pipe/browser_tool.rs`
该模块承担真实浏览器命令发送职责:
该模块承担真实浏览器命令发送职责,也是当前系统中最重要的特权工具面桥接层
- 为每个命令分配 `seq`
- 计算 HMAC。
@@ -93,11 +161,11 @@ src/
- 阻塞等待对应 `BrowserMessage::Response`
- 在超时、响应错配、校验失败时返回错误。
它是 Rust 侧最重要的协议执行点。
它是 Rust 侧最重要的协议执行点。架构上应把它理解为“privileged browser surface adapter”而不是“整个 runtime 本体”。
### 2.6 `src/security/mac_policy.rs`
安全策略只认规则文件,不认模型意图。
安全策略只认规则文件与 pipe contract,不认模型意图。
规则来源为 [`resources/rules.json`](/home/zyl/projects/sgClaw/claw/resources/rules.json),当前默认约束是:
- 允许域名:`oa.example.com``erp.example.com``hr.example.com` 及 demo 域名。
@@ -235,6 +303,8 @@ L2 是产品内核视角的契约说明。两者关系如下:
2. `src/agent/runtime.rs` 的 tool definition
3. `src/compat/browser_tool_adapter.rs``parameters_schema``parse_action`
这三者表达的是“当前特权浏览器工具面”的开放范围,而不是 sgClaw 整体 runtime 的长期能力上限。
---
## 5. `browser_action` 工具契约

View File

@@ -10,32 +10,32 @@
## 1. 端到端数据流
当前主链路的数据流如下:
主线目标中的数据流应当如下:
```
Browser Host
└─ submit_task
Client Surface
└─ submit_task (+ optional browser context)
sgClaw Transport / Handshake
└─ handle_browser_message
└─ sgClaw security envelope
Execution Path Select
planner fallback
└─ zeroclaw compat runtime
ZeroClaw-first runtime
runtime profile / tool policy
browser_action
AgentMessage::Command
Browser executes action
BrowserMessage::Response
Tool execution
├─ browser_action -> pipe -> browser host -> response
└─ non-browser-safe future surfaces only when policy allows
log_entry / task_complete
```
这条链路里没有独立 Skill 执行器,也没有独立任务编排数据库。
因此 L3 的重点不再是“描述一个理想化智能体平台”,而是说明当前仓库里真实存在的数据流状态机。
当前代码与上述目标之间仍有过渡态偏差:
- 浏览器是当前唯一成熟的特权工具面。
- `planner/runtime` 旧链路仍保留在仓库中,但已收敛为 `legacy/dev-only` 辅助模块。
- `zeroclaw` 已 vendored但运行时还没有完全按 zeroclaw-first 方式释放能力。
因此 L3 既要说明目标数据流,也要明确指出当前代码仍处于过渡收口阶段。
---
@@ -52,26 +52,27 @@ log_entry / task_complete
### 2.2 任务接收阶段
宿主发送:
浏览器宿主当前发送:
```json
{ "type": "submit_task", "instruction": "..." }
```
Rust 侧在 [`src/agent/mod.rs`](/home/zyl/projects/sgClaw/claw/src/agent/mod.rs) 中接收后,不直接执行页面命令,而是先决定走哪条执行路径
Rust 侧在 [`src/agent/mod.rs`](/home/zyl/projects/sgClaw/claw/src/agent/mod.rs) 中接收后,不应被理解为“直接开始网页自动化”,而是先决定当前任务使用什么 runtime/profile并判断浏览器上下文是否真的必要
### 2.3 执行路径选择
### 2.3 当前执行路径选择(过渡态)
#### 路径 Aplanner fallback
#### 路径 Alegacy planner/runtime非生产 submit 主链)
条件:没有可用的 `DEEPSEEK_*` 环境配置。
行为:使用仓库内置 planner 直接产生若干步骤,并逐个调用 `BrowserPipeTool`
条件:仅用于 dev/test 验证或保留回归覆盖。
行为:使用仓库内置 planner 或轻量 runtime 直接产生若干步骤,并逐个调用 `BrowserPipeTool`
特点:
- 依赖更少。
- 逻辑可预测。
- 适合协议联调和最小功能验证。
- 不再承接生产浏览器 submit 流量。
#### 路径 BZeroClaw compat runtime
@@ -108,7 +109,7 @@ Rust 侧在 [`src/agent/mod.rs`](/home/zyl/projects/sgClaw/claw/src/agent/mod.rs
6. 形成观察结果
说明:根据 `success``data``aom_snapshot``timing` 形成下一步输入或最终结果。
这意味着“智能体行为”和“浏览器动作执行”之间的接口已经被压缩到非常薄的一层,这是 ZeroClaw 重构最有价值的结构变化。
这意味着“runtime 决策”和“浏览器动作执行”之间的接口已经被压缩到非常薄的一层,这是 sgClaw 作为 zeroclaw 安全封装层最有价值的结构变化。
---
@@ -126,6 +127,7 @@ Rust 侧在 [`src/agent/mod.rs`](/home/zyl/projects/sgClaw/claw/src/agent/mod.rs
典型内容:
- 当前准备执行的动作。
- `planner_mode=zeroclaw_plan_first` 时由 sgClaw 先发出的计划预览。
- compat runtime 中转译出的事件摘要。
- 执行中的信息性提示。
@@ -144,26 +146,28 @@ Rust 侧在 [`src/agent/mod.rs`](/home/zyl/projects/sgClaw/claw/src/agent/mod.rs
“L3 是灵魂”的前提,不是把 Skill 写得越来越玄,而是把 Skill 在当前阶段的真实语义说清楚。
### 5.1 当前不存在独立 Skill 引擎
### 5.1 当前不应再设计独立于 zeroclaw 的 Skill 引擎
当前仓库中没有独立的:
当前仓库中不应再把 Skill 理解为浏览器专用外挂子系统。sgClaw 已经 vendored zeroclaw自带的 skill 体系才是主线。需要注意的是,当前运行时对它的使用仍不充分。
当前代码仍缺少或未完全释放的部分包括:
- Skill 脚本目录加载流程
- Skill 注册表
- Skill 沙箱执行器
- Skill 版本与签名校验主链路
因此不能再把 Skill 描述为已落地子系统
因此文档上不能再把 Skill 描述为“浏览器侧另起一套引擎”,而应描述为“应复用 zeroclaw-native 机制的能力层,当前实现仍在收口”
### 5.2 当前可以保留的 Skill 语义
ZeroClaw 重构版Skill 更准确的含义是:
sgClaw 的主线架构Skill 更准确的含义是:
- 面向未来的“可复用任务模式”抽象
- 可能由提示词、模板、预设工具组合或 planner 规则来承载
- 最终仍要落到统一的 `browser_action` 契约。
- 由 zeroclaw 管理的可复用任务模式、提示规范和可调用工具组合
- 在 compact/full 模式下进入 system prompt 或按需通过 `read_skill` 读取
- 当任务需要浏览器时,最终可落到统一的 `browser_action` 契约;当任务不需要浏览器时,不应强行绕浏览器一圈
换句话说,当前 Skill 不是一个运行时目录,而是一种产品与执行层之间的抽象语言
换句话说Skill 不是“浏览器脚本目录”的别名,而是 zeroclaw runtime 的一部分
### 5.3 Skill 演进约束
@@ -179,13 +183,23 @@ Rust 侧在 [`src/agent/mod.rs`](/home/zyl/projects/sgClaw/claw/src/agent/mod.rs
### 6.1 配置
当前真正参与执行的关键配置来自 [`src/config/settings.rs`](/home/zyl/projects/sgClaw/claw/src/config/settings.rs)
当前真正参与执行的关键配置来自 [`src/config/settings.rs`](/home/zyl/projects/sgClaw/claw/src/config/settings.rs)。它已经不再只是单一 `DEEPSEEK_*` shim而是开始承载 sgClaw 自己的运行时策略
- `DEEPSEEK_API_KEY`
- `DEEPSEEK_BASE_URL`
- `DEEPSEEK_MODEL`
- `providers` / `active_provider`
- `planner_mode`
- `browser_backend`
- `office_backend`
- `skills_prompt_mode`
- `runtime_profile`
这些配置决定是否启用 compat runtime以及模型请求如何路由。
当前默认语义是:
- `providers` 为空时,仍兼容旧的 `apiKey/baseUrl/model` DeepSeek 单模型配置。
- `planner_mode=zeroclaw_plan_first` 时,由 sgClaw 在真实执行前先向宿主发送可展示的计划预览,前端只负责渲染。
- `browser_backend=superrpa` 时,浏览器高权限动作仍以宿主 pipe 为边界sgClaw 只决定运行时策略,不把特权上移到前端。
- `office_backend=openxml` 时,导出类任务仍由 sgClaw 运行时选择实际导出工具。
这部分配置的目标很明确让模型切换、planner 策略和运行时 backend 选择回到 sgClaw 自己,而不是继续散落在 SuperRPA 编译期常量或前端逻辑里。
### 6.2 记忆
@@ -199,10 +213,10 @@ ZeroClaw compat 路径中已经接入 memory adapter但在产品能力层面
## 7. L3 结论
L3 的核心不是“把所有未来能力都放进一个宏大数据流图”,而是说明当前系统如何把自然语言任务压缩成可验证、可回包、可受控的浏览器动作。
L3 的核心不是“把所有未来能力都放进一个宏大数据流图”,也不是“把所有任务都解释成浏览器动作”,而是说明 sgClaw 如何把任务先交给 zeroclaw runtime再把其中需要高风险外部执行的部分压缩成可验证、可回包、可受控的浏览器动作。
重构后的灵魂有三点:
- 任务入口统一。
- 动作契约统一
- 执行路径可替换,但协议和安全边界不变。
- runtime 核心统一到 zeroclaw
- 特权工具面可替换,但协议和安全边界不变。

View File

@@ -1,8 +1,8 @@
# L4 — 工程实现与部署拓扑层
**文档版本**: 2.0
**适用项目**: sgClawZeroClaw 重构版)
**编制日期**: 2026-03-26
**文档版本**: 2.1<br>
**适用项目**: sgClawZeroClaw 重构版)<br>
**编制日期**: 2026-03-29
**读者**: 开发者、测试工程师、联调工程师
@@ -44,7 +44,7 @@ claw/
└── archive/
```
工程上应把 `third_party/zeroclaw` 理解为“已 vendored 的兼容依赖”,而不是单独维护的兄弟项目。
工程上应把 `third_party/zeroclaw` 理解为“已 vendored 的能力核心”,而不是单独维护的兄弟项目,也不是只用于兼容的附属依赖
---
@@ -83,10 +83,15 @@ claw/
职责:
- 决定 fallback 或 compat 执行。
-统一工具契约映射到浏览器协议。
- 当前决定 fallback 或 compat 执行。
-受保护的浏览器工具契约映射到浏览器协议。
- 在 ZeroClaw turn 事件与宿主日志之间做桥接。
说明:
- `src/agent/runtime.rs``src/agent/planner.rs` 属于过渡性轻量路径,不应再被写成长期产品主线。
- 主线目标应是“sgClaw security layer + zeroclaw core runtime”而不是长期保留 browser-only compat 分叉。
### 2.4 安全与配置
- [`src/security/mac_policy.rs`](/home/zyl/projects/sgClaw/claw/src/security/mac_policy.rs)
@@ -96,7 +101,7 @@ claw/
职责:
- 维护运行时安全边界。
- 从环境变量读取 provider 配置
- 读取 provider / skills 等运行时配置,并逐步向 zeroclaw-first 配置模型收敛
---
@@ -130,7 +135,7 @@ sgClaw 不是独立交互式 CLI 产品,正常运行前提是:
### 3.3 模型配置
启用 ZeroClaw compat runtime 的关键环境变量:
当前启用 ZeroClaw compat runtime 的关键环境变量:
```bash
DEEPSEEK_API_KEY=...
@@ -138,7 +143,70 @@ DEEPSEEK_BASE_URL=...
DEEPSEEK_MODEL=...
```
若这些变量不存在或不完整,系统会退回 planner fallback。
若这些变量不存在或不完整,系统会退回 planner fallback。这个行为是当前实现状态,不是长期架构推荐。
### 3.4 runtime config 文件
`host``--config-path=<workspace_root>/sgclaw_config.json` 拉起 `sgclaw` 时,`sgclaw` 会自己读取该 JSON 文件,而不是要求宿主额外复制 skills。
当前支持的关键字段:
```json
{
"apiKey": "sk-...",
"baseUrl": "https://api.deepseek.com",
"model": "deepseek-chat",
"skillsDir": "skill_lib"
}
```
说明:
- `skillsDir` 可省略。
- 若省略,则默认使用 `<workspace_root>/.sgclaw-zeroclaw-workspace/skills`
- 若为相对路径,则相对于 `sgclaw_config.json` 所在目录解析。
- 若指向某个 skill repo 根目录,且其下存在 `skills/` 子目录,运行时会自动落到该 `skills/` 目录。
- 因此 `host` 只需要负责传递 `runtime config` 路径skill 查找策略由 `sgclaw` 自己控制。
- 长期看,这个文件应表达 zeroclaw-first、`planner-first` 的 runtime/profile/tool policy 配置,而不仅是 provider shim。
### 3.5 launch config 文件与 fallback
`launch config``host` 读取,不由 sgClaw 自己解析。设计冻结后的推荐路径为:
```text
<profile>/superrpa/sgclaw_launch_config.json
```
该文件承载的字段应包括:
- `binary`
- `args`
- `env`
- `working_dir`
- `runtime_config_path`
- `frontend_bundle_dir`
加载规则必须保持稳定:
1. `host` 优先读取 profile-local `launch config`
2.`binary` 缺失或无效,则回退到浏览器已知可启动的默认 sgClaw 路径
3.`runtime_config_path` 缺失,则回退到 `<profile>/superrpa/sgclaw_config.json`
4.`frontend_bundle_dir` 缺失或无效,则回退到 bundled frontend resources
这样做的目的不是削弱宿主管控,而是把高频变化项从编译期常量改成运行期可替换对象。
### 3.6 frontend bundle 装载拓扑
`frontend bundle` 的部署方式应当是“外部 bundle 优先,内置资源兜底”:
```text
host
├─ validate frontend_bundle_dir
├─ if valid: load external frontend bundle
└─ else: load bundled frontend resources
```
这意味着后续改浮窗 UI、验收页面或 planner 展示逻辑,不应再默认要求重编 Chromium。
---
@@ -165,7 +233,7 @@ DEEPSEEK_MODEL=...
cargo test
```
这组测试表达了一个重要工程事实当前系统的稳定核心是协议、runtime 选择和 compat 适配,而不是旧版前端验证页。
这组测试表达了一个重要工程事实:当前系统的稳定核心是协议、安全边界、runtime 选择和 zeroclaw 接入,而不是旧版前端验证页。
---
@@ -179,9 +247,11 @@ cargo test
### 5.2 外部宿主负责什么
- 读取并校验 `launch config`
- 拉起并托管 sgClaw 进程。
- 提供页面执行能力。
- 实现命令落地、响应回传和宿主侧校验。
- 装载 `frontend bundle`,并在无效时回退到内置资源。
### 5.3 不在本仓库内交付的内容
@@ -199,13 +269,17 @@ L4 的工程边界必须按仓库现实写清楚,否则会把“外部依赖
```
Browser Host Process
├─ reads launch config
├─ launches sgclaw binary
├─ writes init / submit_task to stdin
├─ reads command / log / task_complete from stdout
─ executes page actions in host environment
─ executes page actions in host environment
└─ loads external frontend bundle or bundled resources
sgclaw binary
├─ loads runtime config
├─ loads resources/rules.json
├─ runs planner-first execution
├─ verifies action/domain
├─ optionally calls provider API
└─ waits for browser response
@@ -220,4 +294,5 @@ sgclaw binary
L4 层面的核心结论只有两点:
1. 本仓库已经从“带演示页的杂糅目录”收敛为“Rust Runtime + 协议文档 + 测试”的内核仓库。
2. ZeroClaw 重构后的工程重点,是保证 compat runtime、fallback runtime、浏览器协议三者在同一 contract 上工作。
2. ZeroClaw 重构后的工程重点,是把工程形态从“browser-first compat”收口为“zeroclaw-first runtime + sgClaw security envelope”同时保持浏览器协议稳定。
3. `host``launch config``runtime config``frontend bundle``planner-first` 必须在文档、代码和验收中使用同一套术语,避免再次把前端逻辑上移到 sgClaw 之外。

View File

@@ -5,24 +5,25 @@
- 目标:给出可执行的工程改造路径与落地记录
## 1. 结论(先说结论)
项目存在至少两条主要提示词构造链路:
项目当前存在至少两条主要提示词构造链路,但长期主线只能保留一条 authoritative chain
1) **轻量运行时链路**`src/agent/runtime.rs`
1) **轻量运行时链路**`src/agent/runtime.rs`,过渡态
- 仅有非常基础的固定 system 提示。
- 适用于非完整流程的本地/最小执行场景。
- 不应继续被扩展成主线产品提示词体系。
2) **ZeroClaw 主链路**`third_party/zeroclaw/*`
- 这条链路是“系统提示”主体,分为:
- `Agent` 内部结构化构建器(`SystemPromptBuilder`
- `channels` 侧统一字符串拼装
- `skills / personality / identity / bootstrap 文件 / 工具说明` 等多个注入源
- 这也是你要关注的主要安全面。
- 这也是你要关注的主要安全面,也是未来应保留的唯一主线
---
## 2. 提示词分布结构(按文件/模块)
### 2.1 固定系统提示(轻量链路)
### 2.1 固定系统提示(轻量链路,待收口
- `src/agent/runtime.rs`
- `execute_task_with_provider``ChatMessage { role: "system" ... }`
- 当前内容:`You are sgClaw. Use browser_action to complete the browser task.`
@@ -108,6 +109,10 @@
## 4. 安全改造建议(按优先级)
### P0建议立即做
0) 收口双主链路
- `sgclaw` 不应长期同时维护一条轻量自定义 prompt 链和一条 zeroclaw 主链。
- 目标是:保留 zeroclaw 主链sgClaw 仅增加安全摘要、浏览器上下文和受控工具面说明。
1) 接入 `PromptGuard`
- 目前已有 `third_party/zeroclaw/src/security/prompt_guard.rs`
- 在以下入口加扫描并截断/告警:
@@ -148,7 +153,8 @@
## 5. 本次已确认的“关键风险”
- `PromptGuard` 尚未在主入口统一挂载(存在检测能力,但未形成强制拦截链)。
- workspace/skills 内容可直接进入 prompt注入面较宽。
- 两套系统提示构建链路(agent builder 与 channel builder存在口径差异需要统一
- 两套系统提示构建链路(轻量链路与 zeroclaw 主链)同时存在,容易造成安全策略漂移
- `sgclaw` 如果继续把浏览器专用提示补丁放在主链外侧,会重新制造第三条 prompt source。
---
@@ -161,4 +167,3 @@
- 复现提示词覆盖攻击
- 系统提示重复/续接场景seed/reseed
- compact/full 两种技能注入对比

View File

@@ -10,6 +10,16 @@
- `L5-提示词分布与安全改造方案.md`:提示词治理与风控增强策略。
- `浏览器对接标准.md`Rust 与 Chromium 对接的协议基线。
当前运行时补充约定:
- `sgclaw` 在主线口径上应理解为“安全加固后的 zeroclaw runtime”而不是浏览器侧 UI 产品,也不是 browser-only agent。
- 浏览器宿主与 pipe 协议提供的是受保护执行面;浏览器是重要工具与前端载体,但不是整个 runtime 的定义。
- 文档中提到 `planner fallback``compat runtime``browser_action only` 时,应优先理解为“当前实现状态/过渡态”,而不是长期产品边界。
- `sgclaw` 的浏览器侧配置文件为 `sgclaw_config.json`
- `sgclaw` 负责解析模型配置与可选的 `skillsDir`
- `skillsDir` 未设置时,默认回退到 `<workspace_root>/.sgclaw-zeroclaw-workspace/skills`
- `skillsDir` 已设置时,支持直接指向 skills 目录,也支持指向包含 `skills/` 子目录的 skill repo 根目录。
## 归档文档
### 项目管理与排期(已归档)

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,145 @@
# sgBrowser websocket probe transcript
Endpoint: `ws://127.0.0.1:12345`
Timeout: `1500ms`
Cargo target dir override: `D:/data/ideaSpace/rust/sgClaw/claw-new/target_task4`
## baseline-open
```text
$ CARGO_TARGET_DIR="/d/data/ideaSpace/rust/sgClaw/claw-new/target_task4" cargo run --manifest-path "/d/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" --bin sgbrowser_ws_probe -- --ws-url "ws://127.0.0.1:12345" --timeout-ms 1500 --step "baseline-open::[\"about:blank\",\"sgBrowerserOpenPage\",\"https://www.zhihu.com/hot\"]"
Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.95s
Running `target_task4\debug\sgbrowser_ws_probe.exe --ws-url 'ws://127.0.0.1:12345' --timeout-ms 1500 --step 'baseline-open::["about:blank","sgBrowerserOpenPage","https://www.zhihu.com/hot"]'`
STEP 1 baseline-open
SEND: ["about:blank","sgBrowerserOpenPage","https://www.zhihu.com/hot"]
RECV: Welcome! You are client #1
OUTCOME: received
```
## open-agent
```text
$ CARGO_TARGET_DIR="/d/data/ideaSpace/rust/sgClaw/claw-new/target_task4" cargo run --manifest-path "/d/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" --bin sgbrowser_ws_probe -- --ws-url "ws://127.0.0.1:12345" --timeout-ms 1500 --step "open-agent::[\"about:blank\",\"sgOpenAgent\"]" --step "post-open-agent-open::[\"about:blank\",\"sgBrowerserOpenPage\",\"https://www.zhihu.com/hot\"]"
Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.98s
Running `target_task4\debug\sgbrowser_ws_probe.exe --ws-url 'ws://127.0.0.1:12345' --timeout-ms 1500 --step 'open-agent::["about:blank","sgOpenAgent"]' --step 'post-open-agent-open::["about:blank","sgBrowerserOpenPage","https://www.zhihu.com/hot"]'`
STEP 1 open-agent
SEND: ["about:blank","sgOpenAgent"]
RECV: Welcome! You are client #1
OUTCOME: received
STEP 2 post-open-agent-open
SEND: ["about:blank","sgBrowerserOpenPage","https://www.zhihu.com/hot"]
RECV: <none>
OUTCOME: timeout
```
## set-auth
```text
$ CARGO_TARGET_DIR="/d/data/ideaSpace/rust/sgClaw/claw-new/target_task4" cargo run --manifest-path "/d/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" --bin sgbrowser_ws_probe -- --ws-url "ws://127.0.0.1:12345" --timeout-ms 1500 --step "set-auth::[\"about:blank\",\"sgSetAuthInfo\",\"probe-user\",\"probe-token\"]" --step "post-set-auth-open::[\"about:blank\",\"sgBrowerserOpenPage\",\"https://www.zhihu.com/hot\"]"
Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.95s
Running `target_task4\debug\sgbrowser_ws_probe.exe --ws-url 'ws://127.0.0.1:12345' --timeout-ms 1500 --step 'set-auth::["about:blank","sgSetAuthInfo","probe-user","probe-token"]' --step 'post-set-auth-open::["about:blank","sgBrowerserOpenPage","https://www.zhihu.com/hot"]'`
STEP 1 set-auth
SEND: ["about:blank","sgSetAuthInfo","probe-user","probe-token"]
RECV: Welcome! You are client #1
OUTCOME: received
STEP 2 post-set-auth-open
SEND: ["about:blank","sgBrowerserOpenPage","https://www.zhihu.com/hot"]
RECV: <none>
OUTCOME: timeout
```
## browser-login
```text
$ CARGO_TARGET_DIR="/d/data/ideaSpace/rust/sgClaw/claw-new/target_task4" cargo run --manifest-path "/d/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" --bin sgbrowser_ws_probe -- --ws-url "ws://127.0.0.1:12345" --timeout-ms 1500 --step 'browser-login::["about:blank","sgBrowserLogin",{"appName":"probe","userName":"probe","orgName":"probe","menus":[{"name":"probe","normalImg":"x","activeImg":"x","url":"https://www.zhihu.com/hot"}]}]' --step 'post-browser-login-open::["about:blank","sgBrowerserOpenPage","https://www.zhihu.com/hot"]'
Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.95s
Running `target_task4\debug\sgbrowser_ws_probe.exe --ws-url 'ws://127.0.0.1:12345' --timeout-ms 1500 --step 'browser-login::["about:blank","sgBrowserLogin",{"appName":"probe","userName":"probe","orgName":"probe","menus":[{"name":"probe","normalImg":"x","activeImg":"x","url":"https://www.zhihu.com/hot"}]}]' --step 'post-browser-login-open::["about:blank","sgBrowerserOpenPage","https://www.zhihu.com/hot"]'`
STEP 1 browser-login
SEND: ["about:blank","sgBrowserLogin",{"appName":"probe","userName":"probe","orgName":"probe","menus":[{"name":"probe","normalImg":"x","activeImg":"x","url":"https://www.zhihu.com/hot"}]}]
RECV: Welcome! You are client #1
OUTCOME: received
STEP 2 post-browser-login-open
SEND: ["about:blank","sgBrowerserOpenPage","https://www.zhihu.com/hot"]
RECV: <none>
OUTCOME: timeout
```
## active-tab
```text
$ CARGO_TARGET_DIR="/d/data/ideaSpace/rust/sgClaw/claw-new/target_task4" cargo run --manifest-path "/d/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" --bin sgbrowser_ws_probe -- --ws-url "ws://127.0.0.1:12345" --timeout-ms 1500 --step "active-tab::[\"about:blank\",\"sgBrowerserActiveTab\",\"https://www.zhihu.com/hot\",\"probeCallback\"]" --step "post-active-tab-open::[\"about:blank\",\"sgBrowerserOpenPage\",\"https://www.zhihu.com/hot\"]"
Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.95s
Running `target_task4\debug\sgbrowser_ws_probe.exe --ws-url 'ws://127.0.0.1:12345' --timeout-ms 1500 --step 'active-tab::["about:blank","sgBrowerserActiveTab","https://www.zhihu.com/hot","probeCallback"]' --step 'post-active-tab-open::["about:blank","sgBrowerserOpenPage","https://www.zhihu.com/hot"]'`
STEP 1 active-tab
SEND: ["about:blank","sgBrowerserActiveTab","https://www.zhihu.com/hot","probeCallback"]
RECV: Welcome! You are client #1
OUTCOME: received
STEP 2 post-active-tab-open
SEND: ["about:blank","sgBrowerserOpenPage","https://www.zhihu.com/hot"]
RECV: <none>
OUTCOME: timeout
```
## combined-bootstrap
```text
$ CARGO_TARGET_DIR="/d/data/ideaSpace/rust/sgClaw/claw-new/target_task4" cargo run --manifest-path "/d/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" --bin sgbrowser_ws_probe -- --ws-url "ws://127.0.0.1:12345" --timeout-ms 1500 --step "combined-open-agent::[\"about:blank\",\"sgOpenAgent\"]" --step "combined-active-tab::[\"about:blank\",\"sgBrowerserActiveTab\",\"https://www.zhihu.com/hot\",\"probeCallback\"]" --step "combined-open::[\"about:blank\",\"sgBrowerserOpenPage\",\"https://www.zhihu.com/hot\"]"
Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.95s
Running `target_task4\debug\sgbrowser_ws_probe.exe --ws-url 'ws://127.0.0.1:12345' --timeout-ms 1500 --step 'combined-open-agent::["about:blank","sgOpenAgent"]' --step 'combined-active-tab::["about:blank","sgBrowerserActiveTab","https://www.zhihu.com/hot","probeCallback"]' --step 'combined-open::["about:blank","sgBrowerserOpenPage","https://www.zhihu.com/hot"]'`
STEP 1 combined-open-agent
SEND: ["about:blank","sgOpenAgent"]
RECV: Welcome! You are client #1
OUTCOME: received
STEP 2 combined-active-tab
SEND: ["about:blank","sgBrowerserActiveTab","https://www.zhihu.com/hot","probeCallback"]
RECV: <none>
OUTCOME: timeout
STEP 3 combined-open
SEND: ["about:blank","sgBrowerserOpenPage","https://www.zhihu.com/hot"]
RECV: <none>
OUTCOME: timeout
```
## requesturl-variants
```text
$ CARGO_TARGET_DIR="/d/data/ideaSpace/rust/sgClaw/claw-new/target_task4" cargo run --manifest-path "/d/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" --bin sgbrowser_ws_probe -- --ws-url "ws://127.0.0.1:12345" --timeout-ms 1500 --step "target-as-requesturl::[\"https://www.zhihu.com/hot\",\"sgBrowerserOpenPage\",\"https://www.zhihu.com/hot\"]"
Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.94s
Running `target_task4\debug\sgbrowser_ws_probe.exe --ws-url 'ws://127.0.0.1:12345' --timeout-ms 1500 --step 'target-as-requesturl::["https://www.zhihu.com/hot","sgBrowerserOpenPage","https://www.zhihu.com/hot"]'`
STEP 1 target-as-requesturl
SEND: ["https://www.zhihu.com/hot","sgBrowerserOpenPage","https://www.zhihu.com/hot"]
RECV: Welcome! You are client #1
OUTCOME: received
```
```text
$ CARGO_TARGET_DIR="/d/data/ideaSpace/rust/sgClaw/claw-new/target_task4" cargo run --manifest-path "/d/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" --bin sgbrowser_ws_probe -- --ws-url "ws://127.0.0.1:12345" --timeout-ms 1500 --step "baidu-requesturl::[\"https://www.baidu.com\",\"sgBrowerserOpenPage\",\"https://www.zhihu.com/hot\"]"
Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.94s
Running `target_task4\debug\sgbrowser_ws_probe.exe --ws-url 'ws://127.0.0.1:12345' --timeout-ms 1500 --step 'baidu-requesturl::["https://www.baidu.com","sgBrowerserOpenPage","https://www.zhihu.com/hot"]'`
STEP 1 baidu-requesturl
SEND: ["https://www.baidu.com","sgBrowerserOpenPage","https://www.zhihu.com/hot"]
RECV: Welcome! You are client #1
OUTCOME: received
```
| Sequence | Sent frames | First reply | Final outcome | Decision signal |
| --- | --- | --- | --- | --- |
| baseline-open | `["about:blank","sgBrowerserOpenPage","https://www.zhihu.com/hot"]` | `Welcome! You are client #1` | received only welcome banner; no numeric status or callback frame captured | does not satisfy Option A rule |
| open-agent | `["about:blank","sgOpenAgent"]` then `["about:blank","sgBrowerserOpenPage","https://www.zhihu.com/hot"]` | `Welcome! You are client #1` | step 2 timed out with no reply | does not satisfy Option A rule |
| set-auth | `["about:blank","sgSetAuthInfo","probe-user","probe-token"]` then `["about:blank","sgBrowerserOpenPage","https://www.zhihu.com/hot"]` | `Welcome! You are client #1` | step 2 timed out with no reply | does not satisfy Option A rule |
| browser-login | `["about:blank","sgBrowserLogin",{"appName":"probe","userName":"probe","orgName":"probe","menus":[{"name":"probe","normalImg":"x","activeImg":"x","url":"https://www.zhihu.com/hot"}]}]` then `["about:blank","sgBrowerserOpenPage","https://www.zhihu.com/hot"]` | `Welcome! You are client #1` | step 2 timed out with no reply | does not satisfy Option A rule |
| active-tab | `["about:blank","sgBrowerserActiveTab","https://www.zhihu.com/hot","probeCallback"]` then `["about:blank","sgBrowerserOpenPage","https://www.zhihu.com/hot"]` | `Welcome! You are client #1` | step 2 timed out with no reply | does not satisfy Option A rule |
| combined-bootstrap | `["about:blank","sgOpenAgent"]` then `["about:blank","sgBrowerserActiveTab","https://www.zhihu.com/hot","probeCallback"]` then `["about:blank","sgBrowerserOpenPage","https://www.zhihu.com/hot"]` | `Welcome! You are client #1` | steps 2 and 3 timed out with no reply | does not satisfy Option A rule |
| requesturl-variants | `["https://www.zhihu.com/hot","sgBrowerserOpenPage","https://www.zhihu.com/hot"]` and `["https://www.baidu.com","sgBrowerserOpenPage","https://www.zhihu.com/hot"]` | `Welcome! You are client #1` | each one-shot run received only the welcome banner; no numeric status or callback frame captured | does not satisfy Option A rule |
## Final decision
**Option B wins.**
Reason: the strict rule says Option A wins only if at least one sequence reproducibly yields real numeric status and/or callback frames for a real business action. Across the full required matrix, the reachable endpoint consistently returned only the websocket welcome banner on the first reply for each fresh connection, and every follow-on business-action step either timed out or produced no numeric status/callback frame. Therefore the evidence does not validate a raw-websocket bootstrap contract, so Option B is the required outcome.

View File

@@ -0,0 +1,89 @@
# Zhihu Hotlist Excel Acceptance
- Date: 2026-03-30 03:46:51 +0800
- Mode: real provider + live Zhihu hotlist API + simulated browser pipe
- Workspace: `/tmp/sgclaw-live-acceptance-85j8m_dq`
- Final success: `True`
- Total score: `90/100`
## Rubric
- skill selection: `30/30`
- tool discipline: `25/25`
- hotlist data correctness: `20/20`
- xlsx export success: `20/20`
- final response quality: `5/5`
## Final Output
- exported_path: `/tmp/sgclaw-live-acceptance-85j8m_dq/.sgclaw-zeroclaw-workspace/out/zhihu-hotlist-1774813602601179684.xlsx`
- final_summary: `完美!我已经成功完成了知乎热榜数据的读取和导出。以下是任务完成总结:
## 任务完成情况
1. **导航到知乎热榜页面**成功从知乎首页导航到热榜页面https://www.zhihu.com/hot
2. **提取热榜数据**通过浏览器获取了当前热榜前10条数据包括
- 排名1-10
- 标题(完整问题描述)
- 热度值(以"万"为单位)
3. **导出Excel文件**成功将结构化数据导出为Excel文件包含以下列
- rank排名
- title标题
- heat热度
## 导出的数据内容
热榜前10条话题涵盖了多个领域
- 娱乐事件(李荣浩与单依纯的演唱会授权争议)
- 体育赛事张雪机车WSBK夺冠
- 国际政治经济(日本外储操作、伊朗冲突)
- 科技产业字节跳动发展、DDR5内存价格
- 社会现象(周末情侣模式、学霸家长低调现象)
- 生活安全(灭火毯安全问题)
## 生成的文件
**Excel文件路径**`/tmp/sgclaw-live-acceptance-85j8m_dq/.sgclaw-zeroclaw-workspace/out/zhihu-hotlist-1774813602601179684.xlsx`
该文件包含了2026年3月30日03:46:02采集的知乎热榜数据您可以使用Excel或其他电子表格软件打开查看完整的热榜信息。`
## Skill Logs
- `sgclaw runtime version=0.1.0 protocol=1.0`
- `DeepSeek config loaded from /tmp/sgclaw-live-acceptance-85j8m_dq/sgclaw_config.json model=deepseek-chat base_url=https://api.deepseek.com`
- `skills dir resolved to /home/zyl/projects/sgClaw/skill_lib/skills`
- `runtime profile=BrowserAttached skills_prompt_mode=Compact`
- `zeroclaw_process_message_primary`
- `先规划再执行知乎热榜 Excel 导出
navigate https://www.zhihu.com/hot
getText main
call openxml_office
return generated local .xlsx path`
- `loaded skills: office-export-xlsx@0.1.0, zhihu-hotlist@0.1.0, zhihu-hotlist-screen@0.1.0, zhihu-navigate@0.1.0, zhihu-write@0.1.0`
- `navigate https://www.zhihu.com/hot`
- `call zhihu-hotlist.extract_hotlist`
- `browser script failed: {"unsupported_action":"eval"}`
- `getText body`
- `getText .HotList-list`
- `call openxml_office`
- `unsupported columns: expected [rank, title, heat]`
- `call openxml_office`
## Live Hotlist Sample
- 1. 李荣浩摆证据 4 连质问单依纯,为什么没有授权的歌曲也能放进演唱会?演唱会筹备中可能出了什么问题? | 1220万
- 2. 如何看待张雪机车在 2026 年 WSBK 葡萄牙站夺冠?这对国内的摩托赛事发展有什么影响? | 370万
- 3. 日本拟动用外储做空国际原油,以挽救日元汇率,对此你怎么看,其会重演 96 年「住友铜事件」么? | 356万
- 4. 字节跳动是怎么短短数年就能单挑所有互联网巨头的? | 277万
- 5. 如何看待张雪机车 820rr 拿下 wsbk 葡萄牙站第一回合冠军?这个冠军含金量如何? | 241万
- 6. 伊朗科技大学遭袭后,伊朗将美以大学列为「合法袭击目标」,如果战争扩大到教育机构,冲突还有回头路吗? | 202万
- 7. 「周末情侣」模式日渐兴起,工作日通过消息视频联系,仅周末相聚,如何看待这种模式?你有过类似的经历吗? | 163万
- 8. 男孩玩灭火毯全身扎满超细玻璃纤维,又痒又痛取不出来,灭火毯为什么会「扎人」?怎么处理才不遭罪? | 158万
- 9. DDR5 内存价格 3 月出现明显下降,请问这是短期现象,还是内存供需紧张真的缓和了? | 151万
- 10. 为什么越厉害的学霸,她们家长越低调?从来不在朋友圈晒孩子成绩? | 139万
## Stderr
- `sgclaw ready: agent_id=4b984e63-3254-4518-a75a-127e7dad6474`

View File

@@ -0,0 +1,422 @@
# collect_lineloss.js 从生成到可用的完整排查记录
本文档记录了 `tq-lineloss-report` skill 脚本从初始生成到最终可用的全部排查过程,包括遇到的每个错误、根因分析和修复方法。可作为后续类似 skill 开发的排查模板。
---
## 背景
### 架构概览
```
用户输入 "兰州公司 月累计 2026-03。。。"
sgClaw Rust 进程
├── 解析指令 → DeterministicExecutionPlan
├── 读取 collect_lineloss.js 脚本
├── 包装为 IIFE(function(){ const args = {...}; <脚本内容> })()
├── 调用 sgBrowserExcuteJsCodeByDomain(domain, wrappedJs)
│ 注入到浏览器中匹配 domain 的页面执行
├── 等待回调:脚本通过 callBackJsToCpp 返回 JSON 结果
├── 解析 artifact JSON → 提取 status/rows/reasons
└── 生成 XLSXRust 侧)→ 返回 outcome
```
### 关键差异:原始场景 vs Skill 模式
| 对比项 | 原始场景 (index.html) | Skill 模式 |
|--------|----------------------|------------|
| 脚本注入方式 | `sgBrowserExcuteJsCode(exactURL, js)` — 精确 URL | `sgBrowserExcuteJsCodeByDomain(domain, js)` — 仅域名匹配 |
| 执行页面 | 业务子页面 `/tqLinelossStatis/tqQualifyRateMonitor` | 可能命中父框架页 `/gsllys` |
| `window.mac` | 有Vue 实例,`mounted()``window.mac = this` | 无(没有 Vue 实例) |
| 导出 Excel | JS 调 `localhost:13313`(本地场景页可访问) | JS 无法调 `localhost:13313`CORS 阻断) |
| 结果回传 | Rust 只需要 `.then()` 回调结果 | 同左,但脚本是 async 函数需 `.then()` 处理 |
---
## 排查时间线
### 第 1 阶段:基础管道问题
#### 问题 1: `missing_expected_domain`
**现象**: `status=blocked reasons=missing_expected_domain`
**根因**: Rust 侧 `deterministic_submit.rs` 构造 args 时没有传 `expected_domain` 字段。`derive_expected_domain()``page_url` 提取 host 时只取了域名不含端口,但传入 args 时 key 不匹配。
**修复**: 确保 `deterministic_submit_args()` 正确插入 `expected_domain` 到 args Map。
**涉及文件**: `src/compat/deterministic_submit.rs`
**是否需要重新编译**: 是
---
#### 问题 2: `target_url` 缺少端口号
**现象**: 脚本注入失败或注入到错误页面。
**根因**: `target_url` 被设为 `http://20.76.57.61`(无端口),但实际业务页面在 `http://20.76.57.61:18080/gsllys/...``sgBrowserExcuteJsCodeByDomain` 需要能匹配到正确的标签页。
**修复**: 在 `deterministic_submit.rs` 中设置完整 `target_url`
```rust
const LINELLOSS_TARGET_URL: &str = "http://20.76.57.61:18080/gsllys/tqLinelossStatis/tqQualifyRateMonitor";
```
**涉及文件**: `src/compat/deterministic_submit.rs`
**是否需要重新编译**: 是
---
#### 问题 3: 脚本返回 `{}` 空对象
**现象**: Rust 侧收到的 artifact 是 `{}`,无任何数据。
**根因**: `collect_lineloss.js` 的入口 `buildBrowserEntrypointResult()``async` 函数,返回 Promise。Rust 侧 `build_eval_js` 包装器原来直接调用 `_s(v)` 发送结果,但 `v` 是一个 Promise 对象JSON.stringify 后变成 `{}`
**修复**: 在 `build_eval_js``callback_backend.rs`)中增加 Promise 检测:
```rust
// 旧代码
"_s(v);"
// 新代码
"if(v&&typeof v.then==='function'){v.then(_s).catch(function(){});}else{_s(v);}"
```
如果返回值是 thenablePromise等它 resolve 后再发送回调。
**涉及文件**: `src/browser/callback_backend.rs``build_eval_js` 函数
**是否需要重新编译**: 是
**教训**: 所有 browser_script skill 如果入口函数是 async返回 Promise都需要这个 `.then()` 处理。这是管道层的通用修复。
---
### 第 2 阶段:页面上下文问题
#### 问题 4: `page_context_unavailable` (mac_missing)
**现象**:
```
tq-lineloss-report 国网兰州供电公司 2026-03 status=blocked rows=0 reasons=page_context_unavailable
```
**排查过程**:
1.`validatePageContext` 中添加诊断信息:
```javascript
// 临时诊断代码
const diag = 'href=' + href + '|host=' + host + '|port=' + port + '|title=' + title + '|mac=' + hasMac;
return { ok: false, reason: 'page_context_unavailable:mac_missing|' + diag };
```
2. 页面返回的诊断结果:
```
href=http://20.76.57.61:18080/gsllys
host=20.76.57.61
port=18080
title=台区线损大数据分析模块
mac=false
```
**根因**: `sgBrowserExcuteJsCodeByDomain("20.76.57.61")` 匹配到了父框架页 `/gsllys`,而不是业务子页面。`window.mac` 是业务子页面的 Vue 实例,在 `mounted()` 中通过 `window.mac = this` 设置,父框架页没有这个实例。
**关键认知**: 在 Skill 模式下没有 Vue 实例,`window.mac` 检查在架构上就不适用。脚本通过 AJAX 发绝对 URL 请求,不依赖页面本地状态。
**修复**: 删除 `globalThis.mac` 检查,只保留 host 匹配:
```javascript
// 修复前
validatePageContext(args) {
// ... 含 mac 检查 + 诊断代码
if (!hasMac) {
return { ok: false, reason: 'page_context_unavailable:mac_missing|' + diag };
}
}
// 修复后
validatePageContext(args) {
const host = normalizeText(globalThis.location?.hostname);
const expected = normalizeText(args.expected_domain);
if (!host) {
return { ok: false, reason: 'page_context_unavailable' };
}
if (host !== expected) {
return { ok: false, reason: 'page_context_mismatch' };
}
return { ok: true };
},
```
**涉及文件**: `collect_lineloss.js``validatePageContext` 函数
**是否需要重新编译**: 否JS 文件运行时读取)
**排查技巧**: 在 reasons 中拼接诊断信息href/host/port/title/mac不需要 F12 console直接通过 Rust 侧的 summary 输出就能看到。
---
### 第 3 阶段API 请求问题
#### 问题 5: `api_query_failed` — 返回 HTML 而非 JSON
**现象**:
```
status=error rows=0 reasons=api_query_failed:month_api_failed: SyntaxError: Unexpected token '<', "<!DOCTYPE "... is not valid JSON
```
**根因**: 后端服务检测到请求缺少 `X-Requested-With: XMLHttpRequest` 头,认为这不是 AJAX 请求,返回了 HTML 登录页面。jQuery 的 `$.ajax` 不会自动添加这个头。
**修复**: 在 `queryMonthData``queryWeekData``$.ajax` 调用中添加请求头:
```javascript
$.ajax({
url,
type: 'POST',
dataType: 'json',
crossDomain: true,
headers: { 'X-Requested-With': 'XMLHttpRequest' }, // <-- 新增
data: request,
contentType: 'application/x-www-form-urlencoded;charset=UTF-8',
success: resolve,
error: (xhr, _status, err) => reject(new Error(
`month_api_failed(${xhr.status}): ${String(err)}|body=${String(xhr.responseText || '').substring(0, 200)}`
))
});
```
**涉及文件**: `collect_lineloss.js``queryMonthData``queryWeekData`
**是否需要重新编译**: 否
**排查技巧**: 在 error handler 中拼接 `xhr.responseText` 的前 200 字符到 reasons 中。如果看到 `<!DOCTYPE` 开头,说明后端返回了 HTML 而非 JSON。
**通用规则**: 内网 Java 后端通常依赖 `X-Requested-With: XMLHttpRequest` 来区分页面请求和 AJAX 请求。所有对内网 API 的 `$.ajax` 调用都应加上此头。
---
### 第 4 阶段:数据规范化问题
#### 问题 6: `row_normalization_failed` — 列名不匹配
**现象**:
```
status=error rows=0 reasons=row_normalization_failed:rawRows=12|keys=YGDL,ORG_NO,YXSL,TG_NUM...
```
**根因**: 初始生成的 `MONTH_COLUMN_DEFS` 使用了猜测的列名:
```javascript
// 错误的列名
['LINE_LOSS_RATE', '线损完成率(%)'],
['PPQ', '累计供电量'],
['UPQ', '累计售电量'],
```
而 API 实际返回的列名是(参考原始场景 `index.html` 中的 `cols2`
```javascript
// 正确的列名
['ORG_NAME', '供电单位'],
['YGDL', '累计供电量'],
['YYDL', '累计售电量'],
['YXSL', '线损完成率(%)'],
['RAT_SCOPE', '线损率累计目标值'],
['BLANK3', '目标完成率'],
['BLANK2', '排行']
```
**修复**: 按原始场景 `index.html``cols2` 的定义修正 `MONTH_COLUMN_DEFS`
**排查技巧**: 在 `reasons` 中拼接 `rawRows.length``Object.keys(rawRows[0]).join(',')` 可以直接看到 API 返回了哪些字段。
**通用规则**: 生成 skill 脚本时,列定义必须从原始场景代码中精确复制,不能靠猜测。找 `cols1`/`cols2` 或表格渲染相关代码。
---
#### 问题 7: `row_normalization_failed` — 数值类型不兼容
**现象**: 列名修正后仍报 `row_normalization_failed:rawRows=12`12 行全部被过滤。
**根因**: `pickFirstNonEmpty()` 函数只识别字符串类型:
```javascript
function pickFirstNonEmpty(...values) {
for (const value of values) {
if (isNonEmptyString(value)) { // isNonEmptyString: typeof value === 'string'
return value.trim();
}
}
return ''; // API 返回数字 12345.67typeof === 'number',被当作空值
}
```
API 返回的字段值是数字(如 `YGDL: 12345.67`),不是字符串。`pickFirstNonEmpty` 对数字返回 `''`,导致所有行的所有字段都为空,全部被过滤。
**修复**: `normalizeMonthRow` 不使用 `pickFirstNonEmpty`,改为直接处理任意类型值:
```javascript
// 修复前
function normalizeMonthRow(rawRow) {
const row = {};
for (const key of MONTH_COLUMNS) {
row[key] = pickFirstNonEmpty(rawRow?.[key]); // 数字类型 → ''
}
return MONTH_COLUMNS.every((key) => row[key] !== '') ? row : null;
}
// 修复后
function normalizeMonthRow(rawRow) {
const row = {};
for (const key of MONTH_COLUMNS) {
const v = rawRow?.[key];
row[key] = (v === null || v === undefined || v === '') ? '' : String(v).trim();
}
return MONTH_COLUMNS.every((key) => row[key] !== '') ? row : null;
}
```
**涉及文件**: `collect_lineloss.js``normalizeMonthRow`
**是否需要重新编译**: 否
**通用规则**: 内网 API 返回的 JSON 中数值字段通常是 `number` 类型而非字符串。行规范化函数必须用 `String(v)` 进行类型转换,不能依赖 `typeof === 'string'` 判断。
---
### 第 5 阶段:导出问题(架构级)
#### 问题 8: 导出永久挂起
**现象**:
```
tq-lineloss-report 国网兰州供电公司 2026-03 status=pl rows=12
```
数据采集成功12 行),但之后永远没有返回,脚本卡死在导出步骤。
**排查过程**:
1. `exportWorkbook` 调用 `fetch('http://localhost:13313/...')` — CORS 阻断
2. 改用 `$.ajax({ crossDomain: true })` — 同样阻断
3. 确认这是浏览器安全模型限制,不是配置问题
**根因**: 脚本运行在远程页面 `http://20.76.57.61:18080` 上,浏览器禁止从远程页面向 `localhost:13313` 发起请求(同源策略 + Mixed Content`crossDomain: true` 只是告诉 jQuery 用跨域模式,并不能绕过浏览器安全策略。
原始场景的解决方式:有一个本地场景页面(`localhost` 上的 `index.html`)充当代理,先在远程页面采集数据,再通过 `postMessage` 或回调传回本地页面,由本地页面调用 `localhost:13313`
Skill 模式没有本地场景页面,因此这种代理机制不存在。
**解决方案**: 将导出逻辑从浏览器 JS 移到 Rust 侧(方案 A2: Rust 本地生成 XLSX
**最终架构**:
```
JS (浏览器): 采集数据 → 返回 artifact { rows, column_defs, status }
Rust (本地): 解析 artifact → 提取 rows + column_defs → 生成 XLSX 文件
```
**具体修改**:
1. **JS 侧**: 删除 `exportWorkbook()``writeReportLog()``postJson()``buildExportPayload()` 等导出相关代码。artifact 中添加 `column_defs` 字段export 状态设为 `deferred_to_rust`
2. **Rust 侧**: 新增 `lineloss_xlsx_export.rs`,用 `zip` crate + OpenXML XML 生成 XLSX。在 `deterministic_submit.rs` 中,收到 artifact 后调用 XLSX 生成。
**涉及文件**:
- `collect_lineloss.js` — 删除导出代码,添加 `column_defs`
- `src/compat/lineloss_xlsx_export.rs` — 新增
- `src/compat/deterministic_submit.rs` — 新增导出集成
- `src/compat/mod.rs` — 注册新模块
**是否需要重新编译**: 是
**通用规则**: 任何从远程页面调用 `localhost` 的操作在 Skill 模式下都不可行。导出/写日志等需要访问本地服务的功能必须放到 Rust 侧实现。
---
## 排查方法论总结
### 1. 诊断信息注入模式
脚本运行在浏览器中,无法看 F12 console。唯一的信息通道是 artifact JSON 的 `reasons` 字段。
```javascript
// 在 catch 块中注入详细错误
reasons: ['api_query_failed:' + String(error?.message || error || 'unknown')]
// 在规范化失败时注入原始数据摘要
reasons: ['row_normalization_failed:rawRows=' + rawRows.length + '|keys=' + Object.keys(rawRows[0]).join(',')]
// 在页面上下文检查中注入环境信息
reason: 'page_context_unavailable:mac_missing|href=' + href + '|host=' + host + '|port=' + port
```
Rust 侧的 summary 输出会包含这些 reasons直接在日志中可见。
### 2. 逐层排查顺序
```
Layer 1: 管道层Rust
├── args 是否正确传入?(expected_domain, target_url, org_code 等)
├── 脚本文件是否正确读取?
├── async 返回值是否被正确处理?(.then() 模式)
└── 回调是否成功返回?
Layer 2: 页面上下文JS
├── 脚本注入到了哪个页面?(href, title)
├── 页面是否有需要的全局变量?(window.mac 等)
└── domain 匹配是否正确?
Layer 3: API 请求JS
├── 请求头是否完整?(X-Requested-With)
├── 返回格式是否正确?(JSON vs HTML)
└── 返回状态码?
Layer 4: 数据处理JS
├── API 返回的字段名是否匹配列定义?
├── 字段值类型是否兼容?(number vs string)
└── 规范化后是否有有效行?
Layer 5: 导出(架构)
├── 是否涉及跨域请求?
├── localhost 是否可达?
└── 是否需要 Rust 侧处理?
```
### 3. 修改后验证检查清单
- [ ] JS 文件语法检查:`node -e "require('./collect_lineloss.js')"`
- [ ] 如果改了 Rust 代码:`cargo build` 编译通过
- [ ] `cargo test` 全部通过(排除已知的 pre-existing failures
- [ ] 替换 JS 文件到部署目录
- [ ] 如果改了 Rust重新部署编译后的 sgclaw 二进制
---
## 最终文件清单
### JS 文件: `collect_lineloss.js`
**位置**: `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/tq-lineloss-report/scripts/collect_lineloss.js`
**功能**: 纯数据采集。注入到浏览器,查询线损平台 API返回结构化 artifact。
**不做的事**: 不调 localhost:13313不导出 Excel不写 report log。
### Rust 文件: 修改清单
| 文件 | 修改内容 | 修改类型 |
|------|---------|---------|
| `src/browser/callback_backend.rs` | `build_eval_js` 增加 `.then()` 处理 async 返回值 | 管道层通用修复 |
| `src/compat/deterministic_submit.rs` | 完整 `target_url`; 解析 artifact 后调 XLSX 导出 | 业务集成 |
| `src/compat/lineloss_xlsx_export.rs` | XLSX 生成zip + OpenXML | 新增 |
| `src/compat/mod.rs` | 注册 `lineloss_xlsx_export` 模块 | 新增 |
---
## 快速复用模板
新建类似 skill 时,直接检查以下要点:
1. **`build_eval_js` 是否支持 async**:入口函数如果是 `async`,确认 `callback_backend.rs` 中有 `.then()` 处理。
2. **`validatePageContext` 不检查页面局部状态**:只检查 host不检查 `window.mac``window.app` 等场景页专属变量。
3. **API 请求必须带 `X-Requested-With: XMLHttpRequest`**:内网 Java 后端的标配。
4. **列定义从原始场景代码精确复制**:找 `cols1`/`cols2` 或表格 `columns` 配置。
5. **`normalizeRow``String(v)` 而非 `pickFirstNonEmpty`**API 返回数字不是字符串。
6. **导出不走浏览器,走 Rust 侧**JS 返回 rows + column_defsRust 生成 XLSX。

View File

@@ -0,0 +1,551 @@
# ZeroClaw Prompt Safety Hardening Implementation Plan
> **For Claude:** REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.
**Goal:** Harden ZeroClaw prompt handling and tool execution so non-skill freeform operations degrade to read-only or business-approved execution, while trusted skill-defined operations retain bounded execution privileges.
**Architecture:** Build a security gate around the existing prompt and tool-entry paths instead of rewriting the full prompt architecture. The gate classifies prompt-injection risk, records operation provenance (`trusted_skill` vs `non_skill`), sanitizes injected workspace/skill content, and enforces execution mode transitions (`clean`, `suspect_readonly`, `suspect_waiting_approval`, `suspect_business_approved`). Trusted skills gain structured business-operation metadata; non-skill operations require business-level approval before any privileged capability is released.
**Tech Stack:** Rust, vendored ZeroClaw (`third_party/zeroclaw`), existing approval/autonomy system, current prompt guard and prompt builder tests, `cargo test`.
### Task 1: Create an Isolated Worktree and Verify a Clean Baseline
**Files:**
- Modify: `/home/zyl/projects/sgClaw/claw/.gitignore`
- Create: `/home/zyl/projects/sgClaw/claw/.worktrees/zeroclaw-prompt-safety-hardening/**`
**Step 1: Verify the worktree directory is safe to use**
Run:
```bash
cd /home/zyl/projects/sgClaw/claw
ls -d .worktrees
git check-ignore -v .worktrees
```
Expected: `.worktrees` exists and is ignored by git.
**Step 2: Create the implementation worktree**
Run:
```bash
cd /home/zyl/projects/sgClaw/claw
git worktree add .worktrees/zeroclaw-prompt-safety-hardening -b zeroclaw-prompt-safety-hardening
```
Expected: a new branch and worktree are created.
**Step 3: Build the baseline in the worktree**
Run:
```bash
cd /home/zyl/projects/sgClaw/claw/.worktrees/zeroclaw-prompt-safety-hardening
cargo test -p zeroclawlabs prompt_guard -- --nocapture
cargo test -p zeroclawlabs build_system_prompt -- --nocapture
```
Expected: existing relevant tests pass before any code changes.
**Step 4: Commit the clean worktree setup if `.gitignore` changed**
Run:
```bash
git add .gitignore
git commit -m "chore: prepare worktree for prompt safety hardening"
```
Expected: commit only if `.gitignore` required an adjustment.
### Task 2: Add the Core Security-Mode Data Model
**Files:**
- Create: `/home/zyl/projects/sgClaw/claw/.worktrees/zeroclaw-prompt-safety-hardening/third_party/zeroclaw/src/security/operation_policy.rs`
- Modify: `/home/zyl/projects/sgClaw/claw/.worktrees/zeroclaw-prompt-safety-hardening/third_party/zeroclaw/src/security/mod.rs`
- Test: `/home/zyl/projects/sgClaw/claw/.worktrees/zeroclaw-prompt-safety-hardening/third_party/zeroclaw/src/security/operation_policy.rs`
**Step 1: Write the failing policy tests**
Add tests that prove:
- suspicious non-skill input maps to `suspect_readonly`
- trusted skill operations can request bounded privileged execution
- any out-of-scope capability request downgrades the operation
Use concrete enums and assertions, for example:
```rust
assert_eq!(
ExecutionMode::from_guard_and_provenance(GuardRisk::Suspicious, OperationProvenance::NonSkill),
ExecutionMode::SuspectReadOnly
);
```
**Step 2: Run the tests to verify RED**
Run:
```bash
cd /home/zyl/projects/sgClaw/claw/.worktrees/zeroclaw-prompt-safety-hardening
cargo test -p zeroclawlabs operation_policy -- --nocapture
```
Expected: fail because the new types do not exist yet.
**Step 3: Implement the minimal policy model**
Define:
- `GuardRisk` (`Clean`, `Suspicious`, `Dangerous`)
- `OperationProvenance` (`TrustedSkill`, `NonSkill`, `Mixed`)
- `ExecutionMode` (`Clean`, `SuspectReadOnly`, `SuspectWaitingApproval`, `SuspectBusinessApproved`)
- `CapabilityClass` for privileged business actions
Add small helper functions that do only state mapping. Do not pull prompt-building logic into this module.
**Step 4: Re-run the policy tests to verify GREEN**
Run:
```bash
cargo test -p zeroclawlabs operation_policy -- --nocapture
```
Expected: the new policy tests pass.
**Step 5: Commit**
Run:
```bash
git add third_party/zeroclaw/src/security/mod.rs third_party/zeroclaw/src/security/operation_policy.rs
git commit -m "feat: add prompt security execution mode model"
```
### Task 3: Add Structured Skill Trust Metadata
**Files:**
- Modify: `/home/zyl/projects/sgClaw/claw/.worktrees/zeroclaw-prompt-safety-hardening/third_party/zeroclaw/src/skills/mod.rs`
- Modify: `/home/zyl/projects/sgClaw/claw/.worktrees/zeroclaw-prompt-safety-hardening/third_party/zeroclaw/src/tools/read_skill.rs`
- Test: `/home/zyl/projects/sgClaw/claw/.worktrees/zeroclaw-prompt-safety-hardening/third_party/zeroclaw/src/skills/mod.rs`
**Step 1: Write failing skill metadata tests**
Add tests that prove:
- `SKILL.toml` can declare a business operation type, capability list, argument constraints, and `step_budget`
- markdown-only skills default to unprivileged metadata
- malformed privileged metadata is rejected or downgraded safely
Use a manifest shape like:
```toml
[skill]
name = "export-report"
description = "Export the monthly report"
[security]
operation_type = "browser_export_data"
allowed_capabilities = ["browser_read", "browser_export"]
step_budget = 6
approval_mode = "trusted_skill"
```
**Step 2: Run the tests to verify RED**
Run:
```bash
cargo test -p zeroclawlabs skill -- --nocapture
```
Expected: fail because the structured metadata fields are missing.
**Step 3: Implement minimal structured metadata**
Extend `Skill` with a structured security block, for example:
- `operation_type`
- `business_description`
- `allowed_capabilities`
- `arg_constraints`
- `step_budget`
- `approval_mode`
Default markdown-only skills to unprivileged metadata so existing skills remain compatible.
**Step 4: Make `read_skill` expose the metadata**
Return or prepend enough structured metadata so the runtime can distinguish trusted skill operations from plain prompt text.
**Step 5: Re-run the tests to verify GREEN**
Run:
```bash
cargo test -p zeroclawlabs skill -- --nocapture
```
Expected: skill parsing and `read_skill` tests pass.
**Step 6: Commit**
Run:
```bash
git add third_party/zeroclaw/src/skills/mod.rs third_party/zeroclaw/src/tools/read_skill.rs
git commit -m "feat: add trusted skill security metadata"
```
### Task 4: Sanitize Injected Workspace and Skill Content Before Prompt Assembly
**Files:**
- Create: `/home/zyl/projects/sgClaw/claw/.worktrees/zeroclaw-prompt-safety-hardening/third_party/zeroclaw/src/security/prompt_sanitizer.rs`
- Modify: `/home/zyl/projects/sgClaw/claw/.worktrees/zeroclaw-prompt-safety-hardening/third_party/zeroclaw/src/security/mod.rs`
- Modify: `/home/zyl/projects/sgClaw/claw/.worktrees/zeroclaw-prompt-safety-hardening/third_party/zeroclaw/src/channels/mod.rs`
- Modify: `/home/zyl/projects/sgClaw/claw/.worktrees/zeroclaw-prompt-safety-hardening/third_party/zeroclaw/src/agent/prompt.rs`
- Test: `/home/zyl/projects/sgClaw/claw/.worktrees/zeroclaw-prompt-safety-hardening/third_party/zeroclaw/src/channels/mod.rs`
**Step 1: Write failing sanitizer tests**
Add tests that prove:
- dangerous bootstrap phrases are removed, escaped, or summarized before prompt injection
- control characters are stripped
- overlong files are truncated with an audit-friendly marker
- safe business content remains readable
**Step 2: Run the tests to verify RED**
Run:
```bash
cargo test -p zeroclawlabs build_system_prompt -- --nocapture
```
Expected: fail because injected files are still copied verbatim.
**Step 3: Implement the sanitizer**
Create a small sanitizer that:
- strips control characters
- caps content length
- flags prompt-override phrases
- emits sanitized content plus metadata such as `truncated` and matched rules
Use this sanitizer in:
- `load_openclaw_bootstrap_files`
- any shared path in `agent/prompt.rs` that renders workspace or skill text into the system prompt
**Step 4: Re-run the tests to verify GREEN**
Run:
```bash
cargo test -p zeroclawlabs build_system_prompt -- --nocapture
```
Expected: prompt-building tests pass with the new sanitized behavior.
**Step 5: Commit**
Run:
```bash
git add third_party/zeroclaw/src/security/mod.rs third_party/zeroclaw/src/security/prompt_sanitizer.rs third_party/zeroclaw/src/channels/mod.rs third_party/zeroclaw/src/agent/prompt.rs
git commit -m "feat: sanitize injected workspace prompt content"
```
### Task 5: Wire `PromptGuard` into Main Agent and Gateway Entry Points
**Files:**
- Modify: `/home/zyl/projects/sgClaw/claw/.worktrees/zeroclaw-prompt-safety-hardening/third_party/zeroclaw/src/security/prompt_guard.rs`
- Modify: `/home/zyl/projects/sgClaw/claw/.worktrees/zeroclaw-prompt-safety-hardening/third_party/zeroclaw/src/agent/agent.rs`
- Modify: `/home/zyl/projects/sgClaw/claw/.worktrees/zeroclaw-prompt-safety-hardening/third_party/zeroclaw/src/gateway/mod.rs`
- Modify: `/home/zyl/projects/sgClaw/claw/.worktrees/zeroclaw-prompt-safety-hardening/third_party/zeroclaw/src/gateway/ws.rs`
- Test: `/home/zyl/projects/sgClaw/claw/.worktrees/zeroclaw-prompt-safety-hardening/third_party/zeroclaw/src/agent/agent.rs`
**Step 1: Write failing entry-point tests**
Add tests that prove:
- suspicious input marks the turn as degraded instead of silently continuing
- dangerous input is blocked
- clean input remains unchanged
Prefer tests that assert on a security decision object instead of brittle prompt strings.
**Step 2: Run the tests to verify RED**
Run:
```bash
cargo test -p zeroclawlabs prompt_guard -- --nocapture
cargo test -p zeroclawlabs agent -- --nocapture
```
Expected: fail because no entry path consumes the guard result.
**Step 3: Implement guarded entry evaluation**
Before each turn:
- scan the inbound user content
- map the guard result into `GuardRisk`
- create an execution context carrying risk and provenance
- attach audit details for later logging
Keep the existing `PromptGuard` regexes unless a test demands a specific adjustment.
**Step 4: Re-run the tests to verify GREEN**
Run:
```bash
cargo test -p zeroclawlabs prompt_guard -- --nocapture
cargo test -p zeroclawlabs agent -- --nocapture
```
Expected: suspicious and blocked paths now behave deterministically.
**Step 5: Commit**
Run:
```bash
git add third_party/zeroclaw/src/security/prompt_guard.rs third_party/zeroclaw/src/agent/agent.rs third_party/zeroclaw/src/gateway/mod.rs third_party/zeroclaw/src/gateway/ws.rs
git commit -m "feat: enforce prompt guard at runtime entry points"
```
### Task 6: Add Business-Level Privileged Operation Registry and Approval Tokens
**Files:**
- Modify: `/home/zyl/projects/sgClaw/claw/.worktrees/zeroclaw-prompt-safety-hardening/third_party/zeroclaw/src/approval/mod.rs`
- Create: `/home/zyl/projects/sgClaw/claw/.worktrees/zeroclaw-prompt-safety-hardening/third_party/zeroclaw/src/security/business_approval.rs`
- Modify: `/home/zyl/projects/sgClaw/claw/.worktrees/zeroclaw-prompt-safety-hardening/third_party/zeroclaw/src/security/mod.rs`
- Test: `/home/zyl/projects/sgClaw/claw/.worktrees/zeroclaw-prompt-safety-hardening/third_party/zeroclaw/src/security/business_approval.rs`
**Step 1: Write failing business approval tests**
Add tests that prove:
- only operations in the privileged registry can request approval
- approval tokens bind to `session_id`, `operation_type`, `allowed_capabilities`, `step_budget`, and expiration
- a mismatched or expired approval token is rejected
**Step 2: Run the tests to verify RED**
Run:
```bash
cargo test -p zeroclawlabs business_approval -- --nocapture
```
Expected: fail because the business approval registry does not exist yet.
**Step 3: Implement the registry and token model**
Create:
- a privileged business operation registry
- a single-operation approval token
- helper checks for `can_request_approval` and `matches_execution_request`
Model approval at the business-operation level, not raw tool calls.
**Step 4: Extend the existing approval module**
Teach the approval module to carry business-level fields through the current request/response flow without breaking old call sites.
**Step 5: Re-run the tests to verify GREEN**
Run:
```bash
cargo test -p zeroclawlabs business_approval -- --nocapture
```
Expected: the token validation and registry tests pass.
**Step 6: Commit**
Run:
```bash
git add third_party/zeroclaw/src/approval/mod.rs third_party/zeroclaw/src/security/mod.rs third_party/zeroclaw/src/security/business_approval.rs
git commit -m "feat: add business-level approval registry"
```
### Task 7: Enforce Execution Modes in Tool Dispatch
**Files:**
- Modify: `/home/zyl/projects/sgClaw/claw/.worktrees/zeroclaw-prompt-safety-hardening/third_party/zeroclaw/src/agent/dispatcher.rs`
- Modify: `/home/zyl/projects/sgClaw/claw/.worktrees/zeroclaw-prompt-safety-hardening/third_party/zeroclaw/src/agent/agent.rs`
- Modify: `/home/zyl/projects/sgClaw/claw/.worktrees/zeroclaw-prompt-safety-hardening/third_party/zeroclaw/src/agent/loop_.rs`
- Test: `/home/zyl/projects/sgClaw/claw/.worktrees/zeroclaw-prompt-safety-hardening/third_party/zeroclaw/src/agent/dispatcher.rs`
**Step 1: Write failing dispatcher tests**
Add tests that prove:
- `suspect_readonly` allows only safe read capabilities
- `trusted_skill` can execute capabilities listed in its metadata within `step_budget`
- `mixed` or non-skill privileged calls require a matching business approval token
**Step 2: Run the tests to verify RED**
Run:
```bash
cargo test -p zeroclawlabs dispatcher -- --nocapture
```
Expected: fail because the dispatcher does not yet know about execution modes.
**Step 3: Implement capability enforcement**
Before dispatching any tool:
- resolve the operation context
- map the tool call to a capability class
- reject calls outside the current execution mode
- decrement or validate `step_budget` for approved bounded flows
Do not rely on prompt text for enforcement.
**Step 4: Re-run the tests to verify GREEN**
Run:
```bash
cargo test -p zeroclawlabs dispatcher -- --nocapture
```
Expected: dispatch now respects read-only, trusted skill, and business-approved modes.
**Step 5: Commit**
Run:
```bash
git add third_party/zeroclaw/src/agent/dispatcher.rs third_party/zeroclaw/src/agent/agent.rs third_party/zeroclaw/src/agent/loop_.rs
git commit -m "feat: enforce execution mode in tool dispatch"
```
### Task 8: Default Skills Prompt Injection to Compact for Safer Runtime Behavior
**Files:**
- Modify: `/home/zyl/projects/sgClaw/claw/.worktrees/zeroclaw-prompt-safety-hardening/third_party/zeroclaw/src/config/schema.rs`
- Modify: `/home/zyl/projects/sgClaw/claw/.worktrees/zeroclaw-prompt-safety-hardening/third_party/zeroclaw/src/agent/prompt.rs`
- Modify: `/home/zyl/projects/sgClaw/claw/.worktrees/zeroclaw-prompt-safety-hardening/third_party/zeroclaw/src/channels/mod.rs`
- Test: `/home/zyl/projects/sgClaw/claw/.worktrees/zeroclaw-prompt-safety-hardening/third_party/zeroclaw/src/config/schema.rs`
**Step 1: Write the failing configuration test**
Add a test that asserts the default skill prompt injection mode is `Compact` unless explicitly configured otherwise.
**Step 2: Run the test to verify RED**
Run:
```bash
cargo test -p zeroclawlabs skills_prompt_injection_mode -- --nocapture
```
Expected: fail because defaults still point to `Full`.
**Step 3: Implement the default flip**
Update config defaults and any prompt-builder defaults that currently assume `Full`. Keep explicit user config backward compatible.
**Step 4: Re-run the test to verify GREEN**
Run:
```bash
cargo test -p zeroclawlabs skills_prompt_injection_mode -- --nocapture
```
Expected: default configuration now resolves to `Compact`.
**Step 5: Commit**
Run:
```bash
git add third_party/zeroclaw/src/config/schema.rs third_party/zeroclaw/src/agent/prompt.rs third_party/zeroclaw/src/channels/mod.rs
git commit -m "feat: default skills prompt injection to compact"
```
### Task 9: Add Audit Logging and Regression Coverage
**Files:**
- Modify: `/home/zyl/projects/sgClaw/claw/.worktrees/zeroclaw-prompt-safety-hardening/third_party/zeroclaw/src/observability/mod.rs`
- Modify: `/home/zyl/projects/sgClaw/claw/.worktrees/zeroclaw-prompt-safety-hardening/third_party/zeroclaw/src/agent/agent.rs`
- Modify: `/home/zyl/projects/sgClaw/claw/.worktrees/zeroclaw-prompt-safety-hardening/third_party/zeroclaw/src/channels/mod.rs`
- Create: `/home/zyl/projects/sgClaw/claw/.worktrees/zeroclaw-prompt-safety-hardening/third_party/zeroclaw/tests/prompt_safety_regression.rs`
**Step 1: Write the failing regression tests**
Cover:
- prompt override attack from user content
- malicious `AGENTS.md` bootstrap content
- trusted skill execution within bounds
- non-skill privileged request requiring business approval
- approval token mismatch
- session history restore preserving degraded mode
**Step 2: Run the tests to verify RED**
Run:
```bash
cargo test -p zeroclawlabs --test prompt_safety_regression -- --nocapture
```
Expected: fail because the end-to-end behavior is not wired together yet.
**Step 3: Implement audit logging**
Record:
- input hash
- matched guard rules
- risk level
- provenance
- execution mode transitions
- approval decisions
Avoid logging raw sensitive content.
**Step 4: Re-run the regression tests to verify GREEN**
Run:
```bash
cargo test -p zeroclawlabs --test prompt_safety_regression -- --nocapture
```
Expected: the regression suite passes.
**Step 5: Commit**
Run:
```bash
git add third_party/zeroclaw/src/observability/mod.rs third_party/zeroclaw/src/agent/agent.rs third_party/zeroclaw/src/channels/mod.rs third_party/zeroclaw/tests/prompt_safety_regression.rs
git commit -m "test: add prompt safety regression coverage"
```
### Task 10: Final Verification and Integration Review
**Files:**
- Modify: `/home/zyl/projects/sgClaw/claw/.worktrees/zeroclaw-prompt-safety-hardening/docs/L5-提示词分布与安全改造方案.md`
- Modify: `/home/zyl/projects/sgClaw/claw/.worktrees/zeroclaw-prompt-safety-hardening/docs/README.md`
**Step 1: Run targeted verification**
Run:
```bash
cd /home/zyl/projects/sgClaw/claw/.worktrees/zeroclaw-prompt-safety-hardening
cargo test -p zeroclawlabs prompt_guard -- --nocapture
cargo test -p zeroclawlabs build_system_prompt -- --nocapture
cargo test -p zeroclawlabs dispatcher -- --nocapture
cargo test -p zeroclawlabs --test prompt_safety_regression -- --nocapture
```
Expected: all prompt safety and dispatcher tests pass.
**Step 2: Run a broad ZeroClaw package test pass if time permits**
Run:
```bash
cargo test -p zeroclawlabs -- --nocapture
```
Expected: no regressions in the vendored package test suite, or a documented list of unrelated existing failures.
**Step 3: Update the security design docs**
Document:
- execution modes
- trusted skill metadata contract
- business approval flow
- why non-skill privileged actions are gated
**Step 4: Commit the docs**
Run:
```bash
git add docs/L5-提示词分布与安全改造方案.md docs/README.md
git commit -m "docs: record prompt safety hardening design"
```
**Step 5: Prepare merge review notes**
Write a short integration summary covering:
- changed entry points
- backward-compatibility expectations
- any skills that need metadata upgrades
- rollout recommendation for existing integrators

View File

@@ -0,0 +1,179 @@
# sgClaw Chat-First UI Refactor Implementation Plan
> **For Claude:** REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.
**Goal:** Rebuild the sgClaw floating chat UI into a chat-first plugin-style product where the message timeline is primary, `执行摘要` is folded into the conversation, and `调试` opens as a full-window overlay instead of occupying persistent space.
**Architecture:** Keep `chrome://superrpa-functions/sgclaw-chat` as the first verified host because it already has Lit-based unit tests, then mirror the same information architecture and visual hierarchy into the ordinary-page injected `sgclaw_overlay.js`. Do not introduce a new backend contract; only rearrange presentation, panel semantics, and message/result composition around the existing runtime state.
**Tech Stack:** Chromium WebUI, Lit templates/components, injected Shadow DOM overlay JavaScript, existing SuperRPA bridge/runtime callbacks, mainline TS unit tests.
### Task 1: Lock The New Information Architecture In Tests
**Files:**
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_mainline_unittest.ts`
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_settings_state_mainline_unittest.ts`
**Step 1: Write the failing test**
Add assertions for these exact product rules:
- `getHtml()` must no longer emit the legacy `debug-note`.
- the main chat template must define a dedicated overlay/sheet container for `history`, `settings`, and `debug`.
- the debug panel must be described as a full-window overlay rather than a side drawer/log block.
- the result presentation must be part of the message stream, not a standalone persistent secondary panel.
**Step 2: Run test to verify it fails**
Run:
```bash
node --test /home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_mainline_unittest.ts
```
Expected: FAIL because current template still includes `debug-note`, side-by-side panel layout, and standalone result panel semantics.
**Step 3: Write minimal implementation**
Change only template/component strings and assertions needed to express the new structure, without touching styling yet.
**Step 4: Run test to verify it passes**
Run the same command.
Expected: PASS.
### Task 2: Refactor `chrome://` sgClaw Into Chat-First Structure
**Files:**
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat.html.ts`
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat.css.ts`
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/components/sgclaw-chat-header.ts`
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/components/sgclaw-chat-composer.ts`
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/components/sgclaw-debug-drawer.ts`
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/components/sgclaw-history-panel.ts`
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/components/sgclaw-settings-panel.ts`
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/components/sgclaw-message-list.ts`
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/components/sgclaw-message-card-result.ts`
**Step 1: Keep the header narrow**
Make the header carry only:
- brand
- current page label
- compact runtime status
- actions for `新对话 / 历史 / 设置 / 调试 / 收起`
Remove the large subtitle/debug framing and the separate heavy runtime action row feel.
**Step 2: Make the message timeline primary**
Turn the main shell body into:
- a single timeline container
- optional empty-state presets
- no persistent secondary summary card
`finalResult` should render as a folded result card appended in the stream.
**Step 3: Convert secondary panels into full overlays**
Render `history`, `settings`, and `debug` inside a full-window overlay/sheet that covers the chat content area instead of sitting above or beside it.
**Step 4: Re-skin toward the approved direction**
Use:
- soft neutral surfaces
- restrained accent usage
- thinner borders
- calmer shadows
- clearer assistant/user card contrast
Avoid:
- debug-workbench feeling
- large gradient blocks
- heavy explanatory copy in the main flow
**Step 5: Run the unit tests**
Run:
```bash
node --test /home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_mainline_unittest.ts
```
Expected: PASS.
### Task 3: Mirror The Same Structure Into Ordinary-Page Overlay
**Files:**
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/sgclaw_overlay.js`
**Step 1: Remove the standalone result panel**
Delete the always-visible `执行摘要` block from the main window body.
**Step 2: Introduce overlay panels**
Change panel rendering so `history`, `settings`, and `debug` appear in a dedicated full-window overlay layer within the floating window instead of as sibling blocks consuming vertical space.
**Step 3: Rebuild the shell**
Match the `chrome://` layout:
- compact header
- primary message timeline
- folded result card inside conversation
- sticky composer
**Step 4: Preserve behavior**
Do not break:
- `sgclaw.newConversation`
- `sgclaw.restoreConversation`
- runtime polling
- config save/load
- unread badge behavior
**Step 5: Run a syntax sanity check**
Run:
```bash
node --check /home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/sgclaw_overlay.js
```
Expected: PASS.
### Task 4: Verify Browser Resource Integration
**Files:**
- No new source files; verification only
**Step 1: Run TS / mainline tests**
Run:
```bash
bash -lc "autoninja -C /home/zyl/projects/superRpa/src/out/KylinRelease functions_ui_mainline_unittests"
```
Expected: build succeeds.
**Step 2: Run targeted mainline unit tests**
Run:
```bash
/home/zyl/projects/superRpa/src/out/KylinRelease/functions_ui_mainline_unittests --gtest_filter='FunctionsUiMainlineTest.*sgclaw*'
```
If filter finds no test names, run the full binary and confirm it exits `0`.
**Step 3: Rebuild browser resources if needed**
Run:
```bash
bash -lc "autoninja -C /home/zyl/projects/superRpa/src/out/KylinRelease chrome"
```
**Step 4: Manually verify product behavior**
Check:
- ordinary webpage floating window
- `chrome://superrpa-functions/sgclaw-chat`
- `调试` opens as full overlay
- `执行摘要` no longer blocks the main conversation
- `历史` and `设置` do not consume persistent layout space

View File

@@ -0,0 +1,148 @@
# SGClaw Configurable Skills Directory Implementation Plan
> **For Claude:** REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.
**Goal:** Let `sgclaw` own skill-directory resolution and allow users to set a custom skills directory in `sgclaw_config.json` without relying on SuperRPA to copy skills into the runtime workspace.
**Architecture:** Extend the existing browser JSON config parser so `sgclaw` can read an optional `skillsDir` field alongside DeepSeek settings. Keep the current embedded ZeroClaw workspace for memory/config internals, but decouple skill loading from that fixed path by resolving a configurable skills root at runtime. Preserve backward compatibility by defaulting to `<workspace_root>/.sgclaw-zeroclaw-workspace/skills` when `skillsDir` is absent or empty.
**Tech Stack:** Rust, serde JSON parsing, existing ZeroClaw compatibility runtime, cargo test
### Task 1: Capture browser config requirements
**Files:**
- Modify: `/home/zyl/projects/sgClaw/claw/src/config/settings.rs`
- Test: `/home/zyl/projects/sgClaw/claw/tests/compat_config_test.rs`
**Step 1: Write the failing test**
Add tests that load `sgclaw_config.json` containing:
- no `skillsDir`
- a relative `skillsDir`
- an absolute `skillsDir`
Assert that:
- `skillsDir` missing falls back to default workspace skills path
- relative values resolve against the browser config directory
- absolute values are preserved
**Step 2: Run test to verify it fails**
Run: `cargo test compat_config -- --nocapture`
Expected: FAIL because `DeepSeekSettings` / config adapter do not expose any skills directory override yet.
**Step 3: Write minimal implementation**
Add a browser-config structure that parses `skillsDir` and expose a resolver function that returns the effective skills directory for `sgclaw`.
**Step 4: Run test to verify it passes**
Run: `cargo test compat_config -- --nocapture`
Expected: PASS for the new parsing and path-resolution cases.
### Task 2: Route compat runtime skill loading through sgclaw-owned resolution
**Files:**
- Modify: `/home/zyl/projects/sgClaw/claw/src/compat/config_adapter.rs`
- Modify: `/home/zyl/projects/sgClaw/claw/src/compat/runtime.rs`
- Test: `/home/zyl/projects/sgClaw/claw/tests/compat_runtime_test.rs`
**Step 1: Write the failing test**
Add a compat runtime test that creates:
- a default workspace skill package under `.sgclaw-zeroclaw-workspace/skills`
- a custom skill package under another directory configured via `skillsDir`
Assert that provider request payload contains only the configured skill name when `skillsDir` is set, and still contains workspace skill names when the override is absent.
**Step 2: Run test to verify it fails**
Run: `cargo test compat_runtime -- --nocapture`
Expected: FAIL because the runtime currently always loads skills from `config.workspace_dir`.
**Step 3: Write minimal implementation**
Keep `config.workspace_dir` for ZeroClaw internal state, but load skills from the resolved effective skills directory by calling `load_skills_from_directory` directly when a custom directory is configured.
**Step 4: Run test to verify it passes**
Run: `cargo test compat_runtime -- --nocapture`
Expected: PASS and provider request payload shows the right `Available Skills` content.
### Task 3: Document and verify backward compatibility
**Files:**
- Modify: `/home/zyl/projects/sgClaw/claw/docs/README.md`
- Modify: `/home/zyl/projects/sgClaw/claw/docs/L5-提示词分布与安全改造方案.md`
**Step 1: Write the failing check**
Record the expected runtime behavior:
- `sgclaw` owns skill lookup
- SuperRPA only passes `--config-path`
- `skillsDir` is optional
**Step 2: Run verification**
Run: `rg -n "skillsDir|sgclaw owns skill lookup|config-path" docs`
Expected: missing text before docs are updated.
**Step 3: Write minimal documentation**
Document:
- JSON field name
- relative-path resolution base
- default fallback
- operational implication for SuperRPA integration
**Step 4: Run verification**
Run: `rg -n "skillsDir|sgclaw owns skill lookup|config-path" docs`
Expected: PASS with updated docs.
### Task 4: Final verification
**Files:**
- Review only: `/home/zyl/projects/sgClaw/claw/src/config/settings.rs`
- Review only: `/home/zyl/projects/sgClaw/claw/src/compat/config_adapter.rs`
- Review only: `/home/zyl/projects/sgClaw/claw/src/compat/runtime.rs`
- Review only: `/home/zyl/projects/sgClaw/claw/tests/compat_config_test.rs`
- Review only: `/home/zyl/projects/sgClaw/claw/tests/compat_runtime_test.rs`
**Step 1: Run targeted tests**
Run: `cargo test compat_config -- --nocapture`
Expected: PASS
**Step 2: Run runtime tests**
Run: `cargo test compat_runtime -- --nocapture`
Expected: PASS
**Step 3: Run skill-lib structural validation**
Run: `python3 -m unittest tests.skill_lib_validation_test -v`
Expected: PASS
**Step 4: Commit**
```bash
git add docs/plans/2026-03-27-sgclaw-configurable-skills-dir-plan.md \
src/config/settings.rs \
src/compat/config_adapter.rs \
src/compat/runtime.rs \
tests/compat_config_test.rs \
tests/compat_runtime_test.rs \
docs/README.md \
docs/L5-提示词分布与安全改造方案.md
git commit -m "feat: make sgclaw skills directory configurable"
```

View File

@@ -0,0 +1,624 @@
# sgClaw Floating Chat Frontend Implementation Plan
> **For Claude:** REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.
**Goal:** Replace the current debug-style `sgclaw-chat` UI with a complete floating-chat frontend that matches the product structure of Doubao's side panel while preserving the current SuperRPA bridge and configuration capabilities.
**Architecture:** Keep `chrome://superrpa-functions/sgclaw-chat` as the first delivery host so the new UI can be built and verified without waiting for the final page-floating container. Split the current monolithic Lit component into host adapter, state modules, typed message model, presentational components, and secondary panels so the same UI can later be mounted in a real injected floating window on normal web pages. Preserve the existing browser bridge (`sgclawConnect`, `sgclawStart`, `sgclawStop`, `sgclawSubmitTask`) and re-home logs/configuration into secondary panels instead of deleting them.
**Tech Stack:** Chromium WebUI, Lit, existing `FunctionsUI` router, SuperRPA browser bridge callbacks, current `sgclaw-config` config page logic, future floating host injection in SuperRPA.
## Product Target
The frontend target is a single-column chat product, not a multi-card debug workstation.
Final visual structure:
```text
Collapsed Fab
┌────────────┐
│ sgClaw ●2 │
└────────────┘
Expanded Chat
┌──────────────────────────────────────────┐
│ sgClaw | 当前网页example.com │
│ [新对话] [历史] [设置] [收起] │
│ 状态:待命 / 执行中 / 出错 │
├──────────────────────────────────────────┤
│ 欢迎区 / 推荐动作 │
│ [总结当前页面] [提取表格] [执行网页操作] │
├──────────────────────────────────────────┤
│ 消息流 │
│ 用户消息 │
│ 助手消息 │
│ 步骤卡 / 结果卡 / 错误卡 │
├──────────────────────────────────────────┤
│ [网页执行] [页面问答] [页面总结] │
│ [上下文开关] [调试] [更多] │
│ ┌──────────────────────────────────────┐ │
│ │ 输入任务... │ │
│ └──────────────────────────────────────┘ │
│ [发送]│
└──────────────────────────────────────────┘
```
Core UX rules:
- The primary content area is always the message stream.
- `finalResult` becomes a result card inside the message stream.
- `logs` move into a hidden debug drawer.
- `start/stop` remain available but move to the header status area.
- Configuration remains available but opens inside a settings panel first, with route-navigation fallback to `chrome://superrpa-functions/sgclaw-config`.
- The same component tree must work in `FunctionsUI` first and later inside a real injected floating host.
## Scope
### In Scope For This Frontend Plan
- Complete visual redesign of `sgclaw-chat`
- Empty state, active chat state, running state, success state, error state
- Local conversation history UI
- Embedded settings panel
- Debug drawer
- Stable typed message model
- Separation of host bridge code from UI code
- Floating launcher state model
### Explicitly Out Of Scope For First Frontend Delivery
- Real attachment upload execution
- Deep-thinking or multi-skill plugin ecosystem
- Provider/protocol redesign on the Rust side
- Full page-injected floating host implementation
- New backend APIs beyond the current bridge
## Existing Baseline To Reuse
The implementation should reuse these existing assets instead of replacing them blindly:
- Host page routing: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/functions.ts`
- Existing chat entry registration: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/functions_manifest.json`
- Current chat page bridge logic: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat.ts`
- Current floating state prototype: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-floating_state.ts`
- Current config UI and bridge: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-config/sgclaw-config.ts`
## Final File Layout
All implementation paths below are exact and rooted in `/home/zyl/projects/superRpa/src`.
### Core Chat Entry
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat.ts`
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat.html.ts`
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat.css.ts`
### State Modules
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-floating_state.ts`
- Create: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_window_state.ts`
- Create: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_conversation_state.ts`
- Create: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_history_state.ts`
- Create: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_settings_state.ts`
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_state.ts`
### Host Adapter
- Create: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_host_adapter.ts`
- Create: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_runtime_bridge.ts`
### Message Model And Rendering
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_messages.ts`
- Create: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/components/sgclaw-message-list.ts`
- Create: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/components/sgclaw-message-card-user.ts`
- Create: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/components/sgclaw-message-card-assistant.ts`
- Create: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/components/sgclaw-message-card-step.ts`
- Create: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/components/sgclaw-message-card-result.ts`
- Create: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/components/sgclaw-message-card-error.ts`
### Shell Components
- Create: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/components/sgclaw-chat-shell.ts`
- Create: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/components/sgclaw-chat-header.ts`
- Create: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/components/sgclaw-chat-composer.ts`
- Create: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/components/sgclaw-history-panel.ts`
- Create: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/components/sgclaw-settings-panel.ts`
- Create: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/components/sgclaw-debug-drawer.ts`
### Build And Host Wiring
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/BUILD.gn`
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/functions.html.ts`
### Tests
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-floating_state_mainline_unittest.ts`
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_mainline_unittest.ts`
- Create: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_window_state_mainline_unittest.ts`
- Create: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_history_state_mainline_unittest.ts`
- Create: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_settings_state_mainline_unittest.ts`
- Create: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_runtime_bridge_mainline_unittest.ts`
## Target State Model
Use a typed model instead of the current loose shape.
```ts
interface SgClawChatWindowState {
windowOpen: boolean;
activePanel: 'chat' | 'history' | 'settings' | 'debug';
unreadCount: number;
}
interface SgClawChatConversationState {
conversationId: string;
draftInput: string;
mode: 'web-action' | 'page-qa' | 'page-summary';
contextEnabled: boolean;
messages: SgClawMessage[];
}
interface SgClawMessage {
id: string;
type: 'user_text' | 'assistant_text' | 'task_step' | 'task_result' | 'task_error' | 'system_notice';
role: 'user' | 'assistant' | 'system';
content: string;
status?: 'pending' | 'running' | 'done' | 'failed';
timestamp: number;
meta?: Record<string, unknown>;
}
```
The current `logs`, `messages`, `finalResult`, `pendingReply`, and `busy` state should be re-expressed through these typed stores instead of being owned directly by the entry component.
## Task 1: Freeze The Current Entry And Enable Real Template/CSS Modules
**Files:**
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat.ts`
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat.html.ts`
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat.css.ts`
- Test: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_mainline_unittest.ts`
**Step 1: Write the failing structure test**
Add assertions that the entry no longer hardcodes the full DOM layout in `render()` and imports its shell template/style helpers.
**Step 2: Run test to verify it fails**
Run:
```bash
autoninja -C /home/zyl/projects/superRpa/src/out/KylinRelease sgclaw-chat_build_ts
```
Expected: fail because `sgclaw-chat.html.ts` and `sgclaw-chat.css.ts` are empty and the new test expects real exports.
**Step 3: Write the minimal implementation**
- Move root shell markup to `getHtml()`
- Move root style tokens/layout to `getCss()`
- Keep `sgclaw-chat.ts` focused on state + events
**Step 4: Run test to verify it passes**
Run the same build target.
Expected: TS build succeeds and the entry uses external template/style helpers.
**Step 5: Commit**
```bash
git -C /home/zyl/projects/superRpa/src add \
chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat.ts \
chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat.html.ts \
chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat.css.ts \
chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_mainline_unittest.ts
git -C /home/zyl/projects/superRpa/src commit -m "refactor: extract sgclaw chat shell template"
```
## Task 2: Build The Window, Conversation, History, And Settings State Modules
**Files:**
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-floating_state.ts`
- Create: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_window_state.ts`
- Create: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_conversation_state.ts`
- Create: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_history_state.ts`
- Create: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_settings_state.ts`
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_state.ts`
- Test: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-floating_state_mainline_unittest.ts`
- Test: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_window_state_mainline_unittest.ts`
- Test: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_history_state_mainline_unittest.ts`
- Test: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_settings_state_mainline_unittest.ts`
**Step 1: Write the failing pure-state tests**
Cover:
- open/close/switch panel transitions
- unread count clear on open
- create/reset conversation
- local history push/select/remove
- settings draft dirty detection
**Step 2: Run tests to verify RED**
Run:
```bash
autoninja -C /home/zyl/projects/superRpa/src/out/KylinRelease sgclaw-chat_build_ts
```
Expected: build fails because the new modules and tests do not exist yet.
**Step 3: Write the minimal implementation**
Implement pure functions only. Do not mix DOM work into these modules.
**Step 4: Run tests to verify GREEN**
Run the same build target.
Expected: all pure-state modules compile and their tests pass.
**Step 5: Commit**
```bash
git -C /home/zyl/projects/superRpa/src add \
chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-floating_state.ts \
chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_window_state.ts \
chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_conversation_state.ts \
chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_history_state.ts \
chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_settings_state.ts \
chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_state.ts \
chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-floating_state_mainline_unittest.ts \
chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_window_state_mainline_unittest.ts \
chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_history_state_mainline_unittest.ts \
chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_settings_state_mainline_unittest.ts
git -C /home/zyl/projects/superRpa/src commit -m "feat: add sgclaw chat state modules"
```
## Task 3: Introduce A Host Adapter So UI Stops Owning Bridge Details
**Files:**
- Create: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_host_adapter.ts`
- Create: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_runtime_bridge.ts`
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat.ts`
- Test: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_runtime_bridge_mainline_unittest.ts`
**Step 1: Write the failing bridge test**
Test that:
- `connect()` issues `sgclawConnect`
- `start()` issues `sgclawStart`
- `stop()` issues `sgclawStop`
- `submitTask()` issues `sgclawSubmitTask`
- callback payload parsing is handled in one place
**Step 2: Run test to verify RED**
Run:
```bash
autoninja -C /home/zyl/projects/superRpa/src/out/KylinRelease sgclaw-chat_build_ts
```
Expected: fail because adapter modules do not exist.
**Step 3: Write minimal implementation**
- Wrap `chrome.send`
- Centralize callback registration
- Return typed runtime events/state to the UI layer
**Step 4: Run test to verify GREEN**
Run the same build target.
Expected: adapter tests compile and pass.
**Step 5: Commit**
```bash
git -C /home/zyl/projects/superRpa/src add \
chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_host_adapter.ts \
chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_runtime_bridge.ts \
chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat.ts \
chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_runtime_bridge_mainline_unittest.ts
git -C /home/zyl/projects/superRpa/src commit -m "refactor: isolate sgclaw chat host bridge"
```
## Task 4: Replace The Loose Message Format With Typed Cards In The Message Stream
**Files:**
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_messages.ts`
- Create: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/components/sgclaw-message-list.ts`
- Create: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/components/sgclaw-message-card-user.ts`
- Create: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/components/sgclaw-message-card-assistant.ts`
- Create: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/components/sgclaw-message-card-step.ts`
- Create: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/components/sgclaw-message-card-result.ts`
- Create: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/components/sgclaw-message-card-error.ts`
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat.ts`
- Test: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_mainline_unittest.ts`
**Step 1: Write the failing rendering test**
Add expectations that:
- empty state shows guidance instead of a blank box
- `task_complete` renders a result card in the message stream
- `error` renders an error card in the message stream
- `pendingReply` renders an assistant pending card
**Step 2: Run test to verify RED**
Run the TS build target.
Expected: fail because message types and card components do not exist.
**Step 3: Write minimal implementation**
- Keep the message list single-column
- Preserve current user/assistant turn behavior
- Move `finalResult` handling into result-card rendering
- Move error display into message flow
**Step 4: Run test to verify GREEN**
Run the same build target.
Expected: cards render correctly and the old standalone result area is no longer required.
**Step 5: Commit**
```bash
git -C /home/zyl/projects/superRpa/src add \
chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_messages.ts \
chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/components/sgclaw-message-list.ts \
chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/components/sgclaw-message-card-user.ts \
chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/components/sgclaw-message-card-assistant.ts \
chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/components/sgclaw-message-card-step.ts \
chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/components/sgclaw-message-card-result.ts \
chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/components/sgclaw-message-card-error.ts \
chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat.ts \
chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_mainline_unittest.ts
git -C /home/zyl/projects/superRpa/src commit -m "feat: add sgclaw chat message cards"
```
## Task 5: Build The Real Header, Empty State, And Composer
**Files:**
- Create: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/components/sgclaw-chat-shell.ts`
- Create: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/components/sgclaw-chat-header.ts`
- Create: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/components/sgclaw-chat-composer.ts`
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat.html.ts`
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat.css.ts`
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat.ts`
- Test: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_mainline_unittest.ts`
**Step 1: Write the failing shell test**
Assert that the rendered page now contains:
- header with title, current page label, and status pill
- empty state recommendation buttons
- fixed composer at the bottom
- no standalone `实时日志` or `最终结果` primary sections
**Step 2: Run test to verify RED**
Run the TS build target.
Expected: fail because the shell components do not exist.
**Step 3: Write minimal implementation**
- Header: title, page context, new-chat/history/settings/collapse actions
- Empty state: 3 to 4 recommended actions
- Composer: text input, send button, mode toggles, context switch
**Step 4: Run test to verify GREEN**
Run the same build target.
Expected: the page renders as a product-style chat shell.
**Step 5: Commit**
```bash
git -C /home/zyl/projects/superRpa/src add \
chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/components/sgclaw-chat-shell.ts \
chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/components/sgclaw-chat-header.ts \
chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/components/sgclaw-chat-composer.ts \
chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat.html.ts \
chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat.css.ts \
chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat.ts \
chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_mainline_unittest.ts
git -C /home/zyl/projects/superRpa/src commit -m "feat: add sgclaw chat shell and composer"
```
## Task 6: Embed Settings And Move Raw Logs Into A Debug Drawer
**Files:**
- Create: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/components/sgclaw-settings-panel.ts`
- Create: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/components/sgclaw-debug-drawer.ts`
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat.ts`
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat.css.ts`
- Reuse Read-Only Reference: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-config/sgclaw-config.ts`
- Test: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_settings_state_mainline_unittest.ts`
**Step 1: Write the failing panel tests**
Cover:
- opening settings panel from header
- editing embedded config draft
- opening debug drawer and showing logs
- closing secondary panels without destroying the chat draft
**Step 2: Run test to verify RED**
Run the TS build target.
Expected: fail because secondary panel components do not exist.
**Step 3: Write minimal implementation**
- Reuse config field structure from `sgclaw-config`
- Keep raw logs in debug only
- Preserve route-navigation fallback for full config page if embedded save/load fails
**Step 4: Run test to verify GREEN**
Run the same build target.
Expected: settings and debug layers behave as secondary panels instead of separate pages.
**Step 5: Commit**
```bash
git -C /home/zyl/projects/superRpa/src add \
chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/components/sgclaw-settings-panel.ts \
chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/components/sgclaw-debug-drawer.ts \
chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat.ts \
chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat.css.ts \
chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_settings_state_mainline_unittest.ts
git -C /home/zyl/projects/superRpa/src commit -m "feat: add sgclaw settings panel and debug drawer"
```
## Task 7: Add Local Conversation History And New-Chat Recovery
**Files:**
- Create: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/components/sgclaw-history-panel.ts`
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_history_state.ts`
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat.ts`
- Test: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_history_state_mainline_unittest.ts`
**Step 1: Write the failing history tests**
Cover:
- saving a conversation preview to local history
- creating a fresh conversation resets message stream but keeps config
- reopening a history item restores messages and draft
**Step 2: Run test to verify RED**
Run the TS build target.
Expected: fail because history panel and persistence behavior do not exist.
**Step 3: Write minimal implementation**
- Store history locally in browser storage or localStorage
- Keep only small metadata + message snapshot for first version
- No backend schema change in this phase
**Step 4: Run test to verify GREEN**
Run the same build target.
Expected: local conversation switching works fully in the frontend.
**Step 5: Commit**
```bash
git -C /home/zyl/projects/superRpa/src add \
chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/components/sgclaw-history-panel.ts \
chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_history_state.ts \
chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat.ts \
chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_history_state_mainline_unittest.ts
git -C /home/zyl/projects/superRpa/src commit -m "feat: add sgclaw local conversation history"
```
## Task 8: Wire New Shell Assets Into BUILD And Polish The Host Page
**Files:**
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/BUILD.gn`
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/functions.html.ts`
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/functions.css`
**Step 1: Write the failing host expectation**
Add a small host-level check that:
- `sgclaw-chat` still loads from the manifest
- host quick actions still work
- the function page provides enough room for the new chat shell
**Step 2: Run test/build to verify RED**
Run:
```bash
autoninja -C /home/zyl/projects/superRpa/src/out/KylinRelease sgclaw-chat_build_ts
```
Expected: fail or render incorrectly because new component files are not all wired into build/host styling yet.
**Step 3: Write minimal implementation**
- Add all new TS modules to `BUILD.gn`
- Keep `sgclaw-chat` and `sgclaw-config` quick actions
- Adjust host layout so the new shell is not boxed into the old debug-page proportions
**Step 4: Run verification**
Run:
```bash
autoninja -C /home/zyl/projects/superRpa/src/out/KylinRelease sgclaw-chat_build_ts
autoninja -C /home/zyl/projects/superRpa/src/out/KylinRelease superrpa_resources
```
Expected: build completes with all new chat modules wired in.
**Step 5: Commit**
```bash
git -C /home/zyl/projects/superRpa/src add \
chrome/browser/resources/superrpa/devtools/BUILD.gn \
chrome/browser/resources/superrpa/devtools/functions/functions.html.ts \
chrome/browser/resources/superrpa/devtools/functions/functions.css
git -C /home/zyl/projects/superRpa/src commit -m "chore: wire sgclaw chat frontend modules"
```
## Manual Verification Matrix
Run all manual checks in `chrome://superrpa-functions/sgclaw-chat` after the full frontend plan lands.
### UX States
- Empty state appears on first open.
- Recommended actions generate user messages.
- Composer stays visible while history/settings/debug panels switch.
- Message stream auto-scrolls to the latest item.
- Result cards and error cards appear inline.
### Runtime
- `启动` works from the header area.
- `停止` works from the header area.
- submit creates an immediate user message.
- pending assistant card appears while waiting.
- result card replaces the old standalone result behavior.
### Settings
- embedded settings loads existing values
- save updates status and clears dirty state
- fallback route to `chrome://superrpa-functions/sgclaw-config` still works
### Debug
- logs are not visible in the main chat view
- debug drawer shows raw logs when opened
### History
- new conversation starts clean
- previous conversation can be restored from local history
- unread badge clears when reopening the window
## Execution Notes
- Keep the current backend/runtime bridge unchanged until the new frontend shell is stable.
- Do not combine page-injected floating host work into this same branch. The first milestone is a complete product-grade frontend inside the existing `FunctionsUI` host.
- When this frontend plan is complete, the next plan should focus only on mounting the same component tree inside a real page floating container.
Plan complete and saved to `docs/plans/2026-03-27-sgclaw-floating-chat-frontend-design.md`. Two execution options:
**1. Subagent-Driven (this session)** - I dispatch fresh subagent per task, review between tasks, fast iteration
**2. Parallel Session (separate)** - Open new session with executing-plans, batch execution with checkpoints
**Which approach?**

View File

@@ -0,0 +1,85 @@
# sgClaw Overlay And Basic Navigation Fixes Implementation Plan
> **For Claude:** REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.
**Goal:** Make ordinary webpages render the new sgClaw floating chat design and support base navigation instructions like `打开知乎`.
**Architecture:** Keep the ordinary-page injection entrypoint unchanged, but replace its in-shadow DOM layout with the same floating-window shell used by the new debug page. On the runtime side, extend the deterministic planner with explicit homepage navigation plans for supported sites so freeform open-site commands do not fail before the compat runtime can help.
**Tech Stack:** Chromium WebUI resource pipeline, injected Shadow DOM overlay JavaScript, Rust planner tests
### Task 1: Lock the current regressions with failing tests
**Files:**
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/superrpa/sgclaw/sgclaw_chat_smoke.mjs`
- Modify: `/home/zyl/projects/sgClaw/claw/tests/planner_test.rs`
**Step 1: Write the failing smoke expectations**
Add assertions that the ordinary webpage overlay shows the new subtitle `面向当前网页的悬浮式对话与自动执行` and no longer exposes the old card titles like `聊天记录`.
**Step 2: Run the smoke to verify it fails**
Run: `node /home/zyl/projects/superRpa/src/chrome/browser/superrpa/sgclaw/sgclaw_chat_smoke.mjs`
Expected: FAIL because ordinary webpages still render the old overlay shell.
**Step 3: Write the failing planner test**
Add a test asserting `plan_instruction("打开知乎")` returns one `Navigate` step to `https://www.zhihu.com`.
**Step 4: Run the planner test to verify it fails**
Run: `cargo test planner_supports_open_zhihu_homepage_instruction --test planner_test`
Expected: FAIL with `unsupported instruction: 打开知乎`.
### Task 2: Migrate the ordinary webpage overlay to the new shell
**Files:**
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/sgclaw_overlay.js`
- Test: `/home/zyl/projects/superRpa/src/chrome/browser/superrpa/sgclaw/sgclaw_chat_smoke.mjs`
**Step 1: Replace the old card layout with the new floating shell**
Keep bridge calls, ids, and polling behavior intact, but render the new header, message pane, composer, settings panel, and debug drawer structure inside the existing injected Shadow DOM.
**Step 2: Keep runtime visibility without reintroducing the old layout**
Move logs and final result into secondary panels or inline cards so the ordinary webpage still exposes execution details without the old four-card layout.
**Step 3: Run the smoke again**
Run: `node /home/zyl/projects/superRpa/src/chrome/browser/superrpa/sgclaw/sgclaw_chat_smoke.mjs`
Expected: PASS once rebuilt resources are being served by the browser binary.
### Task 3: Extend planner support for basic open-site commands
**Files:**
- Modify: `/home/zyl/projects/sgClaw/claw/src/agent/planner.rs`
- Test: `/home/zyl/projects/sgClaw/claw/tests/planner_test.rs`
**Step 1: Implement the minimal homepage plans**
Support `打开知乎` and `打开百度` by returning single-step `Navigate` plans to their homepages.
**Step 2: Run planner tests**
Run: `cargo test --test planner_test`
Expected: PASS.
### Task 4: Build and verify the integrated behavior
**Files:**
- Modify: `/home/zyl/projects/superRpa/src/AGENTS.md`
- Modify: `/home/zyl/projects/superRpa/src/docs/handoffs/2026-03-27-sgclaw-runtime-verification.md`
**Step 1: Rebuild impacted targets**
Run: `autoninja -C /home/zyl/projects/superRpa/src/out/KylinRelease chrome/browser/resources/superrpa:resources sgclaw`
**Step 2: Re-run targeted verification**
Run the smoke and a focused `sgclaw` task submission check for `打开知乎`.
**Step 3: Document the final runtime path**
Record that ordinary webpages and `chrome://superrpa-functions/sgclaw-chat` now share the same floating shell, and that homepage navigation commands are handled by the planner.

View File

@@ -0,0 +1,158 @@
# Skill Lib Testing Implementation Plan
> **For Claude:** REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.
**Goal:** Add an in-project, repeatable test harness that validates `/home/zyl/projects/sgClaw/skill_lib` against the current ZeroClaw `SKILL.md` loader and security-audit expectations.
**Architecture:** Keep the test runner inside the SGClaw repository and target the sibling `skill_lib` directory by relative path. Implement a small Python validator that mirrors the ZeroClaw markdown frontmatter parser and the relevant skill-audit checks, then cover it with a Python `unittest` suite that exercises the actual three migrated Zhihu skills.
**Tech Stack:** Python 3 standard library, `unittest`, local file-system inspection, ZeroClaw source code as behavioral reference, Markdown/YAML-like frontmatter parsing.
### Task 1: Freeze The Test Contract
**Files:**
- Create: `/home/zyl/projects/sgClaw/claw/docs/plans/2026-03-27-skill-lib-testing-plan.md`
- Reference only: `/home/zyl/projects/sgClaw/claw/third_party/zeroclaw/src/skills/mod.rs`
- Reference only: `/home/zyl/projects/sgClaw/claw/third_party/zeroclaw/src/skills/audit.rs`
- Reference only: `/home/zyl/projects/sgClaw/skill_lib/skills/*/SKILL.md`
**Step 1: Capture the loader semantics to preserve**
Document and implement tests for:
- `SKILL.md` frontmatter splitting on `---`
- supported metadata keys: `name`, `description`, `version`, `author`, `tags`
- fallback rules for name, description, and version
- prompt body must exclude the frontmatter block
**Step 2: Capture the audit semantics to preserve**
Document and implement tests for:
- skill root must contain `SKILL.md` or `SKILL.toml`
- symlinks are rejected
- shell-script files are blocked when `allow_scripts` is false
- markdown links must not escape the skill root
- high-risk command snippets inside markdown are rejected
**Step 3: Define the migrated-skill expectations**
The test suite must verify:
- exactly three skill packages exist
- the loaded names are `zhihu-hotlist`, `zhihu-navigate`, `zhihu-write`
- each package has both `references/` and `assets/`
- each description stays trigger-oriented and starts with `Use when`
### Task 2: Write The Failing Tests First
**Files:**
- Create: `/home/zyl/projects/sgClaw/claw/tests/skill_lib_validation_test.py`
**Step 1: Write a failing import-level test**
Import a not-yet-created validator module from:
- `/home/zyl/projects/sgClaw/claw/scripts/validate_skill_lib.py`
Expected initial failure:
- `ModuleNotFoundError` or `FileNotFoundError`
**Step 2: Encode the project expectations**
Add tests for:
- skill discovery count and names
- parsed metadata for each current skill
- audit cleanliness for each skill with `allow_scripts=False`
- package shape (`SKILL.md`, `references/`, `assets/`)
**Step 3: Run the tests and watch them fail**
Run:
```bash
python3 -m unittest tests.skill_lib_validation_test -v
```
Expected:
- failure because the validator module does not exist yet
### Task 3: Implement The Minimal Validator
**Files:**
- Create: `/home/zyl/projects/sgClaw/claw/scripts/validate_skill_lib.py`
**Step 1: Implement discovery helpers**
Implement:
- repo root resolution
- sibling `skill_lib` root resolution
- `skills/` directory enumeration
**Step 2: Implement the markdown loader**
Implement:
- frontmatter split
- lightweight frontmatter parsing
- description fallback extraction
- metadata normalization into a `SkillRecord`
**Step 3: Implement the relevant audit checks**
Implement:
- symlink detection
- blocked shell-script detection
- markdown link boundary checks
- high-risk snippet detection
- deterministic findings collection
**Step 4: Implement a small CLI**
Running:
```bash
python3 scripts/validate_skill_lib.py
```
Should:
- print one summary line per skill
- exit `0` when all skills pass
- exit non-zero when any skill fails
### Task 4: Run The Tests Green
**Files:**
- Test: `/home/zyl/projects/sgClaw/claw/tests/skill_lib_validation_test.py`
- Test: `/home/zyl/projects/sgClaw/claw/scripts/validate_skill_lib.py`
**Step 1: Re-run the unit tests**
Run:
```bash
python3 -m unittest tests.skill_lib_validation_test -v
```
Expected:
- all tests pass
**Step 2: Run the CLI validator**
Run:
```bash
python3 scripts/validate_skill_lib.py
```
Expected:
- all three skills print `PASS`
- process exits `0`
### Task 5: Document The Verification Entry Point
**Files:**
- Modify: `/home/zyl/projects/sgClaw/skill_lib/VERIFY.md`
**Step 1: Add the project-local validation command**
Add:
- `python3 /home/zyl/projects/sgClaw/claw/scripts/validate_skill_lib.py`
- `python3 -m unittest /home/zyl/projects/sgClaw/claw/tests/skill_lib_validation_test.py`
**Step 2: Re-run both commands after the doc update**
Expected:
- validator still exits `0`
- unit tests still pass

View File

@@ -0,0 +1,411 @@
# Skill Lib ZeroClaw Migration Implementation Plan
> **For Claude:** REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.
**Goal:** Create `/home/zyl/projects/sgClaw/skill_lib` as a dedicated skill library directory and restructure the current Zhihu browser capabilities into ZeroClaw-style skill packages.
**Architecture:** Treat `skill_lib` as a standalone skill repository, not as an embedded Rust module tree. Use the ZeroClaw/open-skills layout `skill_lib/skills/<skill-name>/SKILL.md`, keep each skill self-contained, and move long operational detail into `references/` plus any preserved source artifacts into `assets/`. Map the current four exposed Rust capabilities into three end-user skills: `zhihu-navigate`, `zhihu-write`, and `zhihu-hotlist`.
**Tech Stack:** Markdown `SKILL.md`, YAML frontmatter, directory-based ZeroClaw skill packaging, existing SGClaw Zhihu Rust/JSON source material, shell validation commands.
### Task 1: Freeze The Target Layout
**Files:**
- Create: `/home/zyl/projects/sgClaw/skill_lib/`
- Create: `/home/zyl/projects/sgClaw/skill_lib/README.md`
- Create: `/home/zyl/projects/sgClaw/skill_lib/skills/`
- Reference only: `/home/zyl/projects/sgClaw/claw/third_party/zeroclaw/src/skills/mod.rs`
- Reference only: `/home/zyl/projects/sgClaw/claw/third_party/zeroclaw/skills/browser/SKILL.md`
**Step 1: Create the top-level repository skeleton**
Create:
- `/home/zyl/projects/sgClaw/skill_lib/README.md`
- `/home/zyl/projects/sgClaw/skill_lib/skills/`
The README should state:
- this directory is a dedicated ZeroClaw-style skill library
- runtime skill packages live under `skills/<name>/`
- each skill package uses `SKILL.md` plus optional `references/`, `scripts/`, and `assets/`
**Step 2: Document the package contract in the README**
Include:
- required file: `SKILL.md`
- supported frontmatter for this repo: `name`, `description`, `version`, `author`, `tags`
- design rule: `description` must be trigger-oriented and not a workflow dump
- design rule: keep `SKILL.md` concise and push long detail into `references/`
**Step 3: Run structural sanity checks**
Run:
```bash
test -d /home/zyl/projects/sgClaw/skill_lib
test -d /home/zyl/projects/sgClaw/skill_lib/skills
test -f /home/zyl/projects/sgClaw/skill_lib/README.md
```
Expected: all commands exit `0`.
### Task 2: Define The Skill Inventory And Source Mapping
**Files:**
- Create: `/home/zyl/projects/sgClaw/skill_lib/skill_inventory.md`
- Reference only: `/home/zyl/projects/sgClaw/claw/src/skill/mod.rs`
- Reference only: `/home/zyl/projects/sgClaw/claw/src/skill/router.rs`
- Reference only: `/home/zyl/projects/sgClaw/claw/src/skill/zhihu.rs`
- Reference only: `/home/zyl/projects/sgClaw/claw/src/skill/zhihu_hotlist.rs`
- Reference only: `/home/zyl/projects/sgClaw/claw/src/skill/zhihu_hotlist_store.rs`
- Reference only: `/home/zyl/projects/sgClaw/claw/src/skill/zhihu_navigation.rs`
- Reference only: `/home/zyl/projects/sgClaw/claw/resources/skills/zhihu_write_flow.json`
- Reference only: `/home/zyl/projects/sgClaw/claw/resources/skills/zhihu_hotlist_flow.json`
- Reference only: `/home/zyl/projects/sgClaw/claw/resources/skills/zhihu_navigation_pages.json`
**Step 1: Write the migration inventory**
Create `/home/zyl/projects/sgClaw/skill_lib/skill_inventory.md` with a three-row mapping:
- `zhihu-navigate` ← current `zhihu_navigate`
- `zhihu-write` ← current `zhihu_write`
- `zhihu-hotlist` ← current `zhihu_hotlist_collect` + `zhihu_hotlist_report`
**Step 2: Capture the non-migrated code responsibilities**
Document explicitly that this migration does **not** port:
- Rust router dispatch
- browser pipe transport code
- snapshot persistence implementation detail
Document that the new repo is a skill library, not a Rust runtime.
**Step 3: Record source artifacts per target skill**
For each target skill, list:
- source Rust module(s)
- source JSON flow/catalog file(s)
- important risk notes discovered during analysis
**Step 4: Validate the inventory**
Run:
```bash
rg -n "zhihu-navigate|zhihu-write|zhihu-hotlist" /home/zyl/projects/sgClaw/skill_lib/skill_inventory.md
```
Expected: all three skill names appear exactly once as top-level migration targets.
### Task 3: Author The `zhihu-navigate` Skill Package
**Files:**
- Create: `/home/zyl/projects/sgClaw/skill_lib/skills/zhihu-navigate/SKILL.md`
- Create: `/home/zyl/projects/sgClaw/skill_lib/skills/zhihu-navigate/references/routes-and-targets.md`
- Create: `/home/zyl/projects/sgClaw/skill_lib/skills/zhihu-navigate/references/selector-strategy.md`
- Create: `/home/zyl/projects/sgClaw/skill_lib/skills/zhihu-navigate/assets/zhihu_navigation_pages.source.json`
- Reference only: `/home/zyl/projects/sgClaw/claw/src/skill/zhihu_navigation.rs`
- Reference only: `/home/zyl/projects/sgClaw/claw/resources/skills/zhihu_navigation_pages.json`
**Step 1: Preserve the raw source artifact**
Copy the current navigation catalog into:
- `/home/zyl/projects/sgClaw/skill_lib/skills/zhihu-navigate/assets/zhihu_navigation_pages.source.json`
This file is for traceability only, not for frontmatter or prompt injection.
**Step 2: Write the `SKILL.md`**
Use ZeroClaw-style frontmatter:
```yaml
---
name: zhihu-navigate
description: Use when the user wants to open, switch, or navigate to a Zhihu page, tab, menu, profile area, notifications area, message area, or creator area through browser actions.
version: 0.1.0
author: sgclaw
tags:
- zhihu
- browser
- navigation
---
```
The body should include:
- overview
- when to use
- workflow for route vs component vs flow navigation
- ambiguity handling rules
- output contract
- common mistakes
**Step 3: Write `routes-and-targets.md`**
Summarize:
- route/component/flow/target model
- representative target names
- known alias conflicts
- preferred disambiguation wording for future prompts
**Step 4: Write `selector-strategy.md`**
Document:
- why selectors should prefer semantic hooks over CSS hash classes
- fallback ordering
- known brittle selectors from the current source
**Step 5: Validate the package**
Run:
```bash
test -f /home/zyl/projects/sgClaw/skill_lib/skills/zhihu-navigate/SKILL.md
test -f /home/zyl/projects/sgClaw/skill_lib/skills/zhihu-navigate/references/routes-and-targets.md
test -f /home/zyl/projects/sgClaw/skill_lib/skills/zhihu-navigate/references/selector-strategy.md
test -f /home/zyl/projects/sgClaw/skill_lib/skills/zhihu-navigate/assets/zhihu_navigation_pages.source.json
```
Expected: all commands exit `0`.
### Task 4: Author The `zhihu-write` Skill Package
**Files:**
- Create: `/home/zyl/projects/sgClaw/skill_lib/skills/zhihu-write/SKILL.md`
- Create: `/home/zyl/projects/sgClaw/skill_lib/skills/zhihu-write/references/editor-flow.md`
- Create: `/home/zyl/projects/sgClaw/skill_lib/skills/zhihu-write/references/publish-safety.md`
- Create: `/home/zyl/projects/sgClaw/skill_lib/skills/zhihu-write/assets/zhihu_write_flow.source.json`
- Reference only: `/home/zyl/projects/sgClaw/claw/src/skill/zhihu.rs`
- Reference only: `/home/zyl/projects/sgClaw/claw/resources/skills/zhihu_write_flow.json`
**Step 1: Preserve the raw source artifact**
Copy:
- `/home/zyl/projects/sgClaw/claw/resources/skills/zhihu_write_flow.json`
to:
- `/home/zyl/projects/sgClaw/skill_lib/skills/zhihu-write/assets/zhihu_write_flow.source.json`
**Step 2: Write the `SKILL.md`**
The frontmatter should name a single skill:
- `name: zhihu-write`
- description focused on when article drafting or publishing is requested
The body should include:
- prerequisites before touching the editor
- workflow for draft-only vs publish
- explicit confirmation gate before publish
- required final report fields: title, mode, final URL if published, unresolved issues
**Step 3: Write `editor-flow.md`**
Document:
- entry page
- editor readiness checks
- title/body fill rules
- publish confirmation sequence
- URL capture rules
**Step 4: Write `publish-safety.md`**
Document:
- when to stop and ask for confirmation
- what to do if title verification fails
- what to do if the URL remains on edit mode
- brittle selectors that must be revalidated first
**Step 5: Validate the package**
Run:
```bash
test -f /home/zyl/projects/sgClaw/skill_lib/skills/zhihu-write/SKILL.md
test -f /home/zyl/projects/sgClaw/skill_lib/skills/zhihu-write/references/editor-flow.md
test -f /home/zyl/projects/sgClaw/skill_lib/skills/zhihu-write/references/publish-safety.md
test -f /home/zyl/projects/sgClaw/skill_lib/skills/zhihu-write/assets/zhihu_write_flow.source.json
```
Expected: all commands exit `0`.
### Task 5: Author The `zhihu-hotlist` Skill Package
**Files:**
- Create: `/home/zyl/projects/sgClaw/skill_lib/skills/zhihu-hotlist/SKILL.md`
- Create: `/home/zyl/projects/sgClaw/skill_lib/skills/zhihu-hotlist/references/collection-flow.md`
- Create: `/home/zyl/projects/sgClaw/skill_lib/skills/zhihu-hotlist/references/report-format.md`
- Create: `/home/zyl/projects/sgClaw/skill_lib/skills/zhihu-hotlist/references/data-quality.md`
- Create: `/home/zyl/projects/sgClaw/skill_lib/skills/zhihu-hotlist/assets/zhihu_hotlist_flow.source.json`
- Reference only: `/home/zyl/projects/sgClaw/claw/src/skill/zhihu_hotlist.rs`
- Reference only: `/home/zyl/projects/sgClaw/claw/src/skill/zhihu_hotlist_store.rs`
- Reference only: `/home/zyl/projects/sgClaw/claw/resources/skills/zhihu_hotlist_flow.json`
**Step 1: Preserve the raw source artifact**
Copy:
- `/home/zyl/projects/sgClaw/claw/resources/skills/zhihu_hotlist_flow.json`
to:
- `/home/zyl/projects/sgClaw/skill_lib/skills/zhihu-hotlist/assets/zhihu_hotlist_flow.source.json`
**Step 2: Write the `SKILL.md`**
Use one skill to cover:
- hotlist collection
- comment metric collection
- snapshot-style reporting
The body should clearly separate:
- collection workflow
- report workflow
- partial-failure handling
- output contract
**Step 3: Write `collection-flow.md`**
Include:
- hotlist page detection
- hotlist HTML capture strategy
- top N extraction
- detail-page comment collection flow
- metric parsing notes
**Step 4: Write `report-format.md`**
Define:
- report header line
- per-item summary line
- field names and order
- handling when comment metrics are missing
**Step 5: Write `data-quality.md`**
Document:
- why partial success must be surfaced
- what counts as incomplete data
- known parser risks
- recommended caution language in outputs
**Step 6: Validate the package**
Run:
```bash
test -f /home/zyl/projects/sgClaw/skill_lib/skills/zhihu-hotlist/SKILL.md
test -f /home/zyl/projects/sgClaw/skill_lib/skills/zhihu-hotlist/references/collection-flow.md
test -f /home/zyl/projects/sgClaw/skill_lib/skills/zhihu-hotlist/references/report-format.md
test -f /home/zyl/projects/sgClaw/skill_lib/skills/zhihu-hotlist/references/data-quality.md
test -f /home/zyl/projects/sgClaw/skill_lib/skills/zhihu-hotlist/assets/zhihu_hotlist_flow.source.json
```
Expected: all commands exit `0`.
### Task 6: Normalize Frontmatter And Trigger Quality
**Files:**
- Modify: `/home/zyl/projects/sgClaw/skill_lib/skills/zhihu-navigate/SKILL.md`
- Modify: `/home/zyl/projects/sgClaw/skill_lib/skills/zhihu-write/SKILL.md`
- Modify: `/home/zyl/projects/sgClaw/skill_lib/skills/zhihu-hotlist/SKILL.md`
**Step 1: Normalize frontmatter keys**
Ensure each `SKILL.md` contains exactly these frontmatter keys in this order:
- `name`
- `description`
- `version`
- `author`
- `tags`
Do not add Rust-only or unofficial parser fields as required metadata.
**Step 2: Check naming rules**
Ensure skill names are:
- lowercase
- hyphenated
- stable
Names to keep:
- `zhihu-navigate`
- `zhihu-write`
- `zhihu-hotlist`
**Step 3: Tighten descriptions**
Each description must:
- begin with `Use when`
- describe triggering conditions
- mention Zhihu/browser context
- avoid dumping full workflow detail
**Step 4: Validate frontmatter**
Run:
```bash
rg -n "^name: |^description: |^version: |^author: |^tags:" /home/zyl/projects/sgClaw/skill_lib/skills/*/SKILL.md
```
Expected: every skill emits the same five key families.
### Task 7: Add Repository-Level Verification Notes
**Files:**
- Create: `/home/zyl/projects/sgClaw/skill_lib/VERIFY.md`
- Modify: `/home/zyl/projects/sgClaw/skill_lib/README.md`
**Step 1: Create `VERIFY.md`**
Document the manual verification checklist:
- all skill packages are under `skill_lib/skills/`
- each package has `SKILL.md`
- long details live in `references/`
- preserved source JSON is in `assets/`
- no Rust source is copied into the skill repo
**Step 2: Link verification from the README**
Add a short section in `README.md` pointing to `VERIFY.md`.
**Step 3: Run repository-level checks**
Run:
```bash
find /home/zyl/projects/sgClaw/skill_lib/skills -mindepth 2 -maxdepth 2 -name SKILL.md | sort
find /home/zyl/projects/sgClaw/skill_lib/skills -type d \( -name references -o -name assets \) | sort
```
Expected:
- exactly three `SKILL.md` files
- each skill has `references/`
- each skill has `assets/`
### Task 8: Final Review Before Claiming Completion
**Files:**
- Review only: `/home/zyl/projects/sgClaw/skill_lib/`
- Review only: `/home/zyl/projects/sgClaw/claw/docs/plans/2026-03-27-skill-lib-zeroclaw-plan.md`
**Step 1: Review against ZeroClaw runtime constraints**
Check that the final library respects the currently observed runtime facts:
- directory-based skills
- `SKILL.md` supported
- simple frontmatter fields
- optional `references/`, `scripts/`, `assets/`
**Step 2: Review against authoring quality**
Check that each skill:
- is self-contained
- has a narrow trigger boundary
- avoids copying Rust internals into the prompt body
- surfaces ambiguity and failure modes
**Step 3: Produce the implementation report**
The completion report must include:
- created directories
- created skill packages
- any deliberate deviations from upstream ZeroClaw examples
- verification commands actually run
- unresolved risks
**Step 4: Stop before unrelated expansion**
Do not add:
- extra skills beyond the three mapped ones
- generic utility libraries
- unrelated automation scripts
- runtime code changes in `/home/zyl/projects/sgClaw/claw/src/skill/`

View File

@@ -0,0 +1,598 @@
# SGClaw ZeroClaw Core Realignment Implementation Plan
> **For Claude:** REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.
**Goal:** Rebuild `sgclaw` as a security-hardened zeroclaw runtime where the browser is a privileged tool surface and client host, not the product's defining execution model.
**Architecture:** `zeroclaw` remains the real agent core for prompt building, skills, memory, tool routing, autonomy, and execution loops. `sgclaw` adds a security envelope around zeroclaw: hardened configuration defaults, MAC-validated pipe communication, browser-host revalidation, and constrained exposure of privileged tools. The browser stays important, but only as one protected execution surface.
**Tech Stack:** Rust, vendored `zeroclaw`, SuperRPA browser pipe protocol, Chromium WebUI/overlay, JSON config, HMAC/MAC policy rules.
## Scope Guard
- In scope:
- zeroclaw-first runtime restructuring
- security and tool policy layering
- browser pipe re-framing as privileged tool surface
- documentation correction before code changes
- browser UI wording and observability alignment
- Out of scope:
- broad UI redesign before the runtime contract is fixed
- indiscriminately enabling every zeroclaw built-in tool
- keeping parallel browser-only and zeroclaw-first runtime stacks long-term
## Task 1: Rewrite The Product Docs Before Touching Runtime Code
**Files:**
- Modify: `docs/README.md`
- Modify: `docs/L0-产品白皮书与能力全景层.md`
- Modify: `docs/L1-系统架构与安全模型层.md`
- Modify: `docs/L2-核心模块与接口契约层.md`
- Modify: `docs/L3-数据流与Skill体系层.md`
- Modify: `docs/L4-工程实现与部署拓扑层.md`
- Modify: `docs/L5-提示词分布与安全改造方案.md`
- Modify: `docs/浏览器对接标准.md`
**Intent:**
- Make the docs the single source of truth before implementation starts.
- Remove the misleading framing that `sgclaw` is fundamentally a browser execution core.
- Replace it with the correct framing: `sgclaw` is a security-hardened zeroclaw distribution with a privileged browser execution surface.
**Step 1: Add a docs checklist in the plan branch**
Checklist to apply consistently across the docs:
```text
1. zeroclaw is the runtime core
2. sgclaw adds security policy and protected execution surfaces
3. browser pipe is one privileged tool surface, not the whole runtime
4. docs must distinguish current implementation gaps from target architecture
5. no doc may imply that browser-only compat is the desired end state
```
**Step 2: Update the architecture docs**
Required wording changes:
- Replace phrases equivalent to “浏览器智能体执行内核” with wording equivalent to “安全加固后的 zeroclaw runtime”.
- Reframe `compat` as a temporary adaptation layer, not the final product identity.
- Clarify that `browser_action` is a protected tool contract, not the only capability sgClaw should ever have.
- Clarify that prompt治理、安全摘要、skills、memory、routing should stay aligned with zeroclaw-native mechanisms.
**Step 3: Run doc consistency checks**
Run:
```bash
rg -n "浏览器智能体执行内核|单一 `browser_action` 工具|兼容执行器" docs
rg -n "zeroclaw.*核心|特权工具面|安全加固" docs
```
Expected:
- The first command should only return historical or explicitly marked current-state references.
- The second command should show the new target framing in the mainline docs.
**Step 4: Commit**
```bash
git add docs/README.md docs/L0-产品白皮书与能力全景层.md docs/L1-系统架构与安全模型层.md docs/L2-核心模块与接口契约层.md docs/L3-数据流与Skill体系层.md docs/L4-工程实现与部署拓扑层.md docs/L5-提示词分布与安全改造方案.md docs/浏览器对接标准.md
git commit -m "docs: redefine sgclaw as hardened zeroclaw runtime"
```
## Task 2: Introduce A Zeroclaw-First Runtime Module
**Files:**
- Create: `src/runtime/mod.rs`
- Create: `src/runtime/profile.rs`
- Create: `src/runtime/tool_policy.rs`
- Create: `src/runtime/engine.rs`
- Modify: `src/lib.rs`
- Test: `tests/runtime_profile_test.rs`
**Intent:**
- Create a real runtime namespace that represents sgClaws zeroclaw-first architecture.
- Stop letting `compat/runtime.rs` be the place where product architecture is defined.
**Step 1: Write the failing tests**
Create `tests/runtime_profile_test.rs` with at least:
```rust
#[test]
fn browser_attached_profile_exposes_browser_surface_without_becoming_browser_only() {
let profile = RuntimeProfile::BrowserAttached;
let policy = ToolPolicy::for_profile(profile);
assert!(policy.allowed_tools.contains("browser_action"));
assert!(policy.may_use_non_browser_tools);
}
#[test]
fn general_assistant_profile_does_not_require_browser_surface() {
let profile = RuntimeProfile::GeneralAssistant;
let policy = ToolPolicy::for_profile(profile);
assert!(!policy.requires_browser_surface);
}
```
**Step 2: Run the tests to confirm failure**
Run:
```bash
cargo test --test runtime_profile_test -- --nocapture
```
Expected:
- Fail with unresolved imports or missing types for `RuntimeProfile` and `ToolPolicy`.
**Step 3: Add the minimal runtime module**
Implement the first-pass skeleton:
```rust
pub enum RuntimeProfile {
BrowserAttached,
BrowserHeavy,
GeneralAssistant,
}
pub struct ToolPolicy {
pub requires_browser_surface: bool,
pub may_use_non_browser_tools: bool,
pub allowed_tools: Vec<String>,
}
```
**Step 4: Re-run the focused tests**
Run:
```bash
cargo test --test runtime_profile_test -- --nocapture
```
Expected:
- Both new tests pass.
**Step 5: Commit**
```bash
git add src/runtime/mod.rs src/runtime/profile.rs src/runtime/tool_policy.rs src/runtime/engine.rs src/lib.rs tests/runtime_profile_test.rs
git commit -m "feat: add zeroclaw-first runtime module skeleton"
```
## Task 3: Replace DeepSeek-Only Settings With Zeroclaw-First SGClaw Settings
**Files:**
- Modify: `src/config/settings.rs`
- Modify: `src/config/mod.rs`
- Modify: `src/compat/config_adapter.rs`
- Test: `tests/compat_config_test.rs`
- Test: `tests/compat_memory_test.rs`
- Test: `tests/compat_cron_test.rs`
**Intent:**
- Stop treating browser config as only a DeepSeek shim.
- Introduce sgClaw settings that can express zeroclaw-first runtime behavior while staying backward-compatible with the existing `sgclaw_config.json`.
**Step 1: Add the failing config tests**
Add tests beside the existing ones in `tests/compat_config_test.rs`:
```rust
#[test]
fn sgclaw_settings_default_to_compact_skills_and_browser_attached_profile() {
let settings = SgClawSettings::from_legacy_deepseek_fields(
"sk-test".into(),
"https://api.deepseek.com".into(),
"deepseek-chat".into(),
None,
).unwrap();
assert_eq!(settings.runtime_profile, RuntimeProfile::BrowserAttached);
assert_eq!(settings.skills_prompt_mode, SkillsPromptMode::Compact);
}
#[test]
fn sgclaw_settings_load_new_runtime_fields_from_browser_config() {
// write config with runtimeProfile / skillsPromptMode / allowedToolProfiles
}
```
**Step 2: Run the focused tests**
Run:
```bash
cargo test --test compat_config_test -- --nocapture
```
Expected:
- Fail because `SgClawSettings`, `RuntimeProfile`, or new parsing fields do not exist yet.
**Step 3: Implement backward-compatible settings expansion**
Minimal target shape:
```rust
pub struct SgClawSettings {
pub provider_api_key: String,
pub provider_base_url: String,
pub provider_model: String,
pub skills_dir: Option<PathBuf>,
pub skills_prompt_mode: SkillsPromptMode,
pub runtime_profile: RuntimeProfile,
}
```
Compatibility rules:
- Existing `apiKey/baseUrl/model/skillsDir` continue to load.
- New fields such as `skillsPromptMode` and `runtimeProfile` are optional.
- Defaults should be hardened, not legacy-full-prompt by accident.
**Step 4: Run the config and adapter tests**
Run:
```bash
cargo test --test compat_config_test -- --nocapture
cargo test --test compat_memory_test -- --nocapture
cargo test --test compat_cron_test -- --nocapture
```
Expected:
- All pass.
**Step 5: Commit**
```bash
git add src/config/settings.rs src/config/mod.rs src/compat/config_adapter.rs tests/compat_config_test.rs tests/compat_memory_test.rs tests/compat_cron_test.rs
git commit -m "feat: add zeroclaw-first sgclaw settings model"
```
## Task 4: Rebuild The Execution Path Around The New Runtime Engine
**Files:**
- Modify: `src/compat/runtime.rs`
- Modify: `src/agent/mod.rs`
- Modify: `src/compat/event_bridge.rs`
- Test: `tests/compat_runtime_test.rs`
**Intent:**
- Make `compat/runtime.rs` a thin bridge into the new runtime engine instead of the place where the products core execution policy lives.
- Keep browser-originated tasks attached to a browser surface, but do not collapse the runtime into a browser-only tool list.
**Step 1: Add the failing runtime tests**
Extend `tests/compat_runtime_test.rs` with at least:
```rust
#[test]
fn compat_runtime_can_complete_a_text_only_turn_without_browser_tool_calls() {
// provider returns direct assistant content
// no BrowserMessage::Response is queued
// summary should still succeed
}
#[test]
fn compat_runtime_allows_read_skill_under_compact_mode_policy() {
// configure compact skills mode
// verify the runtime tool policy includes read_skill
}
```
**Step 2: Run the focused runtime test file**
Run:
```bash
cargo test --test compat_runtime_test -- --nocapture
```
Expected:
- Fail because the current runtime still hardcodes browser-only exposure or cannot express the new policy.
**Step 3: Refactor the runtime path**
Implementation target:
- `src/agent/mod.rs` decides the runtime profile and passes browser context into the runtime engine.
- `src/compat/runtime.rs` becomes a compatibility bridge, not the architecture center.
- `src/runtime/engine.rs` owns:
- zeroclaw agent creation
- tool registration
- skill loading
- runtime profile application
- browser contract prompt injection only when browser surface is actually present
**Step 4: Re-run the focused runtime tests**
Run:
```bash
cargo test --test compat_runtime_test -- --nocapture
```
Expected:
- Existing compat runtime tests still pass.
- New text-only / compact-skill tests pass.
**Step 5: Commit**
```bash
git add src/compat/runtime.rs src/agent/mod.rs src/compat/event_bridge.rs tests/compat_runtime_test.rs
git commit -m "refactor: route browser requests through zeroclaw-first runtime engine"
```
## Task 5: Treat The Browser Pipe As A Privileged Tool Surface
**Files:**
- Modify: `src/compat/browser_tool_adapter.rs`
- Modify: `src/pipe/protocol.rs`
- Modify: `src/pipe/browser_tool.rs`
- Modify: `src/security/mac_policy.rs`
- Test: `tests/compat_browser_tool_test.rs`
- Test: `tests/browser_tool_test.rs`
- Test: `tests/pipe_protocol_test.rs`
- Test: `tests/pipe_handshake_test.rs`
**Intent:**
- Keep the browser powerful, but explicitly as a protected external execution surface.
- Preserve the current HMAC/MAC/rules boundary while making it obvious in code that browser execution is not synonymous with runtime execution.
**Step 1: Add the failing tests**
Examples:
```rust
#[test]
fn browser_tool_contract_is_marked_as_privileged_surface_in_policy_metadata() {
// assert runtime metadata treats browser tool separately from generic tools
}
#[test]
fn browser_submit_task_can_carry_browser_context_without_forcing_browser_only_execution() {
// protocol/adapter level test
}
```
**Step 2: Run the safety-related tests**
Run:
```bash
cargo test --test compat_browser_tool_test -- --nocapture
cargo test --test browser_tool_test -- --nocapture
cargo test --test pipe_protocol_test -- --nocapture
cargo test --test pipe_handshake_test -- --nocapture
```
Expected:
- Fail on new privilege/metadata expectations before implementation.
**Step 3: Implement the policy split**
Implementation target:
- keep `browser_action` schema constrained
- keep `MacPolicy` as the final guard on outbound browser commands
- annotate browser surface metadata in the runtime/tool policy layer
- avoid leaking browser-only assumptions into generic runtime config
**Step 4: Re-run the focused browser safety tests**
Run the same four commands from Step 2.
Expected:
- All pass.
**Step 5: Commit**
```bash
git add src/compat/browser_tool_adapter.rs src/pipe/protocol.rs src/pipe/browser_tool.rs src/security/mac_policy.rs tests/compat_browser_tool_test.rs tests/browser_tool_test.rs tests/pipe_protocol_test.rs tests/pipe_handshake_test.rs
git commit -m "feat: model browser pipe as privileged runtime surface"
```
## Task 6: Restore Zeroclaw-Native Skills Instead Of Browser-Specific Skill Prompt Hacks
**Files:**
- Modify: `src/runtime/engine.rs`
- Modify: `src/compat/event_bridge.rs`
- Modify: `src/agent/mod.rs`
- Test: `tests/compat_runtime_test.rs`
- Test: `tests/skill_lib_validation_test.py`
**Intent:**
- Make skill loading and skill usage follow zeroclaw-native behavior first.
- Prefer compact mode plus `read_skill` when safe.
- Make skill usage observable in runtime logs so the browser UI can explain what happened.
**Step 1: Add the failing skill-observability tests**
Add tests for:
- compact mode includes `read_skill`
- runtime logs show when `read_skill` or skill-defined tools are invoked
- configured `skillsDir` still resolves both repo-root and nested `skills/` layouts
**Step 2: Run the skill-related tests**
Run:
```bash
cargo test --test compat_runtime_test -- --nocapture
python3 tests/skill_lib_validation_test.py
```
Expected:
- New observability assertions fail before implementation.
**Step 3: Implement the minimal skill-first runtime behavior**
Implementation target:
- compact skills mode becomes the sgClaw default unless explicitly overridden
- `read_skill` is allowed when the active tool policy permits it
- runtime logs include loaded skill names and actual skill/tool usage
**Step 4: Re-run the skill tests**
Run the same two commands from Step 2.
Expected:
- Both pass.
**Step 5: Commit**
```bash
git add src/runtime/engine.rs src/compat/event_bridge.rs src/agent/mod.rs tests/compat_runtime_test.rs tests/skill_lib_validation_test.py
git commit -m "feat: restore zeroclaw-native skill flow and observability"
```
## Task 7: Remove Or Quarantine The Legacy Parallel Runtime Paths
**Files:**
- Modify or delete: `src/agent/runtime.rs`
- Modify or delete: `src/agent/planner.rs`
- Modify: `tests/agent_runtime_test.rs`
- Modify: `tests/planner_test.rs`
- Modify: `docs/L2-核心模块与接口契约层.md`
- Modify: `docs/L3-数据流与Skill体系层.md`
**Intent:**
- Stop shipping multiple conceptual runtimes.
- If the light runtime/planner path still has value, mark it as legacy/dev-only.
- If it has no production value, remove it after the zeroclaw-first runtime is stable.
**Step 1: Decide the disposition**
Choose one:
```text
A. delete planner/runtime legacy path
B. keep as explicit legacy/dev-only module with zero production routing
```
Recommendation: `B` first, then `A` after one clean release cycle.
**Step 2: Add the failing cleanup tests**
Examples:
- production browser path never routes into planner fallback
- docs no longer describe planner fallback as the primary model-enabled path
**Step 3: Run the legacy-path tests**
Run:
```bash
cargo test --test agent_runtime_test -- --nocapture
cargo test --test planner_test -- --nocapture
```
Expected:
- Fail once the cleanup assertions are added.
**Step 4: Apply the cleanup**
Implementation target:
- remove production routing dependency
- rename or mark legacy modules clearly
- update docs to match the new reality
**Step 5: Re-run the legacy-path tests and commit**
Run the same two commands, then:
```bash
git add src/agent/runtime.rs src/agent/planner.rs tests/agent_runtime_test.rs tests/planner_test.rs docs/L2-核心模块与接口契约层.md docs/L3-数据流与Skill体系层.md
git commit -m "refactor: quarantine legacy browser-only runtime paths"
```
## Task 8: Align The Browser UI With The New Runtime Truth
**Files:**
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/sgclaw_overlay.js`
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat.html.ts`
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat.css.ts`
- Modify: `/home/zyl/projects/superRpa/src/AGENTS.md`
**Intent:**
- Make the browser UI an honest client of the runtime.
- Stop labeling all tasks as “网页执行” when the runtime may have solved them through non-browser zeroclaw capabilities.
**Step 1: Add the UI contract checklist**
Checklist:
```text
1. UI shows runtime profile or capability mode
2. UI can show whether browser tools were actually used
3. UI can show when skills were read or invoked
4. wording does not imply browser is the whole runtime
```
**Step 2: Implement the label and observability changes**
Examples:
- replace fixed “网页执行” copy with runtime-derived capability wording
- add a compact “本轮调用”/“能力来源” section
- avoid implying every successful task came from page automation
**Step 3: Verify browser resources**
Run:
```bash
node --check /home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/sgclaw_overlay.js
/home/zyl/projects/superRpa/src/out/KylinRelease/functions_ui_mainline_unittests
autoninja -C /home/zyl/projects/superRpa/src/out/KylinRelease chrome
```
Expected:
- All pass.
**Step 4: Commit**
```bash
git -C /home/zyl/projects/superRpa/src add chrome/browser/resources/superrpa/sgclaw_overlay.js chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat.html.ts chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat.css.ts AGENTS.md
git -C /home/zyl/projects/superRpa/src commit -m "feat: align browser ui with zeroclaw-first runtime"
```
## Final Verification Matrix
Run, in order:
```bash
cargo test --test runtime_profile_test -- --nocapture
cargo test --test compat_config_test -- --nocapture
cargo test --test compat_runtime_test -- --nocapture
cargo test --test compat_browser_tool_test -- --nocapture
cargo test --test browser_tool_test -- --nocapture
cargo test --test pipe_protocol_test -- --nocapture
cargo test --test pipe_handshake_test -- --nocapture
cargo test --test agent_runtime_test -- --nocapture
cargo test --test planner_test -- --nocapture
python3 tests/skill_lib_validation_test.py
```
If local Cargo registry state is unstable, use the Chromium hermetic wrapper instead:
```bash
python3 /home/zyl/projects/superRpa/src/tools/crates/run_cargo.py test --manifest-path /home/zyl/projects/sgClaw/claw/Cargo.toml -- --nocapture
```
Then verify browser integration:
```bash
node --check /home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/sgclaw_overlay.js
/home/zyl/projects/superRpa/src/out/KylinRelease/functions_ui_mainline_unittests
autoninja -C /home/zyl/projects/superRpa/src/out/KylinRelease chrome
```
## Success Criteria
- The docs say the right thing before the code change begins.
- `sgclaw` behaves like a security-hardened zeroclaw runtime, not a browser-only compat shell.
- The browser pipe remains central for protected execution, but it is no longer mistaken for the whole product architecture.
- Skills, prompt building, memory, routing, and tool policy all flow through zeroclaw-native mechanisms first.
- The UI becomes a thin, honest client of the shared runtime.

View File

@@ -0,0 +1,482 @@
# sgClaw SuperRPA Decoupled Runtime Implementation Plan
> **For Claude:** REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.
**Goal:** Keep the SuperRPA parent-child security model, while moving high-frequency sgClaw startup, model, skill, and frontend presentation changes out of SuperRPA compile-time code and into runtime-managed configuration.
**Architecture:** SuperRPA remains the trusted host that owns process spawning, pipe security, browser/office capability gates, and frontend host contracts. sgClaw becomes the runtime-configured execution engine that reads launch/runtime policy from files, with SuperRPA preferring external launch descriptors and external frontend bundles before falling back to bundled defaults. This preserves the security boundary while removing the need to rebuild the browser for routine sgClaw iteration.
**Tech Stack:** Chromium C++ WebUI, TypeScript/Lit frontend, Rust sgClaw runtime, JSON config files, local filesystem-based runtime assets, existing pipe protocol and Zeroclaw planner-first execution path.
### Task 1: Freeze the design in docs before further code changes
**Files:**
- Modify: `/home/zyl/projects/sgClaw/claw/docs/L1-系统架构与安全模型层.md`
- Modify: `/home/zyl/projects/sgClaw/claw/docs/L2-核心模块与接口契约层.md`
- Modify: `/home/zyl/projects/sgClaw/claw/docs/L4-工程实现与部署拓扑层.md`
- Create: `/home/zyl/projects/sgClaw/claw/docs/plans/2026-03-29-sgclaw-superrpa-runtime-config-design.md`
**Step 1: Write the design delta doc**
Document these decisions explicitly:
- SuperRPA owns host security and capability exposure only.
- sgClaw owns planner, model routing, skill orchestration, and business behavior.
- Launch behavior is described by runtime files, not hardcoded browser-side constants.
- Frontend only has display rights; planner/executor decisions stay in sgClaw/Zeroclaw.
**Step 2: Add the failing doc checklist**
Create a checklist inside the design doc with these questions and mark them initially unresolved:
- Can browser startup switch sgClaw binary without rebuilding Chromium?
- Can model/provider selection change without rebuilding Chromium?
- Can floating UI be replaced without rebuilding Chromium?
- Can acceptance flows prove planner-first behavior visually and functionally?
**Step 3: Update the core architecture docs**
Add short sections showing:
- Launch config file path and fallback rules.
- Runtime config ownership split between SuperRPA and sgClaw.
- External frontend bundle loading path and fallback to bundled assets.
**Step 4: Review docs for consistency**
Check that `L1`, `L2`, `L4`, and the new design doc all use the same terms:
- `host`
- `launch config`
- `runtime config`
- `frontend bundle`
- `planner-first`
**Step 5: Commit**
```bash
git -C /home/zyl/projects/sgClaw/claw add \
docs/L1-系统架构与安全模型层.md \
docs/L2-核心模块与接口契约层.md \
docs/L4-工程实现与部署拓扑层.md \
docs/plans/2026-03-29-sgclaw-superrpa-runtime-config-design.md
git -C /home/zyl/projects/sgClaw/claw commit -m "docs: define superrpa sgclaw runtime boundary"
```
### Task 2: Finish and lock down the current stale-backend fix
**Files:**
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/ui/webui/superrpa/sgclaw_session_service.cc`
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/ui/webui/superrpa/sgclaw_session_service.h`
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/ui/webui/superrpa/functions_ui_mainline_unittest.cc`
- Create: `/home/zyl/projects/sgClaw/claw/tools/browser_runtime/sgclaw_browser_entry.sh`
**Step 1: Write the failing regression test**
Add internal tests for binary resolution priority:
1. `SUPERRPA_SGCLAW_BINARY` override wins.
2. `skillsDir`-inferred source checkout wrapper wins over bundled binary.
3. Bundled `out/.../sgclaw` is only a fallback.
**Step 2: Run the failing test**
Run:
```bash
autoninja -C /home/zyl/projects/superRpa/src/out/KylinRelease functions_ui_mainline_unittests
/home/zyl/projects/superRpa/src/out/KylinRelease/functions_ui_mainline_unittests --gtest_filter="SgClawSessionServiceInternalTest.*"
```
Expected: the new test target fails before the final test helper wiring is complete.
**Step 3: Write the minimal implementation**
Expose a testable internal resolver function that accepts:
- config path
- bundled binary path
- optional env override string
- output detail string
Keep production `Start()` calling the same shared resolver to avoid divergence.
**Step 4: Run tests to verify they pass**
Run:
```bash
autoninja -C /home/zyl/projects/superRpa/src/out/KylinRelease functions_ui_mainline_unittests
/home/zyl/projects/superRpa/src/out/KylinRelease/functions_ui_mainline_unittests --gtest_filter="SgClawSessionServiceInternalTest.*:FunctionsUiMainlineTest.StartPublishesDetailedRulesDiagnosticsToUiLogs"
```
Expected: all targeted tests pass.
**Step 5: Run browser compile verification**
Run:
```bash
autoninja -C /home/zyl/projects/superRpa/src/out/KylinRelease chrome
```
Expected: `LINK ./chrome` with exit code `0`.
**Step 6: Commit**
```bash
git -C /home/zyl/projects/superRpa/src add \
chrome/browser/ui/webui/superrpa/sgclaw_session_service.cc \
chrome/browser/ui/webui/superrpa/sgclaw_session_service.h \
chrome/browser/ui/webui/superrpa/functions_ui_mainline_unittest.cc
git -C /home/zyl/projects/sgClaw/claw add \
tools/browser_runtime/sgclaw_browser_entry.sh
git -C /home/zyl/projects/superRpa/src commit -m "superrpa: resolve sgclaw binary from runtime config"
```
### Task 3: Add a real launch descriptor so SuperRPA no longer hardcodes sgClaw startup policy
**Files:**
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/ui/webui/superrpa/sgclaw_webui_config.h`
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/ui/webui/superrpa/sgclaw_webui_config.cc`
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/ui/webui/superrpa/sgclaw_session_service.cc`
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/ui/webui/superrpa/functions_ui.cc`
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-config/sgclaw-config.ts`
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-config/sgclaw-config_state.ts`
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-config/sgclaw-config_mainline_unittest.ts`
- Create: `/home/zyl/projects/superRpa/src/chrome/browser/ui/webui/superrpa/sgclaw_launch_config.h`
- Create: `/home/zyl/projects/superRpa/src/chrome/browser/ui/webui/superrpa/sgclaw_launch_config.cc`
- Test: `/home/zyl/projects/superRpa/src/chrome/browser/ui/webui/superrpa/functions_ui_mainline_unittest.cc`
**Step 1: Write the failing config tests**
Cover:
- missing launch config falls back safely
- explicit `binary`, `args`, `env`, `working_dir`, `runtime_config_path` parse correctly
- unsafe or nonexistent paths are rejected with clear UI-visible errors
**Step 2: Run the failing tests**
Run:
```bash
autoninja -C /home/zyl/projects/superRpa/src/out/KylinRelease functions_ui_mainline_unittests
/home/zyl/projects/superRpa/src/out/KylinRelease/functions_ui_mainline_unittests --gtest_filter="*SgClaw*Config*"
```
Expected: launch-config cases fail before parser/consumer code is added.
**Step 3: Implement minimal launch config support**
Define a host-side launch descriptor with fields:
- `binary`
- `args`
- `env`
- `working_dir`
- `runtime_config_path`
- `frontend_bundle_dir`
Load it from a predictable profile-local path, with safe defaults and fallback to existing behavior.
**Step 4: Wire startup to the descriptor**
Have `SgClawSessionService::Start()` resolve:
- executable path
- process args
- working dir
- env
- runtime config path
without requiring browser recompilation for routine changes.
**Step 5: Wire config UI to persist supported fields**
Make `sgclaw-config` save and load the new fields so local users can adjust launch behavior from the UI or by editing the JSON file directly.
**Step 6: Run tests and browser compile**
Run:
```bash
autoninja -C /home/zyl/projects/superRpa/src/out/KylinRelease functions_ui_mainline_unittests chrome
```
Expected: config tests pass and browser still links.
**Step 7: Commit**
```bash
git -C /home/zyl/projects/superRpa/src add \
chrome/browser/ui/webui/superrpa/sgclaw_launch_config.h \
chrome/browser/ui/webui/superrpa/sgclaw_launch_config.cc \
chrome/browser/ui/webui/superrpa/sgclaw_webui_config.h \
chrome/browser/ui/webui/superrpa/sgclaw_webui_config.cc \
chrome/browser/ui/webui/superrpa/sgclaw_session_service.cc \
chrome/browser/ui/webui/superrpa/functions_ui.cc \
chrome/browser/resources/superrpa/devtools/functions/sgclaw-config/sgclaw-config.ts \
chrome/browser/resources/superrpa/devtools/functions/sgclaw-config/sgclaw-config_state.ts \
chrome/browser/resources/superrpa/devtools/functions/sgclaw-config/sgclaw-config_mainline_unittest.ts
git -C /home/zyl/projects/superRpa/src commit -m "superrpa: add runtime launch config for sgclaw"
```
### Task 4: Expand sgClaw runtime config so model/provider/skill policy live in sgClaw, not SuperRPA
**Files:**
- Modify: `/home/zyl/projects/sgClaw/claw/src/config/settings.rs`
- Modify: `/home/zyl/projects/sgClaw/claw/src/config/mod.rs`
- Modify: `/home/zyl/projects/sgClaw/claw/src/compat/config_adapter.rs`
- Modify: `/home/zyl/projects/sgClaw/claw/src/agent/runtime.rs`
- Modify: `/home/zyl/projects/sgClaw/claw/src/agent/planner.rs`
- Modify: `/home/zyl/projects/sgClaw/claw/tests/compat_config_test.rs`
- Modify: `/home/zyl/projects/sgClaw/claw/tests/runtime_profile_test.rs`
- Modify: `/home/zyl/projects/sgClaw/claw/tests/planner_test.rs`
- Modify: `/home/zyl/projects/sgClaw/claw/docs/L3-数据流与Skill体系层.md`
**Step 1: Write failing Rust tests**
Cover config-driven behavior for:
- planner-first mode
- provider list / active provider
- browser backend selection
- office backend selection
- skills prompt mode
- runtime profile
**Step 2: Run the failing tests**
Run:
```bash
cargo test compat_config_test runtime_profile_test planner_test --manifest-path /home/zyl/projects/sgClaw/claw/Cargo.toml
```
Expected: new config fields are missing or ignored.
**Step 3: Implement minimal config schema changes**
Add fields that let sgClaw choose behavior without host rebuild:
- `planner_mode`
- `providers`
- `active_provider`
- `browser_backend`
- `office_backend`
- `skills_prompt_mode`
- `runtime_profile`
**Step 4: Keep Zeroclaw-first execution**
Ensure the planner reads config before execution and produces a visible plan event for the frontend, but the frontend still only renders what sgClaw emits.
**Step 5: Re-run Rust tests**
Run:
```bash
cargo test --manifest-path /home/zyl/projects/sgClaw/claw/Cargo.toml compat_config_test runtime_profile_test planner_test runtime_task_flow_test
```
Expected: planner/config tests pass.
**Step 6: Commit**
```bash
git -C /home/zyl/projects/sgClaw/claw add \
src/config/settings.rs \
src/config/mod.rs \
src/compat/config_adapter.rs \
src/agent/runtime.rs \
src/agent/planner.rs \
tests/compat_config_test.rs \
tests/runtime_profile_test.rs \
tests/planner_test.rs \
docs/L3-数据流与Skill体系层.md
git -C /home/zyl/projects/sgClaw/claw commit -m "sgclaw: move runtime policy into config"
```
### Task 5: Decouple the floating UI so visual iteration stops depending on Chromium rebuilds
**Files:**
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat.ts`
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_state.ts`
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_mainline_unittest.ts`
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/ui/webui/superrpa/functions_ui.cc`
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/ui/webui/superrpa/sgclaw_session_service.cc`
- Create: `/home/zyl/projects/sgClaw/claw/frontend/runtime-host/README.md`
- Create: `/home/zyl/projects/sgClaw/claw/frontend/runtime-host/manifest.example.json`
**Step 1: Write failing UI host tests**
Cover:
- external frontend bundle dir is preferred when declared in launch config
- bundled frontend assets still load when external assets are absent
- planner events are rendered as plan cards/log lines before execution
**Step 2: Run the failing frontend/browser tests**
Run:
```bash
autoninja -C /home/zyl/projects/superRpa/src/out/KylinRelease chrome/test/data/webui_test_resources
```
Then run the relevant TypeScript tests already wired for the sgClaw chat surface.
**Step 3: Implement the minimal external bundle loader**
SuperRPA should:
- keep the host shell and JS bridge fixed
- optionally load external `sgclaw-chat` assets from runtime-configured directory
- fall back to bundled assets when missing
**Step 4: Surface planner output early**
Use existing runtime event flow so the frontend shows:
- plan summary
- current step
- execution logs
without moving control logic into the frontend.
**Step 5: Re-run tests**
Run the existing sgClaw chat WebUI tests and a browser smoke.
**Step 6: Commit**
```bash
git -C /home/zyl/projects/superRpa/src add \
chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat.ts \
chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_state.ts \
chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_mainline_unittest.ts \
chrome/browser/ui/webui/superrpa/functions_ui.cc \
chrome/browser/ui/webui/superrpa/sgclaw_session_service.cc
git -C /home/zyl/projects/sgClaw/claw add \
frontend/runtime-host/README.md \
frontend/runtime-host/manifest.example.json
git -C /home/zyl/projects/superRpa/src commit -m "superrpa: support external sgclaw frontend bundle"
```
### Task 6: Close the current remaining behavioral gaps before new feature work
**Files:**
- Modify: `/home/zyl/projects/sgClaw/claw/tests/live_acceptance_score_test.py`
- Modify: `/home/zyl/projects/sgClaw/claw/tools/live_acceptance/run_zhihu_hotlist_excel_acceptance.py`
- Modify: `/home/zyl/projects/sgClaw/claw/docs/acceptance/2026-03-29-zhihu-hotlist-excel.md`
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat.ts`
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_messages.ts`
**Step 1: Write failing acceptance assertions**
Add explicit checks for:
- no repeated assistant paragraphs
- no fake fallback data when browser path exists
- planner-first output appears before tool execution
- Zhihu hotlist extraction returns structured rows
- office export returns a real output path
**Step 2: Run the failing acceptance flow**
Run:
```bash
python3 /home/zyl/projects/sgClaw/claw/tools/live_acceptance/run_zhihu_hotlist_excel_acceptance.py
```
Expected: current score exposes the remaining regressions if they still exist.
**Step 3: Fix the smallest issue set first**
Order:
1. repeated message rendering / repeated summary emission
2. planner event visibility
3. structured hotlist extraction handoff
4. office export path propagation
**Step 4: Re-run acceptance**
Run the same command until:
- `hotlist_data_correctness > 0`
- `xlsx_export_success > 0`
- repeated text is absent
**Step 5: Record fresh evidence**
Update the acceptance markdown with:
- timestamp
- score
- exact exported path
- screenshot/log snippets
**Step 6: Commit**
```bash
git -C /home/zyl/projects/sgClaw/claw add \
tests/live_acceptance_score_test.py \
tools/live_acceptance/run_zhihu_hotlist_excel_acceptance.py \
docs/acceptance/2026-03-29-zhihu-hotlist-excel.md
git -C /home/zyl/projects/superRpa/src add \
chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat.ts \
chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_messages.ts
git -C /home/zyl/projects/sgClaw/claw commit -m "acceptance: stabilize zhihu hotlist excel flow"
```
### Task 7: Final integrated verification
**Files:**
- Verify only: `/home/zyl/projects/sgClaw/claw/docs/acceptance/2026-03-29-zhihu-hotlist-excel.md`
- Verify only: `/home/zyl/projects/superRpa/src/chrome/browser/ui/webui/superrpa/sgclaw_session_service.cc`
- Verify only: `/home/zyl/projects/sgClaw/claw/tools/browser_runtime/sgclaw_browser_entry.sh`
**Step 1: Build all affected binaries**
Run:
```bash
autoninja -C /home/zyl/projects/superRpa/src/out/KylinRelease chrome functions_ui_mainline_unittests
cargo test --manifest-path /home/zyl/projects/sgClaw/claw/Cargo.toml
```
Expected: both complete successfully.
**Step 2: Do the live browser smoke**
Run browser with the local profile and verify the logs include one of:
- `using SUPERRPA_SGCLAW_BINARY override: ...`
- `using source checkout sgclaw inferred from skillsDir: ...`
- `using bundled sgclaw from browser output dir: ...`
The expected dev mode result is the source checkout path, not the stale bundled fallback.
**Step 3: Run the final business acceptance**
Ask sgClaw to:
1. read Zhihu hotlist
2. export Excel
3. open the screen presentation in a new tab
Verify:
- planner appears first
- skills are actually used
- exported file path is returned
- new-tab presentation opens
**Step 4: Record the result**
Append the final evidence to:
- `/home/zyl/projects/sgClaw/claw/docs/acceptance/2026-03-29-zhihu-hotlist-excel.md`
**Step 5: Commit**
```bash
git -C /home/zyl/projects/sgClaw/claw commit -m "chore: record final sgclaw superrpa runtime verification"
```
## Remaining Items Explicitly Carried Into This Plan
- The current stale-backend risk is not considered closed until the resolver has automated regression coverage.
- The current local edit in `/home/zyl/projects/superRpa/src/chrome/browser/ui/webui/superrpa/functions_ui_mainline_unittest.cc` must be either completed under Task 2 or replaced with the final tested version.
- The current wrapper script `/home/zyl/projects/sgClaw/claw/tools/browser_runtime/sgclaw_browser_entry.sh` is still untracked and must be committed as part of Task 2.
- The Zhihu hotlist to Excel acceptance still has unresolved correctness and export-path gaps and remains part of the critical path.
- The repeated assistant text regression remains part of the critical path because it corrupts operator trust during demos.
Plan complete and saved to `docs/plans/2026-03-29-sgclaw-superrpa-decoupled-runtime-plan.md`. Two execution options:
**1. Subagent-Driven (this session)** - I dispatch fresh subagent per task, review between tasks, fast iteration
**2. Parallel Session (separate)** - Open new session with executing-plans, batch execution with checkpoints
**Which approach?**

View File

@@ -0,0 +1,137 @@
# sgClaw SuperRPA Runtime Config Design
**Status**: Draft frozen before implementation
**Date**: 2026-03-29
## 1. Goal
Freeze the runtime-boundary design before further implementation so Task 2+ do not drift back into browser-compiled behavior.
The design line is fixed:
- `host` keeps the security boundary.
- sgClaw keeps runtime behavior.
- `frontend bundle` keeps display rights only.
- High-frequency changes move to runtime-managed files whenever possible.
## 2. Ownership Split
### 2.1 host
SuperRPA as `host` owns only the trusted boundary:
- process spawning
- pipe lifecycle and session security
- browser / office capability exposure
- path validation for runtime-managed files
- fallback to bundled defaults when external files are missing or unsafe
`host` does not own planner policy, model routing, provider selection, skill orchestration, or business behavior.
### 2.2 sgClaw runtime
sgClaw owns runtime behavior:
- planner / executor orchestration
- provider list and active provider selection
- skill loading and prompt mode
- browser / office backend selection
- runtime profile behavior
- planner-first execution sequencing
### 2.3 frontend bundle
`frontend bundle` owns presentation only:
- render runtime state, logs, and conversation
- render planner output before execution
- collect user input and forward it through host events
`frontend bundle` must not:
- decide whether planner runs
- directly select provider/backend outside runtime contract
- bypass sgClaw / zeroclaw execution
## 3. Runtime-Managed Files
### 3.1 launch config
Owned by `host`, preferred path:
```text
<profile>/superrpa/sgclaw_launch_config.json
```
Fields:
- `binary`
- `args`
- `env`
- `working_dir`
- `runtime_config_path`
- `frontend_bundle_dir`
Fallback rules:
1. Prefer external `launch config`
2. Fall back to bundled browser defaults when file is missing or invalid
3. Fall back to profile-local `runtime config` if `runtime_config_path` is absent
4. Fall back to bundled frontend resources if `frontend_bundle_dir` is absent or invalid
### 3.2 runtime config
Owned by sgClaw, current default path:
```text
<profile>/superrpa/sgclaw_config.json
```
This file should carry runtime behavior instead of browser compile-time constants, including:
- planner mode
- providers and active provider
- browser backend
- office backend
- skills prompt mode
- runtime profile
### 3.3 frontend bundle
Owned by `host` for loading, but externally replaceable at runtime:
- prefer `frontend_bundle_dir`
- validate path and allowed loading rules
- fall back to bundled resources if invalid
## 4. Planner-First Rule
`planner-first` is a runtime contract, not a frontend trick.
The sequence must be:
1. sgClaw / zeroclaw produces a plan
2. `frontend bundle` displays the plan
3. runtime continues into execution
4. acceptance verifies both visible plan rendering and actual execution ordering
## 5. Failing Checklist
The following questions remain intentionally unresolved at design-freeze time and must be closed by implementation plus verification:
- [ ] Can browser startup switch sgClaw binary without rebuilding Chromium?
- [ ] Can model/provider selection change without rebuilding Chromium?
- [ ] Can floating UI be replaced without rebuilding Chromium?
- [ ] Can acceptance flows prove planner-first behavior visually and functionally?
## 6. Terminology Guardrail
All related docs and code reviews must use the same terms:
- `host`
- `launch config`
- `runtime config`
- `frontend bundle`
- `planner-first`
Any proposal that moves planner or executor logic back into browser-side presentation code is out of bounds for this design.

View File

@@ -0,0 +1,451 @@
# SGClaw ZeroClaw Planner-First Realignment Implementation Plan
> **For Claude:** REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.
**Goal:** Realign the browser submit path so `sgclaw` uses `zeroclaw` as the primary planner/executor, with `sgclaw` acting only as the secure SuperRPA host plus custom tool bridge.
**Architecture:** Stop treating `zeroclaw` as a thin LLM wrapper. The browser message path should enter a `zeroclaw`-native orchestration entry point first, let `zeroclaw` perform planning/tool-loop control, and expose SuperRPA-specific browser/office/screen capabilities as regular tools inside that runtime. Any deterministic fast paths for Zhihu/Office must be implemented as `zeroclaw`-aligned execution components, not as frontend-owned control flow. The frontend may display the generated plan and current stage for UX, but it must not own planning or execution decisions.
**Tech Stack:** Rust, `sgclaw` compat bridge, `third_party/zeroclaw` agent loop, SuperRPA browser pipe, local skill library, OpenXML office export, HTML screen export, cargo tests, Python live acceptance.
### Task 1: Freeze The Current Architecture Gap With Characterization Tests
**Files:**
- Modify: `/home/zyl/projects/sgClaw/claw/tests/compat_runtime_test.rs`
- Reference only: `/home/zyl/projects/sgClaw/claw/src/agent/mod.rs`
- Reference only: `/home/zyl/projects/sgClaw/claw/src/compat/runtime.rs`
- Reference only: `/home/zyl/projects/sgClaw/claw/third_party/zeroclaw/src/agent/loop_.rs`
**Step 1: Write the failing test**
Add a test that submits `读取知乎热榜前10并导出 excel 文件` through `handle_browser_message_with_context(...)` and asserts the browser submit path does **not** terminate inside the current thin `Agent::turn_streamed(...)` compat bridge.
The test should check for one of these observable signals:
- a new orchestration mode log such as `zeroclaw_process_message_primary`
- absence of the old `compat_llm_primary` mode log
- absence of selector-thrashing logs like repeated `getText .HotList-item`, `[data-hot-item]`, `ol li`
**Step 2: Run test to verify it fails**
Run:
```bash
cargo test --test compat_runtime_test browser_submit_path_prefers_zeroclaw_process_message_orchestrator -- --nocapture
```
Expected: FAIL because the current implementation still enters `src/compat/runtime.rs` and drives `agent.turn_streamed(...)` directly.
**Step 3: Write the smallest additional characterization test**
Add a second failing test that proves SuperRPA-specific tools remain available after the orchestration switch:
- browser host tool
- `openxml_office`
- `screen_html_export`
This test should not require real network calls.
**Step 4: Run both failing tests**
Run:
```bash
cargo test --test compat_runtime_test -- --nocapture
```
Expected: at least the new characterization tests fail for the expected reason.
**Step 5: Commit**
```bash
git add tests/compat_runtime_test.rs
git commit -m "test: characterize browser path bypass of zeroclaw orchestrator"
```
### Task 2: Introduce A ZeroClaw-Native Browser Orchestration Entry Point
**Files:**
- Create: `/home/zyl/projects/sgClaw/claw/src/compat/orchestration.rs`
- Modify: `/home/zyl/projects/sgClaw/claw/src/compat/mod.rs`
- Modify: `/home/zyl/projects/sgClaw/claw/src/agent/mod.rs`
- Modify: `/home/zyl/projects/sgClaw/claw/src/compat/runtime.rs`
- Reference only: `/home/zyl/projects/sgClaw/claw/third_party/zeroclaw/src/agent/loop_.rs:4752`
**Step 1: Write the failing unit test for the new entry point**
Add a test for a new helper in `src/compat/orchestration.rs` that:
- receives browser task context
- builds a `zeroclaw` config
- returns a browser-safe orchestration handle or result
The test should prove the new helper is chosen by `handle_browser_message_with_context(...)`.
**Step 2: Run the new test to verify it fails**
Run:
```bash
cargo test --test compat_runtime_test browser_submit_path_prefers_zeroclaw_process_message_orchestrator -- --nocapture
```
Expected: FAIL because the helper does not exist yet.
**Step 3: Implement the minimal entry point**
Create `src/compat/orchestration.rs` with one responsibility:
- bridge browser submit tasks into a `zeroclaw`-native orchestration path
Do not implement Zhihu-specific logic here. This layer must only:
- map config
- map task context/history
- inject SuperRPA tools
- call the chosen `zeroclaw` orchestration function
**Step 4: Switch `handle_browser_message_with_context(...)` to the new entry point**
Modify:
- `/home/zyl/projects/sgClaw/claw/src/agent/mod.rs`
Replace the direct `compat::runtime::execute_task_with_sgclaw_settings(...)` primary path with the new orchestration bridge.
**Step 5: Run the test to verify it passes**
Run:
```bash
cargo test --test compat_runtime_test browser_submit_path_prefers_zeroclaw_process_message_orchestrator -- --nocapture
```
Expected: PASS.
**Step 6: Commit**
```bash
git add src/compat/orchestration.rs src/compat/mod.rs src/agent/mod.rs src/compat/runtime.rs tests/compat_runtime_test.rs
git commit -m "refactor: route browser submit flow through zeroclaw orchestration bridge"
```
### Task 3: Register SuperRPA Browser/Office/Screen Capabilities As Native ZeroClaw Tools
**Files:**
- Modify: `/home/zyl/projects/sgClaw/claw/src/compat/browser_tool_adapter.rs`
- Modify: `/home/zyl/projects/sgClaw/claw/src/compat/openxml_office_tool.rs`
- Modify: `/home/zyl/projects/sgClaw/claw/src/compat/screen_html_export_tool.rs`
- Modify: `/home/zyl/projects/sgClaw/claw/src/runtime/engine.rs`
- Modify: `/home/zyl/projects/sgClaw/claw/src/compat/orchestration.rs`
- Test: `/home/zyl/projects/sgClaw/claw/tests/compat_runtime_test.rs`
- Test: `/home/zyl/projects/sgClaw/claw/tests/compat_openxml_office_tool_test.rs`
- Test: `/home/zyl/projects/sgClaw/claw/tests/compat_screen_html_export_tool_test.rs`
**Step 1: Write the failing tool-registration test**
Add a test that asserts the `zeroclaw` orchestration path exposes:
- the preferred SuperRPA browser tool
- `openxml_office` when Excel export is requested
- `screen_html_export` when screen export is requested
The test must verify this through the new orchestration path, not the old compat path.
**Step 2: Run the test to verify it fails**
Run:
```bash
cargo test --test compat_runtime_test browser_orchestration_registers_superrpa_tools_natively -- --nocapture
```
Expected: FAIL until tool wiring is complete.
**Step 3: Implement minimal native tool registration**
Ensure the new orchestration bridge injects `sgclaw` tools into the `zeroclaw` runtime without changing frontend code. Keep tool naming stable:
- `superrpa_browser`
- `openxml_office`
- `screen_html_export`
**Step 4: Verify tool-level tests still pass**
Run:
```bash
cargo test --test compat_openxml_office_tool_test -- --nocapture
cargo test --test compat_screen_html_export_tool_test -- --nocapture
```
Expected: PASS.
**Step 5: Run the new orchestration registration test**
Run:
```bash
cargo test --test compat_runtime_test browser_orchestration_registers_superrpa_tools_natively -- --nocapture
```
Expected: PASS.
**Step 6: Commit**
```bash
git add src/compat/browser_tool_adapter.rs src/compat/openxml_office_tool.rs src/compat/screen_html_export_tool.rs src/runtime/engine.rs src/compat/orchestration.rs tests/compat_runtime_test.rs tests/compat_openxml_office_tool_test.rs tests/compat_screen_html_export_tool_test.rs
git commit -m "feat: expose superrpa browser and export tools through zeroclaw orchestration"
```
### Task 4: Remove Frontend-Owned Or Custom Compat Mainline Control Flow
**Files:**
- Modify: `/home/zyl/projects/sgClaw/claw/src/compat/runtime.rs`
- Modify: `/home/zyl/projects/sgClaw/claw/src/agent/mod.rs`
- Modify: `/home/zyl/projects/sgClaw/claw/src/compat/skill_runner.rs`
- Test: `/home/zyl/projects/sgClaw/claw/tests/compat_runtime_test.rs`
- Reference only: `/home/zyl/projects/sgClaw/claw/docs/plans/2026-03-29-sgclaw-zeroclaw-planner-first-execution-plan.md`
**Step 1: Write the failing regression test**
Add a test that proves Zhihu hotlist export no longer depends on a frontend-owned mainline such as:
- `compat_skill_runner_primary`
- direct `sgclaw`-local branching before `zeroclaw`
The expected primary mode should be a `zeroclaw`-owned orchestration mode.
**Step 2: Run the regression test to verify it fails**
Run:
```bash
cargo test --test compat_runtime_test zhihu_export_does_not_use_frontend_owned_mainline -- --nocapture
```
Expected: FAIL while `src/compat/skill_runner.rs` still owns primary control flow.
**Step 3: Remove or demote the custom mainline**
Change the code so:
- `src/compat/skill_runner.rs` becomes either a helper invoked inside the `zeroclaw` tool/runtime ecosystem, or is removed if redundant
- `src/agent/mod.rs` no longer branches to a custom primary executor before `zeroclaw`
Do not leave two competing primary modes.
**Step 4: Run the regression test**
Run:
```bash
cargo test --test compat_runtime_test zhihu_export_does_not_use_frontend_owned_mainline -- --nocapture
```
Expected: PASS.
**Step 5: Run the broader compat suite**
Run:
```bash
cargo test --test compat_runtime_test -- --nocapture
```
Expected: PASS.
**Step 6: Commit**
```bash
git add src/compat/runtime.rs src/agent/mod.rs src/compat/skill_runner.rs tests/compat_runtime_test.rs
git commit -m "refactor: remove frontend-owned primary control flow from browser submit path"
```
### Task 5: Align Skills With ZeroClaw Execution Semantics Instead Of Prompt-Only Semantics
**Files:**
- Modify: `/home/zyl/projects/sgClaw/claw/src/runtime/engine.rs`
- Modify: `/home/zyl/projects/sgClaw/claw/src/compat/runtime.rs`
- Modify: `/home/zyl/projects/sgClaw/claw/third_party/zeroclaw/src/tools/read_skill.rs`
- Modify: `/home/zyl/projects/sgClaw/claw/tests/compat_runtime_test.rs`
- Modify: `/home/zyl/projects/sgClaw/claw/tests/read_skill_tool_test.rs`
- Reference only: `/home/zyl/projects/sgClaw/skill_lib/skills/zhihu-hotlist/SKILL.md`
- Reference only: `/home/zyl/projects/sgClaw/skill_lib/skills/office-export-xlsx/SKILL.md`
- Reference only: `/home/zyl/projects/sgClaw/skill_lib/skills/zhihu-hotlist-screen/SKILL.md`
**Step 1: Write the failing skill-execution regression test**
Add a test that proves skill usage in the browser submit path is not just:
- prompt injection
- `read_skill` text stuffing
- model-led selector wandering
Instead, the test should verify the task produces:
- a plan-driven collection/execution flow
- a real `.xlsx` or `.html` artifact path
- no selector-thrashing loop
**Step 2: Run the test to verify it fails**
Run:
```bash
cargo test --test compat_runtime_test browser_skill_usage_is_execution_not_prompt_only -- --nocapture
```
Expected: FAIL until skill semantics are aligned with execution.
**Step 3: Implement the minimal alignment**
Change the orchestration so `read_skill` is a fallback for missing context, not the primary means of making high-frequency browser workflows executable.
Keep:
- skill discovery
- skill references
- artifact contract wording
Reduce:
- over-reliance on prompt stuffing
- over-reliance on model-led selector discovery for known workflows
**Step 4: Re-run the skill regression tests**
Run:
```bash
cargo test --test compat_runtime_test browser_skill_usage_is_execution_not_prompt_only -- --nocapture
cargo test --test read_skill_tool_test -- --nocapture
```
Expected: PASS.
**Step 5: Commit**
```bash
git add src/runtime/engine.rs src/compat/runtime.rs third_party/zeroclaw/src/tools/read_skill.rs tests/compat_runtime_test.rs tests/read_skill_tool_test.rs
git commit -m "refactor: align browser skill execution with zeroclaw-native workflow semantics"
```
### Task 6: Verify The Planner-First Path End-To-End
**Files:**
- Modify: `/home/zyl/projects/sgClaw/claw/docs/acceptance/2026-03-29-zhihu-hotlist-excel.md`
- Test: `/home/zyl/projects/sgClaw/claw/tests/runtime_profile_test.rs`
- Test: `/home/zyl/projects/sgClaw/claw/tests/compat_config_test.rs`
- Test: `/home/zyl/projects/sgClaw/claw/tests/compat_runtime_test.rs`
- Test: `/home/zyl/projects/sgClaw/claw/tests/live_acceptance_score_test.py`
- Reference only: `/home/zyl/projects/superRpa/src/out/KylinRelease/sgclaw`
**Step 1: Run the Rust regression suites**
Run:
```bash
cargo test --test runtime_profile_test -- --nocapture
cargo test --test compat_config_test -- --nocapture
cargo test --test compat_runtime_test -- --nocapture
cargo test --test read_skill_tool_test -- --nocapture
```
Expected: PASS.
**Step 2: Run the Python scoring test**
Run:
```bash
python3 -m unittest tests/live_acceptance_score_test.py
```
Expected: PASS.
**Step 3: Run the live Zhihu hotlist Excel acceptance**
Run:
```bash
python3 tools/live_acceptance/run_zhihu_hotlist_excel_acceptance.py
```
Expected:
- total score returns to `100`
- logs show planner-first `zeroclaw` orchestration instead of selector-thrashing
- no `shell`, `web_fetch`, `web_search_tool`
- final summary includes a real `.xlsx` path
**Step 4: Update the acceptance note**
Record:
- new orchestration mode
- tool sequence
- timing notes
- any remaining selector or latency risk
**Step 5: Rebuild and sync the runtime binary used by SuperRPA**
Run:
```bash
cargo build
cp /home/zyl/projects/sgClaw/claw/target/debug/sgclaw /home/zyl/projects/superRpa/src/out/KylinRelease/sgclaw
sha256sum /home/zyl/projects/sgClaw/claw/target/debug/sgclaw /home/zyl/projects/superRpa/src/out/KylinRelease/sgclaw
```
Expected: the two hashes match exactly.
**Step 6: Commit**
```bash
git add docs/acceptance/2026-03-29-zhihu-hotlist-excel.md tests/runtime_profile_test.rs tests/compat_config_test.rs tests/compat_runtime_test.rs tests/live_acceptance_score_test.py
git commit -m "test: verify planner-first zeroclaw browser orchestration end to end"
```
### Task 7: Surface The Generated Plan In The Chat UI Without Giving Frontend Control
**Files:**
- Modify: `/home/zyl/projects/sgClaw/claw/src/compat/event_bridge.rs`
- Modify: `/home/zyl/projects/sgClaw/claw/src/pipe/protocol.rs`
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/ui/webui/superrpa/sgclaw_session_service.cc`
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/` (the active sgClaw chat UI files that render task progress)
- Test: `/home/zyl/projects/sgClaw/claw/tests/pipe_protocol_test.rs`
- Test: `/home/zyl/projects/sgClaw/claw/tests/compat_runtime_test.rs`
**Step 1: Write the failing protocol/UI test**
Add a test that proves the backend can emit a structured planning event before tool execution starts. The event must carry:
- a short plan title
- a flat ordered step list
- current phase such as `planning`, `executing`, `completed`
The frontend test or fixture should verify the chat can render the plan summary without waiting for final completion.
**Step 2: Run test to verify it fails**
Run:
```bash
cargo test --test pipe_protocol_test -- --nocapture
cargo test --test compat_runtime_test plan_events_are_emitted_before_browser_execution -- --nocapture
```
Expected: FAIL because the protocol does not yet expose a dedicated plan-progress event.
**Step 3: Add the minimal backend event shape**
Extend the `sgclaw` pipe/event bridge so the orchestration layer can emit:
- planner summary
- execution stage transitions
Keep the event read-only from the frontends perspective. The UI may display it, but cannot edit or branch execution.
**Step 4: Render the plan in the active chat UI**
Update the SuperRPA sgClaw chat UI so it:
- prints the generated plan immediately after planning completes
- keeps the plan compact and collapsible
- highlights the current phase while waiting
Do not add frontend-owned retry logic, decision logic, or browser action generation.
**Step 5: Run verification**
Run:
```bash
cargo test --test pipe_protocol_test -- --nocapture
cargo test --test compat_runtime_test -- --nocapture
```
Expected: PASS.
**Step 6: Manual browser validation**
Submit:
```text
读取知乎热榜前10并导出 excel 文件
```
Expected:
- the chat first shows a short generated plan
- the user sees stage transitions instead of a blank wait
- execution still follows the backend-owned `zeroclaw` path
**Step 7: Commit**
```bash
git add src/compat/event_bridge.rs src/pipe/protocol.rs tests/pipe_protocol_test.rs tests/compat_runtime_test.rs
git commit -m "feat: surface backend-generated execution plans in sgclaw chat ui"
```

View File

@@ -0,0 +1,444 @@
# Zhihu Hotlist To Excel Implementation Plan
> **For Claude:** REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.
**Goal:** Make sgClaw reliably read Zhihu hotlist data through a Zhihu browser skill and export the collected structured result into a local `.xlsx` file through an independent Office skill.
**Architecture:** Keep zeroclaw as the core planner, but stop it from wandering across unrelated tools once a browser-attached skill is selected. The hotlist skill must produce a strict structured artifact, and the Office skill must consume that artifact through a dedicated `openxml_office` tool that wraps the sibling `openxml_cli` project. For the first delivery, reuse `openxml_cli template render` with a bundled `.xlsx` template instead of inventing a new workbook-construction API.
**Tech Stack:** Rust, vendored zeroclaw, sgClaw browser pipe, skill packages under `/home/zyl/projects/sgClaw/skill_lib`, sibling `openxml_cli`, JSON payload handoff, `.xlsx` template render, Python/Rust regression tests, real-provider smoke verification.
## Scope Guard
- In scope:
- browser-attached skill execution discipline
- `zhihu-hotlist` structured export artifact
- new `office-export-xlsx` skill
- new `openxml_office` runtime tool
- end-to-end acceptance for "读取知乎热榜数据,并导出 excel 文件"
- Out of scope:
- generic Office authoring platform
- arbitrary shell-based export flows
- browser-side file generation as the main export path
- broad multi-site data export before Zhihu hotlist is stable
## Current Findings To Preserve
- Real-provider validation already proved that `zhihu-hotlist`, `zhihu-navigate`, and `zhihu-write` can be selected through `read_skill`.
- The current failure mode is not "skill missing" but "tool discipline collapse":
- `file_read`, `glob_search`, and `shell` are attempted after `read_skill`
- `zhihu-write` can fill title/body but still exceeds max tool iterations
- `zhihu-navigate` succeeds for some intents but still detours through non-browser tools
- The sibling Office project already exists at `/home/zyl/projects/sgClaw/openxml_cli`.
- `openxml_cli` currently exposes `capabilities`, `template inspect`, `template validate`, and `template render`; it does not yet expose a direct "create workbook from scratch" command.
## Final Acceptance Contract
Input:
```text
读取知乎热榜数据,并导出 excel 文件
```
Required behavior:
1. sgClaw selects `zhihu-hotlist`.
2. sgClaw gathers hotlist rows through the SuperRPA browser interface only.
3. sgClaw converts the result into a structured JSON export payload.
4. sgClaw selects `office-export-xlsx`.
5. sgClaw calls `openxml_office`.
6. A local `.xlsx` file is produced and its path is returned.
Required logs:
- `read_skill zhihu-hotlist`
- browser actions only: `navigate`, `getText`, optionally `click`
- `read_skill office-export-xlsx`
- `call openxml_office`
Forbidden logs during the mainline path:
- `call shell`
- `call glob_search`
- `call file_read` on skill references or skill roots
- `docker run`
Required Excel content:
- one sheet named `知乎热榜`
- columns: `rank`, `title`, `heat`
- at least 10 hotlist rows
- exported values match the collected rows
## Task 1: Lock Browser-Attached Skill Runs To The Right Tools
**Files:**
- Modify: `/home/zyl/projects/sgClaw/claw/tests/compat_runtime_test.rs`
- Modify: `/home/zyl/projects/sgClaw/claw/src/runtime/engine.rs`
- Modify: `/home/zyl/projects/sgClaw/claw/src/runtime/tool_policy.rs`
- Modify: `/home/zyl/projects/sgClaw/claw/src/compat/runtime.rs`
**Intent:**
- Once the task is clearly in a browser-attached Zhihu skill flow, the runtime must stop offering unrelated tools such as `shell`, `glob_search`, and arbitrary `file_read`.
**Step 1: Write the failing regression tests**
Add focused tests in `tests/compat_runtime_test.rs` for:
```rust
#[test]
fn zhihu_hotlist_skill_flow_does_not_expose_shell_or_glob_tools() {}
#[test]
fn browser_attached_export_flow_exposes_browser_and_office_tools_only() {}
```
Assertions to include:
- request tool list contains `superrpa_browser`
- request tool list contains `read_skill`
- request tool list does not contain `shell`
- request tool list does not contain `glob_search`
- request tool list does not contain generic `file_read` during the constrained browser skill phase
**Step 2: Run the focused tests to verify failure**
Run:
```bash
cargo test --test compat_runtime_test zhihu_hotlist_skill_flow_does_not_expose_shell_or_glob_tools -- --nocapture
cargo test --test compat_runtime_test browser_attached_export_flow_exposes_browser_and_office_tools_only -- --nocapture
```
Expected:
- fail because current runtime still exposes too many tools in browser-attached mode
**Step 3: Implement minimal constrained-tool policy**
Implement a browser-skill execution mode that:
- keeps `superrpa_browser`
- keeps compatibility alias `browser_action`
- keeps `read_skill`
- optionally keeps the new `openxml_office` tool only for export tasks
- removes `shell`, `glob_search`, and free-form `file_read` from the allowed tool list for these phases
Do this in `src/runtime/engine.rs` by deriving a narrower `allowed_tools` set from:
- runtime profile
- browser surface present flag
- instruction intent
- whether export mode is active
**Step 4: Re-run the focused tests**
Run the same commands.
Expected:
- both pass
## Task 2: Convert Zhihu Hotlist Skill To Structured Output First
**Files:**
- Modify: `/home/zyl/projects/sgClaw/skill_lib/skills/zhihu-hotlist/SKILL.md`
- Modify: `/home/zyl/projects/sgClaw/claw/tests/skill_lib_validation_test.py`
- Modify: `/home/zyl/projects/sgClaw/claw/tests/compat_runtime_test.rs`
**Intent:**
- The hotlist skill should stop ending with prose-only summaries. Its primary output must be a stable export artifact the Office skill can consume.
**Step 1: Write the failing tests**
Add tests that enforce:
- `zhihu-hotlist` prompt body contains an explicit `Export Artifact` section
- the artifact schema includes `sheet_name`, `columns`, and `rows`
- runtime regression checks can find those fields in the skill content when `read_skill` is used
**Step 2: Run tests to verify failure**
Run:
```bash
python3 -m unittest tests.skill_lib_validation_test
cargo test --test compat_runtime_test handle_browser_message_executes_real_zhihu_hotlist_skill_flow -- --nocapture
```
Expected:
- validation fails because the artifact contract is not yet required
**Step 3: Update `zhihu-hotlist`**
Add an `Export Artifact` section that requires this shape:
```json
{
"source": "https://www.zhihu.com/hot",
"sheet_name": "知乎热榜",
"columns": ["rank", "title", "heat"],
"rows": [[1, "标题", "344万"]]
}
```
Also add hard rules:
- no extra exploratory tools after the browser data is collected
- prose summary is secondary, structured artifact is primary
**Step 4: Re-run tests**
Expected:
- validation passes
## Task 3: Create The Office Export Skill Package
**Files:**
- Create: `/home/zyl/projects/sgClaw/skill_lib/skills/office-export-xlsx/SKILL.md`
- Create: `/home/zyl/projects/sgClaw/skill_lib/skills/office-export-xlsx/references/export-flow.md`
- Create: `/home/zyl/projects/sgClaw/skill_lib/skills/office-export-xlsx/assets/zhihu_hotlist_template.xlsx`
- Modify: `/home/zyl/projects/sgClaw/claw/tests/skill_lib_validation_test.py`
**Intent:**
- Add a fully separate Office skill that knows nothing about browser scraping and only turns structured table data into a local Excel file.
**Step 1: Write the failing validation test**
Extend `tests/skill_lib_validation_test.py` so discovery expects:
```python
EXPECTED_SKILL_NAMES = [
"office-export-xlsx",
"zhihu-hotlist",
"zhihu-navigate",
"zhihu-write",
]
```
Also require the new skill to mention:
- `openxml_office`
- `.xlsx`
- `sheet_name`
- `columns`
- `rows`
**Step 2: Run the validation test to verify failure**
Run:
```bash
python3 -m unittest tests.skill_lib_validation_test
```
Expected:
- fail because the new skill package does not exist yet
**Step 3: Create the skill package**
`SKILL.md` must define:
- when to use: local Office export from structured rows
- required input schema
- output: exported file path
- tool rule: only call `openxml_office`, do not use browser tools
`export-flow.md` must define:
- validate payload shape
- choose output path
- invoke `openxml_office`
- return file path and row count
The first workbook template should be a fixed `zhihu_hotlist_template.xlsx` with:
- sheet `知乎热榜`
- row 1 headers already present
- table fill anchored to a stable name or placeholder expected by `openxml_cli`
**Step 4: Re-run validation**
Expected:
- new skill passes audit
## Task 4: Add The `openxml_office` Runtime Tool
**Files:**
- Create: `/home/zyl/projects/sgClaw/claw/src/compat/openxml_office_tool.rs`
- Modify: `/home/zyl/projects/sgClaw/claw/src/compat/mod.rs`
- Modify: `/home/zyl/projects/sgClaw/claw/src/compat/runtime.rs`
- Modify: `/home/zyl/projects/sgClaw/claw/src/runtime/tool_policy.rs`
- Test: `/home/zyl/projects/sgClaw/claw/tests/compat_openxml_office_tool_test.rs`
**Intent:**
- Wrap sibling `openxml_cli` as a first-class local tool instead of leaking Office export through shell prompting.
**Step 1: Write the failing tool test**
Create `tests/compat_openxml_office_tool_test.rs` with cases for:
- capability probe
- render request assembly for xlsx export
- rejection when rows/columns are missing
- stable JSON output containing `output_path`
**Step 2: Run the test to verify failure**
Run:
```bash
cargo test --test compat_openxml_office_tool_test -- --nocapture
```
Expected:
- fail because the tool does not exist
**Step 3: Implement minimal tool**
Tool contract:
```json
{
"action": "export_hotlist_xlsx",
"template_path": ".../zhihu_hotlist_template.xlsx",
"output_path": "/tmp/zhihu_hotlist.xlsx",
"sheet_name": "知乎热榜",
"columns": ["rank", "title", "heat"],
"rows": [[1, "标题", "344万"]]
}
```
Implementation rules:
- write the payload JSON to a temp file
- invoke sibling `openxml_cli template render --request <file> --json`
- return parsed JSON result and normalized `output_path`
- no free-form shell composition from model text
**Step 4: Re-run the focused tests**
Expected:
- pass
## Task 5: Wire Export Tasks To Use Two Skills In Sequence
**Files:**
- Modify: `/home/zyl/projects/sgClaw/claw/src/runtime/engine.rs`
- Modify: `/home/zyl/projects/sgClaw/claw/tests/compat_runtime_test.rs`
**Intent:**
- The single user instruction must naturally flow from hotlist capture into Office export, not end after the first skill.
**Step 1: Write the failing runtime test**
Add a focused regression test for:
```rust
#[test]
fn zhihu_hotlist_export_task_reads_hotlist_skill_then_office_skill() {}
```
Assertions:
- request stream includes `read_skill zhihu-hotlist`
- later includes `read_skill office-export-xlsx`
- office phase exposes `openxml_office`
- no `shell` is exposed in the constrained task path
**Step 2: Run the test to verify failure**
Run:
```bash
cargo test --test compat_runtime_test zhihu_hotlist_export_task_reads_hotlist_skill_then_office_skill -- --nocapture
```
Expected:
- fail because the task currently has no structured handoff to Office export
**Step 3: Implement minimal chaining support**
Do not add a hard-coded workflow engine.
Minimal implementation:
- strengthen prompt contract so export tasks require structured hotlist artifact
- include `openxml_office` in allowed tools for export intent
- keep browser-only tools for the collection phase and Office-only tool for the export phase
**Step 4: Re-run the test**
Expected:
- pass
## Task 6: Add Real Acceptance Harness And Scoring
**Files:**
- Create: `/home/zyl/projects/sgClaw/claw/tools/live_acceptance/run_zhihu_hotlist_excel_acceptance.py`
- Create: `/home/zyl/projects/sgClaw/claw/docs/acceptance/2026-03-29-zhihu-hotlist-excel.md`
**Intent:**
- Make the final acceptance repeatable with the real user config and a transparent score.
**Step 1: Write the script**
The script must:
- use `/home/zyl/.config/superrpa/Default/superrpa/sgclaw_config.json`
- boot local `target/debug/sgclaw`
- send one browser `submit_task`
- respond to browser commands with controlled fixture responses
- capture:
- loaded skills
- selected skills
- forbidden tool calls
- final summary
- exported file path
**Step 2: Define score rubric**
Rubric:
- `skill selection`: 30
- `tool discipline`: 25
- `hotlist data correctness`: 20
- `xlsx export success`: 20
- `final response quality`: 5
Automatic deductions:
- `shell` called: `-15`
- `glob_search` called: `-10`
- `file_read` on skill references: `-10`
- wrong skill selected first: `-15`
- export missing output path: `-20`
**Step 3: Run acceptance**
Run:
```bash
python3 tools/live_acceptance/run_zhihu_hotlist_excel_acceptance.py
```
Expected:
- prints total score and per-dimension breakdown
- stores final evidence in `docs/acceptance/2026-03-29-zhihu-hotlist-excel.md`
## Delivery Sequence
Execute in this order:
1. Task 1: constrain tools
2. Task 2: structure hotlist output
3. Task 3: add office skill package
4. Task 4: add `openxml_office`
5. Task 5: chain the two skills
6. Task 6: run acceptance and score
## Definition Of Done
- browser-attached hotlist tasks no longer wander into `shell`, `glob_search`, or ad-hoc `file_read`
- `office-export-xlsx` exists as an independent skill
- `openxml_office` exists as an explicit tool
- a single user task can collect hotlist data and export `.xlsx`
- acceptance score is at least `85/100`

View File

@@ -0,0 +1,175 @@
# Browser Script Skill Capability Implementation Plan
> **For Claude:** REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.
**Goal:** Let zeroclaw-backed sgclaw skills call deterministic browser-context scripts through existing SuperRPA page-execution capabilities.
**Architecture:** Reuse SuperRPA's existing page script execution path instead of inventing a parallel workflow executor. Extend sgclaw's browser pipe and zeroclaw skill-tool mapping so a skill can declare a browser script tool, execute a packaged script file with parameters, and receive structured JSON back in the current page context.
**Tech Stack:** Rust (`sgclaw`, vendored `zeroclaw`), Chromium/SuperRPA C++, existing sgclaw browser pipe protocol, skill package `SKILL.toml` / `scripts/`.
### Task 1: Define the browser-script contract
**Files:**
- Modify: `third_party/zeroclaw/src/skills/mod.rs`
- Modify: `third_party/zeroclaw/src/tools/mod.rs`
- Create: `third_party/zeroclaw/src/tools/browser_script_skill.rs`
- Test: `tests/compat_browser_tool_test.rs`
**Step 1: Write the failing test**
Add a sgclaw-side test proving the browser tool schema accepts a script execution action with required browser-script fields and rejects missing script payload.
**Step 2: Run test to verify it fails**
Run: `cargo test --manifest-path /home/zyl/projects/sgClaw/claw/Cargo.toml --test compat_browser_tool_test`
Expected: FAIL because the browser tool still exposes only `click/type/navigate/getText`.
**Step 3: Write minimal implementation**
Extend the browser tool adapter and protocol-facing action parsing to support a deterministic script execution action carrying:
- `action`
- `expected_domain`
- `script`
- optional `args`
**Step 4: Run test to verify it passes**
Run: `cargo test --manifest-path /home/zyl/projects/sgClaw/claw/Cargo.toml --test compat_browser_tool_test`
Expected: PASS for the new browser script action coverage.
**Step 5: Commit**
```bash
git add tests/compat_browser_tool_test.rs src/compat/browser_tool_adapter.rs src/pipe/protocol.rs third_party/zeroclaw/src/skills/mod.rs third_party/zeroclaw/src/tools/mod.rs third_party/zeroclaw/src/tools/browser_script_skill.rs
git commit -m "feat: add browser script skill tool support"
```
### Task 2: Wire browser script execution through the sgclaw host bridge
**Files:**
- Modify: `src/pipe/protocol.rs`
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/superrpa/sgclaw/sgclaw_pipe_protocol.cc`
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/superrpa/sgclaw/sgclaw_security_gate.cc`
- Test: `/home/zyl/projects/superRpa/src/chrome/browser/superrpa/sgclaw/sgclaw_pipe_protocol_mainline_unittest.cc`
- Test: `/home/zyl/projects/superRpa/src/chrome/browser/superrpa/sgclaw/sgclaw_process_host_mainline_unittest.cc`
**Step 1: Write the failing tests**
Add host-side tests proving:
- sgclaw protocol accepts the new browser script action
- handshake advertises the action
- the security gate allows it only through the sgclaw host allowlist
**Step 2: Run tests to verify they fail**
Run the relevant SuperRPA unit tests.
Expected: FAIL because the host protocol and allowlist do not yet include browser script execution.
**Step 3: Write minimal implementation**
Reuse existing SuperRPA router support by bridging the new sgclaw action to the host's page script execution capability instead of inventing a second execution path.
**Step 4: Run tests to verify they pass**
Run the same SuperRPA unit tests.
Expected: PASS with the new action accepted end-to-end.
**Step 5: Commit**
```bash
git add src/pipe/protocol.rs /home/zyl/projects/superRpa/src/chrome/browser/superrpa/sgclaw/sgclaw_pipe_protocol.cc /home/zyl/projects/superRpa/src/chrome/browser/superrpa/sgclaw/sgclaw_security_gate.cc /home/zyl/projects/superRpa/src/chrome/browser/superrpa/sgclaw/sgclaw_pipe_protocol_mainline_unittest.cc /home/zyl/projects/superRpa/src/chrome/browser/superrpa/sgclaw/sgclaw_process_host_mainline_unittest.cc
git commit -m "feat: expose browser script execution to sgclaw"
```
### Task 3: Convert Zhihu hotlist collection into a script-backed skill path
**Files:**
- Create: `/home/zyl/projects/sgClaw/skill_lib/skills/zhihu-hotlist/SKILL.toml`
- Create: `/home/zyl/projects/sgClaw/skill_lib/skills/zhihu-hotlist/scripts/extract_hotlist.js`
- Modify: `/home/zyl/projects/sgClaw/skill_lib/skills/zhihu-hotlist/SKILL.md`
- Modify: `tests/compat_runtime_test.rs`
**Step 1: Write the failing test**
Add runtime coverage proving the Zhihu hotlist export flow uses the skill-backed browser script tool instead of repeated `getText` probing.
**Step 2: Run test to verify it fails**
Run: `cargo test --manifest-path /home/zyl/projects/sgClaw/claw/Cargo.toml --test compat_runtime_test`
Expected: FAIL because the current flow still uses `read_skill` plus generic `getText`.
**Step 3: Write minimal implementation**
Package the hotlist extractor as a deterministic browser script tool returning structured rows and update the skill description to require that tool before export.
**Step 4: Run test to verify it passes**
Run: `cargo test --manifest-path /home/zyl/projects/sgClaw/claw/Cargo.toml --test compat_runtime_test`
Expected: PASS with the script-backed tool visible in logs and the generic repeated `getText` fallback removed from the primary path.
**Step 5: Commit**
```bash
git add /home/zyl/projects/sgClaw/skill_lib/skills/zhihu-hotlist/SKILL.toml /home/zyl/projects/sgClaw/skill_lib/skills/zhihu-hotlist/scripts/extract_hotlist.js /home/zyl/projects/sgClaw/skill_lib/skills/zhihu-hotlist/SKILL.md tests/compat_runtime_test.rs
git commit -m "feat: make zhihu hotlist extraction script-backed"
```
### Task 4: Fix plan visibility in the sgclaw frontend
**Files:**
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_state.ts`
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat.ts`
- Test: frontend or host tests covering `[sgclaw/plan]`
**Step 1: Write the failing test**
Add coverage proving `[sgclaw/plan]` logs are treated as planner logs and preserve multiline task-step rendering.
**Step 2: Run test to verify it fails**
Run the relevant frontend or browser-side tests.
Expected: FAIL because the current frontend only recognizes `[browser/plan]` or `[plan]`.
**Step 3: Write minimal implementation**
Accept `[sgclaw/plan]` as a planner log source and keep the plan text rendered as multiple visible lines.
**Step 4: Run test to verify it passes**
Run the same tests.
Expected: PASS with sgclaw planner logs rendered correctly.
**Step 5: Commit**
```bash
git add /home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_state.ts /home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat.ts
git commit -m "fix: render sgclaw planner logs in chat ui"
```
### Task 5: Full verification
**Files:**
- Verify only
**Step 1: Run sgclaw Rust tests**
Run: `cargo test --manifest-path /home/zyl/projects/sgClaw/claw/Cargo.toml`
Expected: PASS.
**Step 2: Run relevant SuperRPA unit tests**
Run the updated sgclaw host and frontend tests.
Expected: PASS.
**Step 3: Run a manual browser acceptance flow**
Verify that:
- startup logs include runtime and skill versions
- `[sgclaw/plan]` is shown as a real multi-line plan
- Zhihu hotlist export invokes the script-backed skill tool
- the flow returns a real `.xlsx` path without repeated self-referential output
**Step 4: Commit verification artifacts if needed**
Keep verification commits separate from behavior changes.

View File

@@ -0,0 +1,314 @@
# sgClaw 项目现有优势与下一步计划汇报稿
## 一、项目现有优势
和传统 openclaw 类自动化项目相比,本项目的优势不只是“能做页面操作”,而是已经具备了向企业级、长期可用方向演进的基础。简单说,传统方案更像“能跑起来的自动化脚本集合”,本项目更像“带安全边界、带统一控制、带长期演进能力的智能执行底座”。
而且根据后续已经落地的提交来看,本项目的优势已经不再只是架构上的“方向正确”,而是开始体现为一批已经交付的现实能力,包括:
- 运行时策略已经从写死逻辑转向配置驱动。
- 已支持 planner-first 的先规划后执行模式。
- 已支持技能包驱动的浏览器脚本执行。
- 已支持面向具体任务的 Excel 导出和屏幕展示导出能力。
- 已具备版本级日志、技能版本日志和真实验收记录。
这意味着本项目已经从“原型底座”进一步进入“可验证、可扩展、可交付”的阶段。
### 1. 从“脚本工具”升级为“统一执行底座”
传统项目通常是一个场景一套脚本,一个系统一套规则,能解决眼前问题,但难以复用、难以管理、难以持续演进。页面一改,脚本就要改;系统一多,维护成本就会快速上升。
本项目已经把任务接入、运行时控制、浏览器执行、日志回传、结果回传放进统一框架里。这样后续不管是新增能力、扩展场景,还是替换模型、替换策略,都不需要推倒重来,而是在同一个底座上持续增强。对业务侧来说,这意味着项目价值不再停留在“做通一个流程”,而是逐步沉淀为可以支撑更多业务的公共能力。
### 2. 安全设计更完整,更适合企业内网
这是本项目最突出的优势。传统自动化方案最大的问题,是一旦拿到页面操作能力,就容易变成“脚本想点什么就点什么”,安全边界不清楚,风险很难控制。本项目当前已经不是这种模式,而是把浏览器能力放进了严格受控的执行边界里。
从汇报角度,可以把当前安全设计概括为以下几组能力:
#### 3 层安全机制
第一层是启动门禁。浏览器宿主和运行时必须先完成握手,版本不一致、超时、顺序错误,系统都不会进入正式运行状态。
第二层是运行时策略校验。每次真正要执行页面动作前,都会先检查动作是否合法、目标页面是否在允许范围内。
第三层是宿主二次校验。就算运行时已经发出了命令,浏览器宿主仍然会再做一次本地校验,防止异常命令真正落地。
这 3 层叠加起来,形成了“不是模型想做什么就做什么,而是每一步都要过关”的安全控制方式。
#### 6 项协议硬约束
当前协议层已经明确了 6 项刚性要求:
1. 传输格式固定,不能随意乱发消息。
2. 编码方式固定,避免解析异常。
3. 单条消息大小有限制,防止异常数据冲击系统。
4. 序列号必须严格递增,不能重复、不能乱序。
5. 每条关键命令都必须带安全字段。
6. 一次请求只能对应一次响应,不能混乱回包。
这说明系统不是“发个命令过去试试看”,而是每条消息都有严格规则,更适合企业环境中的稳定运行。
#### 2 类白名单
本项目当前至少有两类核心白名单同时生效。
第一类是域名白名单。只有允许的业务域名才可以被操作,不是浏览器里所有页面都能碰。
第二类是动作白名单。只有允许的动作类型才能执行,不是脚本写得出来就一定能跑。
白名单机制的意义在于,系统把“哪些页面能动、哪些动作能做”提前规定清楚,而不是把决定权完全交给模型或脚本。
#### 1 类显式黑名单
除了白名单,本项目还明确保留了显式阻断项。也就是说,不只是“没允许的不行”,而是“高风险动作被直接明确禁止”。
这在企业场景下非常重要,因为有些能力不是“暂时不用”,而是“原则上就不能开放”。有了黑名单,系统在设计上就能提前规避高风险能力外溢。
#### 5 个默认允许动作
当前默认真正开放给运行时执行的动作共有 5 个:
1. 点击
2. 输入
3. 页面跳转
4. 文本读取
5. 受控脚本执行
这里最重要的一点不是“多了一个动作”,而是这个新增能力并没有破坏安全边界。它不是把任意页面脚本能力全部放开,而是在现有受控协议和校验链路内,给技能包提供了一种更强但仍然可控的执行方式。
这看起来不如一些传统方案“动作数量多”,但它的价值恰恰在于边界非常清楚。先把最稳定、最可控、最容易审计的核心动作做好,再逐步扩展,比一开始把大量高风险动作全部开放更稳。
#### 7 个默认允许域名
当前规则里默认允许的域名是有限集合而不是浏览器里的所有网页都能碰。这样做非常符合企业内网环境的实际需求。对于办公系统、ERP、OA 等场景,大家真正需要的不是“全网自动化”,而是“在明确范围内可控地自动化”。
#### 1 套 HMAC 签名机制
所有关键命令不是明文裸发,而是带签名校验。可以简单理解成,每条关键操作都会带“防伪标记”。
这样做的价值是,命令在链路中不容易被伪造、篡改或错误复用,整体安全性远高于普通脚本直接调用页面接口的模式。
#### 1 套序列号机制
每条命令都有严格递增的序列号,而且一个序列号只能对应一个结果。
这让系统能够清楚知道“这条结果到底对应哪一次操作”,避免串包、乱序、错配等问题,提升稳定性和可追溯性。
#### 3 重脚本执行约束
后续提交里新增了技能包驱动的浏览器脚本能力,但这部分并不是“把页面执行彻底放开”,而是在现有安全边界内增加了一层受控能力。
可以把它理解为 3 重约束:
1. 脚本必须来自技能包内的受管路径,不能越界读取技能目录之外的文件。
2. 执行时必须声明目标域名,不能脱离页面上下文随意运行。
3. 脚本仍然通过现有浏览器 pipe 和动作白名单执行,而不是绕过宿主直接落地。
这类设计很关键,因为它说明项目在增强能力的同时,仍然坚持“新增能力必须留在安全边界里”,而不是为了方便把安全口子越开越大。
#### 5 类错误处理策略
系统不是失败了就“直接崩”,而是把错误分成不同类型处理。
- 有的错误不允许重试,直接失败。
- 有的错误可以限次重试。
- 有的错误需要等待配置或人工确认。
- 有的错误会触发熔断。
- 所有失败都要求结构化返回,便于定位问题。
这比传统脚本“报错了就人工重跑”的方式要成熟得多。
#### 1 个熔断阈值
同一动作如果连续失败超过阈值,系统会主动停止继续尝试并通知界面,而不是无限重复。
这能有效避免错误状态下反复点击、反复提交、反复操作,减少业务风险和误操作成本。
#### 7 项联调验收标准
当前项目已经把联调成功的标准写清楚了,包括:
1. 握手成功率要求
2. 版本不匹配的失败处理
3. 序列号异常场景处理
4. 超大消息拦截
5. 核心动作成功率要求
6. 结构化错误返回要求
7. 日志全链路贯通能力
这说明项目不是“靠经验凑合能跑”,而是已经开始形成可以复制、可以验收、可以交付的工程标准。
### 3. 浏览器只是执行面,不再定义整个系统
传统 openclaw 类项目常见问题是浏览器能力太强,最后整个系统都围着页面脚本转,浏览器脚本几乎变成了系统本体。
本项目已经明确把浏览器定义为“受保护的特权执行面”,而不是整个 runtime 本体。这意味着以后就算扩展到别的工具面、别的执行面,也不需要推翻现有架构,系统的演进空间更大,整体结构也更清楚。
### 4. 运行时能力已经从“写死逻辑”升级为“配置驱动”
这一点是后续提交中非常重要的进展。传统项目经常把模型、策略、模式、环境差异写死在代码里,导致后续一改就牵动整体。
本项目现在已经把一批关键决策收进运行时配置,包括:
- 使用哪个模型提供方
- 当前激活哪个 provider
- 使用什么 planner 模式
- 采用哪种 runtime profile
- 浏览器能力走哪种 backend
- Office 导出走哪种 backend
- skills 从哪个目录加载
从汇报口径上,可以把它概括为:
1 套统一 runtime config
3 种 runtime profile
多项可切换运行策略
这意味着系统不再只是“代码怎么写就怎么跑”,而是开始进入“按环境、按任务、按场景灵活切换”的阶段,更适合企业实际落地。
### 5. 已形成“先规划、再执行、再产出结果”的闭环能力
传统自动化项目往往是一上来就直接操作页面,缺少中间过程的可解释性,也不利于后续审计和治理。
本项目后续提交已经进一步加强了 planner-first 模式,也就是在真正执行之前,先给出计划,再按计划执行,再输出结果。对业务和管理层来说,这样的价值非常直接:
- 更容易理解系统准备做什么
- 更容易检查执行过程是否偏离目标
- 更容易把计划、执行、结果串成闭环
同时,本项目已经不是只有一个简单浏览器工具,而是开始形成更清晰的能力分工,例如:
- `superrpa_browser` 负责受控浏览器操作
- `openxml_office` 负责结果导出
- `screen_html_export` 负责展示类产物导出
这说明项目正在从“一个浏览器操作入口”走向“围绕业务结果组织工具链”的阶段。
### 6. 技能体系已经开始从“提示词描述”走向“可执行能力包”
这是本项目相对传统 openclaw 类项目非常重要的一个现实优势。很多传统项目里的“技能”更多只是提示词模板,真正落地时还是回到页面脚本堆叠。
本项目后续提交已经支持技能包驱动的浏览器脚本执行。简单理解,就是一个技能不再只是“告诉模型怎么做”,而是可以带着确定的脚本能力一起交付。这样做有几个明显好处:
1. 能力更稳定
关键步骤不必完全依赖模型自由发挥,而是可以由打包好的脚本完成。
2. 可复用性更强
同一个技能包可以在相似场景中重复使用,不必每次都重新组织页面操作。
3. 更适合沉淀企业资产
后续很多高价值流程,都可以逐步从“提示词经验”沉淀成“可复用技能包”。
这意味着项目已经开始从“智能执行框架”走向“智能执行框架 + 可复用技能资产”的模式。
### 7. 前端只负责展示,不掌握执行权
传统项目里,前端、脚本、执行逻辑经常混在一起,最后变成“界面里藏了很多业务决策”。这种方式短期看开发快,长期看风险大、维护成本高。
本项目已经把前端限制为展示层,只负责展示状态、日志、计划和结果,不负责决定是否执行、如何切换模型、如何绕过安全边界。这样一来,系统结构更清楚,后续维护和升级时也更不容易失控。
此外,后续提交已经支持外部 frontend bundle 优先、内置资源兜底的装载方式。这意味着后续改界面、改展示逻辑,不必每次都重编浏览器宿主,研发效率和交付效率都会更高。
### 8. 配置能力更强,更适合业务落地
传统项目往往把很多关键逻辑写死在脚本里,修改一次就要重新改代码。这样不仅效率低,而且很容易因为局部修改牵动整体。
本项目已经开始把运行时配置、模型配置、策略配置从代码里抽出来,让宿主、运行时、前端之间的责任更清楚。这意味着未来业务调整、模型切换、策略升级都可以更平滑,而不是每次都进行大规模改造。
同时,后续提交还进一步加强了 source checkout 启动包装和 rules 同步能力,这对开发团队来说很重要。它意味着项目不仅适合做成二进制交付,也更适合在源码态持续联调和快速迭代。
### 9. 更适合做长期资产沉淀
传统自动化方案常见的问题,是做完一个流程后,价值基本也就结束了,经验很难积累成资产。
本项目不一样,它的方向是把执行能力、规则、安全边界、日志能力以及后续的元素识别能力,逐步沉淀成可复用资产。对企业来说,这种价值远高于“今天跑通一个流程”,因为它决定了未来是不是能够越做越快、越做越稳、越做越便宜。
现在这件事已经开始有现实支撑了。因为项目不只是在“能操作页面”,还已经能把技能、脚本、导出流程、运行时策略和日志标准逐步固化下来。后续再推进“混合自愈选择器”和元素指纹库时,这些都会自然成为资产沉淀的基础层。
### 10. 可观测性更强,已经开始具备运行级审计基础
传统项目常常只在失败时打印一段日志,出了问题很难知道系统到底做了什么。
本项目后续提交已经补上了一批很关键的运行级日志信息,包括:
- runtime 版本
- 协议版本
- 配置来源
- skills 目录解析结果
- runtime profile
- skills prompt mode
- 已加载技能及版本号
- 当前执行模式
这类能力的价值非常直接:它让系统开始具备“说清楚自己是怎么运行的”的能力。对研发、测试、验收和后续审计来说,这都是非常重要的基础。
### 11. 已经形成“真实验收”而不是“概念演示”
后续提交里,项目已经留下了更完整的验收记录,而不是停留在文档层面的能力描述。以知乎热榜 Excel 导出为例,当前已经形成真实验收结果,包括:
- 真实 provider 模式运行
- 实时热榜数据采集
- 结构化结果导出
- Excel 文件生成
- 验收打分
这说明项目已经不是“理论上可以做到”,而是已经在真实任务链路中证明“能够跑通、能够输出结果、能够形成验收记录”。
对外汇报时,这一点很重要,因为它代表项目已经从“能力设想”走向“能力验证”。
### 12. 工程化基础更好
本项目已经不是单纯的验证页面或原型,而是以运行时内核、协议、规则和测试为主的工程结构。这说明项目更接近“可持续建设的产品内核”,而不只是“一个能演示的自动化效果”。
从目前仓库状态看,已经有 20 多个顶层测试文件覆盖协议、握手、runtime、配置、兼容层、导出工具、技能执行和验收评分等多个方面。这说明项目已经在往“可持续交付、可持续验证”的方向走而不是停留在临时性脚本工程。
从长期看,这种工程化能力决定了项目能不能真正进入生产环境,能不能被更多团队协同使用,能不能在后续持续扩展能力。
### 13. 一句话总结现有优势
如果用非技术语言概括,本项目当前最大的优势可以总结为:
不是“更会点网页”,而是“已经具备了企业级智能执行系统该有的安全边界、控制能力、真实交付能力、稳定基础和长期演进空间”。
## 二、下一步计划
下一阶段的重点,不是继续堆脚本,而是进一步解决“页面一变就失效”的老问题,同时把项目能力从“能执行任务”继续提升为“能持续积累企业级自动化资产”。
### 1. 研发“混合自愈选择器”Hybrid Self-Healing Selector
在内网环境下,逐步摆脱对单一 XPath 的依赖,建立企业级元素指纹库,让系统在页面变化后依然能更稳地找到目标元素。
### 2. 定义元素指纹数据结构JSON
给每个可操作元素建立一份“数字档案”,核心字段包括:
- 语义文本
- A11y Role
- 相对空间位置
- 属性哈希
- 兜底 XPath
- 视觉切图Base64 小图)
这样系统找元素时,不再只靠一条路径,而是像“多特征识别”。
### 3. 推进“影子录制”Shadow Recording机制
在现有传统 RPA 正常运行时,于底层开启影子模式。当旧脚本通过 XPath 成功命中元素并完成操作时,后台静默抓取该元素的完整指纹并写入本地数据库。
通过这种方式,在不额外增加大量人工录制成本的前提下,持续沉淀高价值元素资产库。
### 4. 开发穿透层能力
利用定制 Chromium 的底层权限,解决 `iframe` 和闭合 `Shadow DOM` 这类复杂页面结构下的定位难题,为后续自愈选择器提供更强支撑。
## 三、预期结果
通过下一阶段建设,本项目将从“能执行任务”进一步升级为:
- 在复杂企业页面中更稳
- 对页面变化更不敏感
- 更容易持续积累高价值资产
- 更适合在企业环境中长期推广使用
从业务视角看,项目价值也会从“完成单个流程自动化”进一步升级为“建设企业级智能执行底座”。

View File

@@ -0,0 +1,425 @@
# Claw-WS 开发执行顺序卡片
> 配套计划:[`2026-04-01-claw-ws-parallel-transport.md`](./2026-04-01-claw-ws-parallel-transport.md)
>
> 使用方式:严格按卡片顺序执行。每张卡片完成后先跑卡片内测试,再进入下一张。不要跳卡,不要提前接线,不要先写 service/client 再回头抽象底层。
---
## 卡片 0执行前约束
**目标**
先锁定边界,避免实现过程中把 pipe 模式改坏。
**必须遵守**
- 现有 pipe 模式必须保持可用
- 新增的是并行 `claw-ws` 模式,不是替换 pipe
- v1 只做单客户端、单任务串行
- `browser_action` / `superrpa_browser` 外部命名保持稳定
- 如果 WS `Eval` 不完整,先禁用相关 browser-script skill 暴露
- 不要提前做多客户端、任务队列、管理接口
**完成标准**
- 开发者明确后续所有改动都围绕“抽象复用 + 并行新增”进行
---
## 卡片 1抽共享 SubmitTask Runner
**目标**
把当前 `BrowserMessage::SubmitTask` 的主执行逻辑从 pipe 入口里抽出来,变成共享执行器。
**先做什么**
1. 新增 `tests/task_runner_test.rs`
2. 先写失败用例:
- 空 instruction
- 无 LLM 配置
- 日志顺序仍然是 `LogEntry` -> `TaskComplete`
**要改哪些文件**
- `src/agent/mod.rs`
- `src/lib.rs`
- `src/agent/task_runner.rs`
- `tests/task_runner_test.rs`
**实现动作**
1.`SubmitTaskRequest`
2.`AgentEventSink`
3.`run_submit_task(...)`
4. 让 pipe 入口只做:
- 解包 `BrowserMessage::SubmitTask`
- 转成 `SubmitTaskRequest`
- 调共享 runner
**绝对不要做**
- 不要在这一张卡里引入 ws backend
- 不要改 tool adapter
- 不要碰 service/client
**本卡测试命令**
```bash
cargo test --test runtime_task_flow_test --test task_runner_test
```
**通过标准**
- 老的 `runtime_task_flow_test` 继续绿
- 新的 `task_runner_test` 通过
- pipe 行为无变化
**完成后提交**
```bash
git commit -m "refactor: extract shared submit-task runner"
```
---
## 卡片 2抽 BrowserBackend 抽象
**目标**
把上层 runtime / orchestration / tool adapter 从 `BrowserPipeTool<T>` 解耦,统一依赖浏览器后端接口。
**先做什么**
1. 新增 `tests/browser_backend_capability_test.rs`
2. 先写失败用例:
- pipe backend 元数据不变
- pipe backend 支持 `Eval`
- `supports_eval() == false` 时不暴露 browser-script tools
**要改哪些文件**
- `src/browser/mod.rs`
- `src/browser/backend.rs`
- `src/browser/pipe_backend.rs`
- `src/compat/browser_tool_adapter.rs`
- `src/compat/browser_script_skill_tool.rs`
- `src/compat/runtime.rs`
- `src/compat/orchestration.rs`
- `src/compat/workflow_executor.rs`
- `src/lib.rs`
- `tests/browser_backend_capability_test.rs`
**实现动作**
1. 定义 `BrowserBackend`
2.`pipe_backend` 包装当前 `BrowserPipeTool`
3. 把上层签名改成 `Arc<dyn BrowserBackend>`
4. 保持工具名不变:
- `browser_action`
- `superrpa_browser`
5. 增加 `supports_eval()` gating
**绝对不要做**
- 不要在这一张卡里接浏览器 ws 协议
- 不要建 service
- 不要加 client 协议
**本卡测试命令**
```bash
cargo test --test browser_tool_test --test compat_browser_tool_test --test browser_backend_capability_test
```
**通过标准**
- 现有 browser tool 相关测试不回归
- 新 capability test 通过
- 上层逻辑已脱离 `BrowserPipeTool<T>` 的硬耦合
**完成后提交**
```bash
git commit -m "refactor: abstract browser backend from pipe transport"
```
---
## 卡片 3写死浏览器 WS 协议 Codec
**目标**
单独做浏览器固定 WebSocket 协议编解码层,不把协议细节散落到 backend 和 service 里。
**先做什么**
1. 新增 `tests/browser_ws_protocol_test.rs`
2. 先写失败用例:
- outbound frame 精确编码
- callback payload 解析
- 异常格式拒绝
- v1 action 覆盖完整
**要改哪些文件**
- `src/browser/ws_protocol.rs`
- `tests/browser_ws_protocol_test.rs`
**实现动作**
1. 按浏览器文档编码数组消息
2. 只支持 v1 必需动作:
- `Navigate`
- `GetText`
- `Click`
- `Type`
- `Eval`
3. 定义 callback 解析和关联规则
4. 对 unsupported / malformed 早失败
**绝对不要做**
- 不要在这张卡里连真实浏览器
- 不要写 service 协议
- 不要把网络连接逻辑塞进 codec
**本卡测试命令**
```bash
cargo test --test browser_ws_protocol_test
```
**通过标准**
- codec 单测全绿
- 无网络依赖
- 已能作为 backend 的纯协议层基础
**完成后提交**
```bash
git commit -m "test: codify fixed browser websocket protocol"
```
---
## 卡片 4实现 Browser WS Backend
**目标**
在 codec 之上提供和 pipe backend 类似的阻塞式 `invoke(...)` 能力。
**先做什么**
1. 新增 `tests/browser_ws_backend_test.rs`
2. 先写失败用例:
- `0 + 无 callback` 成功
-`0` 失败
- `0 + callback` 成功
- callback timeout
- socket drop
**要改哪些文件**
- `src/browser/mod.rs`
- `src/browser/ws_backend.rs`
- `tests/browser_ws_backend_test.rs`
**实现动作**
1. 建长连接管理器
2. 先做串行单飞请求
3. 发送前过 `MacPolicy`
4. 统一即时返回和 callback 返回
5. 输出统一 `CommandOutput`
**绝对不要做**
- 不要在这一张卡里接 service 层
- 不要做多并发 browser request
- 不要直接把浏览器 ws 代码散进 runtime
**本卡测试命令**
```bash
cargo test --test browser_ws_backend_test
```
**通过标准**
- backend 在 mocks/fakes 下稳定通过
- invoke 语义与 pipe backend 接近
- 可供上层 runtime 直接替换使用
**完成后提交**
```bash
git commit -m "feat: add browser websocket backend"
```
---
## 卡片 5实现 sg_claw Service
**目标**
新增本地长驻服务端,承接 client 请求并复用共享 task runner。
**先做什么**
1. 新增 `tests/service_ws_session_test.rs`
2. 先写失败用例:
- 首个客户端接入成功
- 第二个客户端收到 busy
- 断开后状态释放
- 任务重入被拒绝
**要改哪些文件**
- `src/service/mod.rs`
- `src/service/protocol.rs`
- `src/service/server.rs`
- `src/bin/sg_claw.rs`
- `src/lib.rs`
- `Cargo.toml`
- `tests/service_ws_session_test.rs`
**实现动作**
1. 定义 client/service 协议
2. 实现 service 端事件 sink
3. 建单活 session 状态机:
- `Idle`
- `ClientAttached`
- `TaskRunning`
4. 路由 `SubmitTask` 到共享 runner
5. 保持 pipe 入口不变
**绝对不要做**
- 不要在这一张卡里做 client 交互体验优化
- 不要加任务队列
- 不要支持多客户端并发
**本卡测试命令**
```bash
cargo test --test service_ws_session_test
```
**通过标准**
- 服务端会话锁生效
- 共享 runner 可被 service 复用
- pipe 模式入口未受影响
**完成后提交**
```bash
git commit -m "feat: add claw-ws service entrypoint"
```
---
## 卡片 6实现 sg_claw_client
**目标**
新增一个薄客户端,提供类似 `claude/codex` 的交互式命令行体验。
**先做什么**
1. 新增 `tests/service_task_flow_test.rs`
2. 先写失败用例:
- submit-task 送达 service
- 日志按顺序流回
- completion 只到一次
- 完成后断开处理清晰
**要改哪些文件**
- `src/bin/sg_claw_client.rs`
- `Cargo.toml`
- `tests/service_task_flow_test.rs`
**实现动作**
1. 建立到本地 service 的 ws 连接
2. 读取用户输入
3. 发送 `SubmitTask`
4. 实时打印日志
5. 收到 `TaskComplete` 结束本轮
**绝对不要做**
- 不要把 runtime、skills、browser backend 复制进 client
- 不要让 client 直接连浏览器
- 不要让 client 承担业务逻辑
**本卡测试命令**
```bash
cargo test --test service_task_flow_test
cargo build --bin sg_claw --bin sg_claw_client
```
**通过标准**
- client 是薄壳
- task flow 正常
- 两个新 binary 可编译
**完成后提交**
```bash
git commit -m "feat: add interactive claw-ws client"
```
---
## 卡片 7最终接线与回归验证
**目标**
把 ws 路径接起来,同时确认 pipe 路径零回归。
**先做什么**
1. 只增加最小配置项:
- `browser_ws_url`
- `service_ws_listen_addr`
2. 检查外部工具命名保持稳定
**要改哪些文件**
- `Cargo.toml`
- `src/lib.rs`
- `src/config/settings.rs`
- `src/runtime/engine.rs`(如确有必要)
**实现动作**
1. 接入最小配置面
2. 确保 pipe / ws 下工具命名一致
3. 跑旧 pipe 回归
4. 跑新 ws 测试
5. 跑全量 Rust tests
6. 编译所有 binary
7. 做一次真实本地 smoke test
**本卡 pipe 回归命令**
```bash
cargo test --test browser_tool_test --test compat_browser_tool_test --test runtime_task_flow_test --test pipe_handshake_test --test pipe_protocol_test --test task_protocol_test
```
**本卡 ws 测试命令**
```bash
cargo test --test task_runner_test --test browser_ws_protocol_test --test browser_ws_backend_test --test browser_backend_capability_test --test service_ws_session_test --test service_task_flow_test
```
**本卡全量命令**
```bash
cargo test --tests
cargo build --bin sgclaw --bin sg_claw --bin sg_claw_client
```
**手工验证**
1. 启动浏览器,确认 `ws://127.0.0.1:12345` 可用
2. `cargo run --bin sg_claw`
3. 新终端运行 `cargo run --bin sg_claw_client`
4. 发一个简单浏览器任务
5. 确认日志流和单次 completion
6. 确认旧 `cargo run` pipe 入口仍可启动
**通过标准**
- pipe 模式零回归
- ws 模式可独立工作
- 两套模式并行存在
**完成后提交**
```bash
git commit -m "feat: wire parallel claw-ws transport"
```
---
## 一句话执行顺序
严格按下面顺序做:
1. 共享 runner
2. browser backend 抽象
3. ws 协议 codec
4. ws backend
5. service
6. client
7. 配置接线 + 回归
如果顺序乱了,最容易出现的问题是:
- 上层重复实现
- pipe 被误伤
- ws 协议细节扩散到整个工程
- service/client 提前写完后又被迫重构

View File

@@ -0,0 +1,687 @@
# Claw-WS Parallel Transport Implementation Plan
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
**Goal:** Add a parallel `claw-ws` transport path that keeps the current pipe mode intact while introducing a long-lived `sg_claw` local service, an interactive `sg_claw_client`, and a browser WebSocket backend at `ws://127.0.0.1:12345`.
**Architecture:** First extract a transport-agnostic submit-task runner and browser backend abstraction from the current pipe-coupled flow. Keep the existing pipe path as one adapter/backend, then add a fixed-protocol browser WebSocket backend plus a small service/session layer and an interactive CLI client that reuse the same runtime, orchestration, and browser-facing tool adapters.
**Tech Stack:** Rust 2021, current sgclaw compat runtime, zeroclaw runtime engine, `serde`/`serde_json`, existing `MacPolicy`, and a blocking WebSocket crate for v1 (`tungstenite` preferred over a broad async rewrite).
---
## Scope Guardrails
- Keep the current pipe mode entrypoint and behavior working.
- Do **not** replace the existing browser pipe path.
- Add a **parallel** WebSocket path only.
- v1 supports **one active client session** only.
- Reuse existing tool names and runtime behavior whenever possible.
- If WS `Eval` support is incomplete, disable eval-dependent browser-script skill exposure in WS mode rather than shipping partial behavior.
- Do not broaden v1 with task queues, multi-client support, or admin endpoints.
---
## File Structure
### Existing files to reuse
- Modify: `src/lib.rs` — current pipe bootstrap and receive loop; keep as the legacy pipe entrypoint.
- Modify: `src/agent/mod.rs` — current `BrowserMessage::SubmitTask` entrypoint and config-loading flow.
- Modify: `src/compat/runtime.rs` — compat runtime and tool assembly.
- Modify: `src/compat/orchestration.rs` — direct workflow vs compat runtime routing.
- Modify: `src/compat/browser_tool_adapter.rs` — exposes `browser_action` and `superrpa_browser`.
- Modify: `src/compat/browser_script_skill_tool.rs` — browser-script skill execution.
- Modify: `src/compat/workflow_executor.rs` — direct browser workflows such as Zhihu flows.
- Reuse: `src/pipe/browser_tool.rs` — current browser command executor; retain as the pipe backend implementation.
- Reuse: `src/pipe/protocol.rs``BrowserMessage`, `AgentMessage`, `Action`, `ExecutionSurfaceMetadata`.
- Reuse: `src/security/mac_policy.rs` — local action/domain guardrails.
- Modify: `src/config/settings.rs` — minimal new config surface for WS mode.
- Optional modify: `src/runtime/engine.rs` — only if backend capability wiring requires it.
### New files to create
- Create: `src/agent/task_runner.rs` — shared submit-task execution entrypoint.
- Create: `src/browser/mod.rs` — browser backend exports.
- Create: `src/browser/backend.rs``BrowserBackend` trait and helpers.
- Create: `src/browser/pipe_backend.rs` — wrapper around existing `BrowserPipeTool`.
- Create: `src/browser/ws_protocol.rs` — fixed browser WS request/response codec.
- Create: `src/browser/ws_backend.rs` — browser WS backend with blocking invoke semantics.
- Create: `src/service/mod.rs` — service exports.
- Create: `src/service/protocol.rs` — client/service WS message types.
- Create: `src/service/server.rs` — single-session `sg_claw` server.
- Create: `src/bin/sg_claw.rs` — service binary.
- Create: `src/bin/sg_claw_client.rs` — interactive CLI client.
- Create: `tests/task_runner_test.rs` — shared submit-task runner regressions.
- Create: `tests/browser_backend_capability_test.rs` — backend capability/tool exposure tests.
- Create: `tests/browser_ws_protocol_test.rs` — browser WS protocol tests.
- Create: `tests/browser_ws_backend_test.rs` — browser WS backend tests.
- Create: `tests/service_ws_session_test.rs` — single-session server tests.
- Create: `tests/service_task_flow_test.rs` — client/service task flow tests.
---
## Task 1: Extract a shared submit-task runner
**Files:**
- Create: `src/agent/task_runner.rs`
- Modify: `src/agent/mod.rs`
- Modify: `src/lib.rs`
- Test: `tests/task_runner_test.rs`
- Reuse: `src/compat/runtime.rs`, `src/compat/orchestration.rs`
- [ ] **Step 1: Write a failing runner regression test**
Create `tests/task_runner_test.rs` covering:
- empty instruction returns the same `TaskComplete` failure summary
- missing LLM config still returns the same summary shape
- the pipe adapter still emits `LogEntry` before `TaskComplete`
- [ ] **Step 2: Run the targeted regression tests first**
Run:
```bash
cargo test --test runtime_task_flow_test --test task_runner_test
```
Expected: `task_runner_test` fails because the shared runner does not exist yet.
- [ ] **Step 3: Define the transport-neutral request model**
Create `src/agent/task_runner.rs` with a request struct that mirrors the current pipe payload:
```rust
pub struct SubmitTaskRequest {
pub instruction: String,
pub conversation_id: Option<String>,
pub messages: Vec<ConversationMessage>,
pub page_url: Option<String>,
pub page_title: Option<String>,
}
```
Normalize empty strings to `None` at the adapter boundary.
- [ ] **Step 4: Define an event sink abstraction**
Add a small trait that can emit the current agent events without depending on a specific transport:
```rust
pub trait AgentEventSink {
fn send(&self, message: &AgentMessage) -> Result<(), PipeError>;
}
```
The existing pipe transport should implement this first.
- [ ] **Step 5: Move submit-task execution into a shared function**
Extract the body currently inside `BrowserMessage::SubmitTask` handling from `src/agent/mod.rs` into a shared function such as:
```rust
pub fn run_submit_task(
sink: &dyn AgentEventSink,
browser_backend: Arc<dyn BrowserBackend>,
context: &AgentRuntimeContext,
request: SubmitTaskRequest,
) -> Result<(), PipeError>
```
This function must still:
- validate empty instruction
- load sgclaw settings
- log runtime/config info
- choose orchestration vs compat runtime
- emit `AgentMessage::TaskComplete`
- [ ] **Step 6: Keep pipe mode as a thin adapter**
Refactor `handle_browser_message_with_context(...)` in `src/agent/mod.rs` so it only:
- pattern matches `BrowserMessage`
- converts `SubmitTask` into `SubmitTaskRequest`
- forwards into `run_submit_task(...)`
- [ ] **Step 7: Re-run the runner regressions**
Run:
```bash
cargo test --test runtime_task_flow_test --test task_runner_test
```
Expected: both tests pass and pipe behavior remains unchanged.
- [ ] **Step 8: Commit**
```bash
git add src/agent/mod.rs src/agent/task_runner.rs src/lib.rs tests/task_runner_test.rs
git commit -m "refactor: extract shared submit-task runner"
```
---
## Task 2: Introduce a browser backend abstraction and wrap the current pipe implementation
**Files:**
- Create: `src/browser/mod.rs`
- Create: `src/browser/backend.rs`
- Create: `src/browser/pipe_backend.rs`
- Modify: `src/lib.rs`
- Modify: `src/compat/browser_tool_adapter.rs`
- Modify: `src/compat/browser_script_skill_tool.rs`
- Modify: `src/compat/runtime.rs`
- Modify: `src/compat/orchestration.rs`
- Modify: `src/compat/workflow_executor.rs`
- Test: `tests/browser_backend_capability_test.rs`
- Reuse: `src/pipe/browser_tool.rs`, `src/security/mac_policy.rs`
- [ ] **Step 1: Add a failing backend capability test**
Create `tests/browser_backend_capability_test.rs` to verify:
- pipe backend still exposes privileged surface metadata
- pipe backend still supports `Eval`
- browser-script tool exposure is disabled when `supports_eval()` is false
- [ ] **Step 2: Run the current browser adapter tests first**
Run:
```bash
cargo test --test browser_tool_test --test compat_browser_tool_test --test browser_backend_capability_test
```
Expected: new capability test fails because the backend abstraction does not exist yet.
- [ ] **Step 3: Define the shared browser interface**
Create `src/browser/backend.rs`:
```rust
pub trait BrowserBackend: Send + Sync {
fn invoke(
&self,
action: Action,
params: Value,
expected_domain: &str,
) -> Result<CommandOutput, PipeError>;
fn surface_metadata(&self) -> ExecutionSurfaceMetadata;
fn supports_eval(&self) -> bool {
true
}
}
```
- [ ] **Step 4: Implement the pipe backend as a wrapper**
Create `src/browser/pipe_backend.rs` that stores the current `BrowserPipeTool<T>` and forwards `invoke(...)` and `surface_metadata()` unchanged.
Pipe mode must continue using:
- `perform_handshake(...)`
- `MacPolicy::load_from_path(...)`
- `BrowserPipeTool::new(...).with_response_timeout(...)`
- [ ] **Step 5: Refactor runtime and tool adapters to depend on `Arc<dyn BrowserBackend>`**
Update:
- `src/compat/browser_tool_adapter.rs`
- `src/compat/browser_script_skill_tool.rs`
- `src/compat/runtime.rs`
- `src/compat/orchestration.rs`
- `src/compat/workflow_executor.rs`
Preserve external tool names:
- `browser_action`
- `superrpa_browser`
- [ ] **Step 6: Add capability gating for eval-dependent script tools**
If `supports_eval()` is false, do **not** expose browser-script skill tools from `build_browser_script_skill_tools(...)` in that backend mode.
- [ ] **Step 7: Re-run browser adapter tests**
Run:
```bash
cargo test --test browser_tool_test --test compat_browser_tool_test --test browser_backend_capability_test
```
Expected: all three pass.
- [ ] **Step 8: Commit**
```bash
git add src/browser src/lib.rs src/compat/browser_tool_adapter.rs src/compat/browser_script_skill_tool.rs src/compat/runtime.rs src/compat/orchestration.rs src/compat/workflow_executor.rs tests/browser_backend_capability_test.rs
git commit -m "refactor: abstract browser backend from pipe transport"
```
---
## Task 3: Implement the fixed browser WebSocket protocol codec in isolation
**Files:**
- Create: `src/browser/ws_protocol.rs`
- Test: `tests/browser_ws_protocol_test.rs`
- Reuse: `docs/_tmp_sgbrowser_ws_api_doc.txt`
- [ ] **Step 1: Write failing protocol codec tests**
Create `tests/browser_ws_protocol_test.rs` covering:
- exact outbound frame encoding
- callback payload decoding
- unknown callback format rejection
- mapping coverage for every supported v1 action
- [ ] **Step 2: Run the protocol tests first**
Run:
```bash
cargo test --test browser_ws_protocol_test
```
Expected: fail because the WS protocol codec does not exist yet.
- [ ] **Step 3: Encode the exact browser frame shapes**
Create `src/browser/ws_protocol.rs` so it can build exact array-form payloads such as:
```rust
[requesturl, "sgBrowserExcuteJsCodeByArea", target_url, js_code, area]
```
Serialize to the JSON string format required by the browser service.
- [ ] **Step 4: Define the v1 action mapping table**
Support only the actions already needed by current sgclaw flows:
- `Navigate`
- `GetText`
- `Click`
- `Type`
- `Eval`
Document which browser functions each one maps to and what assumptions they rely on.
- [ ] **Step 5: Define callback parsing and correlation rules**
Represent callback-bearing operations explicitly, including the callback function naming or request-correlation strategy the backend will depend on.
- [ ] **Step 6: Reject unsupported or malformed shapes early**
Fail fast for:
- unsupported actions
- malformed callback payloads
- missing request correlation metadata
- [ ] **Step 7: Re-run the protocol tests**
Run:
```bash
cargo test --test browser_ws_protocol_test
```
Expected: pass with no network dependency.
- [ ] **Step 8: Commit**
```bash
git add src/browser/ws_protocol.rs tests/browser_ws_protocol_test.rs
git commit -m "test: codify fixed browser websocket protocol"
```
---
## Task 4: Build the browser WS backend with synchronous invoke semantics
**Files:**
- Create: `src/browser/ws_backend.rs`
- Modify: `src/browser/mod.rs`
- Test: `tests/browser_ws_backend_test.rs`
- Reuse: `CommandOutput`, `PipeError`, `ExecutionSurfaceMetadata`, `MacPolicy`
- [ ] **Step 1: Write failing backend behavior tests**
Create `tests/browser_ws_backend_test.rs` covering:
- zero return + no callback => success
- non-zero return => failure
- zero return + callback => success with normalized `CommandOutput`
- callback timeout => timeout error
- dropped socket => clear failure
- [ ] **Step 2: Run backend tests first**
Run:
```bash
cargo test --test browser_ws_backend_test
```
Expected: fail because the WS backend does not exist yet.
- [ ] **Step 3: Build a long-lived browser connection manager**
Implement `src/browser/ws_backend.rs` to connect to `ws://127.0.0.1:12345` and expose blocking `invoke(...)` calls.
Use a dedicated connection loop plus request/response coordination instead of scattering raw socket calls through the runtime.
- [ ] **Step 4: Preserve local guardrails before send**
Validate `MacPolicy` before translating an action into the browser WS protocol, matching current pipe backend behavior.
- [ ] **Step 5: Normalize immediate status returns and delayed callbacks**
For each `invoke(...)` call:
- fail immediately on non-zero return codes
- succeed immediately for operations with no data callback
- wait for the matching callback for result-bearing operations
- convert the final outcome into `CommandOutput`
- [ ] **Step 6: Keep v1 concurrency intentionally serialized**
Allow only one in-flight browser request at a time unless the browser callback protocol proves a stable request-id guarantee.
- [ ] **Step 7: Re-run backend tests**
Run:
```bash
cargo test --test browser_ws_backend_test
```
Expected: pass using mocks/fakes, not the real browser.
- [ ] **Step 8: Commit**
```bash
git add src/browser/mod.rs src/browser/ws_backend.rs tests/browser_ws_backend_test.rs
git commit -m "feat: add browser websocket backend"
```
---
## Task 5: Add the `sg_claw` service protocol and single-session server
**Files:**
- Create: `src/service/mod.rs`
- Create: `src/service/protocol.rs`
- Create: `src/service/server.rs`
- Create: `src/bin/sg_claw.rs`
- Modify: `src/lib.rs`
- Modify: `Cargo.toml`
- Test: `tests/service_ws_session_test.rs`
- Reuse: `AgentMessage::LogEntry`, `AgentMessage::TaskComplete`, `SubmitTaskRequest`, `run_submit_task(...)`
- [ ] **Step 1: Write failing service session tests**
Create `tests/service_ws_session_test.rs` to verify:
- first client attaches
- second client gets `Busy`
- disconnect resets session state
- overlapping task submission is rejected clearly
- [ ] **Step 2: Run the session tests first**
Run:
```bash
cargo test --test service_ws_session_test
```
Expected: fail because the service layer does not exist yet.
- [ ] **Step 3: Define a thin client/service WS protocol**
In `src/service/protocol.rs`, reuse existing task/event shapes as much as possible:
```rust
ClientMessage::SubmitTask { instruction, conversation_id, messages, page_url, page_title }
ClientMessage::Ping
ServiceMessage::LogEntry { level, message }
ServiceMessage::TaskComplete { success, summary }
ServiceMessage::Busy { message }
```
- [ ] **Step 4: Add the service event sink adapter**
Implement `AgentEventSink` for the service session writer so the shared task runner can stream `LogEntry` and `TaskComplete` over the service WebSocket.
- [ ] **Step 5: Implement single-active-client session state**
Model explicit states such as:
- `Idle`
- `ClientAttached`
- `TaskRunning`
Reject a second client with `ServiceMessage::Busy` and close the socket. Reject overlapping tasks instead of queueing them.
- [ ] **Step 6: Add the service binary**
Create `src/bin/sg_claw.rs` that:
- loads config
- creates the browser WS backend
- listens for local client connections
- routes `SubmitTask` into `run_submit_task(...)`
Keep `src/main.rs` and the existing `sgclaw::run()` pipe path unchanged.
- [ ] **Step 7: Re-run the session tests**
Run:
```bash
cargo test --test service_ws_session_test
```
Expected: pass without the real browser.
- [ ] **Step 8: Commit**
```bash
git add src/service src/bin/sg_claw.rs src/lib.rs Cargo.toml tests/service_ws_session_test.rs
git commit -m "feat: add claw-ws service entrypoint"
```
---
## Task 6: Add the `sg_claw_client` interactive CLI
**Files:**
- Create: `src/bin/sg_claw_client.rs`
- Modify: `Cargo.toml`
- Test: `tests/service_task_flow_test.rs`
- Reuse: `src/service/protocol.rs`
- [ ] **Step 1: Write failing client/service task flow tests**
Create `tests/service_task_flow_test.rs` to verify:
- the submit-task request reaches the service
- log entries stream in order
- the final summary arrives exactly once
- disconnect after task completion is handled cleanly
- [ ] **Step 2: Run the flow tests first**
Run:
```bash
cargo test --test service_task_flow_test
```
Expected: fail because the client does not exist yet.
- [ ] **Step 3: Implement a thin interactive client loop**
Create `src/bin/sg_claw_client.rs` that:
- connects to the local `sg_claw` service
- reads a line of user input
- sends `ClientMessage::SubmitTask`
- prints streamed `LogEntry` events as they arrive
- ends the turn on `TaskComplete`
- [ ] **Step 4: Keep the client intentionally dumb**
Do **not** duplicate runtime logic in the client. Browser access, skills, orchestration, and task execution remain entirely inside the service.
- [ ] **Step 5: Re-run the flow tests**
Run:
```bash
cargo test --test service_task_flow_test
```
Expected: pass without the real browser.
- [ ] **Step 6: Build the new binaries explicitly**
Run:
```bash
cargo build --bin sg_claw --bin sg_claw_client
```
Expected: both binaries compile successfully.
- [ ] **Step 7: Commit**
```bash
git add src/bin/sg_claw_client.rs Cargo.toml tests/service_task_flow_test.rs
git commit -m "feat: add interactive claw-ws client"
```
---
## Task 7: Finish wiring, preserve pipe mode, and verify end-to-end behavior
**Files:**
- Modify: `Cargo.toml`
- Modify: `src/lib.rs`
- Modify: `src/config/settings.rs`
- Optional modify: `src/runtime/engine.rs`
- Reuse: `tests/browser_tool_test.rs`, `tests/runtime_task_flow_test.rs`, `tests/compat_runtime_test.rs`
- [ ] **Step 1: Add only the minimum config surface for v1**
Add settings such as:
- `browser_ws_url` defaulting to `ws://127.0.0.1:12345`
- `service_ws_listen_addr` defaulting to local loopback
Do **not** change the meaning of existing browser backend/profile settings just to represent service mode.
- [ ] **Step 2: Keep external browser tool naming stable**
Verify that the runtime still exposes:
- `superrpa_browser`
- `browser_action`
under both pipe and WS modes where the backend supports them.
- [ ] **Step 3: Re-run the current pipe regression suite**
Run:
```bash
cargo test --test browser_tool_test --test compat_browser_tool_test --test runtime_task_flow_test --test pipe_handshake_test --test pipe_protocol_test --test task_protocol_test
```
Expected: all existing pipe-oriented tests still pass unchanged.
- [ ] **Step 4: Run the new WS-focused suite**
Run:
```bash
cargo test --test task_runner_test --test browser_ws_protocol_test --test browser_ws_backend_test --test browser_backend_capability_test --test service_ws_session_test --test service_task_flow_test
```
Expected: all new tests pass without launching the real browser.
- [ ] **Step 5: Run a full Rust test sweep**
Run:
```bash
cargo test --tests
```
Expected: all Rust tests pass.
- [ ] **Step 6: Build all three binaries**
Run:
```bash
cargo build --bin sgclaw --bin sg_claw --bin sg_claw_client
```
Expected: all three binaries compile.
- [ ] **Step 7: Perform a manual local smoke test**
Manual test:
1. Start the browser app so `ws://127.0.0.1:12345` is available.
2. Run `cargo run --bin sg_claw`.
3. In another terminal, run `cargo run --bin sg_claw_client`.
4. Submit a simple browser task such as opening a page or fetching visible text.
5. Confirm the client prints streaming logs and exactly one final completion summary.
6. Confirm the old pipe-mode entry still starts via `cargo run`.
Expected: both modes work side-by-side.
- [ ] **Step 8: Commit**
```bash
git add Cargo.toml src/lib.rs src/config/settings.rs src/runtime/engine.rs
git commit -m "feat: wire parallel claw-ws transport"
```
---
## Verification Checklist
### Fast regression checks
```bash
cargo test --test browser_tool_test --test compat_browser_tool_test --test runtime_task_flow_test
```
Expected: current pipe/browser runtime behavior remains green.
### Full Rust test sweep
```bash
cargo test --tests
```
Expected: all Rust tests pass.
### Binary build verification
```bash
cargo build --bin sgclaw --bin sg_claw --bin sg_claw_client
```
Expected: all three binaries compile.
### Manual end-to-end verification
- Browser app listening on `ws://127.0.0.1:12345`
- `cargo run --bin sg_claw`
- `cargo run --bin sg_claw_client`
- submit one browser task
- verify streaming logs, final completion, and single-client lock behavior
- verify `cargo run` still preserves old pipe bootstrap
---
## Notes for Implementation
- Keep the current pipe bootstrap in `src/lib.rs` intact until the shared runner and pipe backend wrapper are both green.
- Prefer small commits at each task boundary.
- Keep the new WS path additive and isolated.
- Do not ship partial browser capabilities under stable tool names.
- Treat `docs/_tmp_sgbrowser_ws_api_doc.txt` as the browser WS protocol source of truth while implementing `src/browser/ws_protocol.rs`.

View File

@@ -0,0 +1,607 @@
# WS Browser Backend Auth Replacement Implementation Plan
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
**Goal:** Replace the ws service paths empty-session-key `BrowserPipeTool` dependency with a ws-native browser backend path so real browser websocket calls work, while preserving legacy pipe behavior exactly.
**Architecture:** Keep the existing pipe entry untouched and add a ws-only parallel execution seam. The ws service path will construct a `ServiceBrowserWsClient` plus `WsBrowserBackend`, pass that backend through a new ws-only submit-task entry, and let the existing compat/runtime stack consume `Arc<dyn BrowserBackend>` instead of requiring `BrowserPipeTool` on the ws path.
**Tech Stack:** Rust 2021, current sgclaw agent/task runner, compat runtime/orchestration stack, `tungstenite`, `serde_json`, existing `MacPolicy`, existing `BrowserBackend`/`WsBrowserBackend`, and the current Rust test suite.
---
## Scope Guardrails
- Only change the ws service path.
- Do **not** change `src/lib.rs` pipe runtime behavior.
- Do **not** change pipe handshake semantics.
- Do **not** introduce fake session keys, fake HMAC seeds, or auth bypasses.
- Keep legacy `run_submit_task(...)` available for the pipe entry.
- If a shared layer must change, add a parallel ws-only entry instead of weakening the pipe path.
- Keep the current single-client, single-task service model.
- Do not broaden this slice into browser process launch, queueing, multi-client support, or protocol extensions.
---
## File Structure
### Existing files to modify
- Modify: `src/agent/task_runner.rs` — keep the current pipe-oriented submit path and add the ws-only backend-based submit path.
- Modify: `src/compat/runtime.rs` — add a backend-driven execution entry that accepts `Arc<dyn BrowserBackend>` directly, while keeping the current pipe-oriented public functions behaviorally unchanged.
- Modify: `src/compat/orchestration.rs` — add a matching backend-driven execution entry for orchestration/direct-route flows, while keeping the current pipe-oriented public functions behaviorally unchanged.
- Modify: `src/compat/workflow_executor.rs` — add backend-driven sibling APIs for direct-route/fallback execution, while keeping the current pipe-oriented public functions behaviorally unchanged.
- Modify: `src/service/server.rs` — replace the ws services `BrowserPipeTool::new(..., vec![])` path with a ws-native `WsClient` + `WsBrowserBackend` path.
- Modify: `src/service/mod.rs` — only if minimal re-export or call-signature changes are needed around the new ws-only submit path.
- Modify: `src/browser/mod.rs` — only if export cleanup is truly needed for the service wiring.
- Reuse: `src/agent/mod.rs` — keep the current pipe routing unchanged unless a tiny internal refactor is strictly needed to reuse shared code.
- Reuse: `src/browser/backend.rs` — existing shared browser backend trait.
- Reuse: `src/browser/ws_backend.rs` — existing ws-native browser backend implementation.
- Reuse: `src/browser/ws_protocol.rs` — existing browser websocket protocol codec.
- Reuse: `src/compat/browser_tool_adapter.rs` — should already speak `BrowserBackend`; only touch if a narrow ws regression forces it.
- Reuse: `src/compat/browser_script_skill_tool.rs` — eval-capability gating already exists; only touch if a narrow ws regression forces it.
- Reuse: `src/lib.rs` — pipe entrypoint must remain behaviorally unchanged; verify only.
### Existing tests to extend
- Modify: `tests/browser_ws_backend_test.rs` — keep existing ws backend coverage green after the service adapter wiring lands.
- Modify: `tests/browser_script_skill_tool_test.rs` — re-verify eval-gating and browser-script behavior after the shared compat/runtime seam changes.
- Modify: `tests/service_ws_session_test.rs` — update service-side unit/session tests to exercise the ws-only submit path.
- Modify: `tests/service_task_flow_test.rs` — add client→service chain coverage proving the ws path reaches a browser websocket and no longer emits `invalid hmac seed`.
- Modify: `src/service/server.rs` under `#[cfg(test)]` if the private service-side ws adapter cannot be exercised from an integration test crate without changing production visibility.
### New files to create
- Create: `tests/browser_ws_service_adapter_test.rs` if the adapter can be exercised through a public seam; otherwise keep the deterministic adapter tests as unit tests in `src/service/server.rs` so no production visibility changes are required.
---
## Task 1: Lock the ws-only behavior with deterministic failing tests
**Files:**
- Create: `tests/browser_ws_service_adapter_test.rs`
- Modify: `tests/service_task_flow_test.rs`
- Reuse: `tests/browser_ws_backend_test.rs`, `src/browser/ws_backend.rs`, `src/service/server.rs`
- [ ] **Step 1: Write the first failing backend/adapter test**
Create `tests/browser_ws_service_adapter_test.rs` with one focused test that directly exercises the ws-service adapter layer, without `sg_claw_client`, without LLM planning, and without natural-language tasks.
Start with the smallest real behavior from the spec:
- fake browser websocket server accepts one connection
- the ws-service adapter builds the same kind of client the service will use
- `WsBrowserBackend.invoke(Action::Navigate, ...)` succeeds on status `0`
- the fake server receives one text frame that decodes as a ws `Navigate` call
- [ ] **Step 2: Run that single new test and watch it fail**
Run:
```bash
cargo test --test browser_ws_service_adapter_test ws_service_backend_navigate_reaches_browser_websocket -- --nocapture
```
Expected: FAIL because the service-side ws client/adapter does not exist yet.
- [ ] **Step 3: Add the second failing deterministic test**
In the same file, add a test for the forced-close path:
- fake browser websocket server accepts a request, then closes/reset the socket before returning a status frame
- observe the error at the `WsBrowserBackend.invoke(...)` call site
- assert the outward error is exactly `PipeError::PipeClosed`
- [ ] **Step 4: Run only the forced-close test and watch it fail**
Run:
```bash
cargo test --test browser_ws_service_adapter_test ws_service_backend_maps_browser_disconnect_to_pipe_closed -- --nocapture
```
Expected: FAIL because the service-side ws client/adapter still does not exist.
- [ ] **Step 5: Add the third failing deterministic test**
In the same file, add a callback-timeout test:
- fake browser websocket server returns status `0`
- it never returns the callback frame
- assert the outward error at `invoke(...)` is exactly `PipeError::Timeout`
Use a tiny response timeout in the backend under test.
- [ ] **Step 6: Run only the callback-timeout test and watch it fail**
Run:
```bash
cargo test --test browser_ws_service_adapter_test ws_service_backend_times_out_waiting_for_callback -- --nocapture
```
Expected: FAIL because the service-side ws client/adapter still does not exist.
- [ ] **Step 7: Add the end-to-end failing regression for the auth bug**
Extend `tests/service_task_flow_test.rs` with one client→service integration test that:
- starts a fake browser websocket server
- starts the real `sg_claw` service binary with a temp config pointing `browserWsUrl` to that fake server
- starts the real `sg_claw_client`
- submits the fixed instruction `打开知乎热榜并读取页面主区域文本`
- captures service/client output
- asserts the fake browser server received at least one text frame
- asserts output does **not** contain `invalid hmac seed: session key must not be empty`
Do not assert planner details here. This test only proves the service path no longer goes through the empty-session-key auth failure.
- [ ] **Step 8: Run the integration regression and watch it fail**
Run:
```bash
cargo test --test service_task_flow_test ws_service_submit_task_no_longer_hits_invalid_hmac_seed -- --nocapture
```
Expected: FAIL on the current code because the ws service still constructs `BrowserPipeTool::new(..., vec![])`.
- [ ] **Step 9: Commit the red tests only after they are all in place**
Do not commit yet if any required red test was skipped. The next task will make them pass.
---
## Task 2: Add a ws-only browser-backend execution seam without changing the pipe path
**Files:**
- Modify: `src/agent/task_runner.rs`
- Modify: `src/compat/runtime.rs`
- Modify: `src/compat/orchestration.rs`
- Modify: `src/compat/workflow_executor.rs`
- Reuse: `src/agent/mod.rs`, `src/browser/backend.rs`
- Test: `tests/task_runner_test.rs`, `tests/browser_script_skill_tool_test.rs`
- [ ] **Step 1: Write the smallest failing runner-level ws entry test**
Extend `tests/task_runner_test.rs` with a focused test that proves there is a ws-only submit entry accepting `Arc<dyn BrowserBackend>` and an arbitrary event sink, while the old `run_submit_task(...)` signature still exists for pipe mode.
The test can stay on the missing-LLM-config path so it does not need a real browser call. It should compile only once the new ws-only function exists.
- [ ] **Step 2: Run the targeted runner test and watch it fail**
Run:
```bash
cargo test --test task_runner_test ws_only_submit_task_entry_accepts_browser_backend -- --nocapture
```
Expected: FAIL to compile or FAIL to link because the ws-only entry does not exist yet.
- [ ] **Step 3: Add the new ws-only submit-task entry in `src/agent/task_runner.rs`**
Keep the current pipe function intact:
```rust
pub fn run_submit_task<T: Transport + 'static>(... browser_tool: &BrowserPipeTool<T>, ...)
```
Add a parallel entry for the service path, for example:
```rust
pub fn run_submit_task_with_browser_backend(
sink: &dyn AgentEventSink,
browser_backend: Arc<dyn BrowserBackend>,
context: &AgentRuntimeContext,
request: SubmitTaskRequest,
) -> Result<(), PipeError>
```
Rules:
- share as much internal logic as possible with the pipe path
- do not change `run_submit_task(...)` behavior
- do not change `src/agent/mod.rs` pipe wiring except, at most, small internal refactoring to reuse common code
- [ ] **Step 4: Add a backend-driven compat runtime entry**
In `src/compat/runtime.rs`, add a parallel entry that accepts `Arc<dyn BrowserBackend>` directly instead of `BrowserPipeTool<T>`.
Keep the existing pipe-oriented public function in place.
The backend-driven entry must preserve:
- existing log emission order
- tool names (`superrpa_browser`, `browser_action`)
- existing browser-script tool gating behavior
- existing office/screen tool attachment logic
- existing conversation seeding and provider setup
- [ ] **Step 5: Add backend-driven orchestration and workflow-executor entries**
In `src/compat/orchestration.rs`, add the matching backend-driven entry so direct-route flows and fallback flows can run with `Arc<dyn BrowserBackend>` on the ws path.
In `src/compat/workflow_executor.rs`, add backend-driven sibling APIs for any direct-route/fallback execution that is currently hard-wired to `BrowserPipeTool<T>`.
Keep the existing pipe-oriented orchestration and workflow-executor public functions in place.
- [ ] **Step 6: Route the new ws-only submit entry through the backend-driven compat/orchestration/workflow-executor path**
Inside `src/agent/task_runner.rs`, make the new ws-only submit entry call the new backend-based compat/orchestration functions, while the old pipe entry keeps calling the old pipe-based functions.
This is the core compatibility seam, and it must cover both normal compat-runtime execution and direct-route/fallback workflow execution.
- [ ] **Step 7: Re-run the new runner test**
Run:
```bash
cargo test --test task_runner_test ws_only_submit_task_entry_accepts_browser_backend -- --nocapture
```
Expected: PASS.
- [ ] **Step 8: Re-run the full runner, workflow, and browser-script regressions**
Run:
```bash
cargo test --test task_runner_test --test browser_script_skill_tool_test -- --nocapture
```
Then run the workflow executor unit coverage that protects direct-route behavior:
```bash
cargo test compat::workflow_executor::tests -- --nocapture
```
Expected: all existing runner, workflow, and browser-script tests still pass, proving the pipe-facing path, direct-route behavior, and eval-gating stayed stable.
- [ ] **Step 9: Commit**
```bash
git add src/agent/task_runner.rs src/compat/runtime.rs src/compat/orchestration.rs src/compat/workflow_executor.rs tests/task_runner_test.rs tests/browser_script_skill_tool_test.rs
git commit -m "refactor: add ws-only browser backend submit path"
```
---
## Task 3: Replace the ws services empty-session-key browser tool with a ws-native backend
**Files:**
- Modify: `src/service/server.rs`
- Modify: `src/service/mod.rs` only if minimal re-export or signature cleanup is required
- Modify: `src/browser/mod.rs` only if export cleanup is needed
- Test: `tests/browser_ws_service_adapter_test.rs`
- Reuse: `src/browser/ws_backend.rs`, `src/browser/ws_protocol.rs`
- [ ] **Step 1: Write the smallest failing service-side adapter compile target**
Add a compile-level or construction-level assertion in `tests/browser_ws_service_adapter_test.rs` that the service path can construct the new service-side ws client type used by `serve_client(...)`.
This should fail until the type exists in `src/service/server.rs`.
- [ ] **Step 2: Run the adapter test group and watch the constructor test fail**
Run:
```bash
cargo test --test browser_ws_service_adapter_test -- --nocapture
```
Expected: FAIL because the service-side ws client type does not exist yet.
- [ ] **Step 3: Introduce `ServiceBrowserWsClient` in `src/service/server.rs`**
Create a narrow client type that owns the real websocket connection to `browser_ws_url` and implements `WsClient`:
Required responsibilities only:
- lazily connect on first use
- send raw text frames
- receive raw text frames with timeout
- map close/reset to exactly `PipeError::PipeClosed`
- map connect failure to exactly `PipeError::Protocol("browser websocket connect failed: ...")`
- map timeouts to exactly `PipeError::Timeout`
Do **not** duplicate `WsBrowserBackend` responsibilities here.
- [ ] **Step 4: Remove ws-path use of `BrowserPipeTool::new(..., vec![])`**
In `serve_client(...)`, replace this shape:
```rust
let transport = Arc::new(ServiceBrowserTransport::new(...));
let browser_tool = BrowserPipeTool::new(transport.clone(), mac_policy.clone(), vec![])
```
with the ws-native shape:
```rust
let ws_client = Arc::new(ServiceBrowserWsClient::new(...));
let browser_backend: Arc<dyn BrowserBackend> = Arc::new(
WsBrowserBackend::new(ws_client, mac_policy.clone(), initial_request_url(...))
.with_response_timeout(BROWSER_RESPONSE_TIMEOUT)
);
```
Then route the task through the new ws-only submit entry from Task 2.
- [ ] **Step 5: Delete or narrow old ws-path transport code that duplicated protocol handling**
Remove the service-only callback polling / response queue logic that existed solely to feed `BrowserPipeTool`.
Keep only what is still needed for:
- service client websocket I/O (`sg_claw_client``sg_claw`)
- browser websocket I/O (`sg_claw``browser_ws_url`)
Do not leave two competing ws protocol implementations in `src/service/server.rs`.
- [ ] **Step 6: Re-run deterministic adapter/backend tests**
Run:
```bash
cargo test --test browser_ws_service_adapter_test -- --nocapture
```
Expected: PASS, including:
- navigate success
- disconnect => `PipeError::PipeClosed`
- callback timeout => `PipeError::Timeout`
- [ ] **Step 7: Re-run existing ws backend tests**
Run:
```bash
cargo test --test browser_ws_backend_test -- --nocapture
```
Expected: PASS, confirming the service adapter change did not break the existing ws backend semantics.
- [ ] **Step 8: Commit**
```bash
git add src/service/server.rs src/service/mod.rs src/browser/mod.rs tests/browser_ws_service_adapter_test.rs
git commit -m "feat: switch ws service to ws-native browser backend"
```
---
## Task 4: Prove the auth bug is gone and pipe mode is unchanged
**Files:**
- Modify: `tests/service_ws_session_test.rs`
- Modify: `tests/service_task_flow_test.rs`
- Reuse: `src/lib.rs`, `src/service/mod.rs`, `src/compat/workflow_executor.rs`
- [ ] **Step 1: Update service session tests for the new ws-only call path**
Adjust any service session tests that still call `handle_client_message(...)` through the old ws-path `BrowserPipeTool` assumption.
Prefer one of these narrow approaches:
- overload `handle_client_message(...)` with a backend-based service entry used only in ws tests, or
- keep `handle_client_message(...)` pipe-oriented and test the ws path through `serve_client(...)` and the real service binary instead
Choose the option that changes the fewest existing tests and leaves the pipe path simplest.
- [ ] **Step 2: Run the focused service session file**
Run:
```bash
cargo test --test service_ws_session_test -- --nocapture
```
Expected: PASS.
- [ ] **Step 3: Make the auth-regression integration test pass**
Re-run the exact end-to-end regression from Task 1:
```bash
cargo test --test service_task_flow_test ws_service_submit_task_no_longer_hits_invalid_hmac_seed -- --nocapture
```
Expected: PASS, with evidence that:
- the fake browser websocket server received at least one frame
- output no longer contains `invalid hmac seed: session key must not be empty`
- [ ] **Step 4: Add one explicit mandatory assertion for browser websocket connect failures**
Add one focused assertion that a browser websocket connect failure surfaces outward as:
```rust
PipeError::Protocol("browser websocket connect failed: ...")
```
Do not leave this semantic implied.
- [ ] **Step 5: Add one explicit ws direct-route regression**
Add one focused regression that proves a ws-backed browser backend can traverse a direct-route/fallback path that currently flows through `src/compat/workflow_executor.rs`.
Keep it deterministic and narrow. Prefer a fake backend plus direct function invocation over a planner-dependent natural-language end-to-end test.
- [ ] **Step 6: Run the ws-focused regression suite**
Run:
```bash
cargo test --test browser_ws_service_adapter_test --test browser_ws_backend_test --test browser_ws_protocol_test --test service_ws_session_test --test service_task_flow_test -- --nocapture
```
Then run the workflow-executor direct-route coverage:
```bash
cargo test compat::workflow_executor::tests -- --nocapture
```
Expected: all ws-focused and direct-route workflow tests pass.
- [ ] **Step 7: Run the required pipe and browser-script regression suite**
Run:
```bash
cargo test --test pipe_handshake_test --test browser_tool_test --test compat_browser_tool_test --test browser_script_skill_tool_test --test runtime_task_flow_test -- --nocapture
```
Expected: all required pipe and browser-script regressions pass unchanged.
- [ ] **Step 8: Run the full relevant verification sweep**
Run:
```bash
cargo test --test browser_ws_service_adapter_test --test browser_ws_backend_test --test browser_ws_protocol_test --test service_ws_session_test --test service_task_flow_test --test pipe_handshake_test --test browser_tool_test --test compat_browser_tool_test --test browser_script_skill_tool_test --test runtime_task_flow_test -- --nocapture
```
Then run:
```bash
cargo test compat::workflow_executor::tests -- --nocapture
```
Expected: full mixed ws+pipe verification passes in fresh runs.
- [ ] **Step 9: Build the affected binaries**
Run:
```bash
cargo build --bin sgclaw --bin sg_claw --bin sg_claw_client
```
Expected: all three binaries compile.
- [ ] **Step 10: Commit**
```bash
git add tests/service_ws_session_test.rs tests/service_task_flow_test.rs tests/browser_ws_service_adapter_test.rs src/compat/workflow_executor.rs
git commit -m "test: verify ws auth replacement and pipe regressions"
```
---
## Task 5: Manual smoke verification against the real browser
**Files:**
- Reuse only: no code changes unless a verified bug is found during smoke work
- [ ] **Step 1: Start the real browser websocket target**
Confirm the real sgBrowser endpoint is reachable at the configured `browserWsUrl`.
- [ ] **Step 2: Start the real ws service**
Run:
```bash
cargo run --bin sg_claw -- --config-path "D:/data/ideaSpace/rust/sgClaw/sgclaw_config.json"
```
Expected: service prints the resolved listen address and browser websocket URL.
- [ ] **Step 3: Run the minimal browser task through the real client**
Run from a separate terminal with UTF-8-safe input:
```bash
cargo run --bin sg_claw_client -- --config-path "D:/data/ideaSpace/rust/sgClaw/sgclaw_config.json"
```
Submit:
```text
打开知乎热榜并读取页面主区域文本
```
Expected:
- browser actions start executing
- no `invalid hmac seed: session key must not be empty`
- one final completion is returned
- [ ] **Step 4: Run the old Zhihu skill smoke**
Submit:
```text
读取知乎热榜数据,并导出 excel 文件
```
Expected: the task enters the real browser action path instead of dying at auth initialization.
- [ ] **Step 5: Re-check the legacy pipe entry without modifying it**
Run:
```bash
cargo run
```
Only verify startup behavior appropriate for the current pipe environment. Do not change pipe code during this smoke step.
- [ ] **Step 6: If a smoke failure appears, stop and debug before editing**
Any failure found here must be handled with:
- a fresh reproducer
- a failing automated test if feasible
- the smallest scoped fix
Do not fold speculative smoke fixes into this slice.
---
## Verification Checklist
### Deterministic ws-only tests
```bash
cargo test --test browser_ws_service_adapter_test --test browser_ws_backend_test --test browser_ws_protocol_test -- --nocapture
```
Expected: ws-native backend and service adapter semantics are green without LLM/planner dependencies.
### Client→service ws chain tests
```bash
cargo test --test service_ws_session_test --test service_task_flow_test -- --nocapture
```
Expected: the ws service path reaches the browser websocket and no longer emits the empty-session-key auth failure.
### Required pipe and browser-script regressions
```bash
cargo test --test pipe_handshake_test --test browser_tool_test --test compat_browser_tool_test --test browser_script_skill_tool_test --test runtime_task_flow_test -- --nocapture
```
Expected: legacy pipe behavior and browser-script eval-gating remain unchanged.
### Binary build verification
```bash
cargo build --bin sgclaw --bin sg_claw --bin sg_claw_client
```
Expected: all affected binaries compile.
### Manual end-to-end verification
- real sgBrowser running at configured `browserWsUrl`
- `cargo run --bin sg_claw -- --config-path "D:/data/ideaSpace/rust/sgClaw/sgclaw_config.json"`
- `cargo run --bin sg_claw_client -- --config-path "D:/data/ideaSpace/rust/sgClaw/sgclaw_config.json"`
- run the Zhihu minimal task
- run the old Zhihu export task
- verify no `invalid hmac seed` appears
- verify pipe startup still behaves as before
---
## Notes for Implementation
- Keep the current pipe bootstrap in `src/lib.rs` untouched.
- Prefer adding ws-only functions over changing existing pipe signatures.
- Reuse `WsBrowserBackend` for protocol semantics; do not re-implement callback handling inside the service.
- Keep `ServiceBrowserWsClient` narrow: connection lifecycle + raw websocket I/O only.
- Preserve exact outward error semantics from the spec:
- connect failure => `PipeError::Protocol("browser websocket connect failed: ...")`
- non-zero status => `PipeError::Protocol("browser returned non-zero status: ...")`
- callback timeout => `PipeError::Timeout`
- close/reset => `PipeError::PipeClosed`
- Do not claim success until the mixed ws+pipe verification commands have been run fresh.

View File

@@ -0,0 +1,482 @@
# WS Browser Bridge Path Implementation Plan
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
**Goal:** Replace the raw-ws-direct browser execution assumption with a bridge-backed browser integration path that matches the validated FunctionsUI / BrowserAction / CommandRouter model while preserving existing pipe behavior.
**Architecture:** Keep the current Rust-side browser orchestration flow centered on `Arc<dyn BrowserBackend>`, but stop treating `WsBrowserBackend` as the real production browser surface. Model the validated bridge as two explicit layers: Layer 1 session/lifecycle calls (`sgclawConnect`, `sgclawStart`, `sgclawStop`, `sgclawSubmitTask`) and Layer 2 browser-action execution (`window.sgFunctionsUI(...)`, `window.BrowserAction(...)`, `CommandRouter`). The new backend targets Layer 2 only through a narrow repo-local `BridgeActionTransport` seam, while lifecycle/session concerns stay separate from per-action browser execution.
**Tech Stack:** Rust 2021, existing `BrowserBackend` abstraction, compat/runtime/orchestration stack, current service/task runner integration, existing bridge-oriented design docs, existing Rust unit/integration test suite.
---
## Scope Guardrails
- Do **not** continue extending raw external sgBrowser websocket business-frame handling as the mainline path.
- Do **not** modify `src/lib.rs`, pipe handshake behavior, or the working `BrowserPipeTool` path.
- Do **not** invent a parallel browser-command contract unrelated to the documented bridge surface.
- Do **not** rewrite the whole compat/runtime stack when a narrow adapter will do.
- Do **not** assume access to the full SuperRPA browser-host codebase from this repository; encode the validated contract at the nearest seam available here.
---
## File Structure
### Existing files to modify
- Modify: `src/browser/mod.rs`
- export the new bridge contract/transport/backend modules explicitly
- Modify: `src/browser/backend.rs`
- only if a tiny shared helper or trait documentation update is needed for the new bridge-backed backend
- Modify: `src/compat/browser_tool_adapter.rs`
- ensure existing browser action mapping remains reusable with the new backend implementation
- Modify: `src/compat/runtime.rs`
- wire the bridge-backed browser backend into the ws service/browser execution path without changing the pipe path
- Modify: `src/compat/orchestration.rs`
- only where browser backend wiring requires the bridge-backed path to flow through orchestration
- Modify: `src/compat/workflow_executor.rs`
- preserve direct-route/fallback use of `BrowserBackend` when the backend is bridge-backed instead of websocket-backed
- Modify: `src/service/server.rs`
- replace the current real-browser execution assumption with bridge-backend construction plus a repo-local bridge transport provider seam for the relevant service path
- Modify: `tests/compat_browser_tool_test.rs`
- extend browser tool mapping coverage if needed for bridge-backed execution
- Modify: `tests/service_task_flow_test.rs`
- replace raw-ws-direct expectations with bridge-path expectations where appropriate
- Modify: `tests/service_ws_session_test.rs`
- update service-side tests if they currently assume the real browser path is raw websocket driven
### New files to create
- Create: `src/browser/bridge_contract.rs`
- narrow, explicit contract types that keep lifecycle/session bridge calls separate from browser-action execution requests/replies
- Create: `src/browser/bridge_transport.rs`
- repo-local `BridgeActionTransport` seam used by the backend and injected by service/runtime wiring
- Create: `src/browser/bridge_backend.rs`
- new `BrowserBackend` implementation that maps browser actions onto the Layer-2 bridge action contract through `BridgeActionTransport`
- Create: `tests/browser_bridge_backend_test.rs`
- deterministic unit tests for action-to-bridge mapping and reply/error normalization using a fake bridge transport
- Create: `tests/browser_bridge_contract_test.rs`
- narrow tests proving the two bridge layers stay explicit and browser-action requests remain semantic rather than raw-websocket-shaped
### Evidence files to consult during implementation
- Read: `docs/_tmp_sgbrowser_ws_probe_transcript.md`
- Read: `frontend/archive/sgClaw验证-已归档/testRunner.js`
- Read: `docs/superpowers/specs/2026-03-25-superrpa-sgclaw-browser-control-design.md`
- Read: `docs/archive/项目管理与排期/协作时间表.md`
- Read: `docs/plans/2026-03-27-sgclaw-floating-chat-frontend-design.md`
---
## Task 1: Lock the bridge contract in deterministic tests before adding the backend
**Files:**
- Create: `src/browser/bridge_contract.rs`
- Create: `tests/browser_bridge_contract_test.rs`
- Reuse as design evidence:
- `frontend/archive/sgClaw验证-已归档/testRunner.js`
- `docs/archive/项目管理与排期/协作时间表.md`
- `docs/plans/2026-03-27-sgclaw-floating-chat-frontend-design.md`
- [ ] **Step 1: Write the first failing contract test for named bridge calls**
Create `tests/browser_bridge_contract_test.rs` with one focused test that encodes the bridge naming expectations already evidenced in the repo.
Start with a test shape like:
```rust
#[test]
fn bridge_contract_names_match_documented_bridge_surface() {
// assert the contract contains the exact bridge action names
}
```
Required expectations:
- `sgclawConnect`
- `sgclawStart`
- `sgclawStop`
- `sgclawSubmitTask`
- these names live in an explicit lifecycle/session contract type, not in the browser-action request type
Do **not** invent additional action names in this first test.
- [ ] **Step 2: Run the single contract test and verify it fails**
Run:
```bash
cargo test --test browser_bridge_contract_test bridge_contract_names_match_documented_bridge_surface -- --nocapture
```
Expected: FAIL because `src/browser/bridge_contract.rs` does not exist yet.
- [ ] **Step 3: Add the second failing contract test for browser-action request shaping**
In the same file, add one focused test proving the bridge contract can represent a browser action request without leaking raw websocket business-frame semantics.
Test shape:
```rust
#[test]
fn bridge_contract_represents_browser_action_requests_without_ws_business_frames() {
// create a click/navigate/getText style action request and assert shape
}
```
Required assertions:
- request shape identifies the intended browser action semantically
- request shape is distinct from the lifecycle/session bridge call type
- request shape does **not** embed `sgBrowerserOpenPage`, `callBackJsToCpp`, or other raw websocket business-frame names
- [ ] **Step 4: Run the second contract test and verify it fails**
Run:
```bash
cargo test --test browser_bridge_contract_test bridge_contract_represents_browser_action_requests_without_ws_business_frames -- --nocapture
```
Expected: FAIL because the bridge contract does not exist yet.
- [ ] **Step 5: Implement the minimal bridge contract module**
Create `src/browser/bridge_contract.rs` with only the types needed by the tests.
Recommended shape:
```rust
pub enum BridgeLifecycleCall {
Connect,
Start,
Stop,
SubmitTask,
}
impl BridgeLifecycleCall {
pub fn bridge_name(&self) -> &'static str {
match self {
Self::Connect => "sgclawConnect",
Self::Start => "sgclawStart",
Self::Stop => "sgclawStop",
Self::SubmitTask => "sgclawSubmitTask",
}
}
}
pub struct BridgeBrowserActionRequest {
pub action: String,
pub params: serde_json::Value,
pub expected_domain: String,
}
```
Rules:
- model the documented bridge/lifecycle naming explicitly
- keep the browser action request semantic, not websocket-frame-shaped
- keep the module small and repository-local
- [ ] **Step 6: Re-run the contract tests**
Run:
```bash
cargo test --test browser_bridge_contract_test -- --nocapture
```
Expected: PASS.
- [ ] **Step 7: Commit**
```bash
git add src/browser/bridge_contract.rs tests/browser_bridge_contract_test.rs
git commit -m "test: define sgClaw bridge contract surface"
```
---
## Task 2: Add the repo-local transport seam and bridge-backed `BrowserBackend`
**Files:**
- Create: `src/browser/bridge_transport.rs`
- Create: `src/browser/bridge_backend.rs`
- Create: `tests/browser_bridge_backend_test.rs`
- Reuse: `src/browser/backend.rs`
- Reuse: `src/browser/bridge_contract.rs`
- Reuse: `src/compat/browser_tool_adapter.rs`
- [ ] **Step 1: Write the first failing backend test for action mapping**
Create `tests/browser_bridge_backend_test.rs` with one focused test proving a `BrowserBackend` action is translated into the bridge contract request shape.
Start with a narrow action such as `Action::Navigate`.
Required assertions:
- `Action::Navigate` becomes one semantic bridge browser-action request
- the request preserves action parameters and expected domain
- the test does **not** assert any raw websocket payload strings
- [ ] **Step 2: Run the first backend test and verify it fails**
Run:
```bash
cargo test --test browser_bridge_backend_test bridge_backend_maps_navigate_to_bridge_action_request -- --nocapture
```
Expected: FAIL because `src/browser/bridge_backend.rs` does not exist yet.
- [ ] **Step 3: Add the second failing backend test for reply normalization**
Add one focused test proving the backend can normalize a successful bridge reply into the existing `CommandOutput` shape expected by `BrowserBackend` callers.
- [ ] **Step 4: Run the second backend test and verify it fails**
Run:
```bash
cargo test --test browser_bridge_backend_test bridge_backend_normalizes_successful_bridge_reply -- --nocapture
```
Expected: FAIL because the backend does not exist yet.
- [ ] **Step 5: Add the third failing backend test for bridge-side errors**
Add one focused test proving a bridge-side error normalizes into the correct outward `PipeError` semantics for backend callers.
- [ ] **Step 6: Run the error-path test and verify it fails**
Run:
```bash
cargo test --test browser_bridge_backend_test bridge_backend_maps_bridge_failure_to_pipe_error -- --nocapture
```
Expected: FAIL because the backend does not exist yet.
- [ ] **Step 7: Implement the minimal transport seam and bridge backend**
Create `src/browser/bridge_transport.rs` and `src/browser/bridge_backend.rs`.
The transport seam must:
- define the repo-local `BridgeActionTransport` contract used for Layer-2 browser-action execution only
- accept semantic `BridgeBrowserActionRequest` values and return semantic success/error replies
- remain small, explicit, and easy to fake in tests
The backend must:
- implement the existing `BrowserBackend` trait
- translate supported actions into `BridgeBrowserActionRequest`
- depend on `BridgeActionTransport` instead of raw websocket payload building
- normalize success/error replies into existing backend-facing result types
Rules:
- do not embed raw websocket business-frame names
- do not change `BrowserBackend` semantics for existing callers
- do not pull lifecycle/session bridge calls into this backend layer
- [ ] **Step 8: Re-run the bridge backend tests**
Run:
```bash
cargo test --test browser_bridge_backend_test -- --nocapture
```
Expected: PASS.
- [ ] **Step 9: Re-run browser tool adapter coverage**
Run:
```bash
cargo test --test compat_browser_tool_test -- --nocapture
```
Expected: PASS, proving the existing browser action mapping remains reusable with the new backend.
- [ ] **Step 10: Commit**
```bash
git add src/browser/bridge_transport.rs src/browser/bridge_backend.rs tests/browser_bridge_backend_test.rs src/compat/browser_tool_adapter.rs src/browser/mod.rs
git commit -m "feat: add bridge-backed browser backend"
```
---
## Task 3: Wire the bridge-backed backend into the real-browser service path
**Files:**
- Modify: `src/service/server.rs`
- Modify: `src/compat/runtime.rs`
- Modify: `src/compat/orchestration.rs`
- Modify: `src/compat/workflow_executor.rs`
- Modify: `tests/service_task_flow_test.rs`
- Modify: `tests/service_ws_session_test.rs`
- Reuse: `src/browser/bridge_backend.rs`
- Reuse: `src/browser/bridge_contract.rs`
- Reuse: `src/browser/bridge_transport.rs`
- [ ] **Step 1: Write the first failing service-path test for bridge backend construction**
Add or update one focused service test proving the real-browser execution path constructs and uses the bridge-backed backend instead of the raw websocket backend assumption.
The test should observe backend selection at the nearest possible seam.
- [ ] **Step 2: Run the focused service test and verify it fails**
Run the narrowest affected service test command.
Expected: FAIL because the service path is not wired to the bridge backend yet.
- [ ] **Step 3: Add the minimal service/runtime wiring**
Change the relevant service/browser execution path so it constructs the new bridge-backed backend, injects the repo-local bridge transport provider at the nearest seam, and passes the backend through the existing runtime/orchestration flow.
Rules:
- keep the pipe path unchanged
- keep changes localized
- keep lifecycle/session bridge handling separate from per-action browser execution
- preserve existing runtime log and task flow behavior where possible
- [ ] **Step 4: Add one direct-route/fallback regression**
Add one focused regression proving a bridge-backed backend still works through the direct-route or fallback path exercised by `src/compat/workflow_executor.rs`.
- [ ] **Step 5: Run the bridge-focused service tests**
Run:
```bash
cargo test --test service_ws_session_test --test service_task_flow_test -- --nocapture
```
Expected: PASS.
- [ ] **Step 6: Re-run workflow/runtime regressions**
Run:
```bash
cargo test compat::workflow_executor::tests -- --nocapture
cargo test --test compat_browser_tool_test --test browser_script_skill_tool_test --test task_runner_test -- --nocapture
```
Expected: PASS.
- [ ] **Step 7: Commit**
```bash
git add src/service/server.rs src/compat/runtime.rs src/compat/orchestration.rs src/compat/workflow_executor.rs tests/service_ws_session_test.rs tests/service_task_flow_test.rs
git commit -m "refactor: route real browser path through bridge backend"
```
---
## Task 4: Verify bridge-path behavior without pipe regression
**Files:**
- Reuse only unless a failing test proves a minimal fix is still needed
- [ ] **Step 1: Run bridge/backend unit coverage**
Run:
```bash
cargo test --test browser_bridge_contract_test --test browser_bridge_backend_test -- --nocapture
```
Expected: PASS.
- [ ] **Step 2: Run service/runtime bridge-path regressions**
Run:
```bash
cargo test --test service_ws_session_test --test service_task_flow_test -- --nocapture
```
Expected: PASS.
- [ ] **Step 3: Run required pipe regressions**
Run:
```bash
cargo test --test pipe_handshake_test --test browser_tool_test --test compat_browser_tool_test --test browser_script_skill_tool_test --test runtime_task_flow_test -- --nocapture
```
Expected: PASS.
- [ ] **Step 4: Build the affected binaries**
Run:
```bash
cargo build --bin sgclaw --bin sg_claw --bin sg_claw_client
```
Expected: PASS.
- [ ] **Step 5: Stop if any regression points back to raw websocket assumptions**
If any test still encodes raw external websocket business-frame assumptions as the real-browser path, update that test to the bridge-backed design rather than patching production code to satisfy the old assumption.
- [ ] **Step 6: Commit**
```bash
git add tests/browser_bridge_contract_test.rs tests/browser_bridge_backend_test.rs tests/service_ws_session_test.rs tests/service_task_flow_test.rs
git commit -m "test: verify bridge path and preserve pipe behavior"
```
---
## Verification Checklist
### Bridge contract tests
```bash
cargo test --test browser_bridge_contract_test -- --nocapture
```
Expected: documented bridge names and semantic browser-action request shaping are locked.
### Bridge backend tests
```bash
cargo test --test browser_bridge_backend_test -- --nocapture
```
Expected: backend action mapping and reply/error normalization are green.
### Service/runtime integration tests
```bash
cargo test --test service_ws_session_test --test service_task_flow_test -- --nocapture
cargo test compat::workflow_executor::tests -- --nocapture
```
Expected: real-browser path uses the bridge-backed backend and direct-route/fallback behavior remains intact.
### Pipe regressions
```bash
cargo test --test pipe_handshake_test --test browser_tool_test --test compat_browser_tool_test --test browser_script_skill_tool_test --test runtime_task_flow_test -- --nocapture
```
Expected: pipe path remains unchanged.
### Binary build verification
```bash
cargo build --bin sgclaw --bin sg_claw --bin sg_claw_client
```
Expected: affected binaries compile.
---
## Notes for Implementation
- The websocket probe work stays in the repository as diagnostic tooling; do not repurpose it into the bridge adapter.
- `docs/_tmp_sgbrowser_ws_probe_transcript.md` is evidence that rejected the raw-ws-direct assumption, not a contract to keep satisfying.
- Favor one narrow bridge-backed backend over broad runtime rewrites.
- If the nearest repo-local seam is still slightly abstract because the external SuperRPA host code is outside this repository, make that abstraction explicit and test it rather than guessing hidden behavior.

View File

@@ -0,0 +1,566 @@
# WS Browser Integration Surface Correction Implementation Plan
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
**Goal:** Replace the unvalidated raw-ws-direct assumption with an evidence-backed decision: either prove a minimal sgBrowser bootstrap sequence for raw websocket control, or pivot to the real browser bridge surface.
**Architecture:** Treat the existing ws-native backend as a protocol/testing asset, not as a validated production integration surface. First build a narrow probe/validation harness that can run candidate bootstrap sequences and capture exact live transcripts from the real endpoint. Then branch decisively: if a reproducible bootstrap sequence yields real status/callback frames, implement that bootstrap path; otherwise stop raw-ws speculation and write the bridge-first implementation slice.
**Tech Stack:** Rust 2021, existing `src/browser/ws_protocol.rs` / `src/browser/ws_backend.rs`, service websocket infrastructure, `tungstenite`, `serde_json`, current Rust test suite, local sgBrowser websocket documentation.
---
## Scope Guardrails
- Do **not** add more speculative production fixes to `src/service/server.rs` just to “try one more thing.”
- Do **not** claim raw websocket is the supported path unless the live probe transcript proves it.
- Do **not** modify `src/lib.rs`, pipe handshake behavior, or the pipe browser-tool path.
- Do **not** implement both the bootstrap architecture and the bridge architecture in the same branch.
- Keep the ws-native code unless and until the bridge decision makes specific pieces obsolete.
- Prefer a dedicated probe surface over embedding validation logic into production request handling.
---
## File Structure
### Existing files to modify
- Modify: `src/browser/mod.rs`
- export the new `ws_probe` module so both tests and the probe binary use the same library surface
- Modify: `src/browser/ws_protocol.rs`
- only if a tiny helper extraction is required for test/probe readability
- do not change existing protocol semantics in this slice
- Modify: `tests/browser_ws_protocol_test.rs`
- add deterministic coverage for any extracted helper used by the probe harness
### New files to create
- Create: `src/bin/sgbrowser_ws_probe.rs`
- standalone diagnostic binary for ordered frame-script probing against a live sgBrowser websocket endpoint
- Create: `src/browser/ws_probe.rs`
- small reusable probe/transcript module, if needed, to keep the binary and tests focused
- Create: `tests/browser_ws_probe_test.rs`
- deterministic fake-server tests for transcript capture, timeout reporting, and scripted sequence execution
- Create: `docs/superpowers/specs/2026-04-03-ws-browser-bridge-path-design.md` **only if Option B wins after probing**
- follow-up bridge design, not part of the initial coding slice
- Create: `docs/superpowers/plans/2026-04-03-ws-browser-bridge-path-plan.md` **only if Option B wins after probing**
- follow-up bridge implementation plan, not part of the initial coding slice
- Create: `docs/_tmp_sgbrowser_ws_probe_transcript.md`
- temporary evidence artifact capturing the real endpoint probe matrix and outcomes
### Files deliberately not changed in the initial slice
- `src/lib.rs`
- `src/agent/task_runner.rs`
- `src/compat/runtime.rs`
- `src/compat/orchestration.rs`
- `src/compat/workflow_executor.rs`
- `src/browser/ws_backend.rs`
Unless the probe results prove a real bootstrap contract, these files stay untouched.
---
## Task 1: Build a deterministic websocket probe harness before touching production behavior
**Files:**
- Create: `src/browser/ws_probe.rs`
- Create: `tests/browser_ws_probe_test.rs`
- Reuse: `src/browser/ws_protocol.rs`
- [ ] **Step 1: Write the first failing transcript test**
Create `tests/browser_ws_probe_test.rs` with one focused fake-server test that executes a scripted sequence of outgoing text frames and records all received text frames in order.
Start with this shape:
```rust
#[test]
fn probe_records_welcome_then_silence_transcript() {
// fake server sends one welcome frame and then stays silent
// probe result should preserve that exact transcript and mark timeout/silence explicitly
}
```
Required assertions:
- the probe can connect to the fake websocket server
- it can send a scripted first frame
- it records the first inbound text frame exactly
- it returns a transcript/result object that distinguishes timeout from protocol parse failure
- [ ] **Step 2: Run the single new test and verify it fails**
Run:
```bash
cargo test --test browser_ws_probe_test probe_records_welcome_then_silence_transcript -- --nocapture
```
Expected: FAIL because the probe harness does not exist yet.
- [ ] **Step 3: Add the second failing probe test for ordered multi-step scripts**
In the same file, add a test proving the harness can run multiple outgoing frames in a fixed order and keep the transcript segmented by step.
Test shape:
```rust
#[test]
fn probe_runs_ordered_frame_script_and_records_per_step_results() {
// send bootstrap frame 1, bootstrap frame 2, then minimal action
// fake server replies differently at each step
// probe result preserves exact order and outcomes
}
```
Required assertions:
- outgoing frames are sent in the configured order
- inbound frames are attached to the correct step
- the probe can stop the sequence on timeout/close if configured
- [ ] **Step 4: Run the ordered-script test and verify it fails**
Run:
```bash
cargo test --test browser_ws_probe_test probe_runs_ordered_frame_script_and_records_per_step_results -- --nocapture
```
Expected: FAIL because the probe harness does not exist yet.
- [ ] **Step 5: Add the third failing probe test for close/reset visibility**
Add one focused fake-server test that closes the connection after a script step and asserts the transcript reports close/reset rather than generic timeout.
- [ ] **Step 6: Run the close/reset test and verify it fails**
Run:
```bash
cargo test --test browser_ws_probe_test probe_reports_socket_close_separately_from_timeout -- --nocapture
```
Expected: FAIL because the probe harness does not exist yet.
- [ ] **Step 7: Implement the minimal probe module**
Create `src/browser/ws_probe.rs` with only the types and behavior needed by the tests.
Recommended shape:
```rust
pub struct ProbeStep {
pub label: String,
pub payload: String,
pub expect_reply: bool,
}
pub enum ProbeOutcome {
Received(Vec<String>),
TimedOut,
Closed,
ConnectFailed(String),
}
pub struct ProbeStepResult {
pub label: String,
pub sent: String,
pub outcome: ProbeOutcome,
}
pub fn run_probe_script(/* ws url, timeout, steps */) -> Result<Vec<ProbeStepResult>, ProbeError> {
// connect, send ordered frames, collect exact transcript
}
```
Rules:
- do not parse business meaning yet
- do not mix this into normal task execution
- preserve exact raw text frames in transcript results
- keep the module small and diagnostic-oriented
- [ ] **Step 8: Re-run the new probe tests**
Run:
```bash
cargo test --test browser_ws_probe_test -- --nocapture
```
Expected: PASS.
- [ ] **Step 9: Commit**
```bash
git add src/browser/ws_probe.rs tests/browser_ws_probe_test.rs
git commit -m "test: add sgBrowser websocket probe harness"
```
---
## Task 2: Add a standalone probe binary for live sgBrowser evidence collection
**Files:**
- Create: `src/bin/sgbrowser_ws_probe.rs`
- Create: `src/browser/ws_probe.rs`
- Modify: `src/browser/mod.rs`
- Create: `tests/browser_ws_probe_test.rs`
- [ ] **Step 1: Write the failing helper parser test**
In `tests/browser_ws_probe_test.rs`, add one focused test for a new helper function in `src/browser/ws_probe.rs`:
```rust
#[test]
fn parse_probe_args_accepts_ws_url_timeout_and_ordered_steps() {
// parse a fixed argv-style slice into a ProbeCliConfig
}
```
Create and use this exact helper shape:
```rust
pub struct ProbeCliConfig {
pub ws_url: String,
pub timeout_ms: u64,
pub steps: Vec<ProbeStep>,
}
pub fn parse_probe_args(args: &[String]) -> Result<ProbeCliConfig, ProbeError>
```
The test must assert that these exact arguments parse successfully and preserve step order:
```text
--ws-url ws://127.0.0.1:12345
--timeout-ms 1500
--step open-agent::["about:blank","sgOpenAgent"]
--step open-hot::["about:blank","sgBrowerserOpenPage","https://www.zhihu.com/hot"]
```
- [ ] **Step 2: Run the parser test and verify it fails**
Run:
```bash
cargo test --test browser_ws_probe_test parse_probe_args_accepts_ws_url_timeout_and_ordered_steps -- --nocapture
```
Expected: FAIL because `parse_probe_args(...)` and `ProbeCliConfig` do not exist yet.
- [ ] **Step 3: Implement the helper and binary together**
In `src/browser/ws_probe.rs`, add `ProbeCliConfig` and `parse_probe_args(...)`.
In `src/browser/mod.rs`, add the module export:
```rust
pub mod ws_probe;
```
In `src/bin/sgbrowser_ws_probe.rs`, implement the binary using only `std::env::args()` plus `parse_probe_args(...)`.
Required behavior:
- accepts a websocket URL
- accepts a timeout in milliseconds
- accepts repeated ordered steps
- runs the probe harness
- prints a markdown-friendly transcript including:
- step label
- exact sent payload
- exact received frames, if any
- timeout/close outcome
Output shape can be simple, for example:
```text
STEP 1 bootstrap-open-agent
SEND: ["about:blank","sgOpenAgent"]
RECV: Welcome! You are client #1
OUTCOME: timeout
```
Rules:
- no production/browser-runtime integration
- no hidden fallback logic
- no “best effort” guessing of next steps
- [ ] **Step 4: Re-run the parser/helper test**
Run:
```bash
cargo test --test browser_ws_probe_test parse_probe_args_accepts_ws_url_timeout_and_ordered_steps -- --nocapture
```
Expected: PASS.
- [ ] **Step 5: Build the probe binary**
Run:
```bash
cargo build --bin sgbrowser_ws_probe
```
Expected: PASS.
- [ ] **Step 6: Commit**
```bash
git add src/bin/sgbrowser_ws_probe.rs src/browser/ws_probe.rs src/browser/mod.rs tests/browser_ws_probe_test.rs
git commit -m "feat: add live sgBrowser websocket probe binary"
```
---
## Task 3: Run the real endpoint probe matrix and write the evidence transcript
**Files:**
- Create: `docs/_tmp_sgbrowser_ws_probe_transcript.md`
- Reuse only: `src/bin/sgbrowser_ws_probe.rs`, `docs/_tmp_sgbrowser_ws_api_doc.txt`
- [ ] **Step 1: Run the no-bootstrap baseline probe**
Run exactly:
```bash
cargo run --bin sgbrowser_ws_probe -- --ws-url "ws://127.0.0.1:12345" --timeout-ms 1500 --step "baseline-open::[\"about:blank\",\"sgBrowerserOpenPage\",\"https://www.zhihu.com/hot\"]"
```
Append the exact output under a `## baseline-open` heading in `docs/_tmp_sgbrowser_ws_probe_transcript.md`.
- [ ] **Step 2: Run the documented `sgOpenAgent` candidate**
Run exactly:
```bash
cargo run --bin sgbrowser_ws_probe -- --ws-url "ws://127.0.0.1:12345" --timeout-ms 1500 --step "open-agent::[\"about:blank\",\"sgOpenAgent\"]" --step "post-open-agent-open::[\"about:blank\",\"sgBrowerserOpenPage\",\"https://www.zhihu.com/hot\"]"
```
Append the exact output under a `## open-agent` heading.
- [ ] **Step 3: Run the documented `sgSetAuthInfo` candidate**
Run exactly:
```bash
cargo run --bin sgbrowser_ws_probe -- --ws-url "ws://127.0.0.1:12345" --timeout-ms 1500 --step "set-auth::[\"about:blank\",\"sgSetAuthInfo\",\"probe-user\",\"probe-token\"]" --step "post-set-auth-open::[\"about:blank\",\"sgBrowerserOpenPage\",\"https://www.zhihu.com/hot\"]"
```
Append the exact output under a `## set-auth` heading.
- [ ] **Step 4: Run the documented `sgBrowserLogin` candidate**
Run exactly:
```bash
cargo run --bin sgbrowser_ws_probe -- --ws-url "ws://127.0.0.1:12345" --timeout-ms 1500 --step "browser-login::{\"request\":\"use-json-helper\"}"
```
Before running, replace the placeholder payload with the exact JSON-array frame produced by the helper for:
```json
["about:blank","sgBrowserLogin",{"appName":"probe","userName":"probe","orgName":"probe","menus":[{"name":"probe","normalImg":"x","activeImg":"x","url":"https://www.zhihu.com/hot"}]}]
```
Then add a second step in the same command:
```json
["about:blank","sgBrowerserOpenPage","https://www.zhihu.com/hot"]
```
Append the exact output under a `## browser-login` heading.
- [ ] **Step 5: Run the documented `sgBrowerserActiveTab` candidate**
Run exactly:
```bash
cargo run --bin sgbrowser_ws_probe -- --ws-url "ws://127.0.0.1:12345" --timeout-ms 1500 --step "active-tab::[\"about:blank\",\"sgBrowerserActiveTab\",\"https://www.zhihu.com/hot\",\"probeCallback\"]" --step "post-active-tab-open::[\"about:blank\",\"sgBrowerserOpenPage\",\"https://www.zhihu.com/hot\"]"
```
Append the exact output under a `## active-tab` heading.
- [ ] **Step 6: Run one combined bootstrap candidate**
Run exactly:
```bash
cargo run --bin sgbrowser_ws_probe -- --ws-url "ws://127.0.0.1:12345" --timeout-ms 1500 --step "combined-open-agent::[\"about:blank\",\"sgOpenAgent\"]" --step "combined-active-tab::[\"about:blank\",\"sgBrowerserActiveTab\",\"https://www.zhihu.com/hot\",\"probeCallback\"]" --step "combined-open::[\"about:blank\",\"sgBrowerserOpenPage\",\"https://www.zhihu.com/hot\"]"
```
Append the exact output under a `## combined-bootstrap` heading.
- [ ] **Step 7: Run `requesturl` variants for the minimal action**
Run exactly these two additional commands:
```bash
cargo run --bin sgbrowser_ws_probe -- --ws-url "ws://127.0.0.1:12345" --timeout-ms 1500 --step "target-as-requesturl::[\"https://www.zhihu.com/hot\",\"sgBrowerserOpenPage\",\"https://www.zhihu.com/hot\"]"
```
```bash
cargo run --bin sgbrowser_ws_probe -- --ws-url "ws://127.0.0.1:12345" --timeout-ms 1500 --step "baidu-requesturl::[\"https://www.baidu.com\",\"sgBrowerserOpenPage\",\"https://www.zhihu.com/hot\"]"
```
Append the exact outputs under `## requesturl-variants`.
- [ ] **Step 8: Summarize the matrix in the transcript file**
At the end of `docs/_tmp_sgbrowser_ws_probe_transcript.md`, add this exact table template and fill it in:
```markdown
| Sequence | Sent frames | First reply | Final outcome | Decision signal |
| --- | --- | --- | --- | --- |
```
- [ ] **Step 9: Determine which architecture option wins**
Decision rule:
- if at least one sequence reproducibly yields real numeric status and/or callback frames for a real business action, Option A (bootstrap-validated raw websocket) wins
- otherwise, Option B (bridge-first) wins
Do not weaken this decision rule.
- [ ] **Step 10: Commit the evidence artifact**
```bash
git add docs/_tmp_sgbrowser_ws_probe_transcript.md
git commit -m "docs: capture sgBrowser websocket probe evidence"
```
---
## Task 4A: If Option A wins, write the narrow bootstrap implementation slice
**Files:**
- Create: `docs/superpowers/specs/2026-04-03-ws-browser-bootstrap-contract-design.md`
- Create: `docs/superpowers/plans/2026-04-03-ws-browser-bootstrap-contract-plan.md`
- Reuse as evidence input:
- `docs/_tmp_sgbrowser_ws_probe_transcript.md`
- `docs/_tmp_sgbrowser_ws_api_doc.txt`
- `src/browser/ws_backend.rs`
- `src/browser/ws_protocol.rs`
- [ ] **Step 1: Write one new design doc capturing the proven bootstrap contract**
Create:
```text
docs/superpowers/specs/2026-04-03-ws-browser-bootstrap-contract-design.md
```
Include:
- exact validated sequence
- exact required state (`requesturl`, active tab, agent page, auth payload)
- exact failure semantics
- why this is sufficient evidence to keep raw websocket as the product surface
- [ ] **Step 2: Write one new implementation plan for the bootstrap path**
Create:
```text
docs/superpowers/plans/2026-04-03-ws-browser-bootstrap-contract-plan.md
```
Plan only the minimal production changes required to embed the validated bootstrap sequence into the service/browser path.
- [ ] **Step 3: Commit the bootstrap decision docs**
```bash
git add docs/superpowers/specs/2026-04-03-ws-browser-bootstrap-contract-design.md docs/superpowers/plans/2026-04-03-ws-browser-bootstrap-contract-plan.md
git commit -m "docs: capture ws browser bootstrap contract"
```
- [ ] **Step 4: Stop after writing the bootstrap plan**
Do not begin production implementation in the same slice unless the user explicitly asks for execution.
---
## Task 4B: If Option B wins, write the bridge-first implementation slice
**Files:**
- Create: `docs/superpowers/specs/2026-04-03-ws-browser-bridge-path-design.md`
- Create: `docs/superpowers/plans/2026-04-03-ws-browser-bridge-path-plan.md`
- Reuse as evidence input:
- `docs/_tmp_sgbrowser_ws_probe_transcript.md`
- `frontend/archive/sgClaw验证-已归档/testRunner.js`
- `docs/superpowers/specs/2026-03-25-superrpa-sgclaw-browser-control-design.md`
- `docs/archive/项目管理与排期/协作时间表.md`
- `docs/plans/2026-03-27-sgclaw-floating-chat-frontend-design.md`
- [ ] **Step 1: Write the bridge-path design doc**
Create `docs/superpowers/specs/2026-04-03-ws-browser-bridge-path-design.md`.
The design must specify:
- why raw websocket is considered non-validated for external control
- which bridge surface becomes authoritative
- where sgClaw should integrate (`FunctionsUI`, host bridge, `BrowserAction`, `CommandRouter`, or the nearest validated seam in this repo)
- how to preserve pipe behavior and existing abstractions where practical
- [ ] **Step 2: Write the bridge-path implementation plan**
Create `docs/superpowers/plans/2026-04-03-ws-browser-bridge-path-plan.md`.
The plan must:
- identify exact files to touch
- describe the narrowest adapter implementation
- keep TDD/task granularity as in this document
- avoid speculative work outside the bridge slice
- [ ] **Step 3: Commit the bridge decision docs**
```bash
git add docs/superpowers/specs/2026-04-03-ws-browser-bridge-path-design.md docs/superpowers/plans/2026-04-03-ws-browser-bridge-path-plan.md
git commit -m "docs: define bridge-first sgBrowser integration"
```
- [ ] **Step 4: Stop after writing the bridge plan**
Do not start the bridge implementation in the same slice unless the user explicitly asks for execution.
---
## Verification Checklist
### Deterministic probe harness tests
```bash
cargo test --test browser_ws_probe_test -- --nocapture
```
Expected: transcript capture, ordered scripts, timeout reporting, and close/reset reporting all pass.
### Probe binary build
```bash
cargo build --bin sgbrowser_ws_probe
```
Expected: PASS.
### Live evidence collection
- run the probe matrix against the real configured endpoint
- save exact transcripts to `docs/_tmp_sgbrowser_ws_probe_transcript.md`
- make the architecture decision using the documented rule
### Follow-up branch condition
- if Option A wins, repository contains a bootstrap-contract design + plan
- if Option B wins, repository contains a bridge-path design + plan
- no production runtime changes are made until that decision is written down
---
## Notes for Implementation
- The existing `WsBrowserBackend` fix that remembers the last navigated URL remains valid; do not revert it.
- The previous auth-replacement work also remains valid; it removed a real bug but did not prove the raw websocket architecture.
- Keep the probe tool brutally literal: exact sent frames, exact received frames, explicit timeout/close outcomes.
- Resist the temptation to make the probe “smart.” Smart probes hide evidence.
- If the real endpoint still replies only with the welcome banner and then silence across the matrix, treat that as a decision, not as an excuse for more guessing.

View File

@@ -0,0 +1,362 @@
# WS Browser Welcome Frame Compatibility Implementation Plan
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
**Goal:** Make the ws service path tolerate the real sgBrowser welcome banner (`Welcome! You are client #...`) without weakening general ws protocol validation or changing pipe behavior.
**Architecture:** Keep the shared `WsBrowserBackend` strict and implement the compatibility shim only in `ServiceBrowserWsClient`, which is already the real-browser adapter for the ws service path. Add one positive red test for the known welcome frame and one negative red test proving non-matching first text frames still fail as protocol errors, then make the minimal stateful change in `src/service/server.rs` and verify ws + pipe regressions.
**Tech Stack:** Rust 2021, tungstenite websocket client/server, existing `WsBrowserBackend`, existing `ServiceBrowserWsClient`, existing Rust unit/integration test suite.
---
## File Structure
### Existing files to modify
- Modify: `src/service/server.rs`
- Add the one-time per-connection welcome-skip state to `ServiceBrowserWsClient`
- Add the minimal helper(s) for detecting and discarding the first known welcome frame
- Add focused service-adapter unit tests in the existing `#[cfg(test)]` module
- Reuse: `src/browser/ws_backend.rs`
- Do not change protocol parsing rules; only verify behavior remains strict for all non-service callers
- Reuse: `tests/service_task_flow_test.rs`
- Re-run to confirm the ws service path still reaches the browser websocket after the service-side shim
- Reuse: `tests/browser_ws_backend_test.rs`
- Re-run to prove the shared backend semantics remain unchanged
### Files deliberately not changed
- `src/browser/ws_backend.rs`
- `src/browser/ws_protocol.rs`
- `src/agent/task_runner.rs`
- `src/compat/runtime.rs`
- `src/compat/orchestration.rs`
- `src/compat/workflow_executor.rs`
- `src/lib.rs`
The design explicitly keeps the welcome-banner workaround out of the shared backend and out of the pipe path.
---
## Task 1: Reproduce the real welcome-frame failure with focused unit tests
**Files:**
- Modify: `src/service/server.rs`
- [ ] **Step 1: Add the positive failing test for the known welcome frame**
In the existing `#[cfg(test)] mod tests` inside `src/service/server.rs`, add one focused test next to the current ws adapter tests.
Test shape:
```rust
#[test]
fn future_server_side_ws_native_adapter_skips_initial_known_welcome_frame() {
// fake server sends:
// 1. "Welcome! You are client #1"
// 2. "0"
// backend.invoke(Action::Navigate, ...) should succeed
}
```
Required assertions:
- the fake websocket server accepts one connection
- it sends the welcome banner first, then the numeric success status
- `WsBrowserBackend.invoke(Action::Navigate, ...)` returns `Ok(CommandOutput { success: true, .. })`
- [ ] **Step 2: Run only the positive new test and watch it fail**
Run:
```bash
cargo test service::server::tests::future_server_side_ws_native_adapter_skips_initial_known_welcome_frame -- --nocapture
```
Expected: FAIL with a protocol error containing `invalid browser status frame: Welcome! You are client #1`.
- [ ] **Step 3: Add the negative failing test for arbitrary first text**
In the same `#[cfg(test)]` module, add one negative test proving we do **not** silently skip arbitrary first text frames.
Test shape:
```rust
#[test]
fn future_server_side_ws_native_adapter_does_not_skip_unknown_first_text_frame() {
// fake server sends:
// 1. "Hello from server"
// assert invoke(...) fails as PipeError::Protocol(...)
}
```
Required assertions:
- the first frame is a non-matching text frame such as `Hello from server`
- `invoke(...)` fails
- the failure remains a protocol error rather than success or timeout
- [ ] **Step 4: Run only the negative new test and verify the current behavior is already strict**
Run:
```bash
cargo test service::server::tests::future_server_side_ws_native_adapter_does_not_skip_unknown_first_text_frame -- --nocapture
```
Expected: PASS, proving the current implementation already treats arbitrary first text as a protocol error. Keep that assertion in place before any production change.
- [ ] **Step 5: Confirm the TDD gate before implementation**
Do not implement production code before both tests exist and the positive test has failed on current behavior.
---
## Task 2: Add the minimal per-connection welcome-skip state in the service adapter
**Files:**
- Modify: `src/service/server.rs`
- [ ] **Step 1: Add one-time per-connection state to `ServiceBrowserWsClient`**
Extend `ServiceBrowserWsClient` with one extra state field that tracks whether the initial welcome candidate has already been consumed for the current websocket connection.
Allowed shape:
```rust
struct ServiceBrowserWsClient {
browser_ws_url: String,
browser_socket: Mutex<Option<WebSocket<MaybeTlsStream<TcpStream>>>>,
initial_text_frame_checked: Mutex<bool>,
}
```
or an equally small equivalent.
Rules:
- state is per connection, not per request
- state must survive multiple `invoke(...)` calls while reusing the same socket
- do not add broader protocol state machines
- [ ] **Step 2: Add a narrow welcome-frame matcher**
In `src/service/server.rs`, add one small helper that recognizes only the known banner prefix:
```rust
fn is_known_welcome_frame(frame: &str) -> bool {
frame.starts_with("Welcome! You are client #")
}
```
Rules:
- no regex needed
- no generic “ignore arbitrary text” logic
- keep the matcher local to `src/service/server.rs`
- [ ] **Step 3: Update `recv_text_timeout(...)` to skip at most one initial known banner**
Modify `impl WsClient for ServiceBrowserWsClient` so that the first text frame received after connection establishment is handled like this:
1. read the next text frame
2. if the initial-frame state is still false:
- mark the first-frame check as consumed
- if the frame matches `is_known_welcome_frame(...)`, read the next frame and return that next frame instead
3. otherwise, return the frame unchanged
Rules:
- skip only once per connection
- do not loop indefinitely over multiple text frames
- do not swallow unknown first text frames
- do not change timeout / close / reset / connect-failure behavior
- [ ] **Step 4: Reset the one-time state when a fresh socket is created**
When `with_socket(...)` establishes a brand-new websocket connection, ensure the one-time banner-check state is reset so a new connection can tolerate its own first welcome frame.
- [ ] **Step 5: Add one reconnect regression in the service adapter tests**
Add one focused test proving the welcome skip resets on a fresh connection after socket close/reset.
Test shape:
```rust
#[test]
fn future_server_side_ws_native_adapter_skips_welcome_again_after_reconnect() {
// first connection closes after use
// second fresh connection sends the same welcome banner again
// both invocations succeed
}
```
Required assertion:
- the one-time skip is per connection, not global for the client instance
- [ ] **Step 6: Run the positive new test**
Run:
```bash
cargo test service::server::tests::future_server_side_ws_native_adapter_skips_initial_known_welcome_frame -- --nocapture
```
Expected: PASS.
- [ ] **Step 7: Run the negative new test**
Run:
```bash
cargo test service::server::tests::future_server_side_ws_native_adapter_does_not_skip_unknown_first_text_frame -- --nocapture
```
Expected: PASS, proving unknown first text is still treated as a protocol error.
- [ ] **Step 8: Run the reconnect regression**
Run:
```bash
cargo test service::server::tests::future_server_side_ws_native_adapter_skips_welcome_again_after_reconnect -- --nocapture
```
Expected: PASS.
- [ ] **Step 9: Run the full service adapter unit group**
Run:
```bash
cargo test service::server::tests -- --nocapture
```
Expected: PASS, including the existing tests for:
- status `0` success
- connect failure => `PipeError::Protocol("browser websocket connect failed: ...")`
- disconnect/reset => `PipeError::PipeClosed`
- callback timeout => `PipeError::Timeout`
- new known-welcome success path
- new unknown-first-frame strictness path
- new reconnect reset behavior
---
## Task 3: Verify the shared backend stayed strict and the ws service path still works
**Files:**
- Reuse: `tests/browser_ws_backend_test.rs`
- Reuse: `tests/service_task_flow_test.rs`
- Reuse: `src/browser/ws_backend.rs`
- [ ] **Step 1: Re-run the shared ws backend tests unchanged**
Run:
```bash
cargo test --test browser_ws_backend_test -- --nocapture
```
Expected: PASS. This proves `WsBrowserBackend` semantics remain unchanged for its existing deterministic callers.
- [ ] **Step 2: Re-run the service task-flow regression**
Run:
```bash
cargo test --test service_task_flow_test -- --nocapture
```
Expected: PASS, including the auth-regression test that proves the ws service path reaches the browser websocket and no longer emits `invalid hmac seed: session key must not be empty`.
- [ ] **Step 3: Re-run the ws-focused mixed verification**
Run:
```bash
cargo test --test browser_ws_backend_test --test browser_ws_protocol_test --test service_ws_session_test --test service_task_flow_test -- --nocapture
```
Expected: PASS.
---
## Task 4: Re-run the real manual smoke that originally failed
**Files:**
- Reuse only: no code changes unless a fresh reproducer proves another bug
- [ ] **Step 1: Confirm real browser websocket reachability**
Run a reachability check for `ws://127.0.0.1:12345` (or the configured `browserWsUrl`) before starting smoke.
Expected: reachable.
- [ ] **Step 2: Start the real ws service**
Run:
```bash
cargo run --bin sg_claw -- --config-path "D:/data/ideaSpace/rust/sgClaw/sgclaw_config.json"
```
Expected: the service prints:
- `sg_claw ready: ...`
- the resolved `service_ws_listen_addr`
- the configured `browser_ws_url`
- [ ] **Step 3: Re-run the original failing manual smoke**
Run:
```bash
printf '打开知乎热榜并读取页面主区域文本\n' | cargo run --bin sg_claw_client -- --config-path "D:/data/ideaSpace/rust/sgClaw/sgclaw_config.json"
```
Expected:
- no `invalid browser status frame: Welcome! You are client #1`
- browser actions proceed past the first status frame
- if the browser later fails for another reason, capture that new reason exactly
- [ ] **Step 4: Re-run the old Zhihu export task smoke**
Run:
```bash
printf '读取知乎热榜数据,并导出 excel 文件\n' | cargo run --bin sg_claw_client -- --config-path "D:/data/ideaSpace/rust/sgClaw/sgclaw_config.json"
```
Expected:
- no `invalid browser status frame: Welcome! You are client #1`
- the task reaches the real browser action path beyond connection banner handling
- [ ] **Step 5: Stop and debug if a new real-browser issue appears**
If smoke now fails for a different reason, do not piggyback a second fix into this slice without:
- capturing the exact new output
- writing a new focused spec/plan if the issue is materially different
---
## Verification Checklist
### Service adapter unit tests
```bash
cargo test service::server::tests -- --nocapture
```
Expected: all service-side ws adapter tests pass, including the new welcome-frame positive/negative cases and reconnect reset case.
### Shared ws backend + ws service regressions
```bash
cargo test --test browser_ws_backend_test --test browser_ws_protocol_test --test service_ws_session_test --test service_task_flow_test -- --nocapture
```
Expected: PASS.
### Real smoke verification
- `browserWsUrl` reachable
- `sg_claw` starts with real config
- `sg_claw_client` no longer fails on `Welcome! You are client #...`
- Zhihu minimal read task gets past the first status frame
- Zhihu export task gets past the first status frame

View File

@@ -0,0 +1,564 @@
# Zhihu Release WS Function-Callback Migration Implementation Plan
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
**Goal:** Move only the Zhihu direct-execution path to the new Release browser websocket interaction style while keeping the existing pipe protocol and non-Zhihu submit behavior unchanged.
**Architecture:** Keep `ClientMessage` / `ServiceMessage`, `run_submit_task_with_browser_backend(...)`, and the high-level Zhihu workflow steps unchanged. First prove the exact Release browser interaction contract with transcript-backed probes. Then implement the smallest Zhihu-scoped backend path that follows that proven contract. Do not globally rewire the submit path unless the probe evidence proves there is no narrower safe seam.
**Tech Stack:** Rust, tungstenite, existing sgclaw service/client pipe protocol, `docs/_tmp_sgbrowser_ws_api_doc.txt`, Release browser websocket at `ws://127.0.0.1:12345`, current Zhihu direct-execution workflow.
---
## Context
The user has now made the target behavior explicit:
- the browser has changed and the working reference behavior is the user-provided HTML page that connects to `ws://127.0.0.1:12345`
- that page sends a bootstrap registration frame: `{"type":"register","role":"web"}`
- browser requests are still JSON arrays such as `[window.location.href, "sgBrowserSetTheme", "1"]` and `[window.location.href, "sgBrowerserGetUrls", "showUrls"]`
- callback-bearing browser behavior is now centered on page-defined JS callback functions like `showUrls`, not on Rust directly reading a websocket callback frame as the final business result
- the existing sgclaw pipe protocol must remain unchanged
The current sgclaw drift that must be corrected is visible in:
- `src/browser/ws_protocol.rs`
- `Action::Navigate` currently emits `sgHideBrowserCallAfterLoaded` with an inline `callBackJsToCpp(...)` string
- `src/browser/ws_backend.rs`
- Rust currently waits for a browser websocket callback frame and treats that as the action result
- `tests/service_ws_session_test.rs:498-605`
- `tests/service_task_flow_test.rs:499-635`
- existing **generic submit-flow** regressions still lock in the old direct raw-websocket callback-frame assumption
- these are useful as non-regression guardrails, but they are not themselves Zhihu-specific regressions
Zhihu-specific verification must therefore be added explicitly instead of assuming those Baidu-path tests already cover Zhihu.
The new browser style proves these facts and only these facts so far:
1. sgclaw must handle a register-first websocket handshake
2. browser requests are still `[requesturl, action, ...args]`
3. some browser capabilities now return through page-defined callback functions like `showUrls`
4. the current direct raw-websocket callback expectation in Zhihu path is no longer a safe assumption
The production seam is **not** pre-decided here. Task 1 must determine whether Zhihu can be integrated by:
- a direct Zhihu-scoped backend with no helper page, or
- a helper page / relay design because named page callbacks are the only reliable result path
Until Task 1 evidence is captured, both remain hypotheses.
## Evidence to preserve in the implementation
### Browser websocket API doc
From `docs/_tmp_sgbrowser_ws_api_doc.txt`:
- `ws://localhost:12345` is the browser websocket endpoint
- request frames are array payloads with `requesturl`
- `sgBrowerserGetUrls(callback)` uses a callback **function name**: `[requesturl,"sgBrowerserGetUrls", callback]`
- `sgBrowserCallAfterLoaded(targetUrl, callback)` and `sgHideBrowserCallAfterLoaded(targetUrl, callback)` use callback strings with parentheses
- `callBackJsToCpp(param)` uses `sourceUrl@_@targetUrl@_@callback@_@actionUrl@_@responseTxt`
- `sgBrowserRegJsFun(targeturl, funContent)` and `sgBrowserExcuteJsFun(targeturl, funName)` exist and may be useful when the helper page needs durable callback helpers
### Current working HTML pattern from the user
The now-working reference interaction is:
```html
const socket = new WebSocket('ws://127.0.0.1:12345');
socket.onopen = () => {
socket.send(JSON.stringify({type: 'register', role: 'web'}));
};
socket.send(JSON.stringify([window.location.href,"sgBrowerserGetUrls","showUrls"]));
function showUrls(urls) {
// browser invokes this page-defined callback
}
```
That is the browser behavior sgclaw now needs to follow.
---
## Critical files
### Production files to modify
- `src/browser/ws_protocol.rs`
- `src/compat/workflow_executor.rs` (only if a narrow Zhihu-specific correction is required after backend swap)
- `src/service/server.rs` (only if the chosen Zhihu-scoped integration seam must be wired here)
- `src/service/mod.rs` (only if startup plumbing changes are truly required)
- `src/browser/mod.rs`
### New production files likely needed
- `src/browser/zhihu_release_backend.rs`
- a Zhihu-scoped `BrowserBackend` adapter that follows the proven Release browser interaction style without changing non-Zhihu routes
- `src/service/browser_callback_host.rs` **only if the probe proves a service-controlled helper page is actually required**
- service-local helper-page lifecycle and callback relay, if evidence shows the browser cannot be driven safely without it
### Existing files to preserve
- `src/agent/task_runner.rs`
- `src/service/protocol.rs`
- `src/compat/orchestration.rs`
- `src/compat/runtime.rs`
- `src/pipe/*`
### Existing direct-ws files to review explicitly
- `src/browser/ws_backend.rs`
- `tests/browser_ws_backend_test.rs`
These files currently encode the old direct raw-websocket callback expectation. The implementation must either:
- leave them untouched as legacy/direct-contract coverage with no Zhihu production callers, or
- update/remove the Zhihu-specific assumptions they currently lock in.
### Primary test files
- `tests/browser_ws_probe_test.rs`
- `tests/browser_ws_protocol_test.rs`
- `tests/service_ws_session_test.rs`
- `tests/service_task_flow_test.rs`
- `tests/task_runner_test.rs`
- `tests/browser_ws_backend_test.rs`
---
## File structure decisions
### `src/browser/zhihu_release_backend.rs`
Prefer a Zhihu-scoped backend first.
Responsibilities:
- keep the same `BrowserBackend` trait surface
- implement only the behavior needed by the current Zhihu direct-execution route
- translate `Action::Navigate`, `Action::GetText`, and `Action::Eval` into the proven Release-browser interaction style
- normalize results back into `CommandOutput`
- avoid affecting non-Zhihu callers
This is the preferred seam because the user asked to change the current Zhihu flow, not to redesign the whole submit pipeline.
### `src/service/browser_callback_host.rs` (conditional)
Create this file only if Task 1 probe evidence proves that sgclaw must host or control a page in order to receive named callback-function results.
If it is needed, the plan must keep the design minimal and specific:
- one concrete transport only (choose websocket or HTTP, not “websocket or HTTP”)
- explicit readiness handshake
- explicit request correlation by `request_id`
- explicit cleanup when the submit task ends
If Task 1 shows a simpler seam, do not create this file.
### `src/browser/ws_protocol.rs`
Do not let this file keep only the old direct-callback assumption.
It should become the shared place for doc-native request builders such as:
- browser bootstrap frames proven by the transcript
- `sgBrowserCallAfterLoaded` / `sgHideBrowserCallAfterLoaded`
- `sgBrowserExcuteJsCodeByArea`
- optional `sgBrowserRegJsFun` / `sgBrowserExcuteJsFun`
But do **not** let `ws_protocol.rs` absorb service-host lifecycle logic.
### `src/browser/ws_backend.rs` and `tests/browser_ws_backend_test.rs`
Handle these explicitly in the implementation:
- if they still describe a valid direct browser contract, keep them as isolated legacy/direct-ws coverage only
- if their current navigate/callback assumptions conflict with the proven Release Zhihu path, update or narrow those tests so they no longer describe the active Zhihu integration path
Do not leave the old direct-callback assumptions ambiguously “reviewed”; the implementation must make their status explicit.
---
## Task 1: Capture the new Release browser contract in a reproducible probe transcript
**Files:**
- Review/modify: `src/browser/ws_probe.rs`
- Review/modify: `src/bin/sgbrowser_ws_probe.rs`
- Review/modify: `tests/browser_ws_probe_test.rs`
- Create: `docs/_tmp_release_ws_callback_host_transcript.md`
- [ ] **Step 1: Verify current probe coverage against the Release-browser questions**
Read the existing probe module and tests and check whether they already prove all of the following:
- a register-first websocket script can be expressed
- a later array action frame can be expressed in the same script
- per-step inbound frames/outcomes are preserved separately
- timeout/close remain distinguishable in the transcript
Required result:
- identify the exact existing tests that already prove these behaviors
- identify the smallest missing Release-specific coverage, if any
- [ ] **Step 2: Add only the missing regression coverage**
If current tests do **not** already prove the Release-browser bootstrap shape, add the narrowest failing regression in `tests/browser_ws_probe_test.rs`.
Preferred shape if coverage is missing:
```rust
#[test]
fn probe_supports_register_then_array_action_script() {
// fake server expects:
// 1. {"type":"register","role":"web"}
// 2. ["http://127.0.0.1/helper.html","sgBrowerserGetUrls","showUrls"]
}
```
And, if still missing, add one regression proving per-step transcript separation for the register reply and later action reply.
If those behaviors are already covered, skip new test creation and record the exact test names to rely on.
- [ ] **Step 3: Run the relevant probe tests**
Run the narrowest exact tests that prove the Release bootstrap behavior, or the full file if multiple areas changed:
```bash
cargo test --test browser_ws_probe_test -- --nocapture
```
Expected: PASS.
- [ ] **Step 4: Make the probe binary ergonomic for the Release transcript if needed**
Only if the current CLI cannot conveniently express the real Release-browser script, make the smallest change needed in `src/bin/sgbrowser_ws_probe.rs` / `src/browser/ws_probe.rs` so it can capture:
- register frame behavior
- minimal `sgBrowserSetTheme`
- minimal `sgBrowerserGetUrls`
- exact inbound websocket text per step
Do not redesign the probe if it already supports this.
- [ ] **Step 5: Run the live probe against the Release browser and record the real bootstrap**
Use the probe binary against the real endpoint to capture at minimum:
- register frame behavior
- minimal `sgBrowserSetTheme`
- minimal `sgBrowerserGetUrls`
- whether replies come back as websocket text, page-function invocation only, or both
Save the exact transcript in `docs/_tmp_release_ws_callback_host_transcript.md`.
Required output in that temp doc:
- exact sent frames
- exact received websocket frames
- the observed rule for when named callback functions are invoked
- whether Option A or Option B is supported by evidence
- [ ] **Step 6: Commit the probe-only slice if code changed**
If probe code/tests changed:
```bash
git add src/browser/ws_probe.rs src/bin/sgbrowser_ws_probe.rs tests/browser_ws_probe_test.rs docs/_tmp_release_ws_callback_host_transcript.md
git commit -m "test: capture release browser ws bootstrap contract"
```
If only the transcript doc changed, stage only that file and use a docs/test-appropriate commit message.
---
## Task 2: Choose the narrowest Zhihu-only production seam from the probe evidence
**Files:**
- Modify: `src/service/server.rs` (only if required)
- Modify: `src/browser/mod.rs`
- Modify: `src/compat/workflow_executor.rs` (only if required)
- Create: `src/browser/zhihu_release_backend.rs`
- Create: `src/service/browser_callback_host.rs` **only if required**
- Test: `tests/service_ws_session_test.rs`
- Test: `tests/service_task_flow_test.rs`
- [ ] **Step 1: Write down the seam decision in the plan notes before coding**
Based on the transcript from Task 1, record which one of these is supported by evidence:
- Option A: a Zhihu-scoped backend can talk to the Release browser directly with no service-hosted helper page
- Option B: a Zhihu-scoped backend needs a service-controlled helper page because named page callbacks are the only reliable way to get business results
Do not proceed until one option is chosen explicitly from evidence.
- [ ] **Step 2: Add a failing service/task-flow regression that proves only the Zhihu path changes**
Update or add focused tests so that:
- Zhihu submit flow uses the new Release-browser interaction seam
- non-Zhihu behavior is unchanged
- pipe messages remain unchanged
Required assertions:
- the new path is activated only for Zhihu route detection
- `ClientMessage` / `ServiceMessage` stay identical
- existing non-Zhihu submit behavior is not accidentally rerouted
- [ ] **Step 3: Run the new focused regression and confirm failure first**
Run the narrowest exact test names you added in:
```bash
cargo test --test service_ws_session_test <new_test_name> -- --nocapture
cargo test --test service_task_flow_test <new_test_name> -- --nocapture
```
Expected: FAIL because the Zhihu-specific seam does not exist yet.
- [ ] **Step 4: Implement the chosen seam with the smallest blast radius**
If Option A won:
- add `src/browser/zhihu_release_backend.rs`
- wire it only where the Zhihu direct-execution route is selected
- leave global submit-path wiring alone
If Option B won:
- add `src/service/browser_callback_host.rs` with one specific transport and one explicit readiness/correlation model
- add `src/browser/zhihu_release_backend.rs` to talk to that helper path
- wire it only for the Zhihu route
In both cases:
- do not change non-Zhihu callers
- do not redesign `run_submit_task_with_browser_backend(...)`
- do not change the pipe protocol
- [ ] **Step 5: Make the status of old direct-ws code explicit**
Update `src/browser/ws_backend.rs` / `tests/browser_ws_backend_test.rs` only as needed so they no longer ambiguously describe the active Zhihu path.
Allowed outcomes:
- keep them untouched as legacy/direct-ws coverage with no Zhihu production caller
- narrow/update the tests so they no longer claim the active Zhihu integration path
Not allowed:
- leaving the plan and code in a state where both old and new paths appear to be the active Zhihu contract
- [ ] **Step 6: Run focused integration tests**
Run:
```bash
cargo test --test service_ws_session_test -- --nocapture
cargo test --test service_task_flow_test -- --nocapture
cargo test --test task_runner_test -- --nocapture
```
Expected: PASS.
- [ ] **Step 7: Commit the seam-selection slice**
Adjust staged files to match the option actually implemented, for example:
```bash
git add src/browser/zhihu_release_backend.rs src/browser/mod.rs src/service/server.rs src/service/browser_callback_host.rs tests/service_ws_session_test.rs tests/service_task_flow_test.rs tests/browser_ws_backend_test.rs
git commit -m "feat: route zhihu flow through release browser ws contract"
```
Only stage files that were truly changed.
---
## Task 3: Implement Zhihu action mapping on the chosen Release-browser seam
**Files:**
- Modify: `src/browser/ws_protocol.rs`
- Modify: `src/browser/zhihu_release_backend.rs`
- Test: `tests/browser_ws_protocol_test.rs`
- Create: `tests/browser_zhihu_release_backend_test.rs`
- [ ] **Step 1: Write the first failing backend test for Zhihu navigate mapping**
Create `tests/browser_zhihu_release_backend_test.rs` with a fake transport/relay and assert that `Action::Navigate` for the Zhihu path becomes the exact browser request shape proven by Task 1.
Start with this shape:
```rust
#[test]
fn zhihu_release_backend_maps_navigate_to_proven_release_frame() {
// invoke Action::Navigate
// assert exact outbound frame/opcode chosen from transcript evidence
}
```
Required assertions:
- the call site still uses `BrowserBackend::invoke(...)`
- the exact outbound frame matches the recorded Release-browser evidence
- request correlation stays deterministic
- [ ] **Step 2: Run the single new backend test and verify it fails**
Run:
```bash
cargo test --test browser_zhihu_release_backend_test zhihu_release_backend_maps_navigate_to_proven_release_frame -- --nocapture
```
Expected: FAIL because the backend does not exist yet.
- [ ] **Step 3: Implement minimal `Navigate` support**
In `src/browser/zhihu_release_backend.rs`:
- implement `BrowserBackend`
- support `Action::Navigate` first
- use `ws_protocol.rs` helpers for exact browser-frame construction
- do not hardcode speculative opcodes; follow the transcript from Task 1
- [ ] **Step 4: Add failing tests for `GetText` and `Eval`**
Add tests proving:
- `Action::GetText` returns `CommandOutput.data == {"text": "..."}`
- `Action::Eval` returns `CommandOutput.data == {"text": "..."}`
- callback or relay failures become `PipeError::Protocol(...)`
- [ ] **Step 5: Implement `GetText` and `Eval` on the chosen seam**
Use the smallest proven mechanism:
- if the transcript proves page-defined callback functions are required, route through them
- if `callBackJsToCpp(...)` to a page context is still part of the proven path, use it deliberately
- if `sgBrowserRegJsFun` / `sgBrowserExcuteJsFun` becomes necessary, add it only with test coverage and only for the Zhihu path
- [ ] **Step 6: Run focused backend/protocol tests**
Run:
```bash
cargo test --test browser_zhihu_release_backend_test -- --nocapture
cargo test --test browser_ws_protocol_test -- --nocapture
```
Expected: PASS.
- [ ] **Step 7: Commit the Zhihu backend slice**
```bash
git add src/browser/ws_protocol.rs src/browser/zhihu_release_backend.rs src/browser/mod.rs tests/browser_ws_protocol_test.rs tests/browser_zhihu_release_backend_test.rs
git commit -m "feat: add zhihu release ws backend"
```
---
## Task 4: Keep the Zhihu workflow logic stable and patch only proven mismatches
**Files:**
- Review: `src/compat/workflow_executor.rs`
- Test: `tests/service_task_flow_test.rs`
- Test: `tests/compat_runtime_test.rs` (only if a focused direct-execution regression is needed)
- [ ] **Step 1: Write a failing Zhihu-specific regression only if the chosen seam changes route assumptions**
If the new Zhihu backend changes request-url or target-url handling enough to break hotlist flow, add one focused failing regression for that exact behavior.
Candidate assertions:
- hotlist navigate still logs `navigate https://www.zhihu.com/hot`
- follow-up `GetText body` still targets the Zhihu page, not any helper page
- extractor `Eval` still runs against Zhihu, not any helper page
- [ ] **Step 2: Keep the current high-level Zhihu action sequence unless a test proves otherwise**
`src/compat/workflow_executor.rs` currently does the right high-level work:
- navigate to Zhihu hotlist
- poll body text until ready
- run the extractor script
Prefer to keep this file unchanged. Only patch it if the new backend needs a narrow explicit `target_url` fix or similar evidence-backed adjustment.
- [ ] **Step 3: Run the smallest Zhihu-focused verification sweep**
Run:
```bash
cargo test --test service_task_flow_test -- --nocapture
cargo test --test compat_runtime_test zhihu -- --nocapture
```
If the `compat_runtime_test zhihu` filter is too broad or unstable, run the exact focused Zhihu cases that cover hotlist extraction.
- [ ] **Step 4: Commit only if a Zhihu-specific code change was actually required**
```bash
git add src/compat/workflow_executor.rs tests/service_task_flow_test.rs tests/compat_runtime_test.rs
git commit -m "fix: keep zhihu workflow aligned with release ws backend"
```
Skip this commit if no production change in `workflow_executor.rs` was needed.
---
## Task 5: Prove that pipe behavior and non-Zhihu behavior stayed unchanged
**Files:**
- Test: `tests/service_ws_session_test.rs`
- Test: `tests/service_task_flow_test.rs`
- Test: `tests/task_runner_test.rs`
- [ ] **Step 1: Add or update one regression that proves pipe messages are unchanged**
Use the smallest existing test seam to assert that `ClientMessage` / `ServiceMessage` payloads remain unchanged while the Zhihu route uses the new browser integration path internally.
- [ ] **Step 2: Add or update one regression that proves non-Zhihu behavior is unchanged**
Use a non-Zhihu submit or service-session case and assert it does not take the new Zhihu-specific backend path.
- [ ] **Step 3: Preserve current runtime regression guards**
The end-to-end tests must continue asserting that output does **not** contain:
- `invalid hmac seed: session key must not be empty`
- `Cannot drop a runtime in a context where blocking is not allowed`
- [ ] **Step 4: Run the final focused verification sweep**
Run:
```bash
cargo test --test service_ws_session_test -- --nocapture
cargo test --test service_task_flow_test -- --nocapture
cargo test --test task_runner_test -- --nocapture
```
Expected: PASS.
- [ ] **Step 5: Commit the verification sweep**
```bash
git add tests/service_ws_session_test.rs tests/service_task_flow_test.rs tests/task_runner_test.rs tests/browser_ws_backend_test.rs
git commit -m "test: constrain zhihu release ws migration scope"
```
Only stage files that were truly changed.
---
## Out of scope
Do **not** do these in this slice:
- change the pipe protocol
- change `ClientMessage` / `ServiceMessage`
- redesign `run_submit_task_with_browser_backend(...)`
- reintroduce any browser bridge surface
- keep adding speculative direct-raw-websocket callback patches to `ws_backend.rs`
- redesign non-Zhihu workflows unless the new backend abstraction forces a shared fix
- create a long-lived external dependency or third-party server just to host the helper page
---
## Verification checklist
Run at minimum:
```bash
cargo test --test browser_ws_probe_test -- --nocapture
cargo test --test browser_zhihu_release_backend_test -- --nocapture
cargo test --test browser_ws_protocol_test -- --nocapture
cargo test --test service_ws_session_test -- --nocapture
cargo test --test service_task_flow_test -- --nocapture
cargo test --test task_runner_test -- --nocapture
```
If Task 2 chose the helper-page / relay design, also run the helper-page-specific backend tests you added for that path.
Manual verification after code changes:
1. start the real Release browser/runtime that exposes `ws://127.0.0.1:12345`
2. start `sg_claw` with real config
3. start `sg_claw_client`
4. submit:
- `打开知乎热榜获取前10条数据并导出 Excel`
5. confirm the Zhihu path uses the exact Release-browser interaction seam proven by Task 1
6. if Task 2 chose Option B, confirm the helper page / relay path is used only for the Zhihu integration seam
7. confirm non-Zhihu behavior is unchanged
8. confirm the task completes without:
- `timeout while waiting for browser message`
- `invalid browser status frame: Welcome! You are client #1`
- `invalid hmac seed: session key must not be empty`
- `Cannot drop a runtime in a context where blocking is not allowed`
---
## Expected outcome
After this slice:
- sgclaw still exposes the same pipe/service contract
- Zhihu hotlist execution uses the Release-browser websocket contract proven by Task 1
- non-Zhihu behavior remains unchanged
- old direct-ws Zhihu assumptions are no longer ambiguous in production/tests
- if Option A won, Zhihu uses a direct Release-browser backend
- if Option B won, Zhihu uses the minimal helper-page / relay seam justified by the probe evidence

View File

@@ -0,0 +1,322 @@
# Zhihu WS Submit Realignment Implementation Plan
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
**Goal:** Realign Zhihu submit routes to the documented websocket callback model, removing helper-page bootstrap from the mainline while keeping the existing pipe/service contract unchanged.
**Architecture:** The change stays inside the existing submit-path backend selection and websocket protocol flow. Zhihu routes stop choosing `BrowserCallbackBackend` and instead use `WsBrowserBackend` when a real browser websocket is configured, preserving the existing pipe fallback in direct runtime when no websocket URL is available.
**Tech Stack:** Rust, tungstenite websocket client/server, serde_json, cargo test
---
## File Map
- Modify: `src/service/server.rs`
- Change only the Zhihu route-gated submit-path backend selection
- Remove Zhihu submit mainline use of `LiveBrowserCallbackHost` / `BrowserCallbackBackend`
- Keep service submit path on `WsBrowserBackend`
- Preserve initial request URL derivation for Zhihu routes
- Modify: `src/agent/mod.rs`
- Change only the Zhihu route-gated submit-path backend selection
- Remove Zhihu submit mainline use of `LiveBrowserCallbackHost` / `BrowserCallbackBackend`
- Keep direct runtime pipe fallback when browser websocket URL is absent
- Modify: `tests/agent_runtime_test.rs`
- Replace helper-page bootstrap regression with direct websocket submit regression
- Assert no `/sgclaw/browser-helper.html` bootstrap frames are emitted
- Assert real-page request ownership on follow-up Zhihu actions
- Modify: `src/browser/callback_host.rs`
- Remove or rewrite the now-wrong red test that preserves Option-B callback-host startup behavior
- Verify: `tests/browser_ws_backend_test.rs`
- Reuse existing websocket request-url behavior coverage; extend only if the new regression proves insufficient
- Reference: `docs/superpowers/specs/2026-04-04-zhihu-ws-submit-realignment-design.md`
### Task 1: Rewrite the stale submit regression around the real websocket mainline
**Files:**
- Modify: `tests/agent_runtime_test.rs:507-660`
- Test: `tests/agent_runtime_test.rs`
- [ ] **Step 1: Write the failing test**
Rename and rewrite the existing helper-page regression so it asserts the new behavior:
```rust
#[test]
fn production_submit_task_routes_zhihu_through_ws_backend_without_helper_bootstrap() {
// arrange runtime context and fake browser ws server
// submit Zhihu hotlist request
// assert ws frames never contain "/sgclaw/browser-helper.html"
// assert first action is navigate to https://www.zhihu.com/hot
// assert follow-up action uses real-page requesturl instead of helper page
}
```
Use the existing fake ws helpers in the file where possible. Do not add localhost callback-host HTTP plumbing to this rewritten test.
- [ ] **Step 2: Run test to verify it fails**
Run: `cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" production_submit_task_routes_zhihu_through_ws_backend_without_helper_bootstrap --test agent_runtime_test -- --nocapture`
Expected: FAIL because current production code still routes Zhihu submit into `BrowserCallbackBackend` and emits helper-page bootstrap frames.
- [ ] **Step 3: Keep the regression focused**
Before touching production code, confirm the rewritten test checks only these behaviors:
```text
- no callback-host bootstrap frame
- no helper-page URL
- navigate frame still targets https://www.zhihu.com/hot
- follow-up websocket action uses real-page request ownership
```
Do not assert unrelated workflow details beyond what is needed to prove the route correction.
- [ ] **Step 4: Commit the red test**
```bash
git add tests/agent_runtime_test.rs
git commit -m "test: rewrite zhihu submit ws routing regression"
```
### Task 2: Switch service Zhihu submit routes off the callback-host backend
**Files:**
- Modify: `src/service/server.rs:287-328`
- Test: `tests/agent_runtime_test.rs`
- [ ] **Step 1: Write the minimal production change**
Replace only the Zhihu-route callback-host branch with direct websocket backend selection.
Minimal target shape:
```rust
fn browser_backend_for_submit(
browser_ws_url: &str,
mac_policy: &MacPolicy,
request: &SubmitTaskRequest,
) -> Result<Arc<dyn BrowserBackend>, PipeError> {
if should_use_callback_host_backend(request) {
return Ok(Arc::new(WsBrowserBackend::new(
Arc::new(ServiceWsClient::connect(browser_ws_url)?),
mac_policy.clone(),
initial_request_url_for_submit_task(request),
)));
}
Ok(Arc::new(WsBrowserBackend::new(
Arc::new(ServiceWsClient::connect(browser_ws_url)?),
mac_policy.clone(),
initial_request_url_for_submit_task(request),
)))
}
```
After the route-gated branch is removed, simplify further only if the branch becomes redundant without changing non-Zhihu behavior.
- [ ] **Step 2: Run the rewritten regression**
Run: `cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" production_submit_task_routes_zhihu_through_ws_backend_without_helper_bootstrap --test agent_runtime_test -- --nocapture`
Expected: still FAIL or advance to a later assertion until the direct-runtime path is corrected too.
- [ ] **Step 3: Add or update a service-specific regression if needed**
If the rewritten `agent_runtime_test` does not exercise the service submit path directly, add one narrow service regression before continuing.
Target shape:
```rust
#[test]
fn service_submit_task_routes_zhihu_through_ws_backend_without_helper_bootstrap() {
// fake browser ws
// submit Zhihu route through service path
// assert no helper bootstrap frame
}
```
Run the exact test you end up using:
`cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" service_submit_task_routes_zhihu_through_ws_backend_without_helper_bootstrap --test <exact test file> -- --nocapture`
Expected: PASS
- [ ] **Step 4: Commit the service-path fix**
```bash
git add src/service/server.rs tests/agent_runtime_test.rs
git commit -m "fix: route zhihu submit through ws backend"
```
### Task 3: Switch direct runtime Zhihu submit routes off the callback-host backend while keeping pipe fallback
**Files:**
- Modify: `src/agent/mod.rs:49-100`
- Test: `tests/agent_runtime_test.rs`
- [ ] **Step 1: Write the minimal production change**
Remove callback-host backend selection from `browser_backend_for_submit(...)`.
Minimal target behavior:
```rust
if let Some(browser_ws_url) = configured_browser_ws_url(context) {
return Ok(Arc::new(WsBrowserBackend::new(
Arc::new(ServiceWsClient::connect(&browser_ws_url)?),
browser_tool.mac_policy().clone(),
initial_request_url_for_submit_task(request),
).with_response_timeout(browser_tool.response_timeout())));
}
Ok(Arc::new(PipeBrowserBackend::from_inner(browser_tool.clone())))
```
If `ServiceWsClient` is not reusable from `src/service/server.rs`, extract the smallest shared websocket client helper into the browser module instead of inventing a new abstraction.
- [ ] **Step 2: Add a focused fallback assertion only if needed**
If the rewritten regression does not cover the direct-runtime no-websocket case, add one small test:
```rust
#[test]
fn production_submit_task_keeps_pipe_fallback_when_browser_ws_url_is_unset() {
// no SGCLAW_BROWSER_WS_URL
// blank/no ws config
// assert no websocket bootstrap attempt occurs
}
```
Only add this test if current coverage is insufficient.
- [ ] **Step 3: Run tests to verify green**
Run: `cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" production_submit_task_routes_zhihu_through_ws_backend_without_helper_bootstrap --test agent_runtime_test -- --nocapture`
Expected: PASS
If a fallback test was added, run it immediately after and expect PASS.
- [ ] **Step 4: Commit the direct-runtime fix**
```bash
git add src/agent/mod.rs tests/agent_runtime_test.rs
git commit -m "fix: align runtime zhihu submit with ws contract"
```
### Task 4: Reassess stale callback-host regression coverage only if it blocks the approved slice
**Files:**
- Maybe modify: `src/browser/callback_host.rs:793-810`
- Test: `src/browser/callback_host.rs`
- [ ] **Step 1: Check whether the callback-host red test still blocks the approved Option A slice**
Inspect whether this test still preserves rejected Option-B behavior and whether it fails or becomes misleading after Tasks 1-3:
```rust
#[test]
fn live_callback_host_starts_without_bootstrapping_external_helper_page() {
// inspect before editing
}
```
If the test is unrelated to the approved Zhihu mainline or remains harmless, leave it unchanged in this slice.
- [ ] **Step 2: Remove or rewrite only if required by the changed submit-path behavior**
If the test blocks the approved slice, make the smallest change needed:
- delete it if it exists only to preserve rejected Option B behavior, or
- rewrite it so it no longer asserts callback-host startup as the accepted Zhihu mainline
- [ ] **Step 3: Run focused callback-host tests only if Step 2 changed code**
Run: `cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" callback_host --lib -- --nocapture`
Expected: PASS
- [ ] **Step 4: Commit only if Step 2 changed code**
```bash
git add src/browser/callback_host.rs
git commit -m "test: clean up stale callback host regression"
```
### Task 5: Run the focused verification sweep
**Files:**
- Verify: `tests/agent_runtime_test.rs`
- Verify: `tests/compat_runtime_test.rs`
- Verify: any directly affected service/browser websocket tests
- [ ] **Step 1: Run submit-path regression coverage**
Run: `cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" production_submit_task_routes_zhihu_through_ws_backend_without_helper_bootstrap --test agent_runtime_test -- --nocapture`
Expected: PASS
- [ ] **Step 2: Run websocket backend request-url coverage**
Run: `cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" ws_backend_reuses_last_navigated_url_for_followup_requests --test browser_ws_backend_test -- --nocapture`
Expected: PASS
- [ ] **Step 3: Run Zhihu compat runtime coverage**
Run: `cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" zhihu --test compat_runtime_test -- --nocapture`
Expected: PASS for the changed submit-path surface or clear, directly related failures only.
- [ ] **Step 4: Run affected service submit regression coverage**
Run the exact service-specific regression from Task 2 if you added one.
Otherwise, run the narrowest existing service submit test that covers `ClientMessage::SubmitTask` handling for browser routes.
Expected: PASS
- [ ] **Step 5: Commit the verified slice**
```bash
git add src/service/server.rs src/agent/mod.rs tests/agent_runtime_test.rs src/browser/callback_host.rs
git commit -m "fix: realign zhihu submit with browser ws callbacks"
```
### Task 6: Run stronger real-browser validation
**Files:**
- Verify live behavior through existing binaries and real config only
- [ ] **Step 1: Start the service with the real config**
Run: `cargo run --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" --bin sg_claw -- --config-path "D:/data/ideaSpace/rust/sgClaw/sgclaw_config.json"`
Expected: service starts without failing at callback-host readiness timeout.
- [ ] **Step 2: Run the client against the started service**
Run: `cargo run --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" --bin sg_claw_client`
Expected: for `打开知乎热榜获取前10条数据并导出 Excel`, the browser proceeds into real Zhihu page work instead of stalling before page open.
- [ ] **Step 3: Capture the narrow acceptance evidence**
Verify all of the following from logs/frames/observed behavior:
```text
- no callback-host readiness timeout
- no helper-page bootstrap frame
- at least one real-page follow-up browser action after navigate
```
- [ ] **Step 4: Commit only if live verification required code changes**
```bash
git add <only files changed during live-fix follow-up>
git commit -m "fix: tighten zhihu ws submit live validation follow-up"
```
If no further code changes were needed, do not create an extra commit.

View File

@@ -0,0 +1,406 @@
# Service Chat Web Console Implementation Plan
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
**Goal:** Add a standalone local HTML console that connects to the existing service websocket, submits natural-language tasks with the current `submit_task` payload, and leaves the browser-helper/runtime path untouched.
**Architecture:** The change stays fully at the presentation edge. A new self-contained HTML file under `frontend/service-console/` reuses the current websocket protocol from `src/service/protocol.rs`, while one narrow Rust integration test guards the page's protocol shape and forbids any reference to `browser-helper.html`, callback-host endpoints, or the browser websocket. No Rust runtime logic changes are part of this slice.
**Tech Stack:** HTML, CSS, vanilla JavaScript, Rust integration tests, std::fs, Cargo test
---
## File Map
- Create: `frontend/service-console/sg_claw_service_console.html`
- Standalone local page with inline CSS and JavaScript
- Connects to the existing service websocket at `ws://127.0.0.1:42321` by default
- Sends existing `ClientMessage::SubmitTask` JSON
- Renders inbound `ServiceMessage` rows only
- Create: `tests/service_console_html_test.rs`
- Source guard for the standalone page
- Verifies file location, allowed protocol usage, and forbidden helper/callback references
- Reference: `src/service/protocol.rs`
- Existing websocket message shape to mirror exactly
- Reference: `src/bin/sg_claw_client.rs`
- Existing terminal client behavior to mirror for `submit_task`
- Reference: `docs/superpowers/specs/2026-04-06-service-chat-web-console-design.md`
## Scope Guardrails
- Do not modify `src/service/server.rs`.
- Do not modify `src/browser/callback_host.rs`.
- Do not modify `src/browser/callback_backend.rs`.
- Do not modify `src/bin/sg_claw_client.rs`.
- Do not add an HTTP server.
- Do not connect the new page to `ws://127.0.0.1:12345`.
- Do not reference `/sgclaw/browser-helper.html` or `/sgclaw/callback/*` anywhere in the new page.
### Task 1: Add a failing source-guard test for the standalone page
**Files:**
- Create: `tests/service_console_html_test.rs`
- Reference: `docs/superpowers/specs/2026-04-06-service-chat-web-console-design.md`
- [ ] **Step 1: Write the failing test**
Create a focused integration test that resolves the HTML path from `CARGO_MANIFEST_DIR` and asserts the file contract.
```rust
use std::fs;
use std::path::PathBuf;
#[test]
fn service_console_html_stays_on_service_ws_boundary() {
let manifest_dir = PathBuf::from(env!("CARGO_MANIFEST_DIR"));
let html_path = manifest_dir
.join("frontend")
.join("service-console")
.join("sg_claw_service_console.html");
let source = fs::read_to_string(&html_path)
.expect("service console html should exist");
assert!(source.contains("ws://127.0.0.1:42321"));
assert!(source.contains("submit_task"));
assert!(!source.contains("/sgclaw/browser-helper.html"));
assert!(!source.contains("/sgclaw/callback/ready"));
assert!(!source.contains("/sgclaw/callback/events"));
assert!(!source.contains("/sgclaw/callback/commands/next"));
assert!(!source.contains("/sgclaw/callback/commands/ack"));
assert!(!source.contains("ws://127.0.0.1:12345"));
}
```
- [ ] **Step 2: Run test to verify it fails**
Run:
```bash
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" service_console_html_stays_on_service_ws_boundary --test service_console_html_test -- --exact
```
Expected: FAIL because the HTML file does not exist yet.
- [ ] **Step 3: Keep the test narrow**
Before writing production code, confirm the test guards only the approved boundary:
```text
- file exists at frontend/service-console/sg_claw_service_console.html
- service websocket default is present
- submit_task payload marker is present
- no helper-page path
- no callback-host endpoints
- no browser websocket URL
```
Do not turn this into an end-to-end browser test.
- [ ] **Step 4: Commit the red test**
```bash
git add tests/service_console_html_test.rs
git commit -m "test: add service console html boundary guard"
```
### Task 2: Implement the standalone HTML console with the approved boundary
**Files:**
- Create: `frontend/service-console/sg_claw_service_console.html`
- Reference: `src/service/protocol.rs:6`
- Reference: `src/bin/sg_claw_client.rs:16`
- Test: `tests/service_console_html_test.rs`
- [ ] **Step 1: Create the HTML file with the minimal structure**
Write one self-contained page with:
```html
<!doctype html>
<html lang="zh-CN">
<head>
<meta charset="utf-8" />
<meta name="viewport" content="width=device-width, initial-scale=1" />
<title>sgClaw Service Console</title>
</head>
<body>
<div id="app">
<input id="wsUrl" value="ws://127.0.0.1:42321" />
<button id="connectBtn">连接</button>
<div id="connectionState">未连接</div>
<div id="messageStream"></div>
<textarea id="instructionInput"></textarea>
<div id="validationText"></div>
<button id="sendBtn" disabled>发送任务</button>
</div>
</body>
</html>
```
Keep all CSS and JavaScript inline. Do not add external assets or a build step.
- [ ] **Step 2: Implement websocket connect/disconnect behavior**
Add the smallest possible JS behavior, including explicit disconnect on the same button so the UI
matches the approved connect/disconnect contract:
```javascript
let socket = null;
function appendRow(kind, text) {
// append a visible row to #messageStream
}
function updateUiState() {
const connected = socket && socket.readyState === WebSocket.OPEN;
document.getElementById('connectBtn').textContent = connected ? '断开' : '连接';
document.getElementById('sendBtn').disabled = !connected;
document.getElementById('connectionState').textContent = connected ? '已连接' : '未连接';
}
function connectOrDisconnectService() {
if (socket && socket.readyState === WebSocket.OPEN) {
socket.close();
return;
}
const url = document.getElementById('wsUrl').value.trim() || 'ws://127.0.0.1:42321';
socket = new WebSocket(url);
updateUiState();
socket.addEventListener('open', () => {
appendRow('status', 'service websocket connected');
updateUiState();
});
socket.addEventListener('close', () => {
appendRow('status', 'service websocket disconnected');
updateUiState();
});
socket.addEventListener('error', () => appendRow('error', 'service websocket error'));
socket.addEventListener('message', handleMessage);
}
```
Do not add retry loops or background reconnect logic.
- [ ] **Step 3: Implement submit_task sending with the current message shape**
Mirror the terminal client payload shape exactly and show inline validation for empty input:
```javascript
function setValidation(message) {
document.getElementById('validationText').textContent = message;
}
function sendTask() {
const instruction = document.getElementById('instructionInput').value.trim();
if (!socket || socket.readyState !== WebSocket.OPEN) {
return;
}
if (!instruction) {
setValidation('请输入任务内容。');
return;
}
setValidation('');
socket.send(JSON.stringify({
type: 'submit_task',
instruction,
conversation_id: '',
messages: [],
page_url: '',
page_title: ''
}));
}
```
Do not add new fields. Do not add conversation replay logic in this slice.
- [ ] **Step 4: Render existing inbound service messages only**
Handle the current `ServiceMessage` variants with a minimal dispatcher:
```javascript
function handleMessage(event) {
const message = JSON.parse(event.data);
switch (message.type) {
case 'status_changed':
appendRow('status', message.state);
break;
case 'log_entry':
appendRow('log', message.message);
break;
case 'task_complete':
appendRow(message.success ? 'complete' : 'error', message.summary);
break;
case 'busy':
appendRow('error', message.message);
break;
default:
appendRow('error', 'unknown service message: ' + event.data);
}
}
```
Keep the composer enabled during in-flight work so repeated submits surface the existing `busy` response instead of inventing a frontend queue.
- [ ] **Step 5: Keep the helper boundary explicit in the source**
Before running tests, inspect the HTML source and confirm:
```text
- no /sgclaw/browser-helper.html
- no /sgclaw/callback/*
- no ws://127.0.0.1:12345
- no browser websocket register frame logic
```
If any such string appears, remove it before testing.
- [ ] **Step 6: Run the source-guard test to verify green**
Run:
```bash
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" service_console_html_stays_on_service_ws_boundary --test service_console_html_test -- --exact
```
Expected: PASS
- [ ] **Step 7: Commit the standalone page**
```bash
git add frontend/service-console/sg_claw_service_console.html tests/service_console_html_test.rs
git commit -m "feat: add standalone service chat console"
```
### Task 3: Run the focused verification sweep
**Files:**
- Verify: `tests/service_console_html_test.rs`
- Reference: `src/service/protocol.rs`
- Reference: `src/bin/sg_claw_client.rs`
- [ ] **Step 1: Re-run the source-guard test**
Run:
```bash
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" service_console_html_stays_on_service_ws_boundary --test service_console_html_test -- --exact
```
Expected: PASS
- [ ] **Step 2: Manually inspect disconnected-send and validation markers in the HTML source**
Before broader verification, confirm the page source clearly contains all three UI-local rules:
```text
- connect button can disconnect an open websocket
- send button starts disabled while disconnected
- empty instruction shows inline validation text
```
This inspection stays source-level; do not add extra backend tests for it in this slice.
- [ ] **Step 3: Run an existing service protocol regression for safety**
Run the narrow existing protocol coverage to prove the page did not require backend changes:
```bash
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" submit_task_client_message_converts_into_shared_runner_request --test service_ws_session_test -- --exact
```
Expected: PASS
- [ ] **Step 4: Run an existing terminal-client regression for safety**
Run:
```bash
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" client_submits_first_user_line_to_service --test service_task_flow_test -- --exact
```
Expected: PASS
- [ ] **Step 5: Commit only if verification required any code change**
```bash
git add frontend/service-console/sg_claw_service_console.html tests/service_console_html_test.rs
git commit -m "test: tighten service console verification"
```
If verification required no code changes, do not create an extra commit.
### Task 4: Perform the manual smoke check
**Files:**
- Verify live behavior only; no new code required
- [ ] **Step 1: Start the existing service binary**
Run:
```bash
cargo run --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" --bin sg_claw -- --config-path "D:/data/ideaSpace/rust/sgClaw/sgclaw_config.json"
```
Expected: service starts and prints its ready line with the service websocket listen address.
- [ ] **Step 2: Open the standalone page directly**
Open:
```text
D:/data/ideaSpace/rust/sgClaw/claw-new/frontend/service-console/sg_claw_service_console.html
```
Expected: the page loads through the browser as a local file and shows the default websocket URL `ws://127.0.0.1:42321`.
- [ ] **Step 3: Connect, disconnect, and reconnect once**
Expected:
```text
- message stream shows websocket connected
- clicking the same button disconnects the websocket cleanly
- message stream shows websocket disconnected
- send button is disabled again while disconnected
- reconnect succeeds without reloading the page
```
- [ ] **Step 4: Submit one natural-language task**
Use a small harmless instruction such as:
```text
打开百度
```
Expected:
```text
- empty textarea send attempt first shows inline validation without sending a websocket frame
- page sends one submit_task payload after valid input
- page receives and renders status/log/task_complete or busy rows
```
- [ ] **Step 5: Confirm the helper boundary stayed untouched**
Verify from the page source and observed behavior:
```text
- the page never loads /sgclaw/browser-helper.html
- the page never calls /sgclaw/callback/*
- the page never connects to ws://127.0.0.1:12345
```
If the task itself triggers browser automation, that remains owned by the existing Rust runtime rather than by the page.
- [ ] **Step 6: Commit only if the manual pass required code changes**
```bash
git add frontend/service-console/sg_claw_service_console.html tests/service_console_html_test.rs
git commit -m "fix: tighten standalone service console smoke flow"
```
If the manual pass required no code changes, do not create an extra commit.

View File

@@ -0,0 +1,637 @@
# Zhihu Hotlist Post-Export Auto-Open Implementation Plan
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
**Goal:** Extend the existing Zhihu hotlist Excel and dashboard routes so each route can auto-open its own generated artifact after export, while preserving the current callback-host-backed browser boundary and route exclusivity.
**Architecture:** Keep orchestration in `src/compat/workflow_executor.rs`, but move post-export side effects into a new `src/compat/artifact_open.rs` helper so workflow routing stays readable. Excel auto-open is a local OS-launch side effect; dashboard auto-open reuses `screen_html_export`'s existing `presentation.url` and sends one narrow, marker-based `Action::Navigate` request through `BrowserCallbackBackend`, with a matching special-case validator in `MacPolicy` so arbitrary `file://` navigation remains blocked.
**Tech Stack:** Rust, serde_json, std::process::Command, std::path, Cargo tests
---
## File Map
- Create: `src/compat/artifact_open.rs`
- Define the narrow post-export helper surface for this slice only
- Parse and validate generated artifact payload fields passed in by the workflow layer
- Open generated `.xlsx` files with the local default app
- Build the exact approved local-dashboard navigate payload
- Keep one testable internal seam, `open_exported_xlsx_with(output_path, opener)`, so unit tests can prove the generated `.xlsx` path is handed to the launcher without starting a real spreadsheet app
- Include unit tests in the same file for exact Excel path handoff and launcher-failure reporting
- Modify: `src/compat/mod.rs`
- Export the new `artifact_open` module
- Modify: `src/compat/workflow_executor.rs`
- Keep route detection and artifact generation where they are now
- Change `export_xlsx(...)` and `export_screen(...)` so they parse tool payloads, call the route-specific opener, and produce the new success/failure summaries
- Modify: `src/browser/callback_backend.rs`
- Recognize only the approved local-dashboard navigate request shape at `Action::Navigate`
- Keep normal remote navigate behavior unchanged
- Continue emitting `sgBrowerserOpenPage` for the approved local-dashboard case so the helper page stays alive and the dashboard opens in a new visible tab
- Add focused callback-backend unit tests in the existing test module for approved and malformed local-dashboard requests
- Modify: `src/security/mac_policy.rs`
- Add a narrow validator for the approved local-dashboard presentation case
- Keep `validate(...)` unchanged for ordinary remote-domain flow
- Reject malformed marker payloads, non-HTML local paths, and mismatched `file://` / output-path combinations
- Modify: `tests/compat_runtime_test.rs`
- Keep the concrete hotlist workflow regressions in this existing integration test file
- Extend existing Zhihu hotlist export/screen regressions to assert the new summaries and the dashboard marker payload
- Keep the Excel route workflow assertion limited to summary plus “no dashboard navigate marker,” because exact launcher handoff is covered in `src/compat/artifact_open.rs` unit tests
- Modify: `tests/browser_tool_test.rs`
- Add `MacPolicy` coverage for approved local-dashboard presentation, rejected malformed presentation, and unchanged normal-domain validation in one exact file
- Extend the existing `default_rules_allow_zhihu_navigation` area with the new local-dashboard validation tests rather than creating a second policy test location
- Reference only if summary wording ripples outward: `tests/agent_runtime_test.rs:173-258`
- Existing direct-runtime user-visible summary assertion for Zhihu Excel export
- Reference only if summary wording ripples outward: `tests/service_task_flow_test.rs:704-839`
- Existing CLI-to-service user-visible summary assertion for Zhihu Excel export
- Reference only if summary wording ripples outward: `tests/service_ws_session_test.rs:755-869`
- Existing service-binary user-visible summary assertion for Zhihu Excel export
- Reference: `tests/compat_screen_html_export_tool_test.rs`
- Reuse the exact test seam `screen_html_export_tool_renders_dashboard_html_with_presentation_contract`
- Existing proof that `screen_html_export` already returns `presentation.url`
- Reference: `docs/superpowers/specs/2026-04-06-zhihu-hotlist-post-export-auto-open-design.md`
## Scope Guardrails
- Do not modify `frontend/service-console/sg_claw_service_console.html`.
- Do not modify `src/service/protocol.rs`.
- Do not modify `browser-helper.html`.
- Do not modify `/sgclaw/callback/*` endpoint contracts.
- Do not modify websocket protocol framing or `src/browser/ws_protocol.rs`.
- Do not turn Excel-open and dashboard-open into a combined mode.
- Do not add a general-purpose local file browser or generic `file://` allowlist.
- Do not move post-export decisions into the frontend service console.
- Do not require websocket-backend parity in this slice.
### Task 1: Add failing workflow tests for route-specific post-export actions
**Files:**
- Modify: `tests/compat_runtime_test.rs:2154-2304`
- Reference: `src/compat/workflow_executor.rs:375-446`
- Reference: `docs/superpowers/specs/2026-04-06-zhihu-hotlist-post-export-auto-open-design.md`
- [ ] **Step 1: Rewrite the Excel hotlist assertion as a red test for the new summary only**
Keep the current flow setup, but tighten the expectation so it proves the workflow route now reports post-export open success while staying exclusive from the dashboard path.
Target shape:
```rust
#[test]
fn handle_browser_message_chains_hotlist_skill_into_xlsx_export_and_auto_open() {
// existing setup
assert!(summary.contains("已导出并打开知乎热榜 Excel"));
assert!(generated.exists());
assert!(!sent.iter().any(|message| {
matches!(
message,
AgentMessage::Command { action, params, .. }
if action == &Action::Navigate
&& params.get("sgclaw_local_dashboard_open").is_some()
)
}));
}
```
Do not try to prove real OS launching in this workflow test. The exact `.xlsx` path handoff to the launcher belongs in `src/compat/artifact_open.rs` unit tests from Task 2.
- [ ] **Step 2: Rewrite the dashboard hotlist assertion as a red test for browser auto-open**
Tighten the existing dashboard test so it proves the workflow consumes `presentation.url` and emits the approved compat marker payload.
Target shape:
```rust
#[test]
fn handle_browser_message_chains_hotlist_skill_into_screen_export_and_auto_open() {
// existing setup
assert!(summary.contains("已在浏览器中打开知乎热榜大屏"));
let navigate = sent.iter().find_map(|message| match message {
AgentMessage::Command { action, params, security, .. }
if action == &Action::Navigate
&& security.expected_domain == "__sgclaw_local_dashboard__" => Some((params, security)),
_ => None,
}).expect("dashboard route should emit local-dashboard navigate request");
assert!(navigate.0["url"].as_str().unwrap().starts_with("file://"));
assert_eq!(navigate.0["sgclaw_local_dashboard_open"]["source"], json!("compat.workflow_executor"));
assert_eq!(navigate.0["sgclaw_local_dashboard_open"]["kind"], json!("zhihu_hotlist_screen"));
assert_eq!(navigate.0["sgclaw_local_dashboard_open"]["presentation_url"], navigate.0["url"]);
}
```
Also assert that this route still logs `call screen_html_export` and does not invoke the Excel opener path.
- [ ] **Step 3: Add a missing-`presentation.url` regression in the workflow test module if none exists**
Put this close to the existing hotlist tests and keep it narrow:
```rust
#[test]
fn handle_browser_message_reports_dashboard_auto_open_protocol_error_when_presentation_url_is_missing() {
// mock screen_html_export success payload with output_path but no presentation.url
// assert summary contains 已生成知乎热榜大屏 <path>,但浏览器自动打开失败:
}
```
Use the existing summary/path helpers in the file instead of inventing new parsing helpers.
- [ ] **Step 4: Run the focused compat runtime tests to verify they fail**
Run:
```bash
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" handle_browser_message_chains_hotlist_skill_into_xlsx_export_and_auto_open --test compat_runtime_test -- --exact
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" handle_browser_message_chains_hotlist_skill_into_screen_export_and_auto_open --test compat_runtime_test -- --exact
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" handle_browser_message_reports_dashboard_auto_open_protocol_error_when_presentation_url_is_missing --test compat_runtime_test -- --exact
```
Expected: FAIL because the workflow still returns artifact-only summaries and has no post-export open handling.
- [ ] **Step 5: Commit the red workflow tests**
```bash
git add tests/compat_runtime_test.rs
git commit -m "test: add hotlist post-export auto-open regressions"
```
### Task 2: Implement the compat post-export opener and update workflow summaries
**Files:**
- Create: `src/compat/artifact_open.rs`
- Modify: `src/compat/mod.rs`
- Modify: `src/compat/workflow_executor.rs:375-446`
- Test: `src/compat/artifact_open.rs`
- Test: `tests/compat_runtime_test.rs`
- [ ] **Step 1: Add the red unit tests in `src/compat/artifact_open.rs` before writing production code**
Create the new module with a `#[cfg(test)]` block first so the Excel opener has an exact, non-UI verification seam.
Target tests:
```rust
#[test]
fn open_exported_xlsx_with_passes_generated_path_to_launcher() {
let mut seen = None;
let result = open_exported_xlsx_with(Path::new("C:/tmp/zhihu-hotlist.xlsx"), |path| {
seen = Some(path.to_path_buf());
Ok(())
});
assert!(matches!(result, PostExportOpen::Opened));
assert_eq!(seen.unwrap(), PathBuf::from("C:/tmp/zhihu-hotlist.xlsx"));
}
#[test]
fn open_exported_xlsx_with_reports_launcher_failure() {
let result = open_exported_xlsx_with(Path::new("C:/tmp/zhihu-hotlist.xlsx"), |_path| {
Err("launcher failed".to_string())
});
assert!(matches!(result, PostExportOpen::Failed(reason) if reason.contains("launcher failed")));
}
```
Add one matching dashboard payload test in the same file:
```rust
#[test]
fn open_local_dashboard_uses_exact_approved_marker_payload() {
// FakeBrowserBackend records invoke(action, params, expected_domain)
// assert expected_domain == "__sgclaw_local_dashboard__"
// assert params.url == params.sgclaw_local_dashboard_open.presentation_url
// assert source/kind/output_path all match the approved contract
}
```
This step is mandatory so the Excel route is proven to hand the generated path to the opener without launching a real application.
- [ ] **Step 2: Run the new unit tests to verify they fail**
Run:
```bash
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" open_exported_xlsx_with_passes_generated_path_to_launcher --lib -- --exact
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" open_exported_xlsx_with_reports_launcher_failure --lib -- --exact
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" open_local_dashboard_uses_exact_approved_marker_payload --lib -- --exact
```
Expected: FAIL because `src/compat/artifact_open.rs` does not exist yet.
- [ ] **Step 3: Create the small compat opener module**
Add one focused helper module rather than embedding side effects directly into `workflow_executor.rs`.
Target shape:
```rust
pub const LOCAL_DASHBOARD_EXPECTED_DOMAIN: &str = "__sgclaw_local_dashboard__";
pub const LOCAL_DASHBOARD_SOURCE: &str = "compat.workflow_executor";
pub const LOCAL_DASHBOARD_KIND_ZHIHU_HOTLIST_SCREEN: &str = "zhihu_hotlist_screen";
pub enum PostExportOpen {
Opened,
Failed(String),
}
pub fn open_exported_xlsx(output_path: &Path) -> PostExportOpen {
open_exported_xlsx_with(output_path, launch_with_default_xlsx_app)
}
fn open_exported_xlsx_with<F>(output_path: &Path, opener: F) -> PostExportOpen
where
F: FnOnce(&Path) -> Result<(), String>,
{ /* test seam */ }
pub fn open_local_dashboard(
browser_backend: &dyn BrowserBackend,
output_path: &Path,
presentation_url: &str,
) -> PostExportOpen { /* invoke Action::Navigate with exact marker payload */ }
```
Keep the module tiny. The only dedicated test seam in this file should be `open_exported_xlsx_with(...)`; do not introduce a general launcher trait.
- [ ] **Step 4: Implement the Windows-first `.xlsx` opener minimally**
Use a focused local launcher that targets the current environment first.
Preferred target shape:
```rust
Command::new("cmd")
.args(["/C", "start", "", output_path_as_windows_string])
```
Requirements:
```text
- fail if the path does not exist
- do not swallow command-spawn errors
- do not open arbitrary user-selected files from outside this workflow
- keep cross-platform behavior minimal; only add a fallback branch if required to keep tests/build portable
```
If you need a non-Windows fallback for compilation, keep it obviously minimal and out of the hot path.
- [ ] **Step 5: Parse payloads in `workflow_executor.rs` and call the new helper**
Refactor `export_xlsx(...)` and `export_screen(...)` just enough to separate:
```text
- tool execution
- payload parsing
- route-specific post-export open
- summary formatting
```
Minimal target behavior:
```rust
match open_exported_xlsx(&output_path) {
PostExportOpen::Opened => format!("已导出并打开知乎热榜 Excel {output_path}"),
PostExportOpen::Failed(reason) => format!("已导出知乎热榜 Excel {output_path},但自动打开失败:{reason}"),
}
```
```rust
match open_local_dashboard(browser_backend, &output_path, &presentation_url) {
PostExportOpen::Opened => format!("已在浏览器中打开知乎热榜大屏 {output_path}"),
PostExportOpen::Failed(reason) => format!("已生成知乎热榜大屏 {output_path},但浏览器自动打开失败:{reason}"),
}
```
Change signatures only as much as needed to pass `browser_backend` into the dashboard route. Do not broaden unrelated call chains.
- [ ] **Step 6: Export the helper module**
Update `src/compat/mod.rs`:
```rust
pub mod artifact_open;
```
Do not reorder unrelated module exports unless rustfmt does it.
- [ ] **Step 7: Run the focused library and workflow regressions to verify green**
Run:
```bash
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" open_exported_xlsx_with_passes_generated_path_to_launcher --lib -- --exact
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" open_exported_xlsx_with_reports_launcher_failure --lib -- --exact
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" open_local_dashboard_uses_exact_approved_marker_payload --lib -- --exact
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" handle_browser_message_chains_hotlist_skill_into_xlsx_export_and_auto_open --test compat_runtime_test -- --exact
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" handle_browser_message_chains_hotlist_skill_into_screen_export_and_auto_open --test compat_runtime_test -- --exact
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" handle_browser_message_reports_dashboard_auto_open_protocol_error_when_presentation_url_is_missing --test compat_runtime_test -- --exact
```
Expected: PASS for the new library tests and the workflow regressions, unless the dashboard-open path still fails at backend/policy validation.
- [ ] **Step 8: Commit the compat opener and workflow changes**
```bash
git add src/compat/artifact_open.rs src/compat/mod.rs src/compat/workflow_executor.rs tests/compat_runtime_test.rs
git commit -m "feat: auto-open zhihu hotlist export artifacts"
```
### Task 3: Add failing backend and security tests for the narrow local-dashboard allowance
**Files:**
- Modify: `src/browser/callback_backend.rs:536-840`
- Modify: `tests/browser_tool_test.rs` (`default_rules_allow_zhihu_navigation` section plus new local-dashboard validation tests)
- Reference: `src/security/mac_policy.rs:56-132`
- [ ] **Step 1: Add a red callback-backend acceptance test for the approved local-dashboard request shape**
Extend the existing `src/browser/callback_backend.rs` test module with one focused navigate test.
Target shape:
```rust
#[test]
fn callback_backend_accepts_approved_local_dashboard_navigate_request() {
let host = Arc::new(FakeCallbackHost::new(vec![success_reply(json!({ "navigated": true }))]));
let backend = BrowserCallbackBackend::new(
host.clone(),
test_policy(),
"http://127.0.0.1:17888/sgclaw/browser-helper.html",
);
let output = backend.invoke(
Action::Navigate,
json!({
"url": "file:///C:/tmp/zhihu-hotlist-screen.html",
"sgclaw_local_dashboard_open": {
"source": "compat.workflow_executor",
"kind": "zhihu_hotlist_screen",
"output_path": "C:/tmp/zhihu-hotlist-screen.html",
"presentation_url": "file:///C:/tmp/zhihu-hotlist-screen.html"
}
}),
"__sgclaw_local_dashboard__",
);
assert!(output.unwrap().success);
assert_eq!(host.requests()[0].command, json!([
"http://127.0.0.1:17888/sgclaw/browser-helper.html",
"sgBrowerserOpenPage",
"file:///C:/tmp/zhihu-hotlist-screen.html"
]));
}
```
Do not weaken any existing normal-domain tests.
- [ ] **Step 2: Add red rejection tests in exact files**
Put malformed-request rejection in `src/browser/callback_backend.rs` next to the acceptance test:
```rust
#[test]
fn callback_backend_rejects_local_dashboard_navigate_without_required_marker_fields() {}
```
Put policy-only validation in `tests/browser_tool_test.rs` so all public `MacPolicy` assertions stay in one place:
```rust
#[test]
fn mac_policy_rejects_non_html_local_dashboard_presentation() {}
#[test]
fn default_rules_allow_zhihu_navigation() {
let policy = MacPolicy::load_from_path(...).unwrap();
policy.validate(&Action::Navigate, "www.zhihu.com").unwrap();
}
```
Do not create a second `MacPolicy` regression location.
- [ ] **Step 3: Run the focused backend/policy tests to verify red**
Run:
```bash
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" callback_backend_accepts_approved_local_dashboard_navigate_request --lib -- --exact
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" callback_backend_rejects_local_dashboard_navigate_without_required_marker_fields --lib -- --exact
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" mac_policy_rejects_non_html_local_dashboard_presentation --test browser_tool_test -- --exact
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" default_rules_allow_zhihu_navigation --test browser_tool_test -- --exact
```
Expected: the new local-dashboard tests FAIL; `default_rules_allow_zhihu_navigation` should still PASS.
- [ ] **Step 4: Commit the red backend/security tests**
```bash
git add src/browser/callback_backend.rs tests/browser_tool_test.rs
git commit -m "test: lock local dashboard navigate boundary"
```
### Task 4: Implement the narrow callback-backend and MacPolicy allowance
**Files:**
- Modify: `src/browser/callback_backend.rs:300-351`
- Modify: `src/security/mac_policy.rs:56-132`
- Maybe modify: `src/security/mod.rs:9-27`
- Test: `src/browser/callback_backend.rs:536-840`
- Test: `tests/browser_tool_test.rs` (`default_rules_allow_zhihu_navigation` section plus new local-dashboard validation tests)
- [ ] **Step 1: Add a narrow local-dashboard validation helper in `MacPolicy`**
Keep `validate(...)` unchanged for ordinary domain flow. Add one small explicit helper instead.
Target shape:
```rust
pub fn validate_local_dashboard_presentation(
&self,
action: &Action,
expected_domain: &str,
presentation_url: &str,
output_path: &str,
) -> Result<(), SecurityError> {
// require Action::Navigate
// require expected_domain == "__sgclaw_local_dashboard__"
// require file:// URL
// require .html path
// require normalized file URL path matches output_path
}
```
If you need a new `SecurityError` variant for malformed local-dashboard input, add the smallest one that keeps error text clear.
- [ ] **Step 2: Recognize only the exact approved request shape in `BrowserCallbackBackend::invoke(...)`**
Before the normal `self.mac_policy.validate(&action, expected_domain)?` path runs, detect the one approved special case.
Minimal target behavior:
```rust
if let Some(local_dashboard) = approved_local_dashboard_request(&action, &params, expected_domain) {
self.mac_policy.validate_local_dashboard_presentation(
&action,
expected_domain,
&local_dashboard.presentation_url,
&local_dashboard.output_path,
)?;
} else {
self.mac_policy.validate(&action, expected_domain)?;
}
```
The helper should require all of these fields exactly:
```text
- action == Action::Navigate
- expected_domain == "__sgclaw_local_dashboard__"
- params.url exists
- params.sgclaw_local_dashboard_open.source == "compat.workflow_executor"
- params.sgclaw_local_dashboard_open.kind == "zhihu_hotlist_screen"
- params.sgclaw_local_dashboard_open.output_path exists
- params.sgclaw_local_dashboard_open.presentation_url exists and equals params.url
```
Anything else must continue down the normal rejection path.
- [ ] **Step 3: Keep `build_command(Action::Navigate, ...)` simple**
Do not add a second browser opcode or change the callback-host runtime contract. The approved local-dashboard case should still flow into the existing navigate command builder so the emitted command stays:
```rust
json!([
self.helper_page_url,
"sgBrowerserOpenPage",
target_url,
])
```
- [ ] **Step 4: Run the focused backend/security tests to verify green**
Run:
```bash
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" callback_backend_accepts_approved_local_dashboard_navigate_request --lib -- --exact
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" callback_backend_rejects_local_dashboard_navigate_without_required_marker_fields --lib -- --exact
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" mac_policy_rejects_non_html_local_dashboard_presentation --test browser_tool_test -- --exact
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" default_rules_allow_zhihu_navigation --test browser_tool_test -- --exact
```
Expected: PASS
- [ ] **Step 5: Re-run the dashboard workflow regression after backend validation lands**
Run:
```bash
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" handle_browser_message_chains_hotlist_skill_into_screen_export_and_auto_open --test compat_runtime_test -- --exact
```
Expected: PASS
- [ ] **Step 6: Commit the backend/security implementation**
```bash
git add src/browser/callback_backend.rs src/security/mac_policy.rs src/security/mod.rs tests/browser_tool_test.rs tests/compat_runtime_test.rs
git commit -m "fix: allow approved local dashboard auto-open"
```
If `src/security/mod.rs` did not change, omit it from the commit.
### Task 5: Run the focused verification sweep
**Files:**
- Verify: `src/compat/artifact_open.rs`
- Verify: `tests/compat_runtime_test.rs`
- Verify: `tests/compat_screen_html_export_tool_test.rs`
- Verify: `tests/browser_tool_test.rs`
- Verify: `src/browser/callback_backend.rs` test module
- Reference only if summary wording ripples outward: `tests/agent_runtime_test.rs:173-258`
- Reference only if summary wording ripples outward: `tests/service_task_flow_test.rs:704-839`
- Reference only if summary wording ripples outward: `tests/service_ws_session_test.rs:755-869`
- [ ] **Step 1: Re-run the library and workflow regressions**
Run:
```bash
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" open_exported_xlsx_with_passes_generated_path_to_launcher --lib -- --exact
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" open_exported_xlsx_with_reports_launcher_failure --lib -- --exact
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" open_local_dashboard_uses_exact_approved_marker_payload --lib -- --exact
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" handle_browser_message_chains_hotlist_skill_into_xlsx_export_and_auto_open --test compat_runtime_test -- --exact
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" handle_browser_message_chains_hotlist_skill_into_screen_export_and_auto_open --test compat_runtime_test -- --exact
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" handle_browser_message_reports_dashboard_auto_open_protocol_error_when_presentation_url_is_missing --test compat_runtime_test -- --exact
```
Expected: PASS
- [ ] **Step 2: Re-run the tool contract regression that the dashboard route depends on**
Run:
```bash
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" screen_html_export_tool_renders_dashboard_html_with_presentation_contract --test compat_screen_html_export_tool_test -- --exact
```
Expected: PASS
- [ ] **Step 3: Re-run the callback-backend and policy boundary tests**
Run:
```bash
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" callback_backend_accepts_approved_local_dashboard_navigate_request --lib -- --exact
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" callback_backend_rejects_local_dashboard_navigate_without_required_marker_fields --lib -- --exact
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" mac_policy_rejects_non_html_local_dashboard_presentation --test browser_tool_test -- --exact
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" default_rules_allow_zhihu_navigation --test browser_tool_test -- --exact
```
Expected: PASS
- [ ] **Step 4: Re-run outward-facing summary regressions only if needed**
Only if the updated summary text breaks existing assertions, run exactly these existing regressions and adjust only the affected expectation text:
```bash
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" production_submit_task_routes_zhihu_through_ws_backend_without_helper_bootstrap --test agent_runtime_test -- --exact
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" client_to_service_regression_routes_zhihu_without_helper_bootstrap_or_invalid_hmac_seed_output --test service_task_flow_test -- --exact
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" service_binary_submit_flow_routes_zhihu_without_helper_bootstrap --test service_ws_session_test -- --exact
```
Expected: PASS for any test you had to touch. Skip this step entirely if those files needed no edits.
- [ ] **Step 5: Inspect scope before finishing with exact git commands**
Run:
```bash
git diff --name-only -- src/compat/artifact_open.rs src/compat/mod.rs src/compat/workflow_executor.rs src/browser/callback_backend.rs src/security/mac_policy.rs src/security/mod.rs tests/compat_runtime_test.rs tests/browser_tool_test.rs tests/agent_runtime_test.rs tests/service_task_flow_test.rs tests/service_ws_session_test.rs
git diff --stat -- src/compat/artifact_open.rs src/compat/mod.rs src/compat/workflow_executor.rs src/browser/callback_backend.rs src/security/mac_policy.rs src/security/mod.rs tests/compat_runtime_test.rs tests/browser_tool_test.rs tests/agent_runtime_test.rs tests/service_task_flow_test.rs tests/service_ws_session_test.rs
```
Confirm the diff only touches:
```text
- compat workflow/orchestration
- compat post-export helper module
- callback backend narrow local-dashboard acceptance
- MacPolicy narrow local-dashboard validation
- focused related tests
```
Confirm it does **not** touch:
```text
- frontend/service-console/
- src/service/protocol.rs
- browser-helper.html
- callback-host endpoint contracts
- websocket transport/protocol files
```
- [ ] **Step 6: Commit only if verification required additional code changes**
```bash
git add src/compat/artifact_open.rs src/compat/mod.rs src/compat/workflow_executor.rs src/browser/callback_backend.rs src/security/mac_policy.rs tests/compat_runtime_test.rs tests/browser_tool_test.rs tests/agent_runtime_test.rs tests/service_task_flow_test.rs tests/service_ws_session_test.rs
git commit -m "test: tighten hotlist post-export auto-open verification"
```
If verification required no further code changes, do not create an extra commit.

View File

@@ -0,0 +1,281 @@
# Config-Owned Direct Skill Contract Implementation Plan
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
**Goal:** Validate the `directSubmitSkill` control surface early and prevent malformed direct-skill configs from entering the submit routing path, without changing the current happy-path direct execution behavior.
**Architecture:** Keep the existing direct-submit runtime and submit-task seam intact for valid configs. Move `directSubmitSkill` format validation into the normal `SgClawSettings` load path so malformed config fails before routing begins, while leaving valid-but-unresolvable `skill.tool` targets as direct runtime errors in the current direct path.
**Tech Stack:** Rust 2021, `serde` config parsing, current `BrowserMessage::SubmitTask` path, current direct skill runtime, Rust integration tests.
---
## Execution Context
- Follow @superpowers:test-driven-development for the Rust code changes in this plan.
- Follow @superpowers:verification-before-completion before claiming any task is done.
- Do **not** create a git worktree unless the user explicitly asks. This project prefers staying in the current checkout.
- Keep scope tight: this plan does **not** add per-skill dispatch metadata, docs changes, intent classification, or LLM routing changes.
## File Map
### Existing files to modify
- Modify: `src/config/settings.rs`
- validate `directSubmitSkill` during config normalization
- keep the stored field as `Option<String>` so the current direct runtime API stays stable
- Modify: `tests/compat_config_test.rs`
- add a failing config-load regression for malformed `directSubmitSkill`
- Modify: `tests/agent_runtime_test.rs`
- add a failing submit-path regression proving malformed config is rejected before direct routing begins
### Existing files to read but not broaden
- Reuse without redesign: `src/agent/mod.rs`
- Reuse without redesign: `src/compat/direct_skill_runtime.rs`
- Reuse without redesign: `docs/superpowers/specs/2026-04-09-config-owned-direct-skill-dispatch-design.md`
### No new files expected
This slice should fit in the existing config and tests surfaces only.
---
### Task 1: Validate `directSubmitSkill` Before Submit Routing
**Files:**
- Modify: `tests/compat_config_test.rs`
- Modify: `tests/agent_runtime_test.rs`
- Modify: `src/config/settings.rs`
- Read only: `src/agent/mod.rs`
- Read only: `src/compat/direct_skill_runtime.rs`
- [ ] **Step 1: Write the failing config test for malformed `directSubmitSkill`**
Add this focused test to `tests/compat_config_test.rs`:
```rust
#[test]
fn sgclaw_settings_reject_invalid_direct_submit_skill_format() {
let root = std::env::temp_dir().join(format!(
"sgclaw-invalid-direct-submit-skill-{}",
Uuid::new_v4()
));
fs::create_dir_all(&root).unwrap();
let config_path = root.join("sgclaw_config.json");
fs::write(
&config_path,
r#"{
"providers": [],
"skillsDir": "skill_lib",
"directSubmitSkill": "fault-details-report"
}"#,
)
.unwrap();
let err = SgClawSettings::load(Some(config_path.as_path()))
.expect_err("expected invalid directSubmitSkill format");
let message = err.to_string();
assert!(message.contains("directSubmitSkill"));
assert!(message.contains("skill.tool"));
}
```
- [ ] **Step 2: Run the focused config test and verify it fails**
Run:
```bash
cargo test --test compat_config_test sgclaw_settings_reject_invalid_direct_submit_skill_format -- --nocapture
```
Expected: FAIL because the current config loader accepts the malformed string instead of rejecting it early.
- [ ] **Step 3: Write the failing agent regression for malformed config**
Add this focused test to `tests/agent_runtime_test.rs`:
```rust
#[test]
fn submit_task_rejects_invalid_direct_submit_skill_config_before_routing() {
std::env::remove_var("DEEPSEEK_API_KEY");
std::env::remove_var("DEEPSEEK_BASE_URL");
std::env::remove_var("DEEPSEEK_MODEL");
let skill_root = build_direct_runtime_skill_root();
let workspace_root = std::env::temp_dir().join(format!(
"sgclaw-invalid-direct-submit-workspace-{}",
Uuid::new_v4()
));
fs::create_dir_all(&workspace_root).unwrap();
let config_path = workspace_root.join("sgclaw_config.json");
fs::write(
&config_path,
serde_json::json!({
"providers": [],
"skillsDir": skill_root,
"directSubmitSkill": "fault-details-report"
})
.to_string(),
)
.unwrap();
let runtime_context = AgentRuntimeContext::new(Some(config_path), workspace_root);
let transport = Arc::new(MockTransport::new(vec![]));
let browser_tool = BrowserPipeTool::new(
transport.clone(),
direct_runtime_test_policy(),
vec![1, 2, 3, 4, 5, 6, 7, 8],
)
.with_response_timeout(Duration::from_secs(1));
handle_browser_message_with_context(
transport.as_ref(),
&browser_tool,
&runtime_context,
submit_fault_details_message(),
)
.unwrap();
let sent = transport.sent_messages();
assert!(matches!(
sent.last(),
Some(AgentMessage::TaskComplete { success, summary })
if !success && summary.contains("skill.tool")
));
assert!(direct_submit_mode_logs(&sent).is_empty());
assert!(!sent.iter().any(|message| matches!(message, AgentMessage::Command { .. })));
}
```
- [ ] **Step 4: Run the focused agent test and verify it fails**
Run:
```bash
cargo test --test agent_runtime_test submit_task_rejects_invalid_direct_submit_skill_config_before_routing -- --nocapture
```
Expected: FAIL because the malformed config currently loads, enters the direct-submit branch, and emits `direct_skill_primary` before failing later.
- [ ] **Step 5: Implement the minimal config validation**
In `src/config/settings.rs`, add a small helper that validates the normalized `directSubmitSkill` string during `SgClawSettings::new(...)`.
Recommended implementation shape:
```rust
fn normalize_direct_submit_skill(raw: Option<String>) -> Result<Option<String>, ConfigError> {
let value = normalize_optional_value(raw);
let Some(value) = value.as_deref() else {
return Ok(None);
};
let Some((skill_name, tool_name)) = value.split_once('.') else {
return Err(ConfigError::InvalidValue(
"directSubmitSkill",
format!("must use skill.tool format, got {value}"),
));
};
if skill_name.trim().is_empty() || tool_name.trim().is_empty() {
return Err(ConfigError::InvalidValue(
"directSubmitSkill",
format!("must use skill.tool format, got {value}"),
));
}
Ok(Some(value.to_string()))
}
```
Then use it here:
```rust
let direct_submit_skill = normalize_direct_submit_skill(direct_submit_skill)?;
```
Rules:
- do not change the public field type from `Option<String>`
- do not move parsing responsibility into `src/agent/mod.rs`
- do not redesign `src/compat/direct_skill_runtime.rs`
- keep valid-but-unresolvable `skill.tool` targets as runtime errors in the direct path
- [ ] **Step 6: Re-run the two focused tests and verify they pass**
Run:
```bash
cargo test --test compat_config_test sgclaw_settings_reject_invalid_direct_submit_skill_format -- --nocapture
cargo test --test agent_runtime_test submit_task_rejects_invalid_direct_submit_skill_config_before_routing -- --nocapture
```
Expected: PASS.
- [ ] **Step 7: Re-run the broader regression suites**
Run:
```bash
cargo test --test compat_config_test -- --nocapture
cargo test --test agent_runtime_test -- --nocapture
cargo test --test browser_script_skill_tool_test -- --nocapture
cargo build --bin sgclaw
```
Expected: PASS, including:
- the direct-submit happy path
- the existing no-LLM fallback behavior when `directSubmitSkill` is absent
- unchanged browser-script helper semantics
- clean binary build
---
## Verification Checklist
### Config validation
```bash
cargo test --test compat_config_test -- --nocapture
```
Expected: malformed `directSubmitSkill` is rejected early, while the existing direct-only config shape still loads.
### Submit-path behavior
```bash
cargo test --test agent_runtime_test -- --nocapture
```
Expected:
- malformed `directSubmitSkill` never reaches direct routing
- valid configured direct skill still succeeds without LLM config
- no direct skill configured still returns the existing no-LLM message
### Browser-script helper safety
```bash
cargo test --test browser_script_skill_tool_test -- --nocapture
```
Expected: current browser-script execution semantics remain unchanged.
### Build
```bash
cargo build --bin sgclaw
```
Expected: the main binary compiles cleanly.
---
## Notes For The Engineer
- The paired spec is `docs/superpowers/specs/2026-04-09-config-owned-direct-skill-dispatch-design.md`.
- Do **not** add sgClaw-specific dispatch metadata under `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging` in this slice.
- Do **not** turn this into a per-skill registry task yet. This plan only hardens the current config-owned bootstrap contract.
- Keep the current direct target example as `fault-details-report.collect_fault_details`; avoid hard-coding that name into new generic APIs.
- If you discover a need for broader policy routing (`direct_browser` / `llm_agent` by skill), stop and write a new spec/plan instead of expanding this one.

View File

@@ -0,0 +1,520 @@
# Direct Skill Invocation Without LLM Implementation Plan
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
**Goal:** Let the current pipe submit-task flow accept natural-language input but directly invoke one fixed staged browser skill without calling any model, while reserving a clean switch back to LLM-based routing later.
**Architecture:** Keep the existing `BrowserMessage::SubmitTask` entrypoint and add one narrow pre-routing seam before the current compat/LLM chain. When a new config field points to a fixed direct-submit skill, sgClaw loads that skill package from the configured external skills root, finds the target `browser_script` tool, executes it through the existing browser-script wrapper, and returns the result directly. When the field is absent, the current behavior stays unchanged. This preserves a future path where each skill can later declare `direct_browser` or `llm_agent` dispatch without rewriting the submit pipeline again.
**Tech Stack:** Rust 2021, existing `BrowserPipeTool`, current submit-task agent entrypoint, current browser-script skill executor, current sgClaw JSON config loader, `zeroclaw` skill manifest loader.
---
## Recommended First Skill
Use `fault-details-report.collect_fault_details` from:
- `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/scenes/fault-details-report/scene.json`
- `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/fault-details-report/SKILL.toml`
- `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/fault-details-report/scripts/collect_fault_details.js`
Why this one first:
- it is clearly a report/export skill
- it exposes exactly one browser-script tool: `collect_fault_details`
- it has the smallest contract surface (`period` only)
- its current JS is deterministic and simple, so the first slice can focus on plumbing instead of browser scraping complexity
## Scope Guardrails
- Do **not** redesign the existing submit-task protocol.
- Do **not** remove or rewrite the current LLM/compat path; leave it as the fallback/default path.
- Do **not** introduce generic NL intent routing in this slice; this is one fixed direct skill only.
- Do **not** modify `third_party/zeroclaw` skill manifest schema in phase 1.
- Do **not** add Excel export wiring in the first slice unless a test explicitly requires it.
- Do **not** invent a new browser-script execution model; reuse the existing wrapper semantics.
---
## File Map
### Existing files to modify
- Modify: `src/config/settings.rs`
- add a minimal config field for one direct-submit skill name
- Modify: `src/agent/mod.rs`
- add a narrow pre-routing branch before the current compat/LLM path
- Modify: `src/compat/browser_script_skill_tool.rs`
- expose the smallest reusable helper for direct browser-script execution
- Modify: `src/compat/mod.rs` or the nearest module export surface
- export the new narrow direct-skill runtime module if needed
- Modify: `tests/compat_config_test.rs`
- add config coverage for the new direct-submit field
- Modify: `tests/browser_script_skill_tool_test.rs`
- add coverage for the reusable direct-execution helper
- Modify: `tests/agent_runtime_test.rs`
- prove submit-task can bypass the model and directly invoke the fixed skill
### New files to create
- Create: `src/compat/direct_skill_runtime.rs`
- small runtime for loading one configured skill, resolving one configured tool, deriving minimal args, and executing it directly
### Files to reuse without changing behavior
- Reuse: `src/compat/runtime.rs`
- Reuse: `src/compat/orchestration.rs`
- Reuse: `src/compat/config_adapter.rs`
- Reuse: `third_party/zeroclaw/src/skills/mod.rs`
---
### Task 1: Add A Minimal Direct-Submit Skill Config Field
**Files:**
- Modify: `src/config/settings.rs`
- Modify: `tests/compat_config_test.rs`
- [ ] **Step 1: Write the failing config test for the new field**
In `tests/compat_config_test.rs`, add a focused config-load test proving the browser config file can declare one fixed direct-submit skill.
Test shape:
```rust
#[test]
fn sgclaw_settings_load_direct_submit_skill_from_browser_config() {
let root = std::env::temp_dir().join(format!("sgclaw-direct-skill-{}", uuid::Uuid::new_v4()));
std::fs::create_dir_all(&root).unwrap();
let config_path = root.join("sgclaw_config.json");
std::fs::write(
&config_path,
r#"{
"apiKey": "sk-runtime",
"baseUrl": "https://api.deepseek.com",
"model": "deepseek-chat",
"skillsDir": "D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging",
"directSubmitSkill": "fault-details-report.collect_fault_details"
}"#,
)
.unwrap();
let settings = sgclaw::config::SgClawSettings::load(Some(config_path.as_path()))
.unwrap()
.expect("expected sgclaw settings from config file");
assert_eq!(
settings.direct_submit_skill.as_deref(),
Some("fault-details-report.collect_fault_details")
);
}
```
- [ ] **Step 2: Run the focused config test and verify it fails**
Run:
```bash
cargo test --test compat_config_test sgclaw_settings_load_direct_submit_skill_from_browser_config -- --nocapture
```
Expected: FAIL because the config field does not exist yet.
- [ ] **Step 3: Implement the minimal config field**
In `src/config/settings.rs`, add:
- `direct_submit_skill: Option<String>` to `SgClawSettings`
- `direct_submit_skill: Option<String>` to `RawSgClawSettings`
- field normalization in `SgClawSettings::new(...)`
Recommended JSON key shape:
```rust
#[serde(rename = "directSubmitSkill", alias = "direct_submit_skill", default)]
direct_submit_skill: Option<String>,
```
Rules:
- trim empty values to `None`
- keep `DeepSeekSettings` unchanged for this slice unless a compile error proves it must mirror the field
- do not alter unrelated config semantics
- [ ] **Step 4: Re-run the focused config test**
Run:
```bash
cargo test --test compat_config_test sgclaw_settings_load_direct_submit_skill_from_browser_config -- --nocapture
```
Expected: PASS.
- [ ] **Step 5: Re-run the broader config file tests**
Run:
```bash
cargo test --test compat_config_test -- --nocapture
```
Expected: PASS.
- [ ] **Step 6: Commit Task 1**
```bash
git add src/config/settings.rs tests/compat_config_test.rs
git commit -m "feat: add direct submit skill config"
```
---
### Task 2: Extract A Reusable Browser-Script Direct Execution Helper
**Files:**
- Modify: `src/compat/browser_script_skill_tool.rs`
- Modify: `tests/browser_script_skill_tool_test.rs`
- [ ] **Step 1: Write the first failing helper test**
In `tests/browser_script_skill_tool_test.rs`, add a focused test proving direct code can execute a packaged browser script without constructing a full `Tool` object first.
Test shape:
```rust
#[tokio::test]
async fn execute_browser_script_tool_runs_packaged_script_with_expected_domain() {
// build temp skill script
// call the helper directly
// assert Action::Eval was sent with wrapped args and normalized domain
}
```
Required assertions:
- the helper reads the packaged JS file
- it wraps args with `const args = ...`
- it normalizes URL-like `expected_domain`
- it returns the serialized payload string on success
- [ ] **Step 2: Run the helper test and verify it fails**
Run:
```bash
cargo test --test browser_script_skill_tool_test execute_browser_script_tool_runs_packaged_script_with_expected_domain -- --nocapture
```
Expected: FAIL because the helper does not exist yet.
- [ ] **Step 3: Add the second failing helper test for required-domain validation**
Add a focused failure-path test proving the helper rejects missing or invalid `expected_domain` before any browser command is sent.
- [ ] **Step 4: Run the validation test and verify it fails**
Run:
```bash
cargo test --test browser_script_skill_tool_test execute_browser_script_tool_rejects_missing_expected_domain -- --nocapture
```
Expected: FAIL because the helper does not exist yet.
- [ ] **Step 5: Implement the minimal reusable helper**
In `src/compat/browser_script_skill_tool.rs`, extract the smallest reusable function, for example:
```rust
pub async fn execute_browser_script_tool<T: Transport + 'static>(
tool: &SkillTool,
skill_root: &Path,
browser_tool: BrowserPipeTool<T>,
args: Value,
) -> anyhow::Result<ToolResult>
```
Rules:
- reuse the current path validation, script loading, wrapping, `Action::Eval`, and payload formatting logic already used by `BrowserScriptSkillTool::execute`
- do not change outward behavior of `BrowserScriptSkillTool`
- keep the helper narrow and browser-script-only
- [ ] **Step 6: Refactor `BrowserScriptSkillTool::execute` to call the helper**
Keep existing behavior and tests green while removing duplicate execution logic.
- [ ] **Step 7: Re-run the browser-script tests**
Run:
```bash
cargo test --test browser_script_skill_tool_test -- --nocapture
```
Expected: PASS.
- [ ] **Step 8: Commit Task 2**
```bash
git add src/compat/browser_script_skill_tool.rs tests/browser_script_skill_tool_test.rs
git commit -m "refactor: extract direct browser script execution helper"
```
---
### Task 3: Add A Narrow Direct Skill Runtime For One Fixed Skill
**Files:**
- Create: `src/compat/direct_skill_runtime.rs`
- Modify: `src/compat/mod.rs` or nearest module export point
- Reuse: `src/compat/config_adapter.rs`
- Reuse: `third_party/zeroclaw/src/skills/mod.rs`
- [ ] **Step 1: Write the first failing direct-runtime test**
Add a focused test in `tests/agent_runtime_test.rs` or a new narrow compat test proving code can resolve the configured external skills root, load `fault-details-report`, find `collect_fault_details`, and execute it directly.
Recommended shape:
```rust
#[test]
fn direct_skill_runtime_executes_fault_details_report_without_provider() {
// config points at skill_staging root
// direct_submit_skill points at fault-details-report.collect_fault_details
// browser response returns report-artifact payload
// assert no provider/http path is touched
}
```
- [ ] **Step 2: Run the focused direct-runtime test and verify it fails**
Run the narrowest test command for the new test.
Expected: FAIL because the direct runtime does not exist yet.
- [ ] **Step 3: Implement `src/compat/direct_skill_runtime.rs`**
Add a narrow runtime with responsibilities only to:
- resolve the configured skills dir with `resolve_skills_dir_from_sgclaw_settings(...)`
- load skills from that directory with `load_skills_from_directory(...)`
- parse the configured tool name into `skill_name` + `tool_name`
- find the matching skill and matching tool
- verify `tool.kind == "browser_script"`
- derive the minimal argument object
- call the new browser-script helper
- return the output string or a clear `PipeError`
Do **not** add generic routing, scenes, or model fallback here.
- [ ] **Step 4: Keep argument derivation intentionally minimal**
For the first slice, derive only:
- `expected_domain` from `page_url` when present, otherwise fail with a clear message
- `period` from the instruction using a narrow deterministic pattern such as `YYYY-MM`
If the period cannot be derived, return a concise error telling the user to provide it explicitly. Do not guess.
- [ ] **Step 5: Re-run the focused direct-runtime test**
Run the same test command again.
Expected: PASS.
- [ ] **Step 6: Commit Task 3**
```bash
git add src/compat/direct_skill_runtime.rs src/compat/mod.rs tests/agent_runtime_test.rs
git commit -m "feat: add fixed direct skill runtime"
```
---
### Task 4: Insert The Pre-Routing Seam In Submit-Task Entry
**Files:**
- Modify: `src/agent/mod.rs`
- Modify: `tests/agent_runtime_test.rs`
- [ ] **Step 1: Write the first failing submit-path bypass test**
In `tests/agent_runtime_test.rs`, add a focused regression proving that when `directSubmitSkill` is configured, `BrowserMessage::SubmitTask` can succeed without any model/provider being configured.
Test shape:
```rust
#[test]
fn submit_task_uses_direct_skill_mode_without_llm_configuration() {
// config contains skillsDir + directSubmitSkill, but no reachable provider
// natural-language instruction includes period and page_url
// expect TaskComplete success from direct skill result
}
```
Required assertions:
- task succeeds even if provider would be unavailable
- output contains the report artifact payload
- no summary like `未配置大语言模型`
- [ ] **Step 2: Run the bypass test and verify it fails**
Run:
```bash
cargo test --test agent_runtime_test submit_task_uses_direct_skill_mode_without_llm_configuration -- --nocapture
```
Expected: FAIL because submit-task still goes into the current LLM-oriented path.
- [ ] **Step 3: Add the second failing priority test**
Add one focused test proving the direct-submit branch runs before the existing compat/LLM branch.
The easiest assertion is that the mode log becomes something new like:
- `direct_skill_primary`
and the normal mode logs do not appear for that turn.
- [ ] **Step 4: Run the priority test and verify it fails**
Run the narrow test command for the new test.
Expected: FAIL because the mode does not exist yet.
- [ ] **Step 5: Add the narrow pre-routing branch in `src/agent/mod.rs`**
In `handle_browser_message_with_context(...)`, after config load/logging and before the existing `should_use_primary_orchestration(...)` / `compat::runtime` path:
- check `settings.direct_submit_skill`
- if present, emit mode log `direct_skill_primary`
- call the new direct runtime
- send `TaskComplete` and return immediately
Rules:
- if `direct_submit_skill` is absent, keep existing behavior byte-for-byte where possible
- do not modify `compat::runtime.rs` or `compat::orchestration.rs` for this slice
- do not silently fall through to LLM when direct execution fails; return the direct error clearly so the first slice is debuggable
- [ ] **Step 6: Re-run the focused submit-path tests**
Run:
```bash
cargo test --test agent_runtime_test submit_task_uses_direct_skill_mode_without_llm_configuration -- --nocapture
cargo test --test agent_runtime_test direct_skill_mode_logs_direct_skill_primary -- --nocapture
```
Expected: PASS.
- [ ] **Step 7: Re-run existing no-LLM submit regression coverage**
Run:
```bash
cargo test --test agent_runtime_test -- --nocapture
```
Expected: PASS, including existing cases where no direct skill is configured and the old no-LLM failure still applies.
- [ ] **Step 8: Commit Task 4**
```bash
git add src/agent/mod.rs tests/agent_runtime_test.rs
git commit -m "feat: route submit tasks through fixed direct skill mode"
```
---
### Task 5: Lock The Future Migration Seam Without Implementing LLM Dispatch Yet
**Files:**
- Modify only if needed: `src/config/settings.rs`
- Modify only if needed: `src/compat/direct_skill_runtime.rs`
- Reuse: docs/plan only unless code needs one tiny naming fix
- [ ] **Step 1: Keep the config naming compatible with future per-skill dispatch**
Document and preserve this future meaning in code naming:
- current field: one fixed direct skill for submit-task bootstrap
- future model: each skill can declare dispatch mode such as `direct_browser` or `llm_agent`
Prefer neutral names in helper code like:
- `direct skill mode`
- `direct submit skill`
Avoid hard-coding `fault_details` into generic APIs.
- [ ] **Step 2: Add one small negative test for fallback behavior**
Add a focused test proving that when `directSubmitSkill` is not configured, submit-task still behaves exactly as before and can still return the existing no-LLM message.
If an existing test already proves this, keep it and do not add another.
- [ ] **Step 3: Re-run the focused end-to-end verification set**
Run:
```bash
cargo test --test compat_config_test -- --nocapture
cargo test --test browser_script_skill_tool_test -- --nocapture
cargo test --test agent_runtime_test -- --nocapture
```
Expected: PASS.
- [ ] **Step 4: Build the main binary**
Run:
```bash
cargo build --bin sgclaw
```
Expected: PASS.
- [ ] **Step 5: Commit Task 5**
```bash
git add src/config/settings.rs src/compat/direct_skill_runtime.rs src/compat/browser_script_skill_tool.rs src/agent/mod.rs tests/compat_config_test.rs tests/browser_script_skill_tool_test.rs tests/agent_runtime_test.rs
git commit -m "test: verify fixed direct skill submit path"
```
---
## Verification Checklist
### Config loading
```bash
cargo test --test compat_config_test -- --nocapture
```
Expected: `directSubmitSkill` loads correctly and existing config behavior remains intact.
### Browser-script helper
```bash
cargo test --test browser_script_skill_tool_test -- --nocapture
```
Expected: direct helper preserves the existing browser-script execution semantics.
### Submit-path bypass
```bash
cargo test --test agent_runtime_test -- --nocapture
```
Expected: configured direct skill bypasses the model path, while unconfigured submit-task behavior stays unchanged.
### Build
```bash
cargo build --bin sgclaw
```
Expected: the binary compiles cleanly.
---
## Notes For The Engineer
- The key to keeping this slice small is to avoid changing `compat::runtime.rs` and `compat::orchestration.rs`; they remain the future LLM path.
- `fault-details-report.collect_fault_details` is only the bootstrap skill. The plumbing must stay generic enough that the configured tool name can later point to another staged browser skill.
- Phase 1 should not add per-skill dispatch metadata to the external skill manifests yet. Keep that decision in sgClaw config first; move it into skill metadata only after the direct path is proven useful.
- Once the intranet model is ready, the clean next step is to add a dispatch policy layer that chooses between `direct_browser` and `llm_agent` before the current compat path is entered, reusing this same pre-routing seam.

View File

@@ -0,0 +1,666 @@
# WS Branch Scene Cleanup Implementation Plan
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
**Goal:** Strip `feature/claw-ws` back to websocket plus Zhihu execution only by removing staged scene-skill routing, `skill_staging`-aware loading, and array-style `skillsDir` config behavior from this branch.
**Architecture:** Treat `feature/claw-ws` as a transport-focused branch, not a business-scene branch. Keep the browser websocket/callback submit path and the existing Zhihu direct workflows, but delete the fault-details / `95598` scene registry, scene-specific prompt injection, staged scene directory expansion, and scene-only docs/tests so the branch stays small and merges cleanly after the real scene implementation lands on `main`.
**Tech Stack:** Rust 2021, existing sgClaw compat/runtime/orchestration stack, websocket browser backend, callback-host service path, existing `cargo test` suite.
---
## Preconditions
- Execute this plan **only after** `main` already contains the desired clean scene-skill implementation.
- Run it on `feature/claw-ws`, not on `main`.
- Keep websocket and Zhihu behavior intact; this plan is cleanup, not a redesign.
- Keep `docs/_tmp_sgbrowser_ws_api_doc.txt`; it remains the browser integration contract for this branch.
## Scope Guardrails
- Do **not** change the working Zhihu websocket flow in `tests/agent_runtime_test.rs`.
- Do **not** remove `src/browser/ws_backend.rs`, `src/service/server.rs`, or Zhihu routes from `src/compat/workflow_executor.rs`.
- Do **not** add a replacement scene abstraction on this branch.
- Do **not** keep partial scene plumbing “for future use”; delete it completely if it is scene-only.
- Do **not** keep array-style `skillsDir` tests or docs on this branch once the single-path cleanup is complete.
---
## File Map
### Delete
- `src/runtime/scene_registry.rs`
- staged scene registry, hard-coded `skill_staging` scene root, scene matching helpers
- `tests/scene_registry_test.rs`
- scene-registry-specific coverage that should disappear with the feature
- `docs/superpowers/specs/2026-04-06-scene-skill-runtime-routing-design.md`
- scene-routing design doc that no longer belongs on the ws-only branch
- `docs/superpowers/plans/2026-04-06-scene-skill-runtime-routing-plan.md`
- scene-routing implementation plan that no longer belongs on the ws-only branch
### Modify
- `src/runtime/mod.rs`
- stop exporting deleted scene registry APIs
- `src/runtime/engine.rs`
- remove scene-contract prompt injection and staged scene skill loading
- `src/compat/workflow_executor.rs`
- remove `FaultDetailsReport` route detection/execution while keeping Zhihu routes
- `src/compat/orchestration.rs`
- keep direct Zhihu orchestration only; remove scene-driven primary routing triggers
- `src/config/settings.rs`
- collapse `skillsDir` config handling back to single-path semantics
- `src/compat/config_adapter.rs`
- remove scene-specific skills-dir helpers and keep one resolved skills dir
- `src/compat/runtime.rs`
- stop carrying scene-expanded skills dirs through compat runtime
- `src/agent/task_runner.rs`
- update runtime logging and runtime calls to the single skills-dir contract
- `tests/compat_runtime_test.rs`
- remove fault-details / `95598` assertions and keep Zhihu/direct-route coverage
- `tests/runtime_profile_test.rs`
- remove `95598` scene-contract expectations and keep normal browser-profile coverage
- `tests/compat_config_test.rs`
- remove scene-dir / array-config coverage and add single-path cleanup coverage
- `tests/agent_runtime_test.rs`
- only extend if one extra Zhihu keep-path regression is needed after the config cleanup
### Keep As-Is Unless A Signature Change Forces A Tiny Edit
- `src/browser/ws_backend.rs`
- `src/browser/callback_backend.rs`
- `src/browser/callback_host.rs`
- `src/service/server.rs`
- `src/agent/mod.rs`
- `tests/browser_ws_backend_test.rs`
- `tests/service_ws_session_test.rs`
- `tests/task_runner_test.rs`
---
### Task 1: Lock The Cleanup Contract In Failing Tests
**Files:**
- Modify: `tests/compat_runtime_test.rs`
- Modify: `tests/runtime_profile_test.rs`
- Modify: `tests/compat_config_test.rs`
- Reuse: `tests/agent_runtime_test.rs`
- [ ] **Step 1: Add the first failing route-removal test**
In `tests/compat_runtime_test.rs`, add a focused assertion proving the ws branch no longer recognizes the fault-details scene as a direct route:
```rust
#[test]
fn ws_cleanup_no_longer_detects_fault_details_scene_route() {
use sgclaw::compat::workflow_executor::detect_route;
assert_eq!(
detect_route(
"导出故障明细",
Some("https://example.invalid/workbench"),
Some("业务台账"),
),
None,
);
}
```
- [ ] **Step 2: Run the focused route test and verify it fails**
Run:
```bash
cargo test --test compat_runtime_test ws_cleanup_no_longer_detects_fault_details_scene_route -- --nocapture
```
Expected: FAIL because `FaultDetailsReport` is still detected today.
- [ ] **Step 3: Add the second failing orchestration-gate test**
In `tests/compat_runtime_test.rs`, add one focused assertion proving scene keywords no longer open the primary direct-orchestration path:
```rust
#[test]
fn ws_cleanup_scene_keywords_do_not_trigger_primary_orchestration() {
assert!(!sgclaw::compat::orchestration::should_use_primary_orchestration(
"请处理95598抢修市指监测",
Some("https://95598.example.invalid/dispatch"),
Some("95598抢修市指监测"),
));
}
```
- [ ] **Step 4: Run the orchestration-gate test and verify it fails**
Run:
```bash
cargo test --test compat_runtime_test ws_cleanup_scene_keywords_do_not_trigger_primary_orchestration -- --nocapture
```
Expected: FAIL because the scene matcher still feeds primary orchestration today.
- [ ] **Step 5: Add the third failing runtime-instruction test**
In `tests/runtime_profile_test.rs`, add a focused negative assertion proving browser-attached turns no longer receive the `95598` scene execution contract:
```rust
#[test]
fn ws_cleanup_browser_profile_does_not_inject_95598_scene_contract() {
let engine = RuntimeEngine::new(RuntimeProfile::BrowserAttached);
let instruction = engine.build_instruction(
"请处理95598-repair-city-dispatch场景查看抢修市指派单并汇总当前队列",
Some("https://95598.example.invalid/dispatch"),
Some("95598抢修市指监测"),
true,
);
assert!(!instruction.contains("95598-repair-city-dispatch.collect_repair_orders"));
}
```
- [ ] **Step 6: Run the runtime-profile test and verify it fails**
Run:
```bash
cargo test --test runtime_profile_test ws_cleanup_browser_profile_does_not_inject_95598_scene_contract -- --nocapture
```
Expected: FAIL because `src/runtime/engine.rs` still injects the scene contract today.
- [ ] **Step 7: Add the fourth failing config-shape test**
In `tests/compat_config_test.rs`, add one focused assertion proving ws cleanup goes back to a single configured skills path and no longer accepts array-style `skillsDir` JSON:
```rust
#[test]
fn ws_cleanup_rejects_array_style_skills_dir_config() {
let root = std::env::temp_dir().join(format!("sgclaw-config-{}", uuid::Uuid::new_v4()));
std::fs::create_dir_all(&root).unwrap();
let config_path = root.join("sgclaw_config.json");
std::fs::write(
&config_path,
r#"{
"apiKey": "sk-test",
"baseUrl": "https://api.deepseek.com",
"model": "deepseek-chat",
"skillsDir": ["skill_lib", "skill_staging"]
}"#,
)
.unwrap();
assert!(sgclaw::config::SgClawSettings::load(Some(config_path.as_path())).is_err());
}
```
- [ ] **Step 8: Run the config-shape test and verify it fails**
Run:
```bash
cargo test --test compat_config_test ws_cleanup_rejects_array_style_skills_dir_config -- --nocapture
```
Expected: FAIL because the current parser still accepts string-or-array `skillsDir` input.
- [ ] **Step 9: Re-run the existing Zhihu keep-path test as a safety baseline**
Run:
```bash
cargo test --test agent_runtime_test production_submit_task_routes_zhihu_through_ws_backend_without_helper_bootstrap -- --nocapture
```
Expected: PASS, proving the behavior we want to keep is already covered before deletion starts.
---
### Task 2: Remove Scene Registry, Scene Prompt Injection, And Fault-Details Routing
**Files:**
- Delete: `src/runtime/scene_registry.rs`
- Modify: `src/runtime/mod.rs`
- Modify: `src/runtime/engine.rs`
- Modify: `src/compat/workflow_executor.rs`
- Modify: `src/compat/orchestration.rs`
- Modify: `tests/compat_runtime_test.rs`
- Modify: `tests/runtime_profile_test.rs`
- Delete: `tests/scene_registry_test.rs`
- [ ] **Step 1: Remove the runtime scene module export surface**
Update `src/runtime/mod.rs` so it no longer declares or re-exports scene registry items.
Target shape:
```rust
mod engine;
mod profile;
mod tool_policy;
pub use engine::{
is_zhihu_hotlist_task,
is_zhihu_write_task,
task_requests_zhihu_article_publish,
RuntimeEngine,
};
pub use profile::RuntimeProfile;
pub use tool_policy::ToolPolicy;
```
- [ ] **Step 2: Delete `src/runtime/scene_registry.rs`**
Remove the file entirely. Do not leave a stub module or comments about future scene support.
- [ ] **Step 3: Remove scene-aware prompt injection from `src/runtime/engine.rs`**
Delete:
- the `resolve_scene_skills_dir_path` import
- the `DispatchMode` / `match_scene_instruction` imports
- `REPAIR_CITY_DISPATCH_EXECUTION_PROMPT`
- `build_scene_execution_contract(...)`
- the `if let Some(scene_contract) = ...` block inside `RuntimeEngine::build_instruction(...)`
- staged scene directory loading inside `load_runtime_skills(...)`
The resulting instruction assembly should keep:
- browser tool contract
- Zhihu hotlist/export prompts
- Zhihu publish guard
- page context
Do **not** change Zhihu prompt text.
- [ ] **Step 4: Remove the fault-details route from `src/compat/workflow_executor.rs`**
Shrink `WorkflowRoute` back to Zhihu-only variants:
```rust
pub enum WorkflowRoute {
ZhihuHotlistExportXlsx,
ZhihuHotlistScreen,
ZhihuArticleEntry,
ZhihuArticleDraft,
ZhihuArticlePublish,
ZhihuArticleAutoPublishGenerated,
}
```
Delete:
- `FAULT_DETAILS_SCENE_ID`
- the scene check at the top of `detect_route(...)`
- `WorkflowRoute::FaultDetailsReport`
- `execute_fault_details_route(...)`
- any scene-only helpers used only by that path
Keep the Zhihu route order unchanged.
- [ ] **Step 5: Simplify `src/compat/orchestration.rs` to Zhihu-only direct routing**
After the fault-details route is gone, keep `should_use_primary_orchestration(...)` and the two execute functions focused on:
- Zhihu direct routes detected by `detect_route(...)`
- existing Zhihu export/dashboard fallback behavior
Do not add new conditions.
- [ ] **Step 6: Remove scene-only tests and replace them with cleanup assertions**
In `tests/compat_runtime_test.rs` and `tests/runtime_profile_test.rs`:
- delete `fault-details` assertions that require the old route to exist
- delete `95598` scene-contract assertions that require the old prompt injection to exist
- keep the new negative cleanup tests from Task 1
- keep the existing Zhihu assertions intact
Delete `tests/scene_registry_test.rs` completely.
- [ ] **Step 7: Run the focused cleanup tests**
Run:
```bash
cargo test --test compat_runtime_test ws_cleanup_no_longer_detects_fault_details_scene_route -- --nocapture && cargo test --test compat_runtime_test ws_cleanup_scene_keywords_do_not_trigger_primary_orchestration -- --nocapture && cargo test --test runtime_profile_test ws_cleanup_browser_profile_does_not_inject_95598_scene_contract -- --nocapture
```
Expected: PASS.
- [ ] **Step 8: Re-run the focused Zhihu runtime tests**
Run:
```bash
cargo test --test compat_runtime_test zhihu_ -- --nocapture
```
Expected: PASS, proving the Zhihu direct routes still work after the scene deletion.
- [ ] **Step 9: Commit Task 2**
```bash
git add src/runtime/mod.rs src/runtime/engine.rs src/compat/workflow_executor.rs src/compat/orchestration.rs tests/compat_runtime_test.rs tests/runtime_profile_test.rs
git rm src/runtime/scene_registry.rs tests/scene_registry_test.rs
git commit -m "refactor: remove scene routing from ws branch"
```
---
### Task 3: Collapse `skillsDir` Back To Single-Path Semantics
**Files:**
- Modify: `src/config/settings.rs`
- Modify: `src/compat/config_adapter.rs`
- Modify: `src/compat/runtime.rs`
- Modify: `src/agent/task_runner.rs`
- Modify if needed: `tests/agent_runtime_test.rs`
- Modify: `tests/compat_config_test.rs`
- [ ] **Step 1: Change config parsing to a single configured skills path**
In `src/config/settings.rs`, replace the string-or-array parser with a single optional string field.
Target shape:
```rust
#[derive(Debug, Clone, PartialEq, Eq)]
pub struct DeepSeekSettings {
pub api_key: String,
pub base_url: String,
pub model: String,
pub skills_dir: Option<PathBuf>,
}
#[derive(Debug, Clone, PartialEq, Eq)]
pub struct SgClawSettings {
// ...
pub skills_dir: Option<PathBuf>,
// ...
}
```
And in `RawSgClawSettings`:
```rust
#[serde(rename = "skillsDir", alias = "skills_dir", default)]
skills_dir: Option<String>,
```
Delete `deserialize_skills_dirs(...)` entirely.
- [ ] **Step 2: Keep relative-path resolution, but only for one path**
Replace `resolve_configured_skills_dirs(...) -> Vec<PathBuf>` with a single-path helper such as:
```rust
fn resolve_configured_skills_dir(raw: Option<String>, config_dir: &Path) -> Option<PathBuf> {
raw.map(|value| value.trim().to_string())
.filter(|value| !value.is_empty())
.map(PathBuf::from)
.map(|path| if path.is_absolute() { path } else { config_dir.join(path) })
}
```
- [ ] **Step 3: Collapse compat config adapter back to one resolved skills dir**
In `src/compat/config_adapter.rs`:
- keep `zeroclaw_default_skills_dir(...)`
- change `resolve_skills_dir(...)` and `resolve_skills_dir_from_sgclaw_settings(...)` to return a single `PathBuf`
- delete `resolve_scene_skills_dir_from_sgclaw_settings(...)`
- delete `resolve_scene_skills_dir_path(...)`
- delete any helper branches that append `skill_staging/skills`
Recommended shape:
```rust
pub fn resolve_skills_dir_from_sgclaw_settings(
workspace_root: &Path,
settings: &SgClawSettings,
) -> PathBuf {
settings
.skills_dir
.as_ref()
.map(|dir| normalize_configured_skills_dir(dir))
.unwrap_or_else(|| zeroclaw_default_skills_dir(workspace_root))
}
```
- [ ] **Step 4: Update runtime callers to the single-path contract**
In `src/compat/runtime.rs` and `src/agent/task_runner.rs`:
- stop passing vectors of skills dirs around
- update logging from `skills dirs resolved to [...]` to a single-path message such as `skills dir resolved to ...`
- keep the rest of the runtime behavior unchanged
In `src/runtime/engine.rs`, if the method still needs a collection internally, convert the one path at the call site instead of preserving public multi-root plumbing.
- [ ] **Step 5: Replace config tests with single-path cleanup coverage**
In `tests/compat_config_test.rs`:
- keep single-string `skillsDir` resolution tests
- remove `resolve_scene_skills_dir_path_*` coverage
- remove array-acceptance expectations
- keep the new rejecting-array test from Task 1
Add one focused positive test like:
```rust
#[test]
fn ws_cleanup_resolves_single_configured_skills_dir() {
let root = std::env::temp_dir().join(format!("sgclaw-skills-{}", uuid::Uuid::new_v4()));
std::fs::create_dir_all(root.join("skill_lib/skills")).unwrap();
let settings = DeepSeekSettings {
api_key: "key".to_string(),
base_url: "https://api.deepseek.com".to_string(),
model: "deepseek-chat".to_string(),
skills_dir: Some(root.join("skill_lib")),
};
assert_eq!(
resolve_skills_dir(&root, &settings),
root.join("skill_lib/skills"),
);
}
```
- [ ] **Step 6: Run the focused config tests**
Run:
```bash
cargo test --test compat_config_test ws_cleanup_ -- --nocapture
```
Expected: PASS.
- [ ] **Step 7: Re-run the Zhihu websocket keep-path test**
Run:
```bash
cargo test --test agent_runtime_test production_submit_task_routes_zhihu_through_ws_backend_without_helper_bootstrap -- --nocapture
```
Expected: PASS.
- [ ] **Step 8: Commit Task 3**
```bash
git add src/config/settings.rs src/compat/config_adapter.rs src/compat/runtime.rs src/agent/task_runner.rs tests/compat_config_test.rs tests/agent_runtime_test.rs
git commit -m "refactor: restore single skills dir on ws branch"
```
---
### Task 4: Remove Scene-Only Docs And Residual Test References
**Files:**
- Delete: `docs/superpowers/specs/2026-04-06-scene-skill-runtime-routing-design.md`
- Delete: `docs/superpowers/plans/2026-04-06-scene-skill-runtime-routing-plan.md`
- Modify: `tests/compat_runtime_test.rs`
- Modify: `tests/runtime_profile_test.rs`
- Modify: `tests/compat_config_test.rs`
- [ ] **Step 1: Delete the two scene-only planning documents**
Remove exactly these files:
- `docs/superpowers/specs/2026-04-06-scene-skill-runtime-routing-design.md`
- `docs/superpowers/plans/2026-04-06-scene-skill-runtime-routing-plan.md`
Keep the websocket/browser docs and Zhihu docs.
- [ ] **Step 2: Sweep remaining tests for scene-only names**
Remove or rewrite any remaining test blocks that still require:
- `fault-details-report`
- `95598-repair-city-dispatch`
- `resolve_scene_skills_dir_path`
- `resolve_scene_skills_dir_from_sgclaw_settings`
- `scene_registry`
Do not delete Zhihu-related assertions during this sweep.
- [ ] **Step 3: Run a focused grep-style audit from the shell**
Run:
```bash
git grep -n "fault-details-report\|95598-repair-city-dispatch\|resolve_scene_skills_dir_path\|resolve_scene_skills_dir_from_sgclaw_settings\|scene_registry" -- src tests docs
```
Expected: no matches in `src/` or `tests/`; doc matches should be gone after the deletions.
- [ ] **Step 4: Commit Task 4**
```bash
git add tests/compat_runtime_test.rs tests/runtime_profile_test.rs tests/compat_config_test.rs
git rm docs/superpowers/specs/2026-04-06-scene-skill-runtime-routing-design.md docs/superpowers/plans/2026-04-06-scene-skill-runtime-routing-plan.md
git commit -m "docs: remove ws-only scene planning artifacts"
```
---
### Task 5: Verify The Branch Is Back To WS Plus Zhihu Only
**Files:**
- Verify only unless a failing test proves one tiny follow-up fix is needed
- [ ] **Step 1: Run the retained Zhihu websocket regression**
Run:
```bash
cargo test --test agent_runtime_test production_submit_task_routes_zhihu_through_ws_backend_without_helper_bootstrap -- --nocapture
```
Expected: PASS.
- [ ] **Step 2: Run websocket/backend focused coverage**
Run:
```bash
cargo test --test browser_ws_backend_test -- --nocapture && cargo test --test service_ws_session_test -- --nocapture
```
Expected: PASS.
- [ ] **Step 3: Run direct-route/runtime Zhihu coverage**
Run:
```bash
cargo test --test compat_runtime_test zhihu_ -- --nocapture && cargo test --test task_runner_test -- --nocapture
```
Expected: PASS.
- [ ] **Step 4: Run config/runtime verification after the single-dir cleanup**
Run:
```bash
cargo test --test compat_config_test -- --nocapture && cargo test --test runtime_profile_test -- --nocapture
```
Expected: PASS.
- [ ] **Step 5: Build the affected binaries**
Run:
```bash
cargo build --bin sgclaw --bin sg_claw --bin sg_claw_client
```
Expected: PASS.
- [ ] **Step 6: Audit the remaining branch diff against `main`**
Run:
```bash
git diff --stat main...HEAD
```
Expected: the remaining meaningful differences are websocket/browser transport work and Zhihu-related behavior, not scene-routing or staged-scene config churn.
- [ ] **Step 7: Commit the final verification pass**
```bash
git add src/config/settings.rs src/compat/config_adapter.rs src/compat/runtime.rs src/compat/workflow_executor.rs src/compat/orchestration.rs src/runtime/mod.rs src/runtime/engine.rs tests/compat_config_test.rs tests/runtime_profile_test.rs tests/compat_runtime_test.rs tests/agent_runtime_test.rs tests/task_runner_test.rs
git commit -m "test: verify ws branch cleanup preserves zhihu websocket flow"
```
---
## Verification Checklist
### Cleanup regressions
```bash
cargo test --test compat_runtime_test ws_cleanup_ -- --nocapture
cargo test --test runtime_profile_test ws_cleanup_ -- --nocapture
cargo test --test compat_config_test ws_cleanup_ -- --nocapture
```
Expected: scene detection, scene prompt injection, and array-style `skillsDir` behavior are gone.
### Retained Zhihu websocket behavior
```bash
cargo test --test agent_runtime_test production_submit_task_routes_zhihu_through_ws_backend_without_helper_bootstrap -- --nocapture
cargo test --test browser_ws_backend_test -- --nocapture
cargo test --test service_ws_session_test -- --nocapture
cargo test --test compat_runtime_test zhihu_ -- --nocapture
```
Expected: websocket submit path and Zhihu direct workflows still pass.
### Runtime/config verification
```bash
cargo test --test compat_config_test -- --nocapture
cargo test --test runtime_profile_test -- --nocapture
cargo test --test task_runner_test -- --nocapture
```
Expected: runtime/config plumbing is stable after the single-dir cleanup.
### Build verification
```bash
cargo build --bin sgclaw --bin sg_claw --bin sg_claw_client
```
Expected: the branch still compiles cleanly.
---
## Notes For The Engineer
- The current scene support touches three different seams: runtime prompt injection, direct route detection/execution, and multi-root `skillsDir` plumbing. Remove all three; deleting only one leaves conflict-prone leftovers.
- If collapsing `skillsDir` to `Option<PathBuf>` creates more churn than expected, keep the internal representation temporarily as a one-element collection, but the public config contract and tests on this branch must still go back to a single configured path.
- Do not delete browser websocket or callback-host code just because it is adjacent to the scene work; this plan is about stripping scene behavior, not reworking transport.
- If `git diff --stat main...HEAD` still shows scene-specific files after Task 5, stop and remove them before merging `main` back into this branch.

View File

@@ -0,0 +1,672 @@
# Fault Details Full Skill Alignment Implementation Plan
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
**Goal:** Upgrade `fault-details-report.collect_fault_details` into a real staged browser skill that matches the original fault-details workflow, and make `claw-new` interpret the returned artifact status correctly in the direct-submit path.
**Architecture:** Keep routing and direct-skill selection in `claw-new`, but move all fault-details collection, normalization, classification, summary, export, and report-log behavior into the staged skill under `skill_staging`. Implement the staged skill as a true browser-eval entrypoint that remains valid in page context, while exposing testable pure helpers through an environment-safe export guard for `node:test`; then add a narrow Rust artifact interpreter in `src/compat/direct_skill_runtime.rs` so `ok` / `partial` / `empty` map to successful task completion while `blocked` / `error` map to failed completion.
**Tech Stack:** Rust 2021, `serde_json`, existing `BrowserPipeTool` / `browser_script` runtime, `node:test`, staged skill fixtures, Cargo integration tests.
---
## Execution Context
- Follow @superpowers:test-driven-development for every behavior change.
- Follow @superpowers:verification-before-completion before claiming each task is done.
- Do **not** create a git worktree unless the user explicitly asks. This repo preference is already established.
- Keep scope tight. Do **not** add a new browser protocol, new dispatch metadata, new UI opener behavior, or Rust-side fault classification logic.
- Keep the current direct path bootstrap requirement intact: the user instruction must still include an explicit `YYYY-MM`, but the staged skill must treat the page-selected range as the source of truth for collection once execution begins.
- Preserve parity with the original packages real behavior: port the original classification table, `qxxcjl`-based reason heuristics, canonical detail mapping, summary aggregation rules, localhost export call, and report-log call into the staged skill rather than implementing a fixture-only subset.
## File Map
### Existing files to modify in `claw-new`
- Modify: `src/compat/direct_skill_runtime.rs`
- add narrow structured artifact parsing and status-to-summary mapping
- keep direct-skill routing/config ownership unchanged
- Modify: `tests/agent_runtime_test.rs`
- add direct-submit regressions for `ok`, `partial`, `empty`, `blocked`, and `error`
- Modify: `tests/browser_script_skill_tool_test.rs`
- add browser-script execution-shape regression for browser-eval return payloads used by fault-details
### Existing files to modify in `skill_staging`
- Modify: `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/fault-details-report/scripts/collect_fault_details.js`
- replace empty shell with browser-eval entrypoint plus parity helpers
- Create: `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/fault-details-report/scripts/collect_fault_details.test.js`
- deterministic fixture coverage for normalization, classification, summary, artifact contract, export/logging degradation, and entrypoint shape helpers
- Modify: `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/fault-details-report/SKILL.toml`
- align tool description with real collection/export/report-log behavior
- Modify: `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/fault-details-report/SKILL.md`
- align written contract with actual runtime behavior and artifact fields
- Modify: `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/fault-details-report/references/collection-flow.md`
- align flow with page-range/query/export/report-log sequence
- Modify: `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/fault-details-report/references/data-quality.md`
- make canonical columns, original classification tables, reason heuristics, summary rules, and partial semantics explicit
- Modify: `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/scenes/fault-details-report/scene.json`
- keep scene output/state contract aligned with real staged artifact behavior
### Existing files to read but not redesign
- Read only: `docs/superpowers/specs/2026-04-10-fault-details-full-skill-alignment-design.md`
- Read only: `src/agent/mod.rs`
- Read only: `src/compat/browser_script_skill_tool.rs`
- Read only: `D:/desk/智能体资料/大四区报告监测项/故障明细/index.html`
---
### Task 1: Add staged-skill red tests for normalization, summary, and artifact-contract semantics
**Files:**
- Create: `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/fault-details-report/scripts/collect_fault_details.test.js`
- Read only: `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/fault-details-report/scripts/collect_fault_details.js`
- Read only: `D:/desk/智能体资料/大四区报告监测项/故障明细/index.html`
- [ ] **Step 1: Write the failing staged-skill test file**
Add `collect_fault_details.test.js` using `node:test` and `assert/strict`. Cover these behaviors with fixed fixtures:
```javascript
const test = require('node:test');
const assert = require('node:assert/strict');
const {
DETAIL_COLUMNS,
SUMMARY_COLUMNS,
normalizeDetailRow,
deriveSummaryRows,
determineArtifactStatus,
buildFaultDetailsArtifact,
buildBrowserEntrypointResult
} = require('./collect_fault_details.js');
test('normalizeDetailRow maps canonical detail fields from raw repair rows', () => {
const row = normalizeDetailRow({
qxdbh: 'QX-1',
bxsj: '2026-03-09 08:00:00',
cityName: '国网兰州供电公司',
maintOrgName: '城关供电服务班',
maintGroupName: '抢修一班',
bdzMc: '110kV东岗变',
xlmc10: '10kV东岗线',
byqmc: '东岗1号变',
yjflMc: '电网故障',
ejflMc: '线路故障',
sjflMc: '低压线路',
qxxcjl: '现场检查:低压线路断线,已处理完成',
gzms: '客户报修停电'
}, {
companyName: '国网兰州供电公司'
});
assert.equal(row.slsj, '2026-03-09 08:00:00');
assert.equal(row.gssgs, '甘肃省电力公司');
assert.equal(row.gddw, '城关供电服务班');
assert.equal(row.gds, '抢修一班');
assert.equal(row.clzt, '处理完成');
assert.equal(row.bdz, '110kV东岗变');
assert.equal(row.line, '10kV东岗线');
assert.equal(row.pb, '东岗1号变');
});
test('deriveSummaryRows groups normalized rows by gds and computes counters', () => {
const rows = [
{ gds: '抢修一班', gddw: '城关供电服务班', sgs: '国网兰州供电公司', sxfl1: '无效', sxfl2: '无效', gzsb: '' },
{ gds: '抢修一班', gddw: '城关供电服务班', sgs: '国网兰州供电公司', sxfl1: '有效', sxfl2: '用户侧', gzsb: '表后线' },
{ gds: '抢修一班', gddw: '城关供电服务班', sgs: '国网兰州供电公司', sxfl1: '有效', sxfl2: '电网侧', dwcFl: '低压故障', gzsb: '低压线路' }
];
const summaryRows = deriveSummaryRows(rows, { companyName: '国网兰州供电公司' });
assert.equal(summaryRows.length, 1);
assert.equal(summaryRows[0].className, '抢修一班');
assert.equal(summaryRows[0].allCount, 3);
assert.equal(summaryRows[0].wxCount, 1);
assert.equal(summaryRows[0].khcCount, 0);
assert.equal(summaryRows[0].dyGzCount, 1);
assert.equal(summaryRows[0].dyxlCount, 1);
assert.equal(summaryRows[0].bhxCount, 1);
});
test('determineArtifactStatus follows blocked > error > partial > empty > ok precedence', () => {
assert.equal(determineArtifactStatus({ blockedReason: 'missing_session', fatalError: null, partialReasons: [], detailRows: [{}] }), 'blocked');
assert.equal(determineArtifactStatus({ blockedReason: null, fatalError: 'parse_failed', partialReasons: [], detailRows: [{}] }), 'error');
assert.equal(determineArtifactStatus({ blockedReason: null, fatalError: null, partialReasons: ['export_failed'], detailRows: [{}] }), 'partial');
assert.equal(determineArtifactStatus({ blockedReason: null, fatalError: null, partialReasons: [], detailRows: [] }), 'empty');
assert.equal(determineArtifactStatus({ blockedReason: null, fatalError: null, partialReasons: [], detailRows: [{}] }), 'ok');
});
test('buildFaultDetailsArtifact keeps canonical fields, selected range, counts, and downstream results', () => {
const artifact = buildFaultDetailsArtifact({
period: '2026-03',
selectedRange: { start: '2026-03-08 16:00:00', end: '2026-03-09 16:00:00' },
detailRows: [{ qxdbh: 'QX-1' }],
summaryRows: [{ index: 1 }],
partialReasons: ['report_log_failed'],
downstream: {
export: { attempted: true, success: true, path: 'http://localhost/export.xlsx' },
report_log: { attempted: true, success: false, error: '500' }
}
});
assert.equal(artifact.type, 'report-artifact');
assert.equal(artifact.status, 'partial');
assert.deepEqual(artifact.selected_range, { start: '2026-03-08 16:00:00', end: '2026-03-09 16:00:00' });
assert.equal(artifact.counts.detail_rows, 1);
assert.equal(artifact.counts.summary_rows, 1);
assert.deepEqual(artifact.partial_reasons, ['report_log_failed']);
});
test('buildFaultDetailsArtifact keeps required top-level fields for blocked artifact', () => {
const artifact = buildFaultDetailsArtifact({
period: '2026-03',
blockedReason: 'selected_range_unavailable',
partialReasons: ['selected_range_unavailable']
});
assert.equal(artifact.type, 'report-artifact');
assert.equal(artifact.report_name, 'fault-details-report');
assert.equal(artifact.period, '2026-03');
assert.equal(artifact.status, 'blocked');
assert.deepEqual(artifact.partial_reasons, ['selected_range_unavailable']);
assert.equal('downstream' in artifact, false);
});
test('buildFaultDetailsArtifact keeps known selected range and counts on late error', () => {
const artifact = buildFaultDetailsArtifact({
period: '2026-03',
selectedRange: { start: '2026-03-08 16:00:00', end: '2026-03-09 16:00:00' },
detailRows: [],
summaryRows: [],
fatalError: 'summary_failed',
partialReasons: ['summary_failed']
});
assert.equal(artifact.status, 'error');
assert.deepEqual(artifact.selected_range, { start: '2026-03-08 16:00:00', end: '2026-03-09 16:00:00' });
assert.equal(artifact.counts.detail_rows, 0);
assert.equal(artifact.counts.summary_rows, 0);
});
test('buildBrowserEntrypointResult returns blocked artifact when selected range is unavailable', async () => {
const artifact = await buildBrowserEntrypointResult({
period: '2026-03'
}, {
readSelectedRange: async () => null
});
assert.equal(artifact.status, 'blocked');
assert.ok(artifact.partial_reasons.includes('selected_range_unavailable'));
});
```
- [ ] **Step 2: Run the staged-skill test file and verify it fails**
Run:
```bash
node "D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/fault-details-report/scripts/collect_fault_details.test.js"
```
Expected: FAIL because `collect_fault_details.js` does not export these helpers yet and still only returns an empty shell.
---
### Task 2: Implement staged-skill parity helpers and a valid browser entrypoint
**Files:**
- Modify: `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/fault-details-report/scripts/collect_fault_details.js`
- Test: `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/fault-details-report/scripts/collect_fault_details.test.js`
- [ ] **Step 1: Implement the helper exports and browser entrypoint shape needed to satisfy the red tests**
Refactor `collect_fault_details.js` so the file remains a valid browser-eval script in page context while still supporting `node:test` through an environment-safe export guard.
Required implementation pieces:
```javascript
const DETAIL_COLUMNS = [/* existing canonical columns */];
const SUMMARY_COLUMNS = [/* existing summary columns */];
function normalizeDetailRow(raw, context) {
// map qxdbh/gssgs/sgs/gddw/gds/slsj/clzt/bdz/line/pb
// derive sxfl1/sxfl2/sxfl3/gzsb/gzyy from the original package rules
}
function deriveSummaryRows(detailRows, context) {
// group by gds and compute all original package counters
}
function determineArtifactStatus({ blockedReason, fatalError, partialReasons, detailRows }) {
// blocked > error > partial > empty > ok
}
function buildFaultDetailsArtifact({
period,
selectedRange,
detailRows,
summaryRows,
partialReasons,
blockedReason,
fatalError,
downstream
}) {
// return report-artifact with columns, sections, counts, status, partial_reasons, downstream
}
async function buildBrowserEntrypointResult(input, deps = defaultBrowserDeps()) {
// read selected range from page
// collect raw rows from page query
// normalize rows
// derive summary
// attempt export + report log
// return final artifact
}
if (typeof module !== 'undefined' && module.exports) {
module.exports = {
DETAIL_COLUMNS,
SUMMARY_COLUMNS,
normalizeDetailRow,
deriveSummaryRows,
determineArtifactStatus,
buildFaultDetailsArtifact,
buildBrowserEntrypointResult
};
}
return await buildBrowserEntrypointResult(args);
```
Rules:
- keep `DETAIL_COLUMNS` and `SUMMARY_COLUMNS` canonical and stable
- keep helper functions self-contained in this file unless a separate pure helper file becomes necessary for runtime validity
- keep the browser entrypoint compatible with current `eval` wrapper
- keep browser runtime free of unguarded Node-only assumptions
- do **not** invent a new protocol or callback surface
- [ ] **Step 2: Re-run the staged-skill test file and verify it now reaches deeper failures or passes the initial helper coverage**
Run:
```bash
node "D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/fault-details-report/scripts/collect_fault_details.test.js"
```
Expected: either PASS for the Task 1 cases, or fail only on the still-missing full parity/export/history specifics added in Task 3.
---
### Task 3: Add red tests for full classification parity, downstream partials, and empty-result export semantics
**Files:**
- Modify: `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/fault-details-report/scripts/collect_fault_details.test.js`
- Modify: `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/fault-details-report/scripts/collect_fault_details.js`
- Read only: `D:/desk/智能体资料/大四区报告监测项/故障明细/index.html`
- [ ] **Step 1: Extend the staged-skill tests with failing parity and downstream cases**
Add focused failing tests such as:
```javascript
test('normalizeDetailRow derives gzyy from qxxcjl text heuristics', () => {
const row = normalizeDetailRow({
qxxcjl: '现场检查:客户表后线烧损,已恢复送电',
ejflMc: '客户侧故障',
sjflMc: '表后线'
}, { companyName: '国网兰州供电公司' });
assert.equal(row.gzsb, '表后线');
assert.equal(row.gzyy, '表后线烧损');
});
test('buildBrowserEntrypointResult returns partial when export fails after detail collection succeeds', async () => {
const artifact = await buildBrowserEntrypointResult({ period: '2026-03' }, {
readSelectedRange: async () => ({ start: '2026-03-08 16:00:00', end: '2026-03-09 16:00:00' }),
queryFaultRows: async () => [{ qxdbh: 'QX-1', bxsj: '2026-03-09 08:00:00', maintGroupName: '抢修一班' }],
readCompanyContext: () => ({ companyName: '国网兰州供电公司' }),
exportWorkbook: async () => {
throw new Error('export_failed');
},
writeReportLog: async () => ({ success: true })
});
assert.equal(artifact.status, 'partial');
assert.ok(artifact.partial_reasons.includes('export_failed'));
assert.equal(artifact.counts.detail_rows, 1);
assert.equal(artifact.downstream.export.attempted, true);
assert.equal(artifact.downstream.export.success, false);
});
test('buildBrowserEntrypointResult returns error when normalized detail rows cannot be produced', async () => {
const artifact = await buildBrowserEntrypointResult({ period: '2026-03' }, {
readSelectedRange: async () => ({ start: '2026-03-08 16:00:00', end: '2026-03-09 16:00:00' }),
queryFaultRows: async () => [{ qxdbh: '', bxsj: '' }],
readCompanyContext: () => ({ companyName: '国网兰州供电公司' })
});
assert.equal(artifact.status, 'error');
assert.ok(artifact.partial_reasons.includes('detail_normalization_failed'));
});
test('buildBrowserEntrypointResult keeps canonical rows empty for empty result and omits downstream before attempts', async () => {
const artifact = await buildBrowserEntrypointResult({ period: '2026-03' }, {
readSelectedRange: async () => ({ start: '2026-03-08 16:00:00', end: '2026-03-09 16:00:00' }),
queryFaultRows: async () => [],
readCompanyContext: () => ({ companyName: '国网兰州供电公司' })
});
assert.equal(artifact.status, 'empty');
assert.deepEqual(artifact.rows, []);
assert.equal('downstream' in artifact, false);
});
```
Also add fixture cases derived from the original packages full classification table and summary counters so the staged skill is forced toward parity, not a subset implementation.
- [ ] **Step 2: Run the staged-skill test file and verify it fails on the new cases**
Run:
```bash
node "D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/fault-details-report/scripts/collect_fault_details.test.js"
```
Expected: FAIL on missing full classification parity or downstream partial/error behavior.
- [ ] **Step 3: Implement the full business logic needed to satisfy the new tests**
In `collect_fault_details.js`:
- port the original classification table and `qxxcjl` text heuristics for `sxfl1`, `sxfl2`, `sxfl3`, `gzsb`, `gzyy`
- port the original summary derivation rules and counters completely
- add required-field validation so structurally unusable normalized rows escalate to `error`
- add downstream `exportWorkbook` and `writeReportLog` stages that record `{attempted, success, path, error}`
- keep collection success distinct from downstream failures so export/logging failures become `partial`, not full failure
- keep placeholder rows, if needed for downstream empty-export payloads, downstream-only and never in canonical returned `rows`
- include both `period` and `selected_range` in the artifact
- omit `downstream` when export/report-log have not been attempted yet
- [ ] **Step 4: Re-run the staged-skill test file and verify it passes**
Run:
```bash
node "D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/fault-details-report/scripts/collect_fault_details.test.js"
```
Expected: PASS.
---
### Task 4: Align staged-skill metadata and reference docs with the implemented behavior
**Files:**
- Modify: `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/fault-details-report/SKILL.toml`
- Modify: `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/fault-details-report/SKILL.md`
- Modify: `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/fault-details-report/references/collection-flow.md`
- Modify: `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/fault-details-report/references/data-quality.md`
- Modify: `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/scenes/fault-details-report/scene.json`
- [ ] **Step 1: Update the staged metadata/docs to match the implemented runtime contract**
Required changes:
- `SKILL.toml`: description must say the tool collects rows, derives summary, attempts localhost export, and records report history
- `SKILL.md`: artifact example must include `selected_range`, `counts`, `status`, `partial_reasons`, and `downstream`
- `references/collection-flow.md`: sequence must explicitly include page-selected range -> raw query -> normalization -> summary -> export -> report-log
- `references/data-quality.md`: document the original classification tables, `qxxcjl` heuristics, summary rules, partial/error escalation rules, and empty-result semantics explicitly enough to match the implemented helpers
- `scene.json`: keep inputs/outputs/status semantics aligned with the richer artifact; do not add routing policy there
- [ ] **Step 2: Read the updated staged docs and verify they match the implemented JS behavior**
Read and confirm:
- descriptions no longer claim “artifact shell” behavior
- docs do not move routing ownership out of `claw-new`
- docs do not promise auto-opening/downloading behavior in this slice
- docs reflect blocked/error field-presence rules and downstream-attempt semantics
Expected: staged metadata/docs accurately reflect the implemented collector.
---
### Task 5: Add Rust red tests for artifact-status interpretation in the direct-submit runtime
**Files:**
- Modify: `tests/agent_runtime_test.rs`
- Modify: `tests/browser_script_skill_tool_test.rs`
- Modify: `src/compat/direct_skill_runtime.rs`
- Read only: `src/compat/browser_script_skill_tool.rs`
- [ ] **Step 1: Add failing direct-submit runtime tests for structured artifact statuses**
Extend `tests/agent_runtime_test.rs` with focused regressions that use the existing temp skill-root harness but return real `report-artifact` payloads:
```rust
#[test]
fn submit_task_treats_partial_report_artifact_as_success_with_warning_summary() {
let skill_root = build_direct_runtime_skill_root();
let runtime_context = direct_submit_runtime_context(&skill_root);
let transport = Arc::new(MockTransport::new(vec![success_browser_response(
1,
serde_json::json!({
"text": {
"type": "report-artifact",
"report_name": "fault-details-report",
"period": "2026-03",
"selected_range": { "start": "2026-03-08 16:00:00", "end": "2026-03-09 16:00:00" },
"columns": ["qxdbh"],
"rows": [{ "qxdbh": "QX-1" }],
"sections": [{ "name": "summary-sheet", "columns": ["index"], "rows": [{ "index": 1 }] }],
"counts": { "detail_rows": 1, "summary_rows": 1 },
"status": "partial",
"partial_reasons": ["report_log_failed"],
"downstream": {
"export": { "attempted": true, "success": true, "path": "http://localhost/export.xlsx" },
"report_log": { "attempted": true, "success": false, "error": "500" }
}
}
}),
)]));
// ... invoke handle_browser_message_with_context(...)
// assert TaskComplete.success == true
// assert summary contains partial/report_log_failed/detail_rows=1
}
#[test]
fn submit_task_treats_empty_report_artifact_as_success() { /* status=empty => success=true */ }
#[test]
fn submit_task_treats_blocked_report_artifact_as_failure() { /* status=blocked => success=false */ }
#[test]
fn submit_task_treats_error_report_artifact_as_failure() { /* status=error => success=false */ }
```
Also add one focused helper regression to `tests/browser_script_skill_tool_test.rs` that proves the browser-script helper can return a structured object payload used by the fault-details path without flattening required fields away.
Suggested test name:
```rust
#[tokio::test]
async fn execute_browser_script_tool_preserves_structured_report_artifact_payload() { /* ... */ }
```
- [ ] **Step 2: Run the focused Rust tests and verify they fail**
Run:
```bash
cargo test --test agent_runtime_test submit_task_treats_partial_report_artifact_as_success_with_warning_summary -- --nocapture
cargo test --test browser_script_skill_tool_test execute_browser_script_tool_preserves_structured_report_artifact_payload -- --nocapture
```
Expected: the new `agent_runtime_test` case fails because `execute_direct_submit_skill` still returns raw JSON text and `src/agent/mod.rs` still marks all direct-submit results as success when no Rust-side interpretation exists.
---
### Task 6: Implement narrow Rust artifact interpretation without moving business rules into Rust
**Files:**
- Modify: `src/compat/direct_skill_runtime.rs`
- Modify: `tests/agent_runtime_test.rs`
- Modify: `tests/browser_script_skill_tool_test.rs`
- [ ] **Step 1: Implement a narrow structured-artifact interpreter in `src/compat/direct_skill_runtime.rs`**
Add a small internal result type and parser, for example:
```rust
struct DirectSubmitOutcome {
success: bool,
summary: String,
}
fn interpret_direct_submit_output(output: &str) -> DirectSubmitOutcome {
// parse JSON if possible
// if type == "report-artifact", read status/counts/partial_reasons/downstream
// map ok/partial/empty => success=true
// map blocked/error => success=false
// build concise summary with report_name, period, detail_rows, summary_rows, status, partial reasons
// fall back to raw output text when payload is not a recognized artifact
}
```
Then change the public entrypoint shape from `Result<String, PipeError>` to a narrow result carrying `success` and `summary`, or add a second helper that `src/agent/mod.rs` can use without changing routing ownership.
Rules:
- do **not** reimplement fault normalization/classification/summary in Rust
- do **not** add fault-specific branching in `src/agent/mod.rs`
- keep unrecognized non-artifact outputs working as before
- keep explicit `YYYY-MM` derivation and configured `skill.tool` resolution unchanged
- [ ] **Step 2: Update the submit-path caller to use the interpreted success flag**
Adjust the direct-submit branch so `TaskComplete.success` comes from the artifact interpretation instead of blindly treating every `Ok(summary)` as success.
Implementation target:
- keep the direct path in `src/agent/mod.rs`
- keep error handling narrow
- if needed, return a dedicated direct-submit outcome from `execute_direct_submit_skill`
- [ ] **Step 3: Re-run the focused Rust tests and verify they pass**
Run:
```bash
cargo test --test agent_runtime_test submit_task_treats_partial_report_artifact_as_success_with_warning_summary -- --nocapture
cargo test --test agent_runtime_test submit_task_treats_empty_report_artifact_as_success -- --nocapture
cargo test --test agent_runtime_test submit_task_treats_blocked_report_artifact_as_failure -- --nocapture
cargo test --test agent_runtime_test submit_task_treats_error_report_artifact_as_failure -- --nocapture
cargo test --test browser_script_skill_tool_test execute_browser_script_tool_preserves_structured_report_artifact_payload -- --nocapture
```
Expected: PASS.
---
### Task 7: Run the full verification sweep for the staged skill and direct runtime
**Files:**
- Verify only
- [ ] **Step 1: Run the staged-skill deterministic test file**
Run:
```bash
node "D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/fault-details-report/scripts/collect_fault_details.test.js"
```
Expected: PASS.
- [ ] **Step 2: Run the relevant Rust regression suites**
Run:
```bash
cargo test --test browser_script_skill_tool_test -- --nocapture
cargo test --test agent_runtime_test -- --nocapture
```
Expected: PASS.
- [ ] **Step 3: Run the broader compatibility coverage and build**
Run:
```bash
cargo test --test compat_runtime_test -- --nocapture
cargo test --test compat_config_test -- --nocapture
cargo build --bin sgclaw
```
Expected: PASS.
- [ ] **Step 4: Manually verify the requirements against the approved spec**
Checklist:
- staged skill now reads page-selected range instead of inventing a month window after entry
- staged skill returns canonical detail rows and summary rows
- staged skill ports the original classification table, `qxxcjl` heuristics, and summary counters with parity coverage
- staged skill records downstream export/report-log outcome
- staged skill distinguishes `ok` / `partial` / `empty` / `blocked` / `error`
- `blocked` / `error` artifacts keep the required top-level fields, and preserve known `selected_range` / `counts` when failure happens late enough
- `downstream` is omitted when export/report-log were not attempted and included with attempted/success flags once they were attempted
- empty-result canonical `rows` stay empty even if downstream export uses a placeholder transport row
- `claw-new` maps `ok` / `partial` / `empty` to success and `blocked` / `error` to failure
- no new routing metadata was added to `SKILL.toml` or `scene.json`
- no new browser protocol or opener/UI behavior was introduced
Expected: all checklist items satisfied before calling the work complete.
---
## Verification Checklist
### Staged skill behavior
```bash
node "D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/fault-details-report/scripts/collect_fault_details.test.js"
```
Expected: deterministic fixture coverage passes for normalization, full classification parity, summary derivation, artifact shape, empty semantics, and downstream partial semantics.
### Direct-submit runtime mapping
```bash
cargo test --test agent_runtime_test -- --nocapture
```
Expected:
- valid artifact `ok` / `partial` / `empty` completes successfully
- valid artifact `blocked` / `error` completes as failure
- existing invalid config regression still passes
- existing direct-submit happy path still passes
### Browser-script helper safety
```bash
cargo test --test browser_script_skill_tool_test -- --nocapture
```
Expected: current browser-script execution semantics remain intact while returning structured artifact payloads.
### Compatibility/build
```bash
cargo test --test compat_runtime_test -- --nocapture
cargo test --test compat_config_test -- --nocapture
cargo build --bin sgclaw
```
Expected: no regressions in compat execution/config loading; main binary builds cleanly.
---
## Notes For The Engineer
- The paired spec is `docs/superpowers/specs/2026-04-10-fault-details-full-skill-alignment-design.md`.
- Keep all fault business transforms in `skill_staging`, not in Rust.
- Keep direct routing config-owned via `skillsDir` + `directSubmitSkill`.
- Do **not** broaden this slice into LLM routing, generic dispatch policy, new browser opcodes, or export auto-open behavior.
- If the original package reveals extra classification rules that are needed for parity, add them only inside `collect_fault_details.js` and its staged references/tests, not in `claw-new`.

View File

@@ -0,0 +1,551 @@
# Main → WS Merge v2 Implementation Plan
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
**Goal:** 把最新 `origin/main` 合并到 `feature/claw-ws`,让 `ws` 分支最终同时保留 **pipe + ws** 两套通信能力、保留 Zhihu 行为,并用 `main` 上正式的 fault-details 实现替换 `ws` 上已 cleanup 删除的旧重复实现。
**Architecture:** 这次合并不是“把 cleanup 永久保持成没有 fault-details”而是“先删除 ws 上旧重复实现,再吸收 main 上正式实现”。冲突裁决优先级是:**先保 pipe、再保 ws、再保 Zhihu、同时拒绝 ws 上旧重复 scene/fault-details 实现回流**。整个过程使用 `git merge --no-commit --no-ff origin/main`,冲突解决后只做聚焦验证,停在未提交状态。
**Tech Stack:** Git, Rust 2021, Cargo test, sgClaw pipe transport, ws transport, compat/runtime/orchestration stack, Zhihu direct workflow tests.
---
## Preconditions
- 当前分支必须是 `feature/claw-ws`
- `2026-04-09-ws-branch-scene-cleanup-plan.md` 已完成
- 当前不在 merge 状态
- 当前没有 tracked 未提交改动
- 本次**不创建 worktree**,按当前仓库执行
- 本次结束点是:**已合并、已验证、未提交**
---
## Final Merge Principles
### 1) `main` 是 pipe 主线
合并后不能把 `main` 上现有的 pipe 管道通信破坏掉。
### 2) `ws` 分支最终要同时保留 pipe + ws
合并后不能让 `ws` 分支丢掉 websocket 路径,也不能只剩 pipe。
### 3) 两边都有 Zhihu
合并后不能把现有 Zhihu 行为合坏,尤其是 ws→Zhihu 保留路径。
### 4) fault-details 以 `main` 正式实现为准
- `ws` 上那套旧重复实现:**不能回流**
- `main` 上正式实现:**应被合进来**
- 最终结果不是“没有 fault-details”而是“没有 ws 那套旧 fault-details只保留 main 正式版本”
### 5) 不回流旧 scene plumbing
以下旧面不能作为最终结果保留:
- ws 自己那套旧 scene registry / old scene plumbing
- ws cleanup 已删掉的旧重复 route/contract
- 仅为旧 `skill_staging` 场景装配服务的残留逻辑
---
## File Map
### A. 合并时重点观察的共享/高风险文件
- `Cargo.toml`
- `Cargo.lock`
- `src/agent/mod.rs`
- `src/agent/task_runner.rs`
- `src/config/settings.rs`
- `src/compat/config_adapter.rs`
- `src/compat/runtime.rs`
- `src/compat/orchestration.rs`
- `src/compat/workflow_executor.rs`
- `src/compat/browser_script_skill_tool.rs`
- `src/compat/direct_skill_runtime.rs`
- `src/compat/openxml_office_tool.rs`
### B. pipe / ws / Zhihu 保护面
- `src/compat/runtime.rs`
- `src/compat/orchestration.rs`
- `src/compat/workflow_executor.rs`
- `src/agent/task_runner.rs`
- `tests/agent_runtime_test.rs`
- `tests/browser_ws_backend_test.rs`
- `tests/service_ws_session_test.rs`
- `tests/task_runner_test.rs`
### C. cleanup 后仍需防止旧实现回流的文件
- `src/runtime/mod.rs`
- `src/runtime/engine.rs`
- `src/config/settings.rs`
- `src/compat/config_adapter.rs`
- `tests/compat_runtime_test.rs`
- `tests/runtime_profile_test.rs`
- `tests/compat_config_test.rs`
### D. 可能需要随 main 正式 fault-details 一起更新的测试面
- `tests/compat_runtime_test.rs`
- `tests/compat_config_test.rs`
- `tests/browser_script_skill_tool_test.rs`
- `tests/compat_openxml_office_tool_test.rs`
---
## Conflict Resolution Rule Table
| 类别 | 最终保留原则 |
|---|---|
| pipe 主路径 | **优先保留可工作的 main 版本**,不能被 ws 改坏 |
| ws 路径 | **必须继续保留 ws 能力**,不能因吸收 main 而丢失 |
| Zhihu | 两边相关能力都不能合坏,至少保住现有 keep-path |
| fault-details | **保留 main 正式实现**,不保留 ws 旧重复实现 |
| old scene/95598 cleanup 残留 | 不允许以 ws 旧重复实现形式回流 |
| `skillsDir` / config | 以最终产品需要为准;若 main 正式实现不要求旧 array-style/scene expansion则不回流 |
| 临时 merge 修补 | 一律不保留 |
---
### Task 1: Confirm Merge Preconditions And Diff Surface
**Files:**
- No code changes expected
- Observe repo state and branch diff only
- [ ] **Step 1: Confirm current branch**
Run:
```bash
git rev-parse --abbrev-ref HEAD
```
Expected:
```text
feature/claw-ws
```
- [ ] **Step 2: Confirm no merge is in progress**
Run:
```bash
git rev-parse -q --verify MERGE_HEAD
```
Expected: exit code `1`.
- [ ] **Step 3: Confirm no tracked local changes**
Run:
```bash
git diff --name-only && printf '\n---STAGED---\n' && git diff --cached --name-only
```
Expected:
```text
---STAGED---
```
- [ ] **Step 4: List current untracked files**
Run:
```bash
git status --short
```
Expected: only known local untracked items, or a clearly understood list.
- [ ] **Step 5: Update `origin/main`**
Run:
```bash
git fetch origin main
```
- [ ] **Step 6: Show ws vs main diff surface before merge**
Run:
```bash
git diff --name-status HEAD...origin/main
```
Expected: clear file list to compare likely merge surface.
- [ ] **Step 7: Stop if preconditions fail**
Stop if:
- branch is wrong
- merge is in progress
- tracked changes exist
- untracked file collision with `origin/main` is found and unresolved
---
### Task 2: Start The Merge Without Committing
**Files:**
- Merge index / working tree only
- [ ] **Step 1: Start no-commit merge**
Run:
```bash
git merge --no-commit --no-ff origin/main
```
Expected:
- either auto-merge pauses before commit
- or Git reports conflicts
- [ ] **Step 2: Capture merge surface immediately**
Run:
```bash
git status --short
```
- [ ] **Step 3: Separate results into three buckets**
Create a working list of conflicted files under:
1. pipe-critical
2. ws/Zhihu-critical
3. shared infra / tests
- [ ] **Step 4: If no conflicts, proceed directly to Task 4 verification**
- [ ] **Step 5: If conflicts exist, proceed to Task 3**
---
### Task 3: Resolve Conflicts By System Role, Not By Branch Bias
**Files:**
- Only files reported by Git as conflicted
#### Global conflict policy
For every conflicted hunk, answer these four questions in order:
1. Does this hunk affect **pipe** correctness?
2. Does this hunk affect **ws** correctness?
3. Does this hunk affect **Zhihu** correctness?
4. Is this hunk part of **ws old duplicate fault-details/scene logic** or **main official implementation**?
Then apply the rule:
- **pipe cannot break**
- **ws cannot break**
- **Zhihu cannot break**
- **ws old duplicate fault-details must stay deleted**
- **main official fault-details should come in**
---
#### Task 3A: Resolve pipe-critical shared runtime files
**Files:**
- `src/compat/runtime.rs`
- `src/agent/task_runner.rs`
- `src/agent/mod.rs`
- `src/config/settings.rs`
- `src/compat/config_adapter.rs`
- [ ] **Step 1: For each conflict, keep the side that preserves mains pipe behavior**
- [ ] **Step 2: Reject ws-only duplicate business logic that main already owns**
- [ ] **Step 3: Keep ws support if the file also serves ws path**
This is additive preservation, not “main wins everything”.
- [ ] **Step 4: Verify each resolved file has no conflict markers**
Run per file:
```bash
git diff --check -- <path>
```
---
#### Task 3B: Resolve ws / Zhihu-critical routing files
**Files:**
- `src/compat/workflow_executor.rs`
- `src/compat/orchestration.rs`
- [ ] **Step 1: Bring in mains official fault-details path if it lives here**
- [ ] **Step 2: Do not reintroduce wss old duplicate fault-details path**
- [ ] **Step 3: Preserve ws submit/browser websocket path**
- [ ] **Step 4: Preserve Zhihu routing path**
- [ ] **Step 5: Verify each resolved file has no conflict markers**
Run per file:
```bash
git diff --check -- <path>
```
---
#### Task 3C: Resolve shared infra files minimally
**Files:**
- `Cargo.toml`
- `Cargo.lock`
- `src/compat/browser_script_skill_tool.rs`
- `src/compat/direct_skill_runtime.rs`
- `src/compat/openxml_office_tool.rs`
- [ ] **Step 1: Keep only the dependency/code shape needed by the merged result**
- [ ] **Step 2: Do not keep lines from prior failed merge attempts**
- [ ] **Step 3: Accept main fixes unless they break pipe/ws/Zhihu behavior**
- [ ] **Step 4: Verify each resolved file has no conflict markers**
Run per file:
```bash
git diff --check -- <path>
```
---
#### Task 3D: Resolve tests to reflect final intended product
**Files:**
- `tests/compat_runtime_test.rs`
- `tests/runtime_profile_test.rs`
- `tests/compat_config_test.rs`
- `tests/agent_runtime_test.rs`
- `tests/browser_script_skill_tool_test.rs`
- `tests/compat_openxml_office_tool_test.rs`
- [ ] **Step 1: Keep tests proving pipe path still works**
- [ ] **Step 2: Keep tests proving ws path still works**
- [ ] **Step 3: Keep Zhihu keep-path regression**
- [ ] **Step 4: Replace cleanup-only “fault-details absent” assertions if final intended state is now “fault-details present via main official implementation”**
- [ ] **Step 5: Do not keep assertions that only prove wss old duplicate implementation is absent if they now contradict the intended merged product**
- [ ] **Step 6: Verify each resolved test file has no conflict markers**
Run per file:
```bash
git diff --check -- <path>
```
---
#### Task 3E: Confirm merge is fully resolved
**Files:**
- No code changes expected
- [ ] **Step 1: Confirm no unmerged entries remain**
Run:
```bash
git diff --name-only --diff-filter=U
```
Expected: no output.
- [ ] **Step 2: Show final resolved file list**
Run:
```bash
git diff --cached --name-only
```
---
### Task 4: Verify Final Product Behavior, Not Cleanup Intermediate State
**Files:**
- Test: `tests/agent_runtime_test.rs`
- Test: `tests/browser_ws_backend_test.rs`
- Test: `tests/service_ws_session_test.rs`
- Test: `tests/task_runner_test.rs`
- Test: `tests/compat_runtime_test.rs`
- Test: `tests/runtime_profile_test.rs`
- Test: `tests/compat_config_test.rs`
- Conditional: `tests/browser_script_skill_tool_test.rs`
- Conditional: `tests/compat_openxml_office_tool_test.rs`
#### Verification goals
This task must prove all four:
1. **pipe path still works**
2. **ws path still works**
3. **Zhihu still works**
4. **final fault-details implementation is the main version, not wss old duplicate**
---
#### Task 4A: Verify pipe-related behavior
- [ ] **Step 1: Run task runner coverage**
Run:
```bash
cargo test --test task_runner_test -- --nocapture
```
- [ ] **Step 2: Run compat runtime suite relevant to main path**
Run:
```bash
cargo test --test compat_runtime_test -- --nocapture
```
- [ ] **Step 3: If pipe-specific tests fail, stop and fix merge resolution before continuing**
---
#### Task 4B: Verify ws-related behavior
- [ ] **Step 1: Run browser websocket backend suite**
Run:
```bash
cargo test --test browser_ws_backend_test -- --nocapture
```
- [ ] **Step 2: Run service websocket session suite**
Run:
```bash
cargo test --test service_ws_session_test -- --nocapture
```
- [ ] **Step 3: If ws-specific tests fail, stop and fix merge resolution before continuing**
---
#### Task 4C: Verify Zhihu behavior
- [ ] **Step 1: Re-run ws→Zhihu keep-path regression**
Run:
```bash
cargo test --test agent_runtime_test production_submit_task_routes_zhihu_through_ws_backend_without_helper_bootstrap -- --nocapture
```
Expected:
```text
1 passed; 0 failed
```
- [ ] **Step 2: If additional Zhihu tests were touched by conflicts, run the smallest affected test target**
Run as needed:
```bash
cargo test --test agent_runtime_test -- --nocapture
```
---
#### Task 4D: Verify config/runtime contracts
- [ ] **Step 1: Run runtime profile suite**
Run:
```bash
cargo test --test runtime_profile_test -- --nocapture
```
- [ ] **Step 2: Run compat config suite**
Run:
```bash
cargo test --test compat_config_test -- --nocapture
```
- [ ] **Step 3: Ensure contracts now reflect final merged product, not the cleanup-only intermediate**
---
#### Task 4E: Verify shared infra if touched
- [ ] **Step 1: If browser-script tool files were touched**
Run:
```bash
cargo test --test browser_script_skill_tool_test -- --nocapture
```
- [ ] **Step 2: If openxml files were touched**
Run:
```bash
cargo test --test compat_openxml_office_tool_test -- --nocapture
```
---
#### Task 4F: Compile guard
- [ ] **Step 1: Run compile-only full test build**
Run:
```bash
cargo test --no-run
```
Expected: exit code `0`.
---
### Task 5: Confirm The Merge Outcome Matches The Principle
**Files:**
- No code changes expected
- [ ] **Step 1: Show final status**
Run:
```bash
git status --short
```
Expected:
- no `UU` / `AA` / `DD`
- merged, validated, uncommitted state only
- [ ] **Step 2: Show final staged summary**
Run:
```bash
git diff --cached --stat
```
- [ ] **Step 3: Report the four required facts with command-backed evidence**
Only if verified:
1. pipe 没坏
2. ws 没坏
3. Zhihu 没坏
4. 最终 fault-details 来自 main 正式实现,而不是 ws 旧重复实现
- [ ] **Step 4: Stop here**
Do **not** run:
```bash
git commit
git push
```
---
## Stop Conditions
出现以下任一情况立即停止,不擅自扩展处理:
- `origin/main` 的正式 fault-details 实现依赖 cleanup 已删掉的契约,而这已经超出简单 merge 范围
- pipe 与 ws 同时依赖同一段共享代码,但两边要求已结构性冲突
- Zhihu keep-path 失败
- `cargo test --no-run` 失败且问题超出本次 merge surface
- 需要重新设计 pipe/ws 共存方式,而不是单纯合并
---
## One-line Execution Rule
**这次 merge 的最终标准不是“继续保持 ws 没有 fault-details”而是“保住 pipe、保住 ws、保住 Zhihu并让 main 的正式 fault-details 替换 ws 旧重复实现”。**

View File

@@ -0,0 +1,808 @@
# TQ Lineloss Deterministic Skill Implementation Plan
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
**Goal:** Add a staged `tq-lineloss-report.collect_lineloss` browser-script skill plus a `。。。` deterministic submit path in `claw-new` that extracts and normalizes company/month/week parameters without LLM, executes through the existing pipe browser-script seam, and does not regress Zhihu hotlist behavior.
**Architecture:** Keep the new behavior behind a narrow deterministic branch that activates only when the raw instruction ends with the exact suffix `。。。`. `claw-new` owns deterministic trigger detection, explicit scene matching, semantic extraction, canonical normalization, prompt-or-execute control flow, and artifact interpretation; the staged skill owns page inspection, source/API collection, row normalization, export/report-log behavior, and final artifact generation. Reuse the existing `browser_script` execution seam already used by the direct browser path so the backend can later swap from pipe to ws without changing the deterministic contract.
**Tech Stack:** Rust 2021, Cargo tests, existing `BrowserPipeTool` / `execute_browser_script_tool` seam, staged skill packaging under `claw/claw/skills/skill_staging`, browser-side JavaScript, deterministic string parsing and normalization.
---
## Execution Context
- Follow @superpowers:test-driven-development for every behavior change.
- Follow @superpowers:verification-before-completion before claiming each task is done.
- Do **not** create a git worktree unless the user explicitly asks.
- Keep the new behavior as a narrow branch; do **not** redesign the whole runtime into a general registry engine in this slice.
- Preserve `src/runtime/engine.rs:147-159` and `src/runtime/engine.rs:265-286` behavior unless a failing regression test proves a change is required.
- Do **not** add ws runtime requirements on `main`; keep ws-readiness isolated to backend-neutral contracts only.
- Never fall back to page defaults for missing company, mode, or period in deterministic mode.
- If a deterministic request does not match the lineloss whitelist scene, return a deterministic mismatch prompt instead of falling through to ordinary orchestration.
## File Map
### New or modified files in `claw-new`
- Create: `src/compat/deterministic_submit.rs`
- suffix detection, deterministic scene match, prompt-or-execute decision
- Create: `src/compat/tq_lineloss/mod.rs`
- public normalization and artifact helpers
- Create: `src/compat/tq_lineloss/contracts.rs`
- canonical request/result data structures and status semantics
- Create: `src/compat/tq_lineloss/org_resolver.rs`
- alias generation, canonical label/code resolution, ambiguity handling
- Create: `src/compat/tq_lineloss/period_resolver.rs`
- month/week extraction, contradiction detection, canonical payload building
- Create: `src/compat/tq_lineloss/org_units.rs`
- checked-in canonical unit dictionary derived from the real source tree data
- Modify: `src/compat/mod.rs`
- export the deterministic and lineloss modules
- Modify: `src/agent/mod.rs`
- insert the deterministic branch before ordinary LLM interpretation, but only when the exact suffix is present
- Modify only if code duplication would otherwise occur: `src/compat/direct_skill_runtime.rs`
- extract narrow shared browser-script execution helpers without changing current configured direct-submit behavior
- Read but avoid changing unless tests force it: `src/runtime/engine.rs`
- existing Zhihu hotlist routing/prompt logic must remain intact
### New staged skill package in `claw`
- Create: `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/tq-lineloss-report/SKILL.md`
- Create: `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/tq-lineloss-report/SKILL.toml`
- Create: `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/tq-lineloss-report/references/collection-flow.md`
- Create: `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/tq-lineloss-report/references/data-quality.md`
- Create: `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/tq-lineloss-report/assets/scene-snapshot/index.html`
- Create: `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/tq-lineloss-report/scripts/collect_lineloss.js`
- Create: `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/tq-lineloss-report/scripts/collect_lineloss.test.js`
- Create if staging conventions require it: `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/scenes/tq-lineloss-report/scene.json`
### Tests
- Create: `tests/deterministic_submit_test.rs`
- Modify: `tests/compat_runtime_test.rs`
- Modify only if end-to-end submit coverage requires it: `tests/runtime_task_flow_test.rs`
---
## Locked contracts
### Deterministic trigger contract
- Trigger only when the raw instruction ends with the exact suffix `。。。`.
- No suffix: current behavior unchanged.
- Suffix + unsupported scene: explicit deterministic mismatch prompt.
- Suffix is not permission for arbitrary browser actions; only fixed deterministic scenes are allowed.
- Negative cases must stay non-deterministic or mismatched exactly as designed:
- ASCII `...` is not the trigger
- `。。。。` is not the trigger
- `。。。` appearing in the middle of the instruction is not the trigger
- any trailing whitespace after `。。。` is not the trigger in this slice
### Canonical org contract
The resolver must output both display and backend values:
```rust
pub struct ResolvedOrg {
pub label: String,
pub code: String,
}
```
Required supported inputs include:
- `兰州公司`
- `天水公司`
- `国网兰州供电公司`
- `城关供电分公司`
- `榆中县供电公司`
- normalized shorthand such as `榆中县公司`
Rules:
- derive aliases from the real unit tree data
- require uniqueness before execution
- ambiguous aliases prompt and stop
- missing company prompts and stop
### Canonical period contract
```rust
pub enum PeriodMode {
Month,
Week,
}
pub struct ResolvedPeriod {
pub mode: PeriodMode,
pub mode_code: String,
pub value: String,
pub payload: serde_json::Value,
}
```
Required supported inputs include:
- `月累计 2026-03`
- `月累计 2026年3月`
- `周累计 2026年第12周`
Rules:
- month and week intent are mutually exclusive
- missing mode prompts and stop
- missing period prompts and stop
- bare `第12周` is incomplete in this slice and must prompt for year instead of guessing
- derive the real backend `period_mode_code` values and request payload field names from the source page/API contract before implementation; do not ship placeholder enum echoes such as `month`/`week` unless the source materials prove those are the real backend codes
- never use page-selected defaults in deterministic mode
### Artifact contract
Lock the field names now so `claw-new` can interpret status without re-embedding business logic:
```json
{
"type": "report-artifact",
"report_name": "tq-lineloss-report",
"status": "ok",
"org": {
"label": "国网兰州供电公司",
"code": "008df5db70319f73e0508eoac23e0c3c"
},
"period": {
"mode": "month",
"mode_code": "<real-backend-mode-code>",
"value": "2026-03",
"payload": {
"<real-backend-field>": "<real-backend-value>"
}
},
"columns": [],
"rows": [],
"counts": {
"rows": 0
},
"export": {
"attempted": false,
"status": "skipped",
"message": null
},
"reasons": []
}
```
Status mapping in `claw-new`:
- `ok` -> task success
- `partial` -> task success with partial summary
- `blocked` -> task failure
- `error` -> task failure
---
### Task 1: Scaffold the staged skill package and written contract
**Files:**
- Create: `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/tq-lineloss-report/SKILL.md`
- Create: `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/tq-lineloss-report/SKILL.toml`
- Create: `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/tq-lineloss-report/references/collection-flow.md`
- Create: `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/tq-lineloss-report/references/data-quality.md`
- Create: `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/tq-lineloss-report/assets/scene-snapshot/index.html`
- Create if required: `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/scenes/tq-lineloss-report/scene.json`
- [ ] **Step 1: Write the failing package contract files**
Create the package using `fault-details-report` as the structure reference. Lock one tool only:
```toml
[[tools]]
name = "collect_lineloss"
kind = "browser_script"
description = "Collect 台区线损月/周累计线损率 rows using normalized company and period parameters and return a structured report artifact."
```
Declare required args in `SKILL.toml`:
- `expected_domain`
- `org_label`
- `org_code`
- `period_mode`
- `period_mode_code`
- `period_value`
- `period_payload`
- [ ] **Step 2: Write `SKILL.md` before implementation**
Document:
- when to use / when not to use
- required normalized args only
- blocked/error semantics
- exact returned artifact fields
- no raw natural-language values passed to backend requests
- [ ] **Step 3: Write the reference docs**
`references/collection-flow.md` must describe:
- relevant page state
- month request mapping
- week request mapping
- export/report-log flow if retained
`references/data-quality.md` must define:
- canonical output columns
- required field coverage
- status semantics
- partial/error rules
- org/period normalization assumptions
- [ ] **Step 4: Add scene metadata if the current staging registry needs it**
Keep it narrow: one scene, one tool, one artifact type.
- [ ] **Step 5: Add an automated staged-skill load/resolve check**
Add `tests/deterministic_submit_test.rs` coverage that loads the staged skills root used by runtime tests, resolves `tq-lineloss-report.collect_lineloss`, and asserts the tool is discoverable with the required args:
- `expected_domain`
- `org_label`
- `org_code`
- `period_mode`
- `period_mode_code`
- `period_value`
- `period_payload`
Run:
```bash
cargo test deterministic_submit_discovers_tq_lineloss_skill_contract -- --exact
```
Expected: FAIL before the package is fully wired, PASS once the staged skill contract is discoverable and complete.
- [ ] **Step 6: Verify structural parity with `fault-details-report`**
Run a manual file-layout diff and confirm there are no placeholder descriptions or missing required docs.
- [ ] **Step 7: Commit**
```bash
git add "D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/tq-lineloss-report" "D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/scenes/tq-lineloss-report/scene.json"
git commit -m "feat: scaffold tq lineloss staged skill contract"
```
---
### Task 2: Add browser-side JS red tests and implement the staged collector
**Files:**
- Create: `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/tq-lineloss-report/scripts/collect_lineloss.js`
- Create: `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/tq-lineloss-report/scripts/collect_lineloss.test.js`
- [ ] **Step 1: Write the failing JS tests first**
Cover deterministic pure helpers for:
- missing normalized args -> blocked/error contract
- month request shape uses `org_code` + canonical month payload
- week request shape uses `org_code` + canonical week payload
- artifact field names and counts
- partial/error status shaping
- no raw user-entered org text leakage into request fields
Example test skeleton:
```javascript
const test = require('node:test');
const assert = require('node:assert/strict');
const {
validateArgs,
buildMonthRequest,
buildWeekRequest,
normalizeRows,
buildArtifact
} = require('./collect_lineloss.js');
test('buildMonthRequest uses canonical org code and month payload', () => {
const request = buildMonthRequest({
org_code: 'ORG-1',
period_payload: { year: 2026, month: 3 }
});
assert.equal(request.orgCode, 'ORG-1');
assert.equal(request.year, 2026);
assert.equal(request.month, 3);
});
test('buildArtifact locks field names and partial semantics', () => {
const artifact = buildArtifact({
org_label: '国网兰州供电公司',
org_code: 'ORG-1',
period_mode: 'month',
period_mode_code: 'month',
period_value: '2026-03',
period_payload: { year: 2026, month: 3 },
rows: [{ id: 1 }],
status: 'partial',
reasons: ['export_failed']
});
assert.equal(artifact.report_name, 'tq-lineloss-report');
assert.equal(artifact.org.code, 'ORG-1');
assert.equal(artifact.period.value, '2026-03');
assert.deepEqual(artifact.reasons, ['export_failed']);
});
```
- [ ] **Step 2: Run the JS test file to confirm failure**
Run:
```bash
node --test "D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/tq-lineloss-report/scripts/collect_lineloss.test.js"
```
Expected: FAIL because the script/helpers do not exist yet.
- [ ] **Step 3: Write the minimal browser-side implementation**
Required structure:
```javascript
function validateArgs(args) { /* require normalized canonical args */ }
function buildMonthRequest(args) { /* build month request from canonical values */ }
function buildWeekRequest(args) { /* build week request from canonical values */ }
function normalizeRows(rawRows) { /* canonical columns only */ }
function buildArtifact(input) { /* locked artifact shape */ }
return (async () => {
const args = __SKILL_ARGS__;
validateArgs(args);
// validate page context
// collect from page/API
// normalize rows
// optionally attempt export/report-log if the real business flow requires it
return buildArtifact(result);
})();
```
Keep test exports behind an environment-safe guard so the file still works as browser-eval code.
- [ ] **Step 4: Re-run the JS tests until they pass**
Run:
```bash
node --test "D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/tq-lineloss-report/scripts/collect_lineloss.test.js"
```
Expected: PASS.
- [ ] **Step 5: Commit**
```bash
git add "D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/tq-lineloss-report/scripts/collect_lineloss.js" "D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/tq-lineloss-report/scripts/collect_lineloss.test.js"
git commit -m "feat: add tq lineloss browser collection script"
```
---
### Task 3: Add deterministic suffix detection and explicit scene routing
**Files:**
- Create: `src/compat/deterministic_submit.rs`
- Modify: `src/compat/mod.rs`
- Modify: `src/agent/mod.rs`
- Create: `tests/deterministic_submit_test.rs`
- [ ] **Step 1: Write failing routing tests**
Add Rust tests for:
- exact raw `。。。` suffix enables deterministic mode
- no suffix leaves current routing untouched
- suffix + unsupported deterministic request returns supported-scene prompt
- when page URL/title context is available and does not match the lineloss scene, deterministic routing returns mismatch/block prompt instead of proceeding
- Zhihu hotlist request without suffix keeps the current route
- ASCII `...` does not trigger deterministic mode
- `。。。。` does not trigger deterministic mode
- `。。。` in the middle of the instruction does not trigger deterministic mode
- trailing whitespace after `。。。` does not trigger deterministic mode in this slice
Suggested tests:
```rust
#[test]
fn deterministic_submit_requires_exact_suffix() {}
#[test]
fn deterministic_submit_nonmatch_returns_supported_scene_message() {}
#[test]
fn deterministic_submit_rejects_page_context_mismatch() {}
#[test]
fn zhihu_hotlist_request_without_suffix_keeps_existing_route() {}
#[test]
fn deterministic_submit_rejects_non_exact_suffix_variants() {}
```
- [ ] **Step 2: Run the targeted routing tests and confirm failure**
Run:
```bash
cargo test deterministic_submit_requires_exact_suffix -- --exact
cargo test deterministic_submit_nonmatch_returns_supported_scene_message -- --exact
cargo test zhihu_hotlist_request_without_suffix_keeps_existing_route -- --exact
```
Expected: FAIL because the deterministic routing seam does not exist yet.
- [ ] **Step 3: Implement the narrow deterministic routing module**
Recommended public shape:
```rust
pub enum DeterministicSubmitDecision {
NotDeterministic,
Prompt { summary: String },
Execute(DeterministicExecutionPlan),
}
```
`src/agent/mod.rs` should:
1. detect deterministic suffix
2. if not deterministic, continue current flow untouched
3. if prompt, return `TaskComplete`
4. if execute, pass the plan into the browser-script execution seam
- [ ] **Step 4: Re-run the routing tests**
Run:
```bash
cargo test deterministic_submit_requires_exact_suffix -- --exact
cargo test deterministic_submit_nonmatch_returns_supported_scene_message -- --exact
cargo test zhihu_hotlist_request_without_suffix_keeps_existing_route -- --exact
```
Expected: PASS.
- [ ] **Step 5: Commit**
```bash
git add src/compat/deterministic_submit.rs src/compat/mod.rs src/agent/mod.rs tests/deterministic_submit_test.rs
git commit -m "feat: add deterministic submit routing seam"
```
---
### Task 4: Implement company/unit normalization from real source data
**Files:**
- Create: `src/compat/tq_lineloss/mod.rs`
- Create: `src/compat/tq_lineloss/contracts.rs`
- Create: `src/compat/tq_lineloss/org_resolver.rs`
- Create: `src/compat/tq_lineloss/org_units.rs`
- Modify: `tests/deterministic_submit_test.rs`
- [ ] **Step 1: Write failing org resolver tests**
Cover:
- `兰州公司` -> canonical `国网兰州供电公司` + correct code
- `天水公司` -> canonical `国网天水供电公司` + correct code
- `城关供电分公司` -> lower-level direct match
- `榆中县公司` -> normalized county alias match
- ambiguous alias prompts instead of guessing
- missing company prompts instead of executing
Example skeleton:
```rust
#[test]
fn lineloss_org_resolver_matches_city_alias() {}
#[test]
fn lineloss_org_resolver_matches_county_alias() {}
#[test]
fn lineloss_org_resolver_prompts_on_ambiguity() {}
```
- [ ] **Step 2: Run the org tests and confirm failure**
Run:
```bash
cargo test lineloss_org_resolver_matches_city_alias -- --exact
cargo test lineloss_org_resolver_matches_county_alias -- --exact
cargo test lineloss_org_resolver_prompts_on_ambiguity -- --exact
```
Expected: FAIL because the resolver and checked-in unit dictionary do not exist yet.
- [ ] **Step 3: Check in the canonical unit dictionary and implement alias resolution**
Rules:
- derive data from the real source materials, not guessed literals
- keep canonical `label` and `code`
- generate normalized aliases from formal names
- support both city-company and district/county/sub-company levels
- require uniqueness before execution
- [ ] **Step 4: Implement explicit prompt messages**
Examples:
- `已命中台区线损报表技能,但缺少供电单位,请补充如“兰州公司”或“城关供电分公司”。`
- `已命中台区线损报表技能,但供电单位存在歧义,请补充更完整名称。`
- [ ] **Step 5: Re-run the org tests**
Run:
```bash
cargo test lineloss_org_resolver_matches_city_alias -- --exact
cargo test lineloss_org_resolver_matches_county_alias -- --exact
cargo test lineloss_org_resolver_prompts_on_ambiguity -- --exact
```
Expected: PASS.
- [ ] **Step 6: Commit**
```bash
git add src/compat/tq_lineloss/mod.rs src/compat/tq_lineloss/contracts.rs src/compat/tq_lineloss/org_resolver.rs src/compat/tq_lineloss/org_units.rs tests/deterministic_submit_test.rs
git commit -m "feat: add tq lineloss org normalization"
```
---
### Task 5: Implement period extraction and canonical payload building
**Files:**
- Create: `src/compat/tq_lineloss/period_resolver.rs`
- Modify: `src/compat/tq_lineloss/mod.rs`
- Modify: `tests/deterministic_submit_test.rs`
- [ ] **Step 1: Write failing period resolver tests**
Cover:
- `月累计 2026-03`
- `月累计 2026年3月`
- `周累计 2026年第12周`
- contradictory month/week expressions prompt
- missing mode prompts
- missing period prompts
- bare `第12周` prompts for year in this slice
- real backend month/week mode codes and request payload field names are derived from source materials instead of placeholder values
Example skeleton:
```rust
#[test]
fn lineloss_period_resolver_parses_month_text() {}
#[test]
fn lineloss_period_resolver_parses_week_text() {}
#[test]
fn lineloss_period_resolver_prompts_for_missing_year_on_week() {}
#[test]
fn lineloss_period_resolver_rejects_contradictory_mode() {}
```
- [ ] **Step 2: Run the period tests and confirm failure**
Run:
```bash
cargo test lineloss_period_resolver_parses_month_text -- --exact
cargo test lineloss_period_resolver_parses_week_text -- --exact
cargo test lineloss_period_resolver_prompts_for_missing_year_on_week -- --exact
cargo test lineloss_period_resolver_rejects_contradictory_mode -- --exact
```
Expected: FAIL because the period resolver does not exist yet.
- [ ] **Step 3: Implement the minimal resolver**
Output contract:
```rust
pub struct ResolvedPeriod {
pub mode: PeriodMode,
pub mode_code: String,
pub value: String,
pub payload: serde_json::Value,
}
```
Rules:
- no page-default fallback
- no implicit current-year assumptions
- no mixed month/week execution
- [ ] **Step 4: Re-run the period tests**
Run:
```bash
cargo test lineloss_period_resolver_parses_month_text -- --exact
cargo test lineloss_period_resolver_parses_week_text -- --exact
cargo test lineloss_period_resolver_prompts_for_missing_year_on_week -- --exact
cargo test lineloss_period_resolver_rejects_contradictory_mode -- --exact
```
Expected: PASS.
- [ ] **Step 5: Commit**
```bash
git add src/compat/tq_lineloss/period_resolver.rs src/compat/tq_lineloss/mod.rs tests/deterministic_submit_test.rs
git commit -m "feat: add tq lineloss period normalization"
```
---
### Task 6: Wire deterministic execution through the existing browser-script seam
**Files:**
- Modify: `src/compat/deterministic_submit.rs`
- Modify: `src/agent/mod.rs`
- Modify if needed: `src/compat/direct_skill_runtime.rs`
- Modify: `tests/deterministic_submit_test.rs`
- Modify: `tests/compat_runtime_test.rs`
- [ ] **Step 1: Write failing execution tests**
Cover:
- successful deterministic lineloss request builds canonical tool args
- missing company/mode/period returns prompt without browser execution
- `partial` artifact maps to successful partial summary
- `blocked` and `error` artifacts map to failed completion
Example skeleton:
```rust
#[test]
fn deterministic_lineloss_execution_passes_canonical_args() {}
#[test]
fn deterministic_lineloss_missing_company_does_not_invoke_browser() {}
#[test]
fn deterministic_lineloss_partial_artifact_maps_to_partial_summary() {}
```
- [ ] **Step 2: Run the execution tests and confirm failure**
Run:
```bash
cargo test deterministic_lineloss_execution_passes_canonical_args -- --exact
cargo test deterministic_lineloss_missing_company_does_not_invoke_browser -- --exact
cargo test deterministic_lineloss_partial_artifact_maps_to_partial_summary -- --exact
```
Expected: FAIL because the deterministic execution plan is not wired yet.
- [ ] **Step 3: Implement execution via the existing `browser_script` seam**
Build tool args only from normalized values:
- `expected_domain`
- `org_label`
- `org_code`
- `period_mode`
- `period_mode_code`
- `period_value`
- `period_payload`
Resolve the tool explicitly to:
- `tq-lineloss-report.collect_lineloss`
Do not introduce a new browser opcode family or second browser protocol.
- [ ] **Step 4: Implement central artifact interpretation**
Recommended helper:
```rust
fn summarize_lineloss_artifact(artifact: &serde_json::Value) -> (bool, String)
```
Summary must include canonical org/period and row counts, and surface blocked/partial/error reasons.
- [ ] **Step 5: Re-run the execution tests**
Run:
```bash
cargo test deterministic_lineloss_execution_passes_canonical_args -- --exact
cargo test deterministic_lineloss_missing_company_does_not_invoke_browser -- --exact
cargo test deterministic_lineloss_partial_artifact_maps_to_partial_summary -- --exact
```
Expected: PASS.
- [ ] **Step 6: Commit**
```bash
git add src/compat/deterministic_submit.rs src/agent/mod.rs src/compat/direct_skill_runtime.rs tests/deterministic_submit_test.rs tests/compat_runtime_test.rs
git commit -m "feat: execute deterministic tq lineloss skill through browser script seam"
```
---
### Task 7: Add Zhihu regression coverage and run the full verification set
**Files:**
- Modify: `tests/compat_runtime_test.rs`
- Modify only if required: `tests/runtime_task_flow_test.rs`
- Reuse: `tests/deterministic_submit_test.rs`
- [ ] **Step 1: Add focused Zhihu regression tests**
Required assertions:
- ordinary Zhihu hotlist requests without `。。。` still use the current path
- existing export/presentation requests still preserve their current behavior
- deterministic suffix does not silently route unmatched requests into Zhihu logic
- an existing non-lineloss direct `browser_script` path outside the new scene still behaves unchanged
- [ ] **Step 2: Add end-to-end deterministic submit coverage**
Required assertions:
- suffix detection
- scene match
- page-context mismatch prompt/block behavior when URL/title contradict the lineloss scene
- missing/ambiguous prompts
- canonical args passed to the browser-script tool
- returned summary shows canonical org and period
- execution stays on the existing pipe-backed browser-script seam with no ws-only dependency introduced on `main`
- [ ] **Step 3: Run the focused Rust tests**
Run:
```bash
cargo test --test deterministic_submit_test
cargo test --test compat_runtime_test
cargo test --test runtime_task_flow_test
```
Expected: PASS.
- [ ] **Step 4: Run the whole Rust suite**
Run:
```bash
cargo test
```
Expected: PASS.
- [ ] **Step 5: Re-run the staged skill JS tests**
Run:
```bash
node --test "D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/tq-lineloss-report/scripts/collect_lineloss.test.js"
```
Expected: PASS.
- [ ] **Step 6: Commit**
```bash
git add tests/deterministic_submit_test.rs tests/compat_runtime_test.rs tests/runtime_task_flow_test.rs
git commit -m "test: cover deterministic tq lineloss routing and zhihu regression"
```
---
## Final verification checklist
- [ ] `。。。` is the only deterministic trigger.
- [ ] Non-`。。。` requests preserve current routing.
- [ ] Deterministic page-context mismatch blocks or mismatches before execution when URL/title contradict the lineloss scene.
- [ ] Zhihu hotlist behavior is unchanged.
- [ ] Existing non-lineloss direct `browser_script` behavior is unchanged.
- [ ] Deterministic non-match returns an explicit supported-scene message.
- [ ] Missing company prompts.
- [ ] Ambiguous company prompts.
- [ ] Missing mode prompts.
- [ ] Missing period prompts.
- [ ] Bare `第12周` prompts for year.
- [ ] Canonical org code is passed to the staged skill.
- [ ] Canonical period mode code and payload are passed to the staged skill.
- [ ] The staged skill returns the locked artifact shape.
- [ ] Execution uses the existing `browser_script` seam only.
- [ ] No ws-specific runtime dependency is added on `main`.
## Implementation notes
- Prefer extracting a tiny shared execution helper from `src/compat/direct_skill_runtime.rs` if needed instead of duplicating tool lookup or browser-script invocation code.
- Keep deterministic whitelist configuration in one place, but do not expand this slice into a full general scene-registry redesign.
- If a failing test suggests changing Zhihu behavior, fix the deterministic branch or test harness instead of weakening the existing Zhihu path.
- The checked-in unit dictionary is part of the deterministic contract; treat updates to that data as explicit behavior changes and cover them with tests.

View File

@@ -0,0 +1,448 @@
# TQ Lineloss WS Dual-Transport Implementation Plan
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
**Goal:** Add ws communication support for the existing `tq-lineloss-report.collect_lineloss` deterministic browser_script path on the `feature/claw-ws` branch while preserving the current pipe path and validated Zhihu ws behavior.
**Architecture:** Reuse the existing backend-neutral execution seam that already exists for deterministic submit and browser_script execution. Keep lineloss business parsing, canonical args, and artifact interpretation unchanged; only make the ws backend/protocol and submit-path verification complete enough for the same lineloss skill contract to run over both pipe and ws.
**Tech Stack:** Rust 2021, Cargo tests, existing `BrowserBackend` abstraction, `WsBrowserBackend`, `ws_protocol`, browser websocket contract in `docs/_tmp_sgbrowser_ws_api_doc.txt`, existing staged `browser_script` skill execution seam.
---
## Execution Context
- Follow @superpowers:test-driven-development for each behavior change.
- Follow @superpowers:verification-before-completion before claiming each task is done.
- Do **not** create a git worktree unless the user explicitly asks.
- This plan is **ws enablement only** for the already-added lineloss deterministic skill path.
- Do **not** redesign deterministic routing, org parsing, period parsing, staged skill packaging, or artifact contracts unless a failing ws-specific test proves a minimal compatibility fix is required.
- Do **not** modify validated Zhihu hotlist/export business behavior; only add regression coverage around it.
- Preserve the current pipe execution path as the control implementation.
- Preserve the current `BrowserBackend` seam; do not introduce a second lineloss-specific ws execution path.
## Scope Boundary
### In scope
- Make the existing lineloss deterministic `browser_script` skill path run through ws on this branch.
- Keep the same canonical tool args and returned artifact interpretation for both pipe and ws.
- Verify ws browser-script execution against the documented browser ws contract.
- Add focused tests for ws lineloss execution and regressions for Zhihu ws + pipe lineloss.
### Out of scope
- Changing lineloss trigger semantics (`。。。`).
- Changing org/unit normalization semantics or source dictionary shape.
- Changing period normalization semantics.
- Reworking staged skill docs or JS business collection logic beyond ws-compatibility necessities.
- Any Zhihu feature work.
- Any pipe-only cleanup/refactor.
- Any general scene-registry redesign.
## File Map
### Expected code changes
- Modify: `src/pipe/protocol.rs:49-78,130-165,192-209`
- keep `Action::Eval` encoding aligned with the current transport contract and lineloss skill expectations
- Modify: `src/pipe/browser_tool.rs:62-125`
- ensure eval response correlation and payload handling remain sufficient for deterministic lineloss execution
- Modify only if a focused test proves it is necessary: `src/compat/browser_script_skill_tool.rs:135-255`
- preserve browser_script contract; only make minimal output-shape handling fixes if eval payloads differ from the pipe baseline in a way current code cannot consume
- Modify only if a focused parity test proves it is necessary: `src/compat/direct_skill_runtime.rs:50-129`
- preserve shared backend-neutral execution helper behavior; no business logic changes
- Read and normally leave unchanged: `src/compat/deterministic_submit.rs:96-157`
- this is the business contract baseline and should not be rewritten for transport parity work
- Read and normally leave unchanged: `src/agent/mod.rs:242-285`
- this contains the current deterministic dispatch split used by this branch
### Expected test changes
- Modify: `tests/agent_runtime_test.rs`
- add/extend deterministic lineloss runtime coverage and parity assertions using the current runtime path
- Modify: `tests/compat_runtime_test.rs`
- add/extend focused pipe lineloss regression assertions so transport work cannot silently break pipe
- Modify only if end-to-end submit coverage truly needs it: `tests/runtime_task_flow_test.rs`
- verify broader submit-flow expectations remain intact
### Reference-only files
- Read only: `docs/superpowers/plans/2026-04-11-tq-lineloss-deterministic-skill-plan.md`
- Read only: `docs/superpowers/specs/2026-04-11-tq-lineloss-deterministic-skill-design.md`
- Read only: `docs/_tmp_sgbrowser_ws_api_doc.txt`
---
## Locked contracts
### Contract 1: Same lineloss deterministic business contract on both transports
The ws path must reuse the existing values produced by `src/compat/deterministic_submit.rs:84-95` and `src/compat/deterministic_submit.rs:135-166`:
- `expected_domain`
- `org_label`
- `org_code`
- `period_mode`
- `period_mode_code`
- `period_value`
- `period_payload`
No ws-specific lineloss args may be introduced in this slice.
### Contract 2: Same browser_script execution seam on both transports
The ws path must continue to use `execute_browser_script_skill_raw_output_with_browser_backend(...)` from `src/compat/direct_skill_runtime.rs:95-112`, which in turn uses the same browser_script tool path as pipe. Do not add a second lineloss-only ws runner.
### Contract 3: Same artifact interpretation on both transports
The ws path must produce output that remains consumable by `summarize_lineloss_output(...)` / `summarize_lineloss_artifact(...)` in `src/compat/deterministic_submit.rs:168-257` without transport-specific branching.
### Contract 4: Zhihu ws behavior must stay unchanged
The existing ws browser-script / export path already validated by `tests/agent_runtime_test.rs` and `tests/compat_runtime_test.rs` is a hard regression boundary. If a change breaks Zhihu tests, fix the ws seam instead of weakening Zhihu expectations.
### Contract 5: Pipe remains the baseline
For identical lineloss deterministic inputs, the pipe path should continue to succeed without requiring ws configuration.
---
### Task 1: Lock the ws contract with failing transport-level tests
**Files:**
- Modify: `tests/agent_runtime_test.rs`
- Modify: `tests/compat_runtime_test.rs`
- Read: `docs/_tmp_sgbrowser_ws_api_doc.txt`
- [ ] **Step 1: Add a failing ws lineloss deterministic runtime test**
Model it after the existing ws harness in `tests/agent_runtime_test.rs:69-166`, but target lineloss deterministic execution instead of Zhihu. The test should:
- configure `browserWsUrl`
- submit a deterministic lineloss instruction ending with `。。。`
- return a ws callback payload representing a lineloss `report-artifact`
- assert success summary includes canonical org, period, status, and rows
Suggested skeleton:
```rust
#[test]
fn ws_deterministic_lineloss_submit_executes_browser_script_and_summarizes_artifact() {
// arrange ws config + ws server + lineloss artifact callback
// act handle_browser_message_with_context(... SubmitTask ...)
// assert TaskComplete success summary contains canonical org/period/rows
}
```
- [ ] **Step 2: Add a failing pipe regression test for the same lineloss contract**
In `tests/compat_runtime_test.rs`, add a focused pipe-side assertion that the same deterministic lineloss instruction still succeeds through the current pipe seam and uses the same summary contract.
Suggested skeleton:
```rust
#[test]
fn pipe_deterministic_lineloss_submit_preserves_existing_summary_contract() {
// arrange MockTransport responses for browser_script eval
// act handle_browser_message_with_context(...)
// assert success summary matches canonical contract
}
```
- [ ] **Step 3: Add a failing ws regression assertion for Zhihu**
Add or tighten a Zhihu ws assertion proving ordinary Zhihu requests still use the existing ws path and do not get intercepted by lineloss deterministic logic.
- [ ] **Step 4: Run the three focused tests to confirm failure**
Run:
```bash
cargo test ws_deterministic_lineloss_submit_executes_browser_script_and_summarizes_artifact -- --exact
cargo test pipe_deterministic_lineloss_submit_preserves_existing_summary_contract -- --exact
cargo test ws_zhihu_submit_path_remains_unchanged_after_lineloss_transport_work -- --exact
```
Expected: at least the new ws lineloss test fails before the seam is completed.
- [ ] **Step 5: Commit**
```bash
git add tests/agent_runtime_test.rs tests/compat_runtime_test.rs
git commit -m "test: lock ws and pipe lineloss transport contracts"
```
---
### Task 2: Make the current eval transport contract explicitly satisfy browser-script requirements
**Files:**
- Modify: `src/pipe/protocol.rs:49-78,130-165,192-209`
- Modify: `src/pipe/browser_tool.rs:62-124`
- Modify only if tests prove necessary: `src/compat/browser_script_skill_tool.rs:99-180,214-255`
- Modify: `tests/pipe_protocol_test.rs`
- Modify: `tests/browser_tool_test.rs`
- Modify: `tests/browser_script_skill_tool_test.rs`
- [ ] **Step 1: Add failing protocol/result-contract tests first**
Extend or add focused tests to lock the current branch's real transport contract:
- `Action::Eval` remains supported by the line protocol and command encoding
- eval request/response correlation remains stable via `seq` matching for lineloss-style target URLs
- eval/browser_script result handling preserves the full JSON artifact string without truncation before deterministic lineloss summarization consumes it
Suggested skeletons:
```rust
#[test]
fn eval_action_remains_supported_in_protocol() {}
#[test]
fn browser_tool_matches_eval_response_by_seq_for_lineloss_flow() {}
#[test]
fn browser_script_tool_preserves_json_artifact_string_for_lineloss() {}
```
- [ ] **Step 2: Run the focused Task 2 tests to confirm failure**
Run:
```bash
cargo test eval_action_remains_supported_in_protocol -- --exact
cargo test browser_tool_matches_eval_response_by_seq_for_lineloss_flow -- --exact
cargo test browser_script_tool_preserves_json_artifact_string_for_lineloss -- --exact
```
Expected: at least one test fails if the current protocol/correlation/result handling is still insufficient for the lineloss artifact path.
- [ ] **Step 3: Implement the minimal transport-contract fix**
Allowed changes:
- adjust only the `Action::Eval` protocol/encoding support in `src/pipe/protocol.rs`
- adjust only request/response correlation in `src/pipe/browser_tool.rs`
- if and only if tests still prove it necessary, make a tiny result-shape/stringification fix in `src/compat/browser_script_skill_tool.rs`
- keep existing Zhihu-compatible behavior intact
Not allowed:
- adding lineloss-only transport fields
- adding a second lineloss-specific execution path
- changing deterministic lineloss business parsing or summary rules
- [ ] **Step 4: Re-run the focused Task 2 tests**
Run:
```bash
cargo test eval_action_remains_supported_in_protocol -- --exact
cargo test browser_tool_matches_eval_response_by_seq_for_lineloss_flow -- --exact
cargo test browser_script_tool_preserves_json_artifact_string_for_lineloss -- --exact
```
Expected: PASS.
- [ ] **Step 5: Re-run the focused ws lineloss runtime test from Task 1**
Run:
```bash
cargo test ws_deterministic_lineloss_submit_executes_browser_script_and_summarizes_artifact -- --exact
```
Expected: PASS.
- [ ] **Step 6: Commit**
```bash
git add src/pipe/protocol.rs src/pipe/browser_tool.rs src/compat/browser_script_skill_tool.rs tests/pipe_protocol_test.rs tests/browser_tool_test.rs tests/browser_script_skill_tool_test.rs
git commit -m "fix: align eval transport contract with lineloss browser script flow"
```
---
### Task 3: Make eval result-shape handling surface the lineloss artifact cleanly
**Files:**
- Modify: `src/pipe/browser_tool.rs:62-125`
- Modify only if tests prove necessary: `src/compat/browser_script_skill_tool.rs:159-180,248-255`
- Modify: `tests/browser_script_skill_tool_test.rs`
- [ ] **Step 1: Add a failing result-shape test**
Lock that an eval response carrying a JSON string report artifact is surfaced as the same browser_script tool output shape expected by `execute_browser_script_tool(...)`.
Suggested skeleton:
```rust
#[test]
fn ws_backend_eval_returns_text_payload_consumable_by_browser_script_tool() {
// arrange an eval response whose data.text is a JSON string artifact
// assert execute_browser_script_tool(...) returns the full artifact text without truncation
}
```
- [ ] **Step 2: Run the result-shape test to confirm failure**
Run:
```bash
cargo test ws_backend_eval_returns_text_payload_consumable_by_browser_script_tool -- --exact
```
Expected: FAIL only if current eval/result handling is not sufficient for full lineloss artifact output.
- [ ] **Step 3: Implement the minimal result-shape fix**
Allowed fixes:
- adjust `BrowserPipeTool::invoke(...)` only if response packaging itself is wrong
- if and only if still required, make a tiny output-shape compatibility fix in `src/compat/browser_script_skill_tool.rs` so JSON string `data.text` payloads are preserved identically to the pipe baseline
Not allowed:
- transport-specific lineloss parsing
- changes to deterministic business logic
- adding a second lineloss-specific execution path
- [ ] **Step 4: Re-run the result-shape test**
Run:
```bash
cargo test ws_backend_eval_returns_text_payload_consumable_by_browser_script_tool -- --exact
```
Expected: PASS.
- [ ] **Step 5: Re-run the focused ws lineloss runtime test from Task 1**
Run:
```bash
cargo test ws_deterministic_lineloss_submit_executes_browser_script_and_summarizes_artifact -- --exact
```
Expected: PASS.
- [ ] **Step 6: Commit**
```bash
git add src/pipe/browser_tool.rs src/compat/browser_script_skill_tool.rs tests/browser_script_skill_tool_test.rs
git commit -m "fix: make eval result shape match browser script contract"
```
---
### Task 4: Verify the current backend-neutral deterministic execution path without changing business rules
**Files:**
- Read baseline: `src/agent/mod.rs:242-285`
- Read baseline: `src/compat/deterministic_submit.rs:96-157`
- Modify only if a focused parity test proves it is necessary: `src/compat/direct_skill_runtime.rs:50-129`
- Modify: `tests/agent_runtime_test.rs`
- Modify: `tests/compat_runtime_test.rs`
- [ ] **Step 1: Add a failing integration test for backend-neutral parity**
Add a test proving these two current-branch paths produce the same lineloss summary contract for equivalent artifact payloads:
- pipe path via the existing deterministic submit flow in `tests/compat_runtime_test.rs`
- runtime path via `handle_browser_message_with_context(...)` deterministic submit routing in `tests/agent_runtime_test.rs`
Suggested skeleton:
```rust
#[test]
fn deterministic_lineloss_pipe_and_ws_paths_share_summary_contract() {}
```
- [ ] **Step 2: Run the parity test to confirm failure or gap**
Run:
```bash
cargo test deterministic_lineloss_pipe_and_ws_paths_share_summary_contract -- --exact
```
Expected: FAIL only if a remaining shared execution seam gap still exists.
- [ ] **Step 3: Apply the smallest shared execution fix if needed**
Allowed changes:
- tiny helper extraction or result handling in `src/compat/direct_skill_runtime.rs`
- no new lineloss-specific branch
- no change to deterministic lineloss business parsing or summary rules
- no change to configured direct-submit behavior for non-lineloss skills
- [ ] **Step 4: Re-run the parity test**
Run:
```bash
cargo test deterministic_lineloss_pipe_and_ws_paths_share_summary_contract -- --exact
```
Expected: PASS.
- [ ] **Step 5: Commit**
```bash
git add src/compat/direct_skill_runtime.rs tests/agent_runtime_test.rs tests/compat_runtime_test.rs
git commit -m "fix: preserve shared deterministic execution across pipe and ws"
```
---
### Task 5: Run the full focused verification set and stop if any Zhihu or pipe regression appears
**Files:**
- Reuse: `tests/agent_runtime_test.rs`
- Reuse: `tests/compat_runtime_test.rs`
- Reuse: `tests/runtime_task_flow_test.rs`
- [ ] **Step 1: Run focused ws + lineloss + Zhihu regression tests**
Run:
```bash
cargo test --test agent_runtime_test
cargo test --test compat_runtime_test
cargo test --test runtime_task_flow_test
```
Expected: PASS.
- [ ] **Step 2: Run targeted protocol/backend unit tests**
Run:
```bash
cargo test eval_action_remains_supported_in_protocol -- --exact
cargo test browser_tool_matches_eval_response_by_seq_for_lineloss_flow -- --exact
cargo test browser_script_tool_preserves_json_artifact_string_for_lineloss -- --exact
cargo test ws_backend_eval_returns_text_payload_consumable_by_browser_script_tool -- --exact
cargo test deterministic_lineloss_pipe_and_ws_paths_share_summary_contract -- --exact
```
Expected: PASS.
- [ ] **Step 3: Run the full Rust suite**
Run:
```bash
cargo test
```
Expected: PASS.
- [ ] **Step 4: Manual review of diff scope**
Confirm the diff only touches:
- current transport/result seam files (`src/pipe/protocol.rs`, `src/pipe/browser_tool.rs`)
- narrow shared browser_script/result compatibility helpers if strictly necessary
- tests
If diff includes Zhihu business logic, lineloss parsing rules, staged skill business JS, or unrelated cleanup, remove those changes before completion.
- [ ] **Step 5: Commit**
```bash
git add src/pipe/protocol.rs src/pipe/browser_tool.rs src/compat/browser_script_skill_tool.rs src/compat/direct_skill_runtime.rs tests/pipe_protocol_test.rs tests/browser_tool_test.rs tests/browser_script_skill_tool_test.rs tests/agent_runtime_test.rs tests/compat_runtime_test.rs
git commit -m "test: verify lineloss ws transport without regressing pipe or zhihu"
```
---
## Final verification checklist
- [ ] The same lineloss deterministic instruction works on pipe and ws.
- [ ] Pipe still works without any ws configuration.
- [ ] Eval transport support remains available for deterministic lineloss execution.
- [ ] Eval response payloads preserve the full lineloss artifact JSON string.
- [ ] `src/compat/deterministic_submit.rs` business rules remain transport-neutral.
- [ ] No ws-specific lineloss args were introduced.
- [ ] Zhihu ws tests still pass unchanged in behavior.
- [ ] No ordinary Zhihu request is intercepted by lineloss deterministic routing.
- [ ] No new transport-specific business branch was added for lineloss.
## Implementation notes
- Default to changing the current transport/result seam first: `src/pipe/protocol.rs` and `src/pipe/browser_tool.rs`.
- Treat `src/compat/browser_script_skill_tool.rs` and `src/compat/direct_skill_runtime.rs` as shared seams: change them only if a focused failing test shows a transport-neutral compatibility bug.
- If a proposed fix requires changing `src/compat/deterministic_submit.rs` business logic, stop and re-evaluate; that likely means the seam fix is happening at the wrong layer.
- If a proposed fix changes Zhihu expectations, stop and repair the seam instead.

View File

@@ -0,0 +1,228 @@
# Async Browser Script 支持实现计划
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
**Goal:** 修改 `build_eval_js` 函数支持异步脚本,解决 Promise 被 JSON.stringify 序列化为 `{}` 的问题。
**Architecture:**`build_eval_js` 生成的 JavaScript 代码从同步 IIFE 改为 async IIFE用 await 等待脚本执行结果,并检测 Promise-like 对象进行二次等待。
**Tech Stack:** Rust, JavaScript (生成代码)
---
## 文件结构
| 文件 | 操作 | 说明 |
|------|------|------|
| `src/browser/callback_backend.rs` | 修改 | 修改 `build_eval_js` 函数 |
| `tests/browser_script_skill_tool_test.rs` | 新增测试 | 添加异步脚本测试用例 |
---
### Task 1: 修改 build_eval_js 支持异步脚本
**Files:**
- Modify: `src/browser/callback_backend.rs:433-447`
**当前代码:**
```rust
fn build_eval_js(source_url: &str, script: &str) -> String {
let escaped_source_url = escape_js_single_quoted(source_url);
let callback = EVAL_CALLBACK_NAME;
let events_url = escape_js_single_quoted(&events_endpoint_url(source_url));
format!(
"(function(){{try{{var v=(function(){{return {script}}})();\
var t=(typeof v==='string')?v:JSON.stringify(v);\
try{{callBackJsToCpp('{escaped_source_url}@_@'+window.location.href+'@_@{callback}@_@sgBrowserExcuteJsCodeByDomain@_@'+(t??''))}}catch(_){{}}\
var j=JSON.stringify({{type:'callback',callback:'{callback}',request_url:'{escaped_source_url}',payload:{{value:(t??'')}}}});\
try{{var r=new XMLHttpRequest();r.open('POST','{events_url}',true);r.setRequestHeader('Content-Type','application/json');r.send(j)}}catch(_){{}}\
try{{navigator.sendBeacon('{events_url}',new Blob([j],{{type:'application/json'}}))}}catch(_){{}}\
}}catch(e){{}}}})()"
)
}
```
**修改后代码:**
```rust
fn build_eval_js(source_url: &str, script: &str) -> String {
let escaped_source_url = escape_js_single_quoted(source_url);
let callback = EVAL_CALLBACK_NAME;
let events_url = escape_js_single_quoted(&events_endpoint_url(source_url));
format!(
"(async function(){{try{{\
var v=await (async function(){{return {script}}})();\
if(v&&typeof v.then==='function'){{v=await v;}}\
var t=(typeof v==='string')?v:JSON.stringify(v);\
try{{callBackJsToCpp('{escaped_source_url}@_@'+window.location.href+'@_@{callback}@_@sgBrowserExcuteJsCodeByDomain@_@'+(t??''))}}catch(_){{}}\
var j=JSON.stringify({{type:'callback',callback:'{callback}',request_url:'{escaped_source_url}',payload:{{value:(t??'')}}}});\
try{{var r=new XMLHttpRequest();r.open('POST','{events_url}',true);r.setRequestHeader('Content-Type','application/json');r.send(j)}}catch(_){{}}\
try{{navigator.sendBeacon('{events_url}',new Blob([j],{{type:'application/json'}}))}}catch(_){{}}\
}}catch(e){{}}}})()"
)
}
```
**关键变更说明:**
1. `(function()``(async function()` - 整个 IIFE 变为异步
2. `var v=(function(){return {script}})()``var v=await (async function(){return {script}})()` - 内部包装也变为异步并 await
3. 新增 `if(v&&typeof v.then==='function'){v=await v;}` - 检测并等待 Promise-like 对象
- [ ] **Step 1: 修改 build_eval_js 函数**
编辑 `src/browser/callback_backend.rs` 第 433-447 行,替换为上述新代码。
- [ ] **Step 2: 编译验证**
Run: `cargo build`
Expected: 编译成功,无错误
- [ ] **Step 3: 运行现有测试**
Run: `cargo test browser_script_skill_tool`
Expected: 所有测试通过
- [ ] **Step 4: Commit**
```bash
git add src/browser/callback_backend.rs
git commit -m "fix: support async browser scripts in build_eval_js
Wrap eval script in async IIFE and await Promise-like results.
Fixes Promise serialization returning '{}' for async skill scripts.
🤖 Generated with [Qoder][https://qoder.com]"
```
---
### Task 2: 添加异步脚本测试用例
**Files:**
- Modify: `tests/browser_script_skill_tool_test.rs`
- [ ] **Step 1: 添加异步脚本测试用例**
`tests/browser_script_skill_tool_test.rs` 文件末尾添加新测试:
```rust
#[tokio::test]
async fn execute_browser_script_tool_awaits_async_script() {
let skill_dir = unique_temp_dir("sgclaw-browser-script-async");
let scripts_dir = skill_dir.join("scripts");
fs::create_dir_all(&scripts_dir).unwrap();
// 异步脚本,返回 Promise
fs::write(
scripts_dir.join("async_extract.js"),
"return (async function() { return { async: true, args: args }; })();\n",
)
.unwrap();
let transport = Arc::new(MockTransport::new(vec![BrowserMessage::Response {
seq: 1,
success: true,
data: json!({
"text": {
"async": true,
"args": { "expected_domain": "example.com" }
}
}),
aom_snapshot: vec![],
timing: Timing {
queue_ms: 1,
exec_ms: 5,
},
}]));
let mut policy_json = test_policy();
// 允许 example.com
policy_json = MacPolicy::from_json_str(
r#"{
"version": "1.0",
"domains": { "allowed": ["www.zhihu.com", "example.com"] },
"pipe_actions": {
"allowed": ["click", "type", "navigate", "getText", "eval"],
"blocked": []
}
}"#,
)
.unwrap();
let browser_tool = BrowserPipeTool::new(
transport.clone(),
policy_json,
vec![1, 2, 3, 4, 5, 6, 7, 8],
)
.with_response_timeout(Duration::from_secs(1));
let skill_tool = SkillTool {
name: "async_extract".to_string(),
description: "Extract data asynchronously".to_string(),
kind: "browser_script".to_string(),
command: "scripts/async_extract.js".to_string(),
args: HashMap::new(),
};
let result = execute_browser_script_tool(
&skill_tool,
&skill_dir,
&PipeBrowserBackend::from_inner(browser_tool),
json!({
"expected_domain": "example.com"
}),
)
.await
.unwrap();
assert!(result.success);
let output = serde_json::from_str::<serde_json::Value>(&result.output).unwrap();
assert_eq!(output["async"], true);
}
```
- [ ] **Step 2: 运行新测试**
Run: `cargo test execute_browser_script_tool_awaits_async_script`
Expected: 测试通过
- [ ] **Step 3: Commit**
```bash
git add tests/browser_script_skill_tool_test.rs
git commit -m "test: add async browser script test case
🤖 Generated with [Qoder][https://qoder.com]"
```
---
### Task 3: 端到端验证
**Files:**
- 无文件修改,仅验证
- [ ] **Step 1: 完整构建**
Run: `cargo build`
Expected: 编译成功
- [ ] **Step 2: 运行全部测试**
Run: `cargo test`
Expected: 所有测试通过
- [ ] **Step 3: 手动端到端测试**
使用 service console 测试 `tq-lineloss-report.collect_lineloss`:
1. 启动 sgclaw: `target/debug/sg_claw.exe`
2. 在 service console 输入: `兰州公司 台区线损大数据 月累计线损率统计分析。。。`
3. 预期结果: 返回实际报表数据,而非 `{}`
---
## 自检清单
- [x] Spec 覆盖: 设计文档中所有要点都有对应任务
- [x] 无占位符: 所有代码都是完整的
- [x] 类型一致性: 函数签名无变化

View File

@@ -0,0 +1,73 @@
# Async Eval .then() Fix Implementation Plan
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
**Goal:** Fix `build_eval_js` to handle async script return values using `.then()` instead of `async IIFE`.
**Architecture:** Extract callback-sending logic into a `_s` helper function inside the generated JS. If the script returns a Promise, call `_s` via `.then()`; otherwise call `_s` synchronously. This keeps the outer IIFE synchronous for C++ injection compatibility.
**Tech Stack:** Rust, JavaScript
---
## Files
- Modify: `src/browser/callback_backend.rs:433-447` - `build_eval_js` function
---
### Task 1: Modify build_eval_js to support async via .then()
**Files:**
- Modify: `src/browser/callback_backend.rs:433-447`
- [ ] **Step 1: Replace build_eval_js implementation**
Replace the entire `build_eval_js` function body (lines 433-447) with:
```rust
fn build_eval_js(source_url: &str, script: &str) -> String {
let escaped_source_url = escape_js_single_quoted(source_url);
let callback = EVAL_CALLBACK_NAME;
let events_url = escape_js_single_quoted(&events_endpoint_url(source_url));
format!(
"(function(){{try{{\
var v=(function(){{return {script}}})();\
function _s(v){{\
var t=(typeof v==='string')?v:JSON.stringify(v);\
try{{callBackJsToCpp('{escaped_source_url}@_@'+window.location.href+'@_@{callback}@_@sgBrowserExcuteJsCodeByDomain@_@'+(t??''))}}catch(_){{}}\
var j=JSON.stringify({{type:'callback',callback:'{callback}',request_url:'{escaped_source_url}',payload:{{value:(t??'')}}}});\
try{{var r=new XMLHttpRequest();r.open('POST','{events_url}',true);r.setRequestHeader('Content-Type','application/json');r.send(j)}}catch(_){{}}\
try{{navigator.sendBeacon('{events_url}',new Blob([j],{{type:'application/json'}}))}}catch(_){{}}\
}}\
if(v&&typeof v.then==='function'){{v.then(_s).catch(function(){{}});}}else{{_s(v);}}\
}}catch(e){{}}}})()"
)
}
```
- [ ] **Step 2: Run tests**
Run: `cargo test browser_script_skill_tool --no-fail-fast`
Expected: All tests pass.
- [ ] **Step 3: Run full test suite**
Run: `cargo test`
Expected: All tests pass (except pre-existing `lineloss_period_resolver_prompts_for_missing_period` failure which is unrelated).
- [ ] **Step 4: Build**
Run: `cargo build`
Expected: Compiles with no errors.
- [ ] **Step 5: Commit**
```bash
git add src/browser/callback_backend.rs
git commit -m "fix: support async browser scripts via .then() in build_eval_js"
```

View File

@@ -0,0 +1,52 @@
# Expected Domain Arg Fix Implementation Plan
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
**Goal:** Fix browser_script_skill_tool to pass expected_domain to wrapped JS scripts.
**Architecture:** Insert the normalized expected_domain back into args HashMap after domain normalization, before script wrapping.
**Tech Stack:** Rust, serde_json
---
## Files
- Modify: `src/compat/browser_script_skill_tool.rs:210` - Insert expected_domain back into args
---
### Task 1: Insert expected_domain into args
**Files:**
- Modify: `src/compat/browser_script_skill_tool.rs:210`
- [ ] **Step 1: Add expected_domain to args after normalization**
Edit `src/compat/browser_script_skill_tool.rs`, insert after line 209 (`eprintln!("[execute_browser_script_impl] expected_domain: {}", expected_domain);`):
```rust
args.insert("expected_domain".to_string(), Value::String(expected_domain.clone()));
```
The context around line 209-211 should look like this after the edit:
```rust
eprintln!("[execute_browser_script_impl] expected_domain: {}", expected_domain);
args.insert("expected_domain".to_string(), Value::String(expected_domain.clone()));
for required_arg in tool.args.keys() {
```
- [ ] **Step 2: Run tests to verify the fix**
Run: `cargo test browser_script_skill_tool --no-fail-fast -- --nocapture`
Expected: All tests pass, including `execute_browser_script_tool_runs_packaged_script_with_expected_domain`
- [ ] **Step 3: Commit**
```bash
git add src/compat/browser_script_skill_tool.rs
git commit -m "fix: pass expected_domain to wrapped browser scripts"
```

View File

@@ -0,0 +1,163 @@
# 台区线损 requesturl 快速修复 实现计划
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
**Goal:**`derive_request_url_from_instruction` 中添加台区线损 URL 映射,使 `sgHideBrowerserOpenPage` 命令能正确执行。
**Architecture:** 在现有知乎 URL 映射模式后追加台区线损场景的硬编码映射。
**Tech Stack:** Rust
---
### Task 1: 添加测试用例
**Files:**
- Modify: `src/service/server.rs:828` (tests 模块)
- [ ] **Step 1: 在 tests 模块中添加台区线损 URL 映射测试**
`initial_request_url_falls_back_to_zhihu_origin_for_generated_article_publish_routes` 测试后添加新测试:
```rust
#[test]
fn initial_request_url_falls_back_to_lineloss_origin_for_lineloss_instructions() {
let request = SubmitTaskRequest {
instruction: "兰州公司 台区线损大数据 月累计线损率统计分析。。。".to_string(),
..SubmitTaskRequest::default()
};
assert_eq!(
initial_request_url_for_submit_task(&request),
"http://20.76.57.61:18080"
);
}
```
- [ ] **Step 2: 运行测试验证失败**
Run: `cargo test initial_request_url_falls_back_to_lineloss_origin_for_lineloss_instructions -- --nocapture`
Expected: FAIL - 测试应该失败,因为还未实现映射逻辑
- [ ] **Step 3: 提交测试文件**
```bash
git add src/service/server.rs
git commit -m "test: add lineloss requesturl mapping test"
```
---
### Task 2: 实现台区线损 URL 映射
**Files:**
- Modify: `src/service/server.rs:354-382` (derive_request_url_from_instruction 函数)
- [ ] **Step 1: 在 derive_request_url_from_instruction 中添加台区线损映射**
在第二个知乎判断块后、`None` 之前添加:
```rust
// 台区线损相关
// TODO: 临时方案,后续应从 skill 配置或 deterministic_submit 解析结果中获取
if instruction.contains("线损") || instruction.contains("lineloss") {
return Some("http://20.76.57.61:18080".to_string());
}
None
```
完整函数应为:
```rust
fn derive_request_url_from_instruction(instruction: &str) -> Option<String> {
if crate::compat::workflow_executor::detect_route(instruction, None, None)
.is_some_and(|route| {
matches!(
route,
crate::compat::workflow_executor::WorkflowRoute::ZhihuHotlistExportXlsx
| crate::compat::workflow_executor::WorkflowRoute::ZhihuHotlistScreen
| crate::compat::workflow_executor::WorkflowRoute::ZhihuArticleEntry
| crate::compat::workflow_executor::WorkflowRoute::ZhihuArticleAutoPublishGenerated
)
})
{
return Some("https://www.zhihu.com".to_string());
}
if crate::compat::workflow_executor::detect_route(instruction, None, None)
.is_some_and(|route| {
matches!(
route,
crate::compat::workflow_executor::WorkflowRoute::ZhihuArticleDraft
| crate::compat::workflow_executor::WorkflowRoute::ZhihuArticlePublish
)
})
{
return Some("https://zhuanlan.zhihu.com".to_string());
}
// 台区线损相关
// TODO: 临时方案,后续应从 skill 配置或 deterministic_submit 解析结果中获取
if instruction.contains("线损") || instruction.contains("lineloss") {
return Some("http://20.76.57.61:18080".to_string());
}
None
}
```
- [ ] **Step 2: 运行测试验证通过**
Run: `cargo test initial_request_url_falls_back_to_lineloss_origin_for_lineloss_instructions -- --nocapture`
Expected: PASS
- [ ] **Step 3: 运行所有相关测试**
Run: `cargo test initial_request_url -- --nocapture`
Expected: 所有测试通过
- [ ] **Step 4: 构建项目**
Run: `cargo build`
Expected: 编译成功,无错误
- [ ] **Step 5: 提交实现**
```bash
git add src/service/server.rs
git commit -m "feat: add lineloss URL mapping in derive_request_url_from_instruction
临时方案:检测指令中包含'线损'或'lineloss'时返回台区线损平台 URL
🤖 Generated with [Qoder][https://qoder.com]"
```
---
### Task 3: 端到端验证
**Files:**
- 无文件修改,仅运行验证
- [ ] **Step 1: 停止现有 sgclaw 进程**
确保没有 `sg_claw.exe` 在运行
- [ ] **Step 2: 启动 sgclaw 服务**
Run: `target\debug\sg_claw.exe --config-path ..\sgclaw_config.json service`
- [ ] **Step 3: 在 service console 发送测试指令**
指令: `兰州公司 台区线损大数据 月累计线损率统计分析。。。`
Expected: 日志显示 `bootstrap_url=http://20.76.57.61:18080`,而非 `about:blank`
- [ ] **Step 4: 验证 helper page 打开成功**
Expected: 日志显示 `helper_loaded=true, ready=true`,不再超时

View File

@@ -0,0 +1,76 @@
# 台区线损 target_url 缺失修复 实现计划
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
**Goal:**`browser_script_skill_tool.rs` 调用 `Action::Eval` 时添加 `target_url` 参数。
**Architecture:**`expected_domain` 构造完整 URL`http://{expected_domain}`),添加到 invoke 的 params 中。
**Tech Stack:** Rust, serde_json
---
### Task 1: 添加 target_url 参数
**Files:**
- Modify: `src/compat/browser_script_skill_tool.rs:238-241` (invoke 调用)
- [ ] **Step 1: 修改 invoke 调用,添加 target_url**
将:
```rust
let result = match browser_tool.invoke(
Action::Eval,
json!({ "script": wrapped_script }),
&expected_domain,
) {
```
改为:
```rust
let target_url = format!("http://{}", expected_domain);
let result = match browser_tool.invoke(
Action::Eval,
json!({
"script": wrapped_script,
"target_url": target_url,
}),
&expected_domain,
) {
```
- [ ] **Step 2: 构建项目**
Run: `cargo build`
Expected: 编译成功,无错误
- [ ] **Step 3: 提交修改**
```bash
git add src/compat/browser_script_skill_tool.rs
git commit -m "fix: add target_url param for Action::Eval in browser_script_skill_tool
🤖 Generated with [Qoder][https://qoder.com]"
```
---
### Task 2: 端到端验证
**Files:**
- 无文件修改,仅运行验证
- [ ] **Step 1: 停止现有 sgclaw 进程**
确保没有 `sg_claw.exe` 在运行
- [ ] **Step 2: 启动 sgclaw 服务**
Run: `target\debug\sg_claw.exe --config-path ..\sgclaw_config.json service`
- [ ] **Step 3: 在 service console 发送测试指令**
指令: `兰州公司 台区线损大数据 月累计线损率统计分析。。。`
Expected: 日志显示 `invoke 成功`,不再出现 `target_url is required for eval` 错误

View File

@@ -0,0 +1,912 @@
# Rust-Side Lineloss XLSX Export Implementation Plan
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
**Goal:** Move XLSX export from browser JS (blocked by CORS) to Rust side, so `collect_lineloss.js` only collects data and Rust generates the `.xlsx` file locally.
**Architecture:** JS collects API data and returns a `report-artifact` JSON with `rows`, `column_defs`, and metadata. Rust parses the artifact, extracts rows + column definitions, and generates a standard `.xlsx` file using the `zip` crate + OpenXML XML strings (same pattern as `openxml_office_tool.rs`). Report log is deferred.
**Tech Stack:** Rust, `zip` 0.6.6, `serde_json`, OpenXML Spreadsheet ML, JavaScript (browser-injected)
**Spec:** `docs/superpowers/specs/2026-04-13-rust-side-lineloss-xlsx-export.md`
---
## File Structure
| File | Responsibility |
|------|---------------|
| `src/compat/lineloss_xlsx_export.rs` | **New.** Pure XLSX generation: takes column defs + row data, produces `.xlsx` file. No business logic. |
| `src/compat/deterministic_submit.rs` | **Modify.** After receiving JS artifact, extract rows + column_defs, call XLSX export, attach path to outcome. |
| `src/compat/mod.rs` | **Modify.** Register `lineloss_xlsx_export` module. |
| `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/tq-lineloss-report/scripts/collect_lineloss.js` | **Modify.** Remove `exportWorkbook`/`writeReportLog` calls. Add `column_defs` to artifact. |
| `tests/lineloss_xlsx_export_test.rs` | **New.** Unit tests for XLSX generation. |
---
### Task 1: Create `lineloss_xlsx_export.rs` with Tests
**Files:**
- Create: `src/compat/lineloss_xlsx_export.rs`
- Create: `tests/lineloss_xlsx_export_test.rs`
- Modify: `src/compat/mod.rs`
- [ ] **Step 1: Register the new module in `src/compat/mod.rs`**
Add the module declaration in alphabetical order. In `src/compat/mod.rs`, insert after `pub mod event_bridge;`:
```rust
pub mod lineloss_xlsx_export;
```
The full file becomes:
```rust
pub mod artifact_open;
pub mod browser_script_skill_tool;
pub mod browser_tool_adapter;
pub mod config_adapter;
pub mod cron_adapter;
pub mod deterministic_submit;
pub mod direct_skill_runtime;
pub mod event_bridge;
pub mod lineloss_xlsx_export;
pub mod memory_adapter;
pub mod openxml_office_tool;
pub mod orchestration;
pub mod runtime;
pub mod screen_html_export_tool;
pub mod tq_lineloss;
pub mod workflow_executor;
```
- [ ] **Step 2: Write the failing test for XLSX generation**
Create `tests/lineloss_xlsx_export_test.rs`:
```rust
use std::fs;
use std::path::PathBuf;
use serde_json::json;
use sgclaw::compat::lineloss_xlsx_export::{export_lineloss_xlsx, LinelossExportRequest};
fn temp_output_path(name: &str) -> PathBuf {
let dir = std::env::temp_dir().join("sgclaw-test-xlsx");
fs::create_dir_all(&dir).unwrap();
dir.join(name)
}
#[test]
fn export_month_lineloss_produces_valid_xlsx() {
let output_path = temp_output_path("month-test.xlsx");
if output_path.exists() {
fs::remove_file(&output_path).unwrap();
}
let request = LinelossExportRequest {
sheet_name: "国网兰州供电公司月度线损分析报表(2026-03)".to_string(),
column_defs: vec![
("ORG_NAME".to_string(), "供电单位".to_string()),
("YGDL".to_string(), "累计供电量".to_string()),
("YYDL".to_string(), "累计售电量".to_string()),
("YXSL".to_string(), "线损完成率(%)".to_string()),
("RAT_SCOPE".to_string(), "线损率累计目标值".to_string()),
("BLANK3".to_string(), "目标完成率".to_string()),
("BLANK2".to_string(), "排行".to_string()),
],
rows: vec![
serde_json::from_value(json!({
"ORG_NAME": "城关供电",
"YGDL": "12345.67",
"YYDL": "11234.56",
"YXSL": "9.00",
"RAT_SCOPE": "9.50",
"BLANK3": "94.74",
"BLANK2": "1"
}))
.unwrap(),
serde_json::from_value(json!({
"ORG_NAME": "七里河供电",
"YGDL": "9876.54",
"YYDL": "8765.43",
"YXSL": "11.24",
"RAT_SCOPE": "10.00",
"BLANK3": "112.40",
"BLANK2": "2"
}))
.unwrap(),
],
output_path: output_path.clone(),
};
let result_path = export_lineloss_xlsx(&request).unwrap();
assert_eq!(result_path, output_path);
assert!(output_path.exists());
// Verify it's a valid ZIP (xlsx is a zip archive)
let file = fs::File::open(&output_path).unwrap();
let mut archive = zip::ZipArchive::new(file).unwrap();
// Must contain the standard OpenXML entries
let entry_names: Vec<String> = (0..archive.len())
.map(|i| archive.by_index(i).unwrap().name().to_string())
.collect();
assert!(entry_names.contains(&"[Content_Types].xml".to_string()));
assert!(entry_names.contains(&"xl/worksheets/sheet1.xml".to_string()));
assert!(entry_names.contains(&"xl/workbook.xml".to_string()));
// Read sheet1.xml and verify it contains our data
let mut sheet = archive.by_name("xl/worksheets/sheet1.xml").unwrap();
let mut xml = String::new();
std::io::Read::read_to_string(&mut sheet, &mut xml).unwrap();
assert!(xml.contains("供电单位"), "header row should contain 供电单位");
assert!(xml.contains("累计供电量"), "header row should contain 累计供电量");
assert!(xml.contains("城关供电"), "data should contain 城关供电");
assert!(xml.contains("12345.67"), "data should contain 12345.67");
assert!(xml.contains("七里河供电"), "data should contain second row");
// Cleanup
fs::remove_file(&output_path).unwrap();
}
#[test]
fn export_empty_rows_returns_error() {
let output_path = temp_output_path("empty-test.xlsx");
let request = LinelossExportRequest {
sheet_name: "test".to_string(),
column_defs: vec![("A".to_string(), "ColA".to_string())],
rows: vec![],
output_path: output_path.clone(),
};
let result = export_lineloss_xlsx(&request);
assert!(result.is_err());
assert!(
result.unwrap_err().to_string().contains("rows must not be empty"),
"should reject empty rows"
);
}
```
- [ ] **Step 3: Run the test to verify it fails**
Run: `cargo test --test lineloss_xlsx_export_test -- --nocapture`
Expected: compilation error — `lineloss_xlsx_export` module doesn't exist yet or `export_lineloss_xlsx` / `LinelossExportRequest` not defined.
- [ ] **Step 4: Implement `src/compat/lineloss_xlsx_export.rs`**
```rust
use std::fs;
use std::io::Write;
use std::path::{Path, PathBuf};
use serde_json::{Map, Value};
use zip::write::FileOptions;
use zip::{CompressionMethod, ZipWriter};
pub struct LinelossExportRequest {
pub sheet_name: String,
pub column_defs: Vec<(String, String)>,
pub rows: Vec<Map<String, Value>>,
pub output_path: PathBuf,
}
pub fn export_lineloss_xlsx(request: &LinelossExportRequest) -> anyhow::Result<PathBuf> {
if request.rows.is_empty() {
anyhow::bail!("rows must not be empty");
}
if request.column_defs.is_empty() {
anyhow::bail!("column_defs must not be empty");
}
let sheet_xml = build_worksheet_xml(&request.column_defs, &request.rows);
write_xlsx(
&request.output_path,
&request.sheet_name,
&sheet_xml,
)?;
Ok(request.output_path.clone())
}
fn build_worksheet_xml(
column_defs: &[(String, String)],
rows: &[Map<String, Value>],
) -> String {
let mut xml_rows = Vec::with_capacity(rows.len() + 1);
// Header row (row 1)
let header_cells: Vec<String> = column_defs
.iter()
.enumerate()
.map(|(col_idx, (_key, label))| {
let col_letter = column_letter(col_idx);
format!(
"<c r=\"{col_letter}1\" t=\"inlineStr\"><is><t>{}</t></is></c>",
xml_escape(label)
)
})
.collect();
xml_rows.push(format!("<row r=\"1\">{}</row>", header_cells.join("")));
// Data rows (row 2+)
for (row_idx, row) in rows.iter().enumerate() {
let excel_row = row_idx + 2;
let cells: Vec<String> = column_defs
.iter()
.enumerate()
.map(|(col_idx, (key, _label))| {
let col_letter = column_letter(col_idx);
let value = row
.get(key)
.map(|v| value_to_string(v))
.unwrap_or_default();
format!(
"<c r=\"{col_letter}{excel_row}\" t=\"inlineStr\"><is><t>{}</t></is></c>",
xml_escape(&value)
)
})
.collect();
xml_rows.push(format!("<row r=\"{excel_row}\">{}</row>", cells.join("")));
}
format!(
"<?xml version=\"1.0\" encoding=\"UTF-8\" standalone=\"yes\"?>\
<worksheet xmlns=\"http://schemas.openxmlformats.org/spreadsheetml/2006/main\">\
<sheetData>{}</sheetData>\
</worksheet>",
xml_rows.join("")
)
}
fn column_letter(index: usize) -> String {
let mut result = String::new();
let mut n = index;
loop {
result.insert(0, (b'A' + (n % 26) as u8) as char);
if n < 26 {
break;
}
n = n / 26 - 1;
}
result
}
fn value_to_string(value: &Value) -> String {
match value {
Value::String(text) => text.clone(),
Value::Number(number) => number.to_string(),
Value::Bool(flag) => flag.to_string(),
Value::Null => String::new(),
other => other.to_string(),
}
}
fn xml_escape(value: &str) -> String {
value
.replace('&', "&amp;")
.replace('<', "&lt;")
.replace('>', "&gt;")
}
fn write_xlsx(output_path: &Path, sheet_name: &str, sheet_xml: &str) -> anyhow::Result<()> {
if let Some(parent) = output_path.parent() {
fs::create_dir_all(parent)?;
}
if output_path.exists() {
fs::remove_file(output_path)?;
}
let file = fs::File::create(output_path)?;
let mut zip = ZipWriter::new(file);
let options = FileOptions::default().compression_method(CompressionMethod::Stored);
zip.start_file("[Content_Types].xml", options)?;
zip.write_all(content_types_xml().as_bytes())?;
zip.start_file("_rels/.rels", options)?;
zip.write_all(root_rels_xml().as_bytes())?;
zip.start_file("docProps/app.xml", options)?;
zip.write_all(app_xml().as_bytes())?;
zip.start_file("docProps/core.xml", options)?;
zip.write_all(core_xml().as_bytes())?;
zip.start_file("xl/workbook.xml", options)?;
zip.write_all(workbook_xml(&xml_escape(sheet_name)).as_bytes())?;
zip.start_file("xl/_rels/workbook.xml.rels", options)?;
zip.write_all(workbook_rels_xml().as_bytes())?;
zip.start_file("xl/worksheets/sheet1.xml", options)?;
zip.write_all(sheet_xml.as_bytes())?;
zip.finish()?;
Ok(())
}
fn content_types_xml() -> &'static str {
r#"<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<Types xmlns="http://schemas.openxmlformats.org/package/2006/content-types">
<Default Extension="rels" ContentType="application/vnd.openxmlformats-package.relationships+xml"/>
<Default Extension="xml" ContentType="application/xml"/>
<Override PartName="/xl/workbook.xml" ContentType="application/vnd.openxmlformats-officedocument.spreadsheetml.sheet.main+xml"/>
<Override PartName="/xl/worksheets/sheet1.xml" ContentType="application/vnd.openxmlformats-officedocument.spreadsheetml.worksheet+xml"/>
<Override PartName="/docProps/core.xml" ContentType="application/vnd.openxmlformats-package.core-properties+xml"/>
<Override PartName="/docProps/app.xml" ContentType="application/vnd.openxmlformats-officedocument.extended-properties+xml"/>
</Types>"#
}
fn root_rels_xml() -> &'static str {
r#"<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<Relationships xmlns="http://schemas.openxmlformats.org/package/2006/relationships">
<Relationship Id="rId1" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/officeDocument" Target="xl/workbook.xml"/>
<Relationship Id="rId2" Type="http://schemas.openxmlformats.org/package/2006/relationships/metadata/core-properties" Target="docProps/core.xml"/>
<Relationship Id="rId3" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/extended-properties" Target="docProps/app.xml"/>
</Relationships>"#
}
fn app_xml() -> &'static str {
r#"<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<Properties xmlns="http://schemas.openxmlformats.org/officeDocument/2006/extended-properties"
xmlns:vt="http://schemas.openxmlformats.org/officeDocument/2006/docPropsVTypes">
<Application>sgClaw</Application>
</Properties>"#
}
fn core_xml() -> &'static str {
r#"<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<cp:coreProperties xmlns:cp="http://schemas.openxmlformats.org/package/2006/metadata/core-properties"
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:dcterms="http://purl.org/dc/terms/"
xmlns:dcmitype="http://purl.org/dc/dcmitype/"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<dc:title>台区线损报表</dc:title>
</cp:coreProperties>"#
}
fn workbook_xml(sheet_name: &str) -> String {
format!(
r#"<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<workbook xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main"
xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships">
<sheets>
<sheet name="{sheet_name}" sheetId="1" r:id="rId1"/>
</sheets>
</workbook>"#
)
}
fn workbook_rels_xml() -> &'static str {
r#"<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<Relationships xmlns="http://schemas.openxmlformats.org/package/2006/relationships">
<Relationship Id="rId1" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/worksheet" Target="worksheets/sheet1.xml"/>
</Relationships>"#
}
#[cfg(test)]
mod tests {
use super::column_letter;
#[test]
fn column_letter_maps_indices_correctly() {
assert_eq!(column_letter(0), "A");
assert_eq!(column_letter(1), "B");
assert_eq!(column_letter(6), "G");
assert_eq!(column_letter(25), "Z");
assert_eq!(column_letter(26), "AA");
}
}
```
- [ ] **Step 5: Run the tests to verify they pass**
Run: `cargo test --test lineloss_xlsx_export_test -- --nocapture`
Expected: both `export_month_lineloss_produces_valid_xlsx` and `export_empty_rows_returns_error` PASS.
Also run the internal unit test:
Run: `cargo test lineloss_xlsx_export -- --nocapture`
Expected: `column_letter_maps_indices_correctly` PASS.
- [ ] **Step 6: Commit**
```bash
git add src/compat/lineloss_xlsx_export.rs src/compat/mod.rs tests/lineloss_xlsx_export_test.rs
git commit -m "feat(lineloss): add Rust-side XLSX generation for lineloss reports"
```
---
### Task 2: Integrate XLSX Export into `deterministic_submit.rs`
**Files:**
- Modify: `src/compat/deterministic_submit.rs`
- [ ] **Step 1: Add imports and helper function to extract export data from artifact**
At the top of `src/compat/deterministic_submit.rs`, add the import:
```rust
use crate::compat::lineloss_xlsx_export::{export_lineloss_xlsx, LinelossExportRequest};
```
Then add a new helper function after `summarize_lineloss_artifact`:
```rust
struct LinelossArtifactExportData {
sheet_name: String,
column_defs: Vec<(String, String)>,
rows: Vec<Map<String, Value>>,
}
fn extract_export_data(output: &str) -> Option<LinelossArtifactExportData> {
let payload: Value = serde_json::from_str(output).ok()?;
let artifact = payload
.as_object()
.and_then(|object| object.get("text"))
.unwrap_or(&payload);
let artifact = artifact.as_object()?;
if artifact.get("type").and_then(Value::as_str) != Some("report-artifact") {
return None;
}
let status = artifact.get("status").and_then(Value::as_str).unwrap_or("");
if !matches!(status, "ok" | "partial") {
return None;
}
let rows = artifact
.get("rows")
.and_then(Value::as_array)?;
if rows.is_empty() {
return None;
}
let rows: Vec<Map<String, Value>> = rows
.iter()
.filter_map(|row| row.as_object().cloned())
.collect();
if rows.is_empty() {
return None;
}
let column_defs: Vec<(String, String)> = artifact
.get("column_defs")
.and_then(Value::as_array)
.map(|defs| {
defs.iter()
.filter_map(|def| {
let arr = def.as_array()?;
let key = arr.first()?.as_str()?.to_string();
let label = arr.get(1)?.as_str()?.to_string();
Some((key, label))
})
.collect()
})
.unwrap_or_default();
// Fallback: if column_defs not in artifact, try "columns" array as keys
let column_defs = if column_defs.is_empty() {
let columns = artifact
.get("columns")
.and_then(Value::as_array)?;
columns
.iter()
.filter_map(|col| {
let key = col.as_str()?.to_string();
Some((key.clone(), key))
})
.collect()
} else {
column_defs
};
if column_defs.is_empty() {
return None;
}
let org_label = artifact
.get("org")
.and_then(Value::as_object)
.and_then(|org| org.get("label"))
.and_then(Value::as_str)
.unwrap_or("lineloss");
let period_mode = artifact
.get("period")
.and_then(Value::as_object)
.and_then(|p| p.get("mode"))
.and_then(Value::as_str)
.unwrap_or("month");
let period_value = artifact
.get("period")
.and_then(Value::as_object)
.and_then(|p| p.get("value"))
.and_then(Value::as_str)
.unwrap_or("");
let mode_label = if period_mode == "week" { "周度" } else { "月度" };
let sheet_name = format!("{org_label}{mode_label}线损分析报表({period_value})");
Some(LinelossArtifactExportData {
sheet_name,
column_defs,
rows,
})
}
```
- [ ] **Step 2: Add the export-after-collection function**
Add a new function that wraps the existing flow with XLSX export:
```rust
fn try_export_lineloss_xlsx(
output: &str,
workspace_root: &Path,
) -> Option<PathBuf> {
let data = extract_export_data(output)?;
let nanos = std::time::SystemTime::now()
.duration_since(std::time::UNIX_EPOCH)
.map(|d| d.as_nanos())
.unwrap_or_default();
let out_dir = workspace_root.join("out");
let output_path = out_dir.join(format!("tq-lineloss-{nanos}.xlsx"));
let request = LinelossExportRequest {
sheet_name: data.sheet_name,
column_defs: data.column_defs,
rows: data.rows,
output_path,
};
match export_lineloss_xlsx(&request) {
Ok(path) => {
eprintln!("[deterministic_submit] XLSX exported to: {}", path.display());
Some(path)
}
Err(err) => {
eprintln!("[deterministic_submit] XLSX export failed: {err}");
None
}
}
}
```
- [ ] **Step 3: Modify `execute_deterministic_submit_with_browser_backend` to call export**
Replace the body of `execute_deterministic_submit_with_browser_backend` (lines 119-136 of the original file):
```rust
pub fn execute_deterministic_submit_with_browser_backend(
browser_backend: Arc<dyn BrowserBackend>,
plan: &DeterministicExecutionPlan,
workspace_root: &Path,
settings: &SgClawSettings,
) -> Result<DirectSubmitOutcome, PipeError> {
let args = deterministic_submit_args(plan);
let output =
crate::compat::direct_skill_runtime::execute_browser_script_skill_raw_output_with_browser_backend(
browser_backend,
&plan.tool_name,
workspace_root,
settings,
args,
)?;
let export_path = try_export_lineloss_xlsx(&output, workspace_root);
Ok(summarize_lineloss_output_with_export(&output, export_path.as_deref()))
}
```
Apply the same change to `execute_deterministic_submit` (the non-backend variant, lines 101-117):
```rust
pub fn execute_deterministic_submit<T: Transport + 'static>(
browser_tool: BrowserPipeTool<T>,
plan: &DeterministicExecutionPlan,
workspace_root: &Path,
settings: &SgClawSettings,
) -> Result<DirectSubmitOutcome, PipeError> {
let args = deterministic_submit_args(plan);
let output = crate::compat::direct_skill_runtime::execute_browser_script_skill_raw_output(
browser_tool,
&plan.tool_name,
workspace_root,
settings,
args,
)?;
let export_path = try_export_lineloss_xlsx(&output, workspace_root);
Ok(summarize_lineloss_output_with_export(&output, export_path.as_deref()))
}
```
- [ ] **Step 4: Add `summarize_lineloss_output_with_export` function**
Add this new function. It wraps the existing `summarize_lineloss_output` and appends the export path:
```rust
fn summarize_lineloss_output_with_export(output: &str, export_path: Option<&Path>) -> DirectSubmitOutcome {
let mut outcome = summarize_lineloss_output(output);
if let Some(path) = export_path {
outcome.summary.push_str(&format!(" export_path={}", path.display()));
}
outcome
}
```
- [ ] **Step 5: Run existing tests to ensure nothing breaks**
Run: `cargo test --test deterministic_submit_test -- --nocapture`
Expected: all existing tests PASS (the tests don't call `execute_deterministic_submit`, they test `decide_deterministic_submit` and parsing logic which is unchanged).
Run: `cargo test deterministic_submit -- --nocapture`
Expected: PASS.
- [ ] **Step 6: Commit**
```bash
git add src/compat/deterministic_submit.rs
git commit -m "feat(lineloss): integrate Rust-side XLSX export into deterministic submit pipeline"
```
---
### Task 3: Modify `collect_lineloss.js` to Skip Browser-Side Export
**Files:**
- Modify: `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/tq-lineloss-report/scripts/collect_lineloss.js`
- [ ] **Step 1: Add `column_defs` to the artifact returned by `buildArtifact`**
In the `buildArtifact` function (around line 198), the `columns` field currently contains just column keys (e.g., `["ORG_NAME", "YGDL", ...]`). Add a `column_defs` field that includes the full key+label pairs. Change the `buildArtifact` function to also accept and emit `column_defs`:
Find this block in `buildArtifact` (line 198-242):
```javascript
function buildArtifact({
status,
blockedReason = '',
fatalError = '',
org_label = '',
org_code = '',
period_mode = '',
period_mode_code = '',
period_value = '',
period_payload = {},
columns = [],
rows = [],
export: exportState,
reasons = []
}) {
```
Replace with:
```javascript
function buildArtifact({
status,
blockedReason = '',
fatalError = '',
org_label = '',
org_code = '',
period_mode = '',
period_mode_code = '',
period_value = '',
period_payload = {},
columns = [],
column_defs = [],
rows = [],
export: exportState,
reasons = []
}) {
```
In the returned object (the `return { ... }` block inside `buildArtifact`), add `column_defs` after `columns`:
```javascript
columns: [...columns],
column_defs: [...column_defs],
rows: [...rows],
```
- [ ] **Step 2: Pass `column_defs` from `buildBrowserEntrypointResult`**
In `buildBrowserEntrypointResult`, after the `columns` assignment (around line 452), add:
```javascript
const columns = normalizedArgs.period_mode === 'week' ? WEEK_COLUMNS : MONTH_COLUMNS;
const columnDefs = normalizedArgs.period_mode === 'week' ? WEEK_COLUMN_DEFS : MONTH_COLUMN_DEFS;
```
Then in every call to `buildArtifact` inside `buildBrowserEntrypointResult`, add `column_defs: columnDefs` alongside `columns`. There are 5 calls:
**Call 1** (API error, around line 466):
```javascript
columns,
column_defs: columnDefs,
rows: [],
```
**Call 2** (empty rows, around line 483):
```javascript
columns,
column_defs: columnDefs,
rows: []
```
**Call 3** (normalization failure, around line 497):
```javascript
columns,
column_defs: columnDefs,
rows: [],
```
**Call 4** (success, around line 558):
```javascript
columns,
column_defs: columnDefs,
rows,
```
Note: the two `buildArtifact` calls before the `columns` variable is assigned (validation failure and page context failure, around lines 422 and 439) don't need `column_defs` since they don't have data.
- [ ] **Step 3: Remove the `exportWorkbook` and `writeReportLog` calls from the success path**
In `buildBrowserEntrypointResult`, replace the entire export block (lines 518-556) with a simplified version:
Find:
```javascript
const exportState = {
attempted: false,
status: 'skipped',
message: null
};
if (typeof deps.exportWorkbook === 'function') {
exportState.attempted = true;
try {
const exportPayload = buildExportPayload({
mode: normalizedArgs.period_mode,
orgLabel: normalizedArgs.org_label,
periodValue: normalizedArgs.period_value,
rows
});
const exportResult = await deps.exportWorkbook(exportPayload);
const exportPath = pickFirstNonEmpty(exportResult?.path, exportResult?.data?.path, exportResult?.data?.data);
if (!exportPath) {
throw new Error('export_failed');
}
exportState.status = 'ok';
exportState.message = exportPath;
if (typeof deps.writeReportLog === 'function') {
try {
const reportLog = await deps.writeReportLog(buildReportName(normalizedArgs), exportPath);
if (reportLog?.success === false) {
reasons.push('report_log_failed');
}
} catch (_error) {
reasons.push('report_log_failed');
}
}
} catch (error) {
reasons.push('export_failed');
exportState.status = 'failed';
exportState.message = pickFirstNonEmpty(error?.message, 'export_failed');
}
}
```
Replace with:
```javascript
// Export is handled by Rust side after receiving the artifact.
// JS only provides rows + column_defs in the artifact.
const exportState = {
attempted: false,
status: 'deferred_to_rust',
message: null
};
```
- [ ] **Step 4: Remove unused constants and functions**
Remove these constants (lines 5-6) since they are no longer called from JS:
```javascript
const EXPORT_SERVICE_URL = 'http://localhost:13313/SurfaceServices/personalBread/export/faultDetailsExportXLSX';
const REPORT_LOG_URL = 'http://localhost:13313/ReportServices/Api/setReportLog';
```
Remove the `postJson` function (lines 264-294) — it is no longer needed since no JS-side HTTP calls are made to localhost.
Remove these functions from `defaultBrowserDeps()`:
- `exportWorkbook` (lines 350-373)
- `writeReportLog` (lines 375-409)
Remove these now-unused functions:
- `buildExportTitles` (lines 244-254)
- `buildExportPayload` (lines 256-262)
- `buildReportName` (lines 413-415)
- [ ] **Step 5: Update the module.exports to remove unused exports**
Update the `module.exports` block (lines 572-586). Remove `buildBrowserEntrypointResult` from exports if it was only used for testing with full deps, or keep it for test compatibility. The final exports block:
```javascript
if (typeof module !== 'undefined' && module.exports) {
module.exports = {
MONTH_COLUMNS,
WEEK_COLUMNS,
MONTH_COLUMN_DEFS,
WEEK_COLUMN_DEFS,
validateArgs,
buildMonthRequest,
buildWeekRequest,
normalizeRows,
determineArtifactStatus,
buildArtifact,
buildBrowserEntrypointResult
};
} else {
return buildBrowserEntrypointResult(args);
}
```
- [ ] **Step 6: Verify the JS file has no syntax errors**
Run: `node -c "D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/tq-lineloss-report/scripts/collect_lineloss.js"`
Expected: no syntax errors. (Note: the file uses `return` at top level inside a wrapped IIFE when injected into the browser, so Node syntax check may warn — the important thing is no parse errors.)
Alternatively, check the test file still works:
Run: `node "D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/tq-lineloss-report/scripts/collect_lineloss.test.js"`
Expected: tests pass (or at least no JS parse errors).
- [ ] **Step 7: Commit**
```bash
git add "D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/tq-lineloss-report/scripts/collect_lineloss.js"
git commit -m "feat(lineloss): remove browser-side export, defer to Rust-side XLSX generation"
```
---
### Task 4: Full Build Verification
**Files:** None (verification only)
- [ ] **Step 1: Run full cargo build**
Run: `cargo build`
Expected: successful compilation with no errors.
- [ ] **Step 2: Run all tests**
Run: `cargo test -- --nocapture`
Expected: all tests pass, including:
- `lineloss_xlsx_export_test::export_month_lineloss_produces_valid_xlsx`
- `lineloss_xlsx_export_test::export_empty_rows_returns_error`
- `lineloss_xlsx_export::tests::column_letter_maps_indices_correctly`
- All existing `deterministic_submit_test` tests
- [ ] **Step 3: Commit (if any fixups needed)**
Only if compilation or test fixes were required in this step.

View File

@@ -0,0 +1,117 @@
# Helper Page Lifecycle Fix v2 — Same-Connection Close + Open
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
**Goal:** Prevent orphaned helper pages across process restarts by closing existing ones before opening new ones, all on the same WebSocket connection.
**Architecture:** In `bootstrap_helper_page`, after registering with the browser WS, send `sgHideBrowerserClosePage` (best-effort, silently ignored if no page exists), then send `sgHideBrowerserOpenPage`. Change `use_hidden_domain` to `true`.
**Tech Stack:** Rust, tungstenite, SuperRPA browser WS protocol
---
### Task 1: Add close-before-open in bootstrap_helper_page
**Files:**
- Modify: `src/browser/callback_host.rs:345-374` (bootstrap_helper_page function)
- [ ] **Step 1: Add close command before open command in bootstrap_helper_page**
Replace the current `bootstrap_helper_page` function. After `recv_bootstrap_prelude`, send the close command first, then the open command:
```rust
fn bootstrap_helper_page(
browser_ws_url: &str,
request_url: &str,
helper_url: &str,
use_hidden_domain: bool,
) -> Result<(), PipeError> {
let (mut websocket, _) = connect(browser_ws_url)
.map_err(|err| PipeError::Protocol(format!("browser websocket connect failed: {err}")))?;
configure_bootstrap_socket(&mut websocket)?;
websocket
.send(Message::Text(
r#"{"type":"register","role":"web"}"#.to_string().into(),
))
.map_err(|err| PipeError::Protocol(format!("browser websocket register failed: {err}")))?;
let _ = recv_bootstrap_prelude(&mut websocket);
// Close any orphaned helper page from a previous process run.
// Best-effort: if no page exists, the browser silently ignores this.
let (open_action, close_action) = if use_hidden_domain {
("sgHideBrowerserOpenPage", "sgHideBrowerserClosePage")
} else {
("sgBrowerserOpenPage", "sgBrowserClosePage")
};
let close_payload = json!([request_url, close_action, helper_url]).to_string();
let _ = websocket.send(Message::Text(close_payload.into()));
let payload = json!([
request_url,
open_action,
helper_url,
])
.to_string();
websocket
.send(Message::Text(payload.into()))
.map_err(|err| PipeError::Protocol(format!("helper bootstrap send failed: {err}")))?;
Ok(())
}
```
Key changes from current code:
- After `recv_bootstrap_prelude`, add the close command (best-effort, ignore errors)
- Compute both `open_action` and `close_action` from `use_hidden_domain` flag
- Send close first, then open on the same WebSocket connection
- [ ] **Step 2: Change `use_hidden_domain` to `true` in server.rs**
In `src/service/server.rs`, at the `start_with_browser_ws_url` call, change `false` to `true`:
```rust
match LiveBrowserCallbackHost::start_with_browser_ws_url(
browser_ws_url,
&bootstrap_url,
Duration::from_secs(15),
BROWSER_RESPONSE_TIMEOUT,
true, // use_hidden_domain: hidden domain for invisible helper
) {
```
- [ ] **Step 3: Build**
Run: `cargo build 2>&1`
Expected: 0 errors.
- [ ] **Step 4: Run callback_host tests**
Run: `cargo test --lib -- callback_host 2>&1`
Expected: 12 tests pass (including `live_callback_host_sends_bootstrap_open_page_command` which still checks for `sgBrowerserOpenPage` because the test passes `false`, and `live_callback_host_hidden_domain_sends_hide_open_page_command` which passes `true`).
Note: The test passes `false` for `use_hidden_domain`, so the close command will use `sgBrowserClosePage`. The test's fake WebSocket server will receive both the close and open frames. The test only checks that `sgBrowerserOpenPage` is present, which is still true.
- [ ] **Step 5: Commit**
```bash
git add src/browser/callback_host.rs src/service/server.rs
git commit -m "fix(callback_host): close orphaned helper page before opening new one on same WS"
```
---
### Task 2: Full verification
**Files:** None (verification only)
- [ ] **Step 1: Full test suite**
Run: `cargo test 2>&1`
Expected: All tests pass except pre-existing `lineloss_period_resolver_prompts_for_missing_period` failure.
- [ ] **Step 2: Verify key behavioral changes**
Manually confirm:
1. `bootstrap_helper_page` sends close command before open command (both on same WS connection)
2. `use_hidden_domain` is `true` in `server.rs` — helper page opens in hidden domain
3. `Drop for LiveBrowserCallbackHost` remains simple (shutdown only, no close attempt)
4. `cached_host` is still in `mod.rs` outer loop (process-internal deduplication)

View File

@@ -0,0 +1,475 @@
# Helper Page Lifecycle Fix & Hidden Domain Support — Implementation Plan
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
**Goal:** Fix duplicate browser-helper.html pages caused by WebSocket reconnections, add cleanup on Drop, and introduce a config switch for hidden-domain page opening.
**Architecture:** Three changes: (1) lift `cached_host` from `serve_client()` to the outer `run()` loop so reconnections share one host, (2) enhance `Drop for LiveBrowserCallbackHost` to send a close-page command via browser WS, (3) add `use_hidden_domain: bool` parameter that selects between `sgBrowerserOpenPage`/`sgHideBrowerserOpenPage` and their corresponding close APIs.
**Tech Stack:** Rust, tungstenite WebSocket crate, SuperRPA browser WS protocol
---
### Task 1: Add `use_hidden_domain` field and update `bootstrap_helper_page`
**Files:**
- Modify: `src/browser/callback_host.rs:28` (constant), `:44-51` (struct), `:215-252` (constructor), `:340-359` (bootstrap fn)
- [ ] **Step 1: Change `HELPER_BOOTSTRAP_ACTION` from constant to a function of `use_hidden_domain`**
Replace the constant and update `bootstrap_helper_page` to accept and use the flag:
```rust
// DELETE this line:
// const HELPER_BOOTSTRAP_ACTION: &str = "sgBrowerserOpenPage";
// REPLACE bootstrap_helper_page signature and body:
fn bootstrap_helper_page(
browser_ws_url: &str,
request_url: &str,
helper_url: &str,
use_hidden_domain: bool,
) -> Result<(), PipeError> {
let (mut websocket, _) = connect(browser_ws_url)
.map_err(|err| PipeError::Protocol(format!("browser websocket connect failed: {err}")))?;
configure_bootstrap_socket(&mut websocket)?;
websocket
.send(Message::Text(
r#"{"type":"register","role":"web"}"#.to_string().into(),
))
.map_err(|err| PipeError::Protocol(format!("browser websocket register failed: {err}")))?;
let _ = recv_bootstrap_prelude(&mut websocket);
let open_action = if use_hidden_domain {
"sgHideBrowerserOpenPage"
} else {
"sgBrowerserOpenPage"
};
let payload = json!([
request_url,
open_action,
helper_url,
])
.to_string();
websocket
.send(Message::Text(payload.into()))
.map_err(|err| PipeError::Protocol(format!("helper bootstrap send failed: {err}")))?;
Ok(())
}
```
- [ ] **Step 2: Add new fields to `LiveBrowserCallbackHost`**
```rust
#[derive(Debug)]
pub(crate) struct LiveBrowserCallbackHost {
host: Arc<BrowserCallbackHost>,
shutdown: Arc<AtomicBool>,
server_thread: Mutex<Option<JoinHandle<()>>>,
command_lock: Mutex<()>,
result_timeout: Duration,
browser_ws_url: String,
use_hidden_domain: bool,
}
```
- [ ] **Step 3: Update `start_with_browser_ws_url` to accept and store the new parameter**
```rust
impl LiveBrowserCallbackHost {
pub(crate) fn start_with_browser_ws_url(
browser_ws_url: &str,
bootstrap_request_url: &str,
ready_timeout: Duration,
result_timeout: Duration,
use_hidden_domain: bool,
) -> Result<Self, PipeError> {
let listener = TcpListener::bind("127.0.0.1:0").map_err(|err| {
PipeError::Protocol(format!("failed to bind callback host listener: {err}"))
})?;
listener.set_nonblocking(true).map_err(|err| {
PipeError::Protocol(format!("failed to configure callback host listener: {err}"))
})?;
let origin = format!(
"http://{}",
listener.local_addr().map_err(|err| {
PipeError::Protocol(format!(
"failed to resolve callback host listener address: {err}"
))
})?
);
let host = Arc::new(BrowserCallbackHost::with_urls(&origin, browser_ws_url));
let shutdown = Arc::new(AtomicBool::new(false));
let thread_host = host.clone();
let thread_shutdown = shutdown.clone();
let server_thread = thread::spawn(move || serve_loop(listener, thread_host, thread_shutdown));
bootstrap_helper_page(browser_ws_url, bootstrap_request_url, host.helper_url(), use_hidden_domain)?;
wait_for_helper_ready(host.as_ref(), ready_timeout)?;
let live_host = Self {
host,
shutdown,
server_thread: Mutex::new(Some(server_thread)),
command_lock: Mutex::new(()),
result_timeout,
browser_ws_url: browser_ws_url.to_string(),
use_hidden_domain,
};
Ok(live_host)
}
```
- [ ] **Step 4: Fix the inline test struct literal that constructs `LiveBrowserCallbackHost` directly**
In the `live_callback_host_treats_simulated_mouse_command_as_fire_and_forget` test (around line 1110), add the new fields:
```rust
let host = LiveBrowserCallbackHost {
host: Arc::new(BrowserCallbackHost::new()),
shutdown: Arc::new(AtomicBool::new(false)),
server_thread: Mutex::new(None),
command_lock: Mutex::new(()),
result_timeout: Duration::from_millis(10),
browser_ws_url: "ws://127.0.0.1:12345".to_string(),
use_hidden_domain: false,
};
```
- [ ] **Step 5: Run build to verify compilation**
Run: `cargo build 2>&1`
Expected: 0 errors. The `HELPER_BOOTSTRAP_ACTION` constant removal and signature changes should all be internally consistent.
- [ ] **Step 6: Run tests to verify existing behavior is preserved**
Run: `cargo test -- callback_host 2>&1`
Expected: All existing callback_host tests pass (including `live_callback_host_sends_bootstrap_open_page_command` which still checks for `sgBrowerserOpenPage` since no caller passes `true` yet).
- [ ] **Step 7: Commit**
```bash
git add src/browser/callback_host.rs
git commit -m "feat(callback_host): add use_hidden_domain param to bootstrap_helper_page"
```
---
### Task 2: Enhance `Drop` to close the helper page
**Files:**
- Modify: `src/browser/callback_host.rs:321-328` (Drop impl)
- [ ] **Step 1: Add `close_helper_page` helper function**
Add this function near `bootstrap_helper_page` (after line ~360):
```rust
/// Best-effort attempt to close the helper page tab via browser WebSocket.
/// Silently ignores all errors — this runs during Drop and must not panic.
fn close_helper_page(browser_ws_url: &str, helper_url: &str, use_hidden_domain: bool) {
let close_action = if use_hidden_domain {
"sgHideBrowerserClosePage"
} else {
"sgBrowserClosePage"
};
let result: Result<(), Box<dyn std::error::Error>> = (|| {
// Use a raw TcpStream with timeouts instead of tungstenite::connect
// which does not expose a connection timeout.
let addr = browser_ws_url
.trim_start_matches("ws://")
.trim_start_matches("wss://");
let stream = TcpStream::connect_timeout(
&addr.parse().map_err(|e| format!("addr parse: {e}"))?,
Duration::from_millis(100),
)?;
stream.set_read_timeout(Some(Duration::from_millis(200)))?;
stream.set_write_timeout(Some(Duration::from_millis(200)))?;
let (mut websocket, _) = tungstenite::client(
browser_ws_url,
stream,
)?;
websocket.send(Message::Text(
r#"{"type":"register","role":"web"}"#.to_string().into(),
))?;
// Drain the welcome prelude (best-effort, ignore timeout).
let _ = websocket.read();
let payload = json!([helper_url, close_action, helper_url]).to_string();
websocket.send(Message::Text(payload.into()))?;
Ok(())
})();
if let Err(err) = result {
eprintln!("close_helper_page best-effort failed (harmless): {err}");
}
}
```
- [ ] **Step 2: Update `Drop for LiveBrowserCallbackHost` to call `close_helper_page`**
```rust
impl Drop for LiveBrowserCallbackHost {
fn drop(&mut self) {
// Best-effort: tell the browser to close the helper page tab.
close_helper_page(
&self.browser_ws_url,
self.host.helper_url(),
self.use_hidden_domain,
);
self.shutdown.store(true, Ordering::Relaxed);
if let Some(handle) = self.server_thread.lock().unwrap().take() {
let _ = handle.join();
}
}
}
```
- [ ] **Step 3: Run build to verify compilation**
Run: `cargo build 2>&1`
Expected: 0 errors. `close_helper_page` uses types already imported (`TcpStream`, `Duration`, `json!`, `Message`).
- [ ] **Step 4: Run tests**
Run: `cargo test -- callback_host 2>&1`
Expected: All pass. The Drop enhancement is best-effort and the test helper constructs hosts with `server_thread: Mutex::new(None)`, so Drop completes cleanly.
- [ ] **Step 5: Commit**
```bash
git add src/browser/callback_host.rs
git commit -m "feat(callback_host): close helper page on Drop via browser WS"
```
---
### Task 3: Lift `cached_host` to outer loop and update `serve_client` signature
**Files:**
- Modify: `src/service/mod.rs:72-96` (run loop)
- Modify: `src/service/server.rs:232-241` (serve_client signature and cached_host init)
- [ ] **Step 1: Change `serve_client` to accept `cached_host` as a parameter**
In `src/service/server.rs`, change the function signature and remove the local `cached_host` variable:
```rust
pub fn serve_client(
context: &AgentRuntimeContext,
session: &ServiceSession,
sink: Arc<ServiceEventSink>,
browser_ws_url: &str,
mac_policy: &MacPolicy,
cached_host: &mut Option<Arc<LiveBrowserCallbackHost>>,
) -> Result<(), PipeError> {
// DELETE the line: let mut cached_host: Option<Arc<LiveBrowserCallbackHost>> = None;
loop {
// ... rest of function body unchanged, `cached_host` is now the parameter
```
The body references to `cached_host` remain identical — they just operate on the borrowed mutable reference instead of a local variable.
- [ ] **Step 2: Update `start_with_browser_ws_url` call to pass `false` for `use_hidden_domain`**
In `src/service/server.rs`, at the `LiveBrowserCallbackHost::start_with_browser_ws_url` call (around line 288), add the `false` argument:
```rust
match LiveBrowserCallbackHost::start_with_browser_ws_url(
browser_ws_url,
&bootstrap_url,
Duration::from_secs(15),
BROWSER_RESPONSE_TIMEOUT,
false, // use_hidden_domain: visible tab for now
) {
```
- [ ] **Step 3: Lift `cached_host` into `run()` in `mod.rs`**
In `src/service/mod.rs`, declare `cached_host` before the loop and pass it to `serve_client`:
```rust
// Add this import at the top of the function or file:
use crate::browser::callback_host::LiveBrowserCallbackHost;
// Before the loop (after line 64, after `let session = ...`):
let mut cached_host: Option<Arc<LiveBrowserCallbackHost>> = None;
loop {
let (stream, _) = listener.accept()?;
let websocket = accept(stream)
.map_err(|err| PipeError::Protocol(format!("service websocket accept failed: {err}")))?;
let sink = Arc::new(ServiceEventSink::from_websocket(websocket));
match session.try_attach_client() {
Ok(()) => {
let result = serve_client(
&runtime_context,
&session,
sink.clone(),
browser_ws_url,
&mac_policy,
&mut cached_host,
);
session.detach_client();
match result {
Ok(()) | Err(PipeError::PipeClosed) => {}
Err(err) => return Err(err),
}
}
Err(message) => {
sink.send_service_message(message)?;
}
}
}
```
- [ ] **Step 4: Update the `pub use` export if needed**
Check `src/service/mod.rs:17`:
```rust
pub use server::{serve_client, ServiceEventSink, ServiceSession};
```
The signature change is compatible — `serve_client` is still public with an added parameter. Any external callers will get a compile error guiding them to add the parameter, which is the desired behavior.
- [ ] **Step 5: Run build to verify compilation**
Run: `cargo build 2>&1`
Expected: 0 errors. If there are external test files calling `serve_client`, they will fail here and need the new parameter added.
- [ ] **Step 6: Run full test suite**
Run: `cargo test 2>&1`
Expected: All tests pass. External test files that call `serve_client` indirectly through the service protocol tests should still work because they use the WS protocol layer, not `serve_client` directly. (Verified: grep found 0 test files referencing `serve_client` or `LiveBrowserCallbackHost`.)
- [ ] **Step 7: Commit**
```bash
git add src/service/mod.rs src/service/server.rs
git commit -m "fix(service): lift cached_host to outer loop to prevent duplicate helper pages"
```
---
### Task 4: Add tests for hidden domain bootstrap
**Files:**
- Modify: `src/browser/callback_host.rs` (inline tests module, around line 1071)
- [ ] **Step 1: Update existing `live_callback_host_sends_bootstrap_open_page_command` test**
The test currently calls `start_with_browser_ws_url` with 4 args. Add the 5th arg `false`:
```rust
#[test]
fn live_callback_host_sends_bootstrap_open_page_command() {
let (ws_url, frames, handle) = start_fake_browser_status_server();
let result = LiveBrowserCallbackHost::start_with_browser_ws_url(
&ws_url,
"https://www.zhihu.com",
Duration::from_millis(100),
Duration::from_millis(50),
false,
);
assert!(result.is_err(), "expected timeout because no real helper page loads");
drop(result);
handle.join().unwrap();
let sent = frames.lock().unwrap().clone();
assert!(
sent.iter().any(|frame| frame.contains("sgBrowerserOpenPage")),
"bootstrap should send sgBrowerserOpenPage to the browser WS; sent frames: {sent:?}"
);
assert!(
sent.iter().any(|frame| frame.contains("/sgclaw/browser-helper.html")),
"bootstrap should include the helper page URL; sent frames: {sent:?}"
);
assert!(
sent.iter().any(|frame| frame.contains("https://www.zhihu.com")),
"bootstrap requestUrl should be the provided page URL; sent frames: {sent:?}"
);
}
```
- [ ] **Step 2: Add new test for hidden domain bootstrap**
Add this test after the existing one:
```rust
#[test]
fn live_callback_host_hidden_domain_sends_hide_open_page_command() {
let (ws_url, frames, handle) = start_fake_browser_status_server();
let result = LiveBrowserCallbackHost::start_with_browser_ws_url(
&ws_url,
"https://www.zhihu.com",
Duration::from_millis(100),
Duration::from_millis(50),
true,
);
assert!(result.is_err(), "expected timeout because no real helper page loads");
drop(result);
handle.join().unwrap();
let sent = frames.lock().unwrap().clone();
assert!(
sent.iter().any(|frame| frame.contains("sgHideBrowerserOpenPage")),
"hidden domain bootstrap should send sgHideBrowerserOpenPage; sent frames: {sent:?}"
);
assert!(
!sent.iter().any(|frame| {
frame.contains("\"sgBrowerserOpenPage\"")
}),
"hidden domain bootstrap should NOT send visible sgBrowerserOpenPage; sent frames: {sent:?}"
);
assert!(
sent.iter().any(|frame| frame.contains("/sgclaw/browser-helper.html")),
"bootstrap should include the helper page URL; sent frames: {sent:?}"
);
}
```
- [ ] **Step 3: Run all callback_host tests**
Run: `cargo test -- callback_host 2>&1`
Expected: All 3 tests pass:
- `live_callback_host_sends_bootstrap_open_page_command` — regression, visible domain
- `live_callback_host_hidden_domain_sends_hide_open_page_command` — new, hidden domain
- `live_callback_host_treats_simulated_mouse_command_as_fire_and_forget` — unchanged
- [ ] **Step 4: Run full test suite**
Run: `cargo test 2>&1`
Expected: All tests pass.
- [ ] **Step 5: Commit**
```bash
git add src/browser/callback_host.rs
git commit -m "test(callback_host): add hidden domain bootstrap test"
```
---
### Task 5: Full build verification
**Files:** None (verification only)
- [ ] **Step 1: Clean build**
Run: `cargo build 2>&1`
Expected: 0 errors. Warnings about dead code in unrelated modules are acceptable.
- [ ] **Step 2: Full test suite**
Run: `cargo test 2>&1`
Expected: All tests pass. The pre-existing `lineloss_period_resolver_prompts_for_missing_period` failure (from previous work) is known and unrelated.
- [ ] **Step 3: Verify the key behavioral changes in code**
Manually confirm:
1. `src/service/mod.rs``cached_host` is declared BEFORE the `loop`, not inside `serve_client`
2. `src/browser/callback_host.rs``Drop::drop` calls `close_helper_page` before shutdown
3. `src/browser/callback_host.rs``bootstrap_helper_page` uses `"sgHideBrowerserOpenPage"` when `use_hidden_domain == true` and `"sgBrowerserOpenPage"` when `false`
4. `src/service/server.rs``start_with_browser_ws_url` call passes `false` as `use_hidden_domain`

View File

@@ -0,0 +1,762 @@
# Service Console Enhancement Implementation Plan
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
**Goal:** Add auto-connect on page load and a settings panel to sg_claw_service_console.html, with config save via WebSocket to the sgClaw service.
**Architecture:** The HTML page auto-connects on load and provides a settings modal. When user saves, the page sends an `update_config` WebSocket message. The Rust service receives it, merges with existing config, writes to `sgclaw_config.json`, and responds.
**Tech Stack:** Rust (serde, tungstenite), vanilla JavaScript/HTML/CSS
---
### Task 1: Add `UpdateConfig` and `ConfigUpdated` protocol types
**Files:**
- Modify: `src/service/protocol.rs`
- [ ] **Step 1: Add `ConfigUpdatePayload` struct and `UpdateConfig` variant to `ClientMessage`**
Add this struct above the `ClientMessage` enum, and add the `UpdateConfig` variant to the enum:
```rust
use std::path::PathBuf;
#[derive(Debug, Clone, PartialEq, Eq, Serialize, Deserialize)]
pub struct ConfigUpdatePayload {
#[serde(rename = "apiKey", default)]
pub api_key: Option<String>,
#[serde(rename = "baseUrl", default)]
pub base_url: Option<String>,
#[serde(default)]
pub model: Option<String>,
#[serde(rename = "skillsDir", default)]
pub skills_dir: Option<String>,
#[serde(rename = "directSubmitSkill", default)]
pub direct_submit_skill: Option<String>,
#[serde(rename = "runtimeProfile", default)]
pub runtime_profile: Option<String>,
#[serde(rename = "browserBackend", default)]
pub browser_backend: Option<String>,
}
```
Add `UpdateConfig` variant to `ClientMessage` enum (after `Ping`):
```rust
UpdateConfig {
config: ConfigUpdatePayload,
},
```
- [ ] **Step 2: Add `ConfigUpdated` variant to `ServiceMessage`**
Add after `Pong`:
```rust
ConfigUpdated {
success: bool,
message: String,
},
```
- [ ] **Step 3: Update `into_submit_task_request` to handle `UpdateConfig`**
In the match arm, add `ClientMessage::UpdateConfig { .. }` to the list that returns `None`:
```rust
ClientMessage::Connect
| ClientMessage::Start
| ClientMessage::Stop
| ClientMessage::Ping
| ClientMessage::UpdateConfig { .. } => None,
```
- [ ] **Step 4: Run tests to verify protocol compiles**
Run: `cargo test --lib service::protocol`
Expected: PASS (no protocol-specific tests yet, but it should compile)
### Task 2: Add `config_path()` getter to `AgentRuntimeContext`
**Files:**
- Modify: `src/agent/task_runner.rs`
- [ ] **Step 1: Add public getter method**
In the `impl AgentRuntimeContext` block, add after `load_sgclaw_settings()`:
```rust
pub fn config_path(&self) -> Option<&Path> {
self.config_path.as_deref()
}
```
Add the import at the top of the file if not present:
```rust
use std::path::Path;
```
- [ ] **Step 2: Run tests to verify**
Run: `cargo test agent::task_runner`
Expected: PASS
### Task 3: Add `save_to_path()` method to `SgClawSettings`
**Files:**
- Modify: `src/config/settings.rs`
- [ ] **Step 1: Add Serialize derive to SgClawSettings and related types**
The `RawSgClawSettings` struct uses `Deserialize` only. We need to add `Serialize` to `SgClawSettings` for writing. Add `use serde::Serialize;` at the top.
Add `Serialize` derive to `SgClawSettings`:
```rust
#[derive(Debug, Clone, PartialEq, Eq, Serialize)]
pub struct SgClawSettings {
```
But wait - `SgClawSettings` has enum fields (`RuntimeProfile`, `SkillsPromptMode`, `PlannerMode`, `BrowserBackend`, `OfficeBackend`) that don't implement `Serialize`. We need to add Serialize derives to those types too.
Instead, the simpler approach is to write a `to_raw()` method that converts `SgClawSettings` to a serializable struct, then serialize that.
- [ ] **Step 2: Create serializable raw config struct**
Add a new struct at the bottom of the file (before tests if any):
```rust
#[derive(Debug, Serialize)]
struct SerializableRawSgClawSettings {
#[serde(rename = "apiKey")]
api_key: String,
#[serde(rename = "baseUrl")]
base_url: String,
model: String,
#[serde(rename = "skillsDir", skip_serializing_if = "Option::is_none")]
skills_dir: Option<String>,
#[serde(rename = "directSubmitSkill", skip_serializing_if = "Option::is_none")]
direct_submit_skill: Option<String>,
#[serde(rename = "skillsPromptMode", skip_serializing_if = "Option::is_none")]
skills_prompt_mode: Option<String>,
#[serde(rename = "runtimeProfile", skip_serializing_if = "Option::is_none")]
runtime_profile: Option<String>,
#[serde(rename = "plannerMode", skip_serializing_if = "Option::is_none")]
planner_mode: Option<String>,
#[serde(rename = "activeProvider", skip_serializing_if = "Option::is_none")]
active_provider: Option<String>,
#[serde(rename = "browserBackend", skip_serializing_if = "Option::is_none")]
browser_backend: Option<String>,
#[serde(rename = "officeBackend", skip_serializing_if = "Option::is_none")]
office_backend: Option<String>,
#[serde(rename = "browserWsUrl", skip_serializing_if = "Option::is_none")]
browser_ws_url: Option<String>,
#[serde(rename = "serviceWsListenAddr", skip_serializing_if = "Option::is_none")]
service_ws_listen_addr: Option<String>,
#[serde(default)]
providers: Vec<SerializableProviderSettings>,
}
#[derive(Debug, Serialize)]
struct SerializableProviderSettings {
id: String,
provider: Option<String>,
#[serde(rename = "apiKey")]
api_key: String,
#[serde(rename = "baseUrl", skip_serializing_if = "Option::is_none")]
base_url: Option<String>,
model: String,
#[serde(rename = "apiPath", skip_serializing_if = "Option::is_none")]
api_path: Option<String>,
#[serde(rename = "wireApi", skip_serializing_if = "Option::is_none")]
wire_api: Option<String>,
#[serde(rename = "requiresOpenaiAuth")]
requires_openai_auth: bool,
}
```
Add `use serde::Serialize;` at the top of the file (combine with existing `use serde::Deserialize;`):
```rust
use serde::{Deserialize, Serialize};
```
- [ ] **Step 3: Add `to_serializable()` method to `SgClawSettings`**
In the `impl SgClawSettings` block, add:
```rust
fn to_serializable(&self) -> SerializableRawSgClawSettings {
let format_enum_value = |s: &str| s.to_string();
SerializableRawSgClawSettings {
api_key: self.provider_api_key.clone(),
base_url: self.provider_base_url.clone(),
model: self.provider_model.clone(),
skills_dir: self.skills_dir.as_ref().map(|p| p.to_string_lossy().into_owned()),
direct_submit_skill: self.direct_submit_skill.clone(),
skills_prompt_mode: Some(format_enum_value(match self.skills_prompt_mode {
SkillsPromptMode::Full => "full",
SkillsPromptMode::Compact => "compact",
})),
runtime_profile: Some(format_enum_value(match self.runtime_profile {
RuntimeProfile::BrowserAttached => "browser-attached",
RuntimeProfile::BrowserHeavy => "browser-heavy",
RuntimeProfile::GeneralAssistant => "general-assistant",
})),
planner_mode: Some(format_enum_value(match self.planner_mode {
PlannerMode::ZeroclawPlanFirst => "zeroclaw-plan-first",
PlannerMode::LegacyDeterministic => "legacy-deterministic",
})),
active_provider: Some(self.active_provider.clone()),
browser_backend: Some(format_enum_value(match self.browser_backend {
BrowserBackend::SuperRpa => "super-rpa",
BrowserBackend::AgentBrowser => "agent-browser",
BrowserBackend::RustNative => "rust-native",
BrowserBackend::ComputerUse => "computer-use",
BrowserBackend::Auto => "auto",
})),
office_backend: Some(format_enum_value(match self.office_backend {
OfficeBackend::OpenXml => "openxml",
OfficeBackend::Disabled => "disabled",
})),
browser_ws_url: self.browser_ws_url.clone(),
service_ws_listen_addr: self.service_ws_listen_addr.clone(),
providers: self
.providers
.iter()
.map(|p| SerializableProviderSettings {
id: p.id.clone(),
provider: Some(p.provider.clone()),
api_key: p.api_key.clone(),
base_url: p.base_url.clone(),
model: p.model.clone(),
api_path: p.api_path.clone(),
wire_api: p.wire_api.clone(),
requires_openai_auth: p.requires_openai_auth,
})
.collect(),
}
}
```
- [ ] **Step 4: Add `save_to_path()` method**
In the same `impl SgClawSettings` block, add:
```rust
pub fn save_to_path(&self, path: &Path) -> Result<(), ConfigError> {
let serializable = self.to_serializable();
let json = serde_json::to_string_pretty(&serializable)
.map_err(|err| ConfigError::ConfigParse(path.to_path_buf(), err.to_string()))?;
std::fs::write(path, json)
.map_err(|err| ConfigError::ConfigRead(path.to_path_buf(), err.to_string()))
}
```
- [ ] **Step 5: Run tests to verify compilation**
Run: `cargo test --lib config::settings`
Expected: PASS
### Task 4: Handle `UpdateConfig` in the service server
**Files:**
- Modify: `src/service/server.rs`
- Modify: `src/service/mod.rs` (if needed for imports)
- [ ] **Step 1: Add `UpdateConfig` match arm in `serve_client`**
In the `match message` block in `serve_client`, after the `SubmitTask` arm, add:
```rust
ClientMessage::UpdateConfig { config } => {
let Some(config_path) = context.config_path() else {
sink.send_service_message(ServiceMessage::ConfigUpdated {
success: false,
message: "未找到配置文件路径。请通过 --config-path 参数启动 sg_claw 后再使用此功能。".to_string(),
})?;
continue;
};
if !config_path.exists() {
sink.send_service_message(ServiceMessage::ConfigUpdated {
success: false,
message: format!("配置文件不存在: {}", config_path.display()),
})?;
continue;
}
let result = update_config_file(config_path, config);
match result {
Ok(()) => {
sink.send_service_message(ServiceMessage::ConfigUpdated {
success: true,
message: "配置已保存。重启 sg_claw 以应用新配置。".to_string(),
})?;
}
Err(err) => {
sink.send_service_message(ServiceMessage::ConfigUpdated {
success: false,
message: format!("保存配置失败: {}", err),
})?;
}
}
}
```
- [ ] **Step 2: Add `update_config_file` helper function**
Add this function above `serve_client` in `server.rs`:
```rust
use crate::config::settings::{ConfigError, SgClawSettings};
use crate::service::protocol::ConfigUpdatePayload;
use std::path::Path;
fn update_config_file(config_path: &Path, config: ConfigUpdatePayload) -> Result<(), String> {
let mut settings = SgClawSettings::load(Some(config_path))
.map_err(|e| e.to_string())?
.ok_or_else(|| "无法读取现有配置".to_string())?;
if let Some(v) = config.api_key {
settings.provider_api_key = v;
}
if let Some(v) = config.base_url {
settings.provider_base_url = v;
}
if let Some(v) = config.model {
settings.provider_model = v;
}
if let Some(v) = config.skills_dir {
settings.skills_dir = Some(PathBuf::from(&v));
}
if let Some(v) = config.direct_submit_skill {
settings.direct_submit_skill = Some(v);
}
if let Some(v) = config.runtime_profile {
settings.runtime_profile = match v.as_str() {
"browser-attached" => crate::config::settings::RuntimeProfile::BrowserAttached,
"browser-heavy" => crate::config::settings::RuntimeProfile::BrowserHeavy,
"general-assistant" => crate::config::settings::RuntimeProfile::GeneralAssistant,
_ => return Err(format!("无效的 runtimeProfile: {}", v)),
};
}
if let Some(v) = config.browser_backend {
settings.browser_backend = match v.as_str() {
"super-rpa" => crate::config::settings::BrowserBackend::SuperRpa,
"agent-browser" => crate::config::settings::BrowserBackend::AgentBrowser,
"rust-native" => crate::config::settings::BrowserBackend::RustNative,
"computer-use" => crate::config::settings::BrowserBackend::ComputerUse,
"auto" => crate::config::settings::BrowserBackend::Auto,
_ => return Err(format!("无效的 browserBackend: {}", v)),
};
}
settings
.save_to_path(config_path)
.map_err(|e| format!("写入配置文件失败: {}", e))
}
```
Add the import at the top of server.rs:
```rust
use std::path::PathBuf;
```
- [ ] **Step 3: Run tests to verify compilation**
Run: `cargo build`
Expected: SUCCESS
### Task 5: Add auto-connect and settings UI to the service console HTML
**Files:**
- Modify: `frontend/service-console/sg_claw_service_console.html`
- [ ] **Step 1: Add auto-connect on page load**
At the very end of the `<script>` section, after the existing event listeners and `updateUiState()`, add:
```javascript
// Auto-connect on page load
window.addEventListener("DOMContentLoaded", () => {
connectOrDisconnectService(true);
});
```
- [ ] **Step 2: Add Settings button HTML**
In the sidebar section of the HTML, after the connect button and before the "Composer" section label, add:
```html
<button id="settingsBtn" class="ghost-btn" style="margin-top: 8px;">⚙ 设置</button>
```
- [ ] **Step 3: Add Settings modal HTML**
Before the closing `</body>` tag, add the modal HTML:
```html
<!-- Settings Modal -->
<div id="settingsModal" style="display: none; position: fixed; top: 0; left: 0; width: 100%; height: 100%; background: rgba(0,0,0,0.5); z-index: 1000; align-items: center; justify-content: center;">
<div style="background: var(--panel); border-radius: 20px; padding: 28px; width: min(520px, 90%); max-height: 85vh; overflow-y: auto; box-shadow: var(--shadow);">
<h3 style="margin: 0 0 20px; font-size: 1.2rem;">sgClaw 配置</h3>
<div class="field">
<label for="settingApiKey">API 密钥 *</label>
<input id="settingApiKey" type="password" placeholder="输入模型 API 密钥" />
</div>
<div class="field">
<label for="settingBaseUrl">模型服务地址 *</label>
<input id="settingBaseUrl" type="url" placeholder="例如https://api.deepseek.com" />
</div>
<div class="field">
<label for="settingModel">模型名称 *</label>
<input id="settingModel" type="text" placeholder="例如deepseek-chat" />
</div>
<div class="field">
<label for="settingSkillsDir">Skills 目录路径</label>
<input id="settingSkillsDir" type="text" placeholder="例如D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills" />
</div>
<div class="field">
<label for="settingDirectSubmitSkill">直接提交技能</label>
<input id="settingDirectSubmitSkill" type="text" placeholder="例如tq-lineloss-report.collect_lineloss" />
</div>
<div class="field">
<label for="settingRuntimeProfile">运行模式</label>
<select id="settingRuntimeProfile" style="width: 100%; border: 1px solid var(--line); border-radius: 16px; padding: 14px 16px; background: rgba(255, 255, 255, 0.92); color: var(--text); font: inherit;">
<option value="browser-attached">browser-attached</option>
<option value="browser-heavy">browser-heavy</option>
<option value="general-assistant">general-assistant</option>
</select>
</div>
<div class="field">
<label for="settingBrowserBackend">浏览器后端</label>
<select id="settingBrowserBackend" style="width: 100%; border: 1px solid var(--line); border-radius: 16px; padding: 14px 16px; background: rgba(255, 255, 255, 0.92); color: var(--text); font: inherit;">
<option value="super-rpa">super-rpa</option>
<option value="agent-browser">agent-browser</option>
<option value="rust-native">rust-native</option>
<option value="computer-use">computer-use</option>
<option value="auto">auto</option>
</select>
</div>
<div id="settingsValidation" style="color: var(--error); font-size: 0.92rem; min-height: 1.4em; margin: 10px 0;"></div>
<div style="display: flex; gap: 12px; margin-top: 16px;">
<button id="settingsSaveBtn" class="primary-btn" style="flex: 1;">保存</button>
<button id="settingsCancelBtn" class="ghost-btn" style="flex: 1;">取消</button>
</div>
</div>
</div>
```
- [ ] **Step 4: Add settings modal CSS**
Add these CSS rules inside the `<style>` block, before the `@media` query:
```css
/* Settings modal elements */
select {
width: 100%;
border: 1px solid var(--line);
border-radius: 16px;
padding: 14px 16px;
background: rgba(255, 255, 255, 0.92);
color: var(--text);
font: inherit;
outline: none;
cursor: pointer;
}
select:focus {
border-color: rgba(15, 118, 110, 0.5);
box-shadow: 0 0 0 4px rgba(15, 118, 110, 0.12);
}
```
- [ ] **Step 5: Add settings modal JavaScript logic**
Add this JavaScript at the end of the `<script>` section, before the closing `</script>` tag:
```javascript
// Settings modal state
const settingsElements = {
modal: document.getElementById("settingsModal"),
apiKey: document.getElementById("settingApiKey"),
baseUrl: document.getElementById("settingBaseUrl"),
model: document.getElementById("settingModel"),
skillsDir: document.getElementById("settingSkillsDir"),
directSubmitSkill: document.getElementById("settingDirectSubmitSkill"),
runtimeProfile: document.getElementById("settingRuntimeProfile"),
browserBackend: document.getElementById("settingBrowserBackend"),
validation: document.getElementById("settingsValidation"),
saveBtn: document.getElementById("settingsSaveBtn"),
cancelBtn: document.getElementById("settingsCancelBtn"),
};
let settingsOpenBtn = null; // will be set below
function openSettingsModal() {
// Pre-fill with current values from wsUrl field (for baseUrl hint)
settingsElements.apiKey.value = "";
settingsElements.baseUrl.value = "";
settingsElements.model.value = "";
settingsElements.skillsDir.value = "";
settingsElements.directSubmitSkill.value = "";
settingsElements.runtimeProfile.value = "browser-attached";
settingsElements.browserBackend.value = "super-rpa";
settingsElements.validation.textContent = "";
settingsElements.modal.style.display = "flex";
}
function closeSettingsModal() {
settingsElements.modal.style.display = "none";
}
function validateSettings() {
const apiKey = settingsElements.apiKey.value.trim();
const baseUrl = settingsElements.baseUrl.value.trim();
const model = settingsElements.model.value.trim();
if (!apiKey) {
return "API 密钥不能为空";
}
if (!model) {
return "模型名称不能为空";
}
if (!baseUrl) {
return "模型服务地址不能为空";
}
try {
new URL(baseUrl);
} catch {
return "模型服务地址格式无效,请输入有效的 URL";
}
return "";
}
function saveSettings() {
const error = validateSettings();
if (error) {
settingsElements.validation.textContent = error;
return;
}
if (!socket || socket.readyState !== WebSocket.OPEN) {
settingsElements.validation.textContent = "请先连接服务";
return;
}
settingsElements.validation.textContent = "";
settingsElements.saveBtn.disabled = true;
settingsElements.saveBtn.textContent = "保存中...";
const config = {
apiKey: settingsElements.apiKey.value.trim(),
baseUrl: settingsElements.baseUrl.value.trim(),
model: settingsElements.model.value.trim(),
};
const skillsDir = settingsElements.skillsDir.value.trim();
if (skillsDir) config.skillsDir = skillsDir;
const directSubmitSkill = settingsElements.directSubmitSkill.value.trim();
if (directSubmitSkill) config.directSubmitSkill = directSubmitSkill;
config.runtimeProfile = settingsElements.runtimeProfile.value;
config.browserBackend = settingsElements.browserBackend.value;
socket.send(JSON.stringify({
type: "update_config",
config,
}));
}
function handleConfigResponse(message) {
settingsElements.saveBtn.disabled = false;
settingsElements.saveBtn.textContent = "保存";
if (message.success) {
settingsElements.validation.textContent = message.message;
settingsElements.validation.style.color = "var(--success)";
// Auto-close after 2 seconds on success
setTimeout(closeSettingsModal, 2000);
} else {
settingsElements.validation.textContent = message.message;
settingsElements.validation.style.color = "var(--error)";
}
}
// Event listeners for settings
settingsOpenBtn = document.getElementById("settingsBtn");
settingsOpenBtn.addEventListener("click", openSettingsModal);
settingsElements.cancelBtn.addEventListener("click", closeSettingsModal);
settingsElements.saveBtn.addEventListener("click", saveSettings);
// Close modal on background click
settingsElements.modal.addEventListener("click", (e) => {
if (e.target === settingsElements.modal) {
closeSettingsModal();
}
});
```
- [ ] **Step 6: Handle `config_updated` message in `handleMessage`**
In the existing `handleMessage` function, add a new case in the switch statement:
```javascript
case "config_updated":
handleConfigResponse(message);
break;
```
- [ ] **Step 7: Verify the HTML is well-formed**
Open the file in a browser and visually check that:
- The settings button appears below the connect button
- Clicking it opens the modal
- The modal closes on Cancel or background click
### Task 6: Add protocol tests for new message types
**Files:**
- Modify: `tests/service_console_html_test.rs`
- Create: `tests/service_protocol_update_config_test.rs`
- [ ] **Step 1: Create protocol serialization test**
Create `tests/service_protocol_update_config_test.rs`:
```rust
use sgclaw::service::protocol::{ClientMessage, ConfigUpdatePayload, ServiceMessage};
#[test]
fn update_config_serializes_correctly() {
let config = ConfigUpdatePayload {
api_key: Some("test-key".to_string()),
base_url: Some("https://api.example.com".to_string()),
model: Some("test-model".to_string()),
skills_dir: Some("/path/to/skills".to_string()),
direct_submit_skill: Some("my-skill.my-tool".to_string()),
runtime_profile: Some("browser-attached".to_string()),
browser_backend: Some("super-rpa".to_string()),
};
let msg = ClientMessage::UpdateConfig { config };
let json = serde_json::to_string(&msg).unwrap();
assert!(json.contains("\"type\":\"update_config\""));
assert!(json.contains("\"apiKey\":\"test-key\""));
assert!(json.contains("\"baseUrl\":\"https://api.example.com\""));
assert!(json.contains("\"model\":\"test-model\""));
}
#[test]
fn update_config_deserializes_correctly() {
let json = r#"{
"type": "update_config",
"config": {
"apiKey": "key123",
"baseUrl": "https://api.test.com",
"model": "gpt-4"
}
}"#;
let msg: ClientMessage = serde_json::from_str(json).unwrap();
match msg {
ClientMessage::UpdateConfig { config } => {
assert_eq!(config.api_key, Some("key123".to_string()));
assert_eq!(config.base_url, Some("https://api.test.com".to_string()));
assert_eq!(config.model, Some("gpt-4".to_string()));
assert!(config.skills_dir.is_none());
}
_ => panic!("expected UpdateConfig variant"),
}
}
#[test]
fn config_updated_serializes_correctly() {
let msg = ServiceMessage::ConfigUpdated {
success: true,
message: "配置已保存".to_string(),
};
let json = serde_json::to_string(&msg).unwrap();
assert!(json.contains("\"type\":\"config_updated\""));
assert!(json.contains("\"success\":true"));
assert!(json.contains("配置已保存"));
}
#[test]
fn config_updated_deserializes_correctly() {
let json = r#"{"type":"config_updated","success":false,"message":"保存失败"}"#;
let msg: ServiceMessage = serde_json::from_str(json).unwrap();
match msg {
ServiceMessage::ConfigUpdated { success, message } => {
assert!(!success);
assert_eq!(message, "保存失败");
}
_ => panic!("expected ConfigUpdated variant"),
}
}
```
- [ ] **Step 2: Update service console HTML test**
Add to `tests/service_console_html_test.rs`, at the end of the existing test:
```rust
// New enhancement assertions
assert!(source.contains("DOMContentLoaded"));
assert!(source.contains("settingsBtn"));
assert!(source.contains("settingsModal"));
assert!(source.contains("update_config"));
assert!(source.contains("config_updated"));
assert!(source.contains("settingApiKey"));
assert!(source.contains("settingBaseUrl"));
assert!(source.contains("settingModel"));
```
- [ ] **Step 3: Run all new tests**
Run: `cargo test --test service_protocol_update_config_test`
Run: `cargo test --test service_console_html_test`
Expected: All PASS
### Task 7: Full build and test verification
- [ ] **Step 1: Run full test suite**
Run: `cargo test 2>&1`
Expected: All tests pass (except pre-existing `lineloss_period_resolver_prompts_for_missing_period` which was already failing before our changes)
- [ ] **Step 2: Build release binary**
Run: `cargo build --release 2>&1`
Expected: SUCCESS
### Task 8: Manual smoke test instructions
After implementation, verify manually:
1. Start sg_claw with config path: `sg_claw.exe --config-path sgclaw_config.json`
2. Open `sg_claw_service_console.html` in browser
3. Verify: Page auto-connects (should show "已连接" within a few seconds)
4. Click "设置" button
5. Fill in API Key, Base URL, Model
6. Click "保存"
7. Verify: Modal shows "配置已保存。重启 sg_claw 以应用新配置。" and auto-closes after 2 seconds
8. Verify: `sgclaw_config.json` file contains the new values
9. Verify: Existing task submission still works (send a test instruction)

View File

@@ -0,0 +1,506 @@
# WS 浏览器后端认证替换设计
## 背景
当前 `sg_claw` 的 websocket service 路径已经能接收 `sg_claw_client` 请求、复用共享 task runner、连接真实浏览器 websocket 地址 `browser_ws_url`,并进入真实 skill 执行链路。但真实联调时,所有浏览器相关调用都会失败并返回:
- `invalid hmac seed: session key must not be empty`
根因已经定位:
- pipe 模式在 [src/lib.rs](src/lib.rs) 中通过 handshake 拿到 `session_key`,并用它构造 `BrowserPipeTool`
- ws service 模式在 [src/service/server.rs](src/service/server.rs) 中仍然构造 `BrowserPipeTool::new(..., vec![])`
- `BrowserPipeTool` 的认证模型要求非空 session key因此 ws service 路径虽然使用的是浏览器 websocket 协议,仍错误地依赖了 pipe 特有的 HMAC/session-key 语义
这会导致:
1. `sg_claw_client -> sg_claw` 连接正常
2. skill 加载与模型调用正常
3. 真实浏览器动作开始执行
4. 但所有 browser tool 调用在认证层统一失败
## 目标
**仅限 ws 模式改动** 的前提下,让 `sg_claw` service 路径改为使用 **ws-native browser backend**,不再依赖 `BrowserPipeTool` 的 pipe session-key 认证模型,从而让真实浏览器联调可用。
## 约束
必须满足:
- 只改 ws 模式相关实现
- 不破坏 legacy pipe 模式
- 不修改 pipe handshake 语义
- 不修改 `src/lib.rs` 的 pipe 主入口行为
- 不引入临时绕过认证或 fake seed
- 不扩大到多客户端、多任务、队列、守护进程管理
## 非目标
本次不做:
- 自动拉起 sgBrowser
- 浏览器进程管理
- 多浏览器实例支持
- service/client UX 优化
- browser ws 协议扩展
- pipe 模式重构
- 统一重构所有 runtime 层去完全依赖 `BrowserBackend`
## 现状分析
### 正常 pipe 路径
pipe 模式当前在 [src/lib.rs](src/lib.rs) 中:
1. 通过 `perform_handshake(...)` 读取浏览器侧初始化消息
2. 从 handshake 中拿到 `session_key`
3.`BrowserPipeTool::new(transport.clone(), mac_policy, handshake.session_key)` 构造浏览器工具
4. 后续 browser action 使用 pipe/HMAC 语义
该路径已经可用,本次不能动。
### 当前 ws service 路径
当前 ws 模式在 [src/service/server.rs](src/service/server.rs) 中:
1. `sg_claw_client` 将任务发给 `sg_claw` service
2. service 构造 `ServiceBrowserTransport`
3. service 用 `BrowserPipeTool::new(transport.clone(), mac_policy.clone(), vec![])`
4. browser action 经 `ServiceBrowserTransport` 编码为 browser websocket 请求并发给 `browser_ws_url`
问题在于第 3 步:
- service 走的是 browser websocket 协议
- 但却仍使用 `BrowserPipeTool`
- `BrowserPipeTool` 内部仍坚持要求 pipe session key
- 因此真实 ws 联调时直接失败
### 现有 ws-native 能力
代码中已经存在:
- [src/browser/ws_protocol.rs](src/browser/ws_protocol.rs):固定 browser websocket 协议 codec
- [src/browser/ws_backend.rs](src/browser/ws_backend.rs)`WsBrowserBackend`
- [src/browser/mod.rs](src/browser/mod.rs):已导出 `WsBrowserBackend`
`WsBrowserBackend` 本身不依赖 pipe session key而是
- 使用 `WsClient` 发送/接收文本帧
- 使用 `MacPolicy` 做动作级校验
- 通过 `encode_v1_action(...)``decode_callback_frame(...)` 处理 ws 协议
这正是 ws service 模式应该使用的模型。
## 关键集成缝隙
当前共享 runner 的真实缝隙已经确认:
- [src/agent/task_runner.rs](src/agent/task_runner.rs) 的 `run_submit_task(...)` 仍直接要求 `&BrowserPipeTool<T>`
- [src/compat/runtime.rs](src/compat/runtime.rs) 与 [src/compat/orchestration.rs](src/compat/orchestration.rs) 也继续以 `BrowserPipeTool<T>` 作为主浏览器调用对象
- 同时 compat runtime 内部已经存在 `Arc<dyn BrowserBackend>` 的工具适配层,只是它目前是从 `PipeBrowserBackend::from_inner(browser_tool)` 包出来的
这意味着本次实现不能只在 `src/service/server.rs` 里替换构造逻辑,而必须在 **ws 专用调用面** 增加一个最小适配缝隙,让 service 模式能把 `WsBrowserBackend` 传入 compat/runtime/orchestration而 pipe 继续保持 `BrowserPipeTool` 原样。
允许的最小缝隙定义如下:
1. `run_submit_task(...)` 的 pipe 版本保持不动,供 pipe 入口继续使用
2. 新增一个 **仅供 ws service 使用** 的并行入口,例如:
- `run_submit_task_with_browser_backend(...)`
- 或 service 侧调用的等价 ws-only adapter
3. ws-only 入口内部允许把浏览器依赖类型降到 `Arc<dyn BrowserBackend>`
4. `src/lib.rs`、pipe handshake、pipe `BrowserPipeTool` 构造逻辑不允许改行为
## 设计决策
### 决策 1ws service 路径弃用 `BrowserPipeTool`
在 ws service 路径中,不再构造 `BrowserPipeTool`
替代方案:
- service 侧提供一个 `WsClient` 实现
- 直接构造 `WsBrowserBackend`
- 让 ws service 的 browser action 通过 `WsBrowserBackend` 执行
### 决策 2pipe 路径保持原样
pipe 模式继续:
- handshake
- `session_key`
- `BrowserPipeTool`
不做语义调整,不引入兼容层,不改动已存在的验证路径。
### 决策 3runner 只在 ws 调用面做最小接线
当前共享 task runner 复用已经存在,本次不做大重构。
策略是:
- 只在 ws service 用到的调用面,改成可使用 `WsBrowserBackend`
- 如果必须扩共享调用接口,则仅做**最小、兼容、对 pipe 零影响**的改动
- 任何涉及 pipe 行为变更的改动都不允许
### 决策 4保留现有 browser websocket 连接生命周期
本次不重做连接管理架构。
继续维持:
- 单客户端
- 单任务串行
- 按现有 service 生命周期维护 browser websocket 连接
只替换认证错误的执行路径,不顺手做生命周期优化。
## 目标架构
### 目标调用链
```text
sg_claw_client
-> sg_claw service
-> ws-native browser backend
-> browser_ws_url
-> sgBrowser
```
### 与 pipe 的并行关系
```text
pipe mode:
browser process <-> stdio/pipe <-> sgclaw::run() <-> BrowserPipeTool
ws mode:
sg_claw_client <-> sg_claw service <-> WsBrowserBackend <-> sgBrowser websocket
```
两条路径并行存在,互不混用认证模型。
## 模块设计
### 1. `src/service/server.rs`
这是本次核心改动文件。
#### 当前职责
- 管理 service client websocket 收发
- 将 service 请求转入共享 runner
- 维护 service->browser 的 websocket 传输桥
#### 本次改动
- 将“service->browser 的桥”从 `Transport + BrowserPipeTool` 组合改为 `WsClient + WsBrowserBackend`
- 删除 ws service 路径中对空 `session_key` 的依赖
- 继续保留 service socket 生命周期与 session 状态机
#### 目标结构
可接受的目标形态:
- `ServiceBrowserWsClient`:实现 `WsClient`
- 内部继续维护真实 browser websocket 连接
- `serve_client(...)` 在处理任务时构造 `WsBrowserBackend`
- 共享 runner 或其 ws 调用包装层通过该 backend 执行 browser action
### 2. 共享 runner / ws 调用包装层
本次不要求把全项目统一改成 `BrowserBackend`
但 ws service 模式必须能把 browser action 接到 `WsBrowserBackend`
可接受的最小方案:
- 在 ws service 使用的一层引入一个只服务 ws 模式的 adapter
- 该 adapter 把 runner 所需的 browser 调用能力委托给 `WsBrowserBackend`
要求:
- pipe 现有调用签名不变,或即使扩展也必须保证 pipe 行为完全一致
- 不允许为了 ws 把 pipe 入口重写
### 3. `src/browser/ws_backend.rs`
原则上复用现有实现。
只有在以下情况下才允许最小补改:
- service 真实联调发现它缺一个 ws service 必需但当前未暴露的能力
- 该补改只服务 ws-native 路径
- 不影响现有测试语义
## 连接职责与边界
为避免 service 侧与 `WsBrowserBackend` 重复实现责任,本次显式约束如下:
### `WsBrowserBackend` 负责
- 单次 `invoke(...)` 的请求串行化
- 调用 `encode_v1_action(...)`
- 发送 websocket 文本帧
- 等待即时状态帧
- 如有 callback等待 callback 帧并做名称匹配
- 将结果统一为 `CommandOutput`
- 按现有 `WsBrowserBackend` 语义产出 timeout / protocol 错误
### service 侧 `WsClient` 适配器负责
- 持有真实 browser websocket 连接
- 在第一次请求时建立到 `browser_ws_url` 的连接
-`send_text(...)` / `recv_text_timeout(...)` 委托到真实 websocket
- 将底层关闭、reset、timeout 统一映射为既有 `PipeError` 语义
- 不实现 request/response correlation不解析 browser ws 协议 payload
### 明确不允许
- service 侧继续手写 callback 轮询逻辑
- service 侧继续直接调用 `encode_v1_action(...)` 组包作为主路径
- 在 service 侧复制 `WsBrowserBackend` 的协议处理逻辑
这样可以保证:
- `src/service/server.rs` 只负责“连线”
- `src/browser/ws_backend.rs` 继续负责“ws 浏览器调用语义”
## 数据流设计
### 成功路径
1. `sg_claw_client``sg_claw``SubmitTask`
2. service 收到任务并进入共享 runner
3. 当 runner 需要浏览器动作时:
- ws service 调用 `WsBrowserBackend.invoke(...)`
4. `WsBrowserBackend`
-`MacPolicy` 校验动作
-`encode_v1_action(...)` 编码请求
- 发往 `browser_ws_url`
- 等待状态帧
- 如有 callback继续等 callback 帧
5. 结果返回到 runner
6. runner 继续执行并向 client 流式输出日志和 completion
### 失败路径
#### browser websocket 不可连
- 返回明确的 browser websocket connect 错误
- 不冒充认证错误
#### 浏览器返回非 0 状态
- 返回明确协议错误:`browser returned non-zero status`
#### callback 超时
- 返回 timeout
#### websocket 断开
- 返回 `PipeError::PipeClosed`
- 由 service 生命周期逻辑处理
#### 不再允许的错误
- `invalid hmac seed: session key must not be empty`
该错误在 ws 模式下应彻底消失。
## 失败语义
为便于测试与实现ws-only 路径的 outward error 语义固定如下:
### browser websocket connect 失败
- outward: `PipeError::Protocol("browser websocket connect failed: ...")`
### 浏览器返回非 0 状态码
- outward: `PipeError::Protocol("browser returned non-zero status: ...")`
### callback 超时
- outward: `PipeError::Timeout`
- timeout 来源:沿用 `WsBrowserBackend` / ws service 当前 response timeout 配置,默认 30 秒
### websocket 被对端正常关闭或 reset
- outward: `PipeError::PipeClosed`
- 不允许使用“等价错误”这类不精确表述
### 本次必须消除的错误
- `invalid hmac seed: session key must not be empty`
任何 ws service 联调路径再出现该错误,都视为实现未完成。
## 测试设计
### 分层测试策略
为避免依赖 LLM/planner 的非确定性行为,本次测试必须分成两层,且各自断言不同目标:
#### A. backend / adapter 层测试(确定性)
这一层不经过 `sg_claw_client`、不经过真实模型规划,直接验证 ws-only 技术行为。
目标:
1. `ServiceBrowserWsClient``WsBrowserBackend` 的组合可以:
- 发送 `Navigate`
- 接收 `0` 状态
- 在 callback 场景下读取 callback 文本
2. 当 fake browser server 主动关闭/reset 时:
-`WsClient` / `WsBrowserBackend.invoke(...)` 观察层断言 outward error 必须是 `PipeError::PipeClosed`
3. 当 fake browser server 不返回 callback 时:
-`WsBrowserBackend.invoke(...)` 观察层断言 outward error 必须是 `PipeError::Timeout`
4. 该层测试完全不依赖 LLM、planner、skills 路由
建议:
- 新增 focused ws service/backend test
- 输入动作固定为代码直接调用 `invoke(Action::Navigate, ...)` 等,而不是自然语言任务
#### B. client -> service 集成测试(链路验证)
这一层验证 ws-only 接线已经替换掉空 session key 路径,但不承担细粒度协议语义断言。
目标:
1. 通过真实 `sg_claw_client -> sg_claw service` 发起一个最小自然语言任务
2. fake browser websocket server 至少收到一个来自 ws-only 路径的文本帧
3. client/service 输出中不再出现:
- `invalid hmac seed: session key must not be empty`
4. 该层只证明:
- ws service 已不再走空 session key 的 pipe 认证路径
- 真实端到端链路已能到达 browser websocket
该层不用于断言精确 enum 身份,也不用于覆盖 callback timeout / reset 细节。
### 新增红测 1ws-only backend/adapter 基本调用可用
目标:
- 不走自然语言任务
- 直接构造 ws service 使用的 `WsClient` + `WsBrowserBackend`
- 调用固定动作:`Action::Navigate`,目标 url 固定为 `https://www.zhihu.com/hot`
- fake browser websocket server 返回 `0`
- 断言:
- `invoke(...)` 成功
- fake server 收到的首个文本帧可按 `ws_protocol` 语义解释为 `Navigate`
### 新增红测 2ws-only backend/adapter 断链语义固定
目标:
- 不走自然语言任务
- fake browser websocket server 在接受请求后主动关闭或 reset
-`invoke(...)` 观察层断言:
- outward error 固定为 `PipeError::PipeClosed`
### 新增红测 3ws-only backend/adapter callback timeout 语义固定
目标:
- 不走自然语言任务
- fake browser websocket server 返回 `0` 但不返回 callback 帧
-`invoke(...)` 观察层断言:
- outward error 固定为 `PipeError::Timeout`
### 新增红测 4client->service 链路不再触发空 session key 错误
目标:
- 通过真实 `sg_claw_client -> sg_claw service` 链路触发浏览器动作
- 用 fake browser websocket 服务端接住请求
- 任务输入固定为:`打开知乎热榜并读取页面主区域文本`
- 断言 client/service 输出中不再出现:
- `invalid hmac seed: session key must not be empty`
- 断言 fake browser server 至少收到了一个文本帧
### 回归测试
必须重新运行并保持通过:
#### pipe 回归
```bash
cargo test --test pipe_handshake_test -- --nocapture
```
如实现涉及 browser tool 上层接线,还需补跑:
```bash
cargo test --test browser_tool_test --test compat_browser_tool_test --test runtime_task_flow_test -- --nocapture
```
#### ws 回归
```bash
cargo test --test service_ws_session_test --test service_task_flow_test --test browser_ws_protocol_test --test browser_ws_backend_test -- --nocapture
```
## 手工验收
使用真实配置和真实已启动 sgBrowser
1. 启动 sgBrowser并确保 `browserWsUrl` 可用
2. 启动 `sg_claw`
3. 运行:
- `sg_claw_client`
4. 发送知乎最小任务:
- 打开知乎热榜并读取页面主区域文本
5. 观察:
- 不再出现 `invalid hmac seed`
- 出现真实 browser action 日志
- 能返回单次 completion
6. 再运行旧知乎 skill
- `读取知乎热榜数据,并导出 excel 文件`
7. 验证旧知乎 skill 进入真实 browser 执行路径
8. 最后确认 legacy pipe 入口仍可启动(仅验证,不允许为此修改 pipe 实现)
## 风险
### 风险 1ws service 与共享 runner 接口耦合过深
控制:
- 只在 ws 使用面做 adapter
- 不对 pipe 主入口做结构性改造
### 风险 2为适配 ws-native backend 误改 pipe 调用链
控制:
- 所有 pipe 回归必须在每轮修改后重跑
- `src/lib.rs` 不允许改行为
### 风险 3ws service 内联连接逻辑与 `WsBrowserBackend` 责任重复
控制:
- 本次先以最小变更消除认证阻塞
- 不顺手做大规模整理
## 通过标准
满足以下全部条件才算完成:
1. ws service 路径不再依赖空 session key
2. 不再出现 `invalid hmac seed: session key must not be empty`
3. 真实 browser websocket 请求能发到 sgBrowser/fake browser server
4. 旧知乎 skill 至少能进入真实 browser action 执行链路
5. pipe 模式零回归
6. 所有新增/相关测试通过
## 实施建议
按以下顺序实施:
1. 先补红测锁定“ws 不再触发 invalid hmac seed”
2. 再把 ws service 路径切到 `WsBrowserBackend`
3. 跑 ws 测试
4. 跑 pipe 回归
5. 做真实知乎最小任务 smoke
6. 再做旧知乎 skill smoke

View File

@@ -0,0 +1,276 @@
# WS Browser Bridge Path Design
## Background
The repository now has explicit live evidence that the real sgBrowser websocket endpoint at `ws://127.0.0.1:12345` is **reachable** but is **not validated as an external-control surface**.
The probe transcript in `docs/_tmp_sgbrowser_ws_probe_transcript.md` shows a stable outcome across the full bootstrap matrix:
- direct open-page frame
- `sgOpenAgent`
- `sgSetAuthInfo`
- `sgBrowserLogin`
- `sgBrowerserActiveTab`
- combined bootstrap attempts
- alternate `requesturl` values
Across all of those sequences, the endpoint behaved like this:
1. websocket connection succeeds
2. first inbound text frame is always the banner `Welcome! You are client #1`
3. no sequence produced a reproducible numeric status frame for a real business action
4. no sequence produced a reproducible callback frame for a real business action
5. follow-on business frames timed out or produced no further usable protocol traffic
That means the current project can no longer treat raw external websocket business frames as the default production integration surface.
## Why the raw websocket path is now considered non-validated
The decision is not based on a guess. It is based on both live evidence and repository evidence.
### Live evidence
`docs/_tmp_sgbrowser_ws_probe_transcript.md` proves that the real endpoint did **not** yield the one thing raw external control needs:
- a reproducible status/callback response for a real browser action
Because that never happened, the bootstrap hypothesis did not clear the acceptance bar.
### Repository evidence
The rest of the repository already points to a different product integration model.
#### 1. Historical frontend code uses browser-host bridge surfaces
In `frontend/archive/sgClaw验证-已归档/testRunner.js:15-26`:
- the runtime checks for `window.sgFunctionsUI`
- the runtime checks for `window.BrowserAction`
- the working path uses `window.sgFunctionsUI(action, params, callback)`
That is a host/browser bridge contract, not an external raw websocket RPC contract.
#### 2. Prior architecture docs make `CommandRouter` the execution entry
In `docs/superpowers/specs/2026-03-25-superrpa-sgclaw-browser-control-design.md:16-18` and `:36-50`:
- reuse SuperRPA `CommandRouter` as the browser execution entry
- keep browser-side hosting, security re-check, and dispatch in SuperRPA
- avoid building parallel browser automation APIs
That is directly incompatible with treating raw external websocket business frames as the primary control plane.
#### 3. Project planning docs describe FunctionsUI IPC as the supported frontend seam
In `docs/archive/项目管理与排期/协作时间表.md:419-430`:
- Vue/FunctionsUI calls browser-host methods such as `window.superrpa.sgclaw.start()` and `sendCommand(...)`
- browser host pushes callbacks such as `onStatusChange(...)` and `onLog(...)`
Again, this is a bridge and host IPC model.
#### 4. Floating-chat planning already preserves named bridge calls
In `docs/plans/2026-03-27-sgclaw-floating-chat-frontend-design.md:289-293`:
- `connect()` issues `sgclawConnect`
- `start()` issues `sgclawStart`
- `stop()` issues `sgclawStop`
- `submitTask()` issues `sgclawSubmitTask`
That design work assumes a named browser bridge, not direct raw websocket frames.
## Decision
**Authoritative browser integration surface: the browser-host bridge path, not the raw external sgBrowser websocket business-frame path.**
More concretely, sgClaw should target this chain:
```text
sgClaw runtime
-> existing browser-facing bridge contract
-> FunctionsUI / host IPC
-> BrowserAction / sgclaw host callbacks
-> existing SuperRPA CommandRouter dispatch
```
## Authoritative seams for future implementation
Because this repository does not contain the full SuperRPA browser host source tree, the bridge-first implementation must integrate at the **nearest validated seam available in this repo**, while staying aligned with the external browser-host contract already documented.
The future implementation must model **two different bridge layers** explicitly instead of mixing them together.
### Layer 1: session/lifecycle bridge contract
This layer is evidenced by the named calls already present in repo documentation:
- `sgclawConnect`
- `sgclawStart`
- `sgclawStop`
- `sgclawSubmitTask`
This layer manages session setup, task submission, and host/UI lifecycle behavior.
It is important evidence that a browser-host bridge exists, but it is **not** the per-browser-action contract that a new `BrowserBackend` implementation should target.
### Layer 2: browser-action execution contract
This is the authoritative target for the new browser backend.
It is evidenced by:
- `window.BrowserAction(...)` in archived frontend code
- `FunctionsUI` / host IPC integration in archived planning docs
- browser-side dispatch through `CommandRouter` in `docs/superpowers/specs/2026-03-25-superrpa-sgclaw-browser-control-design.md`
In this repository, the concrete boundary must be a **repo-local semantic transport seam** that can be implemented and tested without access to the external SuperRPA host code.
That seam should be a narrow Rust-side contract such as `BridgeActionTransport`:
- input: semantic browser action request (`navigate`, `click`, `getText`, etc.) plus params and expected domain
- output: semantic success/error reply that can be normalized back into `BrowserBackend` results
`BridgeBrowserBackend` should target **Layer 2 only**.
### Explicit out-of-scope boundary
The following are outside this repository and therefore outside the immediate Rust implementation slice:
- actual SuperRPA C++ host/browser code
- actual `FunctionsUI` TypeScript host plumbing in the external browser repository
- actual `CommandRouter` implementation in the external browser repository
This repository should implement only:
- the Rust-side bridge contract types
- the Rust-side bridge transport/provider seam
- the Rust-side bridge-backed browser adapter
- deterministic tests against those seams
### What this means practically
The next implementation slice should **not** continue trying to make `WsBrowserBackend` drive the real browser endpoint directly.
Instead, the next implementation slice should introduce a **bridge-backed browser adapter** that:
- preserves the Rust-side `BrowserBackend` contract where practical
- translates browser actions onto the Layer-2 semantic bridge surface
- keeps lifecycle/session bridge calls separate from per-action browser execution
- leaves the raw websocket probe code as diagnostic infrastructure only
## Chosen architecture
Use a bridge-backed adapter design.
### Target shape
```text
compat/runtime/orchestration
-> Arc<dyn BrowserBackend>
-> BridgeBrowserBackend (new)
-> BridgeActionTransport (new repo-local seam)
-> external browser-host bridge / FunctionsUI IPC
-> BrowserAction / CommandRouter path
```
### Why this shape
- It preserves the already-useful Rust-side browser abstraction (`BrowserBackend`) instead of re-plumbing the entire runtime.
- It keeps raw websocket probing available for diagnostics without letting it dictate production architecture.
- It matches the architecture already documented for SuperRPA integration.
- It keeps future work narrow: one new adapter layer instead of rewriting all runtime behavior.
## What stays the same
### Pipe path remains unchanged
The existing pipe path must remain behaviorally unchanged:
- `src/lib.rs`
- pipe handshake behavior
- `BrowserPipeTool`
- existing HMAC/domain validation semantics
The bridge-first work is about the **ws service / real browser integration path**, not about replacing or weakening the pipe path.
### Existing compat/runtime abstractions should be preserved where practical
The next slice should reuse:
- `BrowserBackend`
- existing browser tool adapters in compat/runtime
- existing task runner/orchestration flow
The new work should be concentrated in a bridge adapter and its wiring, not spread through unrelated layers.
## What does not stay the same
### Raw websocket is no longer the mainline production assumption
The repository may keep:
- `src/browser/ws_backend.rs`
- `src/browser/ws_protocol.rs`
- `src/browser/ws_probe.rs`
- `src/bin/sgbrowser_ws_probe.rs`
But those should now be treated as:
- protocol tooling
- fake-server test tooling
- live diagnostic/probe tooling
- possibly constrained compatibility code
They should remain diagnostic-only in this repository and must not be treated as the production path for reaching the real browser.
## Design constraints for the bridge slice
The bridge-path implementation must follow these constraints:
1. **No parallel browser API invention.** Reuse the real bridge/browser action surface already evidenced in docs and archived frontend code.
2. **No pipe regression.** Do not alter the working pipe entry path.
3. **Adapter-first design.** Prefer one bridge-backed backend implementation over broad runtime rewrites.
4. **TDD first.** Add focused bridge adapter tests before production wiring.
5. **Repository-local seam only.** Where external SuperRPA browser-host code is unavailable here, encode the contract in narrow adapters and tests instead of guessing internals.
## Testing implications
The bridge path changes what “proof” looks like.
### Required proof for the next slice
The next implementation slice must prove:
- a browser action can be emitted onto the bridge contract deterministically
- the bridge adapter maps replies/errors back into `BrowserBackend` semantics
- compat/runtime can use the bridge-backed backend without pipe regression
### No longer required for acceptance
The next slice does **not** need to prove that raw websocket business frames work directly against `ws://127.0.0.1:12345`, because the current evidence rejected that path as the mainline assumption.
## Acceptance criteria for this design decision
This design is correct only if future implementation follows all of these:
1. The next production slice targets the browser-host bridge path rather than raw external websocket business frames.
2. The raw websocket probe tooling remains diagnostic only.
3. Existing pipe behavior stays unchanged.
4. The next implementation plan identifies a narrow bridge-backed adapter, not a broad architecture rewrite.
5. Future success claims are based on bridge-path execution evidence, not on reinterpreting the existing raw-websocket transcript.
## Consequences
### Positive
- Aligns implementation with the strongest evidence already in the repo
- Stops further speculative coding on the wrong control surface
- Preserves existing ws probe work as useful diagnostics
- Keeps the next slice narrow and testable
### Trade-off
- Requires an additional adapter design step before more production code can land
- Defers any hope that a small websocket tweak alone will unlock the real browser path
That trade-off is correct, because the current blocker is no longer a small protocol bug. It is an integration-surface mismatch.

View File

@@ -0,0 +1,288 @@
# WS Browser Integration Surface Correction Design
## Background
The current websocket service path already proved two things:
1. `sg_claw_client -> sg_claw` request handling works.
2. The ws-native backend/auth replacement removed the old pipe/HMAC mismatch that produced `invalid hmac seed: session key must not be empty`.
However, real sgBrowser smoke still does not work.
Manual probing against the configured real browser websocket endpoint (`ws://127.0.0.1:12345`) produced a stable pattern:
- the connection succeeds
- the server sends one banner text frame such as `Welcome! You are client #1`
- after that, business frames receive no status frame and no callback frame
- this remains true for:
- valid-looking `sgBrowerserOpenPage` frames
- callback-based APIs
- no-arg/context-light APIs
- malformed or obviously wrong frames
At the same time, local documentation and archived frontend code point to a different integration model:
- the websocket API doc describes the websocket service as a transport replacement for page-context JavaScript calls, and requires the current page URL (`requesturl`) in each message
- archived frontend/product code uses `window.sgFunctionsUI(...)` and `window.BrowserAction(...)`
- archived architecture docs describe the supported product path as `FunctionsUI -> browser host bridge -> BrowserAction/CommandRouter`, not an arbitrary external process speaking raw browser websocket frames
This means the current assumption is no longer acceptable as the default architecture hypothesis:
- **Rejected default assumption:** `sg_claw` can directly control the real browser by speaking raw business frames to `browserWsUrl` as an external client, with no additional browser-host bridge, page context, or bootstrap/session contract.
That assumption may still turn out to be partially true, but it is no longer justified enough to continue coding against as the mainline design.
## Problem Statement
The project currently has a functioning ws-native transport implementation, but it does **not** have a validated real integration surface for sgBrowser.
The unresolved question is now architectural rather than syntactic:
### Possibility A: raw websocket is valid, but requires hidden bootstrap/preconditions
Examples suggested by the local API document:
- a real browser page must already exist and `requesturl` must refer to that page
- one or more setup calls such as `sgSetAuthInfo`, `sgBrowserLogin`, `sgOpenAgent`, or `sgBrowerserActiveTab` must happen first
- callbacks may require a browser-side JS/page context that an external process does not automatically have
- some APIs may only work against agent/show/hide areas after browser-side initialization
### Possibility B: raw websocket is not the supported external control surface
Instead, the real product path may require:
- `FunctionsUI` / browser-host IPC
- host-side security and routing
- `BrowserAction` / `CommandRouter` dispatch
- page-injected or browser-embedded execution context
If this is true, continuing to invest in raw external websocket business-frame handling as the main integration surface would be architectural drift.
## Goal
Replace the current unvalidated ws-native-direct assumption with a decision-backed integration strategy.
The next implementation slice must do exactly one of these two things based on evidence:
1. **Bootstrap path:** prove that raw websocket control is real and supported once the missing bootstrap/precondition sequence is performed, then codify that bootstrap sequence and keep `WsBrowserBackend` as the execution surface.
2. **Bridge path:** prove that raw websocket is not the real supported surface for external control, then pivot the runtime design so sgClaw targets the actual browser-host bridge / `BrowserAction` surface instead of pretending the raw websocket is enough.
## Non-goals
This correction slice does **not** include:
- broad feature work on the floating chat UI
- multi-client service redesign
- browser process lifecycle management
- speculative protocol expansion
- generic reconnection/backoff work
- rewriting the entire compat/runtime stack without evidence
- landing both bootstrap and bridge implementations in one branch
The purpose of this slice is to choose the correct integration surface first.
## Evidence Summary
### Evidence that the current raw-ws-direct assumption is weak
1. Real endpoint accepts connections but stays silent after the welcome/banner frame.
2. Silence occurs even for malformed frames, which suggests the endpoint is not acting like an openly documented RPC surface for arbitrary external clients.
3. The API documentation frames websocket use as a replacement for page-side JS invocation, not as a standalone public automation API.
4. The documentation repeatedly depends on `requesturl`, callback function names, target pages, and browser areas (`show`, `hide`, `agent`).
5. Historical frontend/product code uses `window.sgFunctionsUI(...)` and `window.BrowserAction(...)`, not raw external websocket business calls.
6. Historical architecture docs emphasize `FunctionsUI`, `CommandRouter`, and browser-host bridge seams.
### Evidence that the current ws-native work is still useful
1. The ws-native auth replacement removed a real bug.
2. The ws backend now correctly carries forward the last navigated request URL.
3. `WsBrowserBackend` and `ws_protocol` remain valuable as deterministic protocol tooling for fake-server tests and any future bootstrap validation.
So the conclusion is **not** “delete ws-native work.”
The conclusion is:
- do not treat raw external websocket control as validated product architecture yet
- use the ws-native code only behind a decision gate
## Design Decision
Adopt a **decision-gated integration strategy**.
### Decision Gate 1: Validate bootstrap viability first
Before any more production architecture changes, add a focused, deterministic validation harness that can exercise a candidate raw-websocket bootstrap sequence against a live endpoint.
The harness must support:
- ordered frame scripts
- exact frame logging
- exact timeout/silence observation
- trying candidate setup sequences such as:
- `sgSetAuthInfo`
- `sgBrowserLogin`
- `sgOpenAgent`
- `sgBrowerserActiveTab`
- then a minimal action such as `sgBrowerserOpenPage` or `sgBrowserExcuteJsCodeByArea`
- trying the same action with different `requesturl` assumptions
- distinguishing these outcomes:
- numeric status returned
- callback returned
- welcome only, then silence
- close/reset
- protocol error
This harness is not product code. It is an evidence tool that prevents blind implementation.
### Decision Gate 2: Make bridge pivot the default fallback
If the validation harness cannot demonstrate a reproducible bootstrap sequence that yields real status/callback frames from the live browser endpoint, then raw websocket must be considered **non-validated for external control**.
At that point, the design must pivot to the bridge path:
- sgClaw browser control targets the real browser-host integration surface
- use the bridge already evidenced in docs/code (`FunctionsUI`, browser host IPC, `BrowserAction`, `CommandRouter`)
- keep raw websocket support, if retained at all, as a diagnostic or highly constrained adapter rather than the primary product path
## Architecture Options
## Option A: Bootstrap-validated raw websocket path
Choose this only if the live validation harness produces repeatable evidence.
### Resulting architecture
```text
sg_claw_client
-> sg_claw service
-> bootstrap sequence executor
-> WsBrowserBackend
-> browserWsUrl
-> sgBrowser
```
### Required conditions
- a reproducible bootstrap sequence exists
- the sequence yields status/callback traffic for real business actions
- the sequence can be encoded as a narrow service-side precondition layer
- the sequence does not require unowned browser UI/manual setup outside a documented contract
### Allowed production changes if Option A wins
- add explicit bootstrap calls before first browser action
- persist validated session/context state needed by the real endpoint
- tighten `request_url` / target-page handling around the proven contract
### Not allowed even if Option A wins
- guessing bootstrap steps without evidence
- silently sprinkling many setup calls into random locations
- broadening the compat/runtime API before the bootstrap contract is known
## Option B: Bridge-first integration path
Choose this if live validation does not prove a workable raw websocket bootstrap.
### Resulting architecture
```text
sg_claw_client
-> sg_claw service
-> bridge adapter
-> browser host / FunctionsUI / BrowserAction / CommandRouter
-> sgBrowser page actions
```
### Required conditions
- local docs/code show a stable supported bridge path
- raw websocket remains non-validated or only page-context-scoped
- the bridge surface can be wrapped behind the existing `BrowserBackend` abstraction or a sibling adapter without weakening pipe behavior
### Allowed production changes if Option B wins
- add a new browser backend implementation that targets the real bridge surface
- redirect ws service/browser execution away from raw business frames
- preserve ws-native code only for tests, probes, or intentionally constrained cases
### Not allowed even if Option B wins
- pretending the old raw-ws mainline still works “well enough”
- leaving the service path ambiguously split between two competing primary surfaces
## Scope Guardrails for the Next Implementation Plan
The next implementation plan must obey these guardrails:
1. **One branch, one decision.** Do not implement both architecture options at once.
2. **Evidence before code.** If bootstrap is unproven, the next coding task is probe/validation tooling, not another speculative service/runtime refactor.
3. **Keep pipe untouched.** `src/lib.rs`, pipe handshake, and the pipe `BrowserPipeTool` path remain behaviorally unchanged.
4. **Do not delete ws-native code prematurely.** It still has value for protocol tests and validation tooling.
5. **Do not broaden success claims.** Removing `invalid hmac seed` did not make real browser control work.
## Testing Strategy
### Stage 1: Evidence tooling tests
Add deterministic tests for the live-probe/validation harness so it can:
- send an ordered frame script
- record exact received frames
- report silence/timeout precisely
- expose transcript output suitable for comparing candidate bootstrap sequences
These tests use a fake websocket server, not sgBrowser.
### Stage 2: Live validation runs
Use the harness against the real endpoint with a fixed matrix of candidate sequences.
At minimum, compare:
1. no bootstrap -> minimal action
2. `sgOpenAgent` -> minimal action
3. `sgSetAuthInfo` -> minimal action
4. `sgBrowserLogin` -> minimal action
5. `sgBrowerserActiveTab` -> minimal action
6. combined documented bootstrap candidates -> minimal action
7. alternate `requesturl` values representing:
- `about:blank`
- target page URL
- a currently open page URL if known
### Stage 3: Architecture-branch acceptance
If Option A wins:
- add one automated regression that proves the validated bootstrap sequence produces the first real status frame in a controlled integration test
- then continue with the narrowest production implementation plan
If Option B wins:
- write a new bridge-integration implementation plan before changing production code
- base all production tasks on the documented bridge surface
## Acceptance Criteria for This Design Correction
This design correction is successful only if future work follows these rules:
1. The repository has an explicit design document recording that raw ws-native direct control is **not currently validated**.
2. The next engineering slice starts with validation or bridge selection, not another speculative runtime refactor.
3. Any future claim that raw websocket is the supported production path must be backed by a reproducible live bootstrap transcript.
4. If that evidence does not appear, the project pivots to the bridge path rather than continuing to guess.
## Consequences
### Positive
- stops further speculative coding against an unproven surface
- preserves useful ws-native work without over-committing to it
- creates a clean decision point for the next implementation branch
### Trade-off
- this does not immediately unblock real browser control
- it intentionally inserts an evidence phase before more production changes
That trade-off is acceptable because the current failure mode is architectural uncertainty, not a missing two-line fix.

View File

@@ -0,0 +1,105 @@
# WS Browser Welcome Frame Compatibility Design
## Background
Manual smoke verification after the ws-native browser backend auth replacement showed that real `sgBrowser` sends a banner text frame immediately after the websocket connection is established:
- `Welcome! You are client #1`
The current ws-native path treats the first received text frame as a protocol status frame. In `src/browser/ws_backend.rs`, `WsBrowserBackend::invoke(...)` reads one text frame and immediately parses it as an integer status code. That works for the existing deterministic tests, but it fails against the real browser because the first frame is a human-readable welcome banner rather than `0` or another numeric status.
This means the auth replacement is working — the old `invalid hmac seed: session key must not be empty` error no longer appears — but real smoke still fails on protocol parsing.
## Goal
Make the ws service path tolerate exactly one initial welcome/banner text frame from the real browser websocket, without weakening the general ws protocol semantics.
## Non-goals
This change must not:
- Relax parsing of arbitrary non-protocol text frames
- Change `WsBrowserBackend` into a browser-specific parser for banners
- Affect the legacy pipe path
- Add retry loops or broader reconnection logic
- Change callback handling semantics
## Chosen approach
Handle the welcome banner only in `ServiceBrowserWsClient`.
### Why this layer
`ServiceBrowserWsClient` is already the real-browser adapter used only by the ws service path in `src/service/server.rs`. The welcome frame is a quirk of the real browser endpoint rather than a property of the shared ws protocol abstraction. Keeping the compatibility behavior in the service-side client preserves the stricter semantics of `WsBrowserBackend` for all other callers and test doubles.
## Behavioral rules
1. Only the first received text frame after establishing a browser websocket connection may be treated as a welcome/banner candidate.
2. If that first text frame matches the real banner shape (currently observed as `Welcome! You are client #1`), the client discards it and continues waiting for the actual protocol frame.
3. The welcome skip is one-time only per connection, not per request. Because `ServiceBrowserWsClient` holds a persistent socket, this state must survive multiple `invoke(...)` calls on the same underlying websocket.
4. After the welcome skip:
- status frames must still be numeric strings
- callback frames must still match the existing JSON-array callback protocol
- any other malformed frame remains a protocol error
5. Timeout, close/reset, and connect-failure semantics remain unchanged.
## Matching strategy
Use a narrow string check in `ServiceBrowserWsClient` for a welcome/banner frame:
- starts with `Welcome! You are client #`
This is intentionally strict. We are adapting one known real-browser behavior, not introducing a generic “ignore garbage text” mode.
## Tests
### New red tests
Add focused unit tests under `src/service/server.rs` tests:
1. Positive case:
- fake websocket server sends:
1. `Welcome! You are client #1`
2. `0`
- then `WsBrowserBackend.invoke(Action::Navigate, ...)` succeeds
2. Negative case:
- fake websocket server sends a different first text frame that does **not** match the known welcome prefix
- assert the call still fails as a protocol error rather than silently skipping the frame
The positive test must fail before the implementation change and pass after it. The negative test guards the non-goal that we are not introducing a generic “ignore arbitrary text” mode.
### Regression coverage
Re-run:
- `cargo test service::server::tests -- --nocapture`
- `cargo test --test browser_ws_backend_test -- --nocapture`
- `cargo test --test service_task_flow_test -- --nocapture`
If those pass, re-run the earlier mixed ws+pipe sweep to confirm no unexpected regression escaped the targeted checks.
## Risks and controls
### Risk: swallowing a legitimate protocol error
Control:
- only allow the one-time skip on the first received text frame
- only skip frames matching the known welcome prefix
### Risk: broadening behavior beyond service ws path
Control:
- keep the change entirely inside `ServiceBrowserWsClient`
- do not modify `WsBrowserBackend` parsing rules
## Acceptance criteria
The fix is complete only if all of the following are true:
1. The positive welcome-banner test fails before the change and passes after it.
2. The negative malformed-first-frame test proves that non-matching first text frames still fail as protocol errors.
3. Real ws service smoke no longer fails with `invalid browser status frame: Welcome! You are client #1` when using the configured real sgBrowser endpoint.
4. Existing ws backend tests remain green.
5. Existing service task-flow regression remains green.
6. Pipe behavior remains unchanged, verified by the mixed ws+pipe regression suite.

View File

@@ -0,0 +1,182 @@
# Zhihu WS Submit Realignment Design
## Background
The current Zhihu submit path drifted away from the documented browser websocket contract.
The authoritative contract for this repository is `docs/_tmp_sgbrowser_ws_api_doc.txt`.
For this slice, the spec anchors to these documented invariants only:
- connect to `ws://127.0.0.1:12345`
- send `{"type":"register","role":"web"}`
- send browser actions as JSON arrays `[requesturl, action, ...args]`
- let browser results come back through documented callback semantics such as `callBackJsToCpp(...)`
- keep the current page URL as the request owner instead of inventing an external helper page
The current production path does not follow that shape for Zhihu routes.
Instead, the submit path selects `BrowserCallbackBackend`, which starts `LiveBrowserCallbackHost` and attempts to bootstrap a local helper page at `/sgclaw/browser-helper.html`. That helper-page bootstrap is not part of the user's confirmed production model, and live evidence already shows it is the wrong assumption for the Release browser.
## Problem Statement
Zhihu submit currently fails before real work begins because the service path depends on a helper-page callback host bootstrap that the Release browser does not use.
That drift shows up in three ways:
1. Zhihu submit routes select the callback-host backend instead of the direct websocket backend.
2. The mainline request URL becomes the local helper page URL instead of the real browser page URL.
3. The submit path waits for helper-page readiness rather than proceeding through the documented websocket callback model.
This causes the observable failure:
- `timeout while waiting for browser message`
- no real Zhihu page open/action in the browser
## Goal
Realign the Zhihu submit path to the documented websocket callback model without changing the existing pipe/service contract.
Concretely, the target behavior is:
- Zhihu submit routes use the websocket browser backend directly
- browser messages keep the real page URL as `requesturl`
- browser actions continue to use documented websocket opcodes
- callback-bearing results continue to use the documented callback payload model
- the browser no longer depends on opening a local helper page before Zhihu work starts
## Non-goals
This slice does not include:
- changing `ClientMessage` or `ServiceMessage`
- changing `run_submit_task_with_browser_backend(...)`
- rewriting the Zhihu workflow itself
- adding a new browser bridge abstraction
- redesigning the pipe path
- deleting callback-host code that is outside the Zhihu submit mainline
- speculative protocol expansion beyond the documented websocket contract
## Chosen Approach
Choose **Option A**: withdraw Zhihu submit from the helper-page callback-host path and return it to the documented websocket callback model.
Rejected alternatives:
- Keep callback host but remove helper bootstrap: still preserves the wrong abstraction in the mainline.
- Build a new orchestration layer: exceeds the requested scope.
## Mainline Architecture After Realignment
```text
sg_claw_client
-> sg_claw service / runtime submit path
-> existing BrowserBackend seam
-> WsBrowserBackend
-> ws://127.0.0.1:12345
-> documented browser opcodes and callback semantics
```
For Zhihu submit routes, the callback-host helper page is no longer part of the mainline execution chain.
## Required Production Changes
### 1. Route selection
Update submit-route backend selection so these routes no longer instantiate `BrowserCallbackBackend`:
- `WorkflowRoute::ZhihuHotlistExportXlsx`
- `WorkflowRoute::ZhihuHotlistScreen`
- `WorkflowRoute::ZhihuArticleEntry`
- `WorkflowRoute::ZhihuArticleDraft`
- `WorkflowRoute::ZhihuArticlePublish`
The change applies in both:
- service submit path in `src/service/server.rs`
- direct runtime submit path in `src/agent/mod.rs`
Direct runtime fallback behavior stays unchanged when no browser websocket URL is configured:
- if a real browser websocket URL is configured, use `WsBrowserBackend` for the listed Zhihu routes
- if no browser websocket URL is configured, keep the existing pipe fallback instead of failing fast
### 2. Request URL ownership
Keep `requesturl` aligned with the real browser page instead of the helper page.
Expected behavior:
- initial request URL comes from the existing submit-path request context
- after a successful navigate call, the websocket backend continues to update its request URL to the navigated target page
- later `getText` and `eval` calls run against the real Zhihu page URL
This preserves the documented page-owned websocket model.
### 3. Callback semantics
Keep callback-bearing actions on the existing websocket protocol path, using the documented callback payload shape.
Required invariants:
- action frames remain `[requesturl, action, ...args]`
- navigate uses the documented opcode `sgHideBrowserCallAfterLoaded`
- `getText` and `eval` continue to emit `callBackJsToCpp(...)` payloads in the documented `sourceUrl@_@targetUrl@_@callback@_@actionUrl@_@responseTxt` form
- callback decoding remains on the websocket path instead of moving through localhost helper-page HTTP endpoints
### 4. Callback-host removal from the Zhihu mainline
For this slice, callback-host code is removed from the Zhihu submit mainline, not redesigned.
Practical meaning:
- Zhihu submit must not start `LiveBrowserCallbackHost`
- Zhihu submit must not emit `sgBrowerserOpenPage` for `/sgclaw/browser-helper.html`
- Zhihu submit must not block on `/sgclaw/callback/ready`
Code outside the Zhihu submit mainline can remain unchanged unless tests require cleanup.
## Test Strategy
This slice follows TDD and replaces the stale helper-page assumptions with direct websocket submit-path assertions.
### Red tests to add or rewrite
1. Rewrite the current submit regression that asserts helper-page bootstrap.
- old behavior under test: Zhihu submit bootstraps callback host
- new behavior under test: Zhihu submit does **not** bootstrap callback host and does **not** emit helper-page frames
2. Add or update a focused submit-path regression proving request ownership stays on the real page.
- after navigate, subsequent Zhihu browser actions must use the real target page URL rather than `/sgclaw/browser-helper.html`
3. Remove or rewrite any newly added red test whose only purpose was to preserve callback-host-without-helper behavior.
- that test belongs to the rejected Option B path, not the chosen Option A path
### Green verification
After the minimal code change, run focused verification in this order:
1. `agent_runtime_test` coverage for the submit path
2. relevant Zhihu `compat_runtime_test` coverage
3. submit/service websocket regressions impacted by route selection
4. stronger real-browser validation after focused tests pass
## Scope Guardrails
The implementation plan for this spec must obey all of the following:
1. Do not modify the pipe contract.
2. Do not add a new browser abstraction.
3. Do not broaden the change beyond the Zhihu submit path and its directly affected websocket protocol tests.
4. Do not keep the helper-page path as a second competing Zhihu mainline.
5. If live validation still reveals a callback-payload mismatch, only adjust the websocket protocol encoding/decoding at the exact mismatch point.
## Acceptance Criteria
The slice is complete when all of the following are true:
1. Zhihu submit routes no longer select the helper-page callback-host backend.
2. No Zhihu submit regression expects or observes `/sgclaw/browser-helper.html` bootstrap.
3. The websocket backend sends Zhihu follow-up actions with the real page URL as `requesturl`.
4. Focused automated tests covering the changed submit path pass.
5. Real-browser validation no longer fails at callback-host readiness timeout, emits no helper-page bootstrap frames, and emits at least one real-page follow-up browser action after navigate.

View File

@@ -0,0 +1,219 @@
# Service Chat Web Console Design
## Background
The current natural-language entrypoint is the terminal client in `src/bin/sg_claw_client.rs`.
That client already talks to the existing service websocket, sends `ClientMessage`, and prints
`ServiceMessage` responses.
The repository also contains a separate browser callback helper at
`http://127.0.0.1:61058/sgclaw/browser-helper.html`. That page is part of the browser backend
execution path and must remain untouched.
For this slice, the authoritative boundary is:
- the new page may talk to the existing service websocket only
- the page must not talk to the browser websocket directly
- the page must not reuse or replace `browser-helper.html`
- the page must not change the service protocol or browser execution logic
## Problem Statement
Running `cargo run --bin sg_claw_client` and typing into stdin works, but it is inconvenient for
routine usage. The user wants a simple local HTML page with a websocket connection field, a natural-
language input box, and a send button.
The risk is scope drift: if the new page reaches into the browser-helper flow or changes backend
logic, it could damage the working Zhihu/browser path.
## Goal
Add a standalone local HTML console that connects to the existing service websocket and submits
natural-language tasks using the current `submit_task` message shape.
The page should be usable without changing `sg_claw`, `sg_claw_client`, `browser-helper.html`, or
any existing service/browser runtime behavior.
## Non-goals
This slice does not include:
- serving the page from the Rust service
- changing `ClientMessage` or `ServiceMessage`
- changing `src/service/server.rs`
- changing `src/browser/callback_host.rs`
- changing `src/browser/callback_backend.rs`
- changing the helper-page bootstrap flow
- adding authentication, persistence, or multi-session orchestration
- replacing the terminal client
## Chosen Approach
Choose Option A: add one standalone HTML file that opens in a normal browser and talks to the
existing service websocket at `ws://127.0.0.1:42321` by default.
Why this option:
- it is the narrowest possible change
- it reuses the already-working service protocol
- it does not alter the browser-helper path
- it keeps all runtime ownership in the existing Rust service
Rejected alternatives:
- extend `browser-helper.html` into a chat UI: wrong boundary; that page belongs to browser
callback orchestration, not user task entry
- add a new HTTP server inside `sg_claw`: unnecessary for the requested scope
- replace the terminal client binary: not required; both clients can coexist
## File Placement
Create the page outside `frontend/runtime-host/`.
Chosen location:
- `frontend/service-console/sg_claw_service_console.html`
Reason:
- `frontend/runtime-host/` is reserved for SuperRPA runtime-host bundles
- the new page is a standalone local tool, not a Chromium-hosted bundle
- keeping it in its own directory makes the isolation explicit
## Page Architecture
The page is a single self-contained HTML file with inline CSS and inline JavaScript.
No build step and no frontend framework are required.
The page has three UI regions:
1. Connection bar
- websocket URL input
- connect/disconnect button
- current connection state label
2. Message stream
- appends service logs in arrival order
- distinguishes connection info, task logs, errors, and final completion
- keeps the current session visible until the page is refreshed
3. Task composer
- one textarea for natural-language input
- one send button
- send disabled while the websocket is disconnected
- while a task is in flight, keep the composer enabled and let repeated submits surface the
existing service-side `busy` response rather than adding a new frontend queue
## Protocol Contract
The page must reuse the existing service protocol exactly.
### Outbound message
When the user clicks send, the page sends:
```json
{
"type": "submit_task",
"instruction": "<user input>",
"conversation_id": "",
"messages": [],
"page_url": "",
"page_title": ""
}
```
This matches the current terminal client shape in `src/bin/sg_claw_client.rs`.
### Inbound messages
The page displays these existing `ServiceMessage` variants:
- `status_changed` -> render as a compact connection/runtime status row
- `log_entry` -> append as a chronological task log row
- `task_complete` -> append as the terminal result row for that submission
- `busy` -> append as a visible refusal/error row without automatic retry
No new message type is introduced.
## Interaction Flow
1. User opens the local HTML file with a normal browser, typically via `file://`.
2. User connects to the service websocket.
3. The page shows websocket connection status locally.
4. User enters a natural-language instruction and clicks send.
5. The page sends one `submit_task` payload over the service websocket.
6. The service continues to execute tasks exactly as it already does.
7. Incoming service messages are appended to the message stream.
8. After `task_complete`, the websocket remains open so the user can send another task.
## Error Handling
The page handles only UI-local failures:
- websocket connect failure -> show connection error and keep send disabled
- websocket disconnect mid-session -> mark disconnected and require reconnect
- empty instruction -> block send and show inline validation
- `busy` response -> show as a visible service-side refusal without retry logic
The page does not add retries, protocol fallbacks, or browser-runtime recovery logic.
## Isolation From `browser-helper.html`
This is the critical constraint.
The new page must never:
- reference `/sgclaw/browser-helper.html`
- reference `/sgclaw/callback/ready`
- reference `/sgclaw/callback/events`
- reference `/sgclaw/callback/commands/next`
- reference `/sgclaw/callback/commands/ack`
- connect to `ws://127.0.0.1:12345`
The only network target owned by the page is the service websocket, defaulting to
`ws://127.0.0.1:42321`.
Because of that boundary, the page does not interfere with the helper-page bootstrap path.
## Test Strategy
This slice stays minimal, so the automated guard is also minimal.
### Automated regression
Add one focused integration test in `tests/service_console_html_test.rs` that reads the standalone
HTML source and asserts:
- the file exists at the agreed path and is resolved from `CARGO_MANIFEST_DIR` so the test is
stable across working directories
- it contains the service websocket default URL
- it contains `submit_task` payload construction
- it does not contain helper-page URLs or callback-host endpoints
- it does not contain the browser websocket URL
This test is a scope guard, not a browser-E2E suite.
### Manual smoke verification
With the existing service binary running:
1. open the HTML file in a browser
2. connect to the service websocket
3. confirm local websocket open/close events and service `status_changed` messages both appear in the message stream
4. submit a natural-language task
5. confirm logs and completion render in the page
6. confirm the helper-page path remains unchanged because the page never references it
## Acceptance Criteria
The slice is complete when all of the following are true:
1. `frontend/service-console/sg_claw_service_console.html` exists.
2. The page connects to the existing service websocket without backend changes.
3. The page sends the existing `submit_task` shape and receives existing `ServiceMessage` events.
4. The page does not reference `browser-helper.html`, callback-host endpoints, or the browser
websocket URL.
5. Existing browser-helper logic remains untouched.
6. The automated source guard passes.
7. Manual smoke verification confirms a task can be submitted from the HTML page.

View File

@@ -0,0 +1,373 @@
# Zhihu Hotlist Post-Export Auto-Open Design
## Background
The current Zhihu hotlist workflows already support two separate artifact outputs:
- `openxml_office` generates a local `.xlsx` file for hotlist export
- `screen_html_export` generates a local `.html` dashboard for presentation
Today, the workflow stops after artifact generation and returns a summary string such as:
- `已导出知乎热榜 Excel <path>`
- `已生成知乎热榜大屏 <path>`
That means the user still has to manually open the generated file.
The user wants one additional post-export action, but only one at a time:
1. for Excel-oriented tasks, automatically open the generated `.xlsx` with the system default spreadsheet application
2. for dashboard-oriented tasks, automatically open the generated local dashboard HTML inside the running sgBrowser session
This is an exclusive choice, not a combined mode.
## Current Runtime Facts
The implementation must match the current browser/runtime boundary that already exists in the repo:
- the active service submit path in `src/service/server.rs` constructs `BrowserCallbackBackend`
- `BrowserCallbackBackend::invoke(Action::Navigate, ...)` currently emits `sgBrowerserOpenPage`, which opens a new visible browser tab and keeps the helper page alive
- `WsBrowserBackend::invoke(Action::Navigate, ...)` has different semantics and a different transport path from the callback-host service path
- `MacPolicy::validate(...)` currently rejects empty or non-domain values, so a raw `file://...` navigation cannot pass through the normal domain validation path today
- `screen_html_export` already returns `presentation.url`, which is the existing `file://` presentation URL contract for the generated dashboard
Those facts mean the design must not promise "replace the helper page" or "reuse identical tab behavior across all backends". The required success path for this slice is narrower: open the generated dashboard automatically in the current callback-host-backed sgBrowser service session without adding a new user-facing surface.
## Problem Statement
The existing workflow logic in `src/compat/workflow_executor.rs` already separates hotlist export from dashboard generation, but it treats both routes as artifact-only flows. The last mile is missing:
- the Excel route does not auto-open the generated file
- the dashboard route does not consume the generated dashboard presentation URL and open it automatically in the browser runtime
The risk is scope drift. This change must not:
- turn Excel-open and dashboard-open into a combined workflow
- add new help/help-like user-visible surfaces
- move orchestration into `frontend/service-console/`
- modify the websocket protocol
- modify `browser-helper.html`
- modify callback-host HTTP endpoints or their contracts
- change the artifact-generation contract of `openxml_office` or `screen_html_export`
## Goal
Extend the existing Zhihu hotlist post-export behavior so that:
- Excel tasks generate `.xlsx` and then auto-open it with the local system default spreadsheet application
- dashboard tasks generate `.html` and then auto-open that generated dashboard inside sgBrowser
On the current callback-host service path, "inside sgBrowser" means opening the generated dashboard in a new visible browser tab while the helper page stays alive. The user does not need to open the file manually.
## Non-goals
This slice does not include:
- opening Excel and dashboard in the same run
- adding a new combined route that auto-opens both artifacts
- adding any new help, helper, or user-visible assistance surface
- modifying `frontend/service-console/sg_claw_service_console.html`
- modifying `src/service/protocol.rs`
- modifying `browser-helper.html`
- modifying `/sgclaw/callback/*` contracts
- turning the browser backend into a general-purpose local filesystem browser
- changing the artifact-generation JSON contract of `openxml_office` or `screen_html_export`
## Chosen Approach
Keep the current two workflow routes, but add one route-specific post-export action to each:
- `ZhihuHotlistExportXlsx` -> generate `.xlsx`, then open it locally with the OS default app
- `ZhihuHotlistScreen` -> generate `.html`, then open the generated dashboard presentation URL in the browser runtime
For the dashboard route, use the existing `presentation.url` returned by `screen_html_export` as the authoritative browser-open URL. Do not invent a separate normal-path URL conversion layer when the tool already returns the presentation contract.
The compat opener must emit one exact navigate request shape for this case.
- `action`: `Action::Navigate`
- `expected_domain`: the exact literal `__sgclaw_local_dashboard__`
- `params.url`: the exact `presentation.url` returned by `screen_html_export`
- `params.sgclaw_local_dashboard_open.source`: the exact literal `compat.workflow_executor`
- `params.sgclaw_local_dashboard_open.kind`: the exact literal `zhihu_hotlist_screen`
- `params.sgclaw_local_dashboard_open.output_path`: the generated local dashboard artifact path
- `params.sgclaw_local_dashboard_open.presentation_url`: the same `file://` URL stored in `params.url`
On the current callback-host-backed service path, only that exact request shape is approved for the local-dashboard special case. A plain `Action::Navigate` with an arbitrary `file://...` URL, or a request missing any one of the required marker fields above, must continue to be rejected.
Because normal `MacPolicy` domain validation cannot accept `file://...`, add a narrow local-dashboard presentation allowance in the browser backend/security boundary. That allowance must be limited to this one case:
- only for `Action::Navigate`
- only for generated local dashboard presentation URLs
- only for local HTML presentation, not arbitrary local paths or generic file browsing
Why this approach:
- it preserves the existing mutual exclusivity between Excel export and dashboard presentation
- it keeps artifact generation in the existing tools
- it keeps browser opening inside the existing browser backend boundary
- it uses the existing `screen_html_export` presentation contract instead of duplicating it
- it avoids pushing orchestration into the service console or protocol layer
- it stays compatible with the current callback-host runtime, where visible navigation is new-tab based
- it limits the guaranteed browser-open behavior in this slice to the callback-host-backed service path that the user is using today
Rejected alternatives:
- add a combined "Excel + dashboard" route: explicitly rejected by user behavior
- let `frontend/service-console/` decide when to open generated files: wrong layer; the console is only a submit/view surface
- add help UI to expose output choices: explicitly unwanted by the user
- change `browser-helper.html` so the helper page itself becomes the dashboard: this would break the current helper-page persistence model
- promise a backend-agnostic "replace the current page" behavior: inaccurate because callback-host and websocket backends do not share identical navigate semantics
- require the websocket backend to gain matching local-dashboard visible-open behavior in this slice: outside the narrow current-service-path goal
## File Responsibilities
### `src/compat/workflow_executor.rs`
Continue to own:
- route detection for Zhihu hotlist workflows
- artifact generation orchestration
- post-export summary construction
New responsibilities in this slice:
- parse the successful artifact payloads after `openxml_office` and `screen_html_export`
- call the route-specific post-export opener only after artifact creation succeeds
- for the dashboard route, consume `presentation.url` from the `screen_html_export` result payload
- keep generation success and post-export open success/failure distinct in the returned summary
### `src/compat/artifact_open.rs`
New helper module to keep side effects out of `workflow_executor.rs`.
Responsibilities:
- open a generated local `.xlsx` with the system default application
- open a generated local dashboard presentation URL through the existing `BrowserBackend`
- construct the exact approved dashboard navigate request shape used by this slice
- define the narrow local-dashboard presentation token/constants used by the compat layer and backend compatibility path
- return narrow success/failure results so `workflow_executor.rs` can produce accurate summaries
This module must stay small and focused. It is not a general launcher framework.
### `src/browser/callback_backend.rs`
New narrow responsibility in this slice:
- at the `BrowserCallbackBackend::invoke(Action::Navigate, params, expected_domain)` entrypoint, recognize only the exact approved local-dashboard presentation request shape
- preserve the current callback-host behavior of using `sgBrowerserOpenPage`, which opens a new visible tab and keeps the helper page alive
- reject local-file navigate attempts that do not include the exact post-export marker payload from the compat layer
This slice must not change callback-host polling, helper bootstrap, or callback endpoint behavior.
### `src/browser/ws_backend.rs`
No required behavior change in this slice.
Notes:
- websocket transport semantics differ from the callback-host service path
- this spec does not require websocket backend local-dashboard visible-open support
- websocket-specific parity can be designed later as a separate slice if needed
### `src/security/mac_policy.rs`
New narrow responsibility in this slice:
- expose a small validation helper for the approved local-dashboard presentation case
- validate the real local presentation URL and artifact path for that case rather than treating `file://` as a normal allowed domain
- keep the normal domain-based validation path unchanged for ordinary remote navigation
The policy layer must not turn `file://` into a generally allowed "domain". This is an explicit special case for generated local dashboard presentation only.
### `src/compat/mod.rs`
Expose the new helper module.
## Route Semantics
### Excel export route
Trigger examples:
- `读取知乎热榜数据,并导出 excel 文件`
- `导出知乎热榜 xlsx`
Expected behavior:
1. collect hotlist rows
2. call `openxml_office`
3. obtain `output_path`
4. open the generated `.xlsx` using the local OS default spreadsheet application
5. return a success summary reflecting both generation and open state
Summary rules:
- open succeeded -> `已导出并打开知乎热榜 Excel <path>`
- open failed but file exists -> `已导出知乎热榜 Excel <path>,但自动打开失败:<reason>`
The workflow still counts artifact generation as successful even if the post-export open step fails.
### Dashboard route
Trigger examples:
- `读取知乎热榜数据并生成领导演示大屏`
- `生成知乎热榜 dashboard`
- `展示知乎热榜大屏`
Expected behavior:
1. collect hotlist rows
2. call `screen_html_export`
3. obtain `output_path`
4. obtain `presentation.url` from the tool result payload
5. invoke the browser opener through the existing `BrowserBackend`
6. return a success summary reflecting both generation and browser-open state
Summary rules:
- browser open succeeded -> `已在浏览器中打开知乎热榜大屏 <path>`
- browser open failed but file exists -> `已生成知乎热榜大屏 <path>,但浏览器自动打开失败:<reason>`
The workflow still counts artifact generation as successful even if the browser-open step fails.
## Browser Boundary
This slice must preserve the current browser/runtime boundary.
Allowed:
- use the existing `BrowserBackend`
- use the existing `Action::Navigate`
- use the existing `screen_html_export` `presentation.url`
- add a narrow compatibility path so local generated dashboard presentation can pass backend validation
Not allowed:
- change `browser-helper.html`
- introduce a new callback-host endpoint
- move file-opening responsibility into the frontend service console
- add a new browser-side bootstrap flow
- require websocket protocol changes
Important semantic note:
- on the current service callback-host path, dashboard open is expected to use `sgBrowerserOpenPage`, so the generated dashboard appears in a new visible browser tab while the helper page remains available for later tasks
- websocket-backed browser execution may continue to differ; this slice does not require matching visible-open semantics there
## Local Dashboard Presentation Allowance
The local dashboard browser-open path needs an explicit narrow validation rule because `file://...` cannot pass the normal domain allowlist.
Requirements for the narrow allowance:
- only approved for `Action::Navigate`
- only approved for the exact compat marker payload described above
- only approved for generated local dashboard presentation URLs
- only approved when the validated local artifact path points to the generated dashboard HTML artifact returned by the same `screen_html_export` success payload
- only approved for local HTML presentation, not arbitrary executables or unrelated local files
- ordinary remote navigation must continue using the existing `MacPolicy::validate(...)` domain rules unchanged
This keeps the behavior small and auditable while still satisfying the user-visible dashboard auto-open requirement.
## Local File Opening Boundary
The Excel auto-open action is a local runtime side effect, not a browser action.
Requirements:
- use the system default application for `.xlsx`
- support the current Windows environment first
- keep the implementation minimal and focused on the generated artifact path
Not required in this slice:
- a cross-platform abstraction beyond the minimal shape needed for the current repo environment
- opening arbitrary user-selected files
- exposing local file opening to the service websocket protocol
## Error Handling
### Excel route
If `.xlsx` generation fails:
- return the existing export failure
If `.xlsx` generation succeeds but auto-open fails:
- keep the artifact path in the summary
- mark only the auto-open step as failed
- do not delete the generated file
### Dashboard route
If `.html` generation fails:
- return the existing screen export failure
If `.html` generation succeeds but browser open fails:
- keep the artifact path in the summary
- mark only the browser-open step as failed
- do not delete the generated file
If the tool result is missing `presentation.url`:
- treat that as a protocol error in the post-export open step for this route
- keep the generated artifact path in the summary if it is available
- do not silently invent a different contract in the normal path
## Test Strategy
### Workflow tests
Update or add focused workflow coverage so that:
- Excel workflow still calls `openxml_office`
- dashboard workflow still calls `screen_html_export`
- the two routes remain mutually exclusive
- dashboard workflow consumes the tool's existing `presentation.url`
### New Excel post-export test
Add a focused regression proving:
- an Excel-oriented hotlist request triggers export
- the generated `.xlsx` path is passed into the local default-app opener
- no browser dashboard navigate is triggered for that route
### New dashboard post-export test
Add a focused regression proving:
- a dashboard-oriented hotlist request triggers HTML generation
- the generated tool payload `presentation.url` is used for browser open
- the browser backend receives a local-dashboard navigate request through the approved compat path
- no local spreadsheet opener is triggered for that route
### Backend/security compatibility tests
Add focused regressions proving:
- callback backend accepts the approved local-dashboard navigate case and still emits `sgBrowerserOpenPage`
- the narrow local-dashboard allowance rejects non-local or malformed URLs
- ordinary domain validation behavior remains unchanged for normal remote navigation
### Existing boundary tests remain unchanged
Do not change the service-console boundary guard. This slice is runtime behavior only.
## Acceptance Criteria
The slice is complete when all of the following are true:
1. Excel hotlist export still generates a local `.xlsx` artifact.
2. Excel hotlist export auto-opens that `.xlsx` with the system default spreadsheet application.
3. Dashboard hotlist export still generates a local `.html` artifact.
4. Dashboard hotlist export consumes the existing `screen_html_export` `presentation.url` and auto-opens it in the current callback-host-backed sgBrowser service session.
5. On the current callback-host service path, the dashboard opens automatically in a visible browser tab without breaking the helper-page runtime.
6. Excel-open and dashboard-open remain separate user-chosen flows, not a combined mode.
7. No new help/help-like user-visible surface is added.
8. The service console, websocket protocol, `browser-helper.html`, and callback-host endpoint surface remain untouched.

View File

@@ -0,0 +1,217 @@
# 95598-repair-city-dispatch 操作分析
## 1. 场景概述
`95598-repair-city-dispatch` 对应“95598抢修-市指”场景,目标是监测抢修工单队列,并在必要时触发提醒、日志写入与自动派单等后续动作。根据 `D:\data\ideaSpace\rust\sgClaw\claw\claw\skills\skill_staging\scenes\95598-repair-city-dispatch\scene.json``D:\data\ideaSpace\rust\sgClaw\claw\claw\skills\skill_staging\skills\95598-repair-city-dispatch\SKILL.md``D:\data\ideaSpace\rust\sgClaw\claw\claw\skills\skill_staging\skills\95598-repair-city-dispatch\scripts\collect_repair_orders.js` 以及两份规则资产,当前最严谨的结论是:本场景的 packaged JS collector 已经实现输入驱动的 `monitor-snapshot` 归一化 / 比较逻辑,会按状态分桶 repair orders、解析 monitor/dispose logs、推导 `pending_ids` / `new_pending_ids`、给出 `success/partial/empty/blocked` 状态,并附带 desk 规则来源、配置基础页与已知问题元数据;但更强的业务监测、提醒与自动派单 workflow 证据仍主要存在于 desk 规则资产中,证据等级分别为 `code-confirmed`
必须显式区分三层证据:
1. packaged runtime-snapshot-collector`collect_repair_orders.js` 已直接实现 repair-order 分类、历史比较、状态判定与标准化快照输出,并显式携带 `workflow_rule_sources``config_base_page``config_base_role``packaged_collector_role``known_issues`,证据等级:`code-confirmed`
2. 业务监测逻辑:`D:\desk\智能体资料\大四区报告监测项\95598抢修-市指_业务检测配置.txt` 直接展示了队列采集、状态分类、监测日志比较、音频提醒与监测日志写入逻辑,证据等级:`code-confirmed`
3. 自动派单 / 提醒逻辑:`D:\desk\智能体资料\大四区报告监测项\95598抢修-市指_自动处理配置.txt` 直接展示了去重、班组匹配、自动派单请求、音频提醒、短信发送、外呼触发与处置日志写入逻辑,证据等级:`code-confirmed`
但这些 `code-confirmed` 只表示“代码或规则资产中存在这些实现分支或动作定义”,不等于“运行时已验证成功”。本文不对运行时成功做任何拔高表述。
## 2. 证据来源
本分析统一只使用四个证据等级标签:`code-confirmed``contract-defined``implementation intent exists but not rigorous / buggy``no direct evidence / candidate only`
1. `D:\data\ideaSpace\rust\sgClaw\claw\claw\skills\skill_staging\skills\95598-repair-city-dispatch\scripts\collect_repair_orders.js`
- 直接定义 `STATUS_GROUPS``LOCAL_SERVICE_ENDPOINTS``WORKFLOW_RULE_SOURCES``CONFIG_BASE_PAGE``KNOWN_ISSUES`,并实现 repair-order 分类、monitor/dispose log 解析比较、`new_pending_ids` 推导、`success/partial/empty/blocked` 状态判定,以及带 `evidence` / `known_issues``monitor-snapshot` 输出,证据等级:`code-confirmed`
2. `D:\desk\智能体资料\大四区报告监测项\95598抢修-市指_业务检测配置.txt`
- 直接实现工单队列采集、按状态分桶、待处理列表比较、音频提醒、监测日志写入,且暴露待处理分类 bug证据等级`code-confirmed`
3. `D:\desk\智能体资料\大四区报告监测项\95598抢修-市指_自动处理配置.txt`
- 直接实现处置日志去重、班组范围匹配、自动派单请求、自动派单成功/失败/异常/未匹配分支、音频日志、短信日志、外呼触发与 `setDisposeLog` 写入,证据等级:`code-confirmed`
4. `D:\data\ideaSpace\rust\sgClaw\claw\claw\skills\skill_staging\skills\95598-repair-city-dispatch\SKILL.md`
- 定义“优先使用 packaged collector、把监测快照与下游动作分离、允许 partial”的运行契约证据等级`contract-defined`
5. `D:\data\ideaSpace\rust\sgClaw\claw\claw\skills\skill_staging\skills\95598-repair-city-dispatch\references\collection-flow.md`
- 定义以页面配置为入口、结合规则资产理解语义、采集状态 `00/01/06/08`、对比 monitor/dispose logs 的一阶流程,证据等级:`contract-defined`
6. `D:\data\ideaSpace\rust\sgClaw\claw\claw\skills\skill_staging\skills\95598-repair-city-dispatch\references\data-quality.md`
- 定义状态分类、partial 规则、empty/failure 区分和下游副作用边界,证据等级:`contract-defined`
7. `D:\data\ideaSpace\rust\sgClaw\claw\claw\skills\skill_staging\scenes\95598-repair-city-dispatch\scene.json`
- 声明场景分类、输入 `time`、依赖和动作类型,证据等级:`code-confirmed`
8. `D:\data\ideaSpace\rust\sgClaw\claw\claw\skills\skill_staging\scenes\95598-repair-city-dispatch\scene.draft.json`
- 展示早期推断中对 `trigger-alert``configServices` 是否拆分的犹豫,属于候选整理结果,证据等级:`no direct evidence / candidate only`
## 3. 实际入口与运行边界
实际入口在 `scene.json` 中已固定:场景页面入口为 `index.html`,技能工具名为 `95598-repair-city-dispatch.collect_repair_orders`,输出类型为 `monitor-snapshot`,输入为 `time`,这些都属于 `code-confirmed`
其中 `assets/scene-snapshot/index.html` 只应被视为配置基础页(例如班组、联系人、范围维护),不应被当作规则 workflow 的主执行证据。
运行边界方面,需要做两个强制区分:
- packaged JS runtime collector 的实际边界:它已经能基于输入 `repair_orders``monitor_logs``dispose_logs` 做状态分类、历史比较、`new_pending_ids` 推导与 `success/partial/empty/blocked` 判定,并返回标准 `monitor-snapshot`;但它仍是输入驱动归一化 collector不直接发起浏览器请求也不直接承载完整业务 workflow证据等级`code-confirmed`
- rule-asset 行为边界:业务检测规则和自动处理规则分别展示了浏览器请求、日志比较、提醒副作用与自动派单副作用,证据等级:`code-confirmed`
也就是说,本场景不能被单句描述成“统一 packaged collector 已完整实现实时队列监测与自动派单”。更严谨的说法是packaged collector 已实现可测试的输入驱动快照归一化 / 比较逻辑;而较强的实时监测与自动处理链路证据仍来自 desk 规则资产,证据等级:`code-confirmed`
同时,`SKILL.md` 与 reference 明确要求把“快照采集成功”与“音频、短信、外呼、自动派单等下游效果”分开表达;这属于运行契约约束,证据等级:`contract-defined`
## 4. 代码已证实的实际操作流程
### 4.1 packaged runtime-snapshot-collector 已证实流程
`collect_repair_orders.js` 中现在能严格确认:
1. 调用 `collectRepairOrders(input)`,读取 `input.repair_orders``input.monitor_logs || input.monitor_log``input.dispose_logs || input.dispose_log``input.local_write_failures``input.blocked_reason` 等输入。
2. 通过 `classifyRepairOrders(...)``STATUS_GROUPS.pending = ["00", "01"]``STATUS_GROUPS.audit = ["06"]``STATUS_GROUPS.processed = ["08"]` 对 repair orders 分桶,并记录未知状态。
3. 从 pending orders 提取 `pending_ids`,再解析 monitor/dispose logs识别 malformed payload并据此推导 `new_pending_ids`
4.`blocked > partial > empty > success` 的优先级计算 `status`,并把未知状态、日志缺失、日志解析失败、本地写失败等写入 `partial_reasons`
5. 返回 `type: "monitor-snapshot"``scene: "95598-repair-city-dispatch"``pending``audit``processed``pending_ids``new_pending_ids``status``partial_reasons`
6. 在返回对象中附带 `evidence.workflow_rule_sources``evidence.config_base_page``evidence.config_base_role``evidence.packaged_collector_role = "runtime-snapshot-collector"`,以及 `known_issues`
7. 模块额外导出 `STATUS_GROUPS``LOCAL_SERVICE_ENDPOINTS``WORKFLOW_RULE_SOURCES``CONFIG_BASE_PAGE``KNOWN_ISSUES`
以上都属于 `code-confirmed`
### 4.2 业务监测规则已证实流程
`95598抢修-市指_业务检测配置.txt` 中可直接确认:
1. 通过 `BrowserAction("sgBrowerserJsAjax2", ...)` 请求 `repairOrder/list`,查询条件包含 `statusName=00,01,06,08` 与当天时间窗,证据等级:`code-confirmed`
2. 将返回列表按状态分到 `list``shlist``ycjList`,并构造 `pending/audit/processed``pendingList`,证据等级:`code-confirmed`
3. 读取 `getMonitorLog`,并基于待处理列表对比决定是否播报音频提醒,证据等级:`code-confirmed`
4. 将监测结果写入 `setMonitorData``setMonitorLog`,证据等级:`code-confirmed`
5. 音频提醒结果会写入 `setAudioPlayLog` 成功/失败/异常三类状态,证据等级:`code-confirmed`
但这里同时存在一个直接可见的 bug待处理判断写成了 `item.status == "00" && item.status == "01"`,这在单个状态值上不可能同时成立,因此规则中的 `pending` 列表构造逻辑不严谨,证据等级:`implementation intent exists but not rigorous / buggy`
### 4.3 自动处理规则已证实流程
`95598抢修-市指_自动处理配置.txt` 中可直接确认:
1. 先写一条“进入自动派单”的监测日志,再读取 `getDisposeLog` 做已派单去重,证据等级:`code-confirmed`
2. 对未派过单的待处理工单,读取 `getClassList`,按 `scope` 对故障地点 `gzdd` 做班组匹配,证据等级:`code-confirmed`
3. 匹配成功时,请求 `repairOrder/initProcess` 进行自动派单,证据等级:`code-confirmed`
4. 自动派单成功时,会触发成功音频播报、短信发送、外呼触发,并写 `setDisposeLog(state="成功")`,证据等级:`code-confirmed`
5. 自动派单失败时,会触发失败音频播报,并写 `setDisposeLog(state="失败")`,证据等级:`code-confirmed`
6. 自动派单异常时,会触发异常音频播报,并写 `setDisposeLog(state="异常")`,证据等级:`code-confirmed`
7. 未匹配到班组时,会触发未匹配音频播报,并写 `setDisposeLog(state="未匹配")`,证据等级:`code-confirmed`
以上动作都只是“规则层实现分支存在”的直接证据,不代表运行时已经验证成功。
## 5. 标准化抽象流程
若为 command-center 做严格抽象,本场景更合理的标准化流程应写成:
1. 接收监测任务输入 `time`
2. 使用规则资产定义的浏览器请求采集 95598 抢修队列。
3. 将源数据分为 `pending``audit``processed`,并保留规则层可见的待处理列表语义。
4. 用 monitor log / dispose log 做比较上下文,得出“新增待处理”或待自动处理集合。
5. 若进入标准配置归一层,再把这些结果映射为 `pending_ids``new_pending_ids` 等 canonical 字段。
6. 先返回或保留监测快照语义。
7. 再执行音频提醒、短信、外呼、自动派单、日志写入等下游动作。
其中第 1 步可由 packaged collector 的显式输入 `time` 支撑,第 3、4、5、6 步可由 packaged collector 的输入驱动归一化 / 比较逻辑支撑,证据等级:`code-confirmed`;第 2、7 步主要由规则资产直接支撑,证据等级:`code-confirmed`;“快照应先于下游副作用表达”这一边界来自 `SKILL.md` / references证据等级`contract-defined`
如果进一步把这个抽象流程说成“已由统一 packaged collector 严格承载实时浏览器采集与自动派单副作用”,那就不严谨了,因为这些更强 workflow 证据仍在 desk 规则资产而不是 packaged collector 中,证据等级只能降为 `implementation intent exists but not rigorous / buggy`
## 6. 输入、上下文与依赖
### 输入
- `time` 是 scene 与 packaged script 共同声明的显式输入,证据等级:`code-confirmed`
- 当天时间窗拼接逻辑出现在业务监测规则中,即 `00:00:00``23:59:59`,证据等级:`code-confirmed`
- “当前队列窗口通常是当天”在 reference 中被明确说明,证据等级:`contract-defined`
### 运行上下文
- 平台 session、org/user 上下文、浏览器可执行 `BrowserAction` 是规则资产和 reference 共同依赖的前提,证据等级分别为 `code-confirmed``contract-defined`
- 页面本身更偏配置页,而真正监测语义来自规则资产,这一点在 `collection-flow.md` 中被明确指出,证据等级:`contract-defined`
### 依赖
- `scene.json` 中声明 `browser``local-service``repair-order-source``history-log``status-classification`,证据等级:`code-confirmed`
- 业务监测规则直接使用 `repairOrder/list``MonitorServices/getMonitorLog``setMonitorData``setMonitorLog``setAudioPlayLog`,证据等级:`code-confirmed`
- 自动处理规则直接使用 `getDisposeLog``getClassList``repairOrder/initProcess``setDisposeLog``setSendMessageLog` 与外呼触发 `mac.callOutLogin`,证据等级:`code-confirmed`
- `configServices` 是否需要单独提升为正式依赖,在 `scene.draft.json` 中仍是待确认项,证据等级:`no direct evidence / candidate only`
## 7. 输出结构
当前输出结构需要分层描述。
### 7.1 packaged runtime collector 已直接定义的输出
`collect_repair_orders.js` 直接定义:
- `type: "monitor-snapshot"`
- `scene: "95598-repair-city-dispatch"`
- `time`
- `pending`
- `audit`
- `processed`
- `pending_ids`
- `new_pending_ids`
- `status`
- `partial_reasons`
- `evidence.workflow_rule_sources`
- `evidence.config_base_page`
- `evidence.config_base_role`
- `evidence.packaged_collector_role`
- `known_issues`
以上全部属于 `code-confirmed`
### 7.2 规则资产已展示的实际快照字段语义
业务监测规则直接构造了:
- `time`
- `type: "95598抢修-市指"`
- `pending`
- `pendingList`
- `audit`
- `processed`
这说明规则层实际快照对象与 packaged stub 的字段命名并不完全一致,尤其是 `pendingList` vs `pending_ids``type` vs `scene`,证据等级:`code-confirmed`
### 7.3 `new_pending_ids` 的证据强度
`SKILL.md`、reference 和 `data-quality.md` 都把 `new_pending_ids` 当作期望输出的一部分,证据等级:`contract-defined`。但在已读规则资产里,能直接看到的是“对 monitor log / dispose log 做比较并决定是否提醒或进入自动派单”,而没有看到显式字段 `new_pending_ids` 被直接写出,因此“存在历史比较意图”是 `code-confirmed`,“`new_pending_ids` 已被当前实现严谨产出”只能标为 `implementation intent exists but not rigorous / buggy`
## 8. 下游动作证据表
| 下游动作 | 当前证据 | 证据等级 | 严谨结论 |
| --- | --- | --- | --- |
| 返回 `monitor-snapshot` runtime collector 输出 | `collect_repair_orders.js` 直接返回对象 | `code-confirmed` | packaged JS 直接证明标准 snapshot 字段、状态判定与 collector metadata 已存在。 |
| 队列采集请求 | 业务监测规则调用 `repairOrder/list` | `code-confirmed` | 队列采集逻辑直接存在于规则资产中。 |
| 音频提醒调用 | 业务监测规则和自动处理规则都调用 `mac.audioPlay(...)` | `code-confirmed` | 只能确认规则层存在音频提醒调用,不代表运行时已验证成功。 |
| 短信发送调用 | 自动处理规则调用 `mac.sendMessages(request)` | `code-confirmed` | 只能确认规则层存在短信发送调用。 |
| 电话 / 外呼触发 | 自动处理规则调用 `mac.callOutLogin(params)` | `code-confirmed` | 只能确认规则层存在外呼触发分支。 |
| 自动派单请求调用 | 自动处理规则请求 `repairOrder/initProcess` | `code-confirmed` | 自动派单请求分支可直接定位。 |
| `setDisposeLog` 成功写入 | 自动处理规则成功分支写 `state="成功"` | `code-confirmed` | 成功路径处置日志写入定义明确存在。 |
| `setDisposeLog` 失败写入 | 自动处理规则失败分支写 `state="失败"` | `code-confirmed` | 失败路径处置日志写入定义明确存在。 |
| `setDisposeLog` 异常写入 | 自动处理规则异常分支写 `state="异常"` | `code-confirmed` | 异常路径处置日志写入定义明确存在。 |
| `setDisposeLog` 未匹配写入 | 自动处理规则未匹配分支写 `state="未匹配"` | `code-confirmed` | 未匹配路径处置日志写入定义明确存在。 |
| `new_pending_ids` 严格产出 | 只在 skill/reference/data-quality 中被要求 | `implementation intent exists but not rigorous / buggy` | 有明确目标语义,但当前读到的规则资产未直接产出同名字段。 |
| 把下游动作结果等同于采集成功 | skill/reference 明确禁止 | `contract-defined` | 契约要求把快照成功与副作用成功分离。 |
## 9. 当前代码疑点 / 不严谨点
1. 最明显的已知 bug 是业务监测规则中的待处理分类条件写成 `item.status == "00" && item.status == "01"`。这会导致 `pending` 分桶逻辑不可能按作者意图工作,证据等级:`implementation intent exists but not rigorous / buggy`
2. packaged collector 与规则资产的输出命名仍不一致collector 使用 `scene``pending_ids``new_pending_ids`,规则对象使用 `type``pendingList`,证据等级:`code-confirmed`
3. `SKILL.md``new_pending_ids` 作为输出重点,但当前直接证据更强的是“做日志比较并决定提醒/自动派单”,而不是“显式产出同名字段”,证据等级:`implementation intent exists but not rigorous / buggy`
4. `scene.draft.json` 仍在犹豫 `trigger-alert` 是否拆成 audio-alert、message-alert、callout 三类动作,说明标准动作建模尚未完全收敛,证据等级:`no direct evidence / candidate only`
5. 虽然规则层存在音频、短信、外呼、自动派单与日志写入定义,但本文不能据此声称这些动作已完成运行时验证,任何这种拔高都不严谨。
## 10. 对 command-center 标准配置的修订建议
1. 对本场景应显式拆分两层实现证据:
- `packaged_collector`: `collect_repair_orders.js` 的 runtime snapshot collector、状态判定、历史比较与 metadata规则来源、配置基础页角色、已知问题证据等级`code-confirmed`
- `rule_asset_workflow`: 业务监测与自动处理规则资产中的真实流程分支,证据等级:`code-confirmed`
2. 在标准配置中把业务监测与自动处理拆成两个子流程:
- `monitoring_flow` 对应 `95598抢修-市指_业务检测配置.txt`
- `auto_processing_flow` 对应 `95598抢修-市指_自动处理配置.txt`
这样可以避免把两份规则混成单一 collector。
3. 输出 schema 建议区分:
- `canonical_snapshot_fields`: `pending_ids` / `new_pending_ids` 等标准字段;
- `observed_rule_fields`: `pendingList` / `type` 等规则层字段。
当前两套命名并存,证据等级:`code-confirmed`
4. 对状态分类增加 `known_bug_note`,明确记录 `status == "00" && status == "01"` 的待处理分类 bug防止后续文档误把 pending 计数写成稳定事实,证据等级:`implementation intent exists but not rigorous / buggy`
5. 对下游动作增加 `effect_channels` 明细,至少拆出 `audio-reminder``sms-send``callout-trigger``auto-dispatch-request``dispose-log-write`,因为这些都已在规则资产中直接出现,证据等级:`code-confirmed`
## 11. 最终严谨结论
关于 `95598-repair-city-dispatch`,当前最可靠的结论是:仓库已经同时存在一个可测试的 packaged JS runtime collector以及两份更强的 desk 规则脚本实现(`D:\desk\智能体资料\大四区报告监测项\95598抢修-市指_业务检测配置.txt``D:\desk\智能体资料\大四区报告监测项\95598抢修-市指_自动处理配置.txt`),其中 packaged collector 已直接实现 repair-order 分类、monitor/dispose log 比较、`new_pending_ids` 推导与 `success/partial/empty/blocked` 状态判定;业务监测规则直接证实了队列采集、日志比较、音频提醒与监测日志写入,自动处理规则直接证实了去重、班组匹配、自动派单请求、短信发送、外呼触发以及 `setDisposeLog` 在成功 / 失败 / 异常 / 未匹配路径上的写入定义,证据等级:`code-confirmed`
但同样必须严格说明:这些 `code-confirmed` 只证明“代码或规则层存在这些实现分支”不证明运行时已验证成功。此外desk 业务监测规则里还存在 `status == "00" && status == "01"` 的待处理分类 bug因此 rule workflow 本身也不能被描述为严谨无误。对 command-center 而言本场景最应该被建模为“packaged collector 已具备输入驱动快照归一化能力、desk rule-asset workflow 证据更强、且监测流与自动处理流必须分开表达”的 monitor scene。

View File

@@ -0,0 +1,155 @@
# 95598-weekly-monitor-report 操作分析
## 1. 场景概述
`95598-weekly-monitor-report` 对应“95598、12398及配网设备监控情况周统计”场景目标是汇总 95598、12398 与配网设备多来源周统计并生成统一周报。根据 `D:\data\ideaSpace\rust\sgClaw\claw\claw\skills\skill_staging\scenes\95598-weekly-monitor-report\scene.json``D:\data\ideaSpace\rust\sgClaw\claw\claw\skills\skill_staging\skills\95598-weekly-monitor-report\SKILL.md``D:\data\ideaSpace\rust\sgClaw\claw\claw\skills\skill_staging\skills\95598-weekly-monitor-report\scripts\collect_weekly_metrics.js`,当前最硬直接证据是:脚本定义了六个 section template、空 artifact、`period``status: "ok"``partial_reasons: []`,证据等级:`code-confirmed`
同时必须明确:当前 packaged script 对 artifact schema / section template 的定义,远强于对实时浏览器采集、多源周统计归并、双周期对齐或导出行为的证明。也就是说,本场景现在更接近“周报结构模板脚本”,而非“已被代码严格证明可跑通的 live browser collector”证据等级`code-confirmed`
## 2. 证据来源
本分析统一只使用四个证据等级标签:`code-confirmed``contract-defined``implementation intent exists but not rigorous / buggy``no direct evidence / candidate only`。其中,脚本直接定义的 artifact schema / section template 归入 `code-confirmed`;未见脚本直接实现的双周期语义、采集逻辑与下游动作,不拔高于其对应较弱标签。
1. `D:\data\ideaSpace\rust\sgClaw\claw\claw\skills\skill_staging\skills\95598-weekly-monitor-report\scripts\collect_weekly_metrics.js`
- 直接定义六个 section template并返回空 artifact证据等级`code-confirmed`
2. `D:\data\ideaSpace\rust\sgClaw\claw\claw\skills\skill_staging\skills\95598-weekly-monitor-report\SKILL.md`
- 描述应读取 current-period 与 cumulative-period、校验会话、收集多来源 source groups、归一 section 数据并在输出中返回两个周期、included source groups、period alignment issues 等;这更像运行契约与实现方向,证据等级以 `contract-defined``implementation intent exists but not rigorous / buggy` 为主。
3. `D:\data\ideaSpace\rust\sgClaw\claw\claw\skills\skill_staging\scenes\95598-weekly-monitor-report\scene.json`
- 定义场景输入 `period`、依赖 `browser` / `multi-source` / `period-alignment` / `local-report-service`,动作包括 `query` / `collect-report` / `aggregate-sections` / `align-periods`,证据等级:`code-confirmed`
4. `D:\data\ideaSpace\rust\sgClaw\claw\claw\skills\skill_staging\skills\95598-weekly-monitor-report\references\collection-flow.md`
- 明确入口页面提供两个日期范围current-period 与 cumulative-period并说明要先读两个范围再收集 source groups、再按 section 归一,证据等级:`contract-defined`
5. `D:\data\ideaSpace\rust\sgClaw\claw\claw\skills\skill_staging\skills\95598-weekly-monitor-report\references\data-quality.md`
- 规定完整结果、partial 规则、常见弱点和 empty/failure 区分,证据等级:`contract-defined`
6. `D:\data\ideaSpace\rust\sgClaw\claw-new\docs\superpowers\specs\2026-04-08-command-center-virtual-employee-inventory.json`
- 已将该场景整理为 `workflow``status_model``hidden_dependencies``open_questions` 等 command-center 视图;但其中部分是再抽象结果,不应反向拔高为实现证据,证据等级:`no direct evidence / candidate only`(仅限 inventory 不能单独证明 packaged script 已实现的部分)。
## 3. 实际入口与运行边界
实际入口已在 `scene.json` 固定:浏览器场景 `index.html`,技能工具名为 `95598-weekly-monitor-report.collect_weekly_metrics`,输出 artifact 为 `report-artifact`,这些都是 `code-confirmed`
运行边界方面出现了本场景最明显的不严谨点:
- scene 与脚本都只保留一个 `period` 字段,证据等级:`code-confirmed`
- `SKILL.md``collection-flow.md` 与 inventory 整理结果都明确说明页面实际有 `current-period``cumulative-period` 两套输入,证据等级:`contract-defined`
- scene 还把 `period-alignment` 声明为依赖,并把 `align-periods` 声明为动作,但脚本没有任何相应执行逻辑,证据等级:`code-confirmed` 对元数据存在成立,而“已实现 period alignment”只能标为 `implementation intent exists but not rigorous / buggy`
因此,当前最严谨的边界判断是:上层元数据与参考文档都把本场景描述成“双周期、多来源、需周期对齐的 section 周报”,但 packaged script 实际只提供了一个空 artifact 模板壳,尚未证明 live collection 行为。
## 4. 代码已证实的实际操作流程
当前脚本中可直接证实的流程只有:
1. 调用 `collectWeeklyMetrics(input)`
2. 读取 `input.period || ""` 写入返回对象的 `period`
3. 构造空主表:`columns: []``rows: []`
4. 基于 `SECTION_TEMPLATES` 复制出 6 个 section且每个 section 初始 `rows: []`
5. 返回 `type: "report-artifact"``report_name``status: "ok"``partial_reasons: []`
这些都属于 `code-confirmed`
至于“读取 current-period / cumulative-period 两个日期范围”“验证多系统会话”“按 source group 采集 95598 / 12398 / 配网设备指标”“执行 period alignment”“导出周报或写报告日志”等行为只在 `SKILL.md` 与 reference 中被描述,没有在 packaged script 中以可执行逻辑出现,因此不能算“代码已证实的实际操作流程”。
## 5. 标准化抽象流程
若做 command-center 的标准化抽象,可将本场景整理为:
1. 接收周报任务请求。
2. 解析 current-period 与 cumulative-period。
3. 验证多系统访问与会话上下文。
4. 按 source groups 收集周统计数据。
5. 将结果归并到六个 section。
6. 对 current-period 与 cumulative-period 做一致性校验或对齐。
7. 生成 `report-artifact`
8. 视情况执行导出/日志等下游动作。
其中第 5 步“六个 section schema 已存在”以及第 7 步“返回 artifact 壳”是 `code-confirmed`。第 2、3、4、6、8 步主要来自 skill/reference/scene 的目标流程描述,证据等级为 `contract-defined`;若要说这些步骤已被 packaged script 落地,则只能降为 `implementation intent exists but not rigorous / buggy`
## 6. 输入、上下文与依赖
### 输入
- `period` 是 scene 与脚本共享的显式输入,证据等级:`code-confirmed`
- `currentPeriod` / `cumulativePeriod`(或 current-period / cumulative-period`SKILL.md`、reference 与 inventory 隐含/显式要求的真实业务输入,证据等级:`contract-defined`
- 这意味着当前输入建模存在明显冲突:统一配置只暴露 `period`,但场景语义其实依赖双周期,证据等级:`implementation intent exists but not rigorous / buggy`
### 运行上下文
- `session`、多源系统账号/缓存 token、浏览器可见页面、周期对齐上下文等在 scene/reference 中被描述scene 元数据层面的声明是 `code-confirmed`,具体业务语义是 `contract-defined`
- “period-alignment-context” 被 inventory 当作 runtime_context 整理出来,可视为对 scene/reference 的再抽象;作为建议结构可以保留,但不宜拔高成脚本已实现能力。
### 依赖
- `browser``multi-source``period-alignment``local-report-service``scene.json` 中可直接定位,证据等级:`code-confirmed`
- `/a_js/YPTAPI.js``http://localhost:13313/ReportServices/*` 等具体依赖来自 reference证据等级`contract-defined`
## 7. 输出结构
当前脚本直接证实的输出结构包括:
- `type: "report-artifact"`
- `report_name: "95598-weekly-monitor-report"`
- `period`
- `columns: []`
- `rows: []`
- 6 个固定 section template
- `status: "ok"`
- `partial_reasons: []`
以上全部属于 `code-confirmed`
六个已被脚本直接定义的 section 分别为:
1. `fault-repair`
2. `frequent-outage`
3. `full-aperture-work-orders`
4. `key-opinion-control`
5. `device-monitoring`
6. `proactive-dispatch`
这些 section 中,前三个使用 `current_period``cumulative``year_over_year` 三类值列,后三个只使用 `value`,证据等级:`code-confirmed`。但这里也出现了建模歧义:
- 输出 artifact 顶层只保留一个 `period`
- section 内部却已经暗示了 `current_period``cumulative` 的双周期视角。
- skill/reference 又在文字上强调 current-period 与 cumulative-period 两个输入。
因此,“双周期输入如何映射到 artifact 顶层 period 与 section 列结构”当前并不严谨,证据等级:`implementation intent exists but not rigorous / buggy`
## 8. 下游动作证据表
| 下游动作 | 当前证据 | 证据等级 | 严谨结论 |
| --- | --- | --- | --- |
| 返回 section 化 `report-artifact` | `collect_weekly_metrics.js` 直接返回对象 | `code-confirmed` | 已有周报 artifact 模板壳,但仍为空数据。 |
| 六类 section 模板存在 | 脚本直接定义 `SECTION_TEMPLATES` | `code-confirmed` | 只能确认输出分区 schema 存在,不能确认真实数据采集。 |
| 双周期读取 | 只在 `SKILL.md` / `collection-flow.md` 中描述 | `contract-defined` | 契约明确需要 current-period 与 cumulative-period但脚本未实现。 |
| 多来源周统计采集 | 只在 skill/reference 中描述 | `contract-defined` | 有清晰目标流程,当前 packaged script 未直接证明。 |
| period alignment | scene 动作/依赖 + skill/reference 说明 | `implementation intent exists but not rigorous / buggy` | 元数据和文档都表达了需要对齐,但脚本没有对齐逻辑,建模仍含糊。 |
| 导出周报 | reference 提及 localhost report services | `contract-defined` | 只能确认存在下游服务约束,不能确认当前 skill 已执行导出。 |
| 报告日志写入 | skill/reference 提及 report-log | `contract-defined` | 只有体系级概念证据,当前脚本无直接调用。 |
| partial / blocked / empty 状态细分 | reference 有定义,脚本固定 `status: "ok"` | `implementation intent exists but not rigorous / buggy` | 状态模型意图明确,但 packaged script 尚未承载。 |
## 9. 当前代码疑点 / 不严谨点
1. `period``currentPeriod/cumulativePeriod` 的建模冲突最突出。scene 与 script 顶层只保留 `period`,但 skill/reference 明确要求双周期输入,前三个 section 的列结构也隐含双周期,这说明现有标准输入设计不严谨,证据等级:`implementation intent exists but not rigorous / buggy`
2. `period-alignment` 既被声明为依赖,也被列为动作 `align-periods`,但脚本没有任何对齐实现;因此“周期对齐能力已实现”不能成立,证据等级:`implementation intent exists but not rigorous / buggy`
3. 前三个 section 使用 `cumulative` 列名,而 skill/output 描述用的是 `cumulative period`;列名、输入名、顶层字段名之间没有形成统一建模,证据等级:`implementation intent exists but not rigorous / buggy`
4. `status` 固定为 `"ok"`,与 reference 对 partial / empty / blocked 的细分要求不一致,证据等级:`code-confirmed` 对现状成立。
5. 尽管 scene/skill 明确是多来源周统计,但脚本完全没有 source group 采集或映射逻辑,因此“周统计 collector 已落地”不能提升为当前代码事实,证据等级:`no direct evidence / candidate only`(对 live collection 执行层而言)。
## 10. 对 command-center 标准配置的修订建议
1. 本场景应把标准输入从单一 `period` 修订为显式双周期结构,例如 `currentPeriod``cumulativePeriod`。若仍需要统一路由入口,可额外保留上层 `period` 摘要,但不能替代执行层双周期字段,证据等级:`implementation intent exists but not rigorous / buggy`
2.`period-alignment` 建议在标准配置中拆成两部分:
- `period_model`: 双周期输入结构;
- `alignment_rule`: 这两组周期如何校验一致性。
当前 scene 只表达了需要对齐,但未给出严格模型,因此这是必要修订。
3. 在 artifact 配置中区分:
- `implemented_section_templates`: 当前六个 section 已被脚本直接实现,证据等级:`code-confirmed`
- `implemented_collection_logic`: 当前未被 packaged script 直接证明,需显式标低。
4. 对前三个 section 的列名建议统一成更一致的配置命名,如 `current_period` / `cumulative_period` / `year_over_year`避免脚本列名、skill 文本、标准配置三套口径混用。
5. 状态模型建议拆成“契约层”和“实现层”,防止 command-center 把 `partial` / `blocked` 误当成当前 collector 已具备的稳定判定能力。
## 11. 最终严谨结论
关于 `95598-weekly-monitor-report`,当前最可靠的结论是:仓库已经存在一个六分区周报 artifact 模板实现,明确给出了 section 名称、列 schema、顶层 `period` 字段以及基础状态字段,证据等级:`code-confirmed`
但当前证据并不足以把它描述成“已严格实现双周期、多来源、含 period alignment 的真实浏览器周统计 collector”。相关双周期读取、source group 采集、period alignment、导出与日志行为主要存在于 `SKILL.md``collection-flow.md``data-quality.md` 与 scene 元数据的目标描述中。尤其是 `period` vs `currentPeriod/cumulativePeriod` 以及 `period-alignment` 的建模仍明显含糊说明本场景现在最适合被归类为“section schema 已定义,但 live browser collection 行为尚未被脚本严格证实”的 staged report scene。

View File

@@ -0,0 +1,203 @@
# 指挥中心规格文档证据分级规则
## 目的
这份文档用于统一指挥中心相关规格文档中的证据表达方式,明确区分:
- 已被代码或规则资产直接证实的事实
- 已被外部接口或文档契约明确约束的事实
- 代码中表达了实现方向,但实现质量、完整性或正确性仍不充分的内容
- 当前没有直接证据、只能作为候选判断的内容
目标不是让规格文档写得更保守,而是让“观察到的事实”“归纳后的结构”“目标态设计”之间的边界始终可追溯、可复核、可讨论。
## 为什么必须分级
如果不做证据分级,指挥中心文档很容易把三类内容混写在一起:
1. 代码里已经存在并可直接定位的行为
2. 为了便于抽象而做出的归一化整理
3. 未来希望达成、但当前未被运行时或资产严格证明的目标结构
混写的直接问题是:
- 读者会把“推断出的整理结果”误认为“当前已实现事实”
- 后续实现或重构时,无法判断某一条到底是在复述现状,还是在提出目标
- 多份规格文档之间会出现证据强弱不一致、措辞口径不一致的问题
因此,所有指挥中心规格文档都必须对关键判断显式标注证据等级。
## 证据标签
以下 4 个标签为唯一允许使用的标准标签,必须按原文书写,不得改写,不得替换为同义词。
### 1. `code-confirmed`
定义:该结论可由当前仓库中的代码、规则资产、静态配置或可直接定位的实现内容直接支持。
适用场景:
- 某个字段、流程步骤、状态分类、规则动作在代码或规则资产中可直接定位
- 某个输出结构、配置项、动作通道已被实现内容明确写出
- 某条成功路径虽然未证明线上真实跑通,但“存在该逻辑分支”这一事实已被代码直接证实
使用边界:
- `code-confirmed` 只证明“代码/资产中存在该实现或定义”
- 不自动等于“生产可用”“运行时已验证成功”“端到端已闭环”
### 2. `contract-defined`
定义:该结论不是直接来自仓库实现,而是由当前被认可的接口契约、协议文档、外部约束文档明确规定。
适用场景:
- 浏览器侧/服务侧接口字段、消息格式、状态码语义由契约文档定义
- 某一能力边界来自明确的外部 API 文档或经项目认可的集成约束
使用边界:
- `contract-defined` 证明“契约如此定义”
- 不自动等于“本仓库已实现”
- 如果代码实现与契约不一致,应分别描述,不得互相覆盖
### 3. `implementation intent exists but not rigorous / buggy`
定义:代码中已经出现实现意图、雏形或局部链路,但当前证据不足以把它写成稳定事实;或者已知实现不严谨、存在缺口、疑似有 bug、成功语义未被严格证明。
适用场景:
- 能看到相关函数、分支、调用点、配置项或动作名,但缺少足够证据证明其稳定成立
- 逻辑存在,但状态语义混乱、异常处理不足、前后约束不完整
- 只能证明“作者想做这件事”,不能证明“这件事已经被可靠实现”
使用边界:
- 该标签用于承认“实现方向存在”
- 同时明确指出“不能把它提升为已确认事实”
- 这是指挥中心文档中承接“代码里有影子,但证据不够硬”的唯一合法标签
### 4. `no direct evidence / candidate only`
定义:当前没有找到代码、规则资产、契约文档或其他直接证据;该内容只能作为候选结构、候选能力、候选拆分或待确认项。
适用场景:
- 为了统一配置结构而提出的候选字段
- 为了后续架构演进而提出的候选能力名称
- 仅由推测、命名习惯、经验归纳得到的判断
使用边界:
- 该标签明确表示“目前只是候选,不是事实”
- 不能把它写成“已有但待接入”“已支持但未启用”之类更强说法,除非另有直接证据
## 推荐表述模板
### `code-confirmed`
可用表述:
- “根据当前代码/规则资产,可直接确认……,证据等级:`code-confirmed`。”
- “文档中的……来自现有实现直接证据,证据等级:`code-confirmed`。”
- “这里只能确认代码层存在该成功路径/动作定义,证据等级:`code-confirmed`;不代表运行时已验证。”
### `contract-defined`
可用表述:
- “根据当前接口契约,……被定义为……,证据等级:`contract-defined`。”
- “该字段/消息结构来自认可的集成契约,证据等级:`contract-defined`。”
- “这里描述的是契约约束,不等于仓库内实现已完成,证据等级:`contract-defined`。”
### `implementation intent exists but not rigorous / buggy`
可用表述:
- “当前实现中可以看到……的意图,但证据尚不足以将其写成稳定事实,证据等级:`implementation intent exists but not rigorous / buggy`。”
- “代码存在相关链路,但实现不够严谨/疑似有缺口,因此仅标为 `implementation intent exists but not rigorous / buggy`。”
- “目前最多只能确认作者试图支持……,不能确认其已被可靠实现,证据等级:`implementation intent exists but not rigorous / buggy`。”
### `no direct evidence / candidate only`
可用表述:
- “……目前没有直接证据,只能作为候选项,证据等级:`no direct evidence / candidate only`。”
- “该拆分/命名属于归一化建议,不代表现状事实,证据等级:`no direct evidence / candidate only`。”
- “除非后续补到代码或契约证据,否则这里只能保持为 `no direct evidence / candidate only`。”
## 禁止表述模式
以下表述在指挥中心规格文档中禁止使用,除非同时给出更低证据等级并明确限定范围。
### 1. 禁止把代码存在误写为运行时已验证
禁止示例:
- “系统已经稳定支持……”
- “该链路已完成闭环……”
- “运行时已证明可以成功……”
问题:这些表述把“代码里有逻辑”错误提升成“真实运行已被验证”。
### 2. 禁止把推断结构误写为既有事实
禁止示例:
- “当前配置结构就是……”
- “系统已有统一能力模型……”
- “所有任务已经按该 schema 实现……”
问题:如果只是为了整理而归纳出的标准结构,应标为候选或目标态,不能写成现状。
### 3. 禁止使用模糊强化词替代证据标签
禁止示例:
- “基本可以认为……”
- “大概率就是……”
- “看起来已经支持……”
- “应该算是实现了……”
问题:模糊判断会绕开证据分级,导致读者无法判断结论强度。
### 4. 禁止自造同义标签或混用近义词
禁止示例:
- “代码已确认”
- “契约已定义”
- “半实现”
- “待验证”
- “候选”
问题:这些中文近义词会破坏跨文档一致性。必须使用本文规定的 4 个精确标签原文。
## 示例:`95598-repair-city-dispatch`
示例结论:
-`95598-repair-city-dispatch` 而言,音频提醒、短信/消息提醒、外呼、处置日志等成功路径行为,如果能够在规则资产或实现内容中直接定位,应写为 `code-confirmed`
- 但这只能说明“代码或规则里存在这些成功路径定义”。
- 不能据此直接写成“运行时已经稳定成功触发音频/短信/外呼/处置日志”。
- 如果当前没有端到端运行验证证据,那么“运行时成功”只能写为 `implementation intent exists but not rigorous / buggy`,或者在证据更弱时写为 `no direct evidence / candidate only`;不能提升为 `code-confirmed`
推荐写法:
“在 `95598-repair-city-dispatch` 中,音频提醒、短信/消息提醒、外呼、处置日志相关成功路径可在规则资产中直接定位,因此这些‘规则层已定义的成功路径行为’可标注为 `code-confirmed`。但目前没有同等强度证据证明这些动作在真实运行时已稳定成功,因此‘运行时成功已验证’这一结论不能标为 `code-confirmed`;在缺少严格运行证据时,应标为 `implementation intent exists but not rigorous / buggy`。”
## 执行规则
- 所有指挥中心相关规格文档,必须使用本文定义的 4 个精确标签。
- 不允许使用任何同义词、中文替代词、缩写或自定义等级名。
- 一条关键结论如果没有证据等级,就视为表达不合格。
- 当同一主题同时涉及“代码事实”和“目标结构”时,必须拆句分别标注,不能合并成一个模糊结论。
## 最短落地准则
写每一条关键判断前,先问两个问题:
1. 我是在复述当前已被直接证据支持的事实,还是在做归一化整理/目标设计?
2. 我手上的证据,到底支撑的是代码存在、契约约束、实现意图,还是根本没有直接证据?
只有先回答这两个问题,指挥中心规格文档才能保持严格、可复核和可持续重写。

View File

@@ -0,0 +1,639 @@
# 指挥中心虚拟员工标准配置清单建议结构
> 免责声明:本文件描述的是“未来可采用的规范化目标配置结构”,不是当前 staged runtime 已稳定实现的结构,也不是对现状的直接复述。文中所有“目标 schema 字段”都必须与当前证据分级文档一起阅读;凡缺乏静态资产直接支撑的字段,只能视为 normalization choice 或 open / candidate 字段,不能表述为当前已稳定存在。
## 目标
这份结构文档的用途,是把当前 evidence-graded 现状文档中的信息,逐步映射为后续可维护、可扩展、可复用的目标配置清单。
因此必须同时保持两条边界:
1. 当前已观察到的事实,来自 evidence-graded current-state docs。
2. 这里提出的统一 schema则是为后续 command-center 配置治理而做的 normalization proposal。
它们不能混写,更不能把 normalization proposal 误写成当前实现事实。
---
## 一、当前证据文档与目标配置的关系
当前已经存在三类文档角色:
1. `2026-04-08-command-center-virtual-employee-inventory-table.md`
- 作用:给人读的 current-state 总览
- 性质:当前观察结果,不是配置 schema
2. `2026-04-08-command-center-virtual-employee-inventory.json`
- 作用:给机器读的 current-state inventory
- 性质:机器可消费的盘点结果,不是目标配置
3. 各 scene 的 `*-operation-analysis.md`
- 作用:记录每个场景的证据来源、强弱、已知问题和边界
- 性质:最关键的证据支撑层
本文件提出的目标配置结构,是在这些 current-state 文档之上的“规范化目标层”。
### 映射原则
- operation-analysis 文档中的 `code-confirmed` 结论,可优先映射为目标 schema 中的“evidence-derived fields”。
- `contract-defined` 结论可映射为“declared / contract-backed fields”但不能默认等于当前 runtime 已实现。
- `implementation intent exists but not rigorous / buggy` 的内容,应进入目标 schema 的 `known_issues``implementation_gap``notes``open_questions`,而不是被包装成稳定主字段。
- `no direct evidence / candidate only` 的内容,只能作为 normalization choice、candidate field 或未来扩展项保留。
简言之evidence-graded current-state docs 告诉我们“现在能严谨说什么”,本文件只负责说明“未来若要统一配置,可怎样承接这些信息”。
---
## 二、推荐文件组织
```text
command-center/
employee.json
capabilities.json
tasks/
fault-details-report.json
jinchang-business-environment-weekly-report.json
95598-weekly-monitor-report.json
95598-repair-city-dispatch.json
jiayuguan-meter-outage.json
```
### 文件职责
- `employee.json`
- 描述这个虚拟员工是谁、职责范围是什么、默认采用什么证据口径
- `capabilities.json`
- 维护归一化能力词表
- 明确哪些能力来自现有证据,哪些只是规范化命名
- `tasks/*.json`
- 每个场景一份目标配置
- 承接当前证据与未来标准字段的映射关系
### 为什么仍然推荐三层拆分
这类拆分仍然成立,但要加一条限定:
- 这是一种 target architecture proposal
- 不是当前仓库已存在的稳定目录结构
- 尤其 `capabilities.json` 代表“统一能力词表”的目标态,而不是当前 staged assets 已实现的统一能力注册表
因此,三层拆分本身属于 normalization choice证据等级不应高于 `no direct evidence / candidate only`,除非未来真的落地成文件结构。
---
## 三、`employee.json` 目标结构
### 3.1 推荐示例
```json
{
"id": "command-center-virtual-employee",
"name": "指挥中心虚拟员工",
"domain": "电力业务指挥中心",
"positioning": "负责业务监测、统计报表、异常识别与后续提醒/处置支撑的虚拟运营员工",
"mission": [
"采集业务数据并生成结构化报表",
"监测工单/事件并识别待处理对象",
"比较历史记录识别新增待办",
"为提醒、外呼、自动派单、自动处理等下游动作提供输入"
],
"task_ids": [
"fault-details-report",
"jinchang-business-environment-weekly-report",
"95598-weekly-monitor-report",
"95598-repair-city-dispatch",
"jiayuguan-meter-outage"
],
"default_evidence_model": [
"code-confirmed",
"contract-defined",
"implementation intent exists but not rigorous / buggy",
"no direct evidence / candidate only"
],
"default_status_model": [
"success",
"partial",
"empty",
"blocked"
]
}
```
### 3.2 字段分层说明
#### A. 可直接由当前证据承接的字段
- `name`
- `domain`
- `task_ids`(前提是仅映射当前已盘点的 5 个 scene
- `default_evidence_model`
这些字段之所以较容易承接,是因为 current-state inventory 已经稳定整理出对应对象和场景清单。
但仍要注意:这只是“可从当前文档整理得到”,不是说仓库里已经存在一个运行中的 `employee.json`
#### B. normalization choices
- `id`
- `positioning`
- `mission`
- `default_status_model`
这些字段主要是为了让目标配置更易治理、更可复用,属于规范化整理,不应表述为 staged runtime 现状。
#### C. open / candidate 字段
建议预留但暂不稳定化:
- `default_runtime_requirements`
- `default_result_types`
- `default_downstream_policy`
- `org_scope`
- `region_scope`
原因是:当前不同 scene 在“上下文依赖、输出类型、地区语义、下游策略”上并不一致,过早把这些做成员工级稳定字段会拔高现状。
---
## 四、`capabilities.json` 目标结构
### 4.1 推荐示例
```json
{
"catalog_version": 1,
"evidence_method": "evidence-graded",
"core": [
{
"id": "browser-collection",
"name": "浏览器采集",
"kind": "normalized-capability",
"evidence_basis": "derived-from-multiple-scenes"
},
{
"id": "report-generation",
"name": "报表生成",
"kind": "normalized-capability",
"evidence_basis": "derived-from-report-scenes"
},
{
"id": "monitor-snapshot",
"name": "监测快照",
"kind": "normalized-capability",
"evidence_basis": "derived-from-monitor-scenes"
}
],
"channels": [
{
"id": "audio-remind",
"name": "音频提醒",
"kind": "normalized-channel",
"observed_in": [
"95598-repair-city-dispatch",
"jiayuguan-meter-outage"
]
},
{
"id": "message-remind",
"name": "消息提醒",
"kind": "normalized-channel",
"observed_in": [
"95598-repair-city-dispatch"
],
"notes": "在 jiayuguan-meter-outage 中只看到保留意图,不应等同视为稳定现状。"
}
],
"actions": [
{
"id": "auto-dispatch",
"name": "自动派单",
"kind": "normalized-action"
}
]
}
```
### 4.2 字段分层说明
#### A. 可由当前证据承接的字段
- `observed_in`
- `notes`
- `evidence_basis`
如果后续真的落地 `capabilities.json`,最应该优先保留的不是“能力名本身”,而是能力和 scene 之间的 evidence mapping。因为当前场景的能力证据强弱明显不同
- 3 个报表 scene 多为 schema/template stub
- 2 个监测 scene 更强 workflow 主要来自规则资产
- `message-remind``callout``auto-dispatch` 等通道在不同 scene 中强度不一致
#### B. normalization choices
- `core`
- `channels`
- `actions`
- `id`
- `name`
- `kind`
这些统一词表字段本身就是规范化选择。当前没有直接证据表明仓库中已经存在统一 capability registry。
#### C. open / candidate 字段
建议保持候选态:
- `required_contexts`
- `result_semantics`
- `stability_level`
- `implemented_by`
- `runtime_owner`
这些字段看起来很有用,但 staged assets 还不足以稳定支撑它们。
### 4.3 对能力词表的关键限制
- 不要把 `report-export``audio-remind``callout` 之类词条本身写成“已全局统一支持”。
- 不要因为某个规则资产里出现了调用,就把它提升为所有 scene 的稳定 capability。
- `email` 目前仍应保持 candidate不应进入“已支持通道”集合。
---
## 五、`tasks/*.json` 目标结构
### 5.1 统一推荐骨架
```json
{
"id": "95598-repair-city-dispatch",
"name": "95598抢修-市指",
"category": "monitor",
"current_state": {
"primary_evidence_summary": "rule assets stronger than packaged JS stub",
"source_refs": [],
"known_issues": []
},
"binding": {
"scene_id": "95598-repair-city-dispatch",
"skill_package": "95598-repair-city-dispatch",
"tool": "collect_repair_orders"
},
"trigger": {
"observed": {},
"normalized": {},
"open_questions": []
},
"inputs": {
"observed": {},
"normalized": {},
"open_questions": []
},
"systems": {
"observed": {},
"normalized": {},
"open_questions": []
},
"workflow": {
"observed": [],
"normalized": [],
"open_questions": []
},
"result": {
"observed": {},
"normalized": {},
"open_questions": []
},
"downstream_effects": {
"observed": [],
"normalized": [],
"open_questions": []
},
"required_capabilities": {
"normalized": [],
"open_questions": []
},
"status_model": {
"declared": {},
"implemented_notes": []
},
"evidence_grades": {},
"open_questions": []
}
```
这个骨架的核心目标不是“把所有字段都填满”,而是强制区分:
- `observed`
- `normalized`
- `open_questions`
这样可避免把 future-facing target config 误写成 current-state。
---
## 六、报表类任务在目标 schema 中应如何表达
适用对象:
- `fault-details-report`
- `jinchang-business-environment-weekly-report`
- `95598-weekly-monitor-report`
### 6.1 当前证据对目标 schema 的约束
这 3 个任务当前最强直接证据主要是:
- 已有 `report-artifact` 结构壳
- 已有 section/template 定义
- 已有 `status` / `partial_reasons` 字段壳
但它们共同缺少同等强度的 live collection 证据。因此若采用该目标 schema建议保留一个明确的 current-state 提示,例如:
```json
"current_state": {
"primary_evidence_summary": "packaged script mainly confirms artifact schema / section template; live collection remains contract-defined or weaker"
}
```
### 6.2 报表类字段分层
#### A. evidence-derived fields
- `binding.scene_id`
- `binding.skill_package`
- `binding.tool`
- `result.observed.artifact_type`
- `result.observed.key_fields`
- `systems.observed.browser_pages`
- `source_refs`
#### B. normalization choices
- `trigger.normalized.natural_language_examples`
- `inputs.normalized.runtime_context`
- `workflow.normalized`
- `required_capabilities.normalized`
- `downstream_effects.normalized`
#### C. open / candidate fields
- `period_model`
- `section_semantics`
- `region_scope`
- `alignment_rule`
- `report_export_policy`
### 6.3 各报表任务的特别约束
#### `fault-details-report`
- 若采用该目标 schema建议对外保留 `period`,但执行层最好允许展开为 `startTime/endTime`
- `summary-sheet` 建议标记为“template confirmed”不要误写成“summary derivation implemented”。
#### `jinchang-business-environment-weekly-report`
- 若采用该目标 schema建议把“4 个固定 section 模板已观察到”与“真实多源采集已实现”分开表达。
- `region` 是否成为稳定字段,目前仍是 open item。
#### `95598-weekly-monitor-report`
- 若采用该目标 schema建议预留 `currentPeriod``cumulativePeriod`,但必须注明这属于对当前建模冲突的修正提案。
- `period alignment` 建议单列为 schema group 或 `alignment_rule`,而不是默认已经在 runtime 中稳定存在。
---
## 七、监测类任务在目标 schema 中应如何表达
适用对象:
- `95598-repair-city-dispatch`
- `jiayuguan-meter-outage`
### 7.1 当前证据对目标 schema 的约束
这两个任务与报表类不同:
- packaged JS collector 已具备输入驱动的 `monitor-snapshot` 归一化 / 比较逻辑,并会附带规则来源、配置基础页角色、已知问题/身份模型说明
- 更强 workflow 证据主要来自规则资产(当前按盘点口径以 `D:/desk/智能体资料/大四区报告监测项/*.txt` 规则脚本为主)
- `assets/scene-snapshot/index.html` 仅属于配置基础层,不应计入 workflow 主执行证据
因此若采用该目标 schema建议显式区分
```json
"current_state": {
"packaged_stub_strength": "code-confirmed",
"rule_asset_workflow_strength": "code-confirmed",
"notes": "workflow evidence is stronger in rule assets than in packaged JS stub"
}
```
### 7.2 监测类字段分层
#### A. evidence-derived fields
- `binding.*`
- `inputs.observed.explicit`
- `systems.observed.upstream_apis`
- `systems.observed.local_services`
- `workflow.observed`
- `result.observed`
- `downstream_effects.observed`
- `current_state.known_issues`
#### B. normalization choices
- `workflow.normalized`
- `required_capabilities.normalized`
- `canonical_snapshot_fields`
- `effect_channels`
#### C. open / candidate fields
- `identity_model`
- `downstream_policy`
- `alert_channel_split`
- `auto_processing_policy`
- `dependency_promotion_rules`
### 7.3 各监测任务的特别约束
#### `95598-repair-city-dispatch`
若采用该目标 schema建议保留以下说明
- workflow 强证据主要来自规则资产(当前盘点以 `D:/desk/智能体资料/大四区报告监测项/95598抢修-市指_业务检测配置.txt``D:/desk/智能体资料/大四区报告监测项/95598抢修-市指_自动处理配置.txt` 为主),而不是 packaged JS stub
- `pending` 分类存在 `status == "00" && status == "01"` bug
- `pending_ids/new_pending_ids` 更像 canonical target fields而不是当前规则层已严格同名产出字段
建议把这个 bug 直接纳入:
```json
"current_state": {
"known_issues": [
"pending classification bug: status == \"00\" && status == \"01\""
]
}
```
#### `jiayuguan-meter-outage`
若采用该目标 schema建议保留以下说明
- workflow 强证据主要来自规则资产(当前盘点以 `D:/desk/智能体资料/大四区报告监测项/户表失电-嘉峪关_业务监测配置.txt``D:/desk/智能体资料/大四区报告监测项/户表失电-嘉峪关_自动处理配置.txt` 为主),而不是 packaged JS stub
- marketing token 是自动处理链路的强依赖
- monitor pending list 用 `consNo`dispose dedupe 用 `eventId`,身份模型不一致
因此在该目标 schema 提案中,建议单列:
```json
"identity_model": {
"monitor_pending_identity": "consNo",
"dispose_dedupe_identity": "eventId",
"status": "implementation intent exists but not rigorous / buggy"
}
```
这类字段不应被伪装成“已经统一好的 snapshot identity model”。
---
## 八、推荐统一字段清单与证据边界
下面给出一个更严格的统一字段视图。
### 1. 元数据层
较适合作为稳定 target schema 的字段:
- `id`
- `name`
- `category`
- `binding.scene_id`
- `binding.skill_package`
- `binding.tool`
其中:
- `binding.*` 更偏 evidence-derived
- `id/name/category` 更偏 normalization choice
### 2. 现状映射层
建议新增并长期保留:
- `current_state.primary_evidence_summary`
- `current_state.source_refs`
- `current_state.known_issues`
- `current_state.notes`
这是本次重写后最重要的新增设计点之一。没有这层target schema 很容易再次把“目标结构”和“现状证据”混在一起。
### 3. 触发层
- `trigger.observed`
- `trigger.normalized`
- `trigger.open_questions`
### 4. 输入层
- `inputs.observed`
- `inputs.normalized`
- `inputs.open_questions`
### 5. 系统层
- `systems.observed`
- `systems.normalized`
- `systems.open_questions`
### 6. 流程层
- `workflow.observed`
- `workflow.normalized`
- `workflow.open_questions`
### 7. 结果层
- `result.observed`
- `result.normalized`
- `result.open_questions`
### 8. 下游动作层
- `downstream_effects.observed`
- `downstream_effects.normalized`
- `downstream_effects.open_questions`
### 9. 能力层
- `required_capabilities.normalized`
- `required_capabilities.open_questions`
### 10. 证据层
- `evidence_grades`
- `source_refs`
### 11. 人工确认层
- `open_questions`
- `known_issues`
---
## 九、为什么这次建议在 target schema 中显式保留“现状层”
旧版结构容易出现的问题是:
- 把 aggregate inventory 直接写成“标准配置已经长这样”
-`required_capabilities``downstream_effects` 这样的归一化字段误读成 runtime 现状
- 把规则资产中的 workflow 直接等价成 packaged script 实现
因此这次建议最关键的修订不是多加几个字段,而是要求 target schema 同时携带:
1. `observed current state`
2. `normalized target structure`
3. `open / candidate items`
只有这样,后续继续扩展新 scene 时,文档才不会再次把三类内容混在一起。
---
## 十、建议的落地顺序
1. 先把 current-state inventory 保持为证据分级后的事实盘点。
2. 再基于 inventory 生成目标态 `employee.json` / `capabilities.json` / `tasks/*.json` 草案。
3. 落地草案时,强制为每个 major group 补齐:
- `observed`
- `normalized`
- `open_questions`
4. 先优先收敛已知关键不严谨点:
- `fault-details-report``period` vs `startTime/endTime`
- `95598-weekly-monitor-report` 的双周期 / period alignment
- `95598-repair-city-dispatch` 的 pending classification bug
- `jiayuguan-meter-outage``consNo` vs `eventId` 身份不一致
5. 最后再考虑是否把能力词表与 target config 接入真实消费链路。
注意:在这些问题未收敛前,不应把目标配置字段写成“已经稳定”。
---
## 十一、推荐结论
如果目标是形成“指挥中心虚拟员工的标准配置清单”,那么未来仍然可以采用:
- `employee.json`
- `capabilities.json`
- `tasks/*.json`
这样的三层结构。
但和旧版不同的是,这套结构必须显式承认:
- 它是 target architecture proposal不是现状复述
- 每个 major schema group 都要区分 evidence-derived fields、normalization choices、open / candidate fields
- evidence-graded current-state docs 才是现状依据
- 报表类 3 个 scene 当前主要是 schema/template stub
- `95598-repair-city-dispatch``jiayuguan-meter-outage` 的 workflow 强证据主要在规则资产
- `95598-repair-city-dispatch` 存在 pending classification bug
- `jiayuguan-meter-outage` 存在 `consNo` / `eventId` 身份不一致问题
- 任何地方都不应宣称 runtime verification
只有在保持这些边界的前提下,这份“标准配置结构”才是严谨可持续的目标态提案,而不是再次把现状、推断和目标混写在一起。

View File

@@ -0,0 +1,121 @@
# 指挥中心虚拟员工业务盘点清单(表格版)
> 说明:本文件是“当前状态总览”,不是目标配置 schema。自本次重写起所有判断统一采用 `code-confirmed`、`contract-defined`、`implementation intent exists but not rigorous / buggy`、`no direct evidence / candidate only` 四级证据模型;结论仅基于已暂存/已落库资产的静态检查结果,不代表任何运行时验证。
## 盘点范围
本表覆盖当前已整理的 5 个 staged scene / skill
- `fault-details-report`
- `jinchang-business-environment-weekly-report`
- `95598-weekly-monitor-report`
- `95598-repair-city-dispatch`
- `jiayuguan-meter-outage`
## 虚拟员工定位
以下“虚拟员工定位”是对当前 5 个 scene 的归一化汇总视角,不是当前仓库里已存在统一员工对象的直接事实;证据等级:`no direct evidence / candidate only`。在这个归一化视角下,可把它理解为“面向电力业务指挥中心的任务型虚拟运营员工”,其职责边界可概括为:
- 以报表模板或监测快照形式承载结构化结果
- 对工单/事件队列做规则化监测与历史比较
- 为提醒、日志、外呼、自动派单、自动处理等下游动作提供输入语义
- 为未来统一配置清单提供归一化抽象基础
但必须强调:以上职责并不等于所有场景都已由统一 packaged runtime 严格实现,更不等于已完成运行时验证。
## 证据标签速记
| 标签 | 严格含义 |
| --- | --- |
| `code-confirmed` | 当前仓库代码、规则资产、静态配置中可直接定位到的事实 |
| `contract-defined` | 由场景说明、参考流程、接口/文档契约明确规定的事实 |
| `implementation intent exists but not rigorous / buggy` | 已看到实现方向或局部链路,但不够严谨、存在缺口或已知 bug |
| `no direct evidence / candidate only` | 当前没有直接证据,只能作为候选抽象、候选结构或待确认项 |
## 业务盘点表
| 名称 | 场景 ID | 类别 | 当前任务目标 | 已观察系统 / 证据基础 | 证据分级摘要 | 严格说明 / 未解决问题 | 对应分析文档 |
| --- | --- | --- | --- | --- | --- | --- | --- |
| 故障明细 | `fault-details-report` | 报表 | 以“故障明细主表 + summary-sheet 分区”形式承载故障明细报表结果。 | `scene.json``SKILL.md``scripts/collect_fault_details.js``references/collection-flow.md``references/data-quality.md` | `code-confirmed`:已直接定义 `report-artifact` 外壳、主表列、`summary-sheet` 模板、`status`/`partial_reasons` 字段。`contract-defined`:页面时间读取、故障查询、字段归一、汇总派生、导出/日志语义。`implementation intent exists but not rigorous / buggy``period``startTime/endTime` 建模不严谨,状态细分只停留在契约层。 | 当前更像“报表 schema/template stub”不能写成已严格实现实时浏览器采集器不得表述为已运行验证。 | `D:/data/ideaSpace/rust/sgClaw/claw-new/docs/superpowers/specs/2026-04-08-fault-details-report-operation-analysis.md` |
| 国网金昌供电公司营商环境周例会报告 | `jinchang-business-environment-weekly-report` | 报表 | 以四个固定 section 模板承载营商环境周报。 | `scene.json``SKILL.md``scripts/collect_business_environment_metrics.js``references/collection-flow.md``references/data-quality.md` | `code-confirmed`:四个 section template、空 artifact、`period`、基础状态字段已存在。`contract-defined`多来源指标采集、周范围读取、section 聚合、导出/日志语义。`implementation intent exists but not rigorous / buggy``region` 仅在文案层出现,未进入稳定 schema。 | 这是“分区化周报模板”而不是已证实的 live collector不能写成已稳定采集多个业务系统。 | `D:/data/ideaSpace/rust/sgClaw/claw-new/docs/superpowers/specs/2026-04-08-jinchang-business-environment-weekly-report-operation-analysis.md` |
| 95598、12398及配网设备监控情况周统计 | `95598-weekly-monitor-report` | 报表 | 以六个固定 section 模板承载周统计结果。 | `scene.json``SKILL.md``scripts/collect_weekly_metrics.js``references/collection-flow.md``references/data-quality.md` | `code-confirmed`:六个 section template、空 artifact、顶层 `period`、基础状态字段已存在。`contract-defined`双周期输入、period alignment、多来源周统计采集。`implementation intent exists but not rigorous / buggy``period` vs `currentPeriod/cumulativePeriod` 冲突明显period alignment 只在元数据/文档层被要求。 | 三个报表 scene 都更接近“已打包的 schema/template stub”不应写成已实现 live collector本场景还存在双周期建模未闭合问题。 | `D:/data/ideaSpace/rust/sgClaw/claw-new/docs/superpowers/specs/2026-04-08-95598-weekly-monitor-report-operation-analysis.md` |
| 95598抢修-市指 | `95598-repair-city-dispatch` | 监测 | 监测抢修工单队列,识别待处理/审核/已处理,并为提醒、日志、自动派单等链路提供输入。 | `scene.json``SKILL.md``scripts/collect_repair_orders.js``D:/desk/智能体资料/大四区报告监测项/95598抢修-市指_业务检测配置.txt``D:/desk/智能体资料/大四区报告监测项/95598抢修-市指_自动处理配置.txt` | `code-confirmed`packaged JS 现已直接实现输入驱动的 `monitor-snapshot` collector可做 repair-order 分类、monitor/dispose log 比较、`new_pending_ids` 推导、`success/partial/empty/blocked` 状态判定,并携带 `workflow_rule_sources``config_base_page/config_base_role``known_issues` 元数据;更强的队列采集、日志比较、音频提醒、短信、外呼、自动派单、处置日志写入证据直接存在于 desk 规则脚本。`contract-defined`:快照语义与下游副作用需分开表达。`implementation intent exists but not rigorous / buggy`desk 规则内存在 `status == "00" && status == "01"` 的待处理分类 bug规则层 `new_pending_ids` 仍更像归一化目标而非同名稳定字段。 | 本场景 desk workflow 证据仍强于 packaged collector且当前实际定时执行证据以 desk 规则脚本为主;`assets/scene-snapshot/index.html` 仅是配置基础页。仍不能宣称任何运行时成功。 | `D:/data/ideaSpace/rust/sgClaw/claw-new/docs/superpowers/specs/2026-04-08-95598-repair-city-dispatch-operation-analysis.md` |
| 户表失电-嘉峪关 | `jiayuguan-meter-outage` | 监测 | 监测户表失电事件,结合服务工单状态与历史日志识别待处理对象,并为自动处理链路提供输入。 | `scene.json``SKILL.md``scripts/collect_outage_events.js``D:/desk/智能体资料/大四区报告监测项/户表失电-嘉峪关_业务监测配置.txt``D:/desk/智能体资料/大四区报告监测项/户表失电-嘉峪关_自动处理配置.txt` | `code-confirmed`packaged JS 现已直接实现输入驱动的 `monitor-snapshot` collector可从 outage/service-order 数据计算 `pending/audit/processed`、比较 monitor/dispose logs、推导 `new_pending_ids`、输出 `success/partial/empty/blocked`,并携带 `workflow_rule_sources``config_base_page/config_base_role``identity_model` 元数据;更强的 outage collection、service-order enrichment、monitor/dispose log 比较、营销 token 依赖自动处理与派单分支直接存在于 desk 规则脚本。`contract-defined`:快照与下游自动处理需分开理解。`implementation intent exists but not rigorous / buggy`:监测 pending 列表用 `consNo`,处置去重用 `eventId`,身份模型不一致;短信通道只看到保留意图/注释代码。 | 本场景 desk workflow 证据也强于 packaged collector且当前实际定时执行证据以 desk 规则脚本为主;`assets/scene-snapshot/index.html` 仅是配置基础页。必须保留身份不一致问题,不能把 `pending_ids/new_pending_ids` 写成已被严格统一定义。 | `D:/data/ideaSpace/rust/sgClaw/claw-new/docs/superpowers/specs/2026-04-08-jiayuguan-meter-outage-operation-analysis.md` |
## 当前状态汇总
### 1. 报表类场景的共同结论
- `fault-details-report`
- `jinchang-business-environment-weekly-report`
- `95598-weekly-monitor-report`
这 3 个 scene 当前最强直接证据都集中在“已打包脚本定义了 artifact schema / section template / 基础状态字段”。
因此,对这 3 个 scene 的严谨表述应是:
- `code-confirmed`:已存在结构模板、字段壳和分区定义
- `contract-defined`:存在明确的目标采集流程与质量要求
- `implementation intent exists but not rigorous / buggy`:运行时采集、周期对齐、状态细分、导出/日志等链路没有被 packaged JS 同等强度证实
换言之,它们当前主要是“结构化报表模板场景”,不应表述为“已验证的 live collector”。
### 2. 监测类场景的共同结论
- `95598-repair-city-dispatch`
- `jiayuguan-meter-outage`
这 2 个 scene 的情况与报表类不同:
- packaged JS collector 已具备输入驱动的 `monitor-snapshot` 归一化 / 比较逻辑
- 更强 workflow 证据主要存在于 desk 规则资产
- 规则资产直接展示了采集、比较、提醒、日志、派单等流程分支
因此,对这 2 个 scene 的严谨表述应是:
- `code-confirmed`:规则资产中确有较强监测/自动处理链路定义
- 但这仍只证明“规则层存在这些实现分支”
- 不得进一步写成“运行时已稳定成功”
### 3. 当前全局未闭合问题
- `fault-details-report``period``startTime/endTime` 的关系未闭合
- `jinchang-business-environment-weekly-report``region` 语义只在文案层出现,未形成稳定字段
- `95598-weekly-monitor-report``period``currentPeriod/cumulativePeriod`、period alignment 之间的关系未闭合
- `95598-repair-city-dispatch`:待处理分类规则存在 `status == "00" && status == "01"` bug
- `jiayuguan-meter-outage`monitor pending 使用 `consNo`dispose dedupe 使用 `eventId`,身份模型不一致
## 按证据等级整理的能力视图
### `code-confirmed`
- 报表 artifact / monitor snapshot 的基础结构壳
- 报表 scene 的固定 section/template 定义
- 两个监测 scene 规则资产中的采集、比较、日志、提醒、派单分支存在性
### `contract-defined`
- 报表类 scene 的目标采集流程、导出语义、质量约束
- 监测类 scene 的“快照成功”与“副作用成功”分离原则
- 周报类双周期/多来源/对齐语义
### `implementation intent exists but not rigorous / buggy`
- 报表类 scene 中对 live collector、period alignment、状态细分的实现意图
- `95598-repair-city-dispatch` 的 pending 分类 bug
- `jiayuguan-meter-outage` 的身份键不一致
- 若干下游通道存在定义或注释代码,但不足以提升为稳定现状
### `no direct evidence / candidate only`
- 统一 capability 名称本身
- 未来标准配置里的字段拆分方案
- `email` 等当前未见直接证据的候选通道
## 使用边界
本文件只用于帮助人快速理解“当前观察到的业务盘点状态”。如需:
- 看每个场景的证据出处与分级理由,读对应 operation-analysis 文档
- 看机器可读盘点结构,读 `2026-04-08-command-center-virtual-employee-inventory.json`
- 看未来目标配置结构提案,读 `2026-04-08-command-center-standard-config-structure.md`

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,142 @@
# fault-details-report 操作分析
## 1. 场景概述
`fault-details-report` 对应“故障明细”场景,目标表述为查询故障明细并生成包含明细与汇总分区的结构化报表。根据 `D:\data\ideaSpace\rust\sgClaw\claw\claw\skills\skill_staging\scenes\fault-details-report\scene.json``D:\data\ideaSpace\rust\sgClaw\claw\claw\skills\skill_staging\skills\fault-details-report\SKILL.md``D:\data\ideaSpace\rust\sgClaw\claw\claw\skills\skill_staging\skills\fault-details-report\scripts\collect_fault_details.js`,当前最强直接证据在于:已打包脚本明确了报表 artifact 的列结构、汇总 section 名称、空结果形态与 `status: "ok"` 默认值,证据等级:`code-confirmed`
但同一批证据并没有展示真实浏览器页面抓取、请求触发、行级归一化或汇总派生的实际执行代码。也就是说,当前 packaged script 对 artifact schema / section template 的定义,明显强于对实时浏览器采集行为的证明,证据等级:`code-confirmed`
## 2. 证据来源
本分析统一只使用四个证据等级标签:`code-confirmed``contract-defined``implementation intent exists but not rigorous / buggy``no direct evidence / candidate only`。其中,脚本直接定义的 artifact schema / section template 归入 `code-confirmed`;未见脚本直接实现的运行语义与下游动作,不拔高于其对应较弱标签。
1. `D:\data\ideaSpace\rust\sgClaw\claw\claw\skills\skill_staging\skills\fault-details-report\scripts\collect_fault_details.js`
- 直接定义 `DETAIL_COLUMNS``SUMMARY_COLUMNS`、返回对象字段、空 `rows`、空 `sections[0].rows``status: "ok"``partial_reasons: []`,证据等级:`code-confirmed`
2. `D:\data\ideaSpace\rust\sgClaw\claw\claw\skills\skill_staging\skills\fault-details-report\SKILL.md`
- 说明预期工作流为读取时间范围、收集原始故障明细、按规范列顺序归一、派生汇总 sheet、返回 artifact这是技能说明与目标运行契约能证明意图与期望输出但不能单独证明脚本已实现全部步骤整体证据等级以 `contract-defined``implementation intent exists but not rigorous / buggy` 并存描述更严谨。
3. `D:\data\ideaSpace\rust\sgClaw\claw\claw\skills\skill_staging\scenes\fault-details-report\scene.json`
- 定义场景输入为 `period`、依赖为 `browser` / `report-history` / `local-report-service`、动作包括 `query` / `collect-report` / `build-summary-section`,属于场景元数据定义,证据等级:`code-confirmed`
4. `D:\data\ideaSpace\rust\sgClaw\claw\claw\skills\skill_staging\skills\fault-details-report\references\collection-flow.md`
- 给出“读取开始结束时间、触发 repair-order query、收集明细、按 `excleIni[0].cols` 归一、派生 summary-sheet、再返回 artifact”的参考流程它定义了预期采集语义证据等级`contract-defined`
5. `D:\data\ideaSpace\rust\sgClaw\claw\claw\skills\skill_staging\skills\fault-details-report\references\data-quality.md`
- 给出必填列、可空列、summary 派生期望、partial 规则与 empty/failure 区分,属于质量约束参考,证据等级:`contract-defined`
6. `D:\data\ideaSpace\rust\sgClaw\claw-new\docs\superpowers\specs\2026-04-08-command-center-virtual-employee-inventory.json`
- 已把该场景整理为 `workflow``result.key_fields``status_model``open_questions`,可作为当前 command-center 侧归纳结果,但其中部分内容是对 scene/skill/reference 的再整理,不应反向当作新实现证据;证据等级:`no direct evidence / candidate only`(仅限 inventory 不能单独证明 packaged script 已实现的部分)。
## 3. 实际入口与运行边界
实际入口已在 `scene.json` 中声明为浏览器场景 `index.html`,技能包工具名为 `fault-details-report.collect_fault_details`artifact 类型为 `report-artifact`,这些都是当前仓库可直接定位的定义,证据等级:`code-confirmed`
运行边界方面:
- 场景元数据只声明了 `inputs: ["period"]`,证据等级:`code-confirmed`
- 参考流程却明确要求从页面 datetime range control 读取 `start` / `end` 时间,证据等级:`contract-defined`
- 因而“外部统一输入叫 `period`,但页面真实业务输入更像 `startTime/endTime` 二元组”这一判断是当前最严谨的归纳,且 inventory 文件也把它列入 `open_questions`,证据等级:`implementation intent exists but not rigorous / buggy`
还要强调:当前可直接运行的打包脚本并未包含浏览器操作、请求调用、页面解析或 localhost 导出调用代码,因此它的实际边界更接近“返回一个预定义空 artifact 的 schema stub”而不是“已严格实现端到端浏览器采集器”证据等级`code-confirmed`
## 4. 代码已证实的实际操作流程
当前代码真正能严格确认的流程只有以下最小闭环:
1. 调用 `collectFaultDetails(input)`
2. 读取 `input.period || ""` 填入返回对象的 `period` 字段。
3.`DETAIL_COLUMNS` 写入主表 `columns`
4. 将空数组写入主表 `rows`
5. 构造一个名为 `summary-sheet` 的 section并写入 `SUMMARY_COLUMNS` 与空 `rows`
6. 返回 `type: "report-artifact"``report_name: "fault-details-report"``status: "ok"``partial_reasons: []`
以上每一步都能在 `collect_fault_details.js` 中直接定位,证据等级:`code-confirmed`
至于以下操作:读取页面时间、触发 repair-order 查询、抓取故障行、归一字段、按明细派生汇总、判断 partial/empty/blocked、调用导出服务或报告日志服务目前在 packaged script 中没有对应实现代码,只在 skill/reference 文本里出现,证据等级最多是 `contract-defined``implementation intent exists but not rigorous / buggy`,不能写成当前代码已证实的实际流程。
## 5. 标准化抽象流程
若为 command-center 做严格抽象,当前更合理的标准化流程应写成:
1. 解析外部任务输入。
2. 将业务时间范围映射到页面查询参数。
3. 执行浏览器态查询并收集故障明细行。
4. 按约定列顺序归一主表数据。
5. 基于明细结果派生 `summary-sheet`
6. 生成 `report-artifact`
7. 如有需要再执行导出/日志等下游动作。
其中第 6 步“生成具有主表+summary-sheet 的 artifact 结构”可由脚本直接支撑,证据等级:`code-confirmed`。第 2、3、4、5、7 步主要来自场景说明与 reference 文档,不是当前脚本已实现事实,证据等级应分别按 `contract-defined``implementation intent exists but not rigorous / buggy` 标注。
## 6. 输入、上下文与依赖
### 输入
- `period` 被 scene 元数据与脚本入参直接使用,证据等级:`code-confirmed`
- “页面实际读取开始时间与结束时间”来自 `references/collection-flow.md``SKILL.md` 的 workflow 描述,证据等级:`contract-defined`
- 因此 `period``startTime/endTime` 的关系当前并不严谨:很可能 `period` 只是上层统一抽象,而底层真实 collector 需要双时间字段,证据等级:`implementation intent exists but not rigorous / buggy`
### 运行上下文
- 浏览器页面可访问、页面日期控件存在、会话已登录,来自 scene/inventory/reference 的联合描述,证据等级以 `code-confirmed`(元数据存在)和 `contract-defined`(具体语义)共同成立。
- `report-history``local-report-service` 被声明为依赖,但 reference 同时强调历史报告不是主数据源、localhost 服务是下游依赖,证据等级:`code-confirmed``contract-defined`
### 依赖
- `browser``fault-detail-query-source``local-report-service` 等依赖名称或整理项可直接在 scene 或 inventory 中定位,证据等级:`code-confirmed`
- `/a_js/YPTAPI.js``http://localhost:13313/ReportServices/*``faultDetailsExportXLSXS` 等更具体依赖来自 reference证据等级`contract-defined`
## 7. 输出结构
当前输出结构是本场景最硬的直接证据。`collect_fault_details.js` 已直接定义:
- `type: "report-artifact"`
- `report_name: "fault-details-report"`
- `period`
- 主表 `columns` = `DETAIL_COLUMNS`
- 主表 `rows` = `[]`
- `sections[0].name = "summary-sheet"`
- `sections[0].columns = SUMMARY_COLUMNS`
- `sections[0].rows = []`
- `status = "ok"`
- `partial_reasons = []`
以上全部属于 `code-confirmed`
`SKILL.md``data-quality.md` 还要求输出中体现 detail row count、summary row count、required column coverage、complete/partial status、missing columns、weak mappings、downstream failures 等诊断信息。除了 `status``partial_reasons` 字段壳子已经存在,其他诊断性内容并未在脚本中实现,证据等级:`implementation intent exists but not rigorous / buggy`
## 8. 下游动作证据表
| 下游动作 | 当前证据 | 证据等级 | 严谨结论 |
| --- | --- | --- | --- |
| 生成 `report-artifact` 返回给上游 | `collect_fault_details.js` 直接返回对象 | `code-confirmed` | 已有稳定的 artifact 结构桩实现,但当前返回为空数据模板。 |
| 明细列顺序标准化 | `DETAIL_COLUMNS` 明确定义 | `code-confirmed` | 只能确认列 schema 被定义,不能确认真实行数据已按此顺序完成映射。 |
| `summary-sheet` 分区存在 | `sections` 中直接构造 `summary-sheet` | `code-confirmed` | 只能确认 section 模板存在,不能确认真实汇总派生逻辑已实现。 |
| 页面采集故障明细行 | 只在 `SKILL.md` / `collection-flow.md` 中描述 | `contract-defined` | 存在明确目标流程,但当前 packaged script 未直接证明已实现。 |
| 汇总派生 | 只在 `SKILL.md` / `collection-flow.md` / `data-quality.md` 中描述 | `contract-defined` | 有契约与质量要求,但没有脚本级派生代码证据。 |
| 导出 Excel | scene 依赖与 reference 提到 localhost export service | `contract-defined` | 这是下游依赖定义,不等于本 skill 当前已实际执行导出。 |
| 写报告日志 | scene 依赖 `report-history`reference 提到 report-log | `contract-defined` | 只能确认体系中有该下游概念,当前脚本未直接实现日志写入。 |
| partial / empty / blocked 状态细分 | skill/reference 有规则,脚本固定 `status: "ok"` | `implementation intent exists but not rigorous / buggy` | 状态模型意图存在,但 packaged script 目前未严格承载这些分支。 |
## 9. 当前代码疑点 / 不严谨点
1. `period``startTime/endTime` 的建模不一致。scene 与脚本只保留 `period`reference 却明确要求读取开始/结束时间;这会让 command-center 难以判断标准输入究竟是一段字符串还是两个独立时间字段,证据等级:`implementation intent exists but not rigorous / buggy`
2. 脚本把 `status` 固定为 `"ok"`,但 reference 与 `SKILL.md` 明确区分 success / partial / empty / blocked当前实现无法承载这些语义证据等级`code-confirmed` 对现状成立,而“应支持细分状态”属于 `contract-defined`
3. `partial_reasons` 虽存在字段,但脚本没有任何填充逻辑,只能算 schema 占位,证据等级:`code-confirmed`
4. `DETAIL_COLUMNS``SUMMARY_COLUMNS` 已定义,但没有任何从页面数据到列值的映射代码;“字段归一化能力已落地”不能成立,证据等级最多为 `implementation intent exists but not rigorous / buggy`
5. 下游导出与日志在参考资料中存在,但当前 skill 脚本并未调用相关服务,因此“报表可直接生成 Excel”不能写成当前代码事实证据等级`no direct evidence / candidate only`(就 packaged script 内实际执行而言)。
## 10. 对 command-center 标准配置的修订建议
1. 将本场景输入从单一 `period` 修订为更严谨的双层表达:
- 对外统一层可保留 `period` 便于路由;
- 执行层建议显式展开 `startTime` / `endTime`
其中“需要展开”的结论来自 scene 与 reference 的冲突修正,证据等级:`implementation intent exists but not rigorous / buggy`
2. 在标准配置里把“artifact schema 已明确、live collector 未证实”作为单独字段或备注保留,避免 command-center 误把 schema stub 当成已实现采集器,证据等级:`code-confirmed`
3.`summary-sheet` 标记为 `section template confirmed`,而不是 `summary derivation implemented`。前者是 `code-confirmed`,后者当前没有同等强度证据。
4. 状态模型建议分成两层:
- `declared_status_model`: success / partial / empty / blocked来源于 skill/reference证据等级`contract-defined`
- `implemented_status_behavior`: 当前仅看到固定 `ok` 成功壳,证据等级:`code-confirmed`
5. 对下游动作增加 `evidence_note`,明确 report-export / report-log 目前主要来自场景与参考定义,不是当前 packaged script 已证实行为。
## 11. 最终严谨结论
关于 `fault-details-report`,当前最可靠的结论是:仓库已经具备一个明确的报表 artifact 模板实现,能够稳定返回故障明细主表列定义、`summary-sheet` 汇总分区模板、空结果数组以及基础状态字段,证据等级:`code-confirmed`
但如果把结论提升为“已经实现真实浏览器故障明细采集、列归一化、汇总派生、导出与日志闭环”,则证据并不充分。相关行为主要存在于 `SKILL.md``references/collection-flow.md``references/data-quality.md` 与 scene 元数据中,能够证明的是目标流程与契约要求,而不是当前 packaged script 已严格完成这些逻辑。因此本场景目前应被描述为“artifact schema / section template 定义强live browser collection 行为证据弱”的 staged report scene而不能被描述为已严谨落地的实时采集器。

View File

@@ -0,0 +1,225 @@
# jiayuguan-meter-outage 操作分析
## 1. 场景概述
`jiayuguan-meter-outage` 对应“户表失电-嘉峪关”场景,目标是采集户表失电事件、关联服务工单状态、对比历史监测 / 处置日志,并在必要时触发音频提醒或自动派单等后续动作。根据 `D:\data\ideaSpace\rust\sgClaw\claw\claw\skills\skill_staging\scenes\jiayuguan-meter-outage\scene.json``D:\data\ideaSpace\rust\sgClaw\claw\claw\skills\skill_staging\skills\jiayuguan-meter-outage\SKILL.md``D:\data\ideaSpace\rust\sgClaw\claw\claw\skills\skill_staging\skills\jiayuguan-meter-outage\scripts\collect_outage_events.js` 以及两份规则资产当前最严谨的结论是packaged JS collector 已经实现输入驱动的 `monitor-snapshot` 归一化 / 比较逻辑,会从 outage events 与 service orders 计算 `pending/audit/processed`、解析 monitor/dispose logs、推导 `pending_ids` / `new_pending_ids`、输出 `success/partial/empty/blocked` 状态,并附带 source endpoint 常量、localhost 端点、desk 规则来源、配置基础页标记与身份模型元数据;更强的业务工作流证据则主要存在于 desk 规则资产中,证据等级分别为 `code-confirmed`
必须明确区分以下几层:
1. packaged runtime-snapshot-collector`collect_outage_events.js` 已直接实现 outage/service-order 归一化、历史比较、身份模型暴露与标准快照输出,并显式携带 `workflow_rule_sources``config_base_page``config_base_role``packaged_collector_role``identity_model` 元数据,证据等级:`code-confirmed`
2. outage collection业务监测规则直接请求 `outage/dhsd/dhsdList` 收集失电事件,证据等级:`code-confirmed`
3. service-order enrichment业务监测规则再请求 `gdgl/active/service/order/list` 收集服务工单状态并补全 `audit` / `processed`,证据等级:`code-confirmed`
4. monitor-log comparison业务监测规则通过 `getMonitorLog` 对比历史待处理列表并决定是否音频提醒,证据等级:`code-confirmed`
5. dispose-log dedupe业务监测规则通过 `getDisposeLog` 做已派单去重并决定是否进入自动处理,证据等级:`code-confirmed`
6. marketing-token-dependent auto-processing and dispatch自动处理规则显式读取营销系统 token并基于营销系统查询结果、班组配置和自动派单接口推进派单证据等级`code-confirmed`
但这些 `code-confirmed` 仍只证明“代码或规则资产中存在这些实现链路”,不代表运行时已验证成功。本文不声称任何运行时验证结论。
## 2. 证据来源
本分析统一只使用四个证据等级标签:`code-confirmed``contract-defined``implementation intent exists but not rigorous / buggy``no direct evidence / candidate only`
1. `D:\data\ideaSpace\rust\sgClaw\claw\claw\skills\skill_staging\skills\jiayuguan-meter-outage\scripts\collect_outage_events.js`
- 直接定义 `SOURCE_GROUPS``LOCAL_SERVICE_ENDPOINTS``WORKFLOW_RULE_SOURCES``CONFIG_BASE_PAGE``IDENTITY_MODEL`,并实现 outage/service-order 分类、monitor/dispose log 解析比较、`new_pending_ids` 推导、`success/partial/empty/blocked` 状态判定,以及带 `evidence` / `identity_model``monitor-snapshot` 输出,证据等级:`code-confirmed`
2. `D:\desk\智能体资料\大四区报告监测项\户表失电-嘉峪关_业务监测配置.txt`
- 直接实现失电事件采集、服务工单状态补充、monitor log 比较、dispose log 去重、音频提醒与监测日志写入,证据等级:`code-confirmed`
3. `D:\desk\智能体资料\大四区报告监测项\户表失电-嘉峪关_自动处理配置.txt`
- 直接实现营销 token 读取、营销系统用户查询、工单编号获取、班组分配、自动派单请求、音频提醒、处置日志写入,以及备用短信函数定义,证据等级:`code-confirmed`
4. `D:\data\ideaSpace\rust\sgClaw\claw\claw\skills\skill_staging\skills\jiayuguan-meter-outage\SKILL.md`
- 定义“失电事件采集与工单状态采集要分开,再组合成一份快照;下游提醒与自动派单不应重定义采集成功”的运行契约,证据等级:`contract-defined`
5. `D:\data\ideaSpace\rust\sgClaw\claw\claw\skills\skill_staging\skills\jiayuguan-meter-outage\references\collection-flow.md`
- 定义以配置页为入口、组合 outage-event collection、service-order enrichment、历史比较和 auto-processing context 的流程,证据等级:`contract-defined`
6. `D:\data\ideaSpace\rust\sgClaw\claw\claw\skills\skill_staging\skills\jiayuguan-meter-outage\references\data-quality.md`
- 定义 pending / audit / processed 的来源语义、partial 规则与依赖告警,证据等级:`contract-defined`
7. `D:\data\ideaSpace\rust\sgClaw\claw\claw\skills\skill_staging\scenes\jiayuguan-meter-outage\scene.json`
- 声明场景分类、输入 `time`、依赖与动作,证据等级:`code-confirmed`
8. `D:\data\ideaSpace\rust\sgClaw\claw\claw\skills\skill_staging\scenes\jiayuguan-meter-outage\scene.draft.json`
- 暴露对 marketing token context 和 `trigger-alert` / `auto-processing` 是否进一步拆分的待定整理,证据等级:`no direct evidence / candidate only`
## 3. 实际入口与运行边界
实际入口在 `scene.json` 中已固定:场景页面入口为 `index.html`,技能工具名为 `jiayuguan-meter-outage.collect_outage_events`,输出类型为 `monitor-snapshot`,输入为 `time`,这些都属于 `code-confirmed`
其中 `assets/scene-snapshot/index.html` 只应被视为配置基础页(例如班组、联系人、范围维护),不应被当作规则 workflow 的主执行证据。
运行边界方面,需要特别强调 packaged collector 与 rule workflow 的分层:
- packaged JS runtime collector 的直接能力边界:它已经能基于输入 `outage_events``service_orders``monitor_logs``dispose_logs``pending/audit/processed` 归一化、历史比较、`new_pending_ids` 推导与 `success/partial/empty/blocked` 判定,并公开两个上游 source endpoint、一组 localhost endpoint、desk 规则来源、配置基础页角色与身份模型元数据;但它仍是输入驱动归一化 collector不直接发起浏览器请求也不直接承载完整业务 workflow证据等级`code-confirmed`
- 更强的业务流程边界,主要体现在 desk 规则资产:先采集户表失电事件,再请求服务工单列表补充状态,再做 monitor/dispose 日志比较,最后才决定提醒或自动处理,证据等级:`code-confirmed`
因此本场景不能被描述成“packaged collector 已完整实现嘉峪关户表失电实时工作流”。更严谨的说法是packaged collector 已实现可测试的输入驱动快照归一化 / 比较逻辑;较强 workflow 证据主要在 desk 规则资产中,证据等级:`code-confirmed`
此外,`collection-flow.md``SKILL.md` 都明确要求把 outage collection、service-order enrichment、历史比较与下游 auto-processing 分开理解;这是运行边界契约,证据等级:`contract-defined`
## 4. 代码已证实的实际操作流程
### 4.1 packaged runtime-snapshot-collector 已证实流程
`collect_outage_events.js` 中现在能严格确认:
1. 调用 `collectOutageEvents(input)`,读取 `input.outage_events``input.service_orders``input.monitor_logs || input.monitor_log``input.dispose_logs || input.dispose_log``input.local_write_failures``input.blocked_reason` 等输入。
2. 通过 `buildOutageContext(...)` 从 outage events 提取 `pending_ids``eventIds``eventIdsByConsNo`,并通过 `classifyServiceOrders(...)` 基于 `gdztmc` 计算 `audit` / `processed`
3. 解析 monitor/dispose logs识别 malformed payload并结合 `consNo``eventId` 的映射推导 `new_pending_ids`
4. 对未知工单状态、日志缺失、日志解析失败、缺失 event identity、identity crosswalk ambiguity、本地写失败等情况记录 `partial_reasons`
5.`blocked > partial > empty > success` 的优先级计算 `status`,返回 `type: "monitor-snapshot"``scene: "jiayuguan-meter-outage"``pending``audit``processed``pending_ids``new_pending_ids``status``partial_reasons`
6. 在返回对象中附带 `evidence.workflow_rule_sources``evidence.config_base_page``evidence.config_base_role``evidence.packaged_collector_role = "runtime-snapshot-collector"`,以及 `identity_model`
7. 模块额外导出 `SOURCE_GROUPS``LOCAL_SERVICE_ENDPOINTS``WORKFLOW_RULE_SOURCES``CONFIG_BASE_PAGE``IDENTITY_MODEL`
以上都属于 `code-confirmed`
### 4.2 业务监测规则已证实流程
`户表失电-嘉峪关_业务监测配置.txt` 直接证实了以下分段流程:
1. outage collection通过 `BrowserAction(... outage/dhsd/dhsdList ...)` 查询近两天到当天的失电事件,并把每条 `consNo` 放入 `idList`,证据等级:`code-confirmed`
2. service-order enrichment随后通过 `BrowserAction(... gdgl/active/service/order/list ...)` 查询当天工单列表,并按 `gdztmc == "待审核"` / `gdztmc == "已归档"` 分别累计 `audit``processed`,证据等级:`code-confirmed`
3. monitor-log comparison通过 `getMonitorLog` 读取历史 `pendingList`,对比当前 `idList`,如发现新增待处理则触发音频提醒,并把快照写入 `setMonitorData` / `setMonitorLog`,证据等级:`code-confirmed`
4. dispose-log dedupe通过 `getDisposeLog` 读取历史处置日志,解析 `orderID` 后提取其中 `id`,再以 `eventId` 为键从当前失电事件中筛出未处置事件 `pendingList`,证据等级:`code-confirmed`
5. 若存在未处置事件,则把 `pendingList` 塞给 `_this.queueObj.pendingList` 并触发 `_this.autoTask()`;否则直接 `_this.processQueue()`,证据等级:`code-confirmed`
### 4.3 自动处理规则已证实流程
`户表失电-嘉峪关_自动处理配置.txt` 直接证实:
1. 自动处理依赖营销系统 token代码从 `localStorage["markYXObj"]` 中读取 `token``loginUserInfo`,证据等级:`code-confirmed`
2. 自动处理先按 `eqPsrName` 合并事件,再读取 `getClassList` 获取班组配置,证据等级:`code-confirmed`
3. 用营销系统接口 `queryEleCust``consNo` 查询用户营销归属,再据此确定 `ecssMgtOrgCode`,证据等级:`code-confirmed`
4. 之后还会调用 `gdgl/zdfw/tgforderzdfw/gdbh` 获取工单编号,再调用 `gdgl/active/service/order/saveAndSend` 发起自动派单,证据等级:`code-confirmed`
5. 自动派单成功 / 失败 / 异常分支都会触发不同音频提醒,并写 `setDisposeLog`,证据等级:`code-confirmed`
6. 短信函数 `msgFC` 在自动处理规则中被定义,但当前成功分支里的短信发送代码被整体注释掉,因此“短信通道已成为当前有效工作流”不能被写成稳定事实,证据等级:`implementation intent exists but not rigorous / buggy`
## 5. 标准化抽象流程
若为 command-center 做严格抽象,本场景更合理的标准化流程应写成:
1. 接收监测任务输入 `time`
2. 单独采集 outage events。
3. 单独采集 service-order states并用其补充 `audit` / `processed`
4. 使用 monitor log 做待处理比较,判断提醒语义。
5. 使用 dispose log 做已处置去重,筛出需要自动处理的事件集合。
6. 先形成或保留监测快照语义。
7. 若满足条件,再进入依赖营销 token 的自动处理 / 派单流程。
8. 记录音频、日志与处置结果等下游动作。
其中第 1 步可由 packaged collector 的显式输入 `time` 支撑,第 2、3、4、5、6 步可由 packaged collector 的输入驱动归一化 / 比较逻辑支撑,证据等级:`code-confirmed`;第 7、8 步主要由规则资产直接支撑,证据等级:`code-confirmed`;“这些步骤应被分离理解、下游动作不应覆盖采集成功语义”的边界来自 `SKILL.md` / references证据等级`contract-defined`
如果把上述流程进一步说成“已由 packaged collector 严格统一承载实时 outage 请求、service-order 查询与自动派单副作用”,则不严谨,因为这些更强 workflow 证据主要来自 desk 规则资产而不是 packaged collector证据等级只能降为 `implementation intent exists but not rigorous / buggy`
## 6. 输入、上下文与依赖
### 输入
- `time` 是 scene 与 packaged script 共同声明的显式输入,证据等级:`code-confirmed`
- 业务监测规则对失电事件使用“近两天到今天”的 `offTime` 查询窗,对服务工单使用“当天”的 `createTime` 查询窗,证据等级:`code-confirmed`
- “当前 outage 和 service-order query windows 都属于实际输入的一部分”在 reference 中被明确说明,证据等级:`contract-defined`
### 运行上下文
- 平台 session、org/user 上下文、浏览器 `BrowserAction` 能力在规则资产中直接使用,证据等级:`code-confirmed`
- marketing token context 在自动处理规则中是实际依赖,而不仅仅是文档说法,证据等级:`code-confirmed`
- reference 也把 marketing token context 明确列为 downstream enrichment / dispatch 依赖,证据等级:`contract-defined`
### 依赖
- `scene.json` 声明 `browser``local-service``outage-source``service-order-source``history-log`,证据等级:`code-confirmed`
- 业务监测规则直接使用 `outage/dhsd/dhsdList``gdgl/active/service/order/list``getMonitorLog``setMonitorData``setMonitorLog``getDisposeLog``setAudioPlayLog`,证据等级:`code-confirmed`
- 自动处理规则直接使用营销系统 `queryEleCust`、工单编号接口 `gdgl/zdfw/tgforderzdfw/gdbh`、自动派单接口 `gdgl/active/service/order/saveAndSend``setDisposeLog``setAudioPlayLog`,证据等级:`code-confirmed`
- `scene.draft.json` 中 marketing token context 是否应提升为正式 dependency 仍是待确认项,因此在标准配置整理上属于 `no direct evidence / candidate only`
## 7. 输出结构
当前输出结构需要分层描述。
### 7.1 packaged runtime collector 已直接定义的输出
`collect_outage_events.js` 直接定义:
- `type: "monitor-snapshot"`
- `scene: "jiayuguan-meter-outage"`
- `time`
- `pending`
- `audit`
- `processed`
- `pending_ids`
- `new_pending_ids`
- `status`
- `partial_reasons`
- `evidence.workflow_rule_sources`
- `evidence.config_base_page`
- `evidence.config_base_role`
- `evidence.packaged_collector_role`
- `identity_model`
以上全部属于 `code-confirmed`
### 7.2 业务监测规则已展示的实际快照字段语义
业务监测规则直接构造了:
- `time`
- `type: "户表失电-嘉峪关"`
- `pending`
- `pendingList`
- `audit`
- `processed`
这说明规则层快照对象与 packaged stub 的标准字段命名并不完全一致,尤其是 `pendingList` vs `pending_ids``type` vs `scene`,证据等级:`code-confirmed`
### 7.3 `new_pending_ids` 的证据强度与身份不一致问题
`SKILL.md`、reference 与 `data-quality.md``new_pending_ids` 当成目标输出的一部分,证据等级:`contract-defined`。但当前规则资产里更强的直接事实是:
- monitor pending list 使用的是 `consNo`,即 `idList.push(item.consNo)`,证据等级:`code-confirmed`
- dispose dedupe 使用的是 `eventId`,即比较 `resList.indexOf(y.eventId)`,证据等级:`code-confirmed`
这意味着当前实现存在明显身份不一致:监测 pending 列表是 `consNo` 视角,而处置去重是 `eventId` 视角。因而“`pending_ids` / `new_pending_ids` 已被当前实现严谨统一定义”不能成立,证据等级:`implementation intent exists but not rigorous / buggy`
## 8. 下游动作证据表
| 下游动作 | 当前证据 | 证据等级 | 严谨结论 |
| --- | --- | --- | --- |
| 返回 `monitor-snapshot` runtime collector 输出 | `collect_outage_events.js` 直接返回对象 | `code-confirmed` | packaged JS 直接证明标准 snapshot 字段、状态判定、身份说明与 collector metadata 已存在。 |
| 失电事件采集 | 业务监测规则调用 `outage/dhsd/dhsdList` | `code-confirmed` | outage collection 在规则资产中直接存在。 |
| 服务工单状态补充 | 业务监测规则调用 `service/order/list` 并按 `gdztmc` 分桶 | `code-confirmed` | service-order enrichment 直接存在。 |
| monitor-log 比较 | 业务监测规则调用 `getMonitorLog` 并对比 `consNo` 列表 | `code-confirmed` | 历史比较逻辑直接存在。 |
| dispose-log 去重 | 业务监测规则调用 `getDisposeLog` 并按 `eventId` 过滤 | `code-confirmed` | 去重逻辑直接存在,但身份键与 monitor pending list 不一致。 |
| 音频提醒调用 | 业务监测规则和自动处理规则都调用 `mac.audioPlay(...)` | `code-confirmed` | 只能确认规则层存在音频提醒调用。 |
| 自动派单请求 | 自动处理规则调用 `service/order/saveAndSend` | `code-confirmed` | 自动派单请求分支可直接定位。 |
| 依赖营销 token 的用户查询 | 自动处理规则调用营销系统 `queryEleCust`,请求头带 `auth_token` | `code-confirmed` | 自动处理对 marketing token 有明确硬依赖。 |
| `setDisposeLog` 成功 / 失败 / 异常写入 | 自动处理规则各分支都写 `setDisposeLog` | `code-confirmed` | 处置日志写入分支存在。 |
| 短信发送通道 | 自动处理规则定义 `msgFC`,但成功分支短信代码被注释 | `implementation intent exists but not rigorous / buggy` | 说明短信意图存在,但当前读取到的有效工作流未严格启用。 |
| `pending_ids` / `new_pending_ids` 严格统一 | skill/reference 有目标要求,但规则层 `consNo``eventId` 混用 | `implementation intent exists but not rigorous / buggy` | 当前身份模型不统一,不能写成严谨既成事实。 |
## 9. 当前代码疑点 / 不严谨点
1. 最关键的不严谨点是身份不一致monitor pending list 以 `consNo` 作为待处理标识,而 dispose dedupe 以 `eventId` 作为去重标识。这会让 `pending_ids``new_pending_ids` 与“已处置集合”的语义难以严格对齐,证据等级:`implementation intent exists but not rigorous / buggy`
2. packaged collector 与规则资产输出命名仍不一致collector 使用 `scene``pending_ids``new_pending_ids`,规则对象使用 `type``pendingList`,证据等级:`code-confirmed`
3. `SKILL.md` 明确要求把 outage collection 与 service-order enrichment 分离理解;当前规则确实这样做了,但 packaged stub 没有承载这层结构,因此如果 command-center 只读 packaged stub 会低估真实 workflow证据等级`code-confirmed`
4. 自动处理强依赖 marketing token`scene.json` 现有正式 dependencies 没把它显式列出;`scene.draft.json` 已把这点作为待确认项,说明标准依赖建模尚未闭合,证据等级:`implementation intent exists but not rigorous / buggy`
5. 自动处理规则中短信发送函数虽然存在,但主成功路径短信代码被注释,说明短信通道更像保留意图而非当前可靠工作流,证据等级:`implementation intent exists but not rigorous / buggy`
6. 本文不能根据规则中存在自动派单和音频分支,就声称这些分支已经过运行时验证;任何这种表述都应避免。
## 10. 对 command-center 标准配置的修订建议
1. 对本场景应显式拆分两层证据:
- `packaged_collector`: `collect_outage_events.js` 的 runtime snapshot collector、状态判定、历史比较与 metadata规则来源、配置基础页角色、身份模型证据等级`code-confirmed`
- `rule_asset_workflow`: 规则资产中的 outage collection、service-order enrichment、历史比较与自动处理流程证据等级`code-confirmed`
2. 标准工作流建议强制拆成五段:
- `outage_collection`
- `service_order_enrichment`
- `monitor_log_comparison`
- `dispose_log_dedupe`
- `marketing_token_dependent_auto_processing`
这些拆分都能由现有规则资产直接支撑,证据等级:`code-confirmed`
3. 标准配置中应单独增加 `identity_model_note`,明确当前监测 pending list 基于 `consNo`,而 dispose dedupe 基于 `eventId`,两者尚未统一,证据等级:`implementation intent exists but not rigorous / buggy`
4. 对 dependencies 建议把 `marketing-token-context` 提升为显式依赖项,因为自动处理规则确实直接读取并使用营销 token证据等级`code-confirmed`;但“如何在标准 scene schema 中表达”目前仍是配置整理问题,证据等级:`implementation intent exists but not rigorous / buggy`
5. 对输出 schema 建议区分:
- `canonical_snapshot_fields`: `pending_ids` / `new_pending_ids` 等标准字段;
- `observed_rule_fields`: `pendingList` / `type` 等规则字段。
并额外记录 `pending_identity = consNo``dispose_identity = eventId` 的差异,避免误建模。
## 11. 最终严谨结论
关于 `jiayuguan-meter-outage`,当前最可靠的结论是:仓库已经存在一个可测试的 packaged JS runtime collector以及两份更强的 desk 规则脚本实现(`D:\desk\智能体资料\大四区报告监测项\户表失电-嘉峪关_业务监测配置.txt``D:\desk\智能体资料\大四区报告监测项\户表失电-嘉峪关_自动处理配置.txt`)。其中 packaged collector 已直接实现 outage/service-order 归一化、monitor/dispose log 比较、`new_pending_ids` 推导与 `success/partial/empty/blocked` 状态判定;业务监测规则直接证实了 outage collection、service-order enrichment、monitor-log comparison、dispose-log dedupe 与音频提醒 / 监测日志写入;自动处理规则则直接证实了依赖 marketing token 的用户归属查询、工单编号获取、自动派单请求以及音频 / 处置日志副作用分支,证据等级:`code-confirmed`
但同样必须严格说明:更强 workflow 证据主要在 desk 规则资产中,而不是 packaged collector因此不能把本场景描述成“packaged collector 已严谨实现全部实时业务流程”。此外当前实现仍存在关键身份不一致问题monitor pending list 使用 `consNo`dispose dedupe 使用 `eventId`。这说明本场景虽然 workflow 证据较强,但 `pending_ids` / `new_pending_ids` 的统一身份模型仍不严谨最适合被描述为“packaged collector 已具备输入驱动快照归一化能力、desk rule-asset workflow 较强、且身份键需要在 command-center 标准配置中显式澄清”的 monitor scene。

View File

@@ -0,0 +1,143 @@
# jinchang-business-environment-weekly-report 操作分析
## 1. 场景概述
`jinchang-business-environment-weekly-report` 对应“国网金昌供电公司营商环境周例会报告”场景,目标是采集多来源指标并组装为分区结构化周报。根据 `D:\data\ideaSpace\rust\sgClaw\claw\claw\skills\skill_staging\scenes\jinchang-business-environment-weekly-report\scene.json``D:\data\ideaSpace\rust\sgClaw\claw\claw\skills\skill_staging\skills\jinchang-business-environment-weekly-report\SKILL.md``D:\data\ideaSpace\rust\sgClaw\claw\claw\skills\skill_staging\skills\jinchang-business-environment-weekly-report\scripts\collect_business_environment_metrics.js`,当前已被代码直接证实的是:打包脚本定义了四个 section template、空主表、`period` 字段、`status: "ok"``partial_reasons: []`,证据等级:`code-confirmed`
同时必须明确说明:当前 packaged script 更强地定义了 artifact schema / section template而没有同等强度地定义真实浏览器采集、跨系统查询、period 对齐或导出执行逻辑。换言之,本场景当前更像“结构化周报模板脚本”,而不是“已被脚本严格实现的多源实时采集器”,证据等级:`code-confirmed`
## 2. 证据来源
本分析统一只使用四个证据等级标签:`code-confirmed``contract-defined``implementation intent exists but not rigorous / buggy``no direct evidence / candidate only`。凡涉及脚本直接定义的 schema / section template标为 `code-confirmed`;凡涉及将真实采集结果映射进这些结构的运行语义,如脚本未直接实现,则不高于 `contract-defined`
1. `D:\data\ideaSpace\rust\sgClaw\claw\claw\skills\skill_staging\skills\jinchang-business-environment-weekly-report\scripts\collect_business_environment_metrics.js`
- 直接定义四个 section template`abnormal-transformer-monitoring``power-outage-monitoring``work-order-acceptance``dispatch-summary`,并返回空 artifact证据等级`code-confirmed`
2. `D:\data\ideaSpace\rust\sgClaw\claw\claw\skills\skill_staging\skills\jinchang-business-environment-weekly-report\SKILL.md`
- 说明应读取周范围、校验会话、收集多个 metric group、映射到 report sections、必要时标记 partial并在输出里返回 `region``period`、缺失 section、周期对齐问题等。它主要定义目标契约与运行意图证据等级以 `contract-defined``implementation intent exists but not rigorous / buggy` 为主。
3. `D:\data\ideaSpace\rust\sgClaw\claw\claw\skills\skill_staging\scenes\jinchang-business-environment-weekly-report\scene.json`
- 声明场景输入为 `period`,依赖包括 `browser``multi-source``local-report-service`,动作包括 `query` / `collect-report` / `aggregate-sections`,证据等级:`code-confirmed`
4. `D:\data\ideaSpace\rust\sgClaw\claw\claw\skills\skill_staging\skills\jinchang-business-environment-weekly-report\references\collection-flow.md`
- 描述周范围读取、跨系统会话校验、多指标组采集、section 装配与下游导出关系,证据等级:`contract-defined`
5. `D:\data\ideaSpace\rust\sgClaw\claw\claw\skills\skill_staging\skills\jinchang-business-environment-weekly-report\references\data-quality.md`
- 描述完整结果、partial 规则、弱点区域与 empty/failure 区分,证据等级:`contract-defined`
6. `D:\data\ideaSpace\rust\sgClaw\claw-new\docs\superpowers\specs\2026-04-08-command-center-virtual-employee-inventory.json`
- 归纳出 workflow、key_fields、status_model 等 command-center 视图;它能帮助识别当前整理结果,但不应被当成比原始 scene/skill/script 更强的实现证据,证据等级:`no direct evidence / candidate only`(仅限 inventory 不能单独证明 packaged script 已实现的部分)。
## 3. 实际入口与运行边界
实际入口在 `scene.json` 中已固定:场景页面入口为 `index.html`,技能调用为 `jinchang-business-environment-weekly-report.collect_business_environment_metrics`,输出 artifact 类型为 `report-artifact`,这些都属于 `code-confirmed`
运行边界方面,当前仓库能确认的内容是:
- 对外输入名为 `period`,证据等级:`code-confirmed`
- 需要浏览器页面、多源系统访问与本地报告服务,证据等级:`code-confirmed`
- 参考资料要求按周范围收集多个指标组并组装 section证据等级`contract-defined`
但“真实 collector 已在 packaged script 中实现多源访问、登录态校验、周期一致性检查”这一说法并不成立。当前脚本只返回空 section 模板,因而其可直接证明的运行边界仍是 schema stub多源采集与组装仅体现为明确实现意图而非已严格落地逻辑证据等级`implementation intent exists but not rigorous / buggy`
## 4. 代码已证实的实际操作流程
当前代码能严格确认的实际操作流程如下:
1. 调用 `collectBusinessEnvironmentMetrics(input)`
2. 读取 `input.period || ""` 写入 artifact 的 `period`
3. 构造空主表:`columns: []``rows: []`
4. 基于 `SECTION_TEMPLATES` 复制出 4 个 section并确保每个 section 的 `rows: []`
5. 返回 `type: "report-artifact"``report_name``status: "ok"``partial_reasons: []`
这些步骤均可在 `collect_business_environment_metrics.js` 中直接定位,证据等级:`code-confirmed`
以下步骤虽然在 `SKILL.md` 与 reference 中多次出现,但并未被脚本直接实现:读取页面周范围、校验多源 token/session、采集变压器监测/停电监测/工单受理/调度总结等真实数据、检查 period alignment、生成最终文档或导出结果。这些内容不能写成“代码已证实的实际流程”最多只能分别标记为 `contract-defined``implementation intent exists but not rigorous / buggy`
## 5. 标准化抽象流程
若做 command-center 的标准化抽象,本场景可整理为:
1. 接收周报任务输入。
2. 解析页面周范围并绑定会话上下文。
3. 访问多个业务来源,按指标组采集数据。
4. 按四类 section 模板/列结构承载结果。
5. 形成统一 `report-artifact`
6. 视情况执行导出/日志等下游动作。
其中第 4 步仅“四类 section 名称与列结构存在”是 `code-confirmed`;“真实采集结果已被映射进四类 section”仍只属于 `contract-defined` 的流程约定。第 2、3、6 步主要来自 skill/reference 的运行说明,证据等级应为 `contract-defined`。如果把这些步骤进一步写成“当前 packaged script 已可靠执行”,就会过度推断,证据等级只能降为 `implementation intent exists but not rigorous / buggy`
## 6. 输入、上下文与依赖
### 输入
- `period` 是 scene 与脚本已共同声明的业务输入,证据等级:`code-confirmed`
- `SKILL.md` 还要求输出中包含 `region`,但 scene 输入与 script 返回结构都未显式声明 `region` 字段,证据等级:`implementation intent exists but not rigorous / buggy`
### 运行上下文
- `session`、多源系统可访问性、缓存 token 可用性等在 scene/reference 中被描述scene 元数据层面的存在是 `code-confirmed`,更具体的业务语义是 `contract-defined`
- 页面历史报告区、执行日志区被 reference 提到,但被明确描述为下游历史/辅助区域,而非主数据源,证据等级:`contract-defined`
### 依赖
- `browser``multi-source``local-report-service` 可直接在 scene 中定位,证据等级:`code-confirmed`
- `/a_js/YPTAPI.js``http://localhost:13313/ReportServices/*`、导出或 surface 服务来自 reference证据等级`contract-defined`
## 7. 输出结构
当前脚本直接证实的输出结构包括:
- `type: "report-artifact"`
- `report_name: "jinchang-business-environment-weekly-report"`
- `period`
- `columns: []`
- `rows: []`
- `sections` 包含 4 个固定模板
- `status: "ok"`
- `partial_reasons: []`
这些均属于 `code-confirmed`
四个固定 section template 分别为:
1. `abnormal-transformer-monitoring`
2. `power-outage-monitoring`
3. `work-order-acceptance`
4. `dispatch-summary`
它们的列结构也都在脚本中已明确定义,证据等级:`code-confirmed`
`SKILL.md` 输出部分提到应返回 `region`、missing sections、period alignment issues、downstream export/logging failures。除 `period` 与空 `partial_reasons` 字段外,其余诊断信息都没有在脚本中被明确建模。尤其是 `region` 出现在输出文案中,却没有进入 artifact schema这是一处场景特定的不一致点证据等级`implementation intent exists but not rigorous / buggy`
## 8. 下游动作证据表
| 下游动作 | 当前证据 | 证据等级 | 严谨结论 |
| --- | --- | --- | --- |
| 返回分区化 `report-artifact` | `collect_business_environment_metrics.js` 直接返回对象 | `code-confirmed` | 已有稳定 artifact 壳,但内容为空模板。 |
| 四类 section 模板存在 | 脚本直接定义 `SECTION_TEMPLATES` | `code-confirmed` | 只能确认 section schema 已确定,不能确认 section 数据采集已实现。 |
| 多源指标采集 | 只在 `SKILL.md` / `collection-flow.md` 中描述 | `contract-defined` | 契约上明确需要多源采集,但当前 packaged script 未直接证明。 |
| 周期一致性判断 | `SKILL.md` / `data-quality.md` 提到 period alignment | `contract-defined` | 存在质量要求,但脚本没有 period alignment 逻辑。 |
| 导出周报文档 | reference 提到 localhost export/surface services | `contract-defined` | 属于下游依赖定义,不等于当前 skill 已执行文档导出。 |
| 报告日志写入 | `SKILL.md` / reference 提到 report-log | `contract-defined` | 只能确认有该下游概念,当前脚本没有调用证据。 |
| `partial` 结果建模 | 脚本保留 `partial_reasons`reference 定义 partial 语义 | `implementation intent exists but not rigorous / buggy` | 字段壳子存在,但没有真实 partial 分支。 |
| `region` 输出 | 只在 `SKILL.md` 输出说明中出现 | `implementation intent exists but not rigorous / buggy` | 表达上存在地区语义,但未进入 scene 输入或 artifact schema。 |
## 9. 当前代码疑点 / 不严谨点
1. `region` 出现在 `SKILL.md` 的输出项中,但 scene.json 与脚本 schema 都没有显式 `region` 字段;这意味着“金昌”可能只是场景名称隐含语义,而非可追踪输出字段,证据等级:`implementation intent exists but not rigorous / buggy`
2. 脚本固定返回空 `columns` 与空 `rows`,说明主表并不是核心结构,真正的核心是 4 个 section template如果 command-center 仍把它当通用主表型报表,容易误建模,证据等级:`code-confirmed`
3. `status` 固定为 `"ok"`,与 skill/reference 所要求的 partial / empty / blocked 区分不一致,证据等级:`code-confirmed` 对现状成立,而目标状态模型仅为 `contract-defined`
4. 参考资料强调多源系统会话与 token 缓存,但脚本完全没有这些依赖的执行路径,因此“多源采集能力已落地”不能被提升为当前代码事实,证据等级:`implementation intent exists but not rigorous / buggy`
5. 导出与报告历史区域在 reference 中存在,但未被脚本直接接入;若在 command-center 中直接把它配置为“可导出 Word/Excel”现状将属于过度推断证据等级`no direct evidence / candidate only`(就 packaged script 执行层而言)。
## 10. 对 command-center 标准配置的修订建议
1. 本场景应把核心输出建模为 `section-based report artifact`,而不是普通二维表。原因是脚本对四个 section template 的定义明显强于对主表的定义,证据等级:`code-confirmed`
2. 在标准配置中补充 `region_semantics``fixed_region` 字段,明确“金昌”究竟只是场景命名,还是应成为可展示/可审计输出的一部分。目前这是未闭合问题,证据等级:`implementation intent exists but not rigorous / buggy`
3. 状态模型建议拆分:
- 契约层声明 success / partial / empty / blocked证据等级`contract-defined`
- 实现层当前只有固定 `ok` artifact stub证据等级`code-confirmed`
4. 给配置增加 `collection_evidence` 备注,明确当前 packaged script 更偏向 section schema 模板,而不是 live browser collector避免后续调度器把它误当已完成的实时采集技能。
5.`downstream_effects` 建议增加 `implemented: false / not-directly-proven` 之类标记,以区分“场景上需要导出”与“脚本里已执行导出”。
## 11. 最终严谨结论
关于 `jinchang-business-environment-weekly-report`,当前最可靠的现状判断是:仓库已经存在一个四分区结构化周报 artifact 模板,四个 section 的名称与列 schema 已由 packaged script 直接定义,证据等级:`code-confirmed`
但“已经实现真实浏览器多源采集、周期一致性校验、section 数据组装、最终导出与日志闭环”这一更强表述没有被脚本直接证明。相关行为主要由 `SKILL.md``collection-flow.md``data-quality.md` 与 scene 元数据定义目标流程和质量要求,因此应把它理解为“有明确契约和实现意图,但当前 packaged script 主要还是 schema/section stub”。此外`region` 在输出话术中出现、却未进入 artifact schema是本场景当前最需要在 command-center 标准配置中澄清的不严谨点。

View File

@@ -0,0 +1,125 @@
# Config-Owned Direct Skill Dispatch Design
**Goal:** Preserve the current minimal submit flow where sgClaw accepts natural-language input, directly invokes one configured staged browser skill without calling an LLM, and keeps dispatch ownership in sgClaw configuration rather than external skill metadata.
**Status:** Approved design direction for the next slice. The current minimal direct-submit path already works; this document records the ownership boundary that future dispatch-policy work should follow.
---
## Decision Summary
1. Keep direct-skill selection in sgClaw configuration.
2. Continue using `skillsDir` plus `directSubmitSkill` as the only control surface for the no-LLM direct path.
3. Do not add sgClaw-specific dispatch fields to files under `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging` in this slice.
4. Keep the currently bound skill as `fault-details-report.collect_fault_details`.
5. When dispatch expands beyond one fixed skill, add the next policy layer on the sgClaw side first, not in `scene.json` or `SKILL.toml`.
---
## Current Minimal Flow
The intended user experience stays unchanged:
- the user types natural language into the input box
- sgClaw receives `BrowserMessage::SubmitTask`
- sgClaw loads runtime config
- if `directSubmitSkill` is configured, sgClaw bypasses LLM routing and directly resolves the configured staged skill from `skillsDir`
- sgClaw executes the target `browser_script` tool through the browser runtime and returns the result
- if `directSubmitSkill` is absent, sgClaw falls back to the existing orchestration / compat behavior
This keeps the first slice small while preserving a clear seam for future expansion.
---
## Ownership Boundary
### sgClaw configuration owns dispatch choice
sgClaw configuration is responsible for deciding whether submit-task should bypass the LLM path and which direct skill should run.
For the current slice, that means:
- `skillsDir` tells sgClaw where to load staged skills from
- `directSubmitSkill` tells sgClaw which `skill.tool` should be used for the direct path
Example:
```json
{
"skillsDir": "D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging",
"directSubmitSkill": "fault-details-report.collect_fault_details"
}
```
### skill_staging owns skill identity and execution assets
Files under `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging` remain responsible for describing the skill package, tool identity, and browser-script implementation.
For the current bound skill:
- `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/scenes/fault-details-report/scene.json`
- `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/fault-details-report/SKILL.toml`
- `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/fault-details-report/scripts/collect_fault_details.js`
These files already provide enough information for sgClaw to locate the package and run the tool. This slice does not add a new dispatch field inside them.
---
## Why This Boundary Is Recommended
### One source of truth for routing
If sgClaw configuration owns the direct-skill decision, the operator can switch the direct skill by changing config only. There is no need to edit code and no need to mutate external skill assets just to change routing.
### Avoid freezing external manifest semantics too early
`skill_staging` is an external skill asset set. Adding sgClaw-specific dispatch metadata now would couple the staged-skill format to one integration strategy before the policy model is stable.
### Preserve a clean migration path
The current minimal path is intentionally narrow: one fixed configured direct skill, no LLM dispatch, no per-skill policy registry yet. Keeping dispatch control in sgClaw makes it easier to add a broader policy layer later without rewriting the staged-skill package format first.
---
## Explicit Non-Goals
This design does not do the following:
- redesign the submit-task protocol
- move dispatch control into `scene.json` or `SKILL.toml`
- require every staged skill to declare `direct_browser` or `llm_agent` right now
- expand the current direct path into generic natural-language intent classification
- change the browser-script execution model
- change the current fallback orchestration / compat execution semantics when `directSubmitSkill` is not configured
---
## Current Skill Contract
The current direct path remains intentionally deterministic.
For `fault-details-report.collect_fault_details`, sgClaw derives only the minimum required arguments:
- `expected_domain` from the current `page_url`
- `period` from an explicit `YYYY-MM` token in the user's natural-language input
That means the UX still looks like natural-language submission, but the runtime does not ask an LLM to infer intent or invent missing parameters. If the period is missing, sgClaw should return a clear error instead of guessing.
---
## Future Dispatch Policy Direction
When more than one staged skill needs routing control, the next layer should still begin on the sgClaw side.
Recommended direction:
- keep `directSubmitSkill` as the current bootstrap switch for the minimal fixed-skill path
- introduce a sgClaw-owned registry or config mapping that can later express `skill.tool -> direct_browser | llm_agent`
- keep external skill manifests unchanged until the policy surface proves stable in real use
Only after the routing model is stable should we consider whether external skill metadata needs a default dispatch hint.
---
## Resulting Design Rule
For this project, the direct-skill decision remains config-owned:
- sgClaw config decides whether submit-task bypasses the LLM path
- staged skill metadata identifies what the skill is and how its browser tool runs
- future per-skill dispatch policy should be added in sgClaw first, not in `skill_staging`
This is the approved baseline for the next dispatch-policy slice.

View File

@@ -0,0 +1,495 @@
# Fault Details Full Skill Alignment Design
**Goal:** Upgrade `fault-details-report.collect_fault_details` from an empty artifact shell into a real staged business skill that matches the original fault-details package's collection, normalization, summary, export, and report-history behavior, while keeping direct-skill routing config-owned in `claw-new`.
**Status:** Approved design direction for the next remediation slice.
---
## Decision Summary
1. Keep direct-skill selection in `claw-new` via `skillsDir` + `directSubmitSkill`; do not move dispatch ownership into `skill_staging` manifests.
2. Put the fault-details business logic in `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging`, not in `claw-new`.
3. Align the staged skill with the original package's real behavior: query raw rows, normalize detail columns, derive summary rows, call localhost export, and write report history.
4. Keep the current browser-execution seam narrow: use the existing `browser_script` / browser-eval path, not a new browser protocol or new opcodes.
5. Add a narrow artifact interpreter in `claw-new` so structured fault-results map cleanly to `TaskComplete.success` and a readable completion summary.
---
## Why This Slice Exists
The current staged skill contract and the current staged skill implementation do not match.
### What the original package actually does
The original package under `D:/desk/智能体资料/大四区报告监测项/故障明细` does all of the following:
- reads the selected date range from the page UI
- queries the D4 repair-order data source
- filters and normalizes raw rows into the canonical detail export schema
- derives grouped summary rows by `gds`
- calls `http://localhost:13313/SurfaceServices/personalBread/export/faultDetailsExportXLSXS`
- auto-opens/downloads the generated file
- writes report history through `http://localhost:13313/ReportServices/Api/setReportLog`
### What the staged skill currently does
The current staged `collect_fault_details.js` only returns an empty `report-artifact` shell with empty `rows` and empty summary `sections`.
It also still uses a Node-style export shape instead of the browser-eval entrypoint shape that the current `browser_script` runtime expects. In practice, this means the staged script is not yet aligned with the real runtime contract even before business behavior is considered.
This slice closes that gap by making the staged skill actually perform the work the original package performs, but through the current sgClaw direct-skill runtime.
---
## Design Rules
### 1. `claw-new` owns routing, not business transforms
`claw-new` stays responsible for:
- loading config
- deciding whether submit-task takes the direct-skill path
- resolving the configured staged skill
- executing the staged browser-script tool
- turning the returned artifact into `TaskComplete.success` + human-readable summary
`claw-new` must **not** become the place where the original fault classification table, detail-row field mapping, or summary aggregation rules are reimplemented.
### 2. `skill_staging` owns fault-details business behavior
The staged skill package owns:
- query orchestration inside the browser page context
- raw-row extraction
- canonical detail-row normalization
- classification and derived fields
- summary-sheet derivation
- localhost export request
- localhost report-log request
- structured result payload
### 3. Keep the current browser seam narrow
Do not introduce a new browser bridge, callback protocol, or skill-specific browser opcode for this slice.
The implementation should continue using the current `browser_script` execution seam already wired through `claw-new/src/compat/browser_script_skill_tool.rs` and `claw-new/src/compat/direct_skill_runtime.rs`.
### 4. Match business behavior, not the original shell verbatim
The original package is a local HTML/Vue shell that uses `BrowserAction(...)`, timers, and hidden-browser choreography. That shell does **not** need to be recreated inside `claw-new`.
What must be preserved is the business outcome:
- same canonical detail columns
- same key field mappings
- same classification rules
- same summary metrics
- same downstream export/history behavior
- same distinction between empty, partial, blocked, and failed work
---
## Ownership Boundary and Landing Zones
### Staged skill changes
These changes land in `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging`.
Primary files:
- `skills/fault-details-report/scripts/collect_fault_details.js`
- becomes the real browser-eval entrypoint
- must directly `return` the final structured artifact from the wrapped browser script
- may contain internal helper functions, but should remain self-contained for the current runtime
- `skills/fault-details-report/SKILL.toml`
- keep `browser_script`
- tighten the tool description so it matches the real behavior
- do not turn `SKILL.toml` into the source of truth for classification rules or routing policy
- `skills/fault-details-report/SKILL.md`
- align the written contract with the implemented runtime behavior
- `skills/fault-details-report/references/collection-flow.md`
- align the staged flow with the implemented query/export/history sequence
- `skills/fault-details-report/references/data-quality.md`
- stay authoritative for canonical columns, required fields, classification tables, `qxxcjl`-based reason heuristics, summary rules, and partial semantics
- `scenes/fault-details-report/scene.json`
- keep the scene contract aligned with the actual output and state semantics
- do not move classification or routing policy into scene metadata
### Caller/runtime changes
These changes land in `D:/data/ideaSpace/rust/sgClaw/claw-new`.
Primary files:
- `src/compat/direct_skill_runtime.rs`
- keep configured direct-skill execution here
- add narrow structured-artifact interpretation after the browser-script returns
- `src/agent/mod.rs`
- keep the current direct-submit routing seam here
- do not add fault-specific business logic here
- `src/compat/browser_script_skill_tool.rs`
- keep the browser-script contract strict: browser-eval entrypoint, no Node-only assumptions
- `tests/agent_runtime_test.rs`
- direct-submit path and result-surface regressions
- `tests/browser_script_skill_tool_test.rs`
- browser-script execution-shape regressions
If a new helper is needed in `claw-new`, it should be a narrow artifact-format/parser helper, not a new business-rules module.
---
## Target Runtime Flow
### Step 1: Submit-task stays config-owned
The user still types natural language into the current sgClaw input.
`claw-new`:
- receives `BrowserMessage::SubmitTask`
- loads `SgClawSettings`
- sees `directSubmitSkill = "fault-details-report.collect_fault_details"`
- bypasses LLM routing exactly as it does now
- resolves the staged skill from `skillsDir`
This preserves the already approved config-owned routing boundary.
### Step 2: Browser-script tool executes as a true browser entrypoint
`collect_fault_details.js` must be shaped for the current runtime:
- the script runs inside the current browser page context through `eval`
- it must not rely on `module.exports`
- it must directly `return collectFaultDetails(args)` from the wrapped script body
This is required because the current sgClaw browser-script runtime reads one script file and wraps it in a browser-side IIFE.
### Step 3: The skill reads the page-selected time range
The source-of-truth query window should come from the current page state, matching the original package behavior.
Design rule:
- read the selected start and end time from the business page controls or page state
- include that exact selected range in the returned artifact
- keep `period` as a bootstrap label from `claw-new`, not as a license to silently guess a different business range
Compatibility rule with the current direct-submit seam:
- the current `claw-new` direct path still requires an explicit `YYYY-MM` token in the user's instruction in order to enter the configured direct-skill flow
- that requirement remains in place for this slice
- once inside the skill, the browser page's selected start/end range is the source of truth for collection
- the returned artifact should include both the user-visible `period` label and the exact selected page range so mismatches are observable instead of hidden
If the page-selected range cannot be read reliably, the skill should return `blocked` instead of inventing a month-wide query window from `period` alone.
### Step 4: The skill collects raw rows and normalizes detail fields
The staged skill must reproduce the original package's detail normalization logic inside the browser-executed script.
That includes preserving the canonical detail schema from the original `excleIni[0].cols`, including the key transforms already present in the original package, such as:
- `slsj = bxsj`
- `gssgs = "甘肃省电力公司"`
- `sgs` derived from the current company/city context
- `gddw = maintOrgName`
- `gds = maintGroupName`
- `clzt = "处理完成"`
- `bdz = bdzMc`
- `line = xlmc10`
- `pb = byqmc`
The staged skill must also port the original classification/derivation logic that fills:
- `sxfl1`
- `sxfl2`
- `sxfl3`
- `gzsb`
- `gzyy`
That includes the original matching table and the `qxxcjl`-based text extraction heuristics that derive the fault reason.
### Step 5: The skill derives summary rows from normalized detail rows
The staged skill must derive the summary sheet from grouped detail rows, keyed around the same business totals the original package computes.
At minimum that includes:
- `index`
- `gsName`
- `fwDept`
- `className`
- `allCount`
- `wxCount`
- `khcCount`
- `sbdSbCount`
- `gyGzCount`
- `dyGzCount`
- `tqdzCount`
- `tqbxCount`
- `dyxlCount`
- `bqxCount`
- `jllCount`
- `bhxCount`
- `qftdCount`
The summary derivation must stay in the staged skill so the same package can later be routed by LLM without moving business logic back into `claw-new`.
### Step 6: The skill performs downstream export and report logging
After detail rows and summary rows are available, the staged skill should reproduce the original package's downstream behavior:
- build the export payload for `faultDetailsExportXLSXS`
- call the localhost export endpoint
- capture the returned export path/URL
- write report history via `setReportLog`
Important boundary:
- export/report-log are downstream side effects
- they do not redefine whether collection itself succeeded
- if collection succeeds but export/logging fails, the result is `partial`, not a full collection failure
- auto-opening/downloading the exported file is out of scope for this slice; this slice records the export path/result in the artifact but does not add new opener/UI behavior in `claw-new`
### Step 7: The skill returns one structured artifact
The staged skill should return one self-describing JSON artifact containing:
- business identity (`type`, `report_name`)
- selected period label
- exact selected start/end range
- canonical detail columns + normalized rows
- summary section columns + rows
- counts
- business status
- partial reasons if any
- downstream export outcome
- downstream report-log outcome
### Step 8: `claw-new` interprets the artifact, not the business rules
After the browser-script returns, `claw-new` should parse the JSON artifact and map it into final submit-task behavior.
Recommended mapping:
- `status = ok` -> `TaskComplete.success = true`
- `status = partial` -> `TaskComplete.success = true`, with warnings in summary
- `status = empty` -> `TaskComplete.success = true`, clearly reported as empty-result
- `status = blocked` -> `TaskComplete.success = false`
- `status = error` -> `TaskComplete.success = false`
This keeps business classification in the staged skill while preventing false-positive success in the direct path.
---
## Artifact Contract
The returned payload should stay `type = "report-artifact"`, but it must become rich enough to describe the real run.
Recommended contract:
```json
{
"type": "report-artifact",
"report_name": "fault-details-report",
"period": "2026-03",
"selected_range": {
"start": "2026-03-08 16:00:00",
"end": "2026-03-09 16:00:00"
},
"columns": ["qxdbh", "gssgs", "sgs", "gddw", "gds", "slsj", "yjflMc", "ejflMc", "sjflMc", "gzms", "yhbh", "yhmc", "lxr", "gzdd", "lxdh", "bxsj", "gdsj", "clzt", "qxxcjl", "bdz", "line", "pb", "sxfl1", "sxfl2", "sxfl3", "gzsb", "gzyy", "bz"],
"rows": [],
"sections": [
{
"name": "summary-sheet",
"columns": ["index", "gsName", "fwDept", "className", "allCount", "wxCount", "khcCount", "sbdSbCount", "gyGzCount", "dyGzCount", "tqdzCount", "tqbxCount", "dyxlCount", "bqxCount", "jllCount", "bhxCount", "qftdCount"],
"rows": []
}
],
"counts": {
"detail_rows": 0,
"summary_rows": 0
},
"status": "ok",
"partial_reasons": [],
"downstream": {
"export": {
"attempted": true,
"success": true,
"path": "http://localhost:13313/.../fault-details.xlsx"
},
"report_log": {
"attempted": true,
"success": true,
"report_name": "国网XX故障报修明细表(03月09日)",
"path": "http://localhost:13313/.../fault-details.xlsx"
}
}
}
```
### Contract notes
- `rows` is the canonical returned detail table, not the export-service transport payload.
- If the export service still requires a placeholder row for an empty spreadsheet, that placeholder should be synthesized only for the downstream export call, not as the canonical returned `rows` contract.
- `counts` should be computed from the canonical returned tables.
- `selected_range`, `columns`, `sections`, `counts`, `status`, and `partial_reasons` should always be present for `ok`, `partial`, and `empty`.
- For `blocked` and `error`, the artifact should still include `type`, `report_name`, `period`, `status`, and `partial_reasons`; `selected_range`, `columns`, `sections`, and `counts` should be included whenever they were already known before the failure point.
- `downstream` should be omitted only when export/report-log were not attempted yet; otherwise include it with `attempted` / `success` flags and any available path or failure detail.
---
## Error Handling and Status Semantics
### `ok`
Use `ok` when all of the following are true:
- raw collection succeeded
- required detail-field normalization succeeded
- summary derivation succeeded
- export succeeded
- report-log write succeeded
### `partial`
Use `partial` when detail collection succeeded but at least one downstream stage degraded, including:
- one or more required fields could not be normalized, but the row set still remains exportable and summary derivation can proceed with explicit gaps recorded
- summary derivation was incomplete, but the detail table is still available
- export failed after rows were available
- report-log write failed after rows/export were available
Escalation rule:
- if the raw query succeeds but required fields are missing so broadly that the canonical detail table cannot be produced at all, use `error`, not `partial`
- if summary derivation cannot even start because the normalized detail rows are structurally unusable, use `error`, not `partial`
`partial_reasons` must name the degraded stage instead of hiding it.
### `empty`
Use `empty` when:
- the query succeeds for the selected range
- zero real detail rows match
This is not a failure.
If the business flow still wants an empty export file or placeholder export payload, that happens downstream without changing the semantic meaning of the result.
### `blocked`
Use `blocked` when the page/session preconditions are not met, for example:
- expected page/session is not available
- required page controls cannot be read
- login/session state is missing or expired
- required browser-visible APIs are unavailable in the current page context
### `error`
Use `error` when the run starts but fails due to operational or parsing problems, for example:
- request failure
- page script failure
- raw response parse failure
- malformed export response
### `claw-new` completion mapping
`claw-new` should convert structured status into final submit completion behavior:
- `ok` / `partial` / `empty`: return a success completion with a concise human summary
- `blocked` / `error`: return a failed completion with a concise human summary
This avoids the current risk where a structured error-like payload could still be surfaced as a nominal success string.
---
## Testing and Acceptance Strategy
### Skill-side deterministic coverage
Add deterministic coverage around the staged skill's business logic in `skill_staging` for:
- canonical detail field mapping
- classification table parity
- `gzyy` extraction heuristics
- summary aggregation parity
- empty-result handling
- partial-result generation when downstream export/logging fails
- browser-script entrypoint shape (`return ...`, not `module.exports`)
The classification/summary tests should use fixed raw-row fixtures so the business rules are validated without a live browser session.
### `claw-new` runtime regressions
Add Rust coverage in `claw-new` for:
- direct-submit success with a populated `report-artifact`
- `partial` artifact mapping to `TaskComplete.success = true`
- `empty` artifact mapping to `TaskComplete.success = true`
- `blocked` / `error` artifact mapping to `TaskComplete.success = false`
- browser-script helper behavior for a real browser-eval return payload
### Manual acceptance
The live manual acceptance bar for this slice should be:
1. Configure `skillsDir` to the staged skill root and `directSubmitSkill` to `fault-details-report.collect_fault_details`.
2. Attach sgClaw to the real target browser page/session.
3. Submit a natural-language fault-details request without LLM routing.
4. Verify the staged skill:
- reads the selected page range
- queries real fault rows
- produces populated detail rows
- produces populated summary rows
- exports the workbook through localhost
- writes report history
5. Verify the final sgClaw completion message reports the correct status, counts, and downstream file/log outcome.
### Acceptance matrix
At minimum, acceptance should cover:
- normal populated result
- empty result with no matching rows
- partial result where export or report-log fails after collection
- blocked result where page/session preconditions are missing
- error result where parsing/query execution fails
---
## Explicit Non-Goals
This slice does **not**:
- move routing ownership out of `claw-new`
- require LLM routing to be available first
- add per-skill dispatch metadata to external manifests for routing policy
- introduce a new browser protocol or browser opcode
- recreate the original Vue shell inside `claw-new`
- move fault classification logic into Rust
- redesign the submit-task protocol beyond better interpretation of the returned artifact
---
## Resulting Design Rule
For the fault-details path:
- `claw-new` decides whether to invoke the fixed staged skill
- the staged skill performs the real fault business workflow
- the staged skill returns a structured artifact that describes collection + downstream outcomes
- `claw-new` interprets that artifact for submit-task success/failure and summary output
That keeps routing config-owned, keeps business logic with the staged skill, and makes `fault-details-report.collect_fault_details` ready for both the current no-LLM path and a later LLM-routed path.
---
## Document Landing Zones
- Approved spec: `docs/superpowers/specs/2026-04-10-fault-details-full-skill-alignment-design.md`
- Follow-up implementation plan: `docs/superpowers/plans/2026-04-10-fault-details-full-skill-alignment-plan.md`

View File

@@ -0,0 +1,618 @@
# TQ Line-Loss Deterministic Skill Design
**Goal:** Add a staged business skill for `台区线损大数据-月_周累计线损率统计分析` and a deterministic natural-language routing path in `claw-new` that can bypass LLM when the instruction ends with `。。。`, while preserving the existing Zhihu hotlist behavior and keeping the execution seam pipe-first but ws-ready.
**Status:** Approved design direction for implementation planning.
---
## Decision Summary
1. Add a new staged skill package `tq-lineloss-report` under `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/`, following the same packaging discipline as `fault-details-report`.
2. In `claw-new`, add a deterministic submit path triggered only when the instruction ends with the three-Chinese-dot suffix `。。。`.
3. In deterministic mode, route only through a fixed whitelist of staged skills; for this slice the new target is `tq-lineloss-report.collect_lineloss`.
4. Deterministic mode must extract business parameters from natural language without using an LLM: company/unit, month-vs-week mode, and period text.
5. Parsed natural-language parameters are not the final backend parameters. They must be normalized into the canonical codes required by the source page / source APIs (for example company code and period mode code).
6. If required parameters are missing or ambiguous, the runtime must stop and ask the user to provide them explicitly. It must **not** silently fall back to page defaults in this slice.
7. Skill execution must reuse the existing browser-script → pipe injection seam already proven by the Zhihu hotlist path. Do not create a second browser execution protocol.
8. The design must not regress or weaken the existing Zhihu hotlist direct path, browser-script path, export path, or current routing behavior.
9. The main branch implementation remains pipe-only, but all new deterministic-routing and skill contracts must stay backend-neutral so the execution backend can later be swapped to ws on the ws branch.
---
## Non-Negotiable Boundaries
### 1. Do not break the existing Zhihu hotlist flow
This is the top safety boundary for the slice.
The new deterministic routing for `tq-lineloss-report` must not break, narrow, or silently change:
- current Zhihu hotlist routing
- current Zhihu direct browser-script execution
- current Zhihu export behavior
- current browser-script skill loading/execution
- existing direct-submit configuration behavior
Design implication:
- The new deterministic path must be added as a narrow, explicit branch.
- Existing Zhihu logic must keep its current trigger semantics and current execution seam.
- Verification for this slice must include targeted Zhihu regression coverage before implementation is considered complete.
### 2. Current main branch is pipe-only
The implementation landing on `main` must execute browser-script skills through the current pipe-backed browser execution seam.
Do not introduce ws as an active runtime requirement for this slice.
### 3. Future ws migration must stay cheap
Although `main` remains pipe-only, the new work must leave a clean extension seam so that after this slice is merged into `ws`, the browser backend can be switched without redesigning:
- the staged skill package
- the deterministic trigger contract
- the parameter extraction contract
- the parameter normalization contract
- the returned artifact contract
---
## Why This Slice Exists
The user wants a staged business skill for `台区线损大数据-月_周累计线损率统计分析` that behaves like a deterministic business operation, not a free-form LLM task.
The desired operator experience is:
- ordinary instructions continue to use the current normal routing / LLM path
- an instruction ending in `。。。` switches to deterministic business execution
- deterministic execution targets a fixed staged skill
- business parameters are extracted from the instruction
- those parameters are normalized to the real coded values the source page/API needs
- the staged browser-script is injected into the third-party browser through the existing pipe seam
This provides an inner-network-safe path that can work without a model today, while reserving an upgrade path for future semantic fallback.
---
## Terminology
### Deterministic mode
A submit-task mode enabled only when the instruction ends with `。。。`.
### Natural-language business parameters
Values expressed by the user in text, such as:
- `兰州公司`
- `天水公司`
- `月累计`
- `周累计`
- `2026-03`
- `2026年第12周`
These are intermediate semantic values, not final backend parameters.
### Canonical execution parameters
The normalized values required by the source page / source API, such as:
- canonical company label
- canonical company code
- period mode code (month/week)
- canonical request period payload
---
## Ownership Boundary and Landing Zones
### Staged skill changes
These land in:
`D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging`
Primary landing zone:
- `skills/tq-lineloss-report/`
Target package structure:
- `SKILL.md`
- `SKILL.toml`
- `references/collection-flow.md`
- `references/data-quality.md`
- `assets/scene-snapshot/index.html`
- `scripts/collect_lineloss.js`
- `scripts/collect_lineloss.test.js`
Potential aligned scene metadata (if included in this slice):
- `scenes/tq-lineloss-report/scene.json`
- optional scene registry updates if the current staging conventions require it
### Caller/runtime changes
These land in:
`D:/data/ideaSpace/rust/sgClaw/claw-new`
Likely ownership areas:
- deterministic instruction detection and deterministic skill matching
- parameter extraction and normalization
- deterministic skill dispatch to the existing browser-script seam
- narrow result interpretation for the returned artifact
- focused regression tests
Design rule:
`claw-new` owns routing, extraction, normalization, and dispatch.
`claw-new` must **not** absorb the line-loss business logic itself.
The staged skill package owns:
- page inspection
- page-side state reading
- page/API data collection
- row normalization
- export/report-log behavior
- final artifact generation
---
## Target Runtime Flow
### Step 1: Submit-task enters deterministic mode only on `。。。`
When the user instruction does **not** end in `。。。`:
- keep the current runtime behavior unchanged
- preserve existing Zhihu hotlist behavior exactly
- preserve existing direct-submit and compat/LLM flows
When the instruction **does** end in `。。。`:
- enter deterministic mode
- do not run the ordinary LLM interpretation branch for this request
- evaluate only the deterministic skill whitelist
### Step 2: Deterministic whitelist match
The runtime should match the instruction against deterministic business scenes.
For this slice the new required deterministic scene is:
- `tq-lineloss-report.collect_lineloss`
The matching layer should remain narrow and explicit. It should not become a general scene-registry runtime in this slice.
Matching should use a deterministic combination of:
- instruction keywords
- optional page URL/title constraints when available
The runtime must not accidentally steal instructions that should still go down the Zhihu path.
### Step 3: Extract semantic business parameters from natural language
After `tq-lineloss-report` is matched, the runtime extracts semantic business parameters from the instruction.
Required semantic categories:
- company/unit expression
- period mode (`month` vs `week`)
- period text/value
Examples of accepted user-facing expressions include:
- `兰州公司`
- `天水公司`
- `国网兰州供电公司`
- `城关供电分公司`
- `2026-03`
- `2026年3月`
- `2026年第12周`
- `第12周`
- `月累计`
- `周累计`
### Step 4: Normalize semantic values into canonical coded values
This is a required separate design step.
The runtime must not pass raw natural-language company text directly to the business request layer.
Instead it must normalize semantic values into canonical execution parameters, including:
- `org_label` — canonical unit label
- `org_code` — the actual code/value required by the business page/API
- `period_mode` — canonical mode (`month` or `week`)
- `period_mode_code` — the page/API code (for example `timeChage`-style encoded mode)
- canonical time payload required by the source APIs/page state
This normalization should be derived from the actual source materials, including page-side dictionaries such as the existing unit tree data.
### Step 5: Missing and ambiguous parameters must stop execution
This slice must not silently infer missing parameters from page defaults.
If a required parameter is missing, execution must stop with an explicit prompt to the user.
If a parameter is ambiguous, execution must stop with an explicit ambiguity prompt.
Examples:
- no company matched
- no month/week mode matched
- no period value matched when required
- a short company alias matches multiple canonical units
- both monthly and weekly intent appear in the same instruction
This is preferable to silently using the wrong company code or the wrong query period.
### Step 6: Execute the staged skill through the existing pipe seam
If and only if parameters are present and successfully normalized:
- resolve `tq-lineloss-report.collect_lineloss`
- build the args object
- execute it through the current `browser_script` runtime
- inject the script into the browser through the existing pipe-backed browser tool seam
This slice must reuse the execution pattern already proven by the current browser-script/direct-skill infrastructure and the current Zhihu hotlist path.
Do not introduce a second browser protocol, new browser opcode family, or parallel execution harness.
### Step 7: Skill JS performs page-side work and returns one artifact
The staged script owns the actual line-loss business behavior:
- reading page-side state when needed
- validating the page context
- using normalized codes/parameters from args
- building source API requests
- collecting/normalizing rows
- export/report logging behavior if required by the final business contract
- returning a structured artifact
---
## Deterministic Trigger Contract
### Trigger rule
Deterministic mode is activated only when the raw instruction ends with the exact three-Chinese-dot suffix:
- `。。。`
This suffix is a user-controlled explicit mode switch.
### Why the suffix exists
It lets the user force business-deterministic behavior without relying on a model, while preserving the normal LLM path for ordinary requests.
### Scope rule
The suffix is not a free pass to run arbitrary browser actions.
It only selects among the deterministic skill whitelist.
If no deterministic scene matches, the runtime should return a deterministic-mode mismatch error that explains the currently supported deterministic scenes, rather than silently dropping into another behavior.
---
## Company / Unit Matching Contract
### Accepted input style
The user does **not** need to type the exact full canonical label.
The runtime should support business shorthand such as:
- `兰州公司`
- `天水公司`
- `白银公司`
- `城关供电分公司`
- `榆中县供电公司`
### Matching approach
Do not use regex alone as the primary company-resolution mechanism.
Use a three-stage resolution strategy:
1. text normalization
2. alias/candidate generation from canonical unit names
3. uniqueness resolution against the real unit dictionary
### Normalization examples
Canonical names such as:
- `国网兰州供电公司`
- `国网天水供电公司`
- `国网榆中县供电公司`
should be matchable from business shorthand forms such as:
- `兰州公司`
- `天水公司`
- `榆中县公司`
- `榆中供电公司`
### Data source for canonical mapping
The company/unit resolver should derive canonical mappings from the real source materials used by the business page, such as the current unit tree dictionary embedded in the source page resources.
Design implication:
- the resolver should produce the real `value`/code required downstream
- the resolver should also keep the canonical label for display/auditability
### Ambiguity rule
If a short alias resolves to more than one valid unit, execution must stop and ask the user to be more specific.
Do not auto-guess.
### Supported granularity
The first implementation must support both:
- city-company level
- district/county/sub-company level
This includes forms like:
- `兰州公司`
- `天水公司`
- `城关供电分公司`
- `榆中县供电公司`
---
## Period Extraction and Normalization Contract
### Required period dimensions
The runtime must identify:
- mode: `month` or `week`
- actual requested period value in a canonical form
### Accepted user-facing patterns
At minimum the design should account for patterns such as:
- `月累计`
- `周累计`
- `2026-03`
- `2026年3月`
- `2026年第12周`
- `第12周`
### Normalization output
The resolver should produce:
- a canonical mode enum/string
- a mode code required by the page/API
- a canonical period payload consumable by the script/business request layer
### Ambiguity rule
If both month and week intent appear, stop and ask the user to clarify.
### Missing-period rule
If the selected line-loss query requires a time period and the instruction does not provide enough information to construct one, stop and ask the user to provide it.
Do not default to the page-selected period in this slice.
---
## Parameter Prompting Contract
When deterministic mode matches `tq-lineloss-report` but one or more required parameters are missing or ambiguous, the runtime should return a user-facing prompt rather than executing.
Expected prompting cases include:
- missing company/unit
- missing month/week mode
- missing period value
- ambiguous company alias
- contradictory period expressions
The prompt should be specific enough to let the user correct only the missing field(s).
Example style:
- `已命中台区线损报表技能,但缺少供电单位,请补充如“兰州公司”或“城关供电分公司”。`
- `已命中台区线损报表技能,但未识别到月/周类型,请补充“月累计”或“周累计”。`
---
## Skill Package Contract
### SKILL.toml
The new skill package must declare a single deterministic collection entrypoint:
- tool name: `collect_lineloss`
- kind: `browser_script`
The tool description must reflect the real staged behavior, not a placeholder shell.
### SKILL.md
The written contract should cover:
- when to use the skill
- when not to use it
- collection workflow
- runtime contract
- explicit missing/partial/error semantics
- returned artifact contract
### references/collection-flow.md
Must explain:
- the source page state used by the skill
- how company and period parameters map to business requests
- which page/API calls are used for month vs week
- export/report-log sequencing if retained in the business flow
### references/data-quality.md
Must define:
- canonical output columns
- required field coverage
- status semantics
- partial/error conditions
- company/period normalization assumptions that the script relies on
### scripts/collect_lineloss.js
This is the real browser-side entrypoint. It should:
- accept normalized args
- validate page context
- execute deterministic page/API data collection
- normalize rows
- perform downstream export/report-history behavior if required
- directly return the final artifact from the browser-script runtime entrypoint shape
### scripts/collect_lineloss.test.js
Must cover the business transforms that can be tested off-browser, especially:
- company normalization assumptions consumed by the script
- monthly vs weekly request-shape logic
- status semantics
- artifact shaping
---
## Returned Artifact Contract
The final line-loss skill should return one structured artifact object rather than free-form prose.
At minimum it should expose:
- artifact type
- report name
- canonical company label/code used for the query
- period mode and canonical period value used for the query
- columns
- rows
- status
- counts
- downstream export/report-log status when applicable
- clear reasons for blocked/partial/error states
The exact field names may be finalized during implementation planning, but the contract must be structured enough for `claw-new` to interpret success vs partial vs blocked without re-embedding business logic.
---
## Pipe-First / Ws-Ready Execution Seam
### Current requirement
The first implementation on `main` must use the existing pipe-backed browser execution path.
### Future requirement
The design must allow later ws adoption without redesigning the skill or routing contract.
### Practical design rule
Keep these backend-neutral:
- deterministic trigger contract
- skill matching contract
- parameter extraction contract
- parameter normalization contract
- tool args contract
- artifact contract
Keep backend-specific code isolated to the execution seam only.
That way the later ws migration can replace the browser backend beneath the same deterministic skill contract.
---
## Caller/Runtime Design Rules
### 1. Keep new business logic out of broad orchestration
Do not thread line-loss-specific business behavior through the general orchestration/runtime path.
### 2. Add a narrow deterministic-routing seam
This slice should add a narrow deterministic branch around submit-task routing, rather than rewriting the whole runtime decision tree.
### 3. Separate extraction from normalization
Do not mix “what the user typed” with “what the backend needs”.
There must be a distinct normalization step.
### 4. Keep the direct-skill browser seam narrow
Reuse the current `browser_script` execution seam instead of inventing a new browser bridge.
### 5. Preserve Zhihu behavior by design, not by hope
The design should assume new deterministic routing can accidentally steal or alter existing Zhihu behavior unless explicitly guarded against.
This is why focused Zhihu regression coverage is mandatory.
---
## Verification Requirements for the Future Implementation Plan
Implementation planning must include explicit verification for:
1. deterministic suffix detection
2. deterministic lineloss scene matching
3. company alias normalization to canonical code
4. support for both company-level and district/county/sub-company-level units
5. month/week extraction and normalization
6. missing-parameter prompt behavior
7. ambiguous-company prompt behavior
8. pipe-backed browser-script execution for the new skill
9. no regression to the existing Zhihu hotlist path
10. preserved direct-skill/browser-script behavior outside the new line-loss scene
---
## Out of Scope for This Slice
- enabling ws execution on `main`
- replacing the current Zhihu routing model
- general scene-registry runtime architecture redesign
- full free-form semantic understanding of arbitrary business language
- typo-tolerant fuzzy NLP beyond deterministic business-safe matching
- making page defaults the hidden source of truth when the user omitted parameters
---
## Planning Notes
The implementation plan should likely split into distinct work items for:
1. staged skill package creation and business contract definition
2. deterministic trigger + scene match in `claw-new`
3. company/unit normalization and ambiguity handling
4. period extraction/normalization and ambiguity handling
5. pipe-backed direct execution integration
6. returned artifact interpretation
7. Zhihu regression verification
8. ws-readiness seam verification
The plan should explicitly keep the “do not break Zhihu hotlist” boundary visible in every execution and verification stage.

View File

@@ -0,0 +1,69 @@
# 异步 Browser Script 支持设计
## 问题
`collect_lineloss.js``buildBrowserEntrypointResult` 是 async 函数,但 `build_eval_js` 生成的执行代码是同步的,导致 Promise 被 JSON.stringify 序列化为 `{}`
**日志表现**
```
[execute_browser_script_impl] 返回成功, payload 长度: 4
```
返回 `{}(4字符)` 而不是实际的报表数据。
## 根本原因
`callback_backend.rs``build_eval_js` 函数:
```javascript
var v=(function(){return {script}})(); // 同步执行
var t=(typeof v==='string')?v:JSON.stringify(v); // Promise -> "{}"
```
当 script 返回 Promise 时,`JSON.stringify(Promise)` 返回 `{}`
## 解决方案
修改 `build_eval_js` 支持 Promise
1.`await` 等待 script 执行结果
2. 检测结果是否为 Promise如果是则等待 resolve
3. 保持对同步脚本的向后兼容
## 实现细节
修改 `src/browser/callback_backend.rs``build_eval_js` 函数:
```javascript
(async function(){
try {
var v = await (function(){return {script}})();
// 等待 Promise resolve
if (v && typeof v.then === 'function') {
v = await v;
}
var t = (typeof v === 'string') ? v : JSON.stringify(v);
// ... 回调逻辑保持不变
} catch(e) {}
})()
```
关键点:
- 包装整个 IIFE 为 async
-`await` 等待 script 执行
- 检测 Promise-like 对象 (`v.then === 'function'`)
- 向后兼容同步脚本直接返回值async 脚本返回 Promise 后被 await
## 影响范围
- `src/browser/callback_backend.rs`: 修改 `build_eval_js` 函数
- 所有 `browser_script` 类型的 skill 自动支持 async
## 测试验证
1. 运行 `cargo test` 确保现有测试通过
2. 端到端测试 `tq-lineloss-report.collect_lineloss` 返回实际数据而非 `{}`
3. 验证同步脚本(如知乎热榜)仍然正常工作
## 不在范围内
- 不修改 `wrap_browser_script`(方案 C 的做法)
- 不修改 skill 脚本本身

View File

@@ -0,0 +1,47 @@
# 修复 build_eval_js 异步支持 + validatePageContext 诊断日志
## 问题描述
1. `collect_lineloss.js``buildBrowserEntrypointResult` 是 async 函数,返回 Promise
2. 当前同步版 `build_eval_js``JSON.stringify(Promise)` = `"{}"`
3. 之前的 async IIFE 方案导致 `page_context_unavailable`(原因待排查)
## 方案
### 修改1: build_eval_js 使用 .then() 分支
文件:`src/browser/callback_backend.rs` - `build_eval_js` 函数
逻辑:
1. 外层 IIFE 保持同步(兼容 C++ 注入层)
2. 将回调发送逻辑提取为 `_s` 函数
3. 如果返回值是 Promise`.then` 方法),用 `.then(_s)` 异步等待结果
4. 否则直接同步调用 `_s(v)`
```javascript
(function(){try{
var v=(function(){return {script}})();
function _s(v){
var t=(typeof v==='string')?v:JSON.stringify(v);
try{callBackJsToCpp(...);}catch(_){}
var j=JSON.stringify({...});
try{XHR...}catch(_){}
try{sendBeacon...}catch(_){}
}
if(v&&typeof v.then==='function'){v.then(_s).catch(function(){});}
else{_s(v);}
}catch(e){}})()
```
### 修改2: validatePageContext 添加诊断日志
文件:`D:\data\ideaSpace\rust\sgClaw\claw\claw\skills\skill_staging\skills\tq-lineloss-report\scripts\collect_lineloss.js`
`validatePageContext` 每个检查点添加 console.log记录 host、expected_domain、mac 状态。
## 验证
1. `cargo test` 通过
2. 编译后拷贝 exe 到线上
3. 执行 skill确认不再返回 `{}`
4. 如果出现 `page_context_unavailable`,查看浏览器控制台日志

View File

@@ -0,0 +1,55 @@
# 修复 Browser Script Skill Tool expected_domain 参数丢失问题
## 问题描述
`tq-lineloss-report.collect_lineloss` skill 执行时返回 `status=blocked row=0 reasons=missing_expected_domain` 错误。
## 根本原因
`src/compat/browser_script_skill_tool.rs``execute_browser_script_impl` 函数:
```rust
// 第 183 行:从 args 中移除 expected_domain
let raw_expected_domain = match args.remove("expected_domain") {
Some(Value::String(value)) if !value.trim().is_empty() => value,
// ...
};
// 第 200 行:规范化域名(去掉 scheme、port 等)
let expected_domain = match normalize_domain_like(&raw_expected_domain) {
Some(value) => value,
// ...
};
// 第 234 行包装脚本时args 中已经没有 expected_domain 了!
let wrapped_script = wrap_browser_script(&script_body, &Value::Object(args.clone()));
```
`args.remove()` 会从 HashMap 中删除键值对,后续 `wrap_browser_script()` 传入的 args 不包含 `expected_domain`,导致 JS 脚本中 `const args = {...}` 缺少该字段。
## 解决方案
在规范化域名后,将 `expected_domain` 重新插入 args。
### 修改位置
文件:`src/compat/browser_script_skill_tool.rs`
行号:第 209 行后(`expected_domain` 赋值之后、`for required_arg` 循环之前)
### 修改内容
```rust
// 第 209 行后添加:
args.insert("expected_domain".to_string(), Value::String(expected_domain.clone()));
```
## 影响范围
- 只影响 `browser_script_skill_tool.rs`
- 所有使用 `expected_domain` 的 browser_script skill 都会受益
- 无破坏性变更
## 验证方法
1. 运行现有测试:`cargo test browser_script_skill_tool`
2. 内网验证:执行 `tq-lineloss-report.collect_lineloss` skill

View File

@@ -0,0 +1,48 @@
# 台区线损 Skill - requesturl 快速修复方案
## 问题背景
`sgHideBrowerserOpenPage` 命令需要 `requesturl` 参数(发起调用的页面 URL但当前台区线损指令解析时返回 `about:blank`,导致浏览器不执行命令。
知乎热榜场景正常工作,因为 `derive_request_url_from_instruction` 返回了 `https://www.zhihu.com`
## 设计方案
**方案:在 `derive_request_url_from_instruction` 中添加台区线损 URL 映射**
### 修改位置
`src/service/server.rs` - `derive_request_url_from_instruction` 函数
### 修改内容
```rust
fn derive_request_url_from_instruction(instruction: &str) -> Option<String> {
// 已有:知乎相关(保持不变)
if crate::compat::workflow_executor::detect_route(instruction, None, None)
.is_some_and(|route| { ... })
{
return Some("https://www.zhihu.com".to_string());
}
// 新增:台区线损相关
// TODO: 临时方案,后续应从 skill 配置或 deterministic_submit 解析结果中获取
if instruction.contains("线损") || instruction.contains("lineloss") {
return Some("http://20.76.57.61:18080".to_string());
}
None
}
```
### 约束条件
- URL 为硬编码,后续需重构为通用方案
- 仅匹配指令中包含"线损"或"lineloss"的场景
## 后续规划
将实现通用方案:
-`DeterministicExecutionPlan.expected_domain` 构造完整 URL
- 或从 skill 配置文件中读取 target URL
- 调整流程顺序,先解析 skill 再打开 helper page

View File

@@ -0,0 +1,36 @@
# 台区线损 Skill - target_url 缺失修复方案
## 问题背景
`browser_script_skill_tool.rs` 调用 `Action::Eval` 时只传了 `script` 参数,没有传 `target_url``callback_backend.rs``target_url` 方法需要从 params 或 `current_target_url` 获取值,两者都没有时报错。
知乎热榜正常工作是因为先执行了 `Action::Navigate`,设置了 `current_target_url`
## 设计方案
**方案:在 `browser_script_skill_tool.rs` 的 params 中添加 `target_url`**
### 修改位置
`src/compat/browser_script_skill_tool.rs` - `execute_browser_script_impl` 函数
### 修改内容
在调用 `browser_tool.invoke(Action::Eval, ...)` 时,从 `expected_domain` 构造完整 URL 并添加到 params
```rust
let target_url = format!("http://{}", expected_domain);
let result = match browser_tool.invoke(
Action::Eval,
json!({
"script": wrapped_script,
"target_url": target_url,
}),
&expected_domain,
) {
```
### 约束条件
- 使用 `http://` 协议前缀
- `expected_domain` 可能包含端口号(如 `20.76.57.61:18080`),直接拼接即可

View File

@@ -0,0 +1,84 @@
# Remove mac Guard from validatePageContext
## Date
2026-04-13
## Problem
`tq-lineloss-report` skill execution reports `status=blocked rows=0 reasons=page_context_unavailable`.
Diagnostic instrumentation confirmed:
```
href=http://20.76.57.61:18080/gsllys
host=20.76.57.61
port=18080
title=台区线损大数据分析模块
mac=false
```
The script executes on the correct domain but `globalThis.mac` does not exist, triggering the `page_context_unavailable` guard.
## Root Cause
`window.mac` is a Vue instance created by the **original scene page** (`index.html`), assigned via `window.mac = this` in `mounted()`. The original scene page acts as a controller that injects JS into the business page via `BrowserAction('sgBrowserExcuteJsCode', exactURL, jsCode)`.
In the skill execution model, there is no scene page. The script is injected directly via `sgBrowserExcuteJsCodeByDomain` onto a page matching the domain. No Vue instance is created, so `globalThis.mac` is always `undefined`. The `mac` check is architecturally invalid for the skill model.
Additionally, `sgBrowserExcuteJsCodeByDomain("20.76.57.61")` matches the parent frame page (`/gsllys`) rather than the business sub-page (`/gsllys/tqLinelossStatis/tqQualifyRateMonitor`). This is acceptable because the skill script makes direct HTTP requests with absolute URLs and does not depend on page-local state.
## Design
Remove the `globalThis.mac` existence check from `validatePageContext` in `collect_lineloss.js`. Retain the `host` matching check as a basic domain guard.
Also clean up the temporary diagnostic code (`diag` variable, `console.log` statements, enriched reason strings) added during debugging.
### Before
```javascript
validatePageContext(args) {
const host = normalizeText(globalThis.location?.hostname);
const port = normalizeText(globalThis.location?.port);
const href = normalizeText(globalThis.location?.href);
const title = normalizeText(globalThis.document?.title);
const expected = normalizeText(args.expected_domain);
const hasMac = !!globalThis.mac;
const diag = 'href=' + href + '|host=' + host + '|port=' + port + '|title=' + title + '|mac=' + hasMac;
console.log('[validatePageContext] ' + diag);
if (!host) {
return { ok: false, reason: 'page_context_unavailable:host_empty|' + diag };
}
if (host !== expected) {
return { ok: false, reason: 'page_context_mismatch:host=' + host + ',expected=' + expected + '|' + diag };
}
if (!hasMac) {
return { ok: false, reason: 'page_context_unavailable:mac_missing|' + diag };
}
return { ok: true };
},
```
### After
```javascript
validatePageContext(args) {
const host = normalizeText(globalThis.location?.hostname);
const expected = normalizeText(args.expected_domain);
if (!host) {
return { ok: false, reason: 'page_context_unavailable' };
}
if (host !== expected) {
return { ok: false, reason: 'page_context_mismatch' };
}
return { ok: true };
},
```
## Files Changed
- `claw/claw/skills/skill_staging/skills/tq-lineloss-report/scripts/collect_lineloss.js``validatePageContext` function only
## No Recompilation Required
The JS file is read at runtime via `fs::read_to_string`. No Rust code changes.

View File

@@ -0,0 +1,111 @@
# Rust-Side Lineloss XLSX Export
## Problem
`collect_lineloss.js` runs on a remote page (`http://20.76.57.61:18080/gsllys`).
The script successfully queries API data (12 rows), but cannot call
`http://localhost:13313/.../faultDetailsExportXLSX` because the browser blocks
cross-origin requests from a remote page to `localhost`.
The original scene architecture had a local scene page acting as a proxy,
but skill mode has no local page -- so export is architecturally impossible
from the browser side.
## Decision
Move XLSX generation to the Rust side. JS only collects data; Rust generates
the `.xlsx` file locally after receiving the artifact.
Report log (`setReportLog`) is deferred to a later iteration.
## Design
### JS Changes (`collect_lineloss.js`)
1. Remove `exportWorkbook()` call and `writeReportLog()` call
2. Return artifact with `rows` array and `column_defs` array
3. Status is `ok` when rows > 0, `empty` when rows == 0, `error`/`blocked` unchanged
Artifact shape:
```json
{
"type": "report-artifact",
"report_name": "tq-lineloss-report",
"status": "ok",
"org": { "label": "...", "code": "..." },
"period": { "mode": "month", "value": "2026-03" },
"column_defs": [["ORG_NAME","供电单位"], ["YGDL","累计供电量"], ...],
"rows": [
{"ORG_NAME":"xxx", "YGDL":"12345.67", ...}
],
"counts": { "rows": 12 }
}
```
### Rust Changes
#### New file: `src/compat/lineloss_xlsx_export.rs`
Generates a standard `.xlsx` file using `zip` crate + OpenXML XML strings.
Follows the pattern established in `openxml_office_tool.rs`.
Public API:
```rust
pub struct LinelossExportRequest {
pub column_defs: Vec<(String, String)>, // (key, chinese_header)
pub rows: Vec<Map<String, Value>>,
pub sheet_name: String,
pub output_path: PathBuf,
}
pub fn export_lineloss_xlsx(request: &LinelossExportRequest) -> anyhow::Result<PathBuf>;
```
Internals:
- Build header row from `column_defs[*].1` (chinese names)
- Build data rows by looking up `column_defs[*].0` keys in each row map
- Generate `worksheet_xml` with inline string cells
- Package with standard OpenXML boilerplate (content_types, rels, workbook)
- Write to `output_path`
#### Modified: `src/compat/deterministic_submit.rs`
In `execute_deterministic_submit_with_browser_backend` (and the non-backend variant):
```
let output = execute_browser_script_skill_raw_output_with_browser_backend(...)?;
let artifact = parse_lineloss_artifact(&output);
if artifact has rows > 0 && column_defs present:
let export_path = workspace_root/out/tq-lineloss-{timestamp}.xlsx
export_lineloss_xlsx(LinelossExportRequest { ... })?
// attach export_path to outcome summary
Ok(summarize_lineloss_output_with_export(&output, export_path))
```
#### Modified: `src/compat/mod.rs`
Add `pub mod lineloss_xlsx_export;`
### Output Path
`{workspace_root}/out/tq-lineloss-{org_label}-{period}-{timestamp_nanos}.xlsx`
### Error Handling
- XLSX generation failure: outcome status = `partial`, reason = `xlsx_export_failed`
- Artifact parse failure: fall through to existing `summarize_lineloss_output`
## Files Changed
| File | Change Type |
|------|-------------|
| `collect_lineloss.js` | Modify: remove export/log calls, add rows+column_defs to artifact |
| `src/compat/lineloss_xlsx_export.rs` | New: XLSX generation |
| `src/compat/deterministic_submit.rs` | Modify: post-process artifact, call XLSX export |
| `src/compat/mod.rs` | Modify: register new module |
## Requires Recompilation
Yes. Rust code changes require `cargo build`.

View File

@@ -0,0 +1,55 @@
# Helper Page Lifecycle Fix v2 — Same-Connection Close + Open
**Date:** 2026-04-14
**Status:** Approved
## Problem
Two issues remain after v1:
1. **Process restart leaves orphaned helper pages**: When the sg_claw process restarts, the old helper page tab remains open in the browser. The new process opens another one.
2. **Helper page is visible**: Uses `sgBrowerserOpenPage` (visible tab API) instead of `sgHideBrowerserOpenPage` (hidden domain API).
## Root Cause of v1 Failure
The v1 `close_helper_page` function created a **second** WebSocket connection to the browser during `Drop`. This likely conflicted with the existing bootstrap connection, causing the browser's WebSocket state to become confused.
## Solution
Send the close command on the **same** WebSocket connection used for bootstrap, before sending the open command:
1. Connect to browser WS
2. Register as "web" role
3. **Blindly send** `sgHideBrowerserClosePage(helper_url)` — closes any orphaned page from a previous process run
4. Send `sgHideBrowerserOpenPage(helper_url)` — opens the new helper page
5. Poll `/sgclaw/callback/ready` for page readiness
Both `use_hidden_domain = true` and the close+open logic are combined into a single change.
## Why This Works
- **Same connection**: Only one WebSocket connection to the browser. No conflict with existing connections.
- **Best-effort close**: If no orphaned page exists (first run ever), the close command is silently ignored by the browser. This does not affect the subsequent open command.
- **Fire-and-forget**: Both close and open commands use the same fire-and-forget semantics as the existing bootstrap command.
## API Reference
| API | Wire format | Effect |
|-----|------------|--------|
| `sgHideBrowerserOpenPage` (API #6) | `[requesturl, "sgHideBrowerserOpenPage", url]` | Opens in hidden domain |
| `sgHideBrowerserClosePage` (API #68) | `[requesturl, "sgHideBrowerserClosePage", url]` | Closes hidden domain page |
## Affected Files
| File | Change |
|------|--------|
| `src/browser/callback_host.rs` | In `bootstrap_helper_page`: add close command before open command |
| `src/service/server.rs` | Change `use_hidden_domain` from `false` to `true` |
## What Does NOT Change
- `callback_backend.rs``SHOW_AREA`, `build_command` unchanged
- `sgBrowserExcuteJsCodeByDomain` area parameter — stays `"show"`
- Helper page HTML content — unchanged
- `Drop for LiveBrowserCallbackHost` — remains simple (shutdown only, no close attempt)
- `cached_host` in `mod.rs` — remains lifted to outer loop

View File

@@ -0,0 +1,99 @@
# Helper Page Lifecycle Fix & Hidden Domain Support
**Date:** 2026-04-14
**Status:** Approved
## Problem Statement
Two bugs in the browser-helper.html page management:
1. **Duplicate helper pages**: Every WebSocket client reconnection triggers a new `serve_client()` call, which creates a new `LiveBrowserCallbackHost` and opens a new helper page via `sgBrowerserOpenPage`. The old helper page tab is never closed, causing accumulation of orphaned tabs.
2. **Helper page is visible**: The bootstrap uses `sgBrowerserOpenPage` (visible tab API) instead of `sgHideBrowerserOpenPage` (hidden domain API). The helper page should not be visible to the user.
## Root Cause Analysis
### Duplicate pages
Call chain:
- `src/service/mod.rs:72` — outer `loop` accepts new WebSocket connections
- `src/service/mod.rs:79` — each connection calls `serve_client()`
- `src/service/server.rs:241``cached_host` declared as local variable, re-initialized to `None` each call
- `src/service/server.rs:288``callback_host.rs:241``bootstrap_helper_page()` opens a new helper tab
`Drop for LiveBrowserCallbackHost` (`callback_host.rs:321-328`) only shuts down the HTTP server thread. It does not send a browser close command for the helper tab.
### Visible page
`callback_host.rs:28`: `HELPER_BOOTSTRAP_ACTION = "sgBrowerserOpenPage"` — this is the visible-domain open API (API #7). The hidden-domain equivalent is `sgHideBrowerserOpenPage` (API #6).
## Solution: Approach C — Incremental Fix
### Step 1: Fix lifecycle (immediate, deterministic fix)
#### 1a. Lift `cached_host` to outer loop
Move `cached_host: Option<Arc<LiveBrowserCallbackHost>>` from inside `serve_client()` to before the `loop` in `run_service()` (`mod.rs`). Change `serve_client()` signature to accept `&mut Option<Arc<LiveBrowserCallbackHost>>` instead of creating its own.
Effect: Multiple WebSocket reconnections share the same host. Helper page opens once per process lifetime.
#### 1b. Close helper page on Drop
Enhance `Drop for LiveBrowserCallbackHost`:
- Add `browser_ws_url: String` field to `LiveBrowserCallbackHost` (stored at construction time)
- Add `use_hidden_domain: bool` field (stored at construction time)
- In `Drop::drop`, before shutting down the server thread:
1. Connect to `browser_ws_url` with 100ms connection timeout
2. Send register message
3. Send close command: `[helper_url, close_api, helper_url]`
- `close_api` = `"sgBrowserClosePage"` when `use_hidden_domain == false`
- `close_api` = `"sgHideBrowerserClosePage"` when `use_hidden_domain == true`
4. All steps are best-effort: failures are silently ignored
5. Total timeout cap: 500ms
### Step 2: Hidden domain config switch (for testing/gradual rollout)
#### 2a. Parameter plumbing
- `LiveBrowserCallbackHost::start_with_browser_ws_url` gains parameter `use_hidden_domain: bool`
- `bootstrap_helper_page` selects API based on this flag:
- `true``"sgHideBrowerserOpenPage"`
- `false``"sgBrowerserOpenPage"` (current behavior, default)
- `LiveBrowserCallbackHost` stores the flag for Drop close-command selection
#### 2b. Caller changes
- `mod.rs` / `server.rs` pass `false` as default
- To enable hidden domain, change the call site to pass `true`
## What Does NOT Change
- `callback_backend.rs` `SHOW_AREA = "show"` — JS injection targets visible business pages, not the helper itself
- `sgBrowserExcuteJsCodeByDomain` area parameter — stays `"show"` regardless of helper domain
- Helper page HTML content — WebSocket connection and command polling JS remain the same
- `collect_lineloss.js` — not affected
## Affected Files
| File | Change |
|------|--------|
| `src/browser/callback_host.rs` | New fields on `LiveBrowserCallbackHost`, `start_with_browser_ws_url` signature change, `Drop` enhancement, new `close_helper_page` helper fn |
| `src/service/mod.rs` | `cached_host` lifted to outer loop, passed to `serve_client` |
| `src/service/server.rs` | `serve_client` signature change to accept `&mut Option<Arc<LiveBrowserCallbackHost>>` |
| Existing test files | Adapt `start_with_browser_ws_url` calls with new `use_hidden_domain` parameter |
## Testing
- Existing `callback_host` tests: adapt to new signature (add `false` parameter)
- New unit test: `use_hidden_domain = true` → bootstrap sends `sgHideBrowerserOpenPage`
- New unit test: `use_hidden_domain = false` → bootstrap sends `sgBrowerserOpenPage` (regression)
- `cargo build` + `cargo test` full verification
## Browser API Reference
| API | Wire format | Effect |
|-----|------------|--------|
| `sgBrowerserOpenPage` (API #7) | `[requesturl, "sgBrowerserOpenPage", url]` | Opens visible tab |
| `sgHideBrowerserOpenPage` (API #6) | `[requesturl, "sgHideBrowerserOpenPage", url]` | Opens in hidden domain |
| `sgBrowserClosePage` (API #64) | `[requesturl, "sgBrowserClosePage", url]` | Closes visible tab |
| `sgHideBrowerserClosePage` (API #68) | `[requesturl, "sgHideBrowerserClosePage", url]` | Closes hidden domain page |

View File

@@ -0,0 +1,284 @@
# sgClaw Service Console Enhancement Design
## Background
The current `sg_claw_service_console.html` provides a basic UI for connecting to the sgClaw service WebSocket and submitting tasks. However, it requires manual connection on first load and has no way to configure the sgClaw settings (API key, model, base URL, skills directory) from the UI.
Users need to manually edit `sgclaw_config.json` before using the console, which is inconvenient for routine operations.
## Problem Statement
1. Page requires manual "Connect" button click on first load
2. No UI for configuring sgClaw runtime settings (model, API key, base URL, skills dir)
3. Users must manually edit `sgclaw_config.json` file to change configuration
## Goal
Enhance the service console page with:
1. **Auto-connect on page load** - attempt WebSocket connection immediately
2. **Settings panel** - edit sgClaw configuration fields through a friendly UI
3. **Config save via WebSocket** - send configuration updates to the running sgClaw service, which writes them to `sgclaw_config.json`
## Non-goals
- Auto-starting `sg_claw.exe` process (browser security limitation, deferred)
- Changing existing `submit_task` protocol or execution flow
- Modifying browser-helper.html or browser execution logic
- Adding authentication or multi-user support
- Configuration validation beyond basic field checks
## Architecture
### Component Overview
```
┌─────────────────────────────────────────┐
│ sg_claw_service_console.html │
│ ┌───────────────────────────────────┐ │
│ │ Auto-connect on load │ │
│ │ (ws://127.0.0.1:42321 default) │ │
│ └───────────────────────────────────┘ │
│ ┌───────────────────────────────────┐ │
│ │ Settings Panel (Modal) │ │
│ │ - API Key │ │
│ │ - Base URL │ │
│ │ - Model │ │
│ │ - Skills Directory │ │
│ │ - Direct Submit Skill (optional) │ │
│ │ - Runtime Profile (dropdown) │ │
│ │ - Browser Backend (dropdown) │ │
│ │ [Save] [Cancel] │ │
│ └───────────────────────────────────┘ │
│ ┌───────────────────────────────────┐ │
│ │ Existing: Connection + Composer │ │
│ └───────────────────────────────────┘ │
└──────────────┬──────────────────────────┘
│ WebSocket
│ submit_task / update_config
┌─────────────────────────────────────────┐
│ sg_claw.exe (service) │
│ ┌───────────────────────────────────┐ │
│ │ ClientMessage handler │ │
│ │ - SubmitTask (existing) │ │
│ │ - UpdateConfig (new) │ │
│ └───────────────────────────────────┘ │
│ ┌───────────────────────────────────┐ │
│ │ Config writer │ │
│ │ Writes to sgclaw_config.json │ │
│ └───────────────────────────────────┘ │
└─────────────────────────────────────────┘
```
### Data Flow
1. **Auto-connect flow:**
- Page loads → JavaScript calls `connect()` automatically
- If WS opens → show "已连接" chip, enable send button
- If WS fails → show "未连接" chip, keep send disabled
- Reconnect logic remains unchanged (existing heartbeat/reconnect)
2. **Config save flow:**
- User clicks "设置" button → modal opens with current config values
- User edits fields → clicks "保存"
- Page sends `update_config` message via WS:
```json
{
"type": "update_config",
"config": {
"apiKey": "...",
"baseUrl": "...",
"model": "...",
"skillsDir": "...",
"directSubmitSkill": "...",
"runtimeProfile": "...",
"browserBackend": "..."
}
}
```
- sgClaw service receives message → validates → writes to `sgclaw_config.json`
- Service responds with success/error → page shows notification
- Service reloads config in-memory (or requires restart - see below)
### Protocol Changes
#### New ClientMessage variant
Add to `src/service/protocol.rs`:
```rust
#[derive(Debug, Clone, PartialEq, Eq, Serialize, Deserialize)]
#[serde(tag = "type", rename_all = "snake_case")]
pub enum ClientMessage {
Connect,
Start,
Stop,
SubmitTask { ... },
Ping,
UpdateConfig { // NEW
config: ConfigUpdatePayload,
},
}
#[derive(Debug, Clone, PartialEq, Eq, Serialize, Deserialize)]
pub struct ConfigUpdatePayload {
pub api_key: Option<String>,
pub base_url: Option<String>,
pub model: Option<String>,
pub skills_dir: Option<String>,
pub direct_submit_skill: Option<String>,
pub runtime_profile: Option<String>,
pub browser_backend: Option<String>,
}
```
#### New ServiceMessage variant (optional)
Add to `src/service/protocol.rs`:
```rust
#[derive(Debug, Clone, PartialEq, Eq, Serialize, Deserialize)]
#[serde(tag = "type", rename_all = "snake_case")]
pub enum ServiceMessage {
StatusChanged { state: String },
LogEntry { level: String, message: String },
TaskComplete { success: bool, summary: String },
Busy { message: String },
Pong,
ConfigUpdated { success: bool, message: String }, // NEW
}
```
### Config Persistence
The service will:
1. Load current `sgclaw_config.json` from the config path (derived from process args)
2. Merge incoming `ConfigUpdatePayload` fields (only non-null fields are updated)
3. Write the merged config back to the same file
4. Respond with success/error message
5. **Hot reload**: The service should reload config in-memory without requiring restart
**Important:** If the config file path cannot be resolved (no `--config-path` arg), the service should respond with an error message indicating that config updates are not supported in env-var-only mode.
### UI Design
#### Settings Button
- Add a "设置" button in the sidebar, below the existing connect button
- Styled as a ghost button with a gear icon (using unicode ⚙ or CSS-only icon)
#### Settings Modal
- Overlay modal with centered card
- Form fields with labels in Chinese:
- `API 密钥` (apiKey) - password input type with show/hide toggle
- `模型服务地址` (baseUrl) - text input
- `模型名称` (model) - text input
- `Skills 目录路径` (skillsDir) - text input with path validation
- `直接提交技能` (directSubmitSkill) - text input (optional, can be empty)
- `运行模式` (runtimeProfile) - dropdown: `browser-attached` / `service-standalone`
- `浏览器后端` (browserBackend) - dropdown: `super-rpa` / `pipe` / `none`
- [保存] primary button, [取消] ghost button
- Validation:
- API Key and Model are required (show red error if empty on save)
- Base URL must be a valid URL format
- Skills Dir must be a valid path format
- Other fields are optional
#### Connection State Auto-detection
- On page load, call `connect()` automatically
- Connection state chip updates as before
- Reconnect logic (existing) remains unchanged
### File Changes
| File | Change |
|------|--------|
| `frontend/service-console/sg_claw_service_console.html` | Add auto-connect on load, settings modal UI, save logic |
| `src/service/protocol.rs` | Add `UpdateConfig` variant and `ConfigUpdatePayload` struct |
| `src/service/protocol.rs` | Add `ConfigUpdated` service message variant |
| `src/service/server.rs` | Handle `UpdateConfig` message, merge config, write file |
| `src/agent/task_runner.rs` | Add `pub fn config_path(&self) -> Option<&Path>` getter to `AgentRuntimeContext` |
| `src/config/settings.rs` | Add `save_to_path()` method for writing config to file |
| `tests/service_console_html_test.rs` | Add assertions for settings modal and update_config message |
### Config Save Implementation
In `src/service/server.rs`, when handling `UpdateConfig`:
```rust
ClientMessage::UpdateConfig { config } => {
// 1. Load current config from config_path
let config_path = runtime_context.config_path(); // needs to be exposed
let current = SgClawSettings::load(config_path.as_deref())?;
// 2. Merge: only overwrite fields that are Some in the payload
let mut merged = current.unwrap_or_default();
if let Some(v) = config.api_key { merged.provider_api_key = v; }
if let Some(v) = config.base_url { merged.provider_base_url = v; }
if let Some(v) = config.model { merged.provider_model = v; }
if let Some(v) = config.skills_dir { merged.skills_dir = Some(PathBuf::from(v)); }
// ... etc for other fields
// 3. Write back to file
merged.save_to_path(config_path.as_ref().ok_or("no config path")?)?;
// 4. Respond
sink.send_service_message(ServiceMessage::ConfigUpdated {
success: true,
message: "配置已保存".to_string(),
})?;
}
```
### Hot Reload Consideration
After saving config, the service should reload its in-memory settings. This requires:
1. Storing the loaded `SgClawSettings` in a reloadable container (e.g., `Arc<Mutex<SgClawSettings>>` or `Arc<RwLock<...>>`)
2. Or, the service can respond with "配置已保存,请重启 sg_claw 以应用更改" (simpler, avoids hot reload complexity)
**Recommended:** Start with "requires restart" approach. Hot reload can be added later if needed.
### Error Handling
| Scenario | Response |
|----------|----------|
| WS not connected when saving | Show inline error: "请先连接服务" |
| Config file not found | Service responds: "未找到配置文件,请通过 --config-path 指定" |
| Invalid config values | Service validates and responds with specific error |
| Write permission denied | Service responds: "无法写入配置文件,请检查文件权限" |
| WS disconnected during save | Show error: "连接断开,保存失败,请重试" |
### Test Strategy
1. **Integration test** (`tests/service_console_html_test.rs`):
- Assert page contains settings modal HTML
- Assert page contains "设置" button
- Assert page sends `update_config` message shape
- Assert page auto-connects on load (contains `window.onload` or equivalent)
2. **Protocol test** (new or existing test file):
- Assert `ClientMessage::UpdateConfig` serializes correctly
- Assert `ServiceMessage::ConfigUpdated` deserializes correctly
3. **Config save test** (new test in `tests/compat_config_test.rs` or new file):
- Create temp config file
- Send UpdateConfig message
- Verify file contents match expected merged config
## Acceptance Criteria
1. Page auto-connects to WS on load without manual button click
2. Settings button visible in sidebar
3. Settings modal opens with form fields for all configurable options
4. Clicking "保存" sends `update_config` message via WS
5. Service receives message and writes to `sgclaw_config.json`
6. Service responds with success/error message
7. Page displays save result notification
8. Existing task submission flow unchanged
9. Existing heartbeat/reconnect logic unchanged
10. Automated tests pass

Binary file not shown.

View File

@@ -3,12 +3,16 @@
> 适用范围P1aRust与 P2Chromium C++)联调开发。
> 目标:双方只要严格按本文档实现,即可稳定联调。
附加口径:
- 浏览器宿主是 sgClaw/zeroclaw runtime 的一个特权执行面,不是整个 runtime 的定义。
- 只有真正需要浏览器执行的动作才应该跨过这条 pipe不要把所有任务都假定为浏览器任务。
## 1. 协议边界与责任
- 单一事实来源:`docs/L2-核心模块与接口契约层.md` 第 5 章5.1~5.4)。
- 协议版本冻结:`1.0`字段、action、错误码变更均视为协议变更。
- P1a 负责:`seq` 生成、command 组包、HMAC 计算、response 关联。
- P2 负责message 解析、Schema 校验、MAC 检查、CommandRouter 执行、结构化回包。
- P1a 负责:zeroclaw-first runtime、任务解释、tool policy、`seq` 生成、command 组包、HMAC 计算、response 关联。
- P2 负责:process host、message 解析、Schema 校验、MAC 检查、CommandRouter 执行、结构化回包。
## 2. Wire Contract双方 MUST
@@ -36,6 +40,10 @@
- 失败 response 必填:`error.code``error.message`(禁止纯文本错误)。
- `action``params` 必须通过 L2 的枚举和 Schema 校验。
说明:
- 这份标准约束的是“浏览器特权工具面”的 wire contract。
- 它不定义 sgClaw/zeroclaw 整体任务语义,也不意味着所有任务都必须变成 browser command。
**标准 command 示例**
```json

View File

@@ -0,0 +1,43 @@
# sgClaw Runtime Host Bundle
`frontendBundleDir` lets SuperRPA prefer an external `sgclaw-chat` bundle at
runtime while keeping the browser host, security boundary, and `chrome.send`
bridge inside Chromium.
## Contract
- SuperRPA remains the host and security boundary.
- sgClaw remains the runtime and planner/execution owner.
- The frontend bundle only renders state that comes from the existing host
bridge.
- Missing external files fall back to Chromium-bundled resources.
## Directory Layout
The external bundle root must mirror the Chromium resource paths under
`functions/sgclaw-chat/`.
Example:
```text
frontend-bundle/
sgclaw-chat/
sgclaw-chat.js
sgclaw-chat.css.js
sgclaw-chat.html.js
components/
sgclaw-chat-shell.js
sgclaw-message-list.js
```
When `frontendBundleDir` points at `frontend-bundle/`, the browser will try to
serve `sgclaw-chat/*` from that directory first. Any file that is absent will
continue loading from the built-in Chromium resource pack.
## Packaging Notes
- Keep relative import paths inside the bundle identical to the bundled
`sgclaw-chat` tree.
- Do not move planner or execution logic into the bundle.
- Treat [manifest.example.json](/home/zyl/projects/sgClaw/claw/frontend/runtime-host/manifest.example.json)
as a packaging reference for humans and tooling.

View File

@@ -0,0 +1,17 @@
{
"bundleId": "sgclaw-chat",
"version": "0.1.0",
"description": "External sgclaw chat presentation bundle for SuperRPA runtime hosting.",
"resourceRoot": ".",
"resourcePrefix": "sgclaw-chat/",
"entries": [
"sgclaw-chat/sgclaw-chat.js",
"sgclaw-chat/sgclaw-chat.css.js",
"sgclaw-chat/sgclaw-chat.html.js"
],
"notes": [
"Paths should mirror Chromium bundled sgclaw-chat resources.",
"Missing files fall back to bundled Chromium resources.",
"The bundle must stay presentation-only and use the existing host bridge."
]
}

View File

@@ -0,0 +1,835 @@
<!doctype html>
<html lang="zh-CN">
<head>
<meta charset="utf-8" />
<meta name="viewport" content="width=device-width, initial-scale=1" />
<title>sgClaw Service Console</title>
<style>
:root {
--bg: #f3efe4;
--panel: rgba(255, 252, 247, 0.88);
--panel-strong: #fffaf2;
--text: #1f2329;
--muted: #636b74;
--line: rgba(31, 35, 41, 0.12);
--accent: #0f766e;
--accent-strong: #115e59;
--warn: #b45309;
--error: #b42318;
--success: #166534;
--shadow: 0 24px 60px rgba(34, 42, 53, 0.14);
}
* {
box-sizing: border-box;
}
body {
margin: 0;
min-height: 100vh;
font-family: "Microsoft YaHei", "PingFang SC", "Segoe UI", sans-serif;
color: var(--text);
background:
radial-gradient(circle at top left, rgba(15, 118, 110, 0.18), transparent 32%),
radial-gradient(circle at right, rgba(180, 83, 9, 0.14), transparent 28%),
linear-gradient(160deg, #f5f0e6 0%, #eef5f4 56%, #f7f3eb 100%);
padding: 24px;
}
.shell {
width: min(1040px, 100%);
margin: 0 auto;
background: var(--panel);
backdrop-filter: blur(14px);
border: 1px solid rgba(255, 255, 255, 0.7);
border-radius: 28px;
box-shadow: var(--shadow);
overflow: hidden;
}
.hero {
padding: 28px 28px 18px;
border-bottom: 1px solid var(--line);
background: linear-gradient(135deg, rgba(255, 250, 242, 0.96), rgba(237, 246, 243, 0.92));
}
.hero h1 {
margin: 0;
font-size: clamp(1.8rem, 4vw, 2.6rem);
line-height: 1.05;
letter-spacing: 0.02em;
}
.hero p {
margin: 10px 0 0;
max-width: 60ch;
color: var(--muted);
line-height: 1.6;
}
.content {
display: grid;
grid-template-columns: minmax(280px, 320px) minmax(0, 1fr);
gap: 0;
}
.sidebar,
.stream-panel {
padding: 24px;
}
.sidebar {
border-right: 1px solid var(--line);
background: rgba(255, 255, 255, 0.38);
}
.section-label {
margin: 0 0 14px;
font-size: 0.83rem;
font-weight: 700;
letter-spacing: 0.14em;
text-transform: uppercase;
color: var(--muted);
}
.field {
margin-bottom: 18px;
}
.field label {
display: block;
margin-bottom: 8px;
font-size: 0.92rem;
color: var(--muted);
}
input,
textarea,
button {
width: 100%;
border: 1px solid var(--line);
border-radius: 16px;
font: inherit;
}
input,
textarea {
background: rgba(255, 255, 255, 0.92);
color: var(--text);
padding: 14px 16px;
outline: none;
transition: border-color 140ms ease, box-shadow 140ms ease, transform 140ms ease;
}
input:focus,
textarea:focus {
border-color: rgba(15, 118, 110, 0.5);
box-shadow: 0 0 0 4px rgba(15, 118, 110, 0.12);
}
textarea {
min-height: 180px;
resize: vertical;
line-height: 1.6;
}
button {
border: 0;
padding: 14px 16px;
font-weight: 700;
cursor: pointer;
transition: transform 140ms ease, opacity 140ms ease, background 140ms ease;
}
button:hover:not(:disabled) {
transform: translateY(-1px);
}
button:disabled {
cursor: not-allowed;
opacity: 0.45;
}
.primary-btn {
background: linear-gradient(135deg, var(--accent), var(--accent-strong));
color: #f6fffd;
box-shadow: 0 14px 28px rgba(15, 118, 110, 0.18);
}
.ghost-btn {
background: rgba(255, 255, 255, 0.9);
color: var(--text);
border: 1px solid var(--line);
}
.status-card {
display: grid;
gap: 8px;
padding: 16px;
border-radius: 20px;
background: var(--panel-strong);
border: 1px solid var(--line);
margin-bottom: 18px;
}
.state-chip {
display: inline-flex;
align-items: center;
width: fit-content;
padding: 6px 10px;
border-radius: 999px;
font-size: 0.85rem;
font-weight: 700;
background: rgba(99, 107, 116, 0.12);
color: var(--muted);
}
.state-chip[data-state="connected"] {
background: rgba(22, 101, 52, 0.12);
color: var(--success);
}
.state-chip[data-state="connecting"] {
background: rgba(180, 83, 9, 0.12);
color: var(--warn);
}
.validation {
min-height: 1.4em;
margin: 10px 0 14px;
color: var(--error);
font-size: 0.92rem;
}
.stream-panel {
display: grid;
grid-template-rows: auto minmax(320px, 1fr);
gap: 18px;
}
.stream-head {
display: flex;
justify-content: space-between;
align-items: end;
gap: 16px;
}
.stream-head h2 {
margin: 0;
font-size: 1.35rem;
}
.stream-head p {
margin: 6px 0 0;
color: var(--muted);
font-size: 0.94rem;
}
.stream {
display: grid;
gap: 12px;
align-content: start;
min-height: 320px;
max-height: 70vh;
overflow: auto;
padding: 4px;
}
.empty-state {
padding: 22px;
border-radius: 20px;
background: rgba(255, 255, 255, 0.52);
border: 1px dashed rgba(31, 35, 41, 0.16);
color: var(--muted);
line-height: 1.6;
}
.row {
display: grid;
grid-template-columns: auto 1fr;
gap: 12px;
align-items: start;
padding: 14px 16px;
border-radius: 18px;
background: rgba(255, 255, 255, 0.76);
border: 1px solid rgba(31, 35, 41, 0.08);
animation: rise 180ms ease;
}
.row-badge {
display: inline-flex;
align-items: center;
justify-content: center;
min-width: 76px;
padding: 6px 10px;
border-radius: 999px;
font-size: 0.76rem;
font-weight: 700;
text-transform: uppercase;
letter-spacing: 0.08em;
background: rgba(99, 107, 116, 0.14);
color: var(--muted);
}
.row.status .row-badge {
background: rgba(15, 118, 110, 0.14);
color: var(--accent-strong);
}
.row.log .row-badge {
background: rgba(57, 91, 163, 0.14);
color: #315aa2;
}
.row.complete .row-badge {
background: rgba(22, 101, 52, 0.14);
color: var(--success);
}
.row.error .row-badge {
background: rgba(180, 35, 24, 0.14);
color: var(--error);
}
.row-text {
margin: 0;
line-height: 1.6;
white-space: pre-wrap;
word-break: break-word;
}
@keyframes rise {
from {
opacity: 0;
transform: translateY(6px);
}
to {
opacity: 1;
transform: translateY(0);
}
}
/* Settings modal elements */
select {
width: 100%;
border: 1px solid var(--line);
border-radius: 16px;
padding: 14px 16px;
background: rgba(255, 255, 255, 0.92);
color: var(--text);
font: inherit;
outline: none;
cursor: pointer;
}
select:focus {
border-color: rgba(15, 118, 110, 0.5);
box-shadow: 0 0 0 4px rgba(15, 118, 110, 0.12);
}
@media (max-width: 900px) {
body {
padding: 16px;
}
.content {
grid-template-columns: 1fr;
}
.sidebar {
border-right: 0;
border-bottom: 1px solid var(--line);
}
.stream {
max-height: none;
}
}
</style>
</head>
<body>
<div class="shell" id="app">
<div class="hero">
<h1>sgClaw Service Console</h1>
<p>直接连接现有 service websocket提交自然语言任务并持续查看 service 返回的状态、日志和完成结果。</p>
</div>
<div class="content">
<div class="sidebar">
<p class="section-label">Connection</p>
<div class="status-card">
<span id="connectionState" class="state-chip" data-state="disconnected">未连接</span>
<span>默认地址使用现有 service websocket。</span>
</div>
<div class="field">
<label for="wsUrl">WebSocket 地址</label>
<input id="wsUrl" value="ws://127.0.0.1:42321" />
</div>
<button id="connectBtn" class="ghost-btn">连接</button>
<button id="settingsBtn" class="ghost-btn" style="margin-top: 8px;">⚙ 设置</button>
<p class="section-label" style="margin-top: 26px;">Composer</p>
<div class="field">
<label for="instructionInput">任务内容</label>
<textarea id="instructionInput" placeholder="例如:打开百度"></textarea>
</div>
<div id="validationText" class="validation"></div>
<button id="sendBtn" class="primary-btn" disabled>发送任务</button>
</div>
<div class="stream-panel">
<div class="stream-head">
<div>
<p class="section-label">Service Stream</p>
<h2>消息流</h2>
<p>只展示本地连接状态与现有 service message。</p>
</div>
</div>
<div id="messageStream" class="stream">
<div class="empty-state" id="emptyState">尚无消息。先连接 service websocket再发送一条自然语言任务。</div>
</div>
</div>
</div>
</div>
<!-- Settings Modal -->
<div id="settingsModal" style="display: none; position: fixed; top: 0; left: 0; width: 100%; height: 100%; background: rgba(0,0,0,0.5); z-index: 1000; align-items: center; justify-content: center;">
<div style="background: var(--panel); border-radius: 20px; padding: 28px; width: min(520px, 90%); max-height: 85vh; overflow-y: auto; box-shadow: var(--shadow);">
<h3 style="margin: 0 0 20px; font-size: 1.2rem;">sgClaw 配置</h3>
<div class="field">
<label for="settingApiKey">API 密钥 *</label>
<input id="settingApiKey" type="password" placeholder="输入模型 API 密钥" />
</div>
<div class="field">
<label for="settingBaseUrl">模型服务地址 *</label>
<input id="settingBaseUrl" type="url" placeholder="例如https://api.deepseek.com" />
</div>
<div class="field">
<label for="settingModel">模型名称 *</label>
<input id="settingModel" type="text" placeholder="例如deepseek-chat" />
</div>
<div class="field">
<label for="settingSkillsDir">Skills 目录路径</label>
<input id="settingSkillsDir" type="text" placeholder="例如D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills" />
</div>
<div class="field">
<label for="settingDirectSubmitSkill">直接提交技能</label>
<input id="settingDirectSubmitSkill" type="text" placeholder="例如tq-lineloss-report.collect_lineloss" />
</div>
<div class="field">
<label for="settingRuntimeProfile">运行模式</label>
<select id="settingRuntimeProfile" style="width: 100%; border: 1px solid var(--line); border-radius: 16px; padding: 14px 16px; background: rgba(255, 255, 255, 0.92); color: var(--text); font: inherit;">
<option value="browser-attached">browser-attached</option>
<option value="browser-heavy">browser-heavy</option>
<option value="general-assistant">general-assistant</option>
</select>
</div>
<div class="field">
<label for="settingBrowserBackend">浏览器后端</label>
<select id="settingBrowserBackend" style="width: 100%; border: 1px solid var(--line); border-radius: 16px; padding: 14px 16px; background: rgba(255, 255, 255, 0.92); color: var(--text); font: inherit;">
<option value="super-rpa">super-rpa</option>
<option value="agent-browser">agent-browser</option>
<option value="rust-native">rust-native</option>
<option value="computer-use">computer-use</option>
<option value="auto">auto</option>
</select>
</div>
<div id="settingsValidation" style="color: var(--error); font-size: 0.92rem; min-height: 1.4em; margin: 10px 0;"></div>
<div style="display: flex; gap: 12px; margin-top: 16px;">
<button id="settingsSaveBtn" class="primary-btn" style="flex: 1;">保存</button>
<button id="settingsCancelBtn" class="ghost-btn" style="flex: 1;">取消</button>
</div>
</div>
</div>
<script>
const defaultWsUrl = "ws://127.0.0.1:42321";
const elements = {
wsUrl: document.getElementById("wsUrl"),
connectBtn: document.getElementById("connectBtn"),
connectionState: document.getElementById("connectionState"),
messageStream: document.getElementById("messageStream"),
instructionInput: document.getElementById("instructionInput"),
validationText: document.getElementById("validationText"),
sendBtn: document.getElementById("sendBtn"),
emptyState: document.getElementById("emptyState")
};
let socket = null;
let reconnectTimer = null;
let connectTimeoutTimer = null;
let heartbeatTimer = null;
let shouldReconnect = false;
let lastHeartbeatAt = 0;
const reconnectDelayMs = 1500;
const reconnectCloseCode = 4000;
const reconnectCloseReason = "manual_disconnect";
const heartbeatIntervalMs = 15000;
const heartbeatTimeoutMs = 30000;
const connectTimeoutMs = 5000;
function appendRow(kind, text) {
if (elements.emptyState) {
elements.emptyState.remove();
elements.emptyState = null;
}
const row = document.createElement("div");
row.className = "row " + kind;
const badge = document.createElement("span");
badge.className = "row-badge";
badge.textContent = kind;
const content = document.createElement("p");
content.className = "row-text";
content.textContent = text;
row.appendChild(badge);
row.appendChild(content);
elements.messageStream.appendChild(row);
elements.messageStream.scrollTop = elements.messageStream.scrollHeight;
}
function clearReconnectTimer() {
if (reconnectTimer) {
clearTimeout(reconnectTimer);
reconnectTimer = null;
}
}
function clearConnectTimeoutTimer() {
if (connectTimeoutTimer) {
clearTimeout(connectTimeoutTimer);
connectTimeoutTimer = null;
}
}
function stopHeartbeat() {
if (heartbeatTimer) {
clearInterval(heartbeatTimer);
heartbeatTimer = null;
}
}
function startHeartbeat() {
stopHeartbeat();
lastHeartbeatAt = Date.now();
heartbeatTimer = setInterval(() => {
if (!socket || socket.readyState !== WebSocket.OPEN) {
return;
}
if (Date.now() - lastHeartbeatAt > heartbeatTimeoutMs) {
appendRow("error", "heartbeat missed, forcing reconnect");
const activeSocket = socket;
socket = null;
stopHeartbeat();
clearConnectTimeoutTimer();
activeSocket.close();
scheduleReconnect();
return;
}
socket.send(JSON.stringify({ type: "ping" }));
}, heartbeatIntervalMs);
}
function scheduleReconnect() {
clearReconnectTimer();
clearConnectTimeoutTimer();
if (!shouldReconnect) {
return;
}
appendRow("status", "service websocket disconnected, retrying");
reconnectTimer = setTimeout(() => connectOrDisconnectService(true), reconnectDelayMs);
updateUiState();
}
function setValidation(message) {
elements.validationText.textContent = message;
}
function updateUiState() {
const readyState = socket ? socket.readyState : WebSocket.CLOSED;
const connected = readyState === WebSocket.OPEN;
const connecting = readyState === WebSocket.CONNECTING || Boolean(reconnectTimer);
let stateText = "未连接";
let stateValue = "disconnected";
if (connected) {
stateText = "已连接";
stateValue = "connected";
} else if (connecting) {
stateText = "连接中";
stateValue = "connecting";
}
elements.connectBtn.textContent = connected || connecting ? "断开" : "连接";
elements.sendBtn.disabled = !connected;
elements.connectionState.textContent = stateText;
elements.connectionState.dataset.state = stateValue;
}
function connectOrDisconnectService(forceConnect = false) {
if (!forceConnect && socket && (socket.readyState === WebSocket.OPEN || socket.readyState === WebSocket.CONNECTING)) {
shouldReconnect = false;
clearReconnectTimer();
clearConnectTimeoutTimer();
stopHeartbeat();
socket.close(reconnectCloseCode, reconnectCloseReason);
return;
}
clearReconnectTimer();
clearConnectTimeoutTimer();
const url = elements.wsUrl.value.trim() || defaultWsUrl;
elements.wsUrl.value = url;
shouldReconnect = true;
const nextSocket = new WebSocket(url);
socket = nextSocket;
updateUiState();
connectTimeoutTimer = setTimeout(() => {
if (socket !== nextSocket || nextSocket.readyState !== WebSocket.CONNECTING) {
return;
}
appendRow("error", "service websocket connect timed out");
socket = null;
nextSocket.close();
scheduleReconnect();
}, connectTimeoutMs);
nextSocket.addEventListener("open", () => {
if (socket !== nextSocket) {
return;
}
clearReconnectTimer();
clearConnectTimeoutTimer();
lastHeartbeatAt = Date.now();
startHeartbeat();
appendRow("status", "service websocket connected");
updateUiState();
});
nextSocket.addEventListener("close", (event) => {
if (socket !== nextSocket) {
return;
}
socket = null;
clearConnectTimeoutTimer();
stopHeartbeat();
const manualClose = event.code === reconnectCloseCode || event.reason === reconnectCloseReason;
if (manualClose) {
shouldReconnect = false;
appendRow("status", "service websocket disconnected");
updateUiState();
return;
}
scheduleReconnect();
});
nextSocket.addEventListener("error", () => {
if (socket !== nextSocket) {
return;
}
appendRow("error", "service websocket error");
});
nextSocket.addEventListener("message", handleMessage);
}
function handleMessage(event) {
lastHeartbeatAt = Date.now();
let message;
try {
message = JSON.parse(event.data);
} catch (_error) {
appendRow("error", "invalid service message: " + event.data);
return;
}
switch (message.type) {
case "status_changed":
appendRow("status", message.state);
break;
case "log_entry":
appendRow("log", message.message);
break;
case "task_complete":
appendRow(message.success ? "complete" : "error", message.summary);
break;
case "busy":
appendRow("error", message.message);
break;
case "pong":
break;
case "config_updated":
handleConfigResponse(message);
break;
default:
appendRow("error", "unknown service message: " + event.data);
}
}
function sendTask() {
const instruction = elements.instructionInput.value.trim();
if (!socket || socket.readyState !== WebSocket.OPEN) {
return;
}
if (!instruction) {
setValidation("请输入任务内容。");
return;
}
setValidation("");
socket.send(JSON.stringify({
type: "submit_task",
instruction,
conversation_id: "",
messages: [],
page_url: "",
page_title: ""
}));
}
elements.connectBtn.addEventListener("click", connectOrDisconnectService);
elements.sendBtn.addEventListener("click", sendTask);
elements.instructionInput.addEventListener("input", () => {
if (elements.instructionInput.value.trim()) {
setValidation("");
}
});
updateUiState();
// Auto-connect on page load
window.addEventListener("DOMContentLoaded", () => {
connectOrDisconnectService(true);
});
// Settings modal state
const settingsElements = {
modal: document.getElementById("settingsModal"),
apiKey: document.getElementById("settingApiKey"),
baseUrl: document.getElementById("settingBaseUrl"),
model: document.getElementById("settingModel"),
skillsDir: document.getElementById("settingSkillsDir"),
directSubmitSkill: document.getElementById("settingDirectSubmitSkill"),
runtimeProfile: document.getElementById("settingRuntimeProfile"),
browserBackend: document.getElementById("settingBrowserBackend"),
validation: document.getElementById("settingsValidation"),
saveBtn: document.getElementById("settingsSaveBtn"),
cancelBtn: document.getElementById("settingsCancelBtn"),
};
let settingsOpenBtn = null;
function openSettingsModal() {
settingsElements.apiKey.value = "";
settingsElements.baseUrl.value = "";
settingsElements.model.value = "";
settingsElements.skillsDir.value = "";
settingsElements.directSubmitSkill.value = "";
settingsElements.runtimeProfile.value = "browser-attached";
settingsElements.browserBackend.value = "super-rpa";
settingsElements.validation.textContent = "";
settingsElements.modal.style.display = "flex";
}
function closeSettingsModal() {
settingsElements.modal.style.display = "none";
}
function validateSettings() {
const apiKey = settingsElements.apiKey.value.trim();
const baseUrl = settingsElements.baseUrl.value.trim();
const model = settingsElements.model.value.trim();
if (!apiKey) {
return "API 密钥不能为空";
}
if (!model) {
return "模型名称不能为空";
}
if (!baseUrl) {
return "模型服务地址不能为空";
}
try {
new URL(baseUrl);
} catch {
return "模型服务地址格式无效,请输入有效的 URL";
}
return "";
}
function saveSettings() {
const error = validateSettings();
if (error) {
settingsElements.validation.textContent = error;
return;
}
if (!socket || socket.readyState !== WebSocket.OPEN) {
settingsElements.validation.textContent = "请先连接服务";
return;
}
settingsElements.validation.textContent = "";
settingsElements.saveBtn.disabled = true;
settingsElements.saveBtn.textContent = "保存中...";
const config = {
apiKey: settingsElements.apiKey.value.trim(),
baseUrl: settingsElements.baseUrl.value.trim(),
model: settingsElements.model.value.trim(),
};
const skillsDir = settingsElements.skillsDir.value.trim();
if (skillsDir) config.skillsDir = skillsDir;
const directSubmitSkill = settingsElements.directSubmitSkill.value.trim();
if (directSubmitSkill) config.directSubmitSkill = directSubmitSkill;
config.runtimeProfile = settingsElements.runtimeProfile.value;
config.browserBackend = settingsElements.browserBackend.value;
socket.send(JSON.stringify({
type: "update_config",
config,
}));
}
function handleConfigResponse(message) {
settingsElements.saveBtn.disabled = false;
settingsElements.saveBtn.textContent = "保存";
if (message.success) {
settingsElements.validation.textContent = message.message;
settingsElements.validation.style.color = "var(--success)";
setTimeout(closeSettingsModal, 2000);
} else {
settingsElements.validation.textContent = message.message;
settingsElements.validation.style.color = "var(--error)";
}
}
// Event listeners for settings
settingsOpenBtn = document.getElementById("settingsBtn");
settingsOpenBtn.addEventListener("click", openSettingsModal);
settingsElements.cancelBtn.addEventListener("click", closeSettingsModal);
settingsElements.saveBtn.addEventListener("click", saveSettings);
settingsElements.modal.addEventListener("click", (e) => {
if (e.target === settingsElements.modal) {
closeSettingsModal();
}
});
</script>
</body>
</html>

View File

@@ -1,19 +1,22 @@
{
"version": "1.0",
"demo_only_domains": ["baidu.com", "www.baidu.com", "zhihu.com", "www.zhihu.com"],
"demo_only_domains": ["baidu.com", "www.baidu.com", "zhihu.com", "www.zhihu.com", "zhuanlan.zhihu.com"],
"domains": {
"allowed": [
"oa.example.com",
"erp.example.com",
"hr.example.com",
"sgcc.example.invalid",
"95598.example.invalid",
"baidu.com",
"www.baidu.com",
"zhihu.com",
"www.zhihu.com"
"www.zhihu.com",
"zhuanlan.zhihu.com"
]
},
"pipe_actions": {
"allowed": ["click", "type", "navigate", "getText"],
"blocked": ["eval", "executeJsInPage"]
"allowed": ["click", "type", "navigate", "getText", "eval"],
"blocked": ["executeJsInPage"]
}
}

View File

@@ -0,0 +1,637 @@
<!doctype html>
<html lang="zh-CN">
<head>
<meta charset="utf-8" />
<meta name="viewport" content="width=device-width, initial-scale=1" />
<title>知乎热榜图表驾驶舱</title>
<script src="https://cdn.jsdelivr.net/npm/echarts@5/dist/echarts.min.js"></script>
<style>
:root {
--bg: #06111f;
--bg-2: #0a1f37;
--panel: rgba(8, 25, 42, 0.88);
--panel-strong: rgba(10, 32, 55, 0.95);
--line: rgba(101, 187, 255, 0.18);
--line-strong: rgba(236, 186, 81, 0.26);
--text: #eef6ff;
--muted: #8ea6c2;
--accent: #62d0ff;
--accent-2: #ecba51;
--accent-3: #6df0c2;
--danger: #ff8b7e;
--shadow: 0 20px 48px rgba(0, 0, 0, 0.34);
--font-heading: "DIN Alternate", "Bahnschrift", "Microsoft YaHei UI", sans-serif;
--font-body: "Segoe UI Variable Text", "Microsoft YaHei", "PingFang SC", sans-serif;
}
* {
box-sizing: border-box;
}
html,
body {
margin: 0;
min-height: 100%;
background:
radial-gradient(circle at 16% 10%, rgba(98, 208, 255, 0.18), transparent 22%),
radial-gradient(circle at 86% 12%, rgba(236, 186, 81, 0.14), transparent 18%),
linear-gradient(145deg, var(--bg) 0%, var(--bg-2) 42%, #030910 100%);
color: var(--text);
font-family: var(--font-body);
}
body::before {
content: "";
position: fixed;
inset: 0;
pointer-events: none;
background-image:
linear-gradient(rgba(101, 187, 255, 0.05) 1px, transparent 1px),
linear-gradient(90deg, rgba(101, 187, 255, 0.05) 1px, transparent 1px);
background-size: 44px 44px;
mask-image: radial-gradient(circle at center, black 34%, rgba(0, 0, 0, 0.22) 88%, transparent 100%);
}
.page {
min-height: 100vh;
padding: 18px;
display: grid;
grid-template-rows: auto auto 1fr auto;
gap: 14px;
}
.panel {
position: relative;
overflow: hidden;
background:
linear-gradient(180deg, rgba(255, 255, 255, 0.035), rgba(255, 255, 255, 0.01)),
linear-gradient(145deg, rgba(9, 30, 51, 0.97), rgba(6, 20, 34, 0.92));
border: 1px solid var(--line);
border-radius: 22px;
box-shadow: var(--shadow);
}
.panel::before {
content: "";
position: absolute;
left: 18px;
right: 18px;
top: 0;
height: 2px;
background: linear-gradient(90deg, transparent, var(--accent), var(--accent-2), transparent);
opacity: 0.95;
}
.hero {
padding: 18px 24px;
display: grid;
grid-template-columns: minmax(0, 1fr) 360px;
gap: 16px;
align-items: center;
}
.eyebrow {
color: var(--accent);
letter-spacing: 2px;
text-transform: uppercase;
font-size: 12px;
margin-bottom: 8px;
}
h1 {
margin: 0;
font-family: var(--font-heading);
font-size: 38px;
line-height: 1.08;
letter-spacing: 1px;
}
#snapshot-meta {
margin: 10px 0 0;
color: var(--muted);
font-size: 14px;
}
.hero-notes {
display: grid;
gap: 10px;
}
.note-card {
padding: 14px 16px;
border-radius: 16px;
background: linear-gradient(135deg, rgba(98, 208, 255, 0.08), rgba(236, 186, 81, 0.08));
border: 1px solid rgba(255, 255, 255, 0.05);
}
.note-card strong {
display: block;
margin-bottom: 6px;
font-size: 14px;
}
.note-card span {
color: var(--muted);
font-size: 12px;
line-height: 1.5;
}
.metrics {
display: grid;
grid-template-columns: repeat(4, 1fr);
gap: 12px;
}
.metric {
padding: 18px 18px 16px;
}
.metric-label {
color: var(--muted);
font-size: 12px;
letter-spacing: 1px;
text-transform: uppercase;
}
.metric-value {
margin-top: 10px;
font-family: var(--font-heading);
font-size: 34px;
color: var(--text);
}
.metric-sub {
margin-top: 8px;
color: var(--accent);
font-size: 12px;
}
.charts {
min-height: 0;
display: grid;
grid-template-columns: 1.2fr 1fr 0.95fr;
grid-template-rows: 360px 320px;
gap: 14px;
grid-template-areas:
"bar top pie"
"bubble table table";
}
.chart-panel {
padding: 14px 16px 12px;
}
.bar-panel { grid-area: bar; }
.top-panel { grid-area: top; }
.pie-panel { grid-area: pie; }
.bubble-panel { grid-area: bubble; }
.table-panel { grid-area: table; padding: 14px 16px 10px; }
.section-head {
display: flex;
align-items: end;
justify-content: space-between;
gap: 12px;
margin-bottom: 10px;
}
.section-head h2 {
margin: 0;
font-size: 22px;
font-family: var(--font-heading);
letter-spacing: 1px;
}
.section-head span {
color: var(--muted);
font-size: 12px;
}
.chart {
width: 100%;
height: calc(100% - 42px);
}
.table-wrap {
height: calc(100% - 42px);
overflow: auto;
padding-right: 4px;
}
table {
width: 100%;
border-collapse: collapse;
}
thead th {
position: sticky;
top: 0;
z-index: 1;
background: rgba(6, 19, 32, 0.96);
padding: 10px 8px;
text-align: left;
font-size: 12px;
color: var(--muted);
letter-spacing: 1px;
text-transform: uppercase;
border-bottom: 1px solid var(--line-strong);
}
tbody td {
padding: 11px 8px;
border-bottom: 1px solid rgba(255, 255, 255, 0.05);
font-size: 13px;
vertical-align: top;
}
tbody tr:nth-child(odd) {
background: rgba(255, 255, 255, 0.016);
}
.rank {
font-family: var(--font-heading);
color: var(--accent-2);
white-space: nowrap;
}
.heat {
color: var(--accent-3);
font-family: var(--font-heading);
white-space: nowrap;
}
.tag {
display: inline-flex;
align-items: center;
padding: 4px 10px;
border-radius: 999px;
background: rgba(98, 208, 255, 0.12);
color: var(--accent);
font-size: 12px;
}
.footer {
padding: 10px 16px;
color: var(--muted);
font-size: 12px;
}
@media (max-width: 1440px) {
.hero {
grid-template-columns: 1fr;
}
.metrics {
grid-template-columns: repeat(2, 1fr);
}
.charts {
grid-template-columns: 1fr;
grid-template-rows: 320px 320px 320px 320px 420px;
grid-template-areas:
"bar"
"top"
"pie"
"bubble"
"table";
}
}
@media (max-width: 760px) {
.page {
padding: 12px;
}
h1 {
font-size: 28px;
}
.metrics {
grid-template-columns: 1fr;
}
}
</style>
</head>
<body>
<div class="page">
<section class="panel hero">
<div>
<div class="eyebrow">Zhihu Hotlist Visual Command Center</div>
<h1>知乎热榜图表驾驶舱</h1>
<p id="snapshot-meta">由 sgClaw screen_html_export 生成的本地静态展示页</p>
</div>
<div class="hero-notes">
<div class="note-card">
<strong>图表表达</strong>
<span>同一份热榜数据同时映射为分类热度、头部热点、结构占比和热度散点,适合现场讲解图表能力。</span>
</div>
<div class="note-card">
<strong>演示建议</strong>
<span id="lead-summary">优先讲解榜首热点、分类分布与热度层级,再向下展开全量榜单细节。</span>
</div>
</div>
</section>
<section class="metrics">
<article class="panel metric">
<div class="metric-label">热榜条目数</div>
<div id="metric-total" class="metric-value">0</div>
<div class="metric-sub">Tracked items</div>
</article>
<article class="panel metric">
<div class="metric-label">主题分类数</div>
<div id="metric-categories" class="metric-value">0</div>
<div class="metric-sub">Topic groups</div>
</article>
<article class="panel metric">
<div class="metric-label">累计热度</div>
<div id="metric-heat" class="metric-value">0</div>
<div class="metric-sub">Total heat</div>
</article>
<article class="panel metric">
<div class="metric-label">头部峰值</div>
<div id="metric-peak" class="metric-value">0</div>
<div class="metric-sub">Peak topic heat</div>
</article>
</section>
<section class="charts">
<section class="panel chart-panel bar-panel">
<div class="section-head">
<h2>分类总热度</h2>
<span>横向对比</span>
</div>
<div id="bar-chart" class="chart"></div>
</section>
<section class="panel chart-panel top-panel">
<div class="section-head">
<h2>Top10 热点</h2>
<span>柱状排行</span>
</div>
<div id="top-chart" class="chart"></div>
</section>
<section class="panel chart-panel pie-panel">
<div class="section-head">
<h2>分类占比</h2>
<span>环形结构</span>
</div>
<div id="pie-chart" class="chart"></div>
</section>
<section class="panel chart-panel bubble-panel">
<div class="section-head">
<h2>热度分层</h2>
<span>散点气泡</span>
</div>
<div id="bubble-chart" class="chart"></div>
</section>
<section class="panel table-panel">
<div class="section-head">
<h2>热榜明细</h2>
<span id="table-note">按原始顺序保留</span>
</div>
<div class="table-wrap">
<table>
<thead>
<tr>
<th>排名</th>
<th>标题</th>
<th>分类</th>
<th>热度</th>
</tr>
</thead>
<tbody id="table-body"></tbody>
</table>
</div>
</section>
</section>
<section class="panel footer">
本页由 `screen_html_export` 生成,适合在系统浏览器中直接打开进行展示。
</section>
</div>
<script>
const defaultPayload = {
"snapshot_id": "template-snapshot",
"generated_at_ms": 0,
"categories": [],
"table": []
}
const themeMeta = {
title: "知乎热榜图表驾驶舱",
renderer: "screen_html_export"
};
const chartColors = ["#62d0ff", "#ecba51", "#6df0c2", "#7f8cff", "#ff8b7e", "#9fcbff", "#58a6ff"];
const charts = {};
function formatNumber(value) {
return new Intl.NumberFormat("zh-CN").format(Number(value || 0));
}
function getTotalHeat(categories) {
return (categories || []).reduce((sum, item) => sum + Number(item.total_heat || 0), 0);
}
function getPeakHeat(table) {
return (table || []).reduce((max, row) => Math.max(max, Number(row.heat_value || 0)), 0);
}
function buildLeadSummary(table, categories) {
const top = (table || [])[0];
const category = (categories || []).slice().sort((a, b) => (b.total_heat || 0) - (a.total_heat || 0))[0];
const parts = [];
if (top) {
parts.push(`榜首是“${top.title}`);
}
if (category) {
parts.push(`主导分类为“${category.category_label}`);
}
parts.push(`共覆盖 ${(table || []).length} 条热点`);
return parts.join("");
}
function ensureCharts() {
if (!window.echarts) {
return;
}
charts.bar = charts.bar || echarts.init(document.getElementById("bar-chart"));
charts.top = charts.top || echarts.init(document.getElementById("top-chart"));
charts.pie = charts.pie || echarts.init(document.getElementById("pie-chart"));
charts.bubble = charts.bubble || echarts.init(document.getElementById("bubble-chart"));
}
function renderBarChart(categories) {
const sorted = (categories || []).slice().sort((a, b) => Number(a.total_heat || 0) - Number(b.total_heat || 0));
charts.bar.setOption({
animationDuration: 700,
grid: {left: 90, right: 18, top: 10, bottom: 20},
xAxis: {
type: "value",
axisLabel: {color: "#8ea6c2"},
splitLine: {lineStyle: {color: "rgba(255,255,255,0.06)"}}
},
yAxis: {
type: "category",
data: sorted.map((item) => item.category_label),
axisLabel: {color: "#eef6ff"},
axisLine: {lineStyle: {color: "rgba(255,255,255,0.1)"}}
},
tooltip: {trigger: "axis", axisPointer: {type: "shadow"}},
series: [{
type: "bar",
data: sorted.map((item, index) => ({
value: Number(item.total_heat || 0),
itemStyle: {color: chartColors[index % chartColors.length], borderRadius: [0, 8, 8, 0]}
})),
label: {show: true, position: "right", color: "#dfeeff"}
}]
});
}
function renderTopChart(table) {
const top = (table || []).slice(0, 10);
charts.top.setOption({
animationDuration: 700,
grid: {left: 42, right: 12, top: 26, bottom: 46},
tooltip: {trigger: "axis", axisPointer: {type: "shadow"}},
xAxis: {
type: "category",
data: top.map((row) => `#${row.rank}`),
axisLabel: {color: "#8ea6c2"},
axisLine: {lineStyle: {color: "rgba(255,255,255,0.1)"}}
},
yAxis: {
type: "value",
axisLabel: {color: "#8ea6c2"},
splitLine: {lineStyle: {color: "rgba(255,255,255,0.06)"}}
},
series: [{
type: "bar",
data: top.map((row, index) => ({
value: Number(row.heat_value || 0),
itemStyle: {color: chartColors[index % chartColors.length], borderRadius: [8, 8, 0, 0]}
})),
label: {show: true, position: "top", color: "#eef6ff", formatter: ({dataIndex}) => top[dataIndex].heat_text}
}]
});
}
function renderPieChart(categories) {
charts.pie.setOption({
animationDuration: 700,
color: chartColors,
tooltip: {trigger: "item"},
legend: {
bottom: 2,
textStyle: {color: "#8ea6c2", fontSize: 11},
itemWidth: 12,
itemHeight: 8
},
series: [{
type: "pie",
radius: ["44%", "72%"],
center: ["50%", "44%"],
itemStyle: {borderColor: "#081a2c", borderWidth: 2},
label: {
color: "#eef6ff",
formatter: "{b}\n{d}%"
},
data: (categories || []).map((item) => ({
name: item.category_label,
value: Number(item.total_heat || 0)
}))
}]
});
}
function renderBubbleChart(table) {
const top = (table || []).slice(0, 12);
charts.bubble.setOption({
animationDuration: 700,
color: chartColors,
grid: {left: 44, right: 18, top: 16, bottom: 36},
xAxis: {
type: "value",
name: "排名",
inverse: true,
min: 0,
max: Math.max(...top.map((row) => Number(row.rank || 0)), 10) + 1,
nameTextStyle: {color: "#8ea6c2"},
axisLabel: {color: "#8ea6c2"},
splitLine: {lineStyle: {color: "rgba(255,255,255,0.06)"}}
},
yAxis: {
type: "value",
name: "热度值",
nameTextStyle: {color: "#8ea6c2"},
axisLabel: {color: "#8ea6c2"},
splitLine: {lineStyle: {color: "rgba(255,255,255,0.06)"}}
},
tooltip: {
formatter: (params) => {
const row = params.data.raw;
return `${row.title}<br/>排名 #${row.rank}<br/>热度 ${row.heat_text}<br/>分类 ${row.category_label}`;
}
},
series: [{
type: "scatter",
symbolSize: (value) => Math.max(18, Math.min(56, value[2] / 80000)),
data: top.map((row, index) => ({
value: [Number(row.rank || 0), Number(row.heat_value || 0), Number(row.heat_value || 0)],
raw: row,
itemStyle: {color: chartColors[index % chartColors.length], opacity: 0.82}
}))
}]
});
}
function renderTable(table) {
document.getElementById("table-body").innerHTML = (table || []).map((row) => `
<tr>
<td class="rank">#${row.rank}</td>
<td>${row.title}</td>
<td><span class="tag">${row.category_label}</span></td>
<td class="heat">${row.heat_text}</td>
</tr>
`).join("");
}
function render(payload) {
const data = payload || defaultPayload;
const categories = data.categories || [];
const table = data.table || [];
document.title = themeMeta.title;
document.getElementById("snapshot-meta").textContent =
`${data.snapshot_id} | 生成时间 ${new Date(data.generated_at_ms || 0).toLocaleString()}`;
document.getElementById("metric-total").textContent = formatNumber(table.length);
document.getElementById("metric-categories").textContent = formatNumber(categories.length);
document.getElementById("metric-heat").textContent = formatNumber(getTotalHeat(categories));
document.getElementById("metric-peak").textContent = formatNumber(getPeakHeat(table));
document.getElementById("lead-summary").textContent = buildLeadSummary(table, categories);
document.getElementById("table-note").textContent =
table.length > 0 ? `当前展示 ${table.length} 条热点` : "暂无热榜数据";
renderTable(table);
ensureCharts();
if (window.echarts) {
renderBarChart(categories);
renderTopChart(table);
renderPieChart(categories);
renderBubbleChart(table);
}
}
window.addEventListener("resize", () => {
Object.values(charts).forEach((chart) => chart && chart.resize());
});
render(defaultPayload);
</script>
</body>
</html>

View File

@@ -0,0 +1,532 @@
import argparse
import re
import sys
import tomllib
from pathlib import Path
from typing import NamedTuple
MAX_TEXT_FILE_BYTES = 512 * 1024
SCRIPT_SUFFIXES = (
".sh",
".bash",
".zsh",
".ksh",
".fish",
".ps1",
".bat",
".cmd",
)
HIGH_RISK_PATTERNS = (
(re.compile(r"(?im)\bcurl\b[^\n|]{0,200}\|\s*(?:sh|bash|zsh)\b"), "curl-pipe-shell"),
(re.compile(r"(?im)\bwget\b[^\n|]{0,200}\|\s*(?:sh|bash|zsh)\b"), "wget-pipe-shell"),
(re.compile(r"(?im)\b(?:invoke-expression|iex)\b"), "powershell-iex"),
(re.compile(r"(?im)\brm\s+-rf\s+/"), "destructive-rm-rf-root"),
(re.compile(r"(?im)\bnc(?:at)?\b[^\n]{0,120}\s-e\b"), "netcat-remote-exec"),
(re.compile(r"(?im)\bdd\s+if="), "disk-overwrite-dd"),
(re.compile(r"(?im)\bmkfs(?:\.[a-z0-9]+)?\b"), "filesystem-format"),
(re.compile(r"(?im):\(\)\s*\{\s*:\|\:&\s*\};:"), "fork-bomb"),
)
MARKDOWN_LINK_RE = re.compile(r"\[[^\]]*\]\(([^)]+)\)")
REPO_ROOT = Path(__file__).resolve().parents[1]
SKILL_LIB_ROOT = REPO_ROOT.parent / "skill_lib"
SKILLS_DIR = SKILL_LIB_ROOT / "skills"
class SkillRecord(NamedTuple):
name: str
description: str
version: str
author: str | None
tags: list[str]
prompt_body: str
location: Path
class AuditReport(NamedTuple):
files_scanned: int
findings: list[str]
class ValidationResult(NamedTuple):
record: SkillRecord
report: AuditReport
ok: bool
def discover_skill_dirs(skills_dir: Path | None = None) -> list[Path]:
root = skills_dir or SKILLS_DIR
if not root.exists():
return []
return sorted(path for path in root.iterdir() if path.is_dir())
def load_skill(skill_dir: Path) -> SkillRecord:
manifest_path = skill_dir / "SKILL.toml"
markdown_path = skill_dir / "SKILL.md"
if manifest_path.is_file():
manifest = tomllib.loads(manifest_path.read_text(encoding="utf-8"))
skill_meta = manifest.get("skill", {})
prompts = manifest.get("prompts", [])
body = ""
if markdown_path.is_file():
_, body = parse_skill_markdown(markdown_path.read_text(encoding="utf-8"))
elif prompts:
body = "\n\n".join(str(prompt) for prompt in prompts)
description = skill_meta.get("description")
if not description or not str(description).strip():
description = extract_description(body)
return SkillRecord(
name=skill_meta.get("name") or skill_dir.name,
description=str(description),
version=str(skill_meta.get("version") or "0.1.0"),
author=skill_meta.get("author") or None,
tags=list(skill_meta.get("tags", [])),
prompt_body=body,
location=manifest_path,
)
skill_path = markdown_path
content = skill_path.read_text(encoding="utf-8")
meta, body = parse_skill_markdown(content)
name = meta["name"] or skill_dir.name
description = meta["description"]
if not description or not description.strip():
description = extract_description(body)
version = meta["version"] or "0.1.0"
author = meta["author"] or None
tags = list(meta["tags"])
return SkillRecord(
name=name,
description=description,
version=version,
author=author,
tags=tags,
prompt_body=body,
location=skill_path,
)
def validate_all_skills(allow_scripts: bool = False) -> list[ValidationResult]:
results = []
for skill_dir in discover_skill_dirs():
record = load_skill(skill_dir)
report = audit_skill_directory(skill_dir, allow_scripts=allow_scripts)
results.append(ValidationResult(record=record, report=report, ok=not report.findings))
return results
def parse_skill_markdown(content: str) -> tuple[dict[str, object], str]:
frontmatter = split_skill_frontmatter(content)
if frontmatter is None:
return empty_meta(), content
raw_frontmatter, body = frontmatter
return parse_simple_frontmatter(raw_frontmatter), body
def split_skill_frontmatter(content: str) -> tuple[str, str] | None:
normalized = content.replace("\r\n", "\n")
if not normalized.startswith("---\n"):
return None
rest = normalized[len("---\n") :]
marker = "\n---\n"
idx = rest.find(marker)
if idx != -1:
return rest[:idx], rest[idx + len(marker) :]
if rest.endswith("\n---"):
return rest[:-4], ""
return None
def parse_simple_frontmatter(frontmatter: str) -> dict[str, object]:
meta = empty_meta()
collecting_tags = False
for raw_line in frontmatter.splitlines():
if collecting_tags:
trimmed = raw_line.strip()
if trimmed.startswith("- "):
tag = trimmed[2:].strip().strip('"').strip("'")
if tag:
meta["tags"].append(tag)
continue
collecting_tags = False
if ":" not in raw_line:
continue
key, value = raw_line.split(":", 1)
key = key.strip()
value = value.strip().strip('"').strip("'")
if key == "name":
meta["name"] = value
elif key == "description":
meta["description"] = value
elif key == "version":
meta["version"] = value
elif key == "author":
meta["author"] = value
elif key == "tags":
if not value:
collecting_tags = True
else:
cleaned = value.lstrip("[").rstrip("]")
meta["tags"] = [
item.strip().strip('"').strip("'")
for item in cleaned.split(",")
if item.strip().strip('"').strip("'")
]
return meta
def empty_meta() -> dict[str, object]:
return {
"name": None,
"description": None,
"version": None,
"author": None,
"tags": [],
}
def extract_description(body: str) -> str:
for line in body.splitlines():
if line.startswith("#"):
continue
if not line.strip():
continue
return line.strip()
return "No description"
def audit_skill_directory(skill_dir: Path, allow_scripts: bool = False) -> AuditReport:
if not skill_dir.exists():
raise FileNotFoundError(f"Skill source does not exist: {skill_dir}")
if not skill_dir.is_dir():
raise NotADirectoryError(f"Skill source must be a directory: {skill_dir}")
canonical_root = skill_dir.resolve()
findings: list[str] = []
files_scanned = 0
has_manifest = (canonical_root / "SKILL.md").is_file() or (canonical_root / "SKILL.toml").is_file()
if not has_manifest:
findings.append(
"Skill root must include SKILL.md or SKILL.toml for deterministic auditing."
)
for path in collect_paths_depth_first(canonical_root):
files_scanned += 1
findings.extend(audit_path(canonical_root, path, allow_scripts=allow_scripts))
return AuditReport(files_scanned=files_scanned, findings=findings)
def collect_paths_depth_first(root: Path) -> list[Path]:
stack = [root]
discovered: list[Path] = []
while stack:
current = stack.pop()
discovered.append(current)
if not current.is_dir():
continue
children = sorted(current.iterdir())
for child in reversed(children):
stack.append(child)
return discovered
def audit_path(root: Path, path: Path, allow_scripts: bool) -> list[str]:
findings: list[str] = []
metadata = path.lstat()
rel = relative_display(root, path)
if path.is_symlink():
findings.append(f"{rel}: symlinks are not allowed in installed skills.")
return findings
if path.is_dir():
return findings
if not allow_scripts and is_unsupported_script_file(path):
findings.append(f"{rel}: script-like files are blocked by skill security policy.")
if metadata.st_size > MAX_TEXT_FILE_BYTES and (is_markdown_file(path) or is_toml_file(path)):
findings.append(f"{rel}: file is too large for static audit (>{MAX_TEXT_FILE_BYTES} bytes).")
return findings
if is_markdown_file(path):
findings.extend(audit_markdown_file(root, path))
elif is_toml_file(path):
findings.extend(audit_manifest_file(root, path))
return findings
def audit_markdown_file(root: Path, path: Path) -> list[str]:
findings: list[str] = []
content = path.read_text(encoding="utf-8")
rel = relative_display(root, path)
pattern = detect_high_risk_snippet(content)
if pattern:
findings.append(f"{rel}: detected high-risk command pattern ({pattern}).")
for target in extract_markdown_links(content):
findings.extend(audit_markdown_link_target(root, path, target))
return findings
def audit_manifest_file(root: Path, path: Path) -> list[str]:
findings: list[str] = []
content = path.read_text(encoding="utf-8")
rel = relative_display(root, path)
pattern = detect_high_risk_snippet(content)
if pattern:
findings.append(f"{rel}: detected high-risk command pattern ({pattern}).")
if any(operator in content for operator in ("&&", "||", ";", "`", "$(")):
findings.append(f"{rel}: manifest content uses shell chaining operators, which are blocked.")
return findings
def extract_markdown_links(content: str) -> list[str]:
return [match.group(1).strip() for match in MARKDOWN_LINK_RE.finditer(content)]
def audit_markdown_link_target(root: Path, source: Path, raw_target: str) -> list[str]:
findings: list[str] = []
normalized = normalize_markdown_target(raw_target)
if not normalized or normalized.startswith("#"):
return findings
rel = relative_display(root, source)
scheme = url_scheme(normalized)
if scheme:
if scheme in {"http", "https", "mailto"}:
if has_markdown_suffix(normalized):
findings.append(
f"{rel}: remote markdown links are blocked by skill security audit ({normalized})."
)
return findings
findings.append(f"{rel}: unsupported URL scheme in markdown link ({normalized}).")
return findings
stripped = strip_query_and_fragment(normalized)
if not stripped:
return findings
if looks_like_absolute_path(stripped):
findings.append(f"{rel}: absolute markdown link paths are not allowed ({normalized}).")
return findings
if has_script_suffix(stripped):
findings.append(f"{rel}: markdown links to script files are blocked ({normalized}).")
if not has_markdown_suffix(stripped):
return findings
base_dir = source.parent
linked_path = base_dir / stripped
try:
canonical_target = linked_path.resolve(strict=True)
except FileNotFoundError:
if is_cross_skill_reference(stripped):
return findings
findings.append(f"{rel}: markdown link points to a missing file ({normalized}).")
return findings
if not is_subpath(canonical_target, root):
skills_root = skills_root_for(root)
if skills_root and is_subpath(canonical_target, skills_root):
if not canonical_target.is_file():
findings.append(f"{rel}: markdown link must point to a file ({normalized}).")
return findings
findings.append(f"{rel}: markdown link escapes skill root ({normalized}).")
return findings
if not canonical_target.is_file():
findings.append(f"{rel}: markdown link must point to a file ({normalized}).")
return findings
def detect_high_risk_snippet(content: str) -> str | None:
for pattern, label in HIGH_RISK_PATTERNS:
if pattern.search(content):
return label
return None
def normalize_markdown_target(raw_target: str) -> str:
trimmed = raw_target.strip()
if trimmed.startswith("<"):
trimmed = trimmed[1:]
if trimmed.endswith(">"):
trimmed = trimmed[:-1]
parts = trimmed.split()
return parts[0] if parts else ""
def strip_query_and_fragment(target: str) -> str:
end = len(target)
hash_idx = target.find("#")
if hash_idx != -1:
end = min(end, hash_idx)
query_idx = target.find("?")
if query_idx != -1:
end = min(end, query_idx)
return target[:end]
def url_scheme(target: str) -> str | None:
if ":" not in target:
return None
scheme, rest = target.split(":", 1)
if not scheme or not rest:
return None
if not all(ch.isalnum() or ch in "+-." for ch in scheme):
return None
return scheme
def looks_like_absolute_path(target: str) -> bool:
if Path(target).is_absolute():
return True
if len(target) >= 3 and target[0].isalpha() and target[1] == ":" and target[2] in "\\/":
return True
return target.startswith("~/")
def is_cross_skill_reference(target: str) -> bool:
normalized = target[2:] if target.startswith("./") else target
path = Path(target)
if ".." in path.parts:
return True
return "/" not in normalized and "\\" not in normalized and has_markdown_suffix(normalized)
def skills_root_for(root: Path) -> Path | None:
current = root
while True:
if current.name == "skills":
return current
if current.parent == current:
return None
current = current.parent
def relative_display(root: Path, path: Path) -> str:
try:
rel = path.relative_to(root)
except ValueError:
return str(path)
return "." if str(rel) == "." else str(rel)
def is_markdown_file(path: Path) -> bool:
return path.suffix.lower() in {".md", ".markdown"}
def is_toml_file(path: Path) -> bool:
return path.suffix.lower() == ".toml"
def is_unsupported_script_file(path: Path) -> bool:
return has_script_suffix(str(path).lower()) or has_shell_shebang(path)
def has_script_suffix(raw: str) -> bool:
lowered = raw.lower()
return any(lowered.endswith(suffix) for suffix in SCRIPT_SUFFIXES)
def has_shell_shebang(path: Path) -> bool:
try:
prefix = path.read_bytes()[:128]
except OSError:
return False
first_line = prefix.decode("utf-8", errors="ignore").splitlines()[0].strip().lower() if prefix else ""
interpreter = shebang_interpreter(first_line)
return interpreter in {"sh", "bash", "zsh", "ksh", "fish", "pwsh", "powershell"}
def shebang_interpreter(line: str) -> str | None:
if not line.startswith("#!"):
return None
shebang = line[2:].strip()
if not shebang:
return None
parts = shebang.split()
first = Path(parts[0]).name
if first == "env":
for part in parts[1:]:
if part.startswith("-"):
continue
return Path(part).name
return None
return first
def has_markdown_suffix(target: str) -> bool:
lowered = target.lower()
return lowered.endswith(".md") or lowered.endswith(".markdown")
def is_subpath(path: Path, root: Path) -> bool:
try:
path.relative_to(root)
return True
except ValueError:
return False
def main(argv: list[str] | None = None) -> int:
parser = argparse.ArgumentParser(description="Validate the sibling skill_lib against ZeroClaw-like rules.")
parser.add_argument(
"--allow-scripts",
action="store_true",
help="Allow shell-script files during auditing.",
)
args = parser.parse_args(argv)
results = validate_all_skills(allow_scripts=args.allow_scripts)
if not results:
print(f"FAIL no skills discovered under {SKILLS_DIR}")
return 1
all_ok = True
for result in results:
status = "PASS" if result.ok else "FAIL"
print(f"{status} {result.record.name}")
for finding in result.report.findings:
print(f" - {finding}")
all_ok = all_ok and result.ok
print(f"Checked {len(results)} skills in {SKILL_LIB_ROOT}")
return 0 if all_ok else 1
if __name__ == "__main__":
sys.exit(main())

View File

@@ -1,165 +1,63 @@
pub mod planner;
pub mod runtime;
pub mod task_runner;
use std::ffi::OsString;
use std::path::PathBuf;
use std::sync::Arc;
use crate::compat::runtime::CompatTaskContext;
use crate::config::DeepSeekSettings;
use crate::pipe::{
AgentMessage, BrowserMessage, BrowserPipeTool, ConversationMessage, PipeError, Transport,
use crate::browser::ws_backend::WsBrowserBackend;
use crate::browser::{BrowserBackend, PipeBrowserBackend};
use crate::pipe::{BrowserMessage, BrowserPipeTool, PipeError, Transport};
pub use task_runner::{
run_submit_task, run_submit_task_with_browser_backend, AgentEventSink, AgentRuntimeContext,
SubmitTaskRequest,
};
#[derive(Debug, Clone, PartialEq, Eq)]
pub struct AgentRuntimeContext {
config_path: Option<PathBuf>,
workspace_root: PathBuf,
fn normalize_optional_submit_field(value: String) -> Option<String> {
let trimmed = value.trim();
(!trimmed.is_empty()).then(|| trimmed.to_string())
}
impl AgentRuntimeContext {
pub fn new(config_path: Option<PathBuf>, workspace_root: PathBuf) -> Self {
Self {
config_path,
workspace_root,
}
fn browser_backend_for_submit<T: Transport + 'static>(
browser_tool: &BrowserPipeTool<T>,
context: &AgentRuntimeContext,
request: &SubmitTaskRequest,
) -> Result<Arc<dyn BrowserBackend>, PipeError> {
if let Some(browser_ws_url) = configured_browser_ws_url(context) {
return Ok(Arc::new(
WsBrowserBackend::new(
Arc::new(crate::service::browser_ws_client::ServiceWsClient::connect(
&browser_ws_url,
)?),
browser_tool.mac_policy().clone(),
crate::service::browser_ws_client::initial_request_url_for_submit_task(request),
)
.with_response_timeout(browser_tool.response_timeout()),
));
}
pub fn from_process_args<I, S>(args: I) -> Result<Self, PipeError>
where
I: IntoIterator<Item = S>,
S: Into<OsString>,
{
let mut config_path = None;
let mut args = args.into_iter().map(Into::into);
let _ = args.next();
while let Some(arg) = args.next() {
if arg == OsString::from("--config-path") {
let Some(value) = args.next() else {
return Err(PipeError::Protocol(
"missing value for --config-path".to_string(),
));
};
config_path = Some(PathBuf::from(value));
continue;
}
let arg_string = arg.to_string_lossy();
if let Some(value) = arg_string.strip_prefix("--config-path=") {
config_path = Some(PathBuf::from(value));
}
}
let workspace_root = config_path
.as_ref()
.and_then(|path| path.parent().map(|parent| parent.to_path_buf()))
.unwrap_or_else(default_workspace_root);
Ok(Self::new(config_path, workspace_root))
}
fn load_deepseek_settings(&self) -> Result<Option<DeepSeekSettings>, PipeError> {
DeepSeekSettings::load(self.config_path.as_deref())
.map_err(|err| PipeError::Protocol(err.to_string()))
}
fn deepseek_source_label(&self) -> String {
match &self.config_path {
Some(path) if path.exists() => path.display().to_string(),
_ => "environment".to_string(),
}
}
Ok(Arc::new(PipeBrowserBackend::from_inner(browser_tool.clone())))
}
impl Default for AgentRuntimeContext {
fn default() -> Self {
Self::new(None, default_workspace_root())
}
fn configured_browser_ws_url(context: &AgentRuntimeContext) -> Option<String> {
std::env::var("SGCLAW_BROWSER_WS_URL")
.ok()
.filter(|value| !value.trim().is_empty())
.or_else(|| {
context
.load_sgclaw_settings()
.ok()
.flatten()
.and_then(|settings| settings.browser_ws_url)
.map(|value| value.trim().to_string())
.filter(|value| !value.is_empty())
})
}
fn default_workspace_root() -> PathBuf {
std::env::current_dir().unwrap_or_else(|_| PathBuf::from("."))
}
fn send_mode_log<T: Transport>(transport: &T, mode: &str) -> Result<(), PipeError> {
transport.send(&AgentMessage::LogEntry {
level: "mode".to_string(),
message: mode.to_string(),
fn send_status_changed<T: Transport>(transport: &T, state: &str) -> Result<(), PipeError> {
transport.send(&crate::pipe::AgentMessage::StatusChanged {
state: state.to_string(),
})
}
fn explicit_non_task_response(history: &[ConversationMessage], instruction: &str) -> Option<String> {
if !history.is_empty() {
return None;
}
let trimmed = instruction.trim();
if trimmed.is_empty() {
return Some("sgClaw 目前只处理浏览器任务,请直接描述要打开、搜索、点击或提取的网页操作。".to_string());
}
const TASK_HINTS: &[&str] = &[
"打开", "搜索", "点击", "输入", "导航", "跳转", "访问", "提取", "获取", "网页", "页面",
"标签页", "百度", "知乎", "google", "open", "search", "click", "type", "navigate",
];
if TASK_HINTS.iter().any(|hint| trimmed.contains(hint)) {
return None;
}
const CHITCHAT_INPUTS: &[&str] = &[
"hi", "hello", "hey", "你好", "您好", "", "在吗", "你是谁", "介绍一下你自己",
];
if CHITCHAT_INPUTS
.iter()
.any(|candidate| trimmed.eq_ignore_ascii_case(candidate) || trimmed == *candidate)
{
return Some("sgClaw 现在是浏览器任务入口,不做通用闲聊。请直接说你想在网页上执行什么操作,例如“打开百度搜索天气”。".to_string());
}
if trimmed.chars().count() <= 8 {
return Some("sgClaw 现在只处理浏览器任务。请直接描述网页操作目标,例如“打开知乎搜索天气”或“提取当前页面标题”。".to_string());
}
None
}
fn execute_plan<T: Transport>(
transport: &T,
browser_tool: &BrowserPipeTool<T>,
plan: &planner::TaskPlan,
) -> Result<String, PipeError> {
for step in &plan.steps {
transport.send(&AgentMessage::LogEntry {
level: "info".to_string(),
message: step.log_message.clone(),
})?;
let result = browser_tool.invoke(
step.action.clone(),
step.params.clone(),
&step.expected_domain,
)?;
if !result.success {
return Err(PipeError::Protocol(format!(
"browser action failed: {}",
result.data
)));
}
}
Ok(plan.summary.clone())
}
pub fn execute_task<T: Transport>(
transport: &T,
browser_tool: &BrowserPipeTool<T>,
instruction: &str,
) -> Result<String, PipeError> {
let plan = planner::plan_instruction(instruction)
.map_err(|err| PipeError::Protocol(err.to_string()))?;
execute_plan(transport, browser_tool, &plan)
}
pub fn handle_browser_message<T: Transport + 'static>(
transport: &T,
browser_tool: &BrowserPipeTool<T>,
@@ -180,6 +78,9 @@ pub fn handle_browser_message_with_context<T: Transport + 'static>(
message: BrowserMessage,
) -> Result<(), PipeError> {
match message {
BrowserMessage::Connect => send_status_changed(transport, "connected"),
BrowserMessage::Start => send_status_changed(transport, "started"),
BrowserMessage::Stop => send_status_changed(transport, "stopped"),
BrowserMessage::SubmitTask {
instruction,
conversation_id,
@@ -187,90 +88,25 @@ pub fn handle_browser_message_with_context<T: Transport + 'static>(
page_url,
page_title,
} => {
if let Some(summary) = explicit_non_task_response(&messages, &instruction) {
return transport.send(&AgentMessage::TaskComplete {
success: false,
summary,
});
}
let task_context = CompatTaskContext {
conversation_id: (!conversation_id.trim().is_empty())
.then_some(conversation_id.clone()),
let request = SubmitTaskRequest {
instruction,
conversation_id: normalize_optional_submit_field(conversation_id),
messages,
page_url: (!page_url.trim().is_empty()).then_some(page_url),
page_title: (!page_title.trim().is_empty()).then_some(page_title),
page_url: normalize_optional_submit_field(page_url),
page_title: normalize_optional_submit_field(page_title),
};
if !task_context.messages.is_empty() {
let _ = transport.send(&AgentMessage::LogEntry {
level: "info".to_string(),
message: format!(
"continuing conversation with {} prior turns",
task_context.messages.len()
),
});
if configured_browser_ws_url(context).is_some() {
let browser_backend = browser_backend_for_submit(browser_tool, context, &request)?;
run_submit_task_with_browser_backend(
transport,
transport,
browser_backend,
context,
request,
)
} else {
run_submit_task(transport, transport, browser_tool, context, request)
}
let completion = match context.load_deepseek_settings() {
Ok(Some(settings)) => {
let _ = transport.send(&AgentMessage::LogEntry {
level: "info".to_string(),
message: format!(
"DeepSeek config loaded from {} model={} base_url={}",
context.deepseek_source_label(),
settings.model,
settings.base_url
),
});
let _ = send_mode_log(transport, "compat_llm_primary");
match crate::compat::runtime::execute_task(
transport,
browser_tool.clone(),
&instruction,
&task_context,
&context.workspace_root,
&settings,
) {
Ok(summary) => AgentMessage::TaskComplete {
success: true,
summary,
},
Err(err) => AgentMessage::TaskComplete {
success: false,
summary: err.to_string(),
},
}
}
Ok(None) => match planner::plan_instruction(&instruction) {
Ok(plan) => {
let _ = send_mode_log(transport, "deterministic_planner");
match execute_plan(transport, browser_tool, &plan) {
Ok(summary) => AgentMessage::TaskComplete {
success: true,
summary,
},
Err(err) => AgentMessage::TaskComplete {
success: false,
summary: err.to_string(),
},
}
}
Err(err) => AgentMessage::TaskComplete {
success: false,
summary: PipeError::Protocol(err.to_string()).to_string(),
},
},
Err(err) => {
let _ = transport.send(&AgentMessage::LogEntry {
level: "error".to_string(),
message: format!("failed to load DeepSeek config: {err}"),
});
AgentMessage::TaskComplete {
success: false,
summary: err.to_string(),
}
}
};
transport.send(&completion)
}
BrowserMessage::Init { .. } => {
eprintln!("ignoring duplicate init after handshake");
@@ -282,3 +118,36 @@ pub fn handle_browser_message_with_context<T: Transport + 'static>(
}
}
}
#[cfg(test)]
mod tests {
use super::normalize_optional_submit_field;
use std::fs;
use std::path::PathBuf;
#[test]
fn normalize_optional_submit_field_trims_and_drops_blank_values() {
assert_eq!(normalize_optional_submit_field(" \n\t ".to_string()), None);
assert_eq!(
normalize_optional_submit_field(" https://example.com/page ".to_string()),
Some("https://example.com/page".to_string())
);
}
#[test]
fn agent_module_cleanup_removes_legacy_runtime_and_planner_sources() {
let manifest_dir = PathBuf::from(env!("CARGO_MANIFEST_DIR"));
let agent_module = fs::read_to_string(manifest_dir.join("src/agent/mod.rs")).unwrap();
let top_lines = agent_module
.lines()
.take(10)
.map(str::trim)
.collect::<Vec<_>>();
assert!(!manifest_dir.join("src/agent/runtime.rs").exists());
assert!(!manifest_dir.join("src/agent/planner.rs").exists());
assert!(!top_lines.iter().any(|line| *line == "pub mod runtime;"));
assert!(!top_lines.iter().any(|line| *line == "pub mod planner;"));
assert!(top_lines.iter().any(|line| *line == "pub mod task_runner;"));
}
}

View File

@@ -1,112 +0,0 @@
use reqwest::Url;
use serde_json::{json, Value};
use thiserror::Error;
use crate::pipe::Action;
const BAIDU_URL: &str = "https://www.baidu.com";
const BAIDU_DOMAIN: &str = "www.baidu.com";
const BAIDU_INPUT_SELECTOR: &str = "#kw";
const BAIDU_SEARCH_BUTTON_SELECTOR: &str = "#su";
const ZHIHU_URL: &str = "https://www.zhihu.com/search";
const ZHIHU_DOMAIN: &str = "www.zhihu.com";
#[derive(Debug, Clone, PartialEq)]
pub struct PlannedStep {
pub action: Action,
pub params: Value,
pub expected_domain: String,
pub log_message: String,
}
#[derive(Debug, Clone, PartialEq)]
pub struct TaskPlan {
pub summary: String,
pub steps: Vec<PlannedStep>,
}
#[derive(Debug, Error, Clone, PartialEq, Eq)]
pub enum PlannerError {
#[error("unsupported instruction: {0}")]
UnsupportedInstruction(String),
#[error("missing search query in instruction")]
MissingQuery,
}
pub fn plan_instruction(instruction: &str) -> Result<TaskPlan, PlannerError> {
let trimmed = instruction.trim();
if let Some(query) = extract_query(trimmed, &["打开百度搜索", "打开百度并搜索"])? {
return Ok(plan_baidu_search(query));
}
if let Some(query) = extract_query(trimmed, &["打开知乎搜索", "打开知乎并搜索"])? {
return Ok(plan_zhihu_search(query));
}
Err(PlannerError::UnsupportedInstruction(trimmed.to_string()))
}
fn extract_query<'a>(
instruction: &'a str,
prefixes: &[&str],
) -> Result<Option<&'a str>, PlannerError> {
let Some(query) = prefixes
.iter()
.find_map(|prefix| instruction.strip_prefix(prefix))
else {
return Ok(None);
};
let query = query.trim();
if query.is_empty() {
return Err(PlannerError::MissingQuery);
}
Ok(Some(query))
}
fn plan_baidu_search(query: &str) -> TaskPlan {
TaskPlan {
summary: format!("已在百度搜索{query}"),
steps: vec![
PlannedStep {
action: Action::Navigate,
params: json!({ "url": BAIDU_URL }),
expected_domain: BAIDU_DOMAIN.to_string(),
log_message: "navigate https://www.baidu.com".to_string(),
},
PlannedStep {
action: Action::Type,
params: json!({
"selector": BAIDU_INPUT_SELECTOR,
"text": query,
"clear_first": true
}),
expected_domain: BAIDU_DOMAIN.to_string(),
log_message: format!("type {query} into {BAIDU_INPUT_SELECTOR}"),
},
PlannedStep {
action: Action::Click,
params: json!({ "selector": BAIDU_SEARCH_BUTTON_SELECTOR }),
expected_domain: BAIDU_DOMAIN.to_string(),
log_message: format!("click {BAIDU_SEARCH_BUTTON_SELECTOR}"),
},
],
}
}
fn plan_zhihu_search(query: &str) -> TaskPlan {
let url = Url::parse_with_params(ZHIHU_URL, &[("type", "content"), ("q", query)])
.expect("valid Zhihu search URL");
let url: String = url.into();
TaskPlan {
summary: format!("已在知乎搜索{query}"),
steps: vec![PlannedStep {
action: Action::Navigate,
params: json!({ "url": url }),
expected_domain: ZHIHU_DOMAIN.to_string(),
log_message: format!("navigate {url}"),
}],
}
}

View File

@@ -1,152 +0,0 @@
use serde_json::{json, Map, Value};
use crate::llm::{ChatMessage, LlmError, LlmProvider, ToolDefinition, ToolFunctionCall};
use crate::pipe::{Action, AgentMessage, BrowserPipeTool, PipeError, Transport};
const BROWSER_ACTION_TOOL_NAME: &str = "browser_action";
#[derive(Debug, Clone, PartialEq)]
struct BrowserActionCall {
action: Action,
expected_domain: String,
params: Value,
}
pub fn execute_task_with_provider<P: LlmProvider, T: Transport>(
transport: &T,
browser_tool: &BrowserPipeTool<T>,
provider: &P,
instruction: &str,
) -> Result<String, PipeError> {
let messages = vec![
ChatMessage {
role: "system".to_string(),
content: "You are sgClaw. Use browser_action to complete the browser task."
.to_string(),
},
ChatMessage {
role: "user".to_string(),
content: instruction.to_string(),
},
];
let tools = vec![browser_action_tool_definition()];
let calls = provider
.chat(&messages, &tools)
.map_err(map_llm_error_to_pipe_error)?;
for call in calls {
let browser_call = parse_browser_action_call(call)
.map_err(|err| PipeError::Protocol(err.to_string()))?;
transport.send(&AgentMessage::LogEntry {
level: "info".to_string(),
message: format!(
"{} {}",
browser_call.action.as_str(),
browser_call.expected_domain
),
})?;
let result = browser_tool.invoke(
browser_call.action,
browser_call.params,
&browser_call.expected_domain,
)?;
if !result.success {
return Err(PipeError::Protocol(format!(
"browser action failed: {}",
result.data
)));
}
}
Ok(format!("已通过 Agent 执行任务: {instruction}"))
}
pub fn browser_action_tool_definition() -> ToolDefinition {
ToolDefinition {
name: BROWSER_ACTION_TOOL_NAME.to_string(),
description: "Execute browser actions in SuperRPA".to_string(),
parameters: json!({
"type": "object",
"required": ["action", "expected_domain"],
"properties": {
"action": { "type": "string", "enum": ["click", "type", "navigate", "getText"] },
"expected_domain": { "type": "string" },
"selector": { "type": "string" },
"text": { "type": "string" },
"url": { "type": "string" },
"clear_first": { "type": "boolean" }
}
}),
}
}
fn parse_browser_action_call(call: ToolFunctionCall) -> Result<BrowserActionCall, RuntimeError> {
if call.name != BROWSER_ACTION_TOOL_NAME {
return Err(RuntimeError::UnsupportedTool(call.name));
}
let mut args = match call.arguments {
Value::Object(args) => args,
other => {
return Err(RuntimeError::InvalidArguments(format!(
"expected object arguments, got {other}"
)))
}
};
let action_name = take_required_string(&mut args, "action")?;
let expected_domain = take_required_string(&mut args, "expected_domain")?;
let action = parse_action(&action_name)?;
let params = Value::Object(action_params_from_args(args));
Ok(BrowserActionCall {
action,
expected_domain,
params,
})
}
fn map_llm_error_to_pipe_error(err: LlmError) -> PipeError {
PipeError::Protocol(err.to_string())
}
fn parse_action(action_name: &str) -> Result<Action, RuntimeError> {
match action_name {
"click" => Ok(Action::Click),
"type" => Ok(Action::Type),
"navigate" => Ok(Action::Navigate),
"getText" => Ok(Action::GetText),
other => Err(RuntimeError::UnsupportedAction(other.to_string())),
}
}
fn take_required_string(
args: &mut Map<String, Value>,
key: &'static str,
) -> Result<String, RuntimeError> {
match args.remove(key) {
Some(Value::String(value)) if !value.trim().is_empty() => Ok(value),
Some(other) => Err(RuntimeError::InvalidArguments(format!(
"{key} must be a non-empty string, got {other}"
))),
None => Err(RuntimeError::MissingField(key)),
}
}
fn action_params_from_args(args: Map<String, Value>) -> Map<String, Value> {
args
}
#[derive(Debug, thiserror::Error)]
enum RuntimeError {
#[error("unsupported tool: {0}")]
UnsupportedTool(String),
#[error("unsupported action: {0}")]
UnsupportedAction(String),
#[error("missing required field: {0}")]
MissingField(&'static str),
#[error("invalid tool arguments: {0}")]
InvalidArguments(String),
}

572
src/agent/task_runner.rs Normal file
View File

@@ -0,0 +1,572 @@
use std::ffi::OsString;
use std::path::{Path, PathBuf};
use std::sync::Arc;
use crate::browser::BrowserBackend;
use crate::compat::config_adapter::resolve_skills_dir_from_sgclaw_settings;
use crate::compat::runtime::CompatTaskContext;
use crate::config::SgClawSettings;
use crate::pipe::{
AgentMessage, BrowserPipeTool, ConversationMessage, PipeError, Transport,
};
use crate::runtime::RuntimeEngine;
#[derive(Debug, Clone, PartialEq, Eq)]
pub struct AgentRuntimeContext {
config_path: Option<PathBuf>,
workspace_root: PathBuf,
}
impl AgentRuntimeContext {
pub fn new(config_path: Option<PathBuf>, workspace_root: PathBuf) -> Self {
Self {
config_path,
workspace_root,
}
}
pub fn from_process_args<I, S>(args: I) -> Result<Self, PipeError>
where
I: IntoIterator<Item = S>,
S: Into<OsString>,
{
let mut config_path = None;
let mut args = args.into_iter().map(Into::into);
let _ = args.next();
while let Some(arg) = args.next() {
if arg == OsString::from("--config-path") {
let Some(value) = args.next() else {
return Err(PipeError::Protocol(
"missing value for --config-path".to_string(),
));
};
config_path = Some(resolve_process_path(PathBuf::from(value)));
continue;
}
let arg_string = arg.to_string_lossy();
if let Some(value) = arg_string.strip_prefix("--config-path=") {
config_path = Some(resolve_process_path(PathBuf::from(value)));
}
}
let workspace_root = config_path
.as_ref()
.and_then(|path| path.parent().map(|parent| parent.to_path_buf()))
.unwrap_or_else(default_workspace_root);
Ok(Self::new(config_path, workspace_root))
}
pub(crate) fn load_sgclaw_settings(&self) -> Result<Option<SgClawSettings>, PipeError> {
SgClawSettings::load(self.config_path.as_deref())
.map_err(|err| PipeError::Protocol(err.to_string()))
}
pub fn config_path(&self) -> Option<&Path> {
self.config_path.as_deref()
}
fn settings_source_label(&self) -> String {
match &self.config_path {
Some(path) if path.exists() => path.display().to_string(),
_ => "environment".to_string(),
}
}
}
impl Default for AgentRuntimeContext {
fn default() -> Self {
Self::new(None, default_workspace_root())
}
}
fn default_workspace_root() -> PathBuf {
std::env::current_dir().unwrap_or_else(|_| PathBuf::from("."))
}
fn resolve_process_path(path: PathBuf) -> PathBuf {
if path.is_absolute() {
path
} else {
default_workspace_root().join(path)
}
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn from_process_args_resolves_relative_config_path_against_current_dir() {
let current_dir = std::env::current_dir().unwrap();
let context = AgentRuntimeContext::from_process_args([
OsString::from("sg_claw"),
OsString::from("--config-path"),
OsString::from("../tmp/sgclaw_config.json"),
])
.unwrap();
assert_eq!(
context.config_path,
Some(current_dir.join("../tmp/sgclaw_config.json"))
);
assert_eq!(context.workspace_root, current_dir.join("../tmp"));
assert!(context.workspace_root.is_absolute());
}
}
#[derive(Debug, Clone, Default, PartialEq, Eq)]
pub struct SubmitTaskRequest {
pub instruction: String,
pub conversation_id: Option<String>,
pub messages: Vec<ConversationMessage>,
pub page_url: Option<String>,
pub page_title: Option<String>,
}
pub trait AgentEventSink: Send + Sync {
fn send(&self, message: &AgentMessage) -> Result<(), PipeError>;
}
impl<T: Transport + ?Sized> AgentEventSink for T {
fn send(&self, message: &AgentMessage) -> Result<(), PipeError> {
Transport::send(self, message)
}
}
fn resolve_submit_instruction(
instruction: String,
page_url: Option<&str>,
page_title: Option<&str>,
) -> Result<(String, Option<crate::compat::deterministic_submit::DeterministicExecutionPlan>), AgentMessage> {
let raw_instruction = instruction;
let trimmed_instruction = raw_instruction.trim().to_string();
if trimmed_instruction.is_empty() {
return Err(AgentMessage::TaskComplete {
success: false,
summary: "请输入任务内容。".to_string(),
});
}
match crate::compat::deterministic_submit::decide_deterministic_submit(
&raw_instruction,
page_url,
page_title,
) {
crate::compat::deterministic_submit::DeterministicSubmitDecision::NotDeterministic => {
Ok((trimmed_instruction, None))
}
crate::compat::deterministic_submit::DeterministicSubmitDecision::Prompt { summary } => {
Err(AgentMessage::TaskComplete {
success: false,
summary,
})
}
crate::compat::deterministic_submit::DeterministicSubmitDecision::Execute(plan) => {
Ok((plan.instruction.clone(), Some(plan)))
}
}
}
pub fn run_submit_task<T: Transport + 'static>(
transport: &T,
sink: &dyn AgentEventSink,
browser_tool: &BrowserPipeTool<T>,
context: &AgentRuntimeContext,
request: SubmitTaskRequest,
) -> Result<(), PipeError> {
let SubmitTaskRequest {
instruction,
conversation_id,
messages,
page_url,
page_title,
} = request;
let task_context = CompatTaskContext {
conversation_id,
messages,
page_url,
page_title,
};
let (instruction, deterministic_plan) = match resolve_submit_instruction(
instruction,
task_context.page_url.as_deref(),
task_context.page_title.as_deref(),
) {
Ok(resolved) => resolved,
Err(completion) => return sink.send(&completion),
};
let _ = sink.send(&AgentMessage::LogEntry {
level: "info".to_string(),
message: runtime_version_log_message(),
});
if !task_context.messages.is_empty() {
let _ = sink.send(&AgentMessage::LogEntry {
level: "info".to_string(),
message: format!(
"continuing conversation with {} prior turns",
task_context.messages.len()
),
});
}
let completion = match context.load_sgclaw_settings() {
Ok(Some(settings)) => {
let resolved_skills_dir =
resolve_skills_dir_from_sgclaw_settings(&context.workspace_root, &settings);
let _ = sink.send(&AgentMessage::LogEntry {
level: "info".to_string(),
message: format!(
"DeepSeek config loaded from {} model={} base_url={}",
context.settings_source_label(),
settings.provider_model,
settings.provider_base_url
),
});
let _ = sink.send(&AgentMessage::LogEntry {
level: "info".to_string(),
message: format!("skills dir resolved to {}", resolved_skills_dir.display()),
});
let _ = sink.send(&AgentMessage::LogEntry {
level: "info".to_string(),
message: format!(
"runtime profile={:?} skills_prompt_mode={:?}",
settings.runtime_profile, settings.skills_prompt_mode
),
});
if let Some(plan) = deterministic_plan.as_ref() {
let _ = send_mode_log(sink, "direct_skill_primary");
let completion =
match crate::compat::deterministic_submit::execute_deterministic_submit(
browser_tool.clone(),
plan,
&context.workspace_root,
&settings,
) {
Ok(outcome) => AgentMessage::TaskComplete {
success: outcome.success,
summary: outcome.summary,
},
Err(err) => AgentMessage::TaskComplete {
success: false,
summary: err.to_string(),
},
};
return sink.send(&completion);
}
if RuntimeEngine::new(settings.runtime_profile).browser_surface_enabled()
&& crate::compat::orchestration::should_use_primary_orchestration(
&instruction,
task_context.page_url.as_deref(),
task_context.page_title.as_deref(),
)
{
let _ = send_mode_log(sink, "zeroclaw_process_message_primary");
match crate::compat::orchestration::execute_task_with_sgclaw_settings(
transport,
browser_tool.clone(),
&instruction,
&task_context,
&context.workspace_root,
&settings,
) {
Ok(summary) => {
return sink.send(&AgentMessage::TaskComplete {
success: true,
summary,
});
}
Err(err) => {
return sink.send(&AgentMessage::TaskComplete {
success: false,
summary: err.to_string(),
});
}
}
}
if settings
.direct_submit_skill
.as_deref()
.map(str::trim)
.is_some_and(|value| !value.is_empty())
{
match crate::compat::direct_skill_runtime::execute_direct_submit_skill(
browser_tool.clone(),
&instruction,
&task_context,
&context.workspace_root,
&settings,
) {
Ok(outcome) => {
let _ = send_mode_log(sink, "direct_skill_primary");
return sink.send(&AgentMessage::TaskComplete {
success: outcome.success,
summary: outcome.summary,
});
}
Err(PipeError::Protocol(message))
if message.contains("must use skill.tool format") =>
{
return sink.send(&AgentMessage::TaskComplete {
success: false,
summary: message,
});
}
Err(err) => {
return sink.send(&AgentMessage::TaskComplete {
success: false,
summary: err.to_string(),
});
}
}
}
let _ = send_mode_log(sink, "compat_llm_primary");
match crate::compat::runtime::execute_task_with_sgclaw_settings(
transport,
browser_tool.clone(),
&instruction,
&task_context,
&context.workspace_root,
&settings,
) {
Ok(summary) => AgentMessage::TaskComplete {
success: true,
summary,
},
Err(err) => AgentMessage::TaskComplete {
success: false,
summary: err.to_string(),
},
}
}
Ok(None) => AgentMessage::TaskComplete {
success: false,
summary: missing_llm_configuration_summary(),
},
Err(err) => {
let _ = sink.send(&AgentMessage::LogEntry {
level: "error".to_string(),
message: format!("failed to load DeepSeek config: {err}"),
});
AgentMessage::TaskComplete {
success: false,
summary: err.to_string(),
}
}
};
sink.send(&completion)
}
pub fn run_submit_task_with_browser_backend<T: Transport + 'static>(
_transport: &T,
sink: &dyn AgentEventSink,
browser_backend: Arc<dyn BrowserBackend>,
context: &AgentRuntimeContext,
request: SubmitTaskRequest,
) -> Result<(), PipeError> {
let SubmitTaskRequest {
instruction,
conversation_id,
messages,
page_url,
page_title,
} = request;
let task_context = CompatTaskContext {
conversation_id,
messages,
page_url,
page_title,
};
let (instruction, deterministic_plan) = match resolve_submit_instruction(
instruction,
task_context.page_url.as_deref(),
task_context.page_title.as_deref(),
) {
Ok(resolved) => resolved,
Err(completion) => return sink.send(&completion),
};
let _ = sink.send(&AgentMessage::LogEntry {
level: "info".to_string(),
message: runtime_version_log_message(),
});
if !task_context.messages.is_empty() {
let _ = sink.send(&AgentMessage::LogEntry {
level: "info".to_string(),
message: format!(
"continuing conversation with {} prior turns",
task_context.messages.len()
),
});
}
let completion = match context.load_sgclaw_settings() {
Ok(Some(settings)) => {
let resolved_skills_dir =
resolve_skills_dir_from_sgclaw_settings(&context.workspace_root, &settings);
let _ = sink.send(&AgentMessage::LogEntry {
level: "info".to_string(),
message: format!(
"DeepSeek config loaded from {} model={} base_url={}",
context.settings_source_label(),
settings.provider_model,
settings.provider_base_url
),
});
let _ = sink.send(&AgentMessage::LogEntry {
level: "info".to_string(),
message: format!("skills dir resolved to {}", resolved_skills_dir.display()),
});
let _ = sink.send(&AgentMessage::LogEntry {
level: "info".to_string(),
message: format!(
"runtime profile={:?} skills_prompt_mode={:?}",
settings.runtime_profile, settings.skills_prompt_mode
),
});
if let Some(plan) = deterministic_plan.as_ref() {
let _ = send_mode_log(sink, "direct_skill_primary");
let completion = match crate::compat::deterministic_submit::execute_deterministic_submit_with_browser_backend(
browser_backend.clone(),
plan,
&context.workspace_root,
&settings,
) {
Ok(outcome) => AgentMessage::TaskComplete {
success: outcome.success,
summary: outcome.summary,
},
Err(err) => AgentMessage::TaskComplete {
success: false,
summary: err.to_string(),
},
};
return sink.send(&completion);
}
if RuntimeEngine::new(settings.runtime_profile).browser_surface_enabled()
&& crate::compat::orchestration::should_use_primary_orchestration(
&instruction,
task_context.page_url.as_deref(),
task_context.page_title.as_deref(),
)
{
let _ = send_mode_log(sink, "zeroclaw_process_message_primary");
match crate::compat::orchestration::execute_task_with_browser_backend(
sink,
browser_backend.clone(),
&instruction,
&task_context,
&context.workspace_root,
&settings,
) {
Ok(summary) => {
return sink.send(&AgentMessage::TaskComplete {
success: true,
summary,
});
}
Err(err) => {
return sink.send(&AgentMessage::TaskComplete {
success: false,
summary: err.to_string(),
});
}
}
}
if settings
.direct_submit_skill
.as_deref()
.map(str::trim)
.is_some_and(|value| !value.is_empty())
{
match crate::compat::direct_skill_runtime::execute_direct_submit_skill_with_browser_backend(
browser_backend.clone(),
&instruction,
&task_context,
&context.workspace_root,
&settings,
) {
Ok(outcome) => {
let _ = send_mode_log(sink, "direct_skill_primary");
return sink.send(&AgentMessage::TaskComplete {
success: outcome.success,
summary: outcome.summary,
});
}
Err(PipeError::Protocol(message))
if message.contains("must use skill.tool format") =>
{
return sink.send(&AgentMessage::TaskComplete {
success: false,
summary: message,
});
}
Err(err) => {
return sink.send(&AgentMessage::TaskComplete {
success: false,
summary: err.to_string(),
});
}
}
}
let _ = send_mode_log(sink, "compat_llm_primary");
match crate::compat::runtime::execute_task_with_browser_backend(
sink,
browser_backend,
&instruction,
&task_context,
&context.workspace_root,
&settings,
) {
Ok(summary) => AgentMessage::TaskComplete {
success: true,
summary,
},
Err(err) => AgentMessage::TaskComplete {
success: false,
summary: err.to_string(),
},
}
}
Ok(None) => AgentMessage::TaskComplete {
success: false,
summary: missing_llm_configuration_summary(),
},
Err(err) => {
let _ = sink.send(&AgentMessage::LogEntry {
level: "error".to_string(),
message: format!("failed to load DeepSeek config: {err}"),
});
AgentMessage::TaskComplete {
success: false,
summary: err.to_string(),
}
}
};
sink.send(&completion)
}
fn send_mode_log(sink: &dyn AgentEventSink, mode: &str) -> Result<(), PipeError> {
sink.send(&AgentMessage::LogEntry {
level: "mode".to_string(),
message: mode.to_string(),
})
}
fn missing_llm_configuration_summary() -> String {
"未配置大语言模型。请先在 sgclaw_config.json 或环境变量中配置 apiKey、baseUrl 与 model。"
.to_string()
}
fn runtime_version_log_message() -> String {
format!(
"sgclaw runtime version={} protocol={}",
env!("CARGO_PKG_VERSION"),
crate::pipe::protocol::PROTOCOL_VERSION
)
}

Some files were not shown because too many files have changed in this diff Show More