Compare commits
132 Commits
d315c13f66
...
feature/ge
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
6122b521a8 | ||
|
|
8162118e6d | ||
|
|
6cdd71b682 | ||
|
|
6c3f3a4002 | ||
|
|
2428e975eb | ||
|
|
956f0c2b68 | ||
|
|
118fc77935 | ||
|
|
4215d49f3f | ||
|
|
475e460eb1 | ||
|
|
6591a0d849 | ||
|
|
56ae03f3f9 | ||
|
|
0fcdfb1787 | ||
|
|
f374334f25 | ||
|
|
e6bab40e5b | ||
|
|
2cab25952e | ||
|
|
6fee4e2083 | ||
|
|
46005d9718 | ||
|
|
d1711a3db3 | ||
|
|
b1647cd865 | ||
|
|
1c964c3e70 | ||
|
|
a6aa18c6d9 | ||
|
|
7420af8f31 | ||
|
|
42eb716b7e | ||
|
|
a325add167 | ||
|
|
d95b8aaf26 | ||
|
|
847f2c62c6 | ||
|
|
ea9147defb | ||
|
|
5ff6e05911 | ||
|
|
eead8f7802 | ||
|
|
d123ee0aca | ||
|
|
d26d96ef64 | ||
|
|
d996b511f6 | ||
|
|
b8d2eb9faa | ||
|
|
4167639231 | ||
|
|
78a36a73b4 | ||
|
|
74c42af717 | ||
|
|
bb15d14749 | ||
|
|
7289cc5779 | ||
|
|
689abf08ec | ||
|
|
2ffb42c181 | ||
|
|
614e9a3a45 | ||
|
|
517ac6bf39 | ||
|
|
dd7b3c582a | ||
|
|
f268668713 | ||
|
|
ce072c2ebe | ||
|
|
464f18c672 | ||
|
|
b5131c858a | ||
|
|
2e69fa7239 | ||
|
|
f84e11c631 | ||
|
|
73edf1e5cf | ||
|
|
87cee36173 | ||
|
|
67fe17302e | ||
|
|
45b54ab007 | ||
|
|
af8f261b79 | ||
|
|
f168f9f375 | ||
|
|
23845413c5 | ||
|
|
ea6be128e7 | ||
|
|
6c1865eb1c | ||
|
|
d00086a70b | ||
|
|
e7a4179513 | ||
|
|
15d4b0dcc1 | ||
|
|
294426ced9 | ||
|
|
ead9ea76fa | ||
|
|
e8d7d6b796 | ||
|
|
bd83d92480 | ||
|
|
044d38003d | ||
|
|
f07f7d63ef | ||
|
|
c60cd308ca | ||
|
|
6aa0c110bd | ||
|
|
390a431a4b | ||
|
|
0f70702914 | ||
|
|
8decd9554c | ||
|
|
adb64429ee | ||
|
|
32e2c59a40 | ||
|
|
fae2fd57d6 | ||
|
|
899c670e5c | ||
|
|
583bb117cb | ||
|
|
ad3778d4c5 | ||
|
|
4d1070dff0 | ||
|
|
0303111d5b | ||
|
|
7320fb7f79 | ||
|
|
dbbc5d030b | ||
|
|
ce6b3e6749 | ||
|
|
a957712590 | ||
|
|
0ebe060484 | ||
|
|
695a888840 | ||
|
|
733aee1e9a | ||
|
|
f8f822e1f3 | ||
|
|
3b156e4bd1 | ||
|
|
645dc60bae | ||
|
|
007959b903 | ||
|
|
a8a470481d | ||
|
|
447457b7d3 | ||
|
|
45b60e37f7 | ||
|
|
d230ff0389 | ||
|
|
72b79feca9 | ||
|
|
dd7805d341 | ||
|
|
883647dffc | ||
|
|
b454fa3f54 | ||
|
|
311cc1fee6 | ||
|
|
7443b9da7f | ||
|
|
34035cdc9c | ||
|
|
4becf81066 | ||
|
|
81de162756 | ||
|
|
630190e4d3 | ||
|
|
57b9be733d | ||
|
|
96c3bf1dee | ||
|
|
bdf8e12246 | ||
|
|
0dd655712c | ||
|
|
6068a8228b | ||
|
|
3e18350320 | ||
|
|
2ae71fb1c9 | ||
|
|
5bccd02d6f | ||
|
|
f51d6b7659 | ||
|
|
c793bfc6a1 | ||
|
|
305b6d5110 | ||
|
|
4c4f45581f | ||
|
|
cd94904329 | ||
|
|
bf09de6700 | ||
|
|
dbb18a094c | ||
|
|
5db25b513e | ||
|
|
d2c9902966 | ||
|
|
f7e2ff256e | ||
|
|
c7d3d45c68 | ||
|
|
0fc6fe0c8e | ||
|
|
ef88487f4a | ||
|
|
e294fbb9b1 | ||
|
|
7d9036b2d4 | ||
|
|
54049a1e1e | ||
|
|
3844f2c34c | ||
|
|
5ed81e5f0c | ||
|
|
b87968632a |
5
.cargo/config.toml
Normal file
5
.cargo/config.toml
Normal file
@@ -0,0 +1,5 @@
|
||||
[source.crates-io]
|
||||
replace-with = "rsproxy-sparse"
|
||||
|
||||
[source.rsproxy-sparse]
|
||||
registry = "sparse+https://rsproxy.cn/index/"
|
||||
18
.gitignore
vendored
18
.gitignore
vendored
@@ -1,2 +1,20 @@
|
||||
.worktrees/
|
||||
target/
|
||||
.claude/
|
||||
.idea/
|
||||
.playwright-mcp/
|
||||
.qoder/
|
||||
.sgclaw_workspace/
|
||||
.sgclaw_workspace_dev1/
|
||||
.sgclaw-zeroclaw-workspace/
|
||||
sgclaw_config.json
|
||||
nul
|
||||
target-test/
|
||||
target-zhihu-nav/
|
||||
dist/
|
||||
.codex/
|
||||
node_modules/
|
||||
tmp_*
|
||||
vibe_images/
|
||||
sgclaw_config_zhihu_probe.json
|
||||
examples/
|
||||
|
||||
@@ -3,6 +3,8 @@
|
||||
## Project Structure & Module Organization
|
||||
`docs/` is the main source of product, architecture, integration, and team-process documentation. Keep active engineering documents in `docs/*.md`; presentation exports belong under `docs/archive/领导演示资料/`. `frontend/archive/sgClaw验证-已归档/` contains the historical Vue 2 verification page (`index.html`, `index.vue`) plus helper scripts (`serve.sh`, `download-libs.sh`, `testRunner.js`). `frontend/README.md` and `docs/README.md` describe what is active versus archived.
|
||||
|
||||
This repository only manages the sgClaw runtime, compatibility layers, skills, and architecture docs. The frontend overlay and Chromium-side code that embed sgClaw into SuperRPA live in the `superRPA` checkout; see `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/` (overlay HTML/JS/CSS) together with the surrounding Chromium resources and host bootstrap code for the actual browser-integrated UI.
|
||||
|
||||
## Build, Test, and Development Commands
|
||||
There is no formal build system in the repository today. Use the local verification page directly:
|
||||
|
||||
|
||||
407
Cargo.lock
generated
407
Cargo.lock
generated
@@ -26,18 +26,6 @@ dependencies = [
|
||||
"generic-array",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "ahash"
|
||||
version = "0.8.12"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "5a15f179cd60c4584b8a8c596927aadc462e27f2ca70c04e0071964a73ba7a75"
|
||||
dependencies = [
|
||||
"cfg-if",
|
||||
"once_cell",
|
||||
"version_check",
|
||||
"zerocopy",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "aho-corasick"
|
||||
version = "1.1.4"
|
||||
@@ -47,12 +35,6 @@ dependencies = [
|
||||
"memchr",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "allocator-api2"
|
||||
version = "0.2.21"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "683d7910e743518b0e34f1186f92494becacb047c7b6bf616c96772180fef923"
|
||||
|
||||
[[package]]
|
||||
name = "android_system_properties"
|
||||
version = "0.1.5"
|
||||
@@ -118,15 +100,6 @@ version = "1.0.102"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "7f202df86484c868dbad7eaa557ef785d5c66295e41b460ef922eca0723b842c"
|
||||
|
||||
[[package]]
|
||||
name = "ar_archive_writer"
|
||||
version = "0.5.1"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "7eb93bbb63b9c227414f6eb3a0adfddca591a8ce1e9b60661bb08969b87e340b"
|
||||
dependencies = [
|
||||
"object",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "async-channel"
|
||||
version = "1.9.0"
|
||||
@@ -220,9 +193,9 @@ dependencies = [
|
||||
|
||||
[[package]]
|
||||
name = "aws-lc-sys"
|
||||
version = "0.39.0"
|
||||
version = "0.39.1"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "1fa7e52a4c5c547c741610a2c6f123f3881e409b714cd27e6798ef020c514f0a"
|
||||
checksum = "83a25cf98105baa966497416dbd42565ce3a8cf8dbfd59803ec9ad46f3126399"
|
||||
dependencies = [
|
||||
"cc",
|
||||
"cmake",
|
||||
@@ -328,6 +301,12 @@ version = "1.25.0"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "c8efb64bd706a16a1bdde310ae86b351e4d21550d98d056f22f8a7f7a2183fec"
|
||||
|
||||
[[package]]
|
||||
name = "byteorder"
|
||||
version = "1.5.0"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "1fd0f2584146f6f2ef48085050886acf353beff7305ebd1ae69500e27c67f64b"
|
||||
|
||||
[[package]]
|
||||
name = "byteorder-lite"
|
||||
version = "0.1.0"
|
||||
@@ -342,9 +321,9 @@ checksum = "1e748733b7cbc798e1434b6ac524f0c1ff2ab456fe201501e6497c8417a4fc33"
|
||||
|
||||
[[package]]
|
||||
name = "cc"
|
||||
version = "1.2.57"
|
||||
version = "1.2.59"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "7a0dd1ca384932ff3641c8718a02769f1698e7563dc6974ffd03346116310423"
|
||||
checksum = "b7a4d3ec6524d28a329fc53654bbadc9bdd7b0431f5d65f1a56ffb28a1ee5283"
|
||||
dependencies = [
|
||||
"find-msvc-tools",
|
||||
"jobserver",
|
||||
@@ -421,16 +400,6 @@ dependencies = [
|
||||
"phf",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "chumsky"
|
||||
version = "0.9.3"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "8eebd66744a15ded14960ab4ccdbfb51ad3b81f51f3f04a80adac98c985396c9"
|
||||
dependencies = [
|
||||
"hashbrown 0.14.5",
|
||||
"stacker",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "cipher"
|
||||
version = "0.4.4"
|
||||
@@ -493,9 +462,9 @@ checksum = "c8d4a3bb8b1e0c1050499d1815f5ab16d04f0959b233085fb31653fbfc9d98f9"
|
||||
|
||||
[[package]]
|
||||
name = "cmake"
|
||||
version = "0.1.57"
|
||||
version = "0.1.58"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "75443c44cd6b379beb8c5b45d85d0773baf31cce901fe7bb252f4eff3008ef7d"
|
||||
checksum = "c0f78a02292a74a88ac736019ab962ece0bc380e3f977bf72e376c5d78ff0678"
|
||||
dependencies = [
|
||||
"cc",
|
||||
]
|
||||
@@ -772,9 +741,9 @@ checksum = "7360491ce676a36bf9bb3c56c1aa791658183a54d2744120f27285738d90465a"
|
||||
|
||||
[[package]]
|
||||
name = "fastrand"
|
||||
version = "2.3.0"
|
||||
version = "2.4.1"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "37909eebbb50d72f9059c3b6d82c0463f2ff062c9e95845c43a6c9c0355411be"
|
||||
checksum = "9f1f227452a390804cdb637b74a86990f2a7d7ba4b7d5693aac9b4dd6defd8d6"
|
||||
|
||||
[[package]]
|
||||
name = "fdeflate"
|
||||
@@ -1012,16 +981,6 @@ dependencies = [
|
||||
"tracing",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "hashbrown"
|
||||
version = "0.14.5"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "e5274423e17b7c9fc20b6e7e208532f9b19825d82dfd615708b70edd83df41f1"
|
||||
dependencies = [
|
||||
"ahash",
|
||||
"allocator-api2",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "hashbrown"
|
||||
version = "0.15.5"
|
||||
@@ -1039,10 +998,11 @@ checksum = "841d1cc9bed7f9236f321df977030373f4a4163ae1a7dbfe1a51a2c1a51d9100"
|
||||
|
||||
[[package]]
|
||||
name = "hashify"
|
||||
version = "0.2.7"
|
||||
version = "0.2.9"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "149e3ea90eb5a26ad354cfe3cb7f7401b9329032d0235f2687d03a35f30e5d4c"
|
||||
checksum = "dd1246c0e5493286aeb2dde35b1f4eb9c4ce00e628641210a5e553fc001a1f26"
|
||||
dependencies = [
|
||||
"indexmap",
|
||||
"proc-macro2",
|
||||
"quote",
|
||||
"syn",
|
||||
@@ -1136,9 +1096,9 @@ checksum = "df3b46402a9d5adb4c86a0cf463f42e19994e3ee891101b1841f30a545cb49a9"
|
||||
|
||||
[[package]]
|
||||
name = "hyper"
|
||||
version = "1.8.1"
|
||||
version = "1.9.0"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "2ab2d4f250c3d7b1c9fcdff1cece94ea4e2dfbec68614f7b87cb205f24ca9d11"
|
||||
checksum = "6299f016b246a94207e63da54dbe807655bf9e00044f73ded42c3ac5305fbcca"
|
||||
dependencies = [
|
||||
"atomic-waker",
|
||||
"bytes",
|
||||
@@ -1151,7 +1111,6 @@ dependencies = [
|
||||
"httpdate",
|
||||
"itoa",
|
||||
"pin-project-lite",
|
||||
"pin-utils",
|
||||
"smallvec",
|
||||
"tokio",
|
||||
"want",
|
||||
@@ -1223,12 +1182,13 @@ dependencies = [
|
||||
|
||||
[[package]]
|
||||
name = "icu_collections"
|
||||
version = "2.1.1"
|
||||
version = "2.2.0"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "4c6b649701667bbe825c3b7e6388cb521c23d88644678e83c0c4d0a621a34b43"
|
||||
checksum = "2984d1cd16c883d7935b9e07e44071dca8d917fd52ecc02c04d5fa0b5a3f191c"
|
||||
dependencies = [
|
||||
"displaydoc",
|
||||
"potential_utf",
|
||||
"utf8_iter",
|
||||
"yoke",
|
||||
"zerofrom",
|
||||
"zerovec",
|
||||
@@ -1236,9 +1196,9 @@ dependencies = [
|
||||
|
||||
[[package]]
|
||||
name = "icu_locale_core"
|
||||
version = "2.1.1"
|
||||
version = "2.2.0"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "edba7861004dd3714265b4db54a3c390e880ab658fec5f7db895fae2046b5bb6"
|
||||
checksum = "92219b62b3e2b4d88ac5119f8904c10f8f61bf7e95b640d25ba3075e6cac2c29"
|
||||
dependencies = [
|
||||
"displaydoc",
|
||||
"litemap",
|
||||
@@ -1249,9 +1209,9 @@ dependencies = [
|
||||
|
||||
[[package]]
|
||||
name = "icu_normalizer"
|
||||
version = "2.1.1"
|
||||
version = "2.2.0"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "5f6c8828b67bf8908d82127b2054ea1b4427ff0230ee9141c54251934ab1b599"
|
||||
checksum = "c56e5ee99d6e3d33bd91c5d85458b6005a22140021cc324cea84dd0e72cff3b4"
|
||||
dependencies = [
|
||||
"icu_collections",
|
||||
"icu_normalizer_data",
|
||||
@@ -1263,15 +1223,15 @@ dependencies = [
|
||||
|
||||
[[package]]
|
||||
name = "icu_normalizer_data"
|
||||
version = "2.1.1"
|
||||
version = "2.2.0"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "7aedcccd01fc5fe81e6b489c15b247b8b0690feb23304303a9e560f37efc560a"
|
||||
checksum = "da3be0ae77ea334f4da67c12f149704f19f81d1adf7c51cf482943e84a2bad38"
|
||||
|
||||
[[package]]
|
||||
name = "icu_properties"
|
||||
version = "2.1.2"
|
||||
version = "2.2.0"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "020bfc02fe870ec3a66d93e677ccca0562506e5872c650f893269e08615d74ec"
|
||||
checksum = "bee3b67d0ea5c2cca5003417989af8996f8604e34fb9ddf96208a033901e70de"
|
||||
dependencies = [
|
||||
"icu_collections",
|
||||
"icu_locale_core",
|
||||
@@ -1283,15 +1243,15 @@ dependencies = [
|
||||
|
||||
[[package]]
|
||||
name = "icu_properties_data"
|
||||
version = "2.1.2"
|
||||
version = "2.2.0"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "616c294cf8d725c6afcd8f55abc17c56464ef6211f9ed59cccffe534129c77af"
|
||||
checksum = "8e2bbb201e0c04f7b4b3e14382af113e17ba4f63e2c9d2ee626b720cbce54a14"
|
||||
|
||||
[[package]]
|
||||
name = "icu_provider"
|
||||
version = "2.1.1"
|
||||
version = "2.2.0"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "85962cf0ce02e1e0a629cc34e7ca3e373ce20dda4c4d7294bbd0bf1fdb59e614"
|
||||
checksum = "139c4cf31c8b5f33d7e199446eff9c1e02decfc2f0eec2c8d71f65befa45b421"
|
||||
dependencies = [
|
||||
"displaydoc",
|
||||
"icu_locale_core",
|
||||
@@ -1355,9 +1315,9 @@ dependencies = [
|
||||
|
||||
[[package]]
|
||||
name = "indexmap"
|
||||
version = "2.13.0"
|
||||
version = "2.13.1"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "7714e70437a7dc3ac8eb7e6f8df75fd8eb422675fc7678aff7364301092b1017"
|
||||
checksum = "45a8a2b9cb3e0b0c1803dbb0758ffac5de2f425b23c28f518faabd9d805342ff"
|
||||
dependencies = [
|
||||
"equivalent",
|
||||
"hashbrown 0.16.1",
|
||||
@@ -1395,9 +1355,9 @@ checksum = "d98f6fed1fde3f8c21bc40a1abb88dd75e67924f9cffc3ef95607bad8017f8e2"
|
||||
|
||||
[[package]]
|
||||
name = "iri-string"
|
||||
version = "0.7.11"
|
||||
version = "0.7.12"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "d8e7418f59cc01c88316161279a7f665217ae316b388e58a0d10e29f54f1e5eb"
|
||||
checksum = "25e659a4bb38e810ebc252e53b5814ff908a8c58c2a9ce2fae1bbec24cbf4e20"
|
||||
dependencies = [
|
||||
"memchr",
|
||||
"serde",
|
||||
@@ -1427,10 +1387,12 @@ dependencies = [
|
||||
|
||||
[[package]]
|
||||
name = "js-sys"
|
||||
version = "0.3.91"
|
||||
version = "0.3.94"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "b49715b7073f385ba4bc528e5747d02e66cb39c6146efb66b781f131f0fb399c"
|
||||
checksum = "2e04e2ef80ce82e13552136fabeef8a5ed1f985a96805761cbb9a2c34e7664d9"
|
||||
dependencies = [
|
||||
"cfg-if",
|
||||
"futures-util",
|
||||
"once_cell",
|
||||
"wasm-bindgen",
|
||||
]
|
||||
@@ -1449,12 +1411,11 @@ checksum = "09edd9e8b54e49e587e4f6295a7d29c3ea94d469cb40ab8ca70b288248a81db2"
|
||||
|
||||
[[package]]
|
||||
name = "lettre"
|
||||
version = "0.11.19"
|
||||
version = "0.11.21"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "9e13e10e8818f8b2a60f52cb127041d388b89f3a96a62be9ceaffa22262fef7f"
|
||||
checksum = "dabda5859ee7c06b995b9d1165aa52c39110e079ef609db97178d86aeb051fa7"
|
||||
dependencies = [
|
||||
"base64",
|
||||
"chumsky",
|
||||
"email-encoding",
|
||||
"email_address",
|
||||
"fastrand",
|
||||
@@ -1473,9 +1434,9 @@ dependencies = [
|
||||
|
||||
[[package]]
|
||||
name = "libc"
|
||||
version = "0.2.183"
|
||||
version = "0.2.184"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "b5b646652bf6661599e1da8901b3b9522896f01e736bad5f723fe7a3a27f899d"
|
||||
checksum = "48f5d2a454e16a5ea0f4ced81bd44e4cfc7bd3a507b61887c99fd3538b28e4af"
|
||||
|
||||
[[package]]
|
||||
name = "libloading"
|
||||
@@ -1518,9 +1479,9 @@ checksum = "32a66949e030da00e8c7d4434b251670a91556f4144941d37452769c25d58a53"
|
||||
|
||||
[[package]]
|
||||
name = "litemap"
|
||||
version = "0.8.1"
|
||||
version = "0.8.2"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "6373607a59f0be73a39b6fe456b8192fcc3585f602af20751600e974dd455e77"
|
||||
checksum = "92daf443525c4cce67b150400bc2316076100ce0b3686209eb8cf3c31612e6f0"
|
||||
|
||||
[[package]]
|
||||
name = "lock_api"
|
||||
@@ -1607,9 +1568,9 @@ dependencies = [
|
||||
|
||||
[[package]]
|
||||
name = "mio"
|
||||
version = "1.1.1"
|
||||
version = "1.2.0"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "a69bcab0ad47271a0234d9422b131806bf3968021e5dc9328caf2d4cd58557fc"
|
||||
checksum = "50b7e5b27aa02a74bac8c3f23f448f8d87ff11f92d3aac1a6ed369ee08cc56c1"
|
||||
dependencies = [
|
||||
"libc",
|
||||
"wasi",
|
||||
@@ -1669,15 +1630,6 @@ dependencies = [
|
||||
"autocfg",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "object"
|
||||
version = "0.37.3"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "ff76201f031d8863c38aa7f905eca4f53abbfa15f609db4277d44cd8938f33fe"
|
||||
dependencies = [
|
||||
"memchr",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "once_cell"
|
||||
version = "1.21.4"
|
||||
@@ -1831,9 +1783,9 @@ checksum = "c33a9471896f1c69cecef8d20cbe2f7accd12527ce60845ff44c153bb2a21b49"
|
||||
|
||||
[[package]]
|
||||
name = "potential_utf"
|
||||
version = "0.1.4"
|
||||
version = "0.1.5"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "b73949432f5e2a09657003c25bca5e19a0e9c84f8058ca374f49e0ebe605af77"
|
||||
checksum = "0103b1cef7ec0cf76490e969665504990193874ea05c85ff9bab8b911d0a0564"
|
||||
dependencies = [
|
||||
"zerovec",
|
||||
]
|
||||
@@ -1866,16 +1818,6 @@ dependencies = [
|
||||
"unicode-ident",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "psm"
|
||||
version = "0.1.30"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "3852766467df634d74f0b2d7819bf8dc483a0eb2e3b0f50f756f9cfe8b0d18d8"
|
||||
dependencies = [
|
||||
"ar_archive_writer",
|
||||
"cc",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "pxfm"
|
||||
version = "0.1.28"
|
||||
@@ -2204,9 +2146,9 @@ dependencies = [
|
||||
|
||||
[[package]]
|
||||
name = "rustc-hash"
|
||||
version = "2.1.1"
|
||||
version = "2.1.2"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "357703d41365b4b27c590e3ed91eabb1b663f07c4c084095e60cbed4362dff0d"
|
||||
checksum = "94300abf3f1ae2e2b8ffb7b58043de3d399c73fa6f4b73826402a5c457614dbe"
|
||||
|
||||
[[package]]
|
||||
name = "rustix"
|
||||
@@ -2328,9 +2270,9 @@ checksum = "b12e76d157a900eb52e81bc6e9f3069344290341720e9178cde2407113ac8d89"
|
||||
|
||||
[[package]]
|
||||
name = "semver"
|
||||
version = "1.0.27"
|
||||
version = "1.0.28"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "d767eb0aabc880b29956c35734170f26ed551a859dbd361d140cdbeca61ab1e2"
|
||||
checksum = "8a7852d02fc848982e0c167ef163aaff9cd91dc640ba85e263cb1ce46fae51cd"
|
||||
|
||||
[[package]]
|
||||
name = "serde"
|
||||
@@ -2399,9 +2341,18 @@ dependencies = [
|
||||
|
||||
[[package]]
|
||||
name = "serde_spanned"
|
||||
version = "1.1.0"
|
||||
version = "0.6.9"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "876ac351060d4f882bb1032b6369eb0aef79ad9df1ea8bc404874d8cc3d0cd98"
|
||||
checksum = "bf41e0cfaf7226dca15e8197172c295a782857fcb97fad1808a166870dee75a3"
|
||||
dependencies = [
|
||||
"serde",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "serde_spanned"
|
||||
version = "1.1.1"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "6662b5879511e06e8999a8a235d848113e942c9124f211511b16466ee2995f26"
|
||||
dependencies = [
|
||||
"serde_core",
|
||||
]
|
||||
@@ -2420,7 +2371,7 @@ dependencies = [
|
||||
|
||||
[[package]]
|
||||
name = "sgclaw"
|
||||
version = "0.1.0"
|
||||
version = "0.1.0-2026.4.9"
|
||||
dependencies = [
|
||||
"anyhow",
|
||||
"async-trait",
|
||||
@@ -2428,14 +2379,19 @@ dependencies = [
|
||||
"futures-util",
|
||||
"hex",
|
||||
"hmac",
|
||||
"regex",
|
||||
"reqwest",
|
||||
"serde",
|
||||
"serde_json",
|
||||
"sha2",
|
||||
"tempfile",
|
||||
"thiserror 1.0.69",
|
||||
"tokio",
|
||||
"toml 0.8.23",
|
||||
"tungstenite 0.29.0",
|
||||
"uuid",
|
||||
"zeroclawlabs",
|
||||
"zip 0.6.6",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
@@ -2502,9 +2458,9 @@ dependencies = [
|
||||
|
||||
[[package]]
|
||||
name = "simd-adler32"
|
||||
version = "0.3.8"
|
||||
version = "0.3.9"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "e320a6c5ad31d271ad523dcf3ad13e2767ad8b1cb8f047f75a8aeaf8da139da2"
|
||||
checksum = "703d5c7ef118737c72f1af64ad2f6f8c5e1921f818cdcb97b8fe6fc69bf66214"
|
||||
|
||||
[[package]]
|
||||
name = "siphasher"
|
||||
@@ -2540,20 +2496,6 @@ version = "1.2.1"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "6ce2be8dc25455e1f91df71bfa12ad37d7af1092ae736f3a6cd0e37bc7810596"
|
||||
|
||||
[[package]]
|
||||
name = "stacker"
|
||||
version = "0.1.23"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "08d74a23609d509411d10e2176dc2a4346e3b4aea2e7b1869f19fdedbc71c013"
|
||||
dependencies = [
|
||||
"cc",
|
||||
"cfg-if",
|
||||
"libc",
|
||||
"psm",
|
||||
"windows-sys 0.52.0",
|
||||
"windows-sys 0.59.0",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "stop-token"
|
||||
version = "0.7.0"
|
||||
@@ -2684,9 +2626,9 @@ dependencies = [
|
||||
|
||||
[[package]]
|
||||
name = "tinystr"
|
||||
version = "0.8.2"
|
||||
version = "0.8.3"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "42d3e9c45c09de15d06dd8acf5f4e0e399e85927b7f00711024eb7ae10fa4869"
|
||||
checksum = "c8323304221c2a851516f22236c5722a72eaa19749016521d6dff0824447d96d"
|
||||
dependencies = [
|
||||
"displaydoc",
|
||||
"zerovec",
|
||||
@@ -2709,9 +2651,9 @@ checksum = "1f3ccbac311fea05f86f61904b462b55fb3df8837a366dfc601a0161d0532f20"
|
||||
|
||||
[[package]]
|
||||
name = "tokio"
|
||||
version = "1.50.0"
|
||||
version = "1.51.1"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "27ad5e34374e03cfffefc301becb44e9dc3c17584f414349ebe29ed26661822d"
|
||||
checksum = "f66bf9585cda4b724d3e78ab34b73fb2bbaba9011b9bfdf69dc836382ea13b8c"
|
||||
dependencies = [
|
||||
"bytes",
|
||||
"libc",
|
||||
@@ -2725,9 +2667,9 @@ dependencies = [
|
||||
|
||||
[[package]]
|
||||
name = "tokio-macros"
|
||||
version = "2.6.1"
|
||||
version = "2.7.0"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "5c55a2eff8b69ce66c84f85e1da1c233edc36ceb85a2058d11b0d6a3c7e7569c"
|
||||
checksum = "385a6cb71ab9ab790c5fe8d67f1645e6c450a7ce006a33de03daa956cf70a496"
|
||||
dependencies = [
|
||||
"proc-macro2",
|
||||
"quote",
|
||||
@@ -2811,42 +2753,83 @@ dependencies = [
|
||||
|
||||
[[package]]
|
||||
name = "toml"
|
||||
version = "1.1.0+spec-1.1.0"
|
||||
version = "0.8.23"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "f8195ca05e4eb728f4ba94f3e3291661320af739c4e43779cbdfae82ab239fcc"
|
||||
checksum = "dc1beb996b9d83529a9e75c17a1686767d148d70663143c7854d8b4a09ced362"
|
||||
dependencies = [
|
||||
"serde",
|
||||
"serde_spanned 0.6.9",
|
||||
"toml_datetime 0.6.11",
|
||||
"toml_edit",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "toml"
|
||||
version = "1.1.2+spec-1.1.0"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "81f3d15e84cbcd896376e6730314d59fb5a87f31e4b038454184435cd57defee"
|
||||
dependencies = [
|
||||
"indexmap",
|
||||
"serde_core",
|
||||
"serde_spanned",
|
||||
"toml_datetime",
|
||||
"serde_spanned 1.1.1",
|
||||
"toml_datetime 1.1.1+spec-1.1.0",
|
||||
"toml_parser",
|
||||
"toml_writer",
|
||||
"winnow 1.0.0",
|
||||
"winnow 1.0.1",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "toml_datetime"
|
||||
version = "1.1.0+spec-1.1.0"
|
||||
version = "0.6.11"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "97251a7c317e03ad83774a8752a7e81fb6067740609f75ea2b585b569a59198f"
|
||||
checksum = "22cddaf88f4fbc13c51aebbf5f8eceb5c7c5a9da2ac40a13519eb5b0a0e8f11c"
|
||||
dependencies = [
|
||||
"serde",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "toml_datetime"
|
||||
version = "1.1.1+spec-1.1.0"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "3165f65f62e28e0115a00b2ebdd37eb6f3b641855f9d636d3cd4103767159ad7"
|
||||
dependencies = [
|
||||
"serde_core",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "toml_parser"
|
||||
version = "1.1.0+spec-1.1.0"
|
||||
name = "toml_edit"
|
||||
version = "0.22.27"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "2334f11ee363607eb04df9b8fc8a13ca1715a72ba8662a26ac285c98aabb4011"
|
||||
checksum = "41fe8c660ae4257887cf66394862d21dbca4a6ddd26f04a3560410406a2f819a"
|
||||
dependencies = [
|
||||
"winnow 1.0.0",
|
||||
"indexmap",
|
||||
"serde",
|
||||
"serde_spanned 0.6.9",
|
||||
"toml_datetime 0.6.11",
|
||||
"toml_write",
|
||||
"winnow 0.7.15",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "toml_writer"
|
||||
version = "1.1.0+spec-1.1.0"
|
||||
name = "toml_parser"
|
||||
version = "1.1.2+spec-1.1.0"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "d282ade6016312faf3e41e57ebbba0c073e4056dab1232ab1cb624199648f8ed"
|
||||
checksum = "a2abe9b86193656635d2411dc43050282ca48aa31c2451210f4202550afb7526"
|
||||
dependencies = [
|
||||
"winnow 1.0.1",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "toml_write"
|
||||
version = "0.1.2"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "5d99f8c9a7727884afe522e9bd5edbfc91a3312b36a77b5fb8926e4c31a41801"
|
||||
|
||||
[[package]]
|
||||
name = "toml_writer"
|
||||
version = "1.1.1+spec-1.1.0"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "756daf9b1013ebe47a8776667b466417e2d4c5679d441c26230efd9ef78692db"
|
||||
|
||||
[[package]]
|
||||
name = "tower"
|
||||
@@ -3067,9 +3050,9 @@ checksum = "06abde3611657adf66d383f00b093d7faecc7fa57071cce2578660c9f1010821"
|
||||
|
||||
[[package]]
|
||||
name = "uuid"
|
||||
version = "1.22.0"
|
||||
version = "1.23.0"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "a68d3c8f01c0cfa54a75291d83601161799e4a89a39e0929f4b0354d88757a37"
|
||||
checksum = "5ac8b6f42ead25368cf5b098aeb3dc8a1a2c05a3eee8a9a1a68c640edbfc79d9"
|
||||
dependencies = [
|
||||
"getrandom 0.4.2",
|
||||
"js-sys",
|
||||
@@ -3133,9 +3116,9 @@ dependencies = [
|
||||
|
||||
[[package]]
|
||||
name = "wasm-bindgen"
|
||||
version = "0.2.114"
|
||||
version = "0.2.117"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "6532f9a5c1ece3798cb1c2cfdba640b9b3ba884f5db45973a6f442510a87d38e"
|
||||
checksum = "0551fc1bb415591e3372d0bc4780db7e587d84e2a7e79da121051c5c4b89d0b0"
|
||||
dependencies = [
|
||||
"cfg-if",
|
||||
"once_cell",
|
||||
@@ -3146,23 +3129,19 @@ dependencies = [
|
||||
|
||||
[[package]]
|
||||
name = "wasm-bindgen-futures"
|
||||
version = "0.4.64"
|
||||
version = "0.4.67"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "e9c5522b3a28661442748e09d40924dfb9ca614b21c00d3fd135720e48b67db8"
|
||||
checksum = "03623de6905b7206edd0a75f69f747f134b7f0a2323392d664448bf2d3c5d87e"
|
||||
dependencies = [
|
||||
"cfg-if",
|
||||
"futures-util",
|
||||
"js-sys",
|
||||
"once_cell",
|
||||
"wasm-bindgen",
|
||||
"web-sys",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "wasm-bindgen-macro"
|
||||
version = "0.2.114"
|
||||
version = "0.2.117"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "18a2d50fcf105fb33bb15f00e7a77b772945a2ee45dcf454961fd843e74c18e6"
|
||||
checksum = "7fbdf9a35adf44786aecd5ff89b4563a90325f9da0923236f6104e603c7e86be"
|
||||
dependencies = [
|
||||
"quote",
|
||||
"wasm-bindgen-macro-support",
|
||||
@@ -3170,9 +3149,9 @@ dependencies = [
|
||||
|
||||
[[package]]
|
||||
name = "wasm-bindgen-macro-support"
|
||||
version = "0.2.114"
|
||||
version = "0.2.117"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "03ce4caeaac547cdf713d280eda22a730824dd11e6b8c3ca9e42247b25c631e3"
|
||||
checksum = "dca9693ef2bab6d4e6707234500350d8dad079eb508dca05530c85dc3a529ff2"
|
||||
dependencies = [
|
||||
"bumpalo",
|
||||
"proc-macro2",
|
||||
@@ -3183,9 +3162,9 @@ dependencies = [
|
||||
|
||||
[[package]]
|
||||
name = "wasm-bindgen-shared"
|
||||
version = "0.2.114"
|
||||
version = "0.2.117"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "75a326b8c223ee17883a4251907455a2431acc2791c98c26279376490c378c16"
|
||||
checksum = "39129a682a6d2d841b6c429d0c51e5cb0ed1a03829d8b3d1e69a011e62cb3d3b"
|
||||
dependencies = [
|
||||
"unicode-ident",
|
||||
]
|
||||
@@ -3239,9 +3218,9 @@ dependencies = [
|
||||
|
||||
[[package]]
|
||||
name = "web-sys"
|
||||
version = "0.3.91"
|
||||
version = "0.3.94"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "854ba17bb104abfb26ba36da9729addc7ce7f06f5c0f90f3c391f8461cca21f9"
|
||||
checksum = "cd70027e39b12f0849461e08ffc50b9cd7688d942c1c8e3c7b22273236b4dd0a"
|
||||
dependencies = [
|
||||
"js-sys",
|
||||
"wasm-bindgen",
|
||||
@@ -3361,15 +3340,6 @@ dependencies = [
|
||||
"windows-targets 0.52.6",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "windows-sys"
|
||||
version = "0.59.0"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "1e38bc4d79ed67fd075bcc251a1c39b32a1776bbe92e5bef1f0bf1f8c531853b"
|
||||
dependencies = [
|
||||
"windows-targets 0.52.6",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "windows-sys"
|
||||
version = "0.60.2"
|
||||
@@ -3528,9 +3498,18 @@ dependencies = [
|
||||
|
||||
[[package]]
|
||||
name = "winnow"
|
||||
version = "1.0.0"
|
||||
version = "0.7.15"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "a90e88e4667264a994d34e6d1ab2d26d398dcdca8b7f52bec8668957517fc7d8"
|
||||
checksum = "df79d97927682d2fd8adb29682d1140b343be4ac0f08fd68b7765d9c059d3945"
|
||||
dependencies = [
|
||||
"memchr",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "winnow"
|
||||
version = "1.0.1"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "09dac053f1cd375980747450bfc7250c264eaae0583872e845c0c7cd578872b5"
|
||||
|
||||
[[package]]
|
||||
name = "wit-bindgen"
|
||||
@@ -3622,9 +3601,9 @@ dependencies = [
|
||||
|
||||
[[package]]
|
||||
name = "writeable"
|
||||
version = "0.6.2"
|
||||
version = "0.6.3"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "9edde0db4769d2dc68579893f2306b26c6ecfbe0ef499b013d731b7b9247e0b9"
|
||||
checksum = "1ffae5123b2d3fc086436f8834ae3ab053a283cfac8fe0a0b8eaae044768a4c4"
|
||||
|
||||
[[package]]
|
||||
name = "xattr"
|
||||
@@ -3638,9 +3617,9 @@ dependencies = [
|
||||
|
||||
[[package]]
|
||||
name = "yoke"
|
||||
version = "0.8.1"
|
||||
version = "0.8.2"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "72d6e5c6afb84d73944e5cedb052c4680d5657337201555f9f2a16b7406d4954"
|
||||
checksum = "abe8c5fda708d9ca3df187cae8bfb9ceda00dd96231bed36e445a1a48e66f9ca"
|
||||
dependencies = [
|
||||
"stable_deref_trait",
|
||||
"yoke-derive",
|
||||
@@ -3649,9 +3628,9 @@ dependencies = [
|
||||
|
||||
[[package]]
|
||||
name = "yoke-derive"
|
||||
version = "0.8.1"
|
||||
version = "0.8.2"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "b659052874eb698efe5b9e8cf382204678a0086ebf46982b79d6ca3182927e5d"
|
||||
checksum = "de844c262c8848816172cef550288e7dc6c7b7814b4ee56b3e1553f275f1858e"
|
||||
dependencies = [
|
||||
"proc-macro2",
|
||||
"quote",
|
||||
@@ -3719,7 +3698,7 @@ dependencies = [
|
||||
"tokio-stream",
|
||||
"tokio-tungstenite 0.29.0",
|
||||
"tokio-util",
|
||||
"toml",
|
||||
"toml 1.1.2+spec-1.1.0",
|
||||
"tower",
|
||||
"tower-http",
|
||||
"tracing",
|
||||
@@ -3728,23 +3707,23 @@ dependencies = [
|
||||
"uuid",
|
||||
"webpki-roots 1.0.6",
|
||||
"which",
|
||||
"zip",
|
||||
"zip 8.5.1",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "zerocopy"
|
||||
version = "0.8.47"
|
||||
version = "0.8.48"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "efbb2a062be311f2ba113ce66f697a4dc589f85e78a4aea276200804cea0ed87"
|
||||
checksum = "eed437bf9d6692032087e337407a86f04cd8d6a16a37199ed57949d415bd68e9"
|
||||
dependencies = [
|
||||
"zerocopy-derive",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "zerocopy-derive"
|
||||
version = "0.8.47"
|
||||
version = "0.8.48"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "0e8bc7269b54418e7aeeef514aa68f8690b8c0489a06b0136e5f57c4c5ccab89"
|
||||
checksum = "70e3cd084b1788766f53af483dd21f93881ff30d7320490ec3ef7526d203bad4"
|
||||
dependencies = [
|
||||
"proc-macro2",
|
||||
"quote",
|
||||
@@ -3753,18 +3732,18 @@ dependencies = [
|
||||
|
||||
[[package]]
|
||||
name = "zerofrom"
|
||||
version = "0.1.6"
|
||||
version = "0.1.7"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "50cc42e0333e05660c3587f3bf9d0478688e15d870fab3346451ce7f8c9fbea5"
|
||||
checksum = "69faa1f2a1ea75661980b013019ed6687ed0e83d069bc1114e2cc74c6c04c4df"
|
||||
dependencies = [
|
||||
"zerofrom-derive",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "zerofrom-derive"
|
||||
version = "0.1.6"
|
||||
version = "0.1.7"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "d71e5d6e06ab090c67b5e44993ec16b72dcbaabc526db883a360057678b48502"
|
||||
checksum = "11532158c46691caf0f2593ea8358fed6bbf68a0315e80aae9bd41fbade684a1"
|
||||
dependencies = [
|
||||
"proc-macro2",
|
||||
"quote",
|
||||
@@ -3780,9 +3759,9 @@ checksum = "b97154e67e32c85465826e8bcc1c59429aaaf107c1e4a9e53c8d8ccd5eff88d0"
|
||||
|
||||
[[package]]
|
||||
name = "zerotrie"
|
||||
version = "0.2.3"
|
||||
version = "0.2.4"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "2a59c17a5562d507e4b54960e8569ebee33bee890c70aa3fe7b97e85a9fd7851"
|
||||
checksum = "0f9152d31db0792fa83f70fb2f83148effb5c1f5b8c7686c3459e361d9bc20bf"
|
||||
dependencies = [
|
||||
"displaydoc",
|
||||
"yoke",
|
||||
@@ -3791,9 +3770,9 @@ dependencies = [
|
||||
|
||||
[[package]]
|
||||
name = "zerovec"
|
||||
version = "0.11.5"
|
||||
version = "0.11.6"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "6c28719294829477f525be0186d13efa9a3c602f7ec202ca9e353d310fb9a002"
|
||||
checksum = "90f911cbc359ab6af17377d242225f4d75119aec87ea711a880987b18cd7b239"
|
||||
dependencies = [
|
||||
"yoke",
|
||||
"zerofrom",
|
||||
@@ -3802,9 +3781,9 @@ dependencies = [
|
||||
|
||||
[[package]]
|
||||
name = "zerovec-derive"
|
||||
version = "0.11.2"
|
||||
version = "0.11.3"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "eadce39539ca5cb3985590102671f2567e659fca9666581ad3411d59207951f3"
|
||||
checksum = "625dc425cab0dca6dc3c3319506e6593dcb08a9f387ea3b284dbd52a92c40555"
|
||||
dependencies = [
|
||||
"proc-macro2",
|
||||
"quote",
|
||||
@@ -3813,9 +3792,21 @@ dependencies = [
|
||||
|
||||
[[package]]
|
||||
name = "zip"
|
||||
version = "8.4.0"
|
||||
version = "0.6.6"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "7756d0206d058333667493c4014f545f4b9603c4330ccd6d9b3f86dcab59f7d9"
|
||||
checksum = "760394e246e4c28189f19d488c058bf16f564016aefac5d32bb1f3b51d5e9261"
|
||||
dependencies = [
|
||||
"byteorder",
|
||||
"crc32fast",
|
||||
"crossbeam-utils",
|
||||
"flate2",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "zip"
|
||||
version = "8.5.1"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "dcab981e19633ebcf0b001ddd37dd802996098bc1864f90b7c5d970ce76c1d59"
|
||||
dependencies = [
|
||||
"crc32fast",
|
||||
"flate2",
|
||||
@@ -3838,9 +3829,9 @@ checksum = "cb8a0807f7c01457d0379ba880ba6322660448ddebc890ce29bb64da71fb40f9"
|
||||
|
||||
[[package]]
|
||||
name = "zune-jpeg"
|
||||
version = "0.5.14"
|
||||
version = "0.5.15"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "0b7a1c0af6e5d8d1363f4994b7a091ccf963d8b694f7da5b0b9cceb82da2c0a6"
|
||||
checksum = "27bc9d5b815bc103f142aa054f561d9187d191692ec7c2d1e2b4737f8dbd7296"
|
||||
dependencies = [
|
||||
"zune-core",
|
||||
]
|
||||
|
||||
@@ -1,6 +1,6 @@
|
||||
[package]
|
||||
name = "sgclaw"
|
||||
version = "0.1.0"
|
||||
version = "0.1.0-2026.4.9"
|
||||
edition = "2021"
|
||||
|
||||
[dependencies]
|
||||
@@ -11,10 +11,17 @@ futures-util = "0.3"
|
||||
hex = "0.4"
|
||||
hmac = "0.12"
|
||||
reqwest = { version = "0.12", default-features = false, features = ["blocking", "json", "rustls-tls"] }
|
||||
regex = "1.10"
|
||||
serde = { version = "1", features = ["derive"] }
|
||||
serde_json = "1"
|
||||
sha2 = "0.10"
|
||||
thiserror = "1"
|
||||
tokio = { version = "1", default-features = false, features = ["rt-multi-thread", "macros"] }
|
||||
toml = "0.8"
|
||||
tungstenite = "0.29"
|
||||
uuid = { version = "1", features = ["v4"] }
|
||||
zip = { version = "0.6.6", default-features = false, features = ["deflate"] }
|
||||
zeroclaw = { package = "zeroclawlabs", path = "third_party/zeroclaw", default-features = false }
|
||||
|
||||
[dev-dependencies]
|
||||
tempfile = "3"
|
||||
|
||||
BIN
docs/2026-04-18-102-scenes-validation-overview.xlsx
Normal file
BIN
docs/2026-04-18-102-scenes-validation-overview.xlsx
Normal file
Binary file not shown.
@@ -8,15 +8,15 @@
|
||||
|
||||
## 1. 产品定义
|
||||
|
||||
sgClaw 是一个嵌入企业浏览器运行环境中的浏览器智能体执行内核。它的职责不是替代整个平台,也不是承诺“全自动数字员工”,而是把自然语言任务转换成受控的浏览器操作,并通过既有浏览器宿主完成页面执行。
|
||||
sgClaw 不是“浏览器智能体执行内核”意义上的 browser-only 产品。更准确地说,它是一个嵌入企业浏览器运行环境中的、安全加固后的 zeroclaw runtime:zeroclaw 负责智能体能力本体,sgClaw 负责把这些能力包裹进受控的执行边界中。
|
||||
|
||||
ZeroClaw 重构之后,sgClaw 的产品形态可以概括为三件事:
|
||||
它的职责不是替代整个平台,也不是承诺“全自动数字员工”,而是:
|
||||
|
||||
1. 把用户任务接入统一的 Agent 执行入口。
|
||||
2. 通过固定的 `browser_action` 工具把意图翻译为浏览器命令。
|
||||
3. 在协议、域名和动作白名单的约束下完成可审计的页面操作。
|
||||
1. 以 zeroclaw 为核心提供统一的 Agent 执行入口、prompt/skills/memory/tool routing 能力。
|
||||
2. 通过 sgClaw 的 pipe、HMAC、MAC Policy 和宿主二次校验,把高风险执行面收敛成受保护的工具表面。
|
||||
3. 在浏览器场景下,把页面操作能力作为特权工具面暴露给 runtime,而不是让浏览器反过来定义整个 runtime。
|
||||
|
||||
当前仓库中的 sgClaw 不是一个完整前端产品,也不是浏览器发行版本身,而是“浏览器 Agent Runtime + Pipe 协议 + ZeroClaw 兼容层”的产品核心。
|
||||
当前仓库中的 sgClaw 不是一个完整前端产品,也不是浏览器发行版本身,而是“vendored zeroclaw core + sgClaw 安全封装层 + browser host integration”的产品核心。
|
||||
|
||||
---
|
||||
|
||||
@@ -24,17 +24,20 @@ ZeroClaw 重构之后,sgClaw 的产品形态可以概括为三件事:
|
||||
|
||||
### 2.1 当前已经落地的能力
|
||||
|
||||
- 已 vendored `zeroclaw`,并通过 Rust 运行时接入 provider、skills、memory 等核心能力入口。
|
||||
- 浏览器侧通过 STDIO JSON Line 协议与 Rust 进程通信。
|
||||
- 启动时执行 `init -> init_ack` 握手,并建立会话级 HMAC 密钥。
|
||||
- 任务输入统一走 `submit_task` 消息。
|
||||
- Rust 侧支持两条执行路径:
|
||||
- 未配置大模型时,使用仓库内置 planner/fallback 逻辑。
|
||||
- 配置 `DEEPSEEK_*` 环境变量时,切换到 ZeroClaw compatibility runtime。
|
||||
- 当前有效工具面收敛为一个工具:`browser_action`。
|
||||
- 当前真正稳定落地的特权执行面收敛为一个工具:`browser_action`。
|
||||
- 当前真正开放给模型的动作仅 4 个:`click`、`type`、`navigate`、`getText`。
|
||||
- 所有浏览器动作都受 `resources/rules.json` 中的域名和动作白名单约束。
|
||||
- 执行过程中会向宿主发送结构化日志和最终任务结果。
|
||||
|
||||
注意:上述“browser-first + compat-only”是当前实现状态,不是目标产品定义。主线目标仍然是“安全加固后的 zeroclaw runtime”,而不是“只会操作浏览器的 agent”。
|
||||
|
||||
### 2.2 当前明确不宣称的能力
|
||||
|
||||
以下内容在旧文档中存在较多规划性描述,但并非当前仓库中的已实现事实:
|
||||
@@ -45,6 +48,7 @@ ZeroClaw 重构之后,sgClaw 的产品形态可以概括为三件事:
|
||||
- 完整的浏览器 Side Panel 产品界面。
|
||||
- 40+ 页面动作在 Agent 侧全部开放。
|
||||
- 真实生产级多租户、审计后台、任务编排中心。
|
||||
- “浏览器是 sgClaw 的全部 runtime” 这种定义。
|
||||
|
||||
这些能力可以保留为后续扩展方向,但不应继续写入 L0-L4 作为现状描述。
|
||||
|
||||
@@ -52,7 +56,7 @@ ZeroClaw 重构之后,sgClaw 的产品形态可以概括为三件事:
|
||||
|
||||
## 3. 产品价值主张
|
||||
|
||||
ZeroClaw 重构后的 sgClaw,核心价值不在“功能堆叠”,而在于把原本分散的浏览器自动化能力收敛成一个可控、可替换、可验证的智能体执行底座。
|
||||
ZeroClaw 重构后的 sgClaw,核心价值不在“功能堆叠”,也不在“单纯浏览器自动化”,而在于把 zeroclaw 的智能体能力放进一个可控、可替换、可验证的安全执行底座里。
|
||||
|
||||
### 3.1 对业务侧
|
||||
|
||||
@@ -62,15 +66,15 @@ ZeroClaw 重构后的 sgClaw,核心价值不在“功能堆叠”,而在于
|
||||
|
||||
### 3.2 对集成侧
|
||||
|
||||
- 浏览器宿主只需实现固定协议,不必理解模型内部细节。
|
||||
- Agent Runtime 可以在保留宿主协议的前提下切换实现策略。
|
||||
- ZeroClaw 兼容层把未来模型、记忆、工具调度的升级入口预留在 Rust 侧。
|
||||
- 浏览器宿主只需实现固定协议和宿主侧安全复检,不必理解模型内部细节。
|
||||
- sgClaw Runtime 可以在保留宿主协议的前提下演进 zeroclaw 配置、skills 和工具策略。
|
||||
- 浏览器只是一个受保护执行面;未来其它客户端也应复用同一 runtime,而不是另起一套架构。
|
||||
|
||||
### 3.3 对安全侧
|
||||
|
||||
- 不是“模型可任意操作浏览器”,而是“模型只能调用被允许的动作”。
|
||||
- 安全边界前置到协议和 MAC Policy,而不是把约束留给提示词。
|
||||
- 域名、动作、HMAC 三类控制共同组成最小可信执行面。
|
||||
- 不是“模型可任意操作浏览器”,而是“runtime 只能通过被授权的特权工具面触发高风险动作”。
|
||||
- 安全边界前置到协议、MAC Policy、宿主二次校验和运行时工具策略,而不是把约束留给提示词。
|
||||
- 域名、动作、HMAC 与工具暴露策略共同组成最小可信执行面。
|
||||
|
||||
---
|
||||
|
||||
@@ -80,15 +84,16 @@ ZeroClaw 重构后的 sgClaw,核心价值不在“功能堆叠”,而在于
|
||||
|---|---|---|
|
||||
| 任务接入 | 已实现 | 接收浏览器宿主发来的 `submit_task` 指令 |
|
||||
| 协议握手 | 已实现 | 统一版本、会话标识、HMAC 种子交换 |
|
||||
| Agent 执行 | 已实现 | planner fallback 与 ZeroClaw compat 共存 |
|
||||
| 浏览器工具 | 已实现 | 单一 `browser_action` 工具 |
|
||||
| zeroclaw Core | 已接入但未充分释放 | prompt、skills、memory、provider routing 的能力本体已经在仓库中 |
|
||||
| Agent 执行 | 已实现但仍处过渡态 | planner fallback 与 browser-first ZeroClaw compat 共存 |
|
||||
| 浏览器特权工具面 | 已实现 | 通过 `browser_action` 暴露受保护的浏览器执行能力 |
|
||||
| 核心动作 | 已实现 | `click/type/navigate/getText` |
|
||||
| 域名白名单 | 已实现 | 仅允许规则文件中的域名 |
|
||||
| 动作白名单 | 已实现 | 仅允许规则文件中的动作 |
|
||||
| 结构化日志 | 已实现 | `log_entry` 与 `task_complete` 回传 |
|
||||
| 扩展动作枚举 | 已预留 | 协议枚举已定义,但默认未开放 |
|
||||
| Skill 引擎 | 未独立实现 | 当前仅保留“可被工具和提示词扩展”的语义入口 |
|
||||
| MCP 生态 | 未在主链路启用 | ZeroClaw 兼容层为后续保留位置 |
|
||||
| Skill 体系 | 已 vendored 但运行时只部分使用 | 应复用 zeroclaw 原生机制,而不是另起一套浏览器专用技能系统 |
|
||||
| MCP 生态 | 未在主链路启用 | 未来可接入,但仍应服从 sgClaw 安全封装 |
|
||||
|
||||
---
|
||||
|
||||
@@ -109,13 +114,13 @@ ZeroClaw 重构后的 sgClaw,核心价值不在“功能堆叠”,而在于
|
||||
当页面元素定位规则明确时,系统可用 `click` 和 `type` 组合完成表单录入、按钮点击、简单提交等动作。
|
||||
是否能覆盖完整业务流程,取决于浏览器宿主是否提供对应页面、选择器和回包信息,而不是文档层面预设“所有流程都能端到端执行”。
|
||||
|
||||
### 5.3 作为更大产品中的 Agent 执行核
|
||||
### 5.3 作为更大产品中的安全执行核
|
||||
|
||||
sgClaw 更适合被理解为产品底座中的一个执行核:
|
||||
|
||||
- 上层可以接入任务输入框、审批入口或业务编排器。
|
||||
- 下层通过既有浏览器控制面执行。
|
||||
- 中间由 sgClaw 把自然语言与浏览器动作连接起来。
|
||||
- 下层可以通过既有浏览器控制面执行,也可以在未来接入其它受保护工具面。
|
||||
- 中间由 sgClaw 把 zeroclaw runtime 与外部受控执行面连接起来。
|
||||
|
||||
---
|
||||
|
||||
|
||||
@@ -1,32 +1,43 @@
|
||||
# L1 — 系统架构与安全模型层
|
||||
|
||||
**文档版本**: 2.0
|
||||
**适用项目**: sgClaw(ZeroClaw 重构版)
|
||||
**编制日期**: 2026-03-26
|
||||
**文档版本**: 2.1<br>
|
||||
**适用项目**: sgClaw(ZeroClaw 重构版)<br>
|
||||
**编制日期**: 2026-03-29
|
||||
|
||||
---
|
||||
|
||||
## 1. 架构总览
|
||||
|
||||
重构后的 sgClaw 架构要点很简单:浏览器宿主负责页面执行,Rust 进程负责任务解释与协议编排,ZeroClaw 作为兼容运行时被接入到 Rust 侧,而不是直接替代整个系统。
|
||||
重构后的 sgClaw 架构要点应当这样理解:`host`(SuperRPA)是受保护的宿主安全边界,`sgClaw` 是运行时能力编排层,`zeroclaw` 是 planner / model / skill 的能力本体,`frontend bundle` 只是展示面。当前代码尚未完全落到这个目标结构,但主线架构口径必须先统一。
|
||||
|
||||
```
|
||||
┌──────────────────────────────┐
|
||||
│ Browser Host / Chromium Side │
|
||||
│ - 读取 launch config │
|
||||
│ - 启动 sgClaw 子进程 │
|
||||
│ - 发送 init / submit_task │
|
||||
│ - 执行 command 并回 response │
|
||||
│ - 复检 HMAC / domain / params │
|
||||
│ - 执行 browser command 并回包 │
|
||||
│ - 选择 frontend bundle │
|
||||
└──────────────┬───────────────┘
|
||||
│ STDIO + JSON Line
|
||||
┌──────────────▼───────────────┐
|
||||
│ sgClaw Rust Runtime │
|
||||
│ sgClaw Security Envelope │
|
||||
│ - 握手与消息循环 │
|
||||
│ - MAC Policy │
|
||||
│ - BrowserPipeTool │
|
||||
│ - Planner fallback │
|
||||
│ - ZeroClaw compat runtime │
|
||||
│ - Runtime / Tool Policy │
|
||||
│ - Config Adaptation │
|
||||
└──────────────┬───────────────┘
|
||||
│ Provider API / Local Config
|
||||
│ zeroclaw APIs / Local Config
|
||||
┌──────────────▼───────────────┐
|
||||
│ ZeroClaw Core Runtime │
|
||||
│ - Prompt Builder │
|
||||
│ - Skills / Memory │
|
||||
│ - Tool Loop / Routing │
|
||||
│ - Provider Dispatch │
|
||||
└──────────────┬───────────────┘
|
||||
│ Provider API / Optional Tools
|
||||
┌──────────────▼───────────────┐
|
||||
│ Model Provider │
|
||||
│ - DeepSeek/OpenAI-compatible │
|
||||
@@ -34,23 +45,25 @@
|
||||
└──────────────────────────────┘
|
||||
```
|
||||
|
||||
架构上最重要的变化是:当前系统不是“完整 ZeroClaw 产品”,而是“保留现有浏览器协议的前提下,把 ZeroClaw 作为兼容执行内核引入”。
|
||||
架构上最重要的变化是:sgClaw 不应被定义为“浏览器专用 agent”,而应被定义为“保留现有浏览器协议前提下,对 zeroclaw 做安全化封装的运行时分发”。同时 `host` 不再持有高频变化的业务策略,`launch config`、`runtime config` 和 `frontend bundle` 应成为运行时可替换对象。
|
||||
|
||||
---
|
||||
|
||||
## 2. 运行时分层
|
||||
|
||||
### 2.1 浏览器宿主层
|
||||
### 2.1 host(浏览器宿主)层
|
||||
|
||||
宿主负责三类职责:
|
||||
`host` 负责受保护边界内的四类职责:
|
||||
|
||||
- 读取并校验 `launch config`,决定进程如何被拉起。
|
||||
- 启动和托管 sgClaw Rust 子进程。
|
||||
- 按协议发送 `init`、`submit_task`、`response`。
|
||||
- 执行 Rust 发来的浏览器命令并回包。
|
||||
- 选择外部 `frontend bundle` 或内置资源作为展示面。
|
||||
|
||||
sgClaw 仓库本身不包含 Chromium/C++ 实现代码,因此 L1 只定义宿主责任边界,不再把外部仓库中的假定文件结构写成“当前仓库现状”。
|
||||
`host` 不拥有 planner、模型路由、skills 编排或业务策略定义。sgClaw 仓库本身不包含 Chromium/C++ 实现代码,因此 L1 只定义宿主责任边界,不再把外部仓库中的假定文件结构写成“当前仓库现状”。
|
||||
|
||||
### 2.2 Rust 控制层
|
||||
### 2.2 sgClaw 安全/控制层
|
||||
|
||||
Rust 侧是当前仓库的事实主体,职责包括:
|
||||
|
||||
@@ -58,10 +71,40 @@ Rust 侧是当前仓库的事实主体,职责包括:
|
||||
- 完成握手、加载 `rules.json`、创建 `BrowserPipeTool`。
|
||||
- 在消息循环中接收浏览器消息并分发到执行层。
|
||||
- 把执行日志和任务结果回传给宿主。
|
||||
- 读取 `runtime config`,决定 planner、provider、skills prompt mode、backend 选择等运行时行为。
|
||||
- 决定哪些 zeroclaw 能力能够暴露给当前运行环境。
|
||||
|
||||
### 2.3 执行层
|
||||
### 2.3 zeroclaw 核心层
|
||||
|
||||
执行层当前有两条路径:
|
||||
主线目标中,zeroclaw 应承担:
|
||||
|
||||
- planner / executor 决策
|
||||
- prompt/system sections 组装
|
||||
- skills / memory / routing
|
||||
- tool loop 与 provider 协调
|
||||
- 通用 agent 能力而非仅浏览器能力
|
||||
|
||||
也就是说,`planner-first` 是 sgClaw / zeroclaw 运行时行为,不是 `frontend bundle` 或 `host` 页面逻辑。
|
||||
|
||||
### 2.4 展示层(frontend bundle)
|
||||
|
||||
前端展示层只拥有显示权,不拥有执行权。其职责应被限制为:
|
||||
|
||||
- 展示会话状态、日志、消息、验收结果。
|
||||
- 将用户输入转成宿主可接受的事件。
|
||||
- 呈现 `planner-first` 的可视化状态,例如“先展示计划,再执行”。
|
||||
|
||||
前端不应决定:
|
||||
|
||||
- 是否调用 planner
|
||||
- 是否切换 provider / model
|
||||
- 是否绕过 sgClaw / zeroclaw 执行
|
||||
|
||||
这些决策必须留在 sgClaw / zeroclaw 运行时。
|
||||
|
||||
### 2.5 当前实现的过渡态
|
||||
|
||||
当前执行层仍有两条路径:
|
||||
|
||||
1. `planner fallback`
|
||||
说明:当未配置 `DEEPSEEK_API_KEY` 等环境变量时,使用仓库内置的轻量 planner 执行。
|
||||
@@ -69,23 +112,38 @@ Rust 侧是当前仓库的事实主体,职责包括:
|
||||
2. `ZeroClaw compat runtime`
|
||||
说明:当提供模型配置后,通过 [`src/compat/runtime.rs`](/home/zyl/projects/sgClaw/claw/src/compat/runtime.rs) 构造 provider、memory 和 `browser_action` 工具,把任务交给 vendored ZeroClaw Agent。
|
||||
|
||||
这两条路径共存,是当前重构期的核心现实。文档必须保留这一点,否则会误导实现和联调。
|
||||
这两条路径是当前代码现实,但都不应被写成长期产品定义。长期目标是“zeroclaw-first runtime + sgClaw security layer”,而不是 browser-only compat。
|
||||
|
||||
### 2.6 运行时配置边界
|
||||
|
||||
本轮架构冻结后,必须使用以下统一术语:
|
||||
|
||||
- `launch config`
|
||||
由 `host` 读取,描述 `binary`、`args`、`env`、`working_dir`、`runtime_config_path`、`frontend_bundle_dir` 等启动时边界。
|
||||
- `runtime config`
|
||||
由 sgClaw 读取,描述 provider、model、planner mode、backend 选择、skills 策略等运行时行为。
|
||||
- `frontend bundle`
|
||||
由 `host` 装载的展示资源目录,允许外部 bundle 优先、内置资源兜底。
|
||||
|
||||
这三类对象都应支持运行期调整;只有安全边界和能力暴露由 `host` 固化。
|
||||
|
||||
---
|
||||
|
||||
## 3. ZeroClaw 重构的架构意义
|
||||
|
||||
ZeroClaw 在本项目中的角色不是“大而全框架接管一切”,而是解决三个具体问题:
|
||||
ZeroClaw 在本项目中的角色不是“大而全框架接管一切”,也不是“被 sgClaw 套壳后只剩一个 browser_action 工具”,而是系统能力本体。sgClaw 应该在它上面解决三个具体问题:
|
||||
|
||||
- 统一模型 Provider 抽象。
|
||||
- 为后续记忆、工具调度、可观测性留出标准扩展位。
|
||||
- 在不改浏览器协议的前提下,替换任务执行内核。
|
||||
- 统一模型 Provider 抽象、skills、memory 和 tool loop。
|
||||
- 在不改浏览器协议的前提下,把高风险执行约束到受保护的工具面。
|
||||
- 让浏览器成为特权执行面,而不是反过来让浏览器定义整个 runtime。
|
||||
|
||||
当前兼容层的限制也必须明确:
|
||||
|
||||
- 只注册一个工具:`browser_action`。
|
||||
- 只开放 4 个动作:`click/type/navigate/getText`。
|
||||
- 不以 ZeroClaw 的全量工具生态作为对外能力宣称。
|
||||
- 不以 ZeroClaw 的全量工具生态作为当前对外能力宣称。
|
||||
|
||||
这些限制是当前实现状态,不是目标架构原则。
|
||||
|
||||
---
|
||||
|
||||
@@ -142,21 +200,38 @@ ZeroClaw 在本项目中的角色不是“大而全框架接管一切”,而
|
||||
原因:
|
||||
|
||||
- 浏览器宿主联调成本最低。
|
||||
- Rust 侧可以独立迭代 planner 和 ZeroClaw 路径。
|
||||
- Rust 侧可以独立演进 zeroclaw runtime 与安全策略,而不破坏宿主联调。
|
||||
- 产品文档、测试和协议标准可以围绕同一条 contract 收敛。
|
||||
|
||||
### 5.3 先做最小工具面,再扩动作
|
||||
### 5.3 先做最小特权工具面,再扩动作
|
||||
|
||||
原因:
|
||||
|
||||
- 当前最稳定的是 `click/type/navigate/getText`。
|
||||
- 动作越多,宿主和模型之间的契约越难稳定。
|
||||
- 动作越多,宿主和 runtime 之间的契约越难稳定。
|
||||
- 在规则文件仍只开放 4 个动作的前提下,文档不应提前放大能力范围。
|
||||
|
||||
### 5.4 启动策略配置化,而不是编译期写死
|
||||
|
||||
原因:
|
||||
|
||||
- 更换 sgClaw binary 不应要求重编 Chromium。
|
||||
- 切换 `runtime config` 不应要求重编 Chromium。
|
||||
- 更换 `frontend bundle` 不应要求重编 Chromium。
|
||||
- 安全边界仍由 `host` 控制,配置化不等于放弃校验。
|
||||
|
||||
### 5.5 planner-first 只属于运行时,不属于展示层
|
||||
|
||||
原因:
|
||||
|
||||
- “先展示计划,再执行”是执行引擎行为,不是前端动画效果。
|
||||
- 前端只能显示 planner 状态,不能私自触发或跳过 planner。
|
||||
- 验收必须同时覆盖视觉呈现和功能执行两个维度。
|
||||
|
||||
---
|
||||
|
||||
## 6. 架构结论
|
||||
|
||||
L1 层面可以把 sgClaw 定义为:一个通过固定浏览器协议接入宿主、以 Rust 为控制层、以 ZeroClaw 为兼容执行核、以 MAC Policy 为最小安全边界的浏览器智能体运行时。
|
||||
L1 层面可以把 sgClaw 定义为:一个通过固定浏览器协议接入 `host`、以 Rust 为安全与控制层、以 zeroclaw 为能力核心、以 `runtime config` 驱动执行行为、以 `frontend bundle` 作为仅展示界面、以 MAC Policy 与宿主复检为受保护执行边界的安全加固运行时分发。
|
||||
|
||||
这一定义与当前仓库实现保持一致,也为后续继续扩展动作、工具和记忆系统保留了清晰边界。
|
||||
这一定义既承认当前仓库仍存在 browser-first compat 的过渡实现,也为后续把 runtime 真正收口到 zeroclaw-first、planner-first 主线保留了清晰边界。
|
||||
|
||||
@@ -1,8 +1,8 @@
|
||||
# L2 — 核心模块与接口契约层
|
||||
|
||||
**文档版本**: 2.0
|
||||
**适用项目**: sgClaw(ZeroClaw 重构版)
|
||||
**编制日期**: 2026-03-26
|
||||
**文档版本**: 2.1<br>
|
||||
**适用项目**: sgClaw(ZeroClaw 重构版)<br>
|
||||
**编制日期**: 2026-03-29
|
||||
|
||||
**读者**: 架构工程师、实现工程师、联调工程师
|
||||
|
||||
@@ -24,14 +24,28 @@ src/
|
||||
└── security/
|
||||
```
|
||||
|
||||
模块边界按职责划分为四层:
|
||||
模块边界按职责划分为五层:
|
||||
|
||||
| 层级 | 模块 | 责任 |
|
||||
|---|---|---|
|
||||
| 传输层 | `pipe` | 定义消息、握手、序列号、收发与命令等待 |
|
||||
| 控制层 | `lib.rs`、`agent` | 接收任务、选择执行路径、回传日志与结果 |
|
||||
| 兼容层 | `compat` | 对接 vendored ZeroClaw,暴露单一 `browser_action` |
|
||||
| 安全层 | `security`、`resources/rules.json` | 域名与动作白名单控制 |
|
||||
| 安全层 | `security`、`resources/rules.json` | 域名、动作与 pipe 命令边界控制 |
|
||||
| 运行时入口层 | `lib.rs`、`agent` | 接收任务、选择执行路径、回传日志与结果 |
|
||||
| 适配层 | `compat`、`config` | 把 sgClaw 宿主环境映射到 zeroclaw 运行时;当前仍带有过渡性限制 |
|
||||
| 核心能力层 | `third_party/zeroclaw` | prompt、skills、memory、tool loop、provider 抽象 |
|
||||
|
||||
本轮冻结后还需要一组跨仓统一术语:
|
||||
|
||||
- `host`
|
||||
指 SuperRPA 浏览器宿主,只拥有安全边界、进程托管和展示装配权。
|
||||
- `launch config`
|
||||
指由 `host` 读取的启动描述文件。
|
||||
- `runtime config`
|
||||
指由 sgClaw 读取的运行时策略文件。
|
||||
- `frontend bundle`
|
||||
指由 `host` 装载的前端展示资源。
|
||||
- `planner-first`
|
||||
指 sgClaw / zeroclaw 先产生计划、前端先展示计划、随后再执行的运行时行为。
|
||||
|
||||
---
|
||||
|
||||
@@ -51,26 +65,27 @@ src/
|
||||
|
||||
### 2.2 `src/agent/mod.rs`
|
||||
|
||||
[`src/agent/mod.rs`](/home/zyl/projects/sgClaw/claw/src/agent/mod.rs) 决定执行路径:
|
||||
[`src/agent/mod.rs`](/home/zyl/projects/sgClaw/claw/src/agent/mod.rs) 当前决定执行路径:
|
||||
|
||||
- 收到 `BrowserMessage::SubmitTask` 时优先尝试读取 `DeepSeekSettings`。
|
||||
- 环境配置存在,则走 `compat::runtime::execute_task`。
|
||||
- 环境配置不存在,则走内置 planner fallback。
|
||||
- 收到 `BrowserMessage::SubmitTask` 时优先尝试读取 `SgClawSettings`。
|
||||
- 环境配置存在,则走 `compat::runtime::execute_task_with_sgclaw_settings`。
|
||||
- 环境配置不存在,则直接返回“未配置大语言模型”,而不是再把生产 submit 流量导回旧 planner。
|
||||
|
||||
这就是当前系统的“路由器”。
|
||||
这就是当前系统的“路由器”。但文档上应把它理解为过渡性 runtime selector,而不是最终产品架构中心。长期看,`planner-first`、provider 选择与 backend 选择都应由 `runtime config` 显式驱动。
|
||||
|
||||
### 2.3 `src/agent/runtime.rs`
|
||||
|
||||
该文件保留了仓库内的轻量 LLM/tool 调用逻辑,核心特点:
|
||||
该文件现在应被视为 `legacy/dev-only` 模块,只保留仓库内的轻量 LLM/tool 调用逻辑用于局部验证,核心特点:
|
||||
|
||||
- 工具名固定为 `browser_action`。
|
||||
- schema 只允许 `click/type/navigate/getText`。
|
||||
- 每次工具调用前后发送 `log_entry`。
|
||||
- 结果失败时直接返回 `PipeError::Protocol`。
|
||||
- 不参与当前生产浏览器 submit 路由。
|
||||
|
||||
### 2.4 `src/compat/runtime.rs`
|
||||
|
||||
[`src/compat/runtime.rs`](/home/zyl/projects/sgClaw/claw/src/compat/runtime.rs) 是 ZeroClaw 重构的关键模块:
|
||||
[`src/compat/runtime.rs`](/home/zyl/projects/sgClaw/claw/src/compat/runtime.rs) 是当前 ZeroClaw 接入的关键模块:
|
||||
|
||||
- 负责构造 ZeroClaw config。
|
||||
- 负责创建 provider。
|
||||
@@ -81,11 +96,64 @@ src/
|
||||
|
||||
- 当前 compat 层只向 ZeroClaw 注册一个工具。
|
||||
- `allowed_tools` 被收敛到 `browser_action`。
|
||||
- 这意味着 ZeroClaw 在本项目中是“兼容执行器”,不是“多工具平台”。
|
||||
- 这意味着当前代码还没有把 sgClaw 做成 zeroclaw-first runtime。
|
||||
- 这是一种实现限制,不应被文档提升为产品原则。
|
||||
|
||||
### 2.7 host / launch config / runtime config 契约
|
||||
|
||||
跨仓接口收口后,`host` 与 sgClaw 的最小契约应为:
|
||||
|
||||
| 对象 | 读取方 | 责任 | 失败时兜底 |
|
||||
|---|---|---|---|
|
||||
| `launch config` | `host` | 解析 `binary`、`args`、`env`、`working_dir`、`runtime_config_path`、`frontend_bundle_dir` | 回退到浏览器内置默认启动策略 |
|
||||
| `runtime config` | sgClaw | 解析 provider、model、planner mode、backend、skills policy | 回退到 sgClaw 默认运行时配置 |
|
||||
| `frontend bundle` | `host` | 提供浮窗/面板展示资源 | 回退到浏览器内置 WebUI 资源 |
|
||||
|
||||
这里的关键点是:`host` 只负责校验和装配,不负责决定 planner、model routing 或 business behavior。
|
||||
|
||||
### 2.8 launch config 文件路径与回退规则
|
||||
|
||||
设计冻结口径如下:
|
||||
|
||||
- profile-local `launch config` 路径:`<profile>/superrpa/sgclaw_launch_config.json`
|
||||
- profile-local `runtime config` 路径:`<profile>/superrpa/sgclaw_config.json`
|
||||
- profile-local hooks / rules 仍沿用:
|
||||
- `<profile>/superrpa/hooks.json`
|
||||
- `<profile>/superrpa/rules.json`
|
||||
|
||||
`launch config` 的回退规则必须保持稳定:
|
||||
|
||||
1. 优先读取 `<profile>/superrpa/sgclaw_launch_config.json`
|
||||
2. 缺失或解析失败时,回退到当前浏览器内置默认启动参数
|
||||
3. `runtime_config_path` 缺失时,回退到 `<profile>/superrpa/sgclaw_config.json`
|
||||
4. `frontend_bundle_dir` 缺失、无效或校验失败时,回退到 bundled frontend resources
|
||||
|
||||
### 2.9 frontend bundle 与 planner-first 契约
|
||||
|
||||
`frontend bundle` 只能消费这些由 `host` 转发的运行时状态:
|
||||
|
||||
- 当前状态与日志
|
||||
- planner 输出
|
||||
- 会话消息
|
||||
- 最终执行结果
|
||||
|
||||
`frontend bundle` 不能直接拥有:
|
||||
|
||||
- provider 切换决策权
|
||||
- planner 开关控制权
|
||||
- executor 旁路能力
|
||||
|
||||
因此 `planner-first` 的契约应是:
|
||||
|
||||
1. sgClaw / zeroclaw 产生计划
|
||||
2. `frontend bundle` 先展示计划
|
||||
3. 经运行时确认后再执行
|
||||
|
||||
这是一条 runtime contract,不是一条前端内部约定。
|
||||
|
||||
### 2.5 `src/pipe/browser_tool.rs`
|
||||
|
||||
该模块承担真实浏览器命令发送职责:
|
||||
该模块承担真实浏览器命令发送职责,也是当前系统中最重要的特权工具面桥接层:
|
||||
|
||||
- 为每个命令分配 `seq`。
|
||||
- 计算 HMAC。
|
||||
@@ -93,11 +161,11 @@ src/
|
||||
- 阻塞等待对应 `BrowserMessage::Response`。
|
||||
- 在超时、响应错配、校验失败时返回错误。
|
||||
|
||||
它是 Rust 侧最重要的协议执行点。
|
||||
它是 Rust 侧最重要的协议执行点。架构上应把它理解为“privileged browser surface adapter”,而不是“整个 runtime 本体”。
|
||||
|
||||
### 2.6 `src/security/mac_policy.rs`
|
||||
|
||||
安全策略只认规则文件,不认模型意图。
|
||||
安全策略只认规则文件与 pipe contract,不认模型意图。
|
||||
规则来源为 [`resources/rules.json`](/home/zyl/projects/sgClaw/claw/resources/rules.json),当前默认约束是:
|
||||
|
||||
- 允许域名:`oa.example.com`、`erp.example.com`、`hr.example.com` 及 demo 域名。
|
||||
@@ -235,6 +303,8 @@ L2 是产品内核视角的契约说明。两者关系如下:
|
||||
2. `src/agent/runtime.rs` 的 tool definition
|
||||
3. `src/compat/browser_tool_adapter.rs` 的 `parameters_schema` 与 `parse_action`
|
||||
|
||||
这三者表达的是“当前特权浏览器工具面”的开放范围,而不是 sgClaw 整体 runtime 的长期能力上限。
|
||||
|
||||
---
|
||||
|
||||
## 5. `browser_action` 工具契约
|
||||
|
||||
@@ -10,32 +10,32 @@
|
||||
|
||||
## 1. 端到端数据流
|
||||
|
||||
当前主链路的数据流如下:
|
||||
主线目标中的数据流应当如下:
|
||||
|
||||
```
|
||||
Browser Host
|
||||
└─ submit_task
|
||||
Client Surface
|
||||
└─ submit_task (+ optional browser context)
|
||||
↓
|
||||
sgClaw Transport / Handshake
|
||||
└─ handle_browser_message
|
||||
└─ sgClaw security envelope
|
||||
↓
|
||||
Execution Path Select
|
||||
├─ planner fallback
|
||||
└─ zeroclaw compat runtime
|
||||
ZeroClaw-first runtime
|
||||
└─ runtime profile / tool policy
|
||||
↓
|
||||
browser_action
|
||||
↓
|
||||
AgentMessage::Command
|
||||
↓
|
||||
Browser executes action
|
||||
↓
|
||||
BrowserMessage::Response
|
||||
Tool execution
|
||||
├─ browser_action -> pipe -> browser host -> response
|
||||
└─ non-browser-safe future surfaces only when policy allows
|
||||
↓
|
||||
log_entry / task_complete
|
||||
```
|
||||
|
||||
这条链路里没有独立 Skill 执行器,也没有独立任务编排数据库。
|
||||
因此 L3 的重点不再是“描述一个理想化智能体平台”,而是说明当前仓库里真实存在的数据流状态机。
|
||||
当前代码与上述目标之间仍有过渡态偏差:
|
||||
|
||||
- 浏览器是当前唯一成熟的特权工具面。
|
||||
- `planner/runtime` 旧链路仍保留在仓库中,但已收敛为 `legacy/dev-only` 辅助模块。
|
||||
- `zeroclaw` 已 vendored,但运行时还没有完全按 zeroclaw-first 方式释放能力。
|
||||
|
||||
因此 L3 既要说明目标数据流,也要明确指出当前代码仍处于过渡收口阶段。
|
||||
|
||||
---
|
||||
|
||||
@@ -52,26 +52,27 @@ log_entry / task_complete
|
||||
|
||||
### 2.2 任务接收阶段
|
||||
|
||||
宿主发送:
|
||||
浏览器宿主当前发送:
|
||||
|
||||
```json
|
||||
{ "type": "submit_task", "instruction": "..." }
|
||||
```
|
||||
|
||||
Rust 侧在 [`src/agent/mod.rs`](/home/zyl/projects/sgClaw/claw/src/agent/mod.rs) 中接收后,不直接执行页面命令,而是先决定走哪条执行路径。
|
||||
Rust 侧在 [`src/agent/mod.rs`](/home/zyl/projects/sgClaw/claw/src/agent/mod.rs) 中接收后,不应被理解为“直接开始网页自动化”,而是先决定当前任务使用什么 runtime/profile,并判断浏览器上下文是否真的必要。
|
||||
|
||||
### 2.3 执行路径选择
|
||||
### 2.3 当前执行路径选择(过渡态)
|
||||
|
||||
#### 路径 A:planner fallback
|
||||
#### 路径 A:legacy planner/runtime(非生产 submit 主链)
|
||||
|
||||
条件:没有可用的 `DEEPSEEK_*` 环境配置。
|
||||
行为:使用仓库内置 planner 直接产生若干步骤,并逐个调用 `BrowserPipeTool`。
|
||||
条件:仅用于 dev/test 验证或保留回归覆盖。
|
||||
行为:使用仓库内置 planner 或轻量 runtime 直接产生若干步骤,并逐个调用 `BrowserPipeTool`。
|
||||
|
||||
特点:
|
||||
|
||||
- 依赖更少。
|
||||
- 逻辑可预测。
|
||||
- 适合协议联调和最小功能验证。
|
||||
- 不再承接生产浏览器 submit 流量。
|
||||
|
||||
#### 路径 B:ZeroClaw compat runtime
|
||||
|
||||
@@ -108,7 +109,7 @@ Rust 侧在 [`src/agent/mod.rs`](/home/zyl/projects/sgClaw/claw/src/agent/mod.rs
|
||||
6. 形成观察结果
|
||||
说明:根据 `success`、`data`、`aom_snapshot` 和 `timing` 形成下一步输入或最终结果。
|
||||
|
||||
这意味着“智能体行为”和“浏览器动作执行”之间的接口已经被压缩到非常薄的一层,这是 ZeroClaw 重构最有价值的结构变化。
|
||||
这意味着“runtime 决策”和“浏览器动作执行”之间的接口已经被压缩到非常薄的一层,这是 sgClaw 作为 zeroclaw 安全封装层最有价值的结构变化。
|
||||
|
||||
---
|
||||
|
||||
@@ -126,6 +127,7 @@ Rust 侧在 [`src/agent/mod.rs`](/home/zyl/projects/sgClaw/claw/src/agent/mod.rs
|
||||
典型内容:
|
||||
|
||||
- 当前准备执行的动作。
|
||||
- `planner_mode=zeroclaw_plan_first` 时由 sgClaw 先发出的计划预览。
|
||||
- compat runtime 中转译出的事件摘要。
|
||||
- 执行中的信息性提示。
|
||||
|
||||
@@ -144,26 +146,28 @@ Rust 侧在 [`src/agent/mod.rs`](/home/zyl/projects/sgClaw/claw/src/agent/mod.rs
|
||||
|
||||
“L3 是灵魂”的前提,不是把 Skill 写得越来越玄,而是把 Skill 在当前阶段的真实语义说清楚。
|
||||
|
||||
### 5.1 当前不存在独立 Skill 引擎
|
||||
### 5.1 当前不应再设计独立于 zeroclaw 的 Skill 引擎
|
||||
|
||||
当前仓库中没有独立的:
|
||||
当前仓库中不应再把 Skill 理解为浏览器专用外挂子系统。sgClaw 已经 vendored zeroclaw,自带的 skill 体系才是主线。需要注意的是,当前运行时对它的使用仍不充分。
|
||||
|
||||
当前代码仍缺少或未完全释放的部分包括:
|
||||
|
||||
- Skill 脚本目录加载流程
|
||||
- Skill 注册表
|
||||
- Skill 沙箱执行器
|
||||
- Skill 版本与签名校验主链路
|
||||
|
||||
因此不能再把 Skill 描述为已落地子系统。
|
||||
因此文档上不能再把 Skill 描述为“浏览器侧另起一套引擎”,而应描述为“应复用 zeroclaw-native 机制的能力层,当前实现仍在收口”。
|
||||
|
||||
### 5.2 当前可以保留的 Skill 语义
|
||||
|
||||
在 ZeroClaw 重构版里,Skill 更准确的含义是:
|
||||
在 sgClaw 的主线架构里,Skill 更准确的含义是:
|
||||
|
||||
- 面向未来的“可复用任务模式”抽象。
|
||||
- 可能由提示词、模板、预设工具组合或 planner 规则来承载。
|
||||
- 最终仍要落到统一的 `browser_action` 契约。
|
||||
- 由 zeroclaw 管理的可复用任务模式、提示规范和可调用工具组合。
|
||||
- 在 compact/full 模式下进入 system prompt 或按需通过 `read_skill` 读取。
|
||||
- 当任务需要浏览器时,最终可落到统一的 `browser_action` 契约;当任务不需要浏览器时,不应强行绕浏览器一圈。
|
||||
|
||||
换句话说,当前 Skill 不是一个运行时目录,而是一种产品与执行层之间的抽象语言。
|
||||
换句话说,Skill 不是“浏览器脚本目录”的别名,而是 zeroclaw runtime 的一部分。
|
||||
|
||||
### 5.3 Skill 演进约束
|
||||
|
||||
@@ -179,13 +183,23 @@ Rust 侧在 [`src/agent/mod.rs`](/home/zyl/projects/sgClaw/claw/src/agent/mod.rs
|
||||
|
||||
### 6.1 配置
|
||||
|
||||
当前真正参与执行的关键配置来自 [`src/config/settings.rs`](/home/zyl/projects/sgClaw/claw/src/config/settings.rs):
|
||||
当前真正参与执行的关键配置来自 [`src/config/settings.rs`](/home/zyl/projects/sgClaw/claw/src/config/settings.rs)。它已经不再只是单一 `DEEPSEEK_*` shim,而是开始承载 sgClaw 自己的运行时策略:
|
||||
|
||||
- `DEEPSEEK_API_KEY`
|
||||
- `DEEPSEEK_BASE_URL`
|
||||
- `DEEPSEEK_MODEL`
|
||||
- `providers` / `active_provider`
|
||||
- `planner_mode`
|
||||
- `browser_backend`
|
||||
- `office_backend`
|
||||
- `skills_prompt_mode`
|
||||
- `runtime_profile`
|
||||
|
||||
这些配置决定是否启用 compat runtime,以及模型请求如何路由。
|
||||
当前默认语义是:
|
||||
|
||||
- `providers` 为空时,仍兼容旧的 `apiKey/baseUrl/model` DeepSeek 单模型配置。
|
||||
- `planner_mode=zeroclaw_plan_first` 时,由 sgClaw 在真实执行前先向宿主发送可展示的计划预览,前端只负责渲染。
|
||||
- `browser_backend=superrpa` 时,浏览器高权限动作仍以宿主 pipe 为边界;sgClaw 只决定运行时策略,不把特权上移到前端。
|
||||
- `office_backend=openxml` 时,导出类任务仍由 sgClaw 运行时选择实际导出工具。
|
||||
|
||||
这部分配置的目标很明确:让模型切换、planner 策略和运行时 backend 选择回到 sgClaw 自己,而不是继续散落在 SuperRPA 编译期常量或前端逻辑里。
|
||||
|
||||
### 6.2 记忆
|
||||
|
||||
@@ -199,10 +213,10 @@ ZeroClaw compat 路径中已经接入 memory adapter,但在产品能力层面
|
||||
|
||||
## 7. L3 结论
|
||||
|
||||
L3 的核心不是“把所有未来能力都放进一个宏大数据流图”,而是说明当前系统如何把自然语言任务压缩成可验证、可回包、可受控的浏览器动作。
|
||||
L3 的核心不是“把所有未来能力都放进一个宏大数据流图”,也不是“把所有任务都解释成浏览器动作”,而是说明 sgClaw 如何把任务先交给 zeroclaw runtime,再把其中需要高风险外部执行的部分压缩成可验证、可回包、可受控的浏览器动作。
|
||||
|
||||
重构后的灵魂有三点:
|
||||
|
||||
- 任务入口统一。
|
||||
- 动作契约统一。
|
||||
- 执行路径可替换,但协议和安全边界不变。
|
||||
- runtime 核心统一到 zeroclaw。
|
||||
- 特权工具面可替换,但协议和安全边界不变。
|
||||
|
||||
@@ -1,8 +1,8 @@
|
||||
# L4 — 工程实现与部署拓扑层
|
||||
|
||||
**文档版本**: 2.0
|
||||
**适用项目**: sgClaw(ZeroClaw 重构版)
|
||||
**编制日期**: 2026-03-26
|
||||
**文档版本**: 2.1<br>
|
||||
**适用项目**: sgClaw(ZeroClaw 重构版)<br>
|
||||
**编制日期**: 2026-03-29
|
||||
|
||||
**读者**: 开发者、测试工程师、联调工程师
|
||||
|
||||
@@ -44,7 +44,7 @@ claw/
|
||||
└── archive/
|
||||
```
|
||||
|
||||
工程上应把 `third_party/zeroclaw` 理解为“已 vendored 的兼容依赖”,而不是单独维护的兄弟项目。
|
||||
工程上应把 `third_party/zeroclaw` 理解为“已 vendored 的能力核心”,而不是单独维护的兄弟项目,也不是只用于兼容的附属依赖。
|
||||
|
||||
---
|
||||
|
||||
@@ -83,10 +83,15 @@ claw/
|
||||
|
||||
职责:
|
||||
|
||||
- 决定 fallback 或 compat 执行。
|
||||
- 把统一工具契约映射到浏览器协议。
|
||||
- 当前决定 fallback 或 compat 执行。
|
||||
- 把受保护的浏览器工具契约映射到浏览器协议。
|
||||
- 在 ZeroClaw turn 事件与宿主日志之间做桥接。
|
||||
|
||||
说明:
|
||||
|
||||
- `src/agent/runtime.rs` 与 `src/agent/planner.rs` 属于过渡性轻量路径,不应再被写成长期产品主线。
|
||||
- 主线目标应是“sgClaw security layer + zeroclaw core runtime”,而不是长期保留 browser-only compat 分叉。
|
||||
|
||||
### 2.4 安全与配置
|
||||
|
||||
- [`src/security/mac_policy.rs`](/home/zyl/projects/sgClaw/claw/src/security/mac_policy.rs)
|
||||
@@ -96,7 +101,7 @@ claw/
|
||||
职责:
|
||||
|
||||
- 维护运行时安全边界。
|
||||
- 从环境变量读取 provider 配置。
|
||||
- 读取 provider / skills 等运行时配置,并逐步向 zeroclaw-first 配置模型收敛。
|
||||
|
||||
---
|
||||
|
||||
@@ -130,7 +135,7 @@ sgClaw 不是独立交互式 CLI 产品,正常运行前提是:
|
||||
|
||||
### 3.3 模型配置
|
||||
|
||||
启用 ZeroClaw compat runtime 的关键环境变量:
|
||||
当前启用 ZeroClaw compat runtime 的关键环境变量:
|
||||
|
||||
```bash
|
||||
DEEPSEEK_API_KEY=...
|
||||
@@ -138,7 +143,70 @@ DEEPSEEK_BASE_URL=...
|
||||
DEEPSEEK_MODEL=...
|
||||
```
|
||||
|
||||
若这些变量不存在或不完整,系统会退回 planner fallback。
|
||||
若这些变量不存在或不完整,系统会退回 planner fallback。这个行为是当前实现状态,不是长期架构推荐。
|
||||
|
||||
### 3.4 runtime config 文件
|
||||
|
||||
当 `host` 以 `--config-path=<workspace_root>/sgclaw_config.json` 拉起 `sgclaw` 时,`sgclaw` 会自己读取该 JSON 文件,而不是要求宿主额外复制 skills。
|
||||
|
||||
当前支持的关键字段:
|
||||
|
||||
```json
|
||||
{
|
||||
"apiKey": "sk-...",
|
||||
"baseUrl": "https://api.deepseek.com",
|
||||
"model": "deepseek-chat",
|
||||
"skillsDir": "skill_lib"
|
||||
}
|
||||
```
|
||||
|
||||
说明:
|
||||
|
||||
- `skillsDir` 可省略。
|
||||
- 若省略,则默认使用 `<workspace_root>/.sgclaw-zeroclaw-workspace/skills`。
|
||||
- 若为相对路径,则相对于 `sgclaw_config.json` 所在目录解析。
|
||||
- 若指向某个 skill repo 根目录,且其下存在 `skills/` 子目录,运行时会自动落到该 `skills/` 目录。
|
||||
- 因此 `host` 只需要负责传递 `runtime config` 路径,skill 查找策略由 `sgclaw` 自己控制。
|
||||
- 长期看,这个文件应表达 zeroclaw-first、`planner-first` 的 runtime/profile/tool policy 配置,而不仅是 provider shim。
|
||||
|
||||
### 3.5 launch config 文件与 fallback
|
||||
|
||||
`launch config` 由 `host` 读取,不由 sgClaw 自己解析。设计冻结后的推荐路径为:
|
||||
|
||||
```text
|
||||
<profile>/superrpa/sgclaw_launch_config.json
|
||||
```
|
||||
|
||||
该文件承载的字段应包括:
|
||||
|
||||
- `binary`
|
||||
- `args`
|
||||
- `env`
|
||||
- `working_dir`
|
||||
- `runtime_config_path`
|
||||
- `frontend_bundle_dir`
|
||||
|
||||
加载规则必须保持稳定:
|
||||
|
||||
1. `host` 优先读取 profile-local `launch config`
|
||||
2. 若 `binary` 缺失或无效,则回退到浏览器已知可启动的默认 sgClaw 路径
|
||||
3. 若 `runtime_config_path` 缺失,则回退到 `<profile>/superrpa/sgclaw_config.json`
|
||||
4. 若 `frontend_bundle_dir` 缺失或无效,则回退到 bundled frontend resources
|
||||
|
||||
这样做的目的不是削弱宿主管控,而是把高频变化项从编译期常量改成运行期可替换对象。
|
||||
|
||||
### 3.6 frontend bundle 装载拓扑
|
||||
|
||||
`frontend bundle` 的部署方式应当是“外部 bundle 优先,内置资源兜底”:
|
||||
|
||||
```text
|
||||
host
|
||||
├─ validate frontend_bundle_dir
|
||||
├─ if valid: load external frontend bundle
|
||||
└─ else: load bundled frontend resources
|
||||
```
|
||||
|
||||
这意味着后续改浮窗 UI、验收页面或 planner 展示逻辑,不应再默认要求重编 Chromium。
|
||||
|
||||
---
|
||||
|
||||
@@ -165,7 +233,7 @@ DEEPSEEK_MODEL=...
|
||||
cargo test
|
||||
```
|
||||
|
||||
这组测试表达了一个重要工程事实:当前系统的稳定核心是协议、runtime 选择和 compat 适配,而不是旧版前端验证页。
|
||||
这组测试表达了一个重要工程事实:当前系统的稳定核心是协议、安全边界、runtime 选择和 zeroclaw 接入,而不是旧版前端验证页。
|
||||
|
||||
---
|
||||
|
||||
@@ -179,9 +247,11 @@ cargo test
|
||||
|
||||
### 5.2 外部宿主负责什么
|
||||
|
||||
- 读取并校验 `launch config`。
|
||||
- 拉起并托管 sgClaw 进程。
|
||||
- 提供页面执行能力。
|
||||
- 实现命令落地、响应回传和宿主侧校验。
|
||||
- 装载 `frontend bundle`,并在无效时回退到内置资源。
|
||||
|
||||
### 5.3 不在本仓库内交付的内容
|
||||
|
||||
@@ -199,13 +269,17 @@ L4 的工程边界必须按仓库现实写清楚,否则会把“外部依赖
|
||||
|
||||
```
|
||||
Browser Host Process
|
||||
├─ reads launch config
|
||||
├─ launches sgclaw binary
|
||||
├─ writes init / submit_task to stdin
|
||||
├─ reads command / log / task_complete from stdout
|
||||
└─ executes page actions in host environment
|
||||
├─ executes page actions in host environment
|
||||
└─ loads external frontend bundle or bundled resources
|
||||
|
||||
sgclaw binary
|
||||
├─ loads runtime config
|
||||
├─ loads resources/rules.json
|
||||
├─ runs planner-first execution
|
||||
├─ verifies action/domain
|
||||
├─ optionally calls provider API
|
||||
└─ waits for browser response
|
||||
@@ -220,4 +294,5 @@ sgclaw binary
|
||||
L4 层面的核心结论只有两点:
|
||||
|
||||
1. 本仓库已经从“带演示页的杂糅目录”收敛为“Rust Runtime + 协议文档 + 测试”的内核仓库。
|
||||
2. ZeroClaw 重构后的工程重点,是保证 compat runtime、fallback runtime、浏览器协议三者在同一 contract 上工作。
|
||||
2. ZeroClaw 重构后的工程重点,是把工程形态从“browser-first compat”收口为“zeroclaw-first runtime + sgClaw security envelope”,同时保持浏览器协议稳定。
|
||||
3. `host`、`launch config`、`runtime config`、`frontend bundle`、`planner-first` 必须在文档、代码和验收中使用同一套术语,避免再次把前端逻辑上移到 sgClaw 之外。
|
||||
|
||||
@@ -5,24 +5,25 @@
|
||||
- 目标:给出可执行的工程改造路径与落地记录
|
||||
|
||||
## 1. 结论(先说结论)
|
||||
项目存在至少两条主要提示词构造链路:
|
||||
项目当前存在至少两条主要提示词构造链路,但长期主线只能保留一条 authoritative chain:
|
||||
|
||||
1) **轻量运行时链路**(`src/agent/runtime.rs`)
|
||||
1) **轻量运行时链路**(`src/agent/runtime.rs`,过渡态)
|
||||
- 仅有非常基础的固定 system 提示。
|
||||
- 适用于非完整流程的本地/最小执行场景。
|
||||
- 不应继续被扩展成主线产品提示词体系。
|
||||
|
||||
2) **ZeroClaw 主链路**(`third_party/zeroclaw/*`)
|
||||
- 这条链路是“系统提示”主体,分为:
|
||||
- `Agent` 内部结构化构建器(`SystemPromptBuilder`)
|
||||
- `channels` 侧统一字符串拼装
|
||||
- `skills / personality / identity / bootstrap 文件 / 工具说明` 等多个注入源
|
||||
- 这也是你要关注的主要安全面。
|
||||
- 这也是你要关注的主要安全面,也是未来应保留的唯一主线。
|
||||
|
||||
---
|
||||
|
||||
## 2. 提示词分布结构(按文件/模块)
|
||||
|
||||
### 2.1 固定系统提示(轻量链路)
|
||||
### 2.1 固定系统提示(轻量链路,待收口)
|
||||
- `src/agent/runtime.rs`
|
||||
- `execute_task_with_provider` 的 `ChatMessage { role: "system" ... }`
|
||||
- 当前内容:`You are sgClaw. Use browser_action to complete the browser task.`
|
||||
@@ -108,6 +109,10 @@
|
||||
## 4. 安全改造建议(按优先级)
|
||||
|
||||
### P0(建议立即做)
|
||||
0) 收口双主链路
|
||||
- `sgclaw` 不应长期同时维护一条轻量自定义 prompt 链和一条 zeroclaw 主链。
|
||||
- 目标是:保留 zeroclaw 主链,sgClaw 仅增加安全摘要、浏览器上下文和受控工具面说明。
|
||||
|
||||
1) 接入 `PromptGuard`
|
||||
- 目前已有 `third_party/zeroclaw/src/security/prompt_guard.rs`
|
||||
- 在以下入口加扫描并截断/告警:
|
||||
@@ -148,7 +153,8 @@
|
||||
## 5. 本次已确认的“关键风险”
|
||||
- `PromptGuard` 尚未在主入口统一挂载(存在检测能力,但未形成强制拦截链)。
|
||||
- workspace/skills 内容可直接进入 prompt,注入面较宽。
|
||||
- 两套系统提示构建链路(agent builder 与 channel builder)存在口径差异,需要统一。
|
||||
- 两套系统提示构建链路(轻量链路与 zeroclaw 主链)同时存在,容易造成安全策略漂移。
|
||||
- `sgclaw` 如果继续把浏览器专用提示补丁放在主链外侧,会重新制造第三条 prompt source。
|
||||
|
||||
---
|
||||
|
||||
@@ -161,4 +167,3 @@
|
||||
- 复现提示词覆盖攻击
|
||||
- 系统提示重复/续接场景(seed/reseed)
|
||||
- compact/full 两种技能注入对比
|
||||
|
||||
|
||||
@@ -10,6 +10,16 @@
|
||||
- `L5-提示词分布与安全改造方案.md`:提示词治理与风控增强策略。
|
||||
- `浏览器对接标准.md`:Rust 与 Chromium 对接的协议基线。
|
||||
|
||||
当前运行时补充约定:
|
||||
|
||||
- `sgclaw` 在主线口径上应理解为“安全加固后的 zeroclaw runtime”,而不是浏览器侧 UI 产品,也不是 browser-only agent。
|
||||
- 浏览器宿主与 pipe 协议提供的是受保护执行面;浏览器是重要工具与前端载体,但不是整个 runtime 的定义。
|
||||
- 文档中提到 `planner fallback`、`compat runtime`、`browser_action only` 时,应优先理解为“当前实现状态/过渡态”,而不是长期产品边界。
|
||||
- `sgclaw` 的浏览器侧配置文件为 `sgclaw_config.json`。
|
||||
- `sgclaw` 负责解析模型配置与可选的 `skillsDir`。
|
||||
- `skillsDir` 未设置时,默认回退到 `<workspace_root>/.sgclaw-zeroclaw-workspace/skills`。
|
||||
- `skillsDir` 已设置时,支持直接指向 skills 目录,也支持指向包含 `skills/` 子目录的 skill repo 根目录。
|
||||
|
||||
## 归档文档
|
||||
|
||||
### 项目管理与排期(已归档)
|
||||
|
||||
1572
docs/_tmp_sgbrowser_ws_api_doc.txt
Normal file
1572
docs/_tmp_sgbrowser_ws_api_doc.txt
Normal file
File diff suppressed because it is too large
Load Diff
145
docs/_tmp_sgbrowser_ws_probe_transcript.md
Normal file
145
docs/_tmp_sgbrowser_ws_probe_transcript.md
Normal file
@@ -0,0 +1,145 @@
|
||||
# sgBrowser websocket probe transcript
|
||||
|
||||
Endpoint: `ws://127.0.0.1:12345`
|
||||
Timeout: `1500ms`
|
||||
Cargo target dir override: `D:/data/ideaSpace/rust/sgClaw/claw-new/target_task4`
|
||||
|
||||
## baseline-open
|
||||
|
||||
```text
|
||||
$ CARGO_TARGET_DIR="/d/data/ideaSpace/rust/sgClaw/claw-new/target_task4" cargo run --manifest-path "/d/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" --bin sgbrowser_ws_probe -- --ws-url "ws://127.0.0.1:12345" --timeout-ms 1500 --step "baseline-open::[\"about:blank\",\"sgBrowerserOpenPage\",\"https://www.zhihu.com/hot\"]"
|
||||
Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.95s
|
||||
Running `target_task4\debug\sgbrowser_ws_probe.exe --ws-url 'ws://127.0.0.1:12345' --timeout-ms 1500 --step 'baseline-open::["about:blank","sgBrowerserOpenPage","https://www.zhihu.com/hot"]'`
|
||||
STEP 1 baseline-open
|
||||
SEND: ["about:blank","sgBrowerserOpenPage","https://www.zhihu.com/hot"]
|
||||
RECV: Welcome! You are client #1
|
||||
OUTCOME: received
|
||||
```
|
||||
|
||||
## open-agent
|
||||
|
||||
```text
|
||||
$ CARGO_TARGET_DIR="/d/data/ideaSpace/rust/sgClaw/claw-new/target_task4" cargo run --manifest-path "/d/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" --bin sgbrowser_ws_probe -- --ws-url "ws://127.0.0.1:12345" --timeout-ms 1500 --step "open-agent::[\"about:blank\",\"sgOpenAgent\"]" --step "post-open-agent-open::[\"about:blank\",\"sgBrowerserOpenPage\",\"https://www.zhihu.com/hot\"]"
|
||||
Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.98s
|
||||
Running `target_task4\debug\sgbrowser_ws_probe.exe --ws-url 'ws://127.0.0.1:12345' --timeout-ms 1500 --step 'open-agent::["about:blank","sgOpenAgent"]' --step 'post-open-agent-open::["about:blank","sgBrowerserOpenPage","https://www.zhihu.com/hot"]'`
|
||||
STEP 1 open-agent
|
||||
SEND: ["about:blank","sgOpenAgent"]
|
||||
RECV: Welcome! You are client #1
|
||||
OUTCOME: received
|
||||
|
||||
STEP 2 post-open-agent-open
|
||||
SEND: ["about:blank","sgBrowerserOpenPage","https://www.zhihu.com/hot"]
|
||||
RECV: <none>
|
||||
OUTCOME: timeout
|
||||
```
|
||||
|
||||
## set-auth
|
||||
|
||||
```text
|
||||
$ CARGO_TARGET_DIR="/d/data/ideaSpace/rust/sgClaw/claw-new/target_task4" cargo run --manifest-path "/d/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" --bin sgbrowser_ws_probe -- --ws-url "ws://127.0.0.1:12345" --timeout-ms 1500 --step "set-auth::[\"about:blank\",\"sgSetAuthInfo\",\"probe-user\",\"probe-token\"]" --step "post-set-auth-open::[\"about:blank\",\"sgBrowerserOpenPage\",\"https://www.zhihu.com/hot\"]"
|
||||
Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.95s
|
||||
Running `target_task4\debug\sgbrowser_ws_probe.exe --ws-url 'ws://127.0.0.1:12345' --timeout-ms 1500 --step 'set-auth::["about:blank","sgSetAuthInfo","probe-user","probe-token"]' --step 'post-set-auth-open::["about:blank","sgBrowerserOpenPage","https://www.zhihu.com/hot"]'`
|
||||
STEP 1 set-auth
|
||||
SEND: ["about:blank","sgSetAuthInfo","probe-user","probe-token"]
|
||||
RECV: Welcome! You are client #1
|
||||
OUTCOME: received
|
||||
|
||||
STEP 2 post-set-auth-open
|
||||
SEND: ["about:blank","sgBrowerserOpenPage","https://www.zhihu.com/hot"]
|
||||
RECV: <none>
|
||||
OUTCOME: timeout
|
||||
```
|
||||
|
||||
## browser-login
|
||||
|
||||
```text
|
||||
$ CARGO_TARGET_DIR="/d/data/ideaSpace/rust/sgClaw/claw-new/target_task4" cargo run --manifest-path "/d/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" --bin sgbrowser_ws_probe -- --ws-url "ws://127.0.0.1:12345" --timeout-ms 1500 --step 'browser-login::["about:blank","sgBrowserLogin",{"appName":"probe","userName":"probe","orgName":"probe","menus":[{"name":"probe","normalImg":"x","activeImg":"x","url":"https://www.zhihu.com/hot"}]}]' --step 'post-browser-login-open::["about:blank","sgBrowerserOpenPage","https://www.zhihu.com/hot"]'
|
||||
Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.95s
|
||||
Running `target_task4\debug\sgbrowser_ws_probe.exe --ws-url 'ws://127.0.0.1:12345' --timeout-ms 1500 --step 'browser-login::["about:blank","sgBrowserLogin",{"appName":"probe","userName":"probe","orgName":"probe","menus":[{"name":"probe","normalImg":"x","activeImg":"x","url":"https://www.zhihu.com/hot"}]}]' --step 'post-browser-login-open::["about:blank","sgBrowerserOpenPage","https://www.zhihu.com/hot"]'`
|
||||
STEP 1 browser-login
|
||||
SEND: ["about:blank","sgBrowserLogin",{"appName":"probe","userName":"probe","orgName":"probe","menus":[{"name":"probe","normalImg":"x","activeImg":"x","url":"https://www.zhihu.com/hot"}]}]
|
||||
RECV: Welcome! You are client #1
|
||||
OUTCOME: received
|
||||
|
||||
STEP 2 post-browser-login-open
|
||||
SEND: ["about:blank","sgBrowerserOpenPage","https://www.zhihu.com/hot"]
|
||||
RECV: <none>
|
||||
OUTCOME: timeout
|
||||
```
|
||||
|
||||
## active-tab
|
||||
|
||||
```text
|
||||
$ CARGO_TARGET_DIR="/d/data/ideaSpace/rust/sgClaw/claw-new/target_task4" cargo run --manifest-path "/d/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" --bin sgbrowser_ws_probe -- --ws-url "ws://127.0.0.1:12345" --timeout-ms 1500 --step "active-tab::[\"about:blank\",\"sgBrowerserActiveTab\",\"https://www.zhihu.com/hot\",\"probeCallback\"]" --step "post-active-tab-open::[\"about:blank\",\"sgBrowerserOpenPage\",\"https://www.zhihu.com/hot\"]"
|
||||
Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.95s
|
||||
Running `target_task4\debug\sgbrowser_ws_probe.exe --ws-url 'ws://127.0.0.1:12345' --timeout-ms 1500 --step 'active-tab::["about:blank","sgBrowerserActiveTab","https://www.zhihu.com/hot","probeCallback"]' --step 'post-active-tab-open::["about:blank","sgBrowerserOpenPage","https://www.zhihu.com/hot"]'`
|
||||
STEP 1 active-tab
|
||||
SEND: ["about:blank","sgBrowerserActiveTab","https://www.zhihu.com/hot","probeCallback"]
|
||||
RECV: Welcome! You are client #1
|
||||
OUTCOME: received
|
||||
|
||||
STEP 2 post-active-tab-open
|
||||
SEND: ["about:blank","sgBrowerserOpenPage","https://www.zhihu.com/hot"]
|
||||
RECV: <none>
|
||||
OUTCOME: timeout
|
||||
```
|
||||
|
||||
## combined-bootstrap
|
||||
|
||||
```text
|
||||
$ CARGO_TARGET_DIR="/d/data/ideaSpace/rust/sgClaw/claw-new/target_task4" cargo run --manifest-path "/d/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" --bin sgbrowser_ws_probe -- --ws-url "ws://127.0.0.1:12345" --timeout-ms 1500 --step "combined-open-agent::[\"about:blank\",\"sgOpenAgent\"]" --step "combined-active-tab::[\"about:blank\",\"sgBrowerserActiveTab\",\"https://www.zhihu.com/hot\",\"probeCallback\"]" --step "combined-open::[\"about:blank\",\"sgBrowerserOpenPage\",\"https://www.zhihu.com/hot\"]"
|
||||
Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.95s
|
||||
Running `target_task4\debug\sgbrowser_ws_probe.exe --ws-url 'ws://127.0.0.1:12345' --timeout-ms 1500 --step 'combined-open-agent::["about:blank","sgOpenAgent"]' --step 'combined-active-tab::["about:blank","sgBrowerserActiveTab","https://www.zhihu.com/hot","probeCallback"]' --step 'combined-open::["about:blank","sgBrowerserOpenPage","https://www.zhihu.com/hot"]'`
|
||||
STEP 1 combined-open-agent
|
||||
SEND: ["about:blank","sgOpenAgent"]
|
||||
RECV: Welcome! You are client #1
|
||||
OUTCOME: received
|
||||
|
||||
STEP 2 combined-active-tab
|
||||
SEND: ["about:blank","sgBrowerserActiveTab","https://www.zhihu.com/hot","probeCallback"]
|
||||
RECV: <none>
|
||||
OUTCOME: timeout
|
||||
|
||||
STEP 3 combined-open
|
||||
SEND: ["about:blank","sgBrowerserOpenPage","https://www.zhihu.com/hot"]
|
||||
RECV: <none>
|
||||
OUTCOME: timeout
|
||||
```
|
||||
|
||||
## requesturl-variants
|
||||
|
||||
```text
|
||||
$ CARGO_TARGET_DIR="/d/data/ideaSpace/rust/sgClaw/claw-new/target_task4" cargo run --manifest-path "/d/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" --bin sgbrowser_ws_probe -- --ws-url "ws://127.0.0.1:12345" --timeout-ms 1500 --step "target-as-requesturl::[\"https://www.zhihu.com/hot\",\"sgBrowerserOpenPage\",\"https://www.zhihu.com/hot\"]"
|
||||
Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.94s
|
||||
Running `target_task4\debug\sgbrowser_ws_probe.exe --ws-url 'ws://127.0.0.1:12345' --timeout-ms 1500 --step 'target-as-requesturl::["https://www.zhihu.com/hot","sgBrowerserOpenPage","https://www.zhihu.com/hot"]'`
|
||||
STEP 1 target-as-requesturl
|
||||
SEND: ["https://www.zhihu.com/hot","sgBrowerserOpenPage","https://www.zhihu.com/hot"]
|
||||
RECV: Welcome! You are client #1
|
||||
OUTCOME: received
|
||||
```
|
||||
|
||||
```text
|
||||
$ CARGO_TARGET_DIR="/d/data/ideaSpace/rust/sgClaw/claw-new/target_task4" cargo run --manifest-path "/d/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" --bin sgbrowser_ws_probe -- --ws-url "ws://127.0.0.1:12345" --timeout-ms 1500 --step "baidu-requesturl::[\"https://www.baidu.com\",\"sgBrowerserOpenPage\",\"https://www.zhihu.com/hot\"]"
|
||||
Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.94s
|
||||
Running `target_task4\debug\sgbrowser_ws_probe.exe --ws-url 'ws://127.0.0.1:12345' --timeout-ms 1500 --step 'baidu-requesturl::["https://www.baidu.com","sgBrowerserOpenPage","https://www.zhihu.com/hot"]'`
|
||||
STEP 1 baidu-requesturl
|
||||
SEND: ["https://www.baidu.com","sgBrowerserOpenPage","https://www.zhihu.com/hot"]
|
||||
RECV: Welcome! You are client #1
|
||||
OUTCOME: received
|
||||
```
|
||||
|
||||
| Sequence | Sent frames | First reply | Final outcome | Decision signal |
|
||||
| --- | --- | --- | --- | --- |
|
||||
| baseline-open | `["about:blank","sgBrowerserOpenPage","https://www.zhihu.com/hot"]` | `Welcome! You are client #1` | received only welcome banner; no numeric status or callback frame captured | does not satisfy Option A rule |
|
||||
| open-agent | `["about:blank","sgOpenAgent"]` then `["about:blank","sgBrowerserOpenPage","https://www.zhihu.com/hot"]` | `Welcome! You are client #1` | step 2 timed out with no reply | does not satisfy Option A rule |
|
||||
| set-auth | `["about:blank","sgSetAuthInfo","probe-user","probe-token"]` then `["about:blank","sgBrowerserOpenPage","https://www.zhihu.com/hot"]` | `Welcome! You are client #1` | step 2 timed out with no reply | does not satisfy Option A rule |
|
||||
| browser-login | `["about:blank","sgBrowserLogin",{"appName":"probe","userName":"probe","orgName":"probe","menus":[{"name":"probe","normalImg":"x","activeImg":"x","url":"https://www.zhihu.com/hot"}]}]` then `["about:blank","sgBrowerserOpenPage","https://www.zhihu.com/hot"]` | `Welcome! You are client #1` | step 2 timed out with no reply | does not satisfy Option A rule |
|
||||
| active-tab | `["about:blank","sgBrowerserActiveTab","https://www.zhihu.com/hot","probeCallback"]` then `["about:blank","sgBrowerserOpenPage","https://www.zhihu.com/hot"]` | `Welcome! You are client #1` | step 2 timed out with no reply | does not satisfy Option A rule |
|
||||
| combined-bootstrap | `["about:blank","sgOpenAgent"]` then `["about:blank","sgBrowerserActiveTab","https://www.zhihu.com/hot","probeCallback"]` then `["about:blank","sgBrowerserOpenPage","https://www.zhihu.com/hot"]` | `Welcome! You are client #1` | steps 2 and 3 timed out with no reply | does not satisfy Option A rule |
|
||||
| requesturl-variants | `["https://www.zhihu.com/hot","sgBrowerserOpenPage","https://www.zhihu.com/hot"]` and `["https://www.baidu.com","sgBrowerserOpenPage","https://www.zhihu.com/hot"]` | `Welcome! You are client #1` | each one-shot run received only the welcome banner; no numeric status or callback frame captured | does not satisfy Option A rule |
|
||||
|
||||
## Final decision
|
||||
|
||||
**Option B wins.**
|
||||
|
||||
Reason: the strict rule says Option A wins only if at least one sequence reproducibly yields real numeric status and/or callback frames for a real business action. Across the full required matrix, the reachable endpoint consistently returned only the websocket welcome banner on the first reply for each fresh connection, and every follow-on business-action step either timed out or produced no numeric status/callback frame. Therefore the evidence does not validate a raw-websocket bootstrap contract, so Option B is the required outcome.
|
||||
89
docs/acceptance/2026-03-29-zhihu-hotlist-excel.md
Normal file
89
docs/acceptance/2026-03-29-zhihu-hotlist-excel.md
Normal file
@@ -0,0 +1,89 @@
|
||||
# Zhihu Hotlist Excel Acceptance
|
||||
|
||||
- Date: 2026-03-30 03:46:51 +0800
|
||||
- Mode: real provider + live Zhihu hotlist API + simulated browser pipe
|
||||
- Workspace: `/tmp/sgclaw-live-acceptance-85j8m_dq`
|
||||
- Final success: `True`
|
||||
- Total score: `90/100`
|
||||
|
||||
## Rubric
|
||||
|
||||
- skill selection: `30/30`
|
||||
- tool discipline: `25/25`
|
||||
- hotlist data correctness: `20/20`
|
||||
- xlsx export success: `20/20`
|
||||
- final response quality: `5/5`
|
||||
|
||||
## Final Output
|
||||
|
||||
- exported_path: `/tmp/sgclaw-live-acceptance-85j8m_dq/.sgclaw-zeroclaw-workspace/out/zhihu-hotlist-1774813602601179684.xlsx`
|
||||
- final_summary: `完美!我已经成功完成了知乎热榜数据的读取和导出。以下是任务完成总结:
|
||||
|
||||
## 任务完成情况
|
||||
|
||||
1. **导航到知乎热榜页面**:成功从知乎首页导航到热榜页面(https://www.zhihu.com/hot)
|
||||
|
||||
2. **提取热榜数据**:通过浏览器获取了当前热榜前10条数据,包括:
|
||||
- 排名(1-10)
|
||||
- 标题(完整问题描述)
|
||||
- 热度值(以"万"为单位)
|
||||
|
||||
3. **导出Excel文件**:成功将结构化数据导出为Excel文件,包含以下列:
|
||||
- rank(排名)
|
||||
- title(标题)
|
||||
- heat(热度)
|
||||
|
||||
## 导出的数据内容
|
||||
|
||||
热榜前10条话题涵盖了多个领域:
|
||||
- 娱乐事件(李荣浩与单依纯的演唱会授权争议)
|
||||
- 体育赛事(张雪机车WSBK夺冠)
|
||||
- 国际政治经济(日本外储操作、伊朗冲突)
|
||||
- 科技产业(字节跳动发展、DDR5内存价格)
|
||||
- 社会现象(周末情侣模式、学霸家长低调现象)
|
||||
- 生活安全(灭火毯安全问题)
|
||||
|
||||
## 生成的文件
|
||||
|
||||
**Excel文件路径**:`/tmp/sgclaw-live-acceptance-85j8m_dq/.sgclaw-zeroclaw-workspace/out/zhihu-hotlist-1774813602601179684.xlsx`
|
||||
|
||||
该文件包含了2026年3月30日03:46:02采集的知乎热榜数据,您可以使用Excel或其他电子表格软件打开查看完整的热榜信息。`
|
||||
|
||||
## Skill Logs
|
||||
|
||||
- `sgclaw runtime version=0.1.0 protocol=1.0`
|
||||
- `DeepSeek config loaded from /tmp/sgclaw-live-acceptance-85j8m_dq/sgclaw_config.json model=deepseek-chat base_url=https://api.deepseek.com`
|
||||
- `skills dir resolved to /home/zyl/projects/sgClaw/skill_lib/skills`
|
||||
- `runtime profile=BrowserAttached skills_prompt_mode=Compact`
|
||||
- `zeroclaw_process_message_primary`
|
||||
- `先规划再执行知乎热榜 Excel 导出
|
||||
navigate https://www.zhihu.com/hot
|
||||
getText main
|
||||
call openxml_office
|
||||
return generated local .xlsx path`
|
||||
- `loaded skills: office-export-xlsx@0.1.0, zhihu-hotlist@0.1.0, zhihu-hotlist-screen@0.1.0, zhihu-navigate@0.1.0, zhihu-write@0.1.0`
|
||||
- `navigate https://www.zhihu.com/hot`
|
||||
- `call zhihu-hotlist.extract_hotlist`
|
||||
- `browser script failed: {"unsupported_action":"eval"}`
|
||||
- `getText body`
|
||||
- `getText .HotList-list`
|
||||
- `call openxml_office`
|
||||
- `unsupported columns: expected [rank, title, heat]`
|
||||
- `call openxml_office`
|
||||
|
||||
## Live Hotlist Sample
|
||||
|
||||
- 1. 李荣浩摆证据 4 连质问单依纯,为什么没有授权的歌曲也能放进演唱会?演唱会筹备中可能出了什么问题? | 1220万
|
||||
- 2. 如何看待张雪机车在 2026 年 WSBK 葡萄牙站夺冠?这对国内的摩托赛事发展有什么影响? | 370万
|
||||
- 3. 日本拟动用外储做空国际原油,以挽救日元汇率,对此你怎么看,其会重演 96 年「住友铜事件」么? | 356万
|
||||
- 4. 字节跳动是怎么短短数年就能单挑所有互联网巨头的? | 277万
|
||||
- 5. 如何看待张雪机车 820rr 拿下 wsbk 葡萄牙站第一回合冠军?这个冠军含金量如何? | 241万
|
||||
- 6. 伊朗科技大学遭袭后,伊朗将美以大学列为「合法袭击目标」,如果战争扩大到教育机构,冲突还有回头路吗? | 202万
|
||||
- 7. 「周末情侣」模式日渐兴起,工作日通过消息视频联系,仅周末相聚,如何看待这种模式?你有过类似的经历吗? | 163万
|
||||
- 8. 男孩玩灭火毯全身扎满超细玻璃纤维,又痒又痛取不出来,灭火毯为什么会「扎人」?怎么处理才不遭罪? | 158万
|
||||
- 9. DDR5 内存价格 3 月出现明显下降,请问这是短期现象,还是内存供需紧张真的缓和了? | 151万
|
||||
- 10. 为什么越厉害的学霸,她们家长越低调?从来不在朋友圈晒孩子成绩? | 139万
|
||||
|
||||
## Stderr
|
||||
|
||||
- `sgclaw ready: agent_id=4b984e63-3254-4518-a75a-127e7dad6474`
|
||||
422
docs/collect_lineloss_troubleshooting_guide.md
Normal file
422
docs/collect_lineloss_troubleshooting_guide.md
Normal file
@@ -0,0 +1,422 @@
|
||||
# collect_lineloss.js 从生成到可用的完整排查记录
|
||||
|
||||
本文档记录了 `tq-lineloss-report` skill 脚本从初始生成到最终可用的全部排查过程,包括遇到的每个错误、根因分析和修复方法。可作为后续类似 skill 开发的排查模板。
|
||||
|
||||
---
|
||||
|
||||
## 背景
|
||||
|
||||
### 架构概览
|
||||
|
||||
```
|
||||
用户输入 "兰州公司 月累计 2026-03。。。"
|
||||
│
|
||||
▼
|
||||
sgClaw Rust 进程
|
||||
├── 解析指令 → DeterministicExecutionPlan
|
||||
├── 读取 collect_lineloss.js 脚本
|
||||
├── 包装为 IIFE:(function(){ const args = {...}; <脚本内容> })()
|
||||
├── 调用 sgBrowserExcuteJsCodeByDomain(domain, wrappedJs)
|
||||
│ 注入到浏览器中匹配 domain 的页面执行
|
||||
├── 等待回调:脚本通过 callBackJsToCpp 返回 JSON 结果
|
||||
├── 解析 artifact JSON → 提取 status/rows/reasons
|
||||
└── 生成 XLSX(Rust 侧)→ 返回 outcome
|
||||
```
|
||||
|
||||
### 关键差异:原始场景 vs Skill 模式
|
||||
|
||||
| 对比项 | 原始场景 (index.html) | Skill 模式 |
|
||||
|--------|----------------------|------------|
|
||||
| 脚本注入方式 | `sgBrowserExcuteJsCode(exactURL, js)` — 精确 URL | `sgBrowserExcuteJsCodeByDomain(domain, js)` — 仅域名匹配 |
|
||||
| 执行页面 | 业务子页面 `/tqLinelossStatis/tqQualifyRateMonitor` | 可能命中父框架页 `/gsllys` |
|
||||
| `window.mac` | 有(Vue 实例,`mounted()` 中 `window.mac = this`) | 无(没有 Vue 实例) |
|
||||
| 导出 Excel | JS 调 `localhost:13313`(本地场景页可访问) | JS 无法调 `localhost:13313`(CORS 阻断) |
|
||||
| 结果回传 | Rust 只需要 `.then()` 回调结果 | 同左,但脚本是 async 函数需 `.then()` 处理 |
|
||||
|
||||
---
|
||||
|
||||
## 排查时间线
|
||||
|
||||
### 第 1 阶段:基础管道问题
|
||||
|
||||
#### 问题 1: `missing_expected_domain`
|
||||
|
||||
**现象**: `status=blocked reasons=missing_expected_domain`
|
||||
|
||||
**根因**: Rust 侧 `deterministic_submit.rs` 构造 args 时没有传 `expected_domain` 字段。`derive_expected_domain()` 从 `page_url` 提取 host 时只取了域名不含端口,但传入 args 时 key 不匹配。
|
||||
|
||||
**修复**: 确保 `deterministic_submit_args()` 正确插入 `expected_domain` 到 args Map。
|
||||
|
||||
**涉及文件**: `src/compat/deterministic_submit.rs`
|
||||
|
||||
**是否需要重新编译**: 是
|
||||
|
||||
---
|
||||
|
||||
#### 问题 2: `target_url` 缺少端口号
|
||||
|
||||
**现象**: 脚本注入失败或注入到错误页面。
|
||||
|
||||
**根因**: `target_url` 被设为 `http://20.76.57.61`(无端口),但实际业务页面在 `http://20.76.57.61:18080/gsllys/...`。`sgBrowserExcuteJsCodeByDomain` 需要能匹配到正确的标签页。
|
||||
|
||||
**修复**: 在 `deterministic_submit.rs` 中设置完整 `target_url`:
|
||||
```rust
|
||||
const LINELLOSS_TARGET_URL: &str = "http://20.76.57.61:18080/gsllys/tqLinelossStatis/tqQualifyRateMonitor";
|
||||
```
|
||||
|
||||
**涉及文件**: `src/compat/deterministic_submit.rs`
|
||||
|
||||
**是否需要重新编译**: 是
|
||||
|
||||
---
|
||||
|
||||
#### 问题 3: 脚本返回 `{}` 空对象
|
||||
|
||||
**现象**: Rust 侧收到的 artifact 是 `{}`,无任何数据。
|
||||
|
||||
**根因**: `collect_lineloss.js` 的入口 `buildBrowserEntrypointResult()` 是 `async` 函数,返回 Promise。Rust 侧 `build_eval_js` 包装器原来直接调用 `_s(v)` 发送结果,但 `v` 是一个 Promise 对象,JSON.stringify 后变成 `{}`。
|
||||
|
||||
**修复**: 在 `build_eval_js`(`callback_backend.rs`)中增加 Promise 检测:
|
||||
```rust
|
||||
// 旧代码
|
||||
"_s(v);"
|
||||
|
||||
// 新代码
|
||||
"if(v&&typeof v.then==='function'){v.then(_s).catch(function(){});}else{_s(v);}"
|
||||
```
|
||||
|
||||
如果返回值是 thenable(Promise),等它 resolve 后再发送回调。
|
||||
|
||||
**涉及文件**: `src/browser/callback_backend.rs` 中 `build_eval_js` 函数
|
||||
|
||||
**是否需要重新编译**: 是
|
||||
|
||||
**教训**: 所有 browser_script skill 如果入口函数是 async(返回 Promise),都需要这个 `.then()` 处理。这是管道层的通用修复。
|
||||
|
||||
---
|
||||
|
||||
### 第 2 阶段:页面上下文问题
|
||||
|
||||
#### 问题 4: `page_context_unavailable` (mac_missing)
|
||||
|
||||
**现象**:
|
||||
```
|
||||
tq-lineloss-report 国网兰州供电公司 2026-03 status=blocked rows=0 reasons=page_context_unavailable
|
||||
```
|
||||
|
||||
**排查过程**:
|
||||
|
||||
1. 在 `validatePageContext` 中添加诊断信息:
|
||||
```javascript
|
||||
// 临时诊断代码
|
||||
const diag = 'href=' + href + '|host=' + host + '|port=' + port + '|title=' + title + '|mac=' + hasMac;
|
||||
return { ok: false, reason: 'page_context_unavailable:mac_missing|' + diag };
|
||||
```
|
||||
|
||||
2. 页面返回的诊断结果:
|
||||
```
|
||||
href=http://20.76.57.61:18080/gsllys
|
||||
host=20.76.57.61
|
||||
port=18080
|
||||
title=台区线损大数据分析模块
|
||||
mac=false
|
||||
```
|
||||
|
||||
**根因**: `sgBrowserExcuteJsCodeByDomain("20.76.57.61")` 匹配到了父框架页 `/gsllys`,而不是业务子页面。`window.mac` 是业务子页面的 Vue 实例,在 `mounted()` 中通过 `window.mac = this` 设置,父框架页没有这个实例。
|
||||
|
||||
**关键认知**: 在 Skill 模式下没有 Vue 实例,`window.mac` 检查在架构上就不适用。脚本通过 AJAX 发绝对 URL 请求,不依赖页面本地状态。
|
||||
|
||||
**修复**: 删除 `globalThis.mac` 检查,只保留 host 匹配:
|
||||
```javascript
|
||||
// 修复前
|
||||
validatePageContext(args) {
|
||||
// ... 含 mac 检查 + 诊断代码
|
||||
if (!hasMac) {
|
||||
return { ok: false, reason: 'page_context_unavailable:mac_missing|' + diag };
|
||||
}
|
||||
}
|
||||
|
||||
// 修复后
|
||||
validatePageContext(args) {
|
||||
const host = normalizeText(globalThis.location?.hostname);
|
||||
const expected = normalizeText(args.expected_domain);
|
||||
if (!host) {
|
||||
return { ok: false, reason: 'page_context_unavailable' };
|
||||
}
|
||||
if (host !== expected) {
|
||||
return { ok: false, reason: 'page_context_mismatch' };
|
||||
}
|
||||
return { ok: true };
|
||||
},
|
||||
```
|
||||
|
||||
**涉及文件**: `collect_lineloss.js` — `validatePageContext` 函数
|
||||
|
||||
**是否需要重新编译**: 否(JS 文件运行时读取)
|
||||
|
||||
**排查技巧**: 在 reasons 中拼接诊断信息(href/host/port/title/mac),不需要 F12 console,直接通过 Rust 侧的 summary 输出就能看到。
|
||||
|
||||
---
|
||||
|
||||
### 第 3 阶段:API 请求问题
|
||||
|
||||
#### 问题 5: `api_query_failed` — 返回 HTML 而非 JSON
|
||||
|
||||
**现象**:
|
||||
```
|
||||
status=error rows=0 reasons=api_query_failed:month_api_failed: SyntaxError: Unexpected token '<', "<!DOCTYPE "... is not valid JSON
|
||||
```
|
||||
|
||||
**根因**: 后端服务检测到请求缺少 `X-Requested-With: XMLHttpRequest` 头,认为这不是 AJAX 请求,返回了 HTML 登录页面。jQuery 的 `$.ajax` 不会自动添加这个头。
|
||||
|
||||
**修复**: 在 `queryMonthData` 和 `queryWeekData` 的 `$.ajax` 调用中添加请求头:
|
||||
```javascript
|
||||
$.ajax({
|
||||
url,
|
||||
type: 'POST',
|
||||
dataType: 'json',
|
||||
crossDomain: true,
|
||||
headers: { 'X-Requested-With': 'XMLHttpRequest' }, // <-- 新增
|
||||
data: request,
|
||||
contentType: 'application/x-www-form-urlencoded;charset=UTF-8',
|
||||
success: resolve,
|
||||
error: (xhr, _status, err) => reject(new Error(
|
||||
`month_api_failed(${xhr.status}): ${String(err)}|body=${String(xhr.responseText || '').substring(0, 200)}`
|
||||
))
|
||||
});
|
||||
```
|
||||
|
||||
**涉及文件**: `collect_lineloss.js` — `queryMonthData` 和 `queryWeekData`
|
||||
|
||||
**是否需要重新编译**: 否
|
||||
|
||||
**排查技巧**: 在 error handler 中拼接 `xhr.responseText` 的前 200 字符到 reasons 中。如果看到 `<!DOCTYPE` 开头,说明后端返回了 HTML 而非 JSON。
|
||||
|
||||
**通用规则**: 内网 Java 后端通常依赖 `X-Requested-With: XMLHttpRequest` 来区分页面请求和 AJAX 请求。所有对内网 API 的 `$.ajax` 调用都应加上此头。
|
||||
|
||||
---
|
||||
|
||||
### 第 4 阶段:数据规范化问题
|
||||
|
||||
#### 问题 6: `row_normalization_failed` — 列名不匹配
|
||||
|
||||
**现象**:
|
||||
```
|
||||
status=error rows=0 reasons=row_normalization_failed:rawRows=12|keys=YGDL,ORG_NO,YXSL,TG_NUM...
|
||||
```
|
||||
|
||||
**根因**: 初始生成的 `MONTH_COLUMN_DEFS` 使用了猜测的列名:
|
||||
```javascript
|
||||
// 错误的列名
|
||||
['LINE_LOSS_RATE', '线损完成率(%)'],
|
||||
['PPQ', '累计供电量'],
|
||||
['UPQ', '累计售电量'],
|
||||
```
|
||||
|
||||
而 API 实际返回的列名是(参考原始场景 `index.html` 中的 `cols2`):
|
||||
```javascript
|
||||
// 正确的列名
|
||||
['ORG_NAME', '供电单位'],
|
||||
['YGDL', '累计供电量'],
|
||||
['YYDL', '累计售电量'],
|
||||
['YXSL', '线损完成率(%)'],
|
||||
['RAT_SCOPE', '线损率累计目标值'],
|
||||
['BLANK3', '目标完成率'],
|
||||
['BLANK2', '排行']
|
||||
```
|
||||
|
||||
**修复**: 按原始场景 `index.html` 中 `cols2` 的定义修正 `MONTH_COLUMN_DEFS`。
|
||||
|
||||
**排查技巧**: 在 `reasons` 中拼接 `rawRows.length` 和 `Object.keys(rawRows[0]).join(',')` 可以直接看到 API 返回了哪些字段。
|
||||
|
||||
**通用规则**: 生成 skill 脚本时,列定义必须从原始场景代码中精确复制,不能靠猜测。找 `cols1`/`cols2` 或表格渲染相关代码。
|
||||
|
||||
---
|
||||
|
||||
#### 问题 7: `row_normalization_failed` — 数值类型不兼容
|
||||
|
||||
**现象**: 列名修正后仍报 `row_normalization_failed:rawRows=12`,12 行全部被过滤。
|
||||
|
||||
**根因**: `pickFirstNonEmpty()` 函数只识别字符串类型:
|
||||
```javascript
|
||||
function pickFirstNonEmpty(...values) {
|
||||
for (const value of values) {
|
||||
if (isNonEmptyString(value)) { // isNonEmptyString: typeof value === 'string'
|
||||
return value.trim();
|
||||
}
|
||||
}
|
||||
return ''; // API 返回数字 12345.67,typeof === 'number',被当作空值
|
||||
}
|
||||
```
|
||||
|
||||
API 返回的字段值是数字(如 `YGDL: 12345.67`),不是字符串。`pickFirstNonEmpty` 对数字返回 `''`,导致所有行的所有字段都为空,全部被过滤。
|
||||
|
||||
**修复**: `normalizeMonthRow` 不使用 `pickFirstNonEmpty`,改为直接处理任意类型值:
|
||||
```javascript
|
||||
// 修复前
|
||||
function normalizeMonthRow(rawRow) {
|
||||
const row = {};
|
||||
for (const key of MONTH_COLUMNS) {
|
||||
row[key] = pickFirstNonEmpty(rawRow?.[key]); // 数字类型 → ''
|
||||
}
|
||||
return MONTH_COLUMNS.every((key) => row[key] !== '') ? row : null;
|
||||
}
|
||||
|
||||
// 修复后
|
||||
function normalizeMonthRow(rawRow) {
|
||||
const row = {};
|
||||
for (const key of MONTH_COLUMNS) {
|
||||
const v = rawRow?.[key];
|
||||
row[key] = (v === null || v === undefined || v === '') ? '' : String(v).trim();
|
||||
}
|
||||
return MONTH_COLUMNS.every((key) => row[key] !== '') ? row : null;
|
||||
}
|
||||
```
|
||||
|
||||
**涉及文件**: `collect_lineloss.js` — `normalizeMonthRow`
|
||||
|
||||
**是否需要重新编译**: 否
|
||||
|
||||
**通用规则**: 内网 API 返回的 JSON 中数值字段通常是 `number` 类型而非字符串。行规范化函数必须用 `String(v)` 进行类型转换,不能依赖 `typeof === 'string'` 判断。
|
||||
|
||||
---
|
||||
|
||||
### 第 5 阶段:导出问题(架构级)
|
||||
|
||||
#### 问题 8: 导出永久挂起
|
||||
|
||||
**现象**:
|
||||
```
|
||||
tq-lineloss-report 国网兰州供电公司 2026-03 status=pl rows=12
|
||||
```
|
||||
数据采集成功(12 行),但之后永远没有返回,脚本卡死在导出步骤。
|
||||
|
||||
**排查过程**:
|
||||
|
||||
1. `exportWorkbook` 调用 `fetch('http://localhost:13313/...')` — CORS 阻断
|
||||
2. 改用 `$.ajax({ crossDomain: true })` — 同样阻断
|
||||
3. 确认这是浏览器安全模型限制,不是配置问题
|
||||
|
||||
**根因**: 脚本运行在远程页面 `http://20.76.57.61:18080` 上,浏览器禁止从远程页面向 `localhost:13313` 发起请求(同源策略 + Mixed Content)。`crossDomain: true` 只是告诉 jQuery 用跨域模式,并不能绕过浏览器安全策略。
|
||||
|
||||
原始场景的解决方式:有一个本地场景页面(`localhost` 上的 `index.html`)充当代理,先在远程页面采集数据,再通过 `postMessage` 或回调传回本地页面,由本地页面调用 `localhost:13313`。
|
||||
|
||||
Skill 模式没有本地场景页面,因此这种代理机制不存在。
|
||||
|
||||
**解决方案**: 将导出逻辑从浏览器 JS 移到 Rust 侧(方案 A2: Rust 本地生成 XLSX)。
|
||||
|
||||
**最终架构**:
|
||||
```
|
||||
JS (浏览器): 采集数据 → 返回 artifact { rows, column_defs, status }
|
||||
↓
|
||||
Rust (本地): 解析 artifact → 提取 rows + column_defs → 生成 XLSX 文件
|
||||
```
|
||||
|
||||
**具体修改**:
|
||||
|
||||
1. **JS 侧**: 删除 `exportWorkbook()`、`writeReportLog()`、`postJson()`、`buildExportPayload()` 等导出相关代码。artifact 中添加 `column_defs` 字段,export 状态设为 `deferred_to_rust`。
|
||||
|
||||
2. **Rust 侧**: 新增 `lineloss_xlsx_export.rs`,用 `zip` crate + OpenXML XML 生成 XLSX。在 `deterministic_submit.rs` 中,收到 artifact 后调用 XLSX 生成。
|
||||
|
||||
**涉及文件**:
|
||||
- `collect_lineloss.js` — 删除导出代码,添加 `column_defs`
|
||||
- `src/compat/lineloss_xlsx_export.rs` — 新增
|
||||
- `src/compat/deterministic_submit.rs` — 新增导出集成
|
||||
- `src/compat/mod.rs` — 注册新模块
|
||||
|
||||
**是否需要重新编译**: 是
|
||||
|
||||
**通用规则**: 任何从远程页面调用 `localhost` 的操作在 Skill 模式下都不可行。导出/写日志等需要访问本地服务的功能必须放到 Rust 侧实现。
|
||||
|
||||
---
|
||||
|
||||
## 排查方法论总结
|
||||
|
||||
### 1. 诊断信息注入模式
|
||||
|
||||
脚本运行在浏览器中,无法看 F12 console。唯一的信息通道是 artifact JSON 的 `reasons` 字段。
|
||||
|
||||
```javascript
|
||||
// 在 catch 块中注入详细错误
|
||||
reasons: ['api_query_failed:' + String(error?.message || error || 'unknown')]
|
||||
|
||||
// 在规范化失败时注入原始数据摘要
|
||||
reasons: ['row_normalization_failed:rawRows=' + rawRows.length + '|keys=' + Object.keys(rawRows[0]).join(',')]
|
||||
|
||||
// 在页面上下文检查中注入环境信息
|
||||
reason: 'page_context_unavailable:mac_missing|href=' + href + '|host=' + host + '|port=' + port
|
||||
```
|
||||
|
||||
Rust 侧的 summary 输出会包含这些 reasons,直接在日志中可见。
|
||||
|
||||
### 2. 逐层排查顺序
|
||||
|
||||
```
|
||||
Layer 1: 管道层(Rust)
|
||||
├── args 是否正确传入?(expected_domain, target_url, org_code 等)
|
||||
├── 脚本文件是否正确读取?
|
||||
├── async 返回值是否被正确处理?(.then() 模式)
|
||||
└── 回调是否成功返回?
|
||||
|
||||
Layer 2: 页面上下文(JS)
|
||||
├── 脚本注入到了哪个页面?(href, title)
|
||||
├── 页面是否有需要的全局变量?(window.mac 等)
|
||||
└── domain 匹配是否正确?
|
||||
|
||||
Layer 3: API 请求(JS)
|
||||
├── 请求头是否完整?(X-Requested-With)
|
||||
├── 返回格式是否正确?(JSON vs HTML)
|
||||
└── 返回状态码?
|
||||
|
||||
Layer 4: 数据处理(JS)
|
||||
├── API 返回的字段名是否匹配列定义?
|
||||
├── 字段值类型是否兼容?(number vs string)
|
||||
└── 规范化后是否有有效行?
|
||||
|
||||
Layer 5: 导出(架构)
|
||||
├── 是否涉及跨域请求?
|
||||
├── localhost 是否可达?
|
||||
└── 是否需要 Rust 侧处理?
|
||||
```
|
||||
|
||||
### 3. 修改后验证检查清单
|
||||
|
||||
- [ ] JS 文件语法检查:`node -e "require('./collect_lineloss.js')"`
|
||||
- [ ] 如果改了 Rust 代码:`cargo build` 编译通过
|
||||
- [ ] `cargo test` 全部通过(排除已知的 pre-existing failures)
|
||||
- [ ] 替换 JS 文件到部署目录
|
||||
- [ ] 如果改了 Rust:重新部署编译后的 sgclaw 二进制
|
||||
|
||||
---
|
||||
|
||||
## 最终文件清单
|
||||
|
||||
### JS 文件: `collect_lineloss.js`
|
||||
|
||||
**位置**: `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/tq-lineloss-report/scripts/collect_lineloss.js`
|
||||
|
||||
**功能**: 纯数据采集。注入到浏览器,查询线损平台 API,返回结构化 artifact。
|
||||
|
||||
**不做的事**: 不调 localhost:13313,不导出 Excel,不写 report log。
|
||||
|
||||
### Rust 文件: 修改清单
|
||||
|
||||
| 文件 | 修改内容 | 修改类型 |
|
||||
|------|---------|---------|
|
||||
| `src/browser/callback_backend.rs` | `build_eval_js` 增加 `.then()` 处理 async 返回值 | 管道层通用修复 |
|
||||
| `src/compat/deterministic_submit.rs` | 完整 `target_url`; 解析 artifact 后调 XLSX 导出 | 业务集成 |
|
||||
| `src/compat/lineloss_xlsx_export.rs` | XLSX 生成(zip + OpenXML) | 新增 |
|
||||
| `src/compat/mod.rs` | 注册 `lineloss_xlsx_export` 模块 | 新增 |
|
||||
|
||||
---
|
||||
|
||||
## 快速复用模板
|
||||
|
||||
新建类似 skill 时,直接检查以下要点:
|
||||
|
||||
1. **`build_eval_js` 是否支持 async**:入口函数如果是 `async`,确认 `callback_backend.rs` 中有 `.then()` 处理。
|
||||
2. **`validatePageContext` 不检查页面局部状态**:只检查 host,不检查 `window.mac`、`window.app` 等场景页专属变量。
|
||||
3. **API 请求必须带 `X-Requested-With: XMLHttpRequest`**:内网 Java 后端的标配。
|
||||
4. **列定义从原始场景代码精确复制**:找 `cols1`/`cols2` 或表格 `columns` 配置。
|
||||
5. **`normalizeRow` 用 `String(v)` 而非 `pickFirstNonEmpty`**:API 返回数字不是字符串。
|
||||
6. **导出不走浏览器,走 Rust 侧**:JS 返回 rows + column_defs,Rust 生成 XLSX。
|
||||
551
docs/plans/2026-03-26-zeroclaw-prompt-safety-hardening-plan.md
Normal file
551
docs/plans/2026-03-26-zeroclaw-prompt-safety-hardening-plan.md
Normal file
@@ -0,0 +1,551 @@
|
||||
# ZeroClaw Prompt Safety Hardening Implementation Plan
|
||||
|
||||
> **For Claude:** REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.
|
||||
|
||||
**Goal:** Harden ZeroClaw prompt handling and tool execution so non-skill freeform operations degrade to read-only or business-approved execution, while trusted skill-defined operations retain bounded execution privileges.
|
||||
|
||||
**Architecture:** Build a security gate around the existing prompt and tool-entry paths instead of rewriting the full prompt architecture. The gate classifies prompt-injection risk, records operation provenance (`trusted_skill` vs `non_skill`), sanitizes injected workspace/skill content, and enforces execution mode transitions (`clean`, `suspect_readonly`, `suspect_waiting_approval`, `suspect_business_approved`). Trusted skills gain structured business-operation metadata; non-skill operations require business-level approval before any privileged capability is released.
|
||||
|
||||
**Tech Stack:** Rust, vendored ZeroClaw (`third_party/zeroclaw`), existing approval/autonomy system, current prompt guard and prompt builder tests, `cargo test`.
|
||||
|
||||
### Task 1: Create an Isolated Worktree and Verify a Clean Baseline
|
||||
|
||||
**Files:**
|
||||
- Modify: `/home/zyl/projects/sgClaw/claw/.gitignore`
|
||||
- Create: `/home/zyl/projects/sgClaw/claw/.worktrees/zeroclaw-prompt-safety-hardening/**`
|
||||
|
||||
**Step 1: Verify the worktree directory is safe to use**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
cd /home/zyl/projects/sgClaw/claw
|
||||
ls -d .worktrees
|
||||
git check-ignore -v .worktrees
|
||||
```
|
||||
|
||||
Expected: `.worktrees` exists and is ignored by git.
|
||||
|
||||
**Step 2: Create the implementation worktree**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
cd /home/zyl/projects/sgClaw/claw
|
||||
git worktree add .worktrees/zeroclaw-prompt-safety-hardening -b zeroclaw-prompt-safety-hardening
|
||||
```
|
||||
|
||||
Expected: a new branch and worktree are created.
|
||||
|
||||
**Step 3: Build the baseline in the worktree**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
cd /home/zyl/projects/sgClaw/claw/.worktrees/zeroclaw-prompt-safety-hardening
|
||||
cargo test -p zeroclawlabs prompt_guard -- --nocapture
|
||||
cargo test -p zeroclawlabs build_system_prompt -- --nocapture
|
||||
```
|
||||
|
||||
Expected: existing relevant tests pass before any code changes.
|
||||
|
||||
**Step 4: Commit the clean worktree setup if `.gitignore` changed**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
git add .gitignore
|
||||
git commit -m "chore: prepare worktree for prompt safety hardening"
|
||||
```
|
||||
|
||||
Expected: commit only if `.gitignore` required an adjustment.
|
||||
|
||||
### Task 2: Add the Core Security-Mode Data Model
|
||||
|
||||
**Files:**
|
||||
- Create: `/home/zyl/projects/sgClaw/claw/.worktrees/zeroclaw-prompt-safety-hardening/third_party/zeroclaw/src/security/operation_policy.rs`
|
||||
- Modify: `/home/zyl/projects/sgClaw/claw/.worktrees/zeroclaw-prompt-safety-hardening/third_party/zeroclaw/src/security/mod.rs`
|
||||
- Test: `/home/zyl/projects/sgClaw/claw/.worktrees/zeroclaw-prompt-safety-hardening/third_party/zeroclaw/src/security/operation_policy.rs`
|
||||
|
||||
**Step 1: Write the failing policy tests**
|
||||
|
||||
Add tests that prove:
|
||||
- suspicious non-skill input maps to `suspect_readonly`
|
||||
- trusted skill operations can request bounded privileged execution
|
||||
- any out-of-scope capability request downgrades the operation
|
||||
|
||||
Use concrete enums and assertions, for example:
|
||||
```rust
|
||||
assert_eq!(
|
||||
ExecutionMode::from_guard_and_provenance(GuardRisk::Suspicious, OperationProvenance::NonSkill),
|
||||
ExecutionMode::SuspectReadOnly
|
||||
);
|
||||
```
|
||||
|
||||
**Step 2: Run the tests to verify RED**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
cd /home/zyl/projects/sgClaw/claw/.worktrees/zeroclaw-prompt-safety-hardening
|
||||
cargo test -p zeroclawlabs operation_policy -- --nocapture
|
||||
```
|
||||
|
||||
Expected: fail because the new types do not exist yet.
|
||||
|
||||
**Step 3: Implement the minimal policy model**
|
||||
|
||||
Define:
|
||||
- `GuardRisk` (`Clean`, `Suspicious`, `Dangerous`)
|
||||
- `OperationProvenance` (`TrustedSkill`, `NonSkill`, `Mixed`)
|
||||
- `ExecutionMode` (`Clean`, `SuspectReadOnly`, `SuspectWaitingApproval`, `SuspectBusinessApproved`)
|
||||
- `CapabilityClass` for privileged business actions
|
||||
|
||||
Add small helper functions that do only state mapping. Do not pull prompt-building logic into this module.
|
||||
|
||||
**Step 4: Re-run the policy tests to verify GREEN**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
cargo test -p zeroclawlabs operation_policy -- --nocapture
|
||||
```
|
||||
|
||||
Expected: the new policy tests pass.
|
||||
|
||||
**Step 5: Commit**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
git add third_party/zeroclaw/src/security/mod.rs third_party/zeroclaw/src/security/operation_policy.rs
|
||||
git commit -m "feat: add prompt security execution mode model"
|
||||
```
|
||||
|
||||
### Task 3: Add Structured Skill Trust Metadata
|
||||
|
||||
**Files:**
|
||||
- Modify: `/home/zyl/projects/sgClaw/claw/.worktrees/zeroclaw-prompt-safety-hardening/third_party/zeroclaw/src/skills/mod.rs`
|
||||
- Modify: `/home/zyl/projects/sgClaw/claw/.worktrees/zeroclaw-prompt-safety-hardening/third_party/zeroclaw/src/tools/read_skill.rs`
|
||||
- Test: `/home/zyl/projects/sgClaw/claw/.worktrees/zeroclaw-prompt-safety-hardening/third_party/zeroclaw/src/skills/mod.rs`
|
||||
|
||||
**Step 1: Write failing skill metadata tests**
|
||||
|
||||
Add tests that prove:
|
||||
- `SKILL.toml` can declare a business operation type, capability list, argument constraints, and `step_budget`
|
||||
- markdown-only skills default to unprivileged metadata
|
||||
- malformed privileged metadata is rejected or downgraded safely
|
||||
|
||||
Use a manifest shape like:
|
||||
```toml
|
||||
[skill]
|
||||
name = "export-report"
|
||||
description = "Export the monthly report"
|
||||
|
||||
[security]
|
||||
operation_type = "browser_export_data"
|
||||
allowed_capabilities = ["browser_read", "browser_export"]
|
||||
step_budget = 6
|
||||
approval_mode = "trusted_skill"
|
||||
```
|
||||
|
||||
**Step 2: Run the tests to verify RED**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
cargo test -p zeroclawlabs skill -- --nocapture
|
||||
```
|
||||
|
||||
Expected: fail because the structured metadata fields are missing.
|
||||
|
||||
**Step 3: Implement minimal structured metadata**
|
||||
|
||||
Extend `Skill` with a structured security block, for example:
|
||||
- `operation_type`
|
||||
- `business_description`
|
||||
- `allowed_capabilities`
|
||||
- `arg_constraints`
|
||||
- `step_budget`
|
||||
- `approval_mode`
|
||||
|
||||
Default markdown-only skills to unprivileged metadata so existing skills remain compatible.
|
||||
|
||||
**Step 4: Make `read_skill` expose the metadata**
|
||||
|
||||
Return or prepend enough structured metadata so the runtime can distinguish trusted skill operations from plain prompt text.
|
||||
|
||||
**Step 5: Re-run the tests to verify GREEN**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
cargo test -p zeroclawlabs skill -- --nocapture
|
||||
```
|
||||
|
||||
Expected: skill parsing and `read_skill` tests pass.
|
||||
|
||||
**Step 6: Commit**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
git add third_party/zeroclaw/src/skills/mod.rs third_party/zeroclaw/src/tools/read_skill.rs
|
||||
git commit -m "feat: add trusted skill security metadata"
|
||||
```
|
||||
|
||||
### Task 4: Sanitize Injected Workspace and Skill Content Before Prompt Assembly
|
||||
|
||||
**Files:**
|
||||
- Create: `/home/zyl/projects/sgClaw/claw/.worktrees/zeroclaw-prompt-safety-hardening/third_party/zeroclaw/src/security/prompt_sanitizer.rs`
|
||||
- Modify: `/home/zyl/projects/sgClaw/claw/.worktrees/zeroclaw-prompt-safety-hardening/third_party/zeroclaw/src/security/mod.rs`
|
||||
- Modify: `/home/zyl/projects/sgClaw/claw/.worktrees/zeroclaw-prompt-safety-hardening/third_party/zeroclaw/src/channels/mod.rs`
|
||||
- Modify: `/home/zyl/projects/sgClaw/claw/.worktrees/zeroclaw-prompt-safety-hardening/third_party/zeroclaw/src/agent/prompt.rs`
|
||||
- Test: `/home/zyl/projects/sgClaw/claw/.worktrees/zeroclaw-prompt-safety-hardening/third_party/zeroclaw/src/channels/mod.rs`
|
||||
|
||||
**Step 1: Write failing sanitizer tests**
|
||||
|
||||
Add tests that prove:
|
||||
- dangerous bootstrap phrases are removed, escaped, or summarized before prompt injection
|
||||
- control characters are stripped
|
||||
- overlong files are truncated with an audit-friendly marker
|
||||
- safe business content remains readable
|
||||
|
||||
**Step 2: Run the tests to verify RED**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
cargo test -p zeroclawlabs build_system_prompt -- --nocapture
|
||||
```
|
||||
|
||||
Expected: fail because injected files are still copied verbatim.
|
||||
|
||||
**Step 3: Implement the sanitizer**
|
||||
|
||||
Create a small sanitizer that:
|
||||
- strips control characters
|
||||
- caps content length
|
||||
- flags prompt-override phrases
|
||||
- emits sanitized content plus metadata such as `truncated` and matched rules
|
||||
|
||||
Use this sanitizer in:
|
||||
- `load_openclaw_bootstrap_files`
|
||||
- any shared path in `agent/prompt.rs` that renders workspace or skill text into the system prompt
|
||||
|
||||
**Step 4: Re-run the tests to verify GREEN**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
cargo test -p zeroclawlabs build_system_prompt -- --nocapture
|
||||
```
|
||||
|
||||
Expected: prompt-building tests pass with the new sanitized behavior.
|
||||
|
||||
**Step 5: Commit**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
git add third_party/zeroclaw/src/security/mod.rs third_party/zeroclaw/src/security/prompt_sanitizer.rs third_party/zeroclaw/src/channels/mod.rs third_party/zeroclaw/src/agent/prompt.rs
|
||||
git commit -m "feat: sanitize injected workspace prompt content"
|
||||
```
|
||||
|
||||
### Task 5: Wire `PromptGuard` into Main Agent and Gateway Entry Points
|
||||
|
||||
**Files:**
|
||||
- Modify: `/home/zyl/projects/sgClaw/claw/.worktrees/zeroclaw-prompt-safety-hardening/third_party/zeroclaw/src/security/prompt_guard.rs`
|
||||
- Modify: `/home/zyl/projects/sgClaw/claw/.worktrees/zeroclaw-prompt-safety-hardening/third_party/zeroclaw/src/agent/agent.rs`
|
||||
- Modify: `/home/zyl/projects/sgClaw/claw/.worktrees/zeroclaw-prompt-safety-hardening/third_party/zeroclaw/src/gateway/mod.rs`
|
||||
- Modify: `/home/zyl/projects/sgClaw/claw/.worktrees/zeroclaw-prompt-safety-hardening/third_party/zeroclaw/src/gateway/ws.rs`
|
||||
- Test: `/home/zyl/projects/sgClaw/claw/.worktrees/zeroclaw-prompt-safety-hardening/third_party/zeroclaw/src/agent/agent.rs`
|
||||
|
||||
**Step 1: Write failing entry-point tests**
|
||||
|
||||
Add tests that prove:
|
||||
- suspicious input marks the turn as degraded instead of silently continuing
|
||||
- dangerous input is blocked
|
||||
- clean input remains unchanged
|
||||
|
||||
Prefer tests that assert on a security decision object instead of brittle prompt strings.
|
||||
|
||||
**Step 2: Run the tests to verify RED**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
cargo test -p zeroclawlabs prompt_guard -- --nocapture
|
||||
cargo test -p zeroclawlabs agent -- --nocapture
|
||||
```
|
||||
|
||||
Expected: fail because no entry path consumes the guard result.
|
||||
|
||||
**Step 3: Implement guarded entry evaluation**
|
||||
|
||||
Before each turn:
|
||||
- scan the inbound user content
|
||||
- map the guard result into `GuardRisk`
|
||||
- create an execution context carrying risk and provenance
|
||||
- attach audit details for later logging
|
||||
|
||||
Keep the existing `PromptGuard` regexes unless a test demands a specific adjustment.
|
||||
|
||||
**Step 4: Re-run the tests to verify GREEN**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
cargo test -p zeroclawlabs prompt_guard -- --nocapture
|
||||
cargo test -p zeroclawlabs agent -- --nocapture
|
||||
```
|
||||
|
||||
Expected: suspicious and blocked paths now behave deterministically.
|
||||
|
||||
**Step 5: Commit**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
git add third_party/zeroclaw/src/security/prompt_guard.rs third_party/zeroclaw/src/agent/agent.rs third_party/zeroclaw/src/gateway/mod.rs third_party/zeroclaw/src/gateway/ws.rs
|
||||
git commit -m "feat: enforce prompt guard at runtime entry points"
|
||||
```
|
||||
|
||||
### Task 6: Add Business-Level Privileged Operation Registry and Approval Tokens
|
||||
|
||||
**Files:**
|
||||
- Modify: `/home/zyl/projects/sgClaw/claw/.worktrees/zeroclaw-prompt-safety-hardening/third_party/zeroclaw/src/approval/mod.rs`
|
||||
- Create: `/home/zyl/projects/sgClaw/claw/.worktrees/zeroclaw-prompt-safety-hardening/third_party/zeroclaw/src/security/business_approval.rs`
|
||||
- Modify: `/home/zyl/projects/sgClaw/claw/.worktrees/zeroclaw-prompt-safety-hardening/third_party/zeroclaw/src/security/mod.rs`
|
||||
- Test: `/home/zyl/projects/sgClaw/claw/.worktrees/zeroclaw-prompt-safety-hardening/third_party/zeroclaw/src/security/business_approval.rs`
|
||||
|
||||
**Step 1: Write failing business approval tests**
|
||||
|
||||
Add tests that prove:
|
||||
- only operations in the privileged registry can request approval
|
||||
- approval tokens bind to `session_id`, `operation_type`, `allowed_capabilities`, `step_budget`, and expiration
|
||||
- a mismatched or expired approval token is rejected
|
||||
|
||||
**Step 2: Run the tests to verify RED**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
cargo test -p zeroclawlabs business_approval -- --nocapture
|
||||
```
|
||||
|
||||
Expected: fail because the business approval registry does not exist yet.
|
||||
|
||||
**Step 3: Implement the registry and token model**
|
||||
|
||||
Create:
|
||||
- a privileged business operation registry
|
||||
- a single-operation approval token
|
||||
- helper checks for `can_request_approval` and `matches_execution_request`
|
||||
|
||||
Model approval at the business-operation level, not raw tool calls.
|
||||
|
||||
**Step 4: Extend the existing approval module**
|
||||
|
||||
Teach the approval module to carry business-level fields through the current request/response flow without breaking old call sites.
|
||||
|
||||
**Step 5: Re-run the tests to verify GREEN**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
cargo test -p zeroclawlabs business_approval -- --nocapture
|
||||
```
|
||||
|
||||
Expected: the token validation and registry tests pass.
|
||||
|
||||
**Step 6: Commit**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
git add third_party/zeroclaw/src/approval/mod.rs third_party/zeroclaw/src/security/mod.rs third_party/zeroclaw/src/security/business_approval.rs
|
||||
git commit -m "feat: add business-level approval registry"
|
||||
```
|
||||
|
||||
### Task 7: Enforce Execution Modes in Tool Dispatch
|
||||
|
||||
**Files:**
|
||||
- Modify: `/home/zyl/projects/sgClaw/claw/.worktrees/zeroclaw-prompt-safety-hardening/third_party/zeroclaw/src/agent/dispatcher.rs`
|
||||
- Modify: `/home/zyl/projects/sgClaw/claw/.worktrees/zeroclaw-prompt-safety-hardening/third_party/zeroclaw/src/agent/agent.rs`
|
||||
- Modify: `/home/zyl/projects/sgClaw/claw/.worktrees/zeroclaw-prompt-safety-hardening/third_party/zeroclaw/src/agent/loop_.rs`
|
||||
- Test: `/home/zyl/projects/sgClaw/claw/.worktrees/zeroclaw-prompt-safety-hardening/third_party/zeroclaw/src/agent/dispatcher.rs`
|
||||
|
||||
**Step 1: Write failing dispatcher tests**
|
||||
|
||||
Add tests that prove:
|
||||
- `suspect_readonly` allows only safe read capabilities
|
||||
- `trusted_skill` can execute capabilities listed in its metadata within `step_budget`
|
||||
- `mixed` or non-skill privileged calls require a matching business approval token
|
||||
|
||||
**Step 2: Run the tests to verify RED**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
cargo test -p zeroclawlabs dispatcher -- --nocapture
|
||||
```
|
||||
|
||||
Expected: fail because the dispatcher does not yet know about execution modes.
|
||||
|
||||
**Step 3: Implement capability enforcement**
|
||||
|
||||
Before dispatching any tool:
|
||||
- resolve the operation context
|
||||
- map the tool call to a capability class
|
||||
- reject calls outside the current execution mode
|
||||
- decrement or validate `step_budget` for approved bounded flows
|
||||
|
||||
Do not rely on prompt text for enforcement.
|
||||
|
||||
**Step 4: Re-run the tests to verify GREEN**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
cargo test -p zeroclawlabs dispatcher -- --nocapture
|
||||
```
|
||||
|
||||
Expected: dispatch now respects read-only, trusted skill, and business-approved modes.
|
||||
|
||||
**Step 5: Commit**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
git add third_party/zeroclaw/src/agent/dispatcher.rs third_party/zeroclaw/src/agent/agent.rs third_party/zeroclaw/src/agent/loop_.rs
|
||||
git commit -m "feat: enforce execution mode in tool dispatch"
|
||||
```
|
||||
|
||||
### Task 8: Default Skills Prompt Injection to Compact for Safer Runtime Behavior
|
||||
|
||||
**Files:**
|
||||
- Modify: `/home/zyl/projects/sgClaw/claw/.worktrees/zeroclaw-prompt-safety-hardening/third_party/zeroclaw/src/config/schema.rs`
|
||||
- Modify: `/home/zyl/projects/sgClaw/claw/.worktrees/zeroclaw-prompt-safety-hardening/third_party/zeroclaw/src/agent/prompt.rs`
|
||||
- Modify: `/home/zyl/projects/sgClaw/claw/.worktrees/zeroclaw-prompt-safety-hardening/third_party/zeroclaw/src/channels/mod.rs`
|
||||
- Test: `/home/zyl/projects/sgClaw/claw/.worktrees/zeroclaw-prompt-safety-hardening/third_party/zeroclaw/src/config/schema.rs`
|
||||
|
||||
**Step 1: Write the failing configuration test**
|
||||
|
||||
Add a test that asserts the default skill prompt injection mode is `Compact` unless explicitly configured otherwise.
|
||||
|
||||
**Step 2: Run the test to verify RED**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
cargo test -p zeroclawlabs skills_prompt_injection_mode -- --nocapture
|
||||
```
|
||||
|
||||
Expected: fail because defaults still point to `Full`.
|
||||
|
||||
**Step 3: Implement the default flip**
|
||||
|
||||
Update config defaults and any prompt-builder defaults that currently assume `Full`. Keep explicit user config backward compatible.
|
||||
|
||||
**Step 4: Re-run the test to verify GREEN**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
cargo test -p zeroclawlabs skills_prompt_injection_mode -- --nocapture
|
||||
```
|
||||
|
||||
Expected: default configuration now resolves to `Compact`.
|
||||
|
||||
**Step 5: Commit**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
git add third_party/zeroclaw/src/config/schema.rs third_party/zeroclaw/src/agent/prompt.rs third_party/zeroclaw/src/channels/mod.rs
|
||||
git commit -m "feat: default skills prompt injection to compact"
|
||||
```
|
||||
|
||||
### Task 9: Add Audit Logging and Regression Coverage
|
||||
|
||||
**Files:**
|
||||
- Modify: `/home/zyl/projects/sgClaw/claw/.worktrees/zeroclaw-prompt-safety-hardening/third_party/zeroclaw/src/observability/mod.rs`
|
||||
- Modify: `/home/zyl/projects/sgClaw/claw/.worktrees/zeroclaw-prompt-safety-hardening/third_party/zeroclaw/src/agent/agent.rs`
|
||||
- Modify: `/home/zyl/projects/sgClaw/claw/.worktrees/zeroclaw-prompt-safety-hardening/third_party/zeroclaw/src/channels/mod.rs`
|
||||
- Create: `/home/zyl/projects/sgClaw/claw/.worktrees/zeroclaw-prompt-safety-hardening/third_party/zeroclaw/tests/prompt_safety_regression.rs`
|
||||
|
||||
**Step 1: Write the failing regression tests**
|
||||
|
||||
Cover:
|
||||
- prompt override attack from user content
|
||||
- malicious `AGENTS.md` bootstrap content
|
||||
- trusted skill execution within bounds
|
||||
- non-skill privileged request requiring business approval
|
||||
- approval token mismatch
|
||||
- session history restore preserving degraded mode
|
||||
|
||||
**Step 2: Run the tests to verify RED**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
cargo test -p zeroclawlabs --test prompt_safety_regression -- --nocapture
|
||||
```
|
||||
|
||||
Expected: fail because the end-to-end behavior is not wired together yet.
|
||||
|
||||
**Step 3: Implement audit logging**
|
||||
|
||||
Record:
|
||||
- input hash
|
||||
- matched guard rules
|
||||
- risk level
|
||||
- provenance
|
||||
- execution mode transitions
|
||||
- approval decisions
|
||||
|
||||
Avoid logging raw sensitive content.
|
||||
|
||||
**Step 4: Re-run the regression tests to verify GREEN**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
cargo test -p zeroclawlabs --test prompt_safety_regression -- --nocapture
|
||||
```
|
||||
|
||||
Expected: the regression suite passes.
|
||||
|
||||
**Step 5: Commit**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
git add third_party/zeroclaw/src/observability/mod.rs third_party/zeroclaw/src/agent/agent.rs third_party/zeroclaw/src/channels/mod.rs third_party/zeroclaw/tests/prompt_safety_regression.rs
|
||||
git commit -m "test: add prompt safety regression coverage"
|
||||
```
|
||||
|
||||
### Task 10: Final Verification and Integration Review
|
||||
|
||||
**Files:**
|
||||
- Modify: `/home/zyl/projects/sgClaw/claw/.worktrees/zeroclaw-prompt-safety-hardening/docs/L5-提示词分布与安全改造方案.md`
|
||||
- Modify: `/home/zyl/projects/sgClaw/claw/.worktrees/zeroclaw-prompt-safety-hardening/docs/README.md`
|
||||
|
||||
**Step 1: Run targeted verification**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
cd /home/zyl/projects/sgClaw/claw/.worktrees/zeroclaw-prompt-safety-hardening
|
||||
cargo test -p zeroclawlabs prompt_guard -- --nocapture
|
||||
cargo test -p zeroclawlabs build_system_prompt -- --nocapture
|
||||
cargo test -p zeroclawlabs dispatcher -- --nocapture
|
||||
cargo test -p zeroclawlabs --test prompt_safety_regression -- --nocapture
|
||||
```
|
||||
|
||||
Expected: all prompt safety and dispatcher tests pass.
|
||||
|
||||
**Step 2: Run a broad ZeroClaw package test pass if time permits**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
cargo test -p zeroclawlabs -- --nocapture
|
||||
```
|
||||
|
||||
Expected: no regressions in the vendored package test suite, or a documented list of unrelated existing failures.
|
||||
|
||||
**Step 3: Update the security design docs**
|
||||
|
||||
Document:
|
||||
- execution modes
|
||||
- trusted skill metadata contract
|
||||
- business approval flow
|
||||
- why non-skill privileged actions are gated
|
||||
|
||||
**Step 4: Commit the docs**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
git add docs/L5-提示词分布与安全改造方案.md docs/README.md
|
||||
git commit -m "docs: record prompt safety hardening design"
|
||||
```
|
||||
|
||||
**Step 5: Prepare merge review notes**
|
||||
|
||||
Write a short integration summary covering:
|
||||
- changed entry points
|
||||
- backward-compatibility expectations
|
||||
- any skills that need metadata upgrades
|
||||
- rollout recommendation for existing integrators
|
||||
179
docs/plans/2026-03-27-sgclaw-chat-first-ui-refactor-plan.md
Normal file
179
docs/plans/2026-03-27-sgclaw-chat-first-ui-refactor-plan.md
Normal file
@@ -0,0 +1,179 @@
|
||||
# sgClaw Chat-First UI Refactor Implementation Plan
|
||||
|
||||
> **For Claude:** REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.
|
||||
|
||||
**Goal:** Rebuild the sgClaw floating chat UI into a chat-first plugin-style product where the message timeline is primary, `执行摘要` is folded into the conversation, and `调试` opens as a full-window overlay instead of occupying persistent space.
|
||||
|
||||
**Architecture:** Keep `chrome://superrpa-functions/sgclaw-chat` as the first verified host because it already has Lit-based unit tests, then mirror the same information architecture and visual hierarchy into the ordinary-page injected `sgclaw_overlay.js`. Do not introduce a new backend contract; only rearrange presentation, panel semantics, and message/result composition around the existing runtime state.
|
||||
|
||||
**Tech Stack:** Chromium WebUI, Lit templates/components, injected Shadow DOM overlay JavaScript, existing SuperRPA bridge/runtime callbacks, mainline TS unit tests.
|
||||
|
||||
### Task 1: Lock The New Information Architecture In Tests
|
||||
|
||||
**Files:**
|
||||
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_mainline_unittest.ts`
|
||||
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_settings_state_mainline_unittest.ts`
|
||||
|
||||
**Step 1: Write the failing test**
|
||||
|
||||
Add assertions for these exact product rules:
|
||||
- `getHtml()` must no longer emit the legacy `debug-note`.
|
||||
- the main chat template must define a dedicated overlay/sheet container for `history`, `settings`, and `debug`.
|
||||
- the debug panel must be described as a full-window overlay rather than a side drawer/log block.
|
||||
- the result presentation must be part of the message stream, not a standalone persistent secondary panel.
|
||||
|
||||
**Step 2: Run test to verify it fails**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
node --test /home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_mainline_unittest.ts
|
||||
```
|
||||
|
||||
Expected: FAIL because current template still includes `debug-note`, side-by-side panel layout, and standalone result panel semantics.
|
||||
|
||||
**Step 3: Write minimal implementation**
|
||||
|
||||
Change only template/component strings and assertions needed to express the new structure, without touching styling yet.
|
||||
|
||||
**Step 4: Run test to verify it passes**
|
||||
|
||||
Run the same command.
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
### Task 2: Refactor `chrome://` sgClaw Into Chat-First Structure
|
||||
|
||||
**Files:**
|
||||
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat.html.ts`
|
||||
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat.css.ts`
|
||||
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/components/sgclaw-chat-header.ts`
|
||||
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/components/sgclaw-chat-composer.ts`
|
||||
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/components/sgclaw-debug-drawer.ts`
|
||||
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/components/sgclaw-history-panel.ts`
|
||||
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/components/sgclaw-settings-panel.ts`
|
||||
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/components/sgclaw-message-list.ts`
|
||||
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/components/sgclaw-message-card-result.ts`
|
||||
|
||||
**Step 1: Keep the header narrow**
|
||||
|
||||
Make the header carry only:
|
||||
- brand
|
||||
- current page label
|
||||
- compact runtime status
|
||||
- actions for `新对话 / 历史 / 设置 / 调试 / 收起`
|
||||
|
||||
Remove the large subtitle/debug framing and the separate heavy runtime action row feel.
|
||||
|
||||
**Step 2: Make the message timeline primary**
|
||||
|
||||
Turn the main shell body into:
|
||||
- a single timeline container
|
||||
- optional empty-state presets
|
||||
- no persistent secondary summary card
|
||||
|
||||
`finalResult` should render as a folded result card appended in the stream.
|
||||
|
||||
**Step 3: Convert secondary panels into full overlays**
|
||||
|
||||
Render `history`, `settings`, and `debug` inside a full-window overlay/sheet that covers the chat content area instead of sitting above or beside it.
|
||||
|
||||
**Step 4: Re-skin toward the approved direction**
|
||||
|
||||
Use:
|
||||
- soft neutral surfaces
|
||||
- restrained accent usage
|
||||
- thinner borders
|
||||
- calmer shadows
|
||||
- clearer assistant/user card contrast
|
||||
|
||||
Avoid:
|
||||
- debug-workbench feeling
|
||||
- large gradient blocks
|
||||
- heavy explanatory copy in the main flow
|
||||
|
||||
**Step 5: Run the unit tests**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
node --test /home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_mainline_unittest.ts
|
||||
```
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
### Task 3: Mirror The Same Structure Into Ordinary-Page Overlay
|
||||
|
||||
**Files:**
|
||||
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/sgclaw_overlay.js`
|
||||
|
||||
**Step 1: Remove the standalone result panel**
|
||||
|
||||
Delete the always-visible `执行摘要` block from the main window body.
|
||||
|
||||
**Step 2: Introduce overlay panels**
|
||||
|
||||
Change panel rendering so `history`, `settings`, and `debug` appear in a dedicated full-window overlay layer within the floating window instead of as sibling blocks consuming vertical space.
|
||||
|
||||
**Step 3: Rebuild the shell**
|
||||
|
||||
Match the `chrome://` layout:
|
||||
- compact header
|
||||
- primary message timeline
|
||||
- folded result card inside conversation
|
||||
- sticky composer
|
||||
|
||||
**Step 4: Preserve behavior**
|
||||
|
||||
Do not break:
|
||||
- `sgclaw.newConversation`
|
||||
- `sgclaw.restoreConversation`
|
||||
- runtime polling
|
||||
- config save/load
|
||||
- unread badge behavior
|
||||
|
||||
**Step 5: Run a syntax sanity check**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
node --check /home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/sgclaw_overlay.js
|
||||
```
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
### Task 4: Verify Browser Resource Integration
|
||||
|
||||
**Files:**
|
||||
- No new source files; verification only
|
||||
|
||||
**Step 1: Run TS / mainline tests**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
bash -lc "autoninja -C /home/zyl/projects/superRpa/src/out/KylinRelease functions_ui_mainline_unittests"
|
||||
```
|
||||
|
||||
Expected: build succeeds.
|
||||
|
||||
**Step 2: Run targeted mainline unit tests**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
/home/zyl/projects/superRpa/src/out/KylinRelease/functions_ui_mainline_unittests --gtest_filter='FunctionsUiMainlineTest.*sgclaw*'
|
||||
```
|
||||
|
||||
If filter finds no test names, run the full binary and confirm it exits `0`.
|
||||
|
||||
**Step 3: Rebuild browser resources if needed**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
bash -lc "autoninja -C /home/zyl/projects/superRpa/src/out/KylinRelease chrome"
|
||||
```
|
||||
|
||||
**Step 4: Manually verify product behavior**
|
||||
|
||||
Check:
|
||||
- ordinary webpage floating window
|
||||
- `chrome://superrpa-functions/sgclaw-chat`
|
||||
- `调试` opens as full overlay
|
||||
- `执行摘要` no longer blocks the main conversation
|
||||
- `历史` and `设置` do not consume persistent layout space
|
||||
148
docs/plans/2026-03-27-sgclaw-configurable-skills-dir-plan.md
Normal file
148
docs/plans/2026-03-27-sgclaw-configurable-skills-dir-plan.md
Normal file
@@ -0,0 +1,148 @@
|
||||
# SGClaw Configurable Skills Directory Implementation Plan
|
||||
|
||||
> **For Claude:** REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.
|
||||
|
||||
**Goal:** Let `sgclaw` own skill-directory resolution and allow users to set a custom skills directory in `sgclaw_config.json` without relying on SuperRPA to copy skills into the runtime workspace.
|
||||
|
||||
**Architecture:** Extend the existing browser JSON config parser so `sgclaw` can read an optional `skillsDir` field alongside DeepSeek settings. Keep the current embedded ZeroClaw workspace for memory/config internals, but decouple skill loading from that fixed path by resolving a configurable skills root at runtime. Preserve backward compatibility by defaulting to `<workspace_root>/.sgclaw-zeroclaw-workspace/skills` when `skillsDir` is absent or empty.
|
||||
|
||||
**Tech Stack:** Rust, serde JSON parsing, existing ZeroClaw compatibility runtime, cargo test
|
||||
|
||||
### Task 1: Capture browser config requirements
|
||||
|
||||
**Files:**
|
||||
- Modify: `/home/zyl/projects/sgClaw/claw/src/config/settings.rs`
|
||||
- Test: `/home/zyl/projects/sgClaw/claw/tests/compat_config_test.rs`
|
||||
|
||||
**Step 1: Write the failing test**
|
||||
|
||||
Add tests that load `sgclaw_config.json` containing:
|
||||
- no `skillsDir`
|
||||
- a relative `skillsDir`
|
||||
- an absolute `skillsDir`
|
||||
|
||||
Assert that:
|
||||
- `skillsDir` missing falls back to default workspace skills path
|
||||
- relative values resolve against the browser config directory
|
||||
- absolute values are preserved
|
||||
|
||||
**Step 2: Run test to verify it fails**
|
||||
|
||||
Run: `cargo test compat_config -- --nocapture`
|
||||
|
||||
Expected: FAIL because `DeepSeekSettings` / config adapter do not expose any skills directory override yet.
|
||||
|
||||
**Step 3: Write minimal implementation**
|
||||
|
||||
Add a browser-config structure that parses `skillsDir` and expose a resolver function that returns the effective skills directory for `sgclaw`.
|
||||
|
||||
**Step 4: Run test to verify it passes**
|
||||
|
||||
Run: `cargo test compat_config -- --nocapture`
|
||||
|
||||
Expected: PASS for the new parsing and path-resolution cases.
|
||||
|
||||
### Task 2: Route compat runtime skill loading through sgclaw-owned resolution
|
||||
|
||||
**Files:**
|
||||
- Modify: `/home/zyl/projects/sgClaw/claw/src/compat/config_adapter.rs`
|
||||
- Modify: `/home/zyl/projects/sgClaw/claw/src/compat/runtime.rs`
|
||||
- Test: `/home/zyl/projects/sgClaw/claw/tests/compat_runtime_test.rs`
|
||||
|
||||
**Step 1: Write the failing test**
|
||||
|
||||
Add a compat runtime test that creates:
|
||||
- a default workspace skill package under `.sgclaw-zeroclaw-workspace/skills`
|
||||
- a custom skill package under another directory configured via `skillsDir`
|
||||
|
||||
Assert that provider request payload contains only the configured skill name when `skillsDir` is set, and still contains workspace skill names when the override is absent.
|
||||
|
||||
**Step 2: Run test to verify it fails**
|
||||
|
||||
Run: `cargo test compat_runtime -- --nocapture`
|
||||
|
||||
Expected: FAIL because the runtime currently always loads skills from `config.workspace_dir`.
|
||||
|
||||
**Step 3: Write minimal implementation**
|
||||
|
||||
Keep `config.workspace_dir` for ZeroClaw internal state, but load skills from the resolved effective skills directory by calling `load_skills_from_directory` directly when a custom directory is configured.
|
||||
|
||||
**Step 4: Run test to verify it passes**
|
||||
|
||||
Run: `cargo test compat_runtime -- --nocapture`
|
||||
|
||||
Expected: PASS and provider request payload shows the right `Available Skills` content.
|
||||
|
||||
### Task 3: Document and verify backward compatibility
|
||||
|
||||
**Files:**
|
||||
- Modify: `/home/zyl/projects/sgClaw/claw/docs/README.md`
|
||||
- Modify: `/home/zyl/projects/sgClaw/claw/docs/L5-提示词分布与安全改造方案.md`
|
||||
|
||||
**Step 1: Write the failing check**
|
||||
|
||||
Record the expected runtime behavior:
|
||||
- `sgclaw` owns skill lookup
|
||||
- SuperRPA only passes `--config-path`
|
||||
- `skillsDir` is optional
|
||||
|
||||
**Step 2: Run verification**
|
||||
|
||||
Run: `rg -n "skillsDir|sgclaw owns skill lookup|config-path" docs`
|
||||
|
||||
Expected: missing text before docs are updated.
|
||||
|
||||
**Step 3: Write minimal documentation**
|
||||
|
||||
Document:
|
||||
- JSON field name
|
||||
- relative-path resolution base
|
||||
- default fallback
|
||||
- operational implication for SuperRPA integration
|
||||
|
||||
**Step 4: Run verification**
|
||||
|
||||
Run: `rg -n "skillsDir|sgclaw owns skill lookup|config-path" docs`
|
||||
|
||||
Expected: PASS with updated docs.
|
||||
|
||||
### Task 4: Final verification
|
||||
|
||||
**Files:**
|
||||
- Review only: `/home/zyl/projects/sgClaw/claw/src/config/settings.rs`
|
||||
- Review only: `/home/zyl/projects/sgClaw/claw/src/compat/config_adapter.rs`
|
||||
- Review only: `/home/zyl/projects/sgClaw/claw/src/compat/runtime.rs`
|
||||
- Review only: `/home/zyl/projects/sgClaw/claw/tests/compat_config_test.rs`
|
||||
- Review only: `/home/zyl/projects/sgClaw/claw/tests/compat_runtime_test.rs`
|
||||
|
||||
**Step 1: Run targeted tests**
|
||||
|
||||
Run: `cargo test compat_config -- --nocapture`
|
||||
|
||||
Expected: PASS
|
||||
|
||||
**Step 2: Run runtime tests**
|
||||
|
||||
Run: `cargo test compat_runtime -- --nocapture`
|
||||
|
||||
Expected: PASS
|
||||
|
||||
**Step 3: Run skill-lib structural validation**
|
||||
|
||||
Run: `python3 -m unittest tests.skill_lib_validation_test -v`
|
||||
|
||||
Expected: PASS
|
||||
|
||||
**Step 4: Commit**
|
||||
|
||||
```bash
|
||||
git add docs/plans/2026-03-27-sgclaw-configurable-skills-dir-plan.md \
|
||||
src/config/settings.rs \
|
||||
src/compat/config_adapter.rs \
|
||||
src/compat/runtime.rs \
|
||||
tests/compat_config_test.rs \
|
||||
tests/compat_runtime_test.rs \
|
||||
docs/README.md \
|
||||
docs/L5-提示词分布与安全改造方案.md
|
||||
git commit -m "feat: make sgclaw skills directory configurable"
|
||||
```
|
||||
624
docs/plans/2026-03-27-sgclaw-floating-chat-frontend-design.md
Normal file
624
docs/plans/2026-03-27-sgclaw-floating-chat-frontend-design.md
Normal file
@@ -0,0 +1,624 @@
|
||||
# sgClaw Floating Chat Frontend Implementation Plan
|
||||
|
||||
> **For Claude:** REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.
|
||||
|
||||
**Goal:** Replace the current debug-style `sgclaw-chat` UI with a complete floating-chat frontend that matches the product structure of Doubao's side panel while preserving the current SuperRPA bridge and configuration capabilities.
|
||||
|
||||
**Architecture:** Keep `chrome://superrpa-functions/sgclaw-chat` as the first delivery host so the new UI can be built and verified without waiting for the final page-floating container. Split the current monolithic Lit component into host adapter, state modules, typed message model, presentational components, and secondary panels so the same UI can later be mounted in a real injected floating window on normal web pages. Preserve the existing browser bridge (`sgclawConnect`, `sgclawStart`, `sgclawStop`, `sgclawSubmitTask`) and re-home logs/configuration into secondary panels instead of deleting them.
|
||||
|
||||
**Tech Stack:** Chromium WebUI, Lit, existing `FunctionsUI` router, SuperRPA browser bridge callbacks, current `sgclaw-config` config page logic, future floating host injection in SuperRPA.
|
||||
|
||||
## Product Target
|
||||
|
||||
The frontend target is a single-column chat product, not a multi-card debug workstation.
|
||||
|
||||
Final visual structure:
|
||||
|
||||
```text
|
||||
Collapsed Fab
|
||||
┌────────────┐
|
||||
│ sgClaw ●2 │
|
||||
└────────────┘
|
||||
|
||||
Expanded Chat
|
||||
┌──────────────────────────────────────────┐
|
||||
│ sgClaw | 当前网页:example.com │
|
||||
│ [新对话] [历史] [设置] [收起] │
|
||||
│ 状态:待命 / 执行中 / 出错 │
|
||||
├──────────────────────────────────────────┤
|
||||
│ 欢迎区 / 推荐动作 │
|
||||
│ [总结当前页面] [提取表格] [执行网页操作] │
|
||||
├──────────────────────────────────────────┤
|
||||
│ 消息流 │
|
||||
│ 用户消息 │
|
||||
│ 助手消息 │
|
||||
│ 步骤卡 / 结果卡 / 错误卡 │
|
||||
├──────────────────────────────────────────┤
|
||||
│ [网页执行] [页面问答] [页面总结] │
|
||||
│ [上下文开关] [调试] [更多] │
|
||||
│ ┌──────────────────────────────────────┐ │
|
||||
│ │ 输入任务... │ │
|
||||
│ └──────────────────────────────────────┘ │
|
||||
│ [发送]│
|
||||
└──────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
Core UX rules:
|
||||
|
||||
- The primary content area is always the message stream.
|
||||
- `finalResult` becomes a result card inside the message stream.
|
||||
- `logs` move into a hidden debug drawer.
|
||||
- `start/stop` remain available but move to the header status area.
|
||||
- Configuration remains available but opens inside a settings panel first, with route-navigation fallback to `chrome://superrpa-functions/sgclaw-config`.
|
||||
- The same component tree must work in `FunctionsUI` first and later inside a real injected floating host.
|
||||
|
||||
## Scope
|
||||
|
||||
### In Scope For This Frontend Plan
|
||||
|
||||
- Complete visual redesign of `sgclaw-chat`
|
||||
- Empty state, active chat state, running state, success state, error state
|
||||
- Local conversation history UI
|
||||
- Embedded settings panel
|
||||
- Debug drawer
|
||||
- Stable typed message model
|
||||
- Separation of host bridge code from UI code
|
||||
- Floating launcher state model
|
||||
|
||||
### Explicitly Out Of Scope For First Frontend Delivery
|
||||
|
||||
- Real attachment upload execution
|
||||
- Deep-thinking or multi-skill plugin ecosystem
|
||||
- Provider/protocol redesign on the Rust side
|
||||
- Full page-injected floating host implementation
|
||||
- New backend APIs beyond the current bridge
|
||||
|
||||
## Existing Baseline To Reuse
|
||||
|
||||
The implementation should reuse these existing assets instead of replacing them blindly:
|
||||
|
||||
- Host page routing: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/functions.ts`
|
||||
- Existing chat entry registration: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/functions_manifest.json`
|
||||
- Current chat page bridge logic: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat.ts`
|
||||
- Current floating state prototype: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-floating_state.ts`
|
||||
- Current config UI and bridge: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-config/sgclaw-config.ts`
|
||||
|
||||
## Final File Layout
|
||||
|
||||
All implementation paths below are exact and rooted in `/home/zyl/projects/superRpa/src`.
|
||||
|
||||
### Core Chat Entry
|
||||
|
||||
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat.ts`
|
||||
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat.html.ts`
|
||||
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat.css.ts`
|
||||
|
||||
### State Modules
|
||||
|
||||
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-floating_state.ts`
|
||||
- Create: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_window_state.ts`
|
||||
- Create: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_conversation_state.ts`
|
||||
- Create: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_history_state.ts`
|
||||
- Create: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_settings_state.ts`
|
||||
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_state.ts`
|
||||
|
||||
### Host Adapter
|
||||
|
||||
- Create: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_host_adapter.ts`
|
||||
- Create: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_runtime_bridge.ts`
|
||||
|
||||
### Message Model And Rendering
|
||||
|
||||
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_messages.ts`
|
||||
- Create: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/components/sgclaw-message-list.ts`
|
||||
- Create: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/components/sgclaw-message-card-user.ts`
|
||||
- Create: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/components/sgclaw-message-card-assistant.ts`
|
||||
- Create: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/components/sgclaw-message-card-step.ts`
|
||||
- Create: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/components/sgclaw-message-card-result.ts`
|
||||
- Create: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/components/sgclaw-message-card-error.ts`
|
||||
|
||||
### Shell Components
|
||||
|
||||
- Create: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/components/sgclaw-chat-shell.ts`
|
||||
- Create: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/components/sgclaw-chat-header.ts`
|
||||
- Create: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/components/sgclaw-chat-composer.ts`
|
||||
- Create: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/components/sgclaw-history-panel.ts`
|
||||
- Create: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/components/sgclaw-settings-panel.ts`
|
||||
- Create: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/components/sgclaw-debug-drawer.ts`
|
||||
|
||||
### Build And Host Wiring
|
||||
|
||||
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/BUILD.gn`
|
||||
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/functions.html.ts`
|
||||
|
||||
### Tests
|
||||
|
||||
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-floating_state_mainline_unittest.ts`
|
||||
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_mainline_unittest.ts`
|
||||
- Create: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_window_state_mainline_unittest.ts`
|
||||
- Create: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_history_state_mainline_unittest.ts`
|
||||
- Create: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_settings_state_mainline_unittest.ts`
|
||||
- Create: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_runtime_bridge_mainline_unittest.ts`
|
||||
|
||||
## Target State Model
|
||||
|
||||
Use a typed model instead of the current loose shape.
|
||||
|
||||
```ts
|
||||
interface SgClawChatWindowState {
|
||||
windowOpen: boolean;
|
||||
activePanel: 'chat' | 'history' | 'settings' | 'debug';
|
||||
unreadCount: number;
|
||||
}
|
||||
|
||||
interface SgClawChatConversationState {
|
||||
conversationId: string;
|
||||
draftInput: string;
|
||||
mode: 'web-action' | 'page-qa' | 'page-summary';
|
||||
contextEnabled: boolean;
|
||||
messages: SgClawMessage[];
|
||||
}
|
||||
|
||||
interface SgClawMessage {
|
||||
id: string;
|
||||
type: 'user_text' | 'assistant_text' | 'task_step' | 'task_result' | 'task_error' | 'system_notice';
|
||||
role: 'user' | 'assistant' | 'system';
|
||||
content: string;
|
||||
status?: 'pending' | 'running' | 'done' | 'failed';
|
||||
timestamp: number;
|
||||
meta?: Record<string, unknown>;
|
||||
}
|
||||
```
|
||||
|
||||
The current `logs`, `messages`, `finalResult`, `pendingReply`, and `busy` state should be re-expressed through these typed stores instead of being owned directly by the entry component.
|
||||
|
||||
## Task 1: Freeze The Current Entry And Enable Real Template/CSS Modules
|
||||
|
||||
**Files:**
|
||||
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat.ts`
|
||||
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat.html.ts`
|
||||
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat.css.ts`
|
||||
- Test: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_mainline_unittest.ts`
|
||||
|
||||
**Step 1: Write the failing structure test**
|
||||
|
||||
Add assertions that the entry no longer hardcodes the full DOM layout in `render()` and imports its shell template/style helpers.
|
||||
|
||||
**Step 2: Run test to verify it fails**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
autoninja -C /home/zyl/projects/superRpa/src/out/KylinRelease sgclaw-chat_build_ts
|
||||
```
|
||||
|
||||
Expected: fail because `sgclaw-chat.html.ts` and `sgclaw-chat.css.ts` are empty and the new test expects real exports.
|
||||
|
||||
**Step 3: Write the minimal implementation**
|
||||
|
||||
- Move root shell markup to `getHtml()`
|
||||
- Move root style tokens/layout to `getCss()`
|
||||
- Keep `sgclaw-chat.ts` focused on state + events
|
||||
|
||||
**Step 4: Run test to verify it passes**
|
||||
|
||||
Run the same build target.
|
||||
|
||||
Expected: TS build succeeds and the entry uses external template/style helpers.
|
||||
|
||||
**Step 5: Commit**
|
||||
|
||||
```bash
|
||||
git -C /home/zyl/projects/superRpa/src add \
|
||||
chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat.ts \
|
||||
chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat.html.ts \
|
||||
chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat.css.ts \
|
||||
chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_mainline_unittest.ts
|
||||
git -C /home/zyl/projects/superRpa/src commit -m "refactor: extract sgclaw chat shell template"
|
||||
```
|
||||
|
||||
## Task 2: Build The Window, Conversation, History, And Settings State Modules
|
||||
|
||||
**Files:**
|
||||
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-floating_state.ts`
|
||||
- Create: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_window_state.ts`
|
||||
- Create: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_conversation_state.ts`
|
||||
- Create: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_history_state.ts`
|
||||
- Create: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_settings_state.ts`
|
||||
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_state.ts`
|
||||
- Test: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-floating_state_mainline_unittest.ts`
|
||||
- Test: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_window_state_mainline_unittest.ts`
|
||||
- Test: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_history_state_mainline_unittest.ts`
|
||||
- Test: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_settings_state_mainline_unittest.ts`
|
||||
|
||||
**Step 1: Write the failing pure-state tests**
|
||||
|
||||
Cover:
|
||||
- open/close/switch panel transitions
|
||||
- unread count clear on open
|
||||
- create/reset conversation
|
||||
- local history push/select/remove
|
||||
- settings draft dirty detection
|
||||
|
||||
**Step 2: Run tests to verify RED**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
autoninja -C /home/zyl/projects/superRpa/src/out/KylinRelease sgclaw-chat_build_ts
|
||||
```
|
||||
|
||||
Expected: build fails because the new modules and tests do not exist yet.
|
||||
|
||||
**Step 3: Write the minimal implementation**
|
||||
|
||||
Implement pure functions only. Do not mix DOM work into these modules.
|
||||
|
||||
**Step 4: Run tests to verify GREEN**
|
||||
|
||||
Run the same build target.
|
||||
|
||||
Expected: all pure-state modules compile and their tests pass.
|
||||
|
||||
**Step 5: Commit**
|
||||
|
||||
```bash
|
||||
git -C /home/zyl/projects/superRpa/src add \
|
||||
chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-floating_state.ts \
|
||||
chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_window_state.ts \
|
||||
chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_conversation_state.ts \
|
||||
chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_history_state.ts \
|
||||
chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_settings_state.ts \
|
||||
chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_state.ts \
|
||||
chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-floating_state_mainline_unittest.ts \
|
||||
chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_window_state_mainline_unittest.ts \
|
||||
chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_history_state_mainline_unittest.ts \
|
||||
chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_settings_state_mainline_unittest.ts
|
||||
git -C /home/zyl/projects/superRpa/src commit -m "feat: add sgclaw chat state modules"
|
||||
```
|
||||
|
||||
## Task 3: Introduce A Host Adapter So UI Stops Owning Bridge Details
|
||||
|
||||
**Files:**
|
||||
- Create: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_host_adapter.ts`
|
||||
- Create: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_runtime_bridge.ts`
|
||||
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat.ts`
|
||||
- Test: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_runtime_bridge_mainline_unittest.ts`
|
||||
|
||||
**Step 1: Write the failing bridge test**
|
||||
|
||||
Test that:
|
||||
- `connect()` issues `sgclawConnect`
|
||||
- `start()` issues `sgclawStart`
|
||||
- `stop()` issues `sgclawStop`
|
||||
- `submitTask()` issues `sgclawSubmitTask`
|
||||
- callback payload parsing is handled in one place
|
||||
|
||||
**Step 2: Run test to verify RED**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
autoninja -C /home/zyl/projects/superRpa/src/out/KylinRelease sgclaw-chat_build_ts
|
||||
```
|
||||
|
||||
Expected: fail because adapter modules do not exist.
|
||||
|
||||
**Step 3: Write minimal implementation**
|
||||
|
||||
- Wrap `chrome.send`
|
||||
- Centralize callback registration
|
||||
- Return typed runtime events/state to the UI layer
|
||||
|
||||
**Step 4: Run test to verify GREEN**
|
||||
|
||||
Run the same build target.
|
||||
|
||||
Expected: adapter tests compile and pass.
|
||||
|
||||
**Step 5: Commit**
|
||||
|
||||
```bash
|
||||
git -C /home/zyl/projects/superRpa/src add \
|
||||
chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_host_adapter.ts \
|
||||
chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_runtime_bridge.ts \
|
||||
chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat.ts \
|
||||
chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_runtime_bridge_mainline_unittest.ts
|
||||
git -C /home/zyl/projects/superRpa/src commit -m "refactor: isolate sgclaw chat host bridge"
|
||||
```
|
||||
|
||||
## Task 4: Replace The Loose Message Format With Typed Cards In The Message Stream
|
||||
|
||||
**Files:**
|
||||
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_messages.ts`
|
||||
- Create: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/components/sgclaw-message-list.ts`
|
||||
- Create: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/components/sgclaw-message-card-user.ts`
|
||||
- Create: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/components/sgclaw-message-card-assistant.ts`
|
||||
- Create: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/components/sgclaw-message-card-step.ts`
|
||||
- Create: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/components/sgclaw-message-card-result.ts`
|
||||
- Create: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/components/sgclaw-message-card-error.ts`
|
||||
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat.ts`
|
||||
- Test: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_mainline_unittest.ts`
|
||||
|
||||
**Step 1: Write the failing rendering test**
|
||||
|
||||
Add expectations that:
|
||||
- empty state shows guidance instead of a blank box
|
||||
- `task_complete` renders a result card in the message stream
|
||||
- `error` renders an error card in the message stream
|
||||
- `pendingReply` renders an assistant pending card
|
||||
|
||||
**Step 2: Run test to verify RED**
|
||||
|
||||
Run the TS build target.
|
||||
|
||||
Expected: fail because message types and card components do not exist.
|
||||
|
||||
**Step 3: Write minimal implementation**
|
||||
|
||||
- Keep the message list single-column
|
||||
- Preserve current user/assistant turn behavior
|
||||
- Move `finalResult` handling into result-card rendering
|
||||
- Move error display into message flow
|
||||
|
||||
**Step 4: Run test to verify GREEN**
|
||||
|
||||
Run the same build target.
|
||||
|
||||
Expected: cards render correctly and the old standalone result area is no longer required.
|
||||
|
||||
**Step 5: Commit**
|
||||
|
||||
```bash
|
||||
git -C /home/zyl/projects/superRpa/src add \
|
||||
chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_messages.ts \
|
||||
chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/components/sgclaw-message-list.ts \
|
||||
chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/components/sgclaw-message-card-user.ts \
|
||||
chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/components/sgclaw-message-card-assistant.ts \
|
||||
chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/components/sgclaw-message-card-step.ts \
|
||||
chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/components/sgclaw-message-card-result.ts \
|
||||
chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/components/sgclaw-message-card-error.ts \
|
||||
chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat.ts \
|
||||
chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_mainline_unittest.ts
|
||||
git -C /home/zyl/projects/superRpa/src commit -m "feat: add sgclaw chat message cards"
|
||||
```
|
||||
|
||||
## Task 5: Build The Real Header, Empty State, And Composer
|
||||
|
||||
**Files:**
|
||||
- Create: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/components/sgclaw-chat-shell.ts`
|
||||
- Create: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/components/sgclaw-chat-header.ts`
|
||||
- Create: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/components/sgclaw-chat-composer.ts`
|
||||
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat.html.ts`
|
||||
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat.css.ts`
|
||||
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat.ts`
|
||||
- Test: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_mainline_unittest.ts`
|
||||
|
||||
**Step 1: Write the failing shell test**
|
||||
|
||||
Assert that the rendered page now contains:
|
||||
- header with title, current page label, and status pill
|
||||
- empty state recommendation buttons
|
||||
- fixed composer at the bottom
|
||||
- no standalone `实时日志` or `最终结果` primary sections
|
||||
|
||||
**Step 2: Run test to verify RED**
|
||||
|
||||
Run the TS build target.
|
||||
|
||||
Expected: fail because the shell components do not exist.
|
||||
|
||||
**Step 3: Write minimal implementation**
|
||||
|
||||
- Header: title, page context, new-chat/history/settings/collapse actions
|
||||
- Empty state: 3 to 4 recommended actions
|
||||
- Composer: text input, send button, mode toggles, context switch
|
||||
|
||||
**Step 4: Run test to verify GREEN**
|
||||
|
||||
Run the same build target.
|
||||
|
||||
Expected: the page renders as a product-style chat shell.
|
||||
|
||||
**Step 5: Commit**
|
||||
|
||||
```bash
|
||||
git -C /home/zyl/projects/superRpa/src add \
|
||||
chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/components/sgclaw-chat-shell.ts \
|
||||
chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/components/sgclaw-chat-header.ts \
|
||||
chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/components/sgclaw-chat-composer.ts \
|
||||
chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat.html.ts \
|
||||
chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat.css.ts \
|
||||
chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat.ts \
|
||||
chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_mainline_unittest.ts
|
||||
git -C /home/zyl/projects/superRpa/src commit -m "feat: add sgclaw chat shell and composer"
|
||||
```
|
||||
|
||||
## Task 6: Embed Settings And Move Raw Logs Into A Debug Drawer
|
||||
|
||||
**Files:**
|
||||
- Create: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/components/sgclaw-settings-panel.ts`
|
||||
- Create: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/components/sgclaw-debug-drawer.ts`
|
||||
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat.ts`
|
||||
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat.css.ts`
|
||||
- Reuse Read-Only Reference: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-config/sgclaw-config.ts`
|
||||
- Test: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_settings_state_mainline_unittest.ts`
|
||||
|
||||
**Step 1: Write the failing panel tests**
|
||||
|
||||
Cover:
|
||||
- opening settings panel from header
|
||||
- editing embedded config draft
|
||||
- opening debug drawer and showing logs
|
||||
- closing secondary panels without destroying the chat draft
|
||||
|
||||
**Step 2: Run test to verify RED**
|
||||
|
||||
Run the TS build target.
|
||||
|
||||
Expected: fail because secondary panel components do not exist.
|
||||
|
||||
**Step 3: Write minimal implementation**
|
||||
|
||||
- Reuse config field structure from `sgclaw-config`
|
||||
- Keep raw logs in debug only
|
||||
- Preserve route-navigation fallback for full config page if embedded save/load fails
|
||||
|
||||
**Step 4: Run test to verify GREEN**
|
||||
|
||||
Run the same build target.
|
||||
|
||||
Expected: settings and debug layers behave as secondary panels instead of separate pages.
|
||||
|
||||
**Step 5: Commit**
|
||||
|
||||
```bash
|
||||
git -C /home/zyl/projects/superRpa/src add \
|
||||
chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/components/sgclaw-settings-panel.ts \
|
||||
chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/components/sgclaw-debug-drawer.ts \
|
||||
chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat.ts \
|
||||
chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat.css.ts \
|
||||
chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_settings_state_mainline_unittest.ts
|
||||
git -C /home/zyl/projects/superRpa/src commit -m "feat: add sgclaw settings panel and debug drawer"
|
||||
```
|
||||
|
||||
## Task 7: Add Local Conversation History And New-Chat Recovery
|
||||
|
||||
**Files:**
|
||||
- Create: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/components/sgclaw-history-panel.ts`
|
||||
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_history_state.ts`
|
||||
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat.ts`
|
||||
- Test: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_history_state_mainline_unittest.ts`
|
||||
|
||||
**Step 1: Write the failing history tests**
|
||||
|
||||
Cover:
|
||||
- saving a conversation preview to local history
|
||||
- creating a fresh conversation resets message stream but keeps config
|
||||
- reopening a history item restores messages and draft
|
||||
|
||||
**Step 2: Run test to verify RED**
|
||||
|
||||
Run the TS build target.
|
||||
|
||||
Expected: fail because history panel and persistence behavior do not exist.
|
||||
|
||||
**Step 3: Write minimal implementation**
|
||||
|
||||
- Store history locally in browser storage or localStorage
|
||||
- Keep only small metadata + message snapshot for first version
|
||||
- No backend schema change in this phase
|
||||
|
||||
**Step 4: Run test to verify GREEN**
|
||||
|
||||
Run the same build target.
|
||||
|
||||
Expected: local conversation switching works fully in the frontend.
|
||||
|
||||
**Step 5: Commit**
|
||||
|
||||
```bash
|
||||
git -C /home/zyl/projects/superRpa/src add \
|
||||
chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/components/sgclaw-history-panel.ts \
|
||||
chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_history_state.ts \
|
||||
chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat.ts \
|
||||
chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_history_state_mainline_unittest.ts
|
||||
git -C /home/zyl/projects/superRpa/src commit -m "feat: add sgclaw local conversation history"
|
||||
```
|
||||
|
||||
## Task 8: Wire New Shell Assets Into BUILD And Polish The Host Page
|
||||
|
||||
**Files:**
|
||||
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/BUILD.gn`
|
||||
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/functions.html.ts`
|
||||
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/functions.css`
|
||||
|
||||
**Step 1: Write the failing host expectation**
|
||||
|
||||
Add a small host-level check that:
|
||||
- `sgclaw-chat` still loads from the manifest
|
||||
- host quick actions still work
|
||||
- the function page provides enough room for the new chat shell
|
||||
|
||||
**Step 2: Run test/build to verify RED**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
autoninja -C /home/zyl/projects/superRpa/src/out/KylinRelease sgclaw-chat_build_ts
|
||||
```
|
||||
|
||||
Expected: fail or render incorrectly because new component files are not all wired into build/host styling yet.
|
||||
|
||||
**Step 3: Write minimal implementation**
|
||||
|
||||
- Add all new TS modules to `BUILD.gn`
|
||||
- Keep `sgclaw-chat` and `sgclaw-config` quick actions
|
||||
- Adjust host layout so the new shell is not boxed into the old debug-page proportions
|
||||
|
||||
**Step 4: Run verification**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
autoninja -C /home/zyl/projects/superRpa/src/out/KylinRelease sgclaw-chat_build_ts
|
||||
autoninja -C /home/zyl/projects/superRpa/src/out/KylinRelease superrpa_resources
|
||||
```
|
||||
|
||||
Expected: build completes with all new chat modules wired in.
|
||||
|
||||
**Step 5: Commit**
|
||||
|
||||
```bash
|
||||
git -C /home/zyl/projects/superRpa/src add \
|
||||
chrome/browser/resources/superrpa/devtools/BUILD.gn \
|
||||
chrome/browser/resources/superrpa/devtools/functions/functions.html.ts \
|
||||
chrome/browser/resources/superrpa/devtools/functions/functions.css
|
||||
git -C /home/zyl/projects/superRpa/src commit -m "chore: wire sgclaw chat frontend modules"
|
||||
```
|
||||
|
||||
## Manual Verification Matrix
|
||||
|
||||
Run all manual checks in `chrome://superrpa-functions/sgclaw-chat` after the full frontend plan lands.
|
||||
|
||||
### UX States
|
||||
|
||||
- Empty state appears on first open.
|
||||
- Recommended actions generate user messages.
|
||||
- Composer stays visible while history/settings/debug panels switch.
|
||||
- Message stream auto-scrolls to the latest item.
|
||||
- Result cards and error cards appear inline.
|
||||
|
||||
### Runtime
|
||||
|
||||
- `启动` works from the header area.
|
||||
- `停止` works from the header area.
|
||||
- submit creates an immediate user message.
|
||||
- pending assistant card appears while waiting.
|
||||
- result card replaces the old standalone result behavior.
|
||||
|
||||
### Settings
|
||||
|
||||
- embedded settings loads existing values
|
||||
- save updates status and clears dirty state
|
||||
- fallback route to `chrome://superrpa-functions/sgclaw-config` still works
|
||||
|
||||
### Debug
|
||||
|
||||
- logs are not visible in the main chat view
|
||||
- debug drawer shows raw logs when opened
|
||||
|
||||
### History
|
||||
|
||||
- new conversation starts clean
|
||||
- previous conversation can be restored from local history
|
||||
- unread badge clears when reopening the window
|
||||
|
||||
## Execution Notes
|
||||
|
||||
- Keep the current backend/runtime bridge unchanged until the new frontend shell is stable.
|
||||
- Do not combine page-injected floating host work into this same branch. The first milestone is a complete product-grade frontend inside the existing `FunctionsUI` host.
|
||||
- When this frontend plan is complete, the next plan should focus only on mounting the same component tree inside a real page floating container.
|
||||
|
||||
Plan complete and saved to `docs/plans/2026-03-27-sgclaw-floating-chat-frontend-design.md`. Two execution options:
|
||||
|
||||
**1. Subagent-Driven (this session)** - I dispatch fresh subagent per task, review between tasks, fast iteration
|
||||
|
||||
**2. Parallel Session (separate)** - Open new session with executing-plans, batch execution with checkpoints
|
||||
|
||||
**Which approach?**
|
||||
@@ -0,0 +1,85 @@
|
||||
# sgClaw Overlay And Basic Navigation Fixes Implementation Plan
|
||||
|
||||
> **For Claude:** REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.
|
||||
|
||||
**Goal:** Make ordinary webpages render the new sgClaw floating chat design and support base navigation instructions like `打开知乎`.
|
||||
|
||||
**Architecture:** Keep the ordinary-page injection entrypoint unchanged, but replace its in-shadow DOM layout with the same floating-window shell used by the new debug page. On the runtime side, extend the deterministic planner with explicit homepage navigation plans for supported sites so freeform open-site commands do not fail before the compat runtime can help.
|
||||
|
||||
**Tech Stack:** Chromium WebUI resource pipeline, injected Shadow DOM overlay JavaScript, Rust planner tests
|
||||
|
||||
### Task 1: Lock the current regressions with failing tests
|
||||
|
||||
**Files:**
|
||||
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/superrpa/sgclaw/sgclaw_chat_smoke.mjs`
|
||||
- Modify: `/home/zyl/projects/sgClaw/claw/tests/planner_test.rs`
|
||||
|
||||
**Step 1: Write the failing smoke expectations**
|
||||
|
||||
Add assertions that the ordinary webpage overlay shows the new subtitle `面向当前网页的悬浮式对话与自动执行` and no longer exposes the old card titles like `聊天记录`.
|
||||
|
||||
**Step 2: Run the smoke to verify it fails**
|
||||
|
||||
Run: `node /home/zyl/projects/superRpa/src/chrome/browser/superrpa/sgclaw/sgclaw_chat_smoke.mjs`
|
||||
Expected: FAIL because ordinary webpages still render the old overlay shell.
|
||||
|
||||
**Step 3: Write the failing planner test**
|
||||
|
||||
Add a test asserting `plan_instruction("打开知乎")` returns one `Navigate` step to `https://www.zhihu.com`.
|
||||
|
||||
**Step 4: Run the planner test to verify it fails**
|
||||
|
||||
Run: `cargo test planner_supports_open_zhihu_homepage_instruction --test planner_test`
|
||||
Expected: FAIL with `unsupported instruction: 打开知乎`.
|
||||
|
||||
### Task 2: Migrate the ordinary webpage overlay to the new shell
|
||||
|
||||
**Files:**
|
||||
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/sgclaw_overlay.js`
|
||||
- Test: `/home/zyl/projects/superRpa/src/chrome/browser/superrpa/sgclaw/sgclaw_chat_smoke.mjs`
|
||||
|
||||
**Step 1: Replace the old card layout with the new floating shell**
|
||||
|
||||
Keep bridge calls, ids, and polling behavior intact, but render the new header, message pane, composer, settings panel, and debug drawer structure inside the existing injected Shadow DOM.
|
||||
|
||||
**Step 2: Keep runtime visibility without reintroducing the old layout**
|
||||
|
||||
Move logs and final result into secondary panels or inline cards so the ordinary webpage still exposes execution details without the old four-card layout.
|
||||
|
||||
**Step 3: Run the smoke again**
|
||||
|
||||
Run: `node /home/zyl/projects/superRpa/src/chrome/browser/superrpa/sgclaw/sgclaw_chat_smoke.mjs`
|
||||
Expected: PASS once rebuilt resources are being served by the browser binary.
|
||||
|
||||
### Task 3: Extend planner support for basic open-site commands
|
||||
|
||||
**Files:**
|
||||
- Modify: `/home/zyl/projects/sgClaw/claw/src/agent/planner.rs`
|
||||
- Test: `/home/zyl/projects/sgClaw/claw/tests/planner_test.rs`
|
||||
|
||||
**Step 1: Implement the minimal homepage plans**
|
||||
|
||||
Support `打开知乎` and `打开百度` by returning single-step `Navigate` plans to their homepages.
|
||||
|
||||
**Step 2: Run planner tests**
|
||||
|
||||
Run: `cargo test --test planner_test`
|
||||
Expected: PASS.
|
||||
|
||||
### Task 4: Build and verify the integrated behavior
|
||||
|
||||
**Files:**
|
||||
- Modify: `/home/zyl/projects/superRpa/src/AGENTS.md`
|
||||
- Modify: `/home/zyl/projects/superRpa/src/docs/handoffs/2026-03-27-sgclaw-runtime-verification.md`
|
||||
|
||||
**Step 1: Rebuild impacted targets**
|
||||
|
||||
Run: `autoninja -C /home/zyl/projects/superRpa/src/out/KylinRelease chrome/browser/resources/superrpa:resources sgclaw`
|
||||
|
||||
**Step 2: Re-run targeted verification**
|
||||
|
||||
Run the smoke and a focused `sgclaw` task submission check for `打开知乎`.
|
||||
|
||||
**Step 3: Document the final runtime path**
|
||||
|
||||
Record that ordinary webpages and `chrome://superrpa-functions/sgclaw-chat` now share the same floating shell, and that homepage navigation commands are handled by the planner.
|
||||
158
docs/plans/2026-03-27-skill-lib-testing-plan.md
Normal file
158
docs/plans/2026-03-27-skill-lib-testing-plan.md
Normal file
@@ -0,0 +1,158 @@
|
||||
# Skill Lib Testing Implementation Plan
|
||||
|
||||
> **For Claude:** REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.
|
||||
|
||||
**Goal:** Add an in-project, repeatable test harness that validates `/home/zyl/projects/sgClaw/skill_lib` against the current ZeroClaw `SKILL.md` loader and security-audit expectations.
|
||||
|
||||
**Architecture:** Keep the test runner inside the SGClaw repository and target the sibling `skill_lib` directory by relative path. Implement a small Python validator that mirrors the ZeroClaw markdown frontmatter parser and the relevant skill-audit checks, then cover it with a Python `unittest` suite that exercises the actual three migrated Zhihu skills.
|
||||
|
||||
**Tech Stack:** Python 3 standard library, `unittest`, local file-system inspection, ZeroClaw source code as behavioral reference, Markdown/YAML-like frontmatter parsing.
|
||||
|
||||
### Task 1: Freeze The Test Contract
|
||||
|
||||
**Files:**
|
||||
- Create: `/home/zyl/projects/sgClaw/claw/docs/plans/2026-03-27-skill-lib-testing-plan.md`
|
||||
- Reference only: `/home/zyl/projects/sgClaw/claw/third_party/zeroclaw/src/skills/mod.rs`
|
||||
- Reference only: `/home/zyl/projects/sgClaw/claw/third_party/zeroclaw/src/skills/audit.rs`
|
||||
- Reference only: `/home/zyl/projects/sgClaw/skill_lib/skills/*/SKILL.md`
|
||||
|
||||
**Step 1: Capture the loader semantics to preserve**
|
||||
|
||||
Document and implement tests for:
|
||||
- `SKILL.md` frontmatter splitting on `---`
|
||||
- supported metadata keys: `name`, `description`, `version`, `author`, `tags`
|
||||
- fallback rules for name, description, and version
|
||||
- prompt body must exclude the frontmatter block
|
||||
|
||||
**Step 2: Capture the audit semantics to preserve**
|
||||
|
||||
Document and implement tests for:
|
||||
- skill root must contain `SKILL.md` or `SKILL.toml`
|
||||
- symlinks are rejected
|
||||
- shell-script files are blocked when `allow_scripts` is false
|
||||
- markdown links must not escape the skill root
|
||||
- high-risk command snippets inside markdown are rejected
|
||||
|
||||
**Step 3: Define the migrated-skill expectations**
|
||||
|
||||
The test suite must verify:
|
||||
- exactly three skill packages exist
|
||||
- the loaded names are `zhihu-hotlist`, `zhihu-navigate`, `zhihu-write`
|
||||
- each package has both `references/` and `assets/`
|
||||
- each description stays trigger-oriented and starts with `Use when`
|
||||
|
||||
### Task 2: Write The Failing Tests First
|
||||
|
||||
**Files:**
|
||||
- Create: `/home/zyl/projects/sgClaw/claw/tests/skill_lib_validation_test.py`
|
||||
|
||||
**Step 1: Write a failing import-level test**
|
||||
|
||||
Import a not-yet-created validator module from:
|
||||
- `/home/zyl/projects/sgClaw/claw/scripts/validate_skill_lib.py`
|
||||
|
||||
Expected initial failure:
|
||||
- `ModuleNotFoundError` or `FileNotFoundError`
|
||||
|
||||
**Step 2: Encode the project expectations**
|
||||
|
||||
Add tests for:
|
||||
- skill discovery count and names
|
||||
- parsed metadata for each current skill
|
||||
- audit cleanliness for each skill with `allow_scripts=False`
|
||||
- package shape (`SKILL.md`, `references/`, `assets/`)
|
||||
|
||||
**Step 3: Run the tests and watch them fail**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
python3 -m unittest tests.skill_lib_validation_test -v
|
||||
```
|
||||
|
||||
Expected:
|
||||
- failure because the validator module does not exist yet
|
||||
|
||||
### Task 3: Implement The Minimal Validator
|
||||
|
||||
**Files:**
|
||||
- Create: `/home/zyl/projects/sgClaw/claw/scripts/validate_skill_lib.py`
|
||||
|
||||
**Step 1: Implement discovery helpers**
|
||||
|
||||
Implement:
|
||||
- repo root resolution
|
||||
- sibling `skill_lib` root resolution
|
||||
- `skills/` directory enumeration
|
||||
|
||||
**Step 2: Implement the markdown loader**
|
||||
|
||||
Implement:
|
||||
- frontmatter split
|
||||
- lightweight frontmatter parsing
|
||||
- description fallback extraction
|
||||
- metadata normalization into a `SkillRecord`
|
||||
|
||||
**Step 3: Implement the relevant audit checks**
|
||||
|
||||
Implement:
|
||||
- symlink detection
|
||||
- blocked shell-script detection
|
||||
- markdown link boundary checks
|
||||
- high-risk snippet detection
|
||||
- deterministic findings collection
|
||||
|
||||
**Step 4: Implement a small CLI**
|
||||
|
||||
Running:
|
||||
```bash
|
||||
python3 scripts/validate_skill_lib.py
|
||||
```
|
||||
|
||||
Should:
|
||||
- print one summary line per skill
|
||||
- exit `0` when all skills pass
|
||||
- exit non-zero when any skill fails
|
||||
|
||||
### Task 4: Run The Tests Green
|
||||
|
||||
**Files:**
|
||||
- Test: `/home/zyl/projects/sgClaw/claw/tests/skill_lib_validation_test.py`
|
||||
- Test: `/home/zyl/projects/sgClaw/claw/scripts/validate_skill_lib.py`
|
||||
|
||||
**Step 1: Re-run the unit tests**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
python3 -m unittest tests.skill_lib_validation_test -v
|
||||
```
|
||||
|
||||
Expected:
|
||||
- all tests pass
|
||||
|
||||
**Step 2: Run the CLI validator**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
python3 scripts/validate_skill_lib.py
|
||||
```
|
||||
|
||||
Expected:
|
||||
- all three skills print `PASS`
|
||||
- process exits `0`
|
||||
|
||||
### Task 5: Document The Verification Entry Point
|
||||
|
||||
**Files:**
|
||||
- Modify: `/home/zyl/projects/sgClaw/skill_lib/VERIFY.md`
|
||||
|
||||
**Step 1: Add the project-local validation command**
|
||||
|
||||
Add:
|
||||
- `python3 /home/zyl/projects/sgClaw/claw/scripts/validate_skill_lib.py`
|
||||
- `python3 -m unittest /home/zyl/projects/sgClaw/claw/tests/skill_lib_validation_test.py`
|
||||
|
||||
**Step 2: Re-run both commands after the doc update**
|
||||
|
||||
Expected:
|
||||
- validator still exits `0`
|
||||
- unit tests still pass
|
||||
411
docs/plans/2026-03-27-skill-lib-zeroclaw-plan.md
Normal file
411
docs/plans/2026-03-27-skill-lib-zeroclaw-plan.md
Normal file
@@ -0,0 +1,411 @@
|
||||
# Skill Lib ZeroClaw Migration Implementation Plan
|
||||
|
||||
> **For Claude:** REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.
|
||||
|
||||
**Goal:** Create `/home/zyl/projects/sgClaw/skill_lib` as a dedicated skill library directory and restructure the current Zhihu browser capabilities into ZeroClaw-style skill packages.
|
||||
|
||||
**Architecture:** Treat `skill_lib` as a standalone skill repository, not as an embedded Rust module tree. Use the ZeroClaw/open-skills layout `skill_lib/skills/<skill-name>/SKILL.md`, keep each skill self-contained, and move long operational detail into `references/` plus any preserved source artifacts into `assets/`. Map the current four exposed Rust capabilities into three end-user skills: `zhihu-navigate`, `zhihu-write`, and `zhihu-hotlist`.
|
||||
|
||||
**Tech Stack:** Markdown `SKILL.md`, YAML frontmatter, directory-based ZeroClaw skill packaging, existing SGClaw Zhihu Rust/JSON source material, shell validation commands.
|
||||
|
||||
### Task 1: Freeze The Target Layout
|
||||
|
||||
**Files:**
|
||||
- Create: `/home/zyl/projects/sgClaw/skill_lib/`
|
||||
- Create: `/home/zyl/projects/sgClaw/skill_lib/README.md`
|
||||
- Create: `/home/zyl/projects/sgClaw/skill_lib/skills/`
|
||||
- Reference only: `/home/zyl/projects/sgClaw/claw/third_party/zeroclaw/src/skills/mod.rs`
|
||||
- Reference only: `/home/zyl/projects/sgClaw/claw/third_party/zeroclaw/skills/browser/SKILL.md`
|
||||
|
||||
**Step 1: Create the top-level repository skeleton**
|
||||
|
||||
Create:
|
||||
- `/home/zyl/projects/sgClaw/skill_lib/README.md`
|
||||
- `/home/zyl/projects/sgClaw/skill_lib/skills/`
|
||||
|
||||
The README should state:
|
||||
- this directory is a dedicated ZeroClaw-style skill library
|
||||
- runtime skill packages live under `skills/<name>/`
|
||||
- each skill package uses `SKILL.md` plus optional `references/`, `scripts/`, and `assets/`
|
||||
|
||||
**Step 2: Document the package contract in the README**
|
||||
|
||||
Include:
|
||||
- required file: `SKILL.md`
|
||||
- supported frontmatter for this repo: `name`, `description`, `version`, `author`, `tags`
|
||||
- design rule: `description` must be trigger-oriented and not a workflow dump
|
||||
- design rule: keep `SKILL.md` concise and push long detail into `references/`
|
||||
|
||||
**Step 3: Run structural sanity checks**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
test -d /home/zyl/projects/sgClaw/skill_lib
|
||||
test -d /home/zyl/projects/sgClaw/skill_lib/skills
|
||||
test -f /home/zyl/projects/sgClaw/skill_lib/README.md
|
||||
```
|
||||
|
||||
Expected: all commands exit `0`.
|
||||
|
||||
### Task 2: Define The Skill Inventory And Source Mapping
|
||||
|
||||
**Files:**
|
||||
- Create: `/home/zyl/projects/sgClaw/skill_lib/skill_inventory.md`
|
||||
- Reference only: `/home/zyl/projects/sgClaw/claw/src/skill/mod.rs`
|
||||
- Reference only: `/home/zyl/projects/sgClaw/claw/src/skill/router.rs`
|
||||
- Reference only: `/home/zyl/projects/sgClaw/claw/src/skill/zhihu.rs`
|
||||
- Reference only: `/home/zyl/projects/sgClaw/claw/src/skill/zhihu_hotlist.rs`
|
||||
- Reference only: `/home/zyl/projects/sgClaw/claw/src/skill/zhihu_hotlist_store.rs`
|
||||
- Reference only: `/home/zyl/projects/sgClaw/claw/src/skill/zhihu_navigation.rs`
|
||||
- Reference only: `/home/zyl/projects/sgClaw/claw/resources/skills/zhihu_write_flow.json`
|
||||
- Reference only: `/home/zyl/projects/sgClaw/claw/resources/skills/zhihu_hotlist_flow.json`
|
||||
- Reference only: `/home/zyl/projects/sgClaw/claw/resources/skills/zhihu_navigation_pages.json`
|
||||
|
||||
**Step 1: Write the migration inventory**
|
||||
|
||||
Create `/home/zyl/projects/sgClaw/skill_lib/skill_inventory.md` with a three-row mapping:
|
||||
- `zhihu-navigate` ← current `zhihu_navigate`
|
||||
- `zhihu-write` ← current `zhihu_write`
|
||||
- `zhihu-hotlist` ← current `zhihu_hotlist_collect` + `zhihu_hotlist_report`
|
||||
|
||||
**Step 2: Capture the non-migrated code responsibilities**
|
||||
|
||||
Document explicitly that this migration does **not** port:
|
||||
- Rust router dispatch
|
||||
- browser pipe transport code
|
||||
- snapshot persistence implementation detail
|
||||
|
||||
Document that the new repo is a skill library, not a Rust runtime.
|
||||
|
||||
**Step 3: Record source artifacts per target skill**
|
||||
|
||||
For each target skill, list:
|
||||
- source Rust module(s)
|
||||
- source JSON flow/catalog file(s)
|
||||
- important risk notes discovered during analysis
|
||||
|
||||
**Step 4: Validate the inventory**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
rg -n "zhihu-navigate|zhihu-write|zhihu-hotlist" /home/zyl/projects/sgClaw/skill_lib/skill_inventory.md
|
||||
```
|
||||
|
||||
Expected: all three skill names appear exactly once as top-level migration targets.
|
||||
|
||||
### Task 3: Author The `zhihu-navigate` Skill Package
|
||||
|
||||
**Files:**
|
||||
- Create: `/home/zyl/projects/sgClaw/skill_lib/skills/zhihu-navigate/SKILL.md`
|
||||
- Create: `/home/zyl/projects/sgClaw/skill_lib/skills/zhihu-navigate/references/routes-and-targets.md`
|
||||
- Create: `/home/zyl/projects/sgClaw/skill_lib/skills/zhihu-navigate/references/selector-strategy.md`
|
||||
- Create: `/home/zyl/projects/sgClaw/skill_lib/skills/zhihu-navigate/assets/zhihu_navigation_pages.source.json`
|
||||
- Reference only: `/home/zyl/projects/sgClaw/claw/src/skill/zhihu_navigation.rs`
|
||||
- Reference only: `/home/zyl/projects/sgClaw/claw/resources/skills/zhihu_navigation_pages.json`
|
||||
|
||||
**Step 1: Preserve the raw source artifact**
|
||||
|
||||
Copy the current navigation catalog into:
|
||||
- `/home/zyl/projects/sgClaw/skill_lib/skills/zhihu-navigate/assets/zhihu_navigation_pages.source.json`
|
||||
|
||||
This file is for traceability only, not for frontmatter or prompt injection.
|
||||
|
||||
**Step 2: Write the `SKILL.md`**
|
||||
|
||||
Use ZeroClaw-style frontmatter:
|
||||
```yaml
|
||||
---
|
||||
name: zhihu-navigate
|
||||
description: Use when the user wants to open, switch, or navigate to a Zhihu page, tab, menu, profile area, notifications area, message area, or creator area through browser actions.
|
||||
version: 0.1.0
|
||||
author: sgclaw
|
||||
tags:
|
||||
- zhihu
|
||||
- browser
|
||||
- navigation
|
||||
---
|
||||
```
|
||||
|
||||
The body should include:
|
||||
- overview
|
||||
- when to use
|
||||
- workflow for route vs component vs flow navigation
|
||||
- ambiguity handling rules
|
||||
- output contract
|
||||
- common mistakes
|
||||
|
||||
**Step 3: Write `routes-and-targets.md`**
|
||||
|
||||
Summarize:
|
||||
- route/component/flow/target model
|
||||
- representative target names
|
||||
- known alias conflicts
|
||||
- preferred disambiguation wording for future prompts
|
||||
|
||||
**Step 4: Write `selector-strategy.md`**
|
||||
|
||||
Document:
|
||||
- why selectors should prefer semantic hooks over CSS hash classes
|
||||
- fallback ordering
|
||||
- known brittle selectors from the current source
|
||||
|
||||
**Step 5: Validate the package**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
test -f /home/zyl/projects/sgClaw/skill_lib/skills/zhihu-navigate/SKILL.md
|
||||
test -f /home/zyl/projects/sgClaw/skill_lib/skills/zhihu-navigate/references/routes-and-targets.md
|
||||
test -f /home/zyl/projects/sgClaw/skill_lib/skills/zhihu-navigate/references/selector-strategy.md
|
||||
test -f /home/zyl/projects/sgClaw/skill_lib/skills/zhihu-navigate/assets/zhihu_navigation_pages.source.json
|
||||
```
|
||||
|
||||
Expected: all commands exit `0`.
|
||||
|
||||
### Task 4: Author The `zhihu-write` Skill Package
|
||||
|
||||
**Files:**
|
||||
- Create: `/home/zyl/projects/sgClaw/skill_lib/skills/zhihu-write/SKILL.md`
|
||||
- Create: `/home/zyl/projects/sgClaw/skill_lib/skills/zhihu-write/references/editor-flow.md`
|
||||
- Create: `/home/zyl/projects/sgClaw/skill_lib/skills/zhihu-write/references/publish-safety.md`
|
||||
- Create: `/home/zyl/projects/sgClaw/skill_lib/skills/zhihu-write/assets/zhihu_write_flow.source.json`
|
||||
- Reference only: `/home/zyl/projects/sgClaw/claw/src/skill/zhihu.rs`
|
||||
- Reference only: `/home/zyl/projects/sgClaw/claw/resources/skills/zhihu_write_flow.json`
|
||||
|
||||
**Step 1: Preserve the raw source artifact**
|
||||
|
||||
Copy:
|
||||
- `/home/zyl/projects/sgClaw/claw/resources/skills/zhihu_write_flow.json`
|
||||
|
||||
to:
|
||||
- `/home/zyl/projects/sgClaw/skill_lib/skills/zhihu-write/assets/zhihu_write_flow.source.json`
|
||||
|
||||
**Step 2: Write the `SKILL.md`**
|
||||
|
||||
The frontmatter should name a single skill:
|
||||
- `name: zhihu-write`
|
||||
- description focused on when article drafting or publishing is requested
|
||||
|
||||
The body should include:
|
||||
- prerequisites before touching the editor
|
||||
- workflow for draft-only vs publish
|
||||
- explicit confirmation gate before publish
|
||||
- required final report fields: title, mode, final URL if published, unresolved issues
|
||||
|
||||
**Step 3: Write `editor-flow.md`**
|
||||
|
||||
Document:
|
||||
- entry page
|
||||
- editor readiness checks
|
||||
- title/body fill rules
|
||||
- publish confirmation sequence
|
||||
- URL capture rules
|
||||
|
||||
**Step 4: Write `publish-safety.md`**
|
||||
|
||||
Document:
|
||||
- when to stop and ask for confirmation
|
||||
- what to do if title verification fails
|
||||
- what to do if the URL remains on edit mode
|
||||
- brittle selectors that must be revalidated first
|
||||
|
||||
**Step 5: Validate the package**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
test -f /home/zyl/projects/sgClaw/skill_lib/skills/zhihu-write/SKILL.md
|
||||
test -f /home/zyl/projects/sgClaw/skill_lib/skills/zhihu-write/references/editor-flow.md
|
||||
test -f /home/zyl/projects/sgClaw/skill_lib/skills/zhihu-write/references/publish-safety.md
|
||||
test -f /home/zyl/projects/sgClaw/skill_lib/skills/zhihu-write/assets/zhihu_write_flow.source.json
|
||||
```
|
||||
|
||||
Expected: all commands exit `0`.
|
||||
|
||||
### Task 5: Author The `zhihu-hotlist` Skill Package
|
||||
|
||||
**Files:**
|
||||
- Create: `/home/zyl/projects/sgClaw/skill_lib/skills/zhihu-hotlist/SKILL.md`
|
||||
- Create: `/home/zyl/projects/sgClaw/skill_lib/skills/zhihu-hotlist/references/collection-flow.md`
|
||||
- Create: `/home/zyl/projects/sgClaw/skill_lib/skills/zhihu-hotlist/references/report-format.md`
|
||||
- Create: `/home/zyl/projects/sgClaw/skill_lib/skills/zhihu-hotlist/references/data-quality.md`
|
||||
- Create: `/home/zyl/projects/sgClaw/skill_lib/skills/zhihu-hotlist/assets/zhihu_hotlist_flow.source.json`
|
||||
- Reference only: `/home/zyl/projects/sgClaw/claw/src/skill/zhihu_hotlist.rs`
|
||||
- Reference only: `/home/zyl/projects/sgClaw/claw/src/skill/zhihu_hotlist_store.rs`
|
||||
- Reference only: `/home/zyl/projects/sgClaw/claw/resources/skills/zhihu_hotlist_flow.json`
|
||||
|
||||
**Step 1: Preserve the raw source artifact**
|
||||
|
||||
Copy:
|
||||
- `/home/zyl/projects/sgClaw/claw/resources/skills/zhihu_hotlist_flow.json`
|
||||
|
||||
to:
|
||||
- `/home/zyl/projects/sgClaw/skill_lib/skills/zhihu-hotlist/assets/zhihu_hotlist_flow.source.json`
|
||||
|
||||
**Step 2: Write the `SKILL.md`**
|
||||
|
||||
Use one skill to cover:
|
||||
- hotlist collection
|
||||
- comment metric collection
|
||||
- snapshot-style reporting
|
||||
|
||||
The body should clearly separate:
|
||||
- collection workflow
|
||||
- report workflow
|
||||
- partial-failure handling
|
||||
- output contract
|
||||
|
||||
**Step 3: Write `collection-flow.md`**
|
||||
|
||||
Include:
|
||||
- hotlist page detection
|
||||
- hotlist HTML capture strategy
|
||||
- top N extraction
|
||||
- detail-page comment collection flow
|
||||
- metric parsing notes
|
||||
|
||||
**Step 4: Write `report-format.md`**
|
||||
|
||||
Define:
|
||||
- report header line
|
||||
- per-item summary line
|
||||
- field names and order
|
||||
- handling when comment metrics are missing
|
||||
|
||||
**Step 5: Write `data-quality.md`**
|
||||
|
||||
Document:
|
||||
- why partial success must be surfaced
|
||||
- what counts as incomplete data
|
||||
- known parser risks
|
||||
- recommended caution language in outputs
|
||||
|
||||
**Step 6: Validate the package**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
test -f /home/zyl/projects/sgClaw/skill_lib/skills/zhihu-hotlist/SKILL.md
|
||||
test -f /home/zyl/projects/sgClaw/skill_lib/skills/zhihu-hotlist/references/collection-flow.md
|
||||
test -f /home/zyl/projects/sgClaw/skill_lib/skills/zhihu-hotlist/references/report-format.md
|
||||
test -f /home/zyl/projects/sgClaw/skill_lib/skills/zhihu-hotlist/references/data-quality.md
|
||||
test -f /home/zyl/projects/sgClaw/skill_lib/skills/zhihu-hotlist/assets/zhihu_hotlist_flow.source.json
|
||||
```
|
||||
|
||||
Expected: all commands exit `0`.
|
||||
|
||||
### Task 6: Normalize Frontmatter And Trigger Quality
|
||||
|
||||
**Files:**
|
||||
- Modify: `/home/zyl/projects/sgClaw/skill_lib/skills/zhihu-navigate/SKILL.md`
|
||||
- Modify: `/home/zyl/projects/sgClaw/skill_lib/skills/zhihu-write/SKILL.md`
|
||||
- Modify: `/home/zyl/projects/sgClaw/skill_lib/skills/zhihu-hotlist/SKILL.md`
|
||||
|
||||
**Step 1: Normalize frontmatter keys**
|
||||
|
||||
Ensure each `SKILL.md` contains exactly these frontmatter keys in this order:
|
||||
- `name`
|
||||
- `description`
|
||||
- `version`
|
||||
- `author`
|
||||
- `tags`
|
||||
|
||||
Do not add Rust-only or unofficial parser fields as required metadata.
|
||||
|
||||
**Step 2: Check naming rules**
|
||||
|
||||
Ensure skill names are:
|
||||
- lowercase
|
||||
- hyphenated
|
||||
- stable
|
||||
|
||||
Names to keep:
|
||||
- `zhihu-navigate`
|
||||
- `zhihu-write`
|
||||
- `zhihu-hotlist`
|
||||
|
||||
**Step 3: Tighten descriptions**
|
||||
|
||||
Each description must:
|
||||
- begin with `Use when`
|
||||
- describe triggering conditions
|
||||
- mention Zhihu/browser context
|
||||
- avoid dumping full workflow detail
|
||||
|
||||
**Step 4: Validate frontmatter**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
rg -n "^name: |^description: |^version: |^author: |^tags:" /home/zyl/projects/sgClaw/skill_lib/skills/*/SKILL.md
|
||||
```
|
||||
|
||||
Expected: every skill emits the same five key families.
|
||||
|
||||
### Task 7: Add Repository-Level Verification Notes
|
||||
|
||||
**Files:**
|
||||
- Create: `/home/zyl/projects/sgClaw/skill_lib/VERIFY.md`
|
||||
- Modify: `/home/zyl/projects/sgClaw/skill_lib/README.md`
|
||||
|
||||
**Step 1: Create `VERIFY.md`**
|
||||
|
||||
Document the manual verification checklist:
|
||||
- all skill packages are under `skill_lib/skills/`
|
||||
- each package has `SKILL.md`
|
||||
- long details live in `references/`
|
||||
- preserved source JSON is in `assets/`
|
||||
- no Rust source is copied into the skill repo
|
||||
|
||||
**Step 2: Link verification from the README**
|
||||
|
||||
Add a short section in `README.md` pointing to `VERIFY.md`.
|
||||
|
||||
**Step 3: Run repository-level checks**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
find /home/zyl/projects/sgClaw/skill_lib/skills -mindepth 2 -maxdepth 2 -name SKILL.md | sort
|
||||
find /home/zyl/projects/sgClaw/skill_lib/skills -type d \( -name references -o -name assets \) | sort
|
||||
```
|
||||
|
||||
Expected:
|
||||
- exactly three `SKILL.md` files
|
||||
- each skill has `references/`
|
||||
- each skill has `assets/`
|
||||
|
||||
### Task 8: Final Review Before Claiming Completion
|
||||
|
||||
**Files:**
|
||||
- Review only: `/home/zyl/projects/sgClaw/skill_lib/`
|
||||
- Review only: `/home/zyl/projects/sgClaw/claw/docs/plans/2026-03-27-skill-lib-zeroclaw-plan.md`
|
||||
|
||||
**Step 1: Review against ZeroClaw runtime constraints**
|
||||
|
||||
Check that the final library respects the currently observed runtime facts:
|
||||
- directory-based skills
|
||||
- `SKILL.md` supported
|
||||
- simple frontmatter fields
|
||||
- optional `references/`, `scripts/`, `assets/`
|
||||
|
||||
**Step 2: Review against authoring quality**
|
||||
|
||||
Check that each skill:
|
||||
- is self-contained
|
||||
- has a narrow trigger boundary
|
||||
- avoids copying Rust internals into the prompt body
|
||||
- surfaces ambiguity and failure modes
|
||||
|
||||
**Step 3: Produce the implementation report**
|
||||
|
||||
The completion report must include:
|
||||
- created directories
|
||||
- created skill packages
|
||||
- any deliberate deviations from upstream ZeroClaw examples
|
||||
- verification commands actually run
|
||||
- unresolved risks
|
||||
|
||||
**Step 4: Stop before unrelated expansion**
|
||||
|
||||
Do not add:
|
||||
- extra skills beyond the three mapped ones
|
||||
- generic utility libraries
|
||||
- unrelated automation scripts
|
||||
- runtime code changes in `/home/zyl/projects/sgClaw/claw/src/skill/`
|
||||
|
||||
598
docs/plans/2026-03-28-sgclaw-zeroclaw-core-realignment-plan.md
Normal file
598
docs/plans/2026-03-28-sgclaw-zeroclaw-core-realignment-plan.md
Normal file
@@ -0,0 +1,598 @@
|
||||
# SGClaw ZeroClaw Core Realignment Implementation Plan
|
||||
|
||||
> **For Claude:** REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.
|
||||
|
||||
**Goal:** Rebuild `sgclaw` as a security-hardened zeroclaw runtime where the browser is a privileged tool surface and client host, not the product's defining execution model.
|
||||
|
||||
**Architecture:** `zeroclaw` remains the real agent core for prompt building, skills, memory, tool routing, autonomy, and execution loops. `sgclaw` adds a security envelope around zeroclaw: hardened configuration defaults, MAC-validated pipe communication, browser-host revalidation, and constrained exposure of privileged tools. The browser stays important, but only as one protected execution surface.
|
||||
|
||||
**Tech Stack:** Rust, vendored `zeroclaw`, SuperRPA browser pipe protocol, Chromium WebUI/overlay, JSON config, HMAC/MAC policy rules.
|
||||
|
||||
## Scope Guard
|
||||
|
||||
- In scope:
|
||||
- zeroclaw-first runtime restructuring
|
||||
- security and tool policy layering
|
||||
- browser pipe re-framing as privileged tool surface
|
||||
- documentation correction before code changes
|
||||
- browser UI wording and observability alignment
|
||||
- Out of scope:
|
||||
- broad UI redesign before the runtime contract is fixed
|
||||
- indiscriminately enabling every zeroclaw built-in tool
|
||||
- keeping parallel browser-only and zeroclaw-first runtime stacks long-term
|
||||
|
||||
## Task 1: Rewrite The Product Docs Before Touching Runtime Code
|
||||
|
||||
**Files:**
|
||||
- Modify: `docs/README.md`
|
||||
- Modify: `docs/L0-产品白皮书与能力全景层.md`
|
||||
- Modify: `docs/L1-系统架构与安全模型层.md`
|
||||
- Modify: `docs/L2-核心模块与接口契约层.md`
|
||||
- Modify: `docs/L3-数据流与Skill体系层.md`
|
||||
- Modify: `docs/L4-工程实现与部署拓扑层.md`
|
||||
- Modify: `docs/L5-提示词分布与安全改造方案.md`
|
||||
- Modify: `docs/浏览器对接标准.md`
|
||||
|
||||
**Intent:**
|
||||
- Make the docs the single source of truth before implementation starts.
|
||||
- Remove the misleading framing that `sgclaw` is fundamentally a browser execution core.
|
||||
- Replace it with the correct framing: `sgclaw` is a security-hardened zeroclaw distribution with a privileged browser execution surface.
|
||||
|
||||
**Step 1: Add a docs checklist in the plan branch**
|
||||
|
||||
Checklist to apply consistently across the docs:
|
||||
|
||||
```text
|
||||
1. zeroclaw is the runtime core
|
||||
2. sgclaw adds security policy and protected execution surfaces
|
||||
3. browser pipe is one privileged tool surface, not the whole runtime
|
||||
4. docs must distinguish current implementation gaps from target architecture
|
||||
5. no doc may imply that browser-only compat is the desired end state
|
||||
```
|
||||
|
||||
**Step 2: Update the architecture docs**
|
||||
|
||||
Required wording changes:
|
||||
|
||||
- Replace phrases equivalent to “浏览器智能体执行内核” with wording equivalent to “安全加固后的 zeroclaw runtime”.
|
||||
- Reframe `compat` as a temporary adaptation layer, not the final product identity.
|
||||
- Clarify that `browser_action` is a protected tool contract, not the only capability sgClaw should ever have.
|
||||
- Clarify that prompt治理、安全摘要、skills、memory、routing should stay aligned with zeroclaw-native mechanisms.
|
||||
|
||||
**Step 3: Run doc consistency checks**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
rg -n "浏览器智能体执行内核|单一 `browser_action` 工具|兼容执行器" docs
|
||||
rg -n "zeroclaw.*核心|特权工具面|安全加固" docs
|
||||
```
|
||||
|
||||
Expected:
|
||||
- The first command should only return historical or explicitly marked current-state references.
|
||||
- The second command should show the new target framing in the mainline docs.
|
||||
|
||||
**Step 4: Commit**
|
||||
|
||||
```bash
|
||||
git add docs/README.md docs/L0-产品白皮书与能力全景层.md docs/L1-系统架构与安全模型层.md docs/L2-核心模块与接口契约层.md docs/L3-数据流与Skill体系层.md docs/L4-工程实现与部署拓扑层.md docs/L5-提示词分布与安全改造方案.md docs/浏览器对接标准.md
|
||||
git commit -m "docs: redefine sgclaw as hardened zeroclaw runtime"
|
||||
```
|
||||
|
||||
## Task 2: Introduce A Zeroclaw-First Runtime Module
|
||||
|
||||
**Files:**
|
||||
- Create: `src/runtime/mod.rs`
|
||||
- Create: `src/runtime/profile.rs`
|
||||
- Create: `src/runtime/tool_policy.rs`
|
||||
- Create: `src/runtime/engine.rs`
|
||||
- Modify: `src/lib.rs`
|
||||
- Test: `tests/runtime_profile_test.rs`
|
||||
|
||||
**Intent:**
|
||||
- Create a real runtime namespace that represents sgClaw’s zeroclaw-first architecture.
|
||||
- Stop letting `compat/runtime.rs` be the place where product architecture is defined.
|
||||
|
||||
**Step 1: Write the failing tests**
|
||||
|
||||
Create `tests/runtime_profile_test.rs` with at least:
|
||||
|
||||
```rust
|
||||
#[test]
|
||||
fn browser_attached_profile_exposes_browser_surface_without_becoming_browser_only() {
|
||||
let profile = RuntimeProfile::BrowserAttached;
|
||||
let policy = ToolPolicy::for_profile(profile);
|
||||
|
||||
assert!(policy.allowed_tools.contains("browser_action"));
|
||||
assert!(policy.may_use_non_browser_tools);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn general_assistant_profile_does_not_require_browser_surface() {
|
||||
let profile = RuntimeProfile::GeneralAssistant;
|
||||
let policy = ToolPolicy::for_profile(profile);
|
||||
|
||||
assert!(!policy.requires_browser_surface);
|
||||
}
|
||||
```
|
||||
|
||||
**Step 2: Run the tests to confirm failure**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test runtime_profile_test -- --nocapture
|
||||
```
|
||||
|
||||
Expected:
|
||||
- Fail with unresolved imports or missing types for `RuntimeProfile` and `ToolPolicy`.
|
||||
|
||||
**Step 3: Add the minimal runtime module**
|
||||
|
||||
Implement the first-pass skeleton:
|
||||
|
||||
```rust
|
||||
pub enum RuntimeProfile {
|
||||
BrowserAttached,
|
||||
BrowserHeavy,
|
||||
GeneralAssistant,
|
||||
}
|
||||
|
||||
pub struct ToolPolicy {
|
||||
pub requires_browser_surface: bool,
|
||||
pub may_use_non_browser_tools: bool,
|
||||
pub allowed_tools: Vec<String>,
|
||||
}
|
||||
```
|
||||
|
||||
**Step 4: Re-run the focused tests**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test runtime_profile_test -- --nocapture
|
||||
```
|
||||
|
||||
Expected:
|
||||
- Both new tests pass.
|
||||
|
||||
**Step 5: Commit**
|
||||
|
||||
```bash
|
||||
git add src/runtime/mod.rs src/runtime/profile.rs src/runtime/tool_policy.rs src/runtime/engine.rs src/lib.rs tests/runtime_profile_test.rs
|
||||
git commit -m "feat: add zeroclaw-first runtime module skeleton"
|
||||
```
|
||||
|
||||
## Task 3: Replace DeepSeek-Only Settings With Zeroclaw-First SGClaw Settings
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/config/settings.rs`
|
||||
- Modify: `src/config/mod.rs`
|
||||
- Modify: `src/compat/config_adapter.rs`
|
||||
- Test: `tests/compat_config_test.rs`
|
||||
- Test: `tests/compat_memory_test.rs`
|
||||
- Test: `tests/compat_cron_test.rs`
|
||||
|
||||
**Intent:**
|
||||
- Stop treating browser config as only a DeepSeek shim.
|
||||
- Introduce sgClaw settings that can express zeroclaw-first runtime behavior while staying backward-compatible with the existing `sgclaw_config.json`.
|
||||
|
||||
**Step 1: Add the failing config tests**
|
||||
|
||||
Add tests beside the existing ones in `tests/compat_config_test.rs`:
|
||||
|
||||
```rust
|
||||
#[test]
|
||||
fn sgclaw_settings_default_to_compact_skills_and_browser_attached_profile() {
|
||||
let settings = SgClawSettings::from_legacy_deepseek_fields(
|
||||
"sk-test".into(),
|
||||
"https://api.deepseek.com".into(),
|
||||
"deepseek-chat".into(),
|
||||
None,
|
||||
).unwrap();
|
||||
|
||||
assert_eq!(settings.runtime_profile, RuntimeProfile::BrowserAttached);
|
||||
assert_eq!(settings.skills_prompt_mode, SkillsPromptMode::Compact);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn sgclaw_settings_load_new_runtime_fields_from_browser_config() {
|
||||
// write config with runtimeProfile / skillsPromptMode / allowedToolProfiles
|
||||
}
|
||||
```
|
||||
|
||||
**Step 2: Run the focused tests**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test compat_config_test -- --nocapture
|
||||
```
|
||||
|
||||
Expected:
|
||||
- Fail because `SgClawSettings`, `RuntimeProfile`, or new parsing fields do not exist yet.
|
||||
|
||||
**Step 3: Implement backward-compatible settings expansion**
|
||||
|
||||
Minimal target shape:
|
||||
|
||||
```rust
|
||||
pub struct SgClawSettings {
|
||||
pub provider_api_key: String,
|
||||
pub provider_base_url: String,
|
||||
pub provider_model: String,
|
||||
pub skills_dir: Option<PathBuf>,
|
||||
pub skills_prompt_mode: SkillsPromptMode,
|
||||
pub runtime_profile: RuntimeProfile,
|
||||
}
|
||||
```
|
||||
|
||||
Compatibility rules:
|
||||
- Existing `apiKey/baseUrl/model/skillsDir` continue to load.
|
||||
- New fields such as `skillsPromptMode` and `runtimeProfile` are optional.
|
||||
- Defaults should be hardened, not legacy-full-prompt by accident.
|
||||
|
||||
**Step 4: Run the config and adapter tests**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test compat_config_test -- --nocapture
|
||||
cargo test --test compat_memory_test -- --nocapture
|
||||
cargo test --test compat_cron_test -- --nocapture
|
||||
```
|
||||
|
||||
Expected:
|
||||
- All pass.
|
||||
|
||||
**Step 5: Commit**
|
||||
|
||||
```bash
|
||||
git add src/config/settings.rs src/config/mod.rs src/compat/config_adapter.rs tests/compat_config_test.rs tests/compat_memory_test.rs tests/compat_cron_test.rs
|
||||
git commit -m "feat: add zeroclaw-first sgclaw settings model"
|
||||
```
|
||||
|
||||
## Task 4: Rebuild The Execution Path Around The New Runtime Engine
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/compat/runtime.rs`
|
||||
- Modify: `src/agent/mod.rs`
|
||||
- Modify: `src/compat/event_bridge.rs`
|
||||
- Test: `tests/compat_runtime_test.rs`
|
||||
|
||||
**Intent:**
|
||||
- Make `compat/runtime.rs` a thin bridge into the new runtime engine instead of the place where the product’s core execution policy lives.
|
||||
- Keep browser-originated tasks attached to a browser surface, but do not collapse the runtime into a browser-only tool list.
|
||||
|
||||
**Step 1: Add the failing runtime tests**
|
||||
|
||||
Extend `tests/compat_runtime_test.rs` with at least:
|
||||
|
||||
```rust
|
||||
#[test]
|
||||
fn compat_runtime_can_complete_a_text_only_turn_without_browser_tool_calls() {
|
||||
// provider returns direct assistant content
|
||||
// no BrowserMessage::Response is queued
|
||||
// summary should still succeed
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn compat_runtime_allows_read_skill_under_compact_mode_policy() {
|
||||
// configure compact skills mode
|
||||
// verify the runtime tool policy includes read_skill
|
||||
}
|
||||
```
|
||||
|
||||
**Step 2: Run the focused runtime test file**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test compat_runtime_test -- --nocapture
|
||||
```
|
||||
|
||||
Expected:
|
||||
- Fail because the current runtime still hardcodes browser-only exposure or cannot express the new policy.
|
||||
|
||||
**Step 3: Refactor the runtime path**
|
||||
|
||||
Implementation target:
|
||||
|
||||
- `src/agent/mod.rs` decides the runtime profile and passes browser context into the runtime engine.
|
||||
- `src/compat/runtime.rs` becomes a compatibility bridge, not the architecture center.
|
||||
- `src/runtime/engine.rs` owns:
|
||||
- zeroclaw agent creation
|
||||
- tool registration
|
||||
- skill loading
|
||||
- runtime profile application
|
||||
- browser contract prompt injection only when browser surface is actually present
|
||||
|
||||
**Step 4: Re-run the focused runtime tests**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test compat_runtime_test -- --nocapture
|
||||
```
|
||||
|
||||
Expected:
|
||||
- Existing compat runtime tests still pass.
|
||||
- New text-only / compact-skill tests pass.
|
||||
|
||||
**Step 5: Commit**
|
||||
|
||||
```bash
|
||||
git add src/compat/runtime.rs src/agent/mod.rs src/compat/event_bridge.rs tests/compat_runtime_test.rs
|
||||
git commit -m "refactor: route browser requests through zeroclaw-first runtime engine"
|
||||
```
|
||||
|
||||
## Task 5: Treat The Browser Pipe As A Privileged Tool Surface
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/compat/browser_tool_adapter.rs`
|
||||
- Modify: `src/pipe/protocol.rs`
|
||||
- Modify: `src/pipe/browser_tool.rs`
|
||||
- Modify: `src/security/mac_policy.rs`
|
||||
- Test: `tests/compat_browser_tool_test.rs`
|
||||
- Test: `tests/browser_tool_test.rs`
|
||||
- Test: `tests/pipe_protocol_test.rs`
|
||||
- Test: `tests/pipe_handshake_test.rs`
|
||||
|
||||
**Intent:**
|
||||
- Keep the browser powerful, but explicitly as a protected external execution surface.
|
||||
- Preserve the current HMAC/MAC/rules boundary while making it obvious in code that browser execution is not synonymous with runtime execution.
|
||||
|
||||
**Step 1: Add the failing tests**
|
||||
|
||||
Examples:
|
||||
|
||||
```rust
|
||||
#[test]
|
||||
fn browser_tool_contract_is_marked_as_privileged_surface_in_policy_metadata() {
|
||||
// assert runtime metadata treats browser tool separately from generic tools
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn browser_submit_task_can_carry_browser_context_without_forcing_browser_only_execution() {
|
||||
// protocol/adapter level test
|
||||
}
|
||||
```
|
||||
|
||||
**Step 2: Run the safety-related tests**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test compat_browser_tool_test -- --nocapture
|
||||
cargo test --test browser_tool_test -- --nocapture
|
||||
cargo test --test pipe_protocol_test -- --nocapture
|
||||
cargo test --test pipe_handshake_test -- --nocapture
|
||||
```
|
||||
|
||||
Expected:
|
||||
- Fail on new privilege/metadata expectations before implementation.
|
||||
|
||||
**Step 3: Implement the policy split**
|
||||
|
||||
Implementation target:
|
||||
- keep `browser_action` schema constrained
|
||||
- keep `MacPolicy` as the final guard on outbound browser commands
|
||||
- annotate browser surface metadata in the runtime/tool policy layer
|
||||
- avoid leaking browser-only assumptions into generic runtime config
|
||||
|
||||
**Step 4: Re-run the focused browser safety tests**
|
||||
|
||||
Run the same four commands from Step 2.
|
||||
|
||||
Expected:
|
||||
- All pass.
|
||||
|
||||
**Step 5: Commit**
|
||||
|
||||
```bash
|
||||
git add src/compat/browser_tool_adapter.rs src/pipe/protocol.rs src/pipe/browser_tool.rs src/security/mac_policy.rs tests/compat_browser_tool_test.rs tests/browser_tool_test.rs tests/pipe_protocol_test.rs tests/pipe_handshake_test.rs
|
||||
git commit -m "feat: model browser pipe as privileged runtime surface"
|
||||
```
|
||||
|
||||
## Task 6: Restore Zeroclaw-Native Skills Instead Of Browser-Specific Skill Prompt Hacks
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/runtime/engine.rs`
|
||||
- Modify: `src/compat/event_bridge.rs`
|
||||
- Modify: `src/agent/mod.rs`
|
||||
- Test: `tests/compat_runtime_test.rs`
|
||||
- Test: `tests/skill_lib_validation_test.py`
|
||||
|
||||
**Intent:**
|
||||
- Make skill loading and skill usage follow zeroclaw-native behavior first.
|
||||
- Prefer compact mode plus `read_skill` when safe.
|
||||
- Make skill usage observable in runtime logs so the browser UI can explain what happened.
|
||||
|
||||
**Step 1: Add the failing skill-observability tests**
|
||||
|
||||
Add tests for:
|
||||
- compact mode includes `read_skill`
|
||||
- runtime logs show when `read_skill` or skill-defined tools are invoked
|
||||
- configured `skillsDir` still resolves both repo-root and nested `skills/` layouts
|
||||
|
||||
**Step 2: Run the skill-related tests**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test compat_runtime_test -- --nocapture
|
||||
python3 tests/skill_lib_validation_test.py
|
||||
```
|
||||
|
||||
Expected:
|
||||
- New observability assertions fail before implementation.
|
||||
|
||||
**Step 3: Implement the minimal skill-first runtime behavior**
|
||||
|
||||
Implementation target:
|
||||
- compact skills mode becomes the sgClaw default unless explicitly overridden
|
||||
- `read_skill` is allowed when the active tool policy permits it
|
||||
- runtime logs include loaded skill names and actual skill/tool usage
|
||||
|
||||
**Step 4: Re-run the skill tests**
|
||||
|
||||
Run the same two commands from Step 2.
|
||||
|
||||
Expected:
|
||||
- Both pass.
|
||||
|
||||
**Step 5: Commit**
|
||||
|
||||
```bash
|
||||
git add src/runtime/engine.rs src/compat/event_bridge.rs src/agent/mod.rs tests/compat_runtime_test.rs tests/skill_lib_validation_test.py
|
||||
git commit -m "feat: restore zeroclaw-native skill flow and observability"
|
||||
```
|
||||
|
||||
## Task 7: Remove Or Quarantine The Legacy Parallel Runtime Paths
|
||||
|
||||
**Files:**
|
||||
- Modify or delete: `src/agent/runtime.rs`
|
||||
- Modify or delete: `src/agent/planner.rs`
|
||||
- Modify: `tests/agent_runtime_test.rs`
|
||||
- Modify: `tests/planner_test.rs`
|
||||
- Modify: `docs/L2-核心模块与接口契约层.md`
|
||||
- Modify: `docs/L3-数据流与Skill体系层.md`
|
||||
|
||||
**Intent:**
|
||||
- Stop shipping multiple conceptual runtimes.
|
||||
- If the light runtime/planner path still has value, mark it as legacy/dev-only.
|
||||
- If it has no production value, remove it after the zeroclaw-first runtime is stable.
|
||||
|
||||
**Step 1: Decide the disposition**
|
||||
|
||||
Choose one:
|
||||
|
||||
```text
|
||||
A. delete planner/runtime legacy path
|
||||
B. keep as explicit legacy/dev-only module with zero production routing
|
||||
```
|
||||
|
||||
Recommendation: `B` first, then `A` after one clean release cycle.
|
||||
|
||||
**Step 2: Add the failing cleanup tests**
|
||||
|
||||
Examples:
|
||||
- production browser path never routes into planner fallback
|
||||
- docs no longer describe planner fallback as the primary model-enabled path
|
||||
|
||||
**Step 3: Run the legacy-path tests**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test agent_runtime_test -- --nocapture
|
||||
cargo test --test planner_test -- --nocapture
|
||||
```
|
||||
|
||||
Expected:
|
||||
- Fail once the cleanup assertions are added.
|
||||
|
||||
**Step 4: Apply the cleanup**
|
||||
|
||||
Implementation target:
|
||||
- remove production routing dependency
|
||||
- rename or mark legacy modules clearly
|
||||
- update docs to match the new reality
|
||||
|
||||
**Step 5: Re-run the legacy-path tests and commit**
|
||||
|
||||
Run the same two commands, then:
|
||||
|
||||
```bash
|
||||
git add src/agent/runtime.rs src/agent/planner.rs tests/agent_runtime_test.rs tests/planner_test.rs docs/L2-核心模块与接口契约层.md docs/L3-数据流与Skill体系层.md
|
||||
git commit -m "refactor: quarantine legacy browser-only runtime paths"
|
||||
```
|
||||
|
||||
## Task 8: Align The Browser UI With The New Runtime Truth
|
||||
|
||||
**Files:**
|
||||
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/sgclaw_overlay.js`
|
||||
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat.html.ts`
|
||||
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat.css.ts`
|
||||
- Modify: `/home/zyl/projects/superRpa/src/AGENTS.md`
|
||||
|
||||
**Intent:**
|
||||
- Make the browser UI an honest client of the runtime.
|
||||
- Stop labeling all tasks as “网页执行” when the runtime may have solved them through non-browser zeroclaw capabilities.
|
||||
|
||||
**Step 1: Add the UI contract checklist**
|
||||
|
||||
Checklist:
|
||||
|
||||
```text
|
||||
1. UI shows runtime profile or capability mode
|
||||
2. UI can show whether browser tools were actually used
|
||||
3. UI can show when skills were read or invoked
|
||||
4. wording does not imply browser is the whole runtime
|
||||
```
|
||||
|
||||
**Step 2: Implement the label and observability changes**
|
||||
|
||||
Examples:
|
||||
- replace fixed “网页执行” copy with runtime-derived capability wording
|
||||
- add a compact “本轮调用”/“能力来源” section
|
||||
- avoid implying every successful task came from page automation
|
||||
|
||||
**Step 3: Verify browser resources**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
node --check /home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/sgclaw_overlay.js
|
||||
/home/zyl/projects/superRpa/src/out/KylinRelease/functions_ui_mainline_unittests
|
||||
autoninja -C /home/zyl/projects/superRpa/src/out/KylinRelease chrome
|
||||
```
|
||||
|
||||
Expected:
|
||||
- All pass.
|
||||
|
||||
**Step 4: Commit**
|
||||
|
||||
```bash
|
||||
git -C /home/zyl/projects/superRpa/src add chrome/browser/resources/superrpa/sgclaw_overlay.js chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat.html.ts chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat.css.ts AGENTS.md
|
||||
git -C /home/zyl/projects/superRpa/src commit -m "feat: align browser ui with zeroclaw-first runtime"
|
||||
```
|
||||
|
||||
## Final Verification Matrix
|
||||
|
||||
Run, in order:
|
||||
|
||||
```bash
|
||||
cargo test --test runtime_profile_test -- --nocapture
|
||||
cargo test --test compat_config_test -- --nocapture
|
||||
cargo test --test compat_runtime_test -- --nocapture
|
||||
cargo test --test compat_browser_tool_test -- --nocapture
|
||||
cargo test --test browser_tool_test -- --nocapture
|
||||
cargo test --test pipe_protocol_test -- --nocapture
|
||||
cargo test --test pipe_handshake_test -- --nocapture
|
||||
cargo test --test agent_runtime_test -- --nocapture
|
||||
cargo test --test planner_test -- --nocapture
|
||||
python3 tests/skill_lib_validation_test.py
|
||||
```
|
||||
|
||||
If local Cargo registry state is unstable, use the Chromium hermetic wrapper instead:
|
||||
|
||||
```bash
|
||||
python3 /home/zyl/projects/superRpa/src/tools/crates/run_cargo.py test --manifest-path /home/zyl/projects/sgClaw/claw/Cargo.toml -- --nocapture
|
||||
```
|
||||
|
||||
Then verify browser integration:
|
||||
|
||||
```bash
|
||||
node --check /home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/sgclaw_overlay.js
|
||||
/home/zyl/projects/superRpa/src/out/KylinRelease/functions_ui_mainline_unittests
|
||||
autoninja -C /home/zyl/projects/superRpa/src/out/KylinRelease chrome
|
||||
```
|
||||
|
||||
## Success Criteria
|
||||
|
||||
- The docs say the right thing before the code change begins.
|
||||
- `sgclaw` behaves like a security-hardened zeroclaw runtime, not a browser-only compat shell.
|
||||
- The browser pipe remains central for protected execution, but it is no longer mistaken for the whole product architecture.
|
||||
- Skills, prompt building, memory, routing, and tool policy all flow through zeroclaw-native mechanisms first.
|
||||
- The UI becomes a thin, honest client of the shared runtime.
|
||||
482
docs/plans/2026-03-29-sgclaw-superrpa-decoupled-runtime-plan.md
Normal file
482
docs/plans/2026-03-29-sgclaw-superrpa-decoupled-runtime-plan.md
Normal file
@@ -0,0 +1,482 @@
|
||||
# sgClaw SuperRPA Decoupled Runtime Implementation Plan
|
||||
|
||||
> **For Claude:** REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.
|
||||
|
||||
**Goal:** Keep the SuperRPA parent-child security model, while moving high-frequency sgClaw startup, model, skill, and frontend presentation changes out of SuperRPA compile-time code and into runtime-managed configuration.
|
||||
|
||||
**Architecture:** SuperRPA remains the trusted host that owns process spawning, pipe security, browser/office capability gates, and frontend host contracts. sgClaw becomes the runtime-configured execution engine that reads launch/runtime policy from files, with SuperRPA preferring external launch descriptors and external frontend bundles before falling back to bundled defaults. This preserves the security boundary while removing the need to rebuild the browser for routine sgClaw iteration.
|
||||
|
||||
**Tech Stack:** Chromium C++ WebUI, TypeScript/Lit frontend, Rust sgClaw runtime, JSON config files, local filesystem-based runtime assets, existing pipe protocol and Zeroclaw planner-first execution path.
|
||||
|
||||
### Task 1: Freeze the design in docs before further code changes
|
||||
|
||||
**Files:**
|
||||
- Modify: `/home/zyl/projects/sgClaw/claw/docs/L1-系统架构与安全模型层.md`
|
||||
- Modify: `/home/zyl/projects/sgClaw/claw/docs/L2-核心模块与接口契约层.md`
|
||||
- Modify: `/home/zyl/projects/sgClaw/claw/docs/L4-工程实现与部署拓扑层.md`
|
||||
- Create: `/home/zyl/projects/sgClaw/claw/docs/plans/2026-03-29-sgclaw-superrpa-runtime-config-design.md`
|
||||
|
||||
**Step 1: Write the design delta doc**
|
||||
|
||||
Document these decisions explicitly:
|
||||
- SuperRPA owns host security and capability exposure only.
|
||||
- sgClaw owns planner, model routing, skill orchestration, and business behavior.
|
||||
- Launch behavior is described by runtime files, not hardcoded browser-side constants.
|
||||
- Frontend only has display rights; planner/executor decisions stay in sgClaw/Zeroclaw.
|
||||
|
||||
**Step 2: Add the failing doc checklist**
|
||||
|
||||
Create a checklist inside the design doc with these questions and mark them initially unresolved:
|
||||
- Can browser startup switch sgClaw binary without rebuilding Chromium?
|
||||
- Can model/provider selection change without rebuilding Chromium?
|
||||
- Can floating UI be replaced without rebuilding Chromium?
|
||||
- Can acceptance flows prove planner-first behavior visually and functionally?
|
||||
|
||||
**Step 3: Update the core architecture docs**
|
||||
|
||||
Add short sections showing:
|
||||
- Launch config file path and fallback rules.
|
||||
- Runtime config ownership split between SuperRPA and sgClaw.
|
||||
- External frontend bundle loading path and fallback to bundled assets.
|
||||
|
||||
**Step 4: Review docs for consistency**
|
||||
|
||||
Check that `L1`, `L2`, `L4`, and the new design doc all use the same terms:
|
||||
- `host`
|
||||
- `launch config`
|
||||
- `runtime config`
|
||||
- `frontend bundle`
|
||||
- `planner-first`
|
||||
|
||||
**Step 5: Commit**
|
||||
|
||||
```bash
|
||||
git -C /home/zyl/projects/sgClaw/claw add \
|
||||
docs/L1-系统架构与安全模型层.md \
|
||||
docs/L2-核心模块与接口契约层.md \
|
||||
docs/L4-工程实现与部署拓扑层.md \
|
||||
docs/plans/2026-03-29-sgclaw-superrpa-runtime-config-design.md
|
||||
git -C /home/zyl/projects/sgClaw/claw commit -m "docs: define superrpa sgclaw runtime boundary"
|
||||
```
|
||||
|
||||
### Task 2: Finish and lock down the current stale-backend fix
|
||||
|
||||
**Files:**
|
||||
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/ui/webui/superrpa/sgclaw_session_service.cc`
|
||||
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/ui/webui/superrpa/sgclaw_session_service.h`
|
||||
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/ui/webui/superrpa/functions_ui_mainline_unittest.cc`
|
||||
- Create: `/home/zyl/projects/sgClaw/claw/tools/browser_runtime/sgclaw_browser_entry.sh`
|
||||
|
||||
**Step 1: Write the failing regression test**
|
||||
|
||||
Add internal tests for binary resolution priority:
|
||||
1. `SUPERRPA_SGCLAW_BINARY` override wins.
|
||||
2. `skillsDir`-inferred source checkout wrapper wins over bundled binary.
|
||||
3. Bundled `out/.../sgclaw` is only a fallback.
|
||||
|
||||
**Step 2: Run the failing test**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
autoninja -C /home/zyl/projects/superRpa/src/out/KylinRelease functions_ui_mainline_unittests
|
||||
/home/zyl/projects/superRpa/src/out/KylinRelease/functions_ui_mainline_unittests --gtest_filter="SgClawSessionServiceInternalTest.*"
|
||||
```
|
||||
|
||||
Expected: the new test target fails before the final test helper wiring is complete.
|
||||
|
||||
**Step 3: Write the minimal implementation**
|
||||
|
||||
Expose a testable internal resolver function that accepts:
|
||||
- config path
|
||||
- bundled binary path
|
||||
- optional env override string
|
||||
- output detail string
|
||||
|
||||
Keep production `Start()` calling the same shared resolver to avoid divergence.
|
||||
|
||||
**Step 4: Run tests to verify they pass**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
autoninja -C /home/zyl/projects/superRpa/src/out/KylinRelease functions_ui_mainline_unittests
|
||||
/home/zyl/projects/superRpa/src/out/KylinRelease/functions_ui_mainline_unittests --gtest_filter="SgClawSessionServiceInternalTest.*:FunctionsUiMainlineTest.StartPublishesDetailedRulesDiagnosticsToUiLogs"
|
||||
```
|
||||
|
||||
Expected: all targeted tests pass.
|
||||
|
||||
**Step 5: Run browser compile verification**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
autoninja -C /home/zyl/projects/superRpa/src/out/KylinRelease chrome
|
||||
```
|
||||
|
||||
Expected: `LINK ./chrome` with exit code `0`.
|
||||
|
||||
**Step 6: Commit**
|
||||
|
||||
```bash
|
||||
git -C /home/zyl/projects/superRpa/src add \
|
||||
chrome/browser/ui/webui/superrpa/sgclaw_session_service.cc \
|
||||
chrome/browser/ui/webui/superrpa/sgclaw_session_service.h \
|
||||
chrome/browser/ui/webui/superrpa/functions_ui_mainline_unittest.cc
|
||||
git -C /home/zyl/projects/sgClaw/claw add \
|
||||
tools/browser_runtime/sgclaw_browser_entry.sh
|
||||
git -C /home/zyl/projects/superRpa/src commit -m "superrpa: resolve sgclaw binary from runtime config"
|
||||
```
|
||||
|
||||
### Task 3: Add a real launch descriptor so SuperRPA no longer hardcodes sgClaw startup policy
|
||||
|
||||
**Files:**
|
||||
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/ui/webui/superrpa/sgclaw_webui_config.h`
|
||||
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/ui/webui/superrpa/sgclaw_webui_config.cc`
|
||||
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/ui/webui/superrpa/sgclaw_session_service.cc`
|
||||
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/ui/webui/superrpa/functions_ui.cc`
|
||||
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-config/sgclaw-config.ts`
|
||||
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-config/sgclaw-config_state.ts`
|
||||
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-config/sgclaw-config_mainline_unittest.ts`
|
||||
- Create: `/home/zyl/projects/superRpa/src/chrome/browser/ui/webui/superrpa/sgclaw_launch_config.h`
|
||||
- Create: `/home/zyl/projects/superRpa/src/chrome/browser/ui/webui/superrpa/sgclaw_launch_config.cc`
|
||||
- Test: `/home/zyl/projects/superRpa/src/chrome/browser/ui/webui/superrpa/functions_ui_mainline_unittest.cc`
|
||||
|
||||
**Step 1: Write the failing config tests**
|
||||
|
||||
Cover:
|
||||
- missing launch config falls back safely
|
||||
- explicit `binary`, `args`, `env`, `working_dir`, `runtime_config_path` parse correctly
|
||||
- unsafe or nonexistent paths are rejected with clear UI-visible errors
|
||||
|
||||
**Step 2: Run the failing tests**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
autoninja -C /home/zyl/projects/superRpa/src/out/KylinRelease functions_ui_mainline_unittests
|
||||
/home/zyl/projects/superRpa/src/out/KylinRelease/functions_ui_mainline_unittests --gtest_filter="*SgClaw*Config*"
|
||||
```
|
||||
|
||||
Expected: launch-config cases fail before parser/consumer code is added.
|
||||
|
||||
**Step 3: Implement minimal launch config support**
|
||||
|
||||
Define a host-side launch descriptor with fields:
|
||||
- `binary`
|
||||
- `args`
|
||||
- `env`
|
||||
- `working_dir`
|
||||
- `runtime_config_path`
|
||||
- `frontend_bundle_dir`
|
||||
|
||||
Load it from a predictable profile-local path, with safe defaults and fallback to existing behavior.
|
||||
|
||||
**Step 4: Wire startup to the descriptor**
|
||||
|
||||
Have `SgClawSessionService::Start()` resolve:
|
||||
- executable path
|
||||
- process args
|
||||
- working dir
|
||||
- env
|
||||
- runtime config path
|
||||
|
||||
without requiring browser recompilation for routine changes.
|
||||
|
||||
**Step 5: Wire config UI to persist supported fields**
|
||||
|
||||
Make `sgclaw-config` save and load the new fields so local users can adjust launch behavior from the UI or by editing the JSON file directly.
|
||||
|
||||
**Step 6: Run tests and browser compile**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
autoninja -C /home/zyl/projects/superRpa/src/out/KylinRelease functions_ui_mainline_unittests chrome
|
||||
```
|
||||
|
||||
Expected: config tests pass and browser still links.
|
||||
|
||||
**Step 7: Commit**
|
||||
|
||||
```bash
|
||||
git -C /home/zyl/projects/superRpa/src add \
|
||||
chrome/browser/ui/webui/superrpa/sgclaw_launch_config.h \
|
||||
chrome/browser/ui/webui/superrpa/sgclaw_launch_config.cc \
|
||||
chrome/browser/ui/webui/superrpa/sgclaw_webui_config.h \
|
||||
chrome/browser/ui/webui/superrpa/sgclaw_webui_config.cc \
|
||||
chrome/browser/ui/webui/superrpa/sgclaw_session_service.cc \
|
||||
chrome/browser/ui/webui/superrpa/functions_ui.cc \
|
||||
chrome/browser/resources/superrpa/devtools/functions/sgclaw-config/sgclaw-config.ts \
|
||||
chrome/browser/resources/superrpa/devtools/functions/sgclaw-config/sgclaw-config_state.ts \
|
||||
chrome/browser/resources/superrpa/devtools/functions/sgclaw-config/sgclaw-config_mainline_unittest.ts
|
||||
git -C /home/zyl/projects/superRpa/src commit -m "superrpa: add runtime launch config for sgclaw"
|
||||
```
|
||||
|
||||
### Task 4: Expand sgClaw runtime config so model/provider/skill policy live in sgClaw, not SuperRPA
|
||||
|
||||
**Files:**
|
||||
- Modify: `/home/zyl/projects/sgClaw/claw/src/config/settings.rs`
|
||||
- Modify: `/home/zyl/projects/sgClaw/claw/src/config/mod.rs`
|
||||
- Modify: `/home/zyl/projects/sgClaw/claw/src/compat/config_adapter.rs`
|
||||
- Modify: `/home/zyl/projects/sgClaw/claw/src/agent/runtime.rs`
|
||||
- Modify: `/home/zyl/projects/sgClaw/claw/src/agent/planner.rs`
|
||||
- Modify: `/home/zyl/projects/sgClaw/claw/tests/compat_config_test.rs`
|
||||
- Modify: `/home/zyl/projects/sgClaw/claw/tests/runtime_profile_test.rs`
|
||||
- Modify: `/home/zyl/projects/sgClaw/claw/tests/planner_test.rs`
|
||||
- Modify: `/home/zyl/projects/sgClaw/claw/docs/L3-数据流与Skill体系层.md`
|
||||
|
||||
**Step 1: Write failing Rust tests**
|
||||
|
||||
Cover config-driven behavior for:
|
||||
- planner-first mode
|
||||
- provider list / active provider
|
||||
- browser backend selection
|
||||
- office backend selection
|
||||
- skills prompt mode
|
||||
- runtime profile
|
||||
|
||||
**Step 2: Run the failing tests**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test compat_config_test runtime_profile_test planner_test --manifest-path /home/zyl/projects/sgClaw/claw/Cargo.toml
|
||||
```
|
||||
|
||||
Expected: new config fields are missing or ignored.
|
||||
|
||||
**Step 3: Implement minimal config schema changes**
|
||||
|
||||
Add fields that let sgClaw choose behavior without host rebuild:
|
||||
- `planner_mode`
|
||||
- `providers`
|
||||
- `active_provider`
|
||||
- `browser_backend`
|
||||
- `office_backend`
|
||||
- `skills_prompt_mode`
|
||||
- `runtime_profile`
|
||||
|
||||
**Step 4: Keep Zeroclaw-first execution**
|
||||
|
||||
Ensure the planner reads config before execution and produces a visible plan event for the frontend, but the frontend still only renders what sgClaw emits.
|
||||
|
||||
**Step 5: Re-run Rust tests**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --manifest-path /home/zyl/projects/sgClaw/claw/Cargo.toml compat_config_test runtime_profile_test planner_test runtime_task_flow_test
|
||||
```
|
||||
|
||||
Expected: planner/config tests pass.
|
||||
|
||||
**Step 6: Commit**
|
||||
|
||||
```bash
|
||||
git -C /home/zyl/projects/sgClaw/claw add \
|
||||
src/config/settings.rs \
|
||||
src/config/mod.rs \
|
||||
src/compat/config_adapter.rs \
|
||||
src/agent/runtime.rs \
|
||||
src/agent/planner.rs \
|
||||
tests/compat_config_test.rs \
|
||||
tests/runtime_profile_test.rs \
|
||||
tests/planner_test.rs \
|
||||
docs/L3-数据流与Skill体系层.md
|
||||
git -C /home/zyl/projects/sgClaw/claw commit -m "sgclaw: move runtime policy into config"
|
||||
```
|
||||
|
||||
### Task 5: Decouple the floating UI so visual iteration stops depending on Chromium rebuilds
|
||||
|
||||
**Files:**
|
||||
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat.ts`
|
||||
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_state.ts`
|
||||
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_mainline_unittest.ts`
|
||||
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/ui/webui/superrpa/functions_ui.cc`
|
||||
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/ui/webui/superrpa/sgclaw_session_service.cc`
|
||||
- Create: `/home/zyl/projects/sgClaw/claw/frontend/runtime-host/README.md`
|
||||
- Create: `/home/zyl/projects/sgClaw/claw/frontend/runtime-host/manifest.example.json`
|
||||
|
||||
**Step 1: Write failing UI host tests**
|
||||
|
||||
Cover:
|
||||
- external frontend bundle dir is preferred when declared in launch config
|
||||
- bundled frontend assets still load when external assets are absent
|
||||
- planner events are rendered as plan cards/log lines before execution
|
||||
|
||||
**Step 2: Run the failing frontend/browser tests**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
autoninja -C /home/zyl/projects/superRpa/src/out/KylinRelease chrome/test/data/webui_test_resources
|
||||
```
|
||||
|
||||
Then run the relevant TypeScript tests already wired for the sgClaw chat surface.
|
||||
|
||||
**Step 3: Implement the minimal external bundle loader**
|
||||
|
||||
SuperRPA should:
|
||||
- keep the host shell and JS bridge fixed
|
||||
- optionally load external `sgclaw-chat` assets from runtime-configured directory
|
||||
- fall back to bundled assets when missing
|
||||
|
||||
**Step 4: Surface planner output early**
|
||||
|
||||
Use existing runtime event flow so the frontend shows:
|
||||
- plan summary
|
||||
- current step
|
||||
- execution logs
|
||||
|
||||
without moving control logic into the frontend.
|
||||
|
||||
**Step 5: Re-run tests**
|
||||
|
||||
Run the existing sgClaw chat WebUI tests and a browser smoke.
|
||||
|
||||
**Step 6: Commit**
|
||||
|
||||
```bash
|
||||
git -C /home/zyl/projects/superRpa/src add \
|
||||
chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat.ts \
|
||||
chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_state.ts \
|
||||
chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_mainline_unittest.ts \
|
||||
chrome/browser/ui/webui/superrpa/functions_ui.cc \
|
||||
chrome/browser/ui/webui/superrpa/sgclaw_session_service.cc
|
||||
git -C /home/zyl/projects/sgClaw/claw add \
|
||||
frontend/runtime-host/README.md \
|
||||
frontend/runtime-host/manifest.example.json
|
||||
git -C /home/zyl/projects/superRpa/src commit -m "superrpa: support external sgclaw frontend bundle"
|
||||
```
|
||||
|
||||
### Task 6: Close the current remaining behavioral gaps before new feature work
|
||||
|
||||
**Files:**
|
||||
- Modify: `/home/zyl/projects/sgClaw/claw/tests/live_acceptance_score_test.py`
|
||||
- Modify: `/home/zyl/projects/sgClaw/claw/tools/live_acceptance/run_zhihu_hotlist_excel_acceptance.py`
|
||||
- Modify: `/home/zyl/projects/sgClaw/claw/docs/acceptance/2026-03-29-zhihu-hotlist-excel.md`
|
||||
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat.ts`
|
||||
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_messages.ts`
|
||||
|
||||
**Step 1: Write failing acceptance assertions**
|
||||
|
||||
Add explicit checks for:
|
||||
- no repeated assistant paragraphs
|
||||
- no fake fallback data when browser path exists
|
||||
- planner-first output appears before tool execution
|
||||
- Zhihu hotlist extraction returns structured rows
|
||||
- office export returns a real output path
|
||||
|
||||
**Step 2: Run the failing acceptance flow**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
python3 /home/zyl/projects/sgClaw/claw/tools/live_acceptance/run_zhihu_hotlist_excel_acceptance.py
|
||||
```
|
||||
|
||||
Expected: current score exposes the remaining regressions if they still exist.
|
||||
|
||||
**Step 3: Fix the smallest issue set first**
|
||||
|
||||
Order:
|
||||
1. repeated message rendering / repeated summary emission
|
||||
2. planner event visibility
|
||||
3. structured hotlist extraction handoff
|
||||
4. office export path propagation
|
||||
|
||||
**Step 4: Re-run acceptance**
|
||||
|
||||
Run the same command until:
|
||||
- `hotlist_data_correctness > 0`
|
||||
- `xlsx_export_success > 0`
|
||||
- repeated text is absent
|
||||
|
||||
**Step 5: Record fresh evidence**
|
||||
|
||||
Update the acceptance markdown with:
|
||||
- timestamp
|
||||
- score
|
||||
- exact exported path
|
||||
- screenshot/log snippets
|
||||
|
||||
**Step 6: Commit**
|
||||
|
||||
```bash
|
||||
git -C /home/zyl/projects/sgClaw/claw add \
|
||||
tests/live_acceptance_score_test.py \
|
||||
tools/live_acceptance/run_zhihu_hotlist_excel_acceptance.py \
|
||||
docs/acceptance/2026-03-29-zhihu-hotlist-excel.md
|
||||
git -C /home/zyl/projects/superRpa/src add \
|
||||
chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat.ts \
|
||||
chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_messages.ts
|
||||
git -C /home/zyl/projects/sgClaw/claw commit -m "acceptance: stabilize zhihu hotlist excel flow"
|
||||
```
|
||||
|
||||
### Task 7: Final integrated verification
|
||||
|
||||
**Files:**
|
||||
- Verify only: `/home/zyl/projects/sgClaw/claw/docs/acceptance/2026-03-29-zhihu-hotlist-excel.md`
|
||||
- Verify only: `/home/zyl/projects/superRpa/src/chrome/browser/ui/webui/superrpa/sgclaw_session_service.cc`
|
||||
- Verify only: `/home/zyl/projects/sgClaw/claw/tools/browser_runtime/sgclaw_browser_entry.sh`
|
||||
|
||||
**Step 1: Build all affected binaries**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
autoninja -C /home/zyl/projects/superRpa/src/out/KylinRelease chrome functions_ui_mainline_unittests
|
||||
cargo test --manifest-path /home/zyl/projects/sgClaw/claw/Cargo.toml
|
||||
```
|
||||
|
||||
Expected: both complete successfully.
|
||||
|
||||
**Step 2: Do the live browser smoke**
|
||||
|
||||
Run browser with the local profile and verify the logs include one of:
|
||||
- `using SUPERRPA_SGCLAW_BINARY override: ...`
|
||||
- `using source checkout sgclaw inferred from skillsDir: ...`
|
||||
- `using bundled sgclaw from browser output dir: ...`
|
||||
|
||||
The expected dev mode result is the source checkout path, not the stale bundled fallback.
|
||||
|
||||
**Step 3: Run the final business acceptance**
|
||||
|
||||
Ask sgClaw to:
|
||||
1. read Zhihu hotlist
|
||||
2. export Excel
|
||||
3. open the screen presentation in a new tab
|
||||
|
||||
Verify:
|
||||
- planner appears first
|
||||
- skills are actually used
|
||||
- exported file path is returned
|
||||
- new-tab presentation opens
|
||||
|
||||
**Step 4: Record the result**
|
||||
|
||||
Append the final evidence to:
|
||||
- `/home/zyl/projects/sgClaw/claw/docs/acceptance/2026-03-29-zhihu-hotlist-excel.md`
|
||||
|
||||
**Step 5: Commit**
|
||||
|
||||
```bash
|
||||
git -C /home/zyl/projects/sgClaw/claw commit -m "chore: record final sgclaw superrpa runtime verification"
|
||||
```
|
||||
|
||||
## Remaining Items Explicitly Carried Into This Plan
|
||||
|
||||
- The current stale-backend risk is not considered closed until the resolver has automated regression coverage.
|
||||
- The current local edit in `/home/zyl/projects/superRpa/src/chrome/browser/ui/webui/superrpa/functions_ui_mainline_unittest.cc` must be either completed under Task 2 or replaced with the final tested version.
|
||||
- The current wrapper script `/home/zyl/projects/sgClaw/claw/tools/browser_runtime/sgclaw_browser_entry.sh` is still untracked and must be committed as part of Task 2.
|
||||
- The Zhihu hotlist to Excel acceptance still has unresolved correctness and export-path gaps and remains part of the critical path.
|
||||
- The repeated assistant text regression remains part of the critical path because it corrupts operator trust during demos.
|
||||
|
||||
Plan complete and saved to `docs/plans/2026-03-29-sgclaw-superrpa-decoupled-runtime-plan.md`. Two execution options:
|
||||
|
||||
**1. Subagent-Driven (this session)** - I dispatch fresh subagent per task, review between tasks, fast iteration
|
||||
|
||||
**2. Parallel Session (separate)** - Open new session with executing-plans, batch execution with checkpoints
|
||||
|
||||
**Which approach?**
|
||||
137
docs/plans/2026-03-29-sgclaw-superrpa-runtime-config-design.md
Normal file
137
docs/plans/2026-03-29-sgclaw-superrpa-runtime-config-design.md
Normal file
@@ -0,0 +1,137 @@
|
||||
# sgClaw SuperRPA Runtime Config Design
|
||||
|
||||
**Status**: Draft frozen before implementation
|
||||
**Date**: 2026-03-29
|
||||
|
||||
## 1. Goal
|
||||
|
||||
Freeze the runtime-boundary design before further implementation so Task 2+ do not drift back into browser-compiled behavior.
|
||||
|
||||
The design line is fixed:
|
||||
|
||||
- `host` keeps the security boundary.
|
||||
- sgClaw keeps runtime behavior.
|
||||
- `frontend bundle` keeps display rights only.
|
||||
- High-frequency changes move to runtime-managed files whenever possible.
|
||||
|
||||
## 2. Ownership Split
|
||||
|
||||
### 2.1 host
|
||||
|
||||
SuperRPA as `host` owns only the trusted boundary:
|
||||
|
||||
- process spawning
|
||||
- pipe lifecycle and session security
|
||||
- browser / office capability exposure
|
||||
- path validation for runtime-managed files
|
||||
- fallback to bundled defaults when external files are missing or unsafe
|
||||
|
||||
`host` does not own planner policy, model routing, provider selection, skill orchestration, or business behavior.
|
||||
|
||||
### 2.2 sgClaw runtime
|
||||
|
||||
sgClaw owns runtime behavior:
|
||||
|
||||
- planner / executor orchestration
|
||||
- provider list and active provider selection
|
||||
- skill loading and prompt mode
|
||||
- browser / office backend selection
|
||||
- runtime profile behavior
|
||||
- planner-first execution sequencing
|
||||
|
||||
### 2.3 frontend bundle
|
||||
|
||||
`frontend bundle` owns presentation only:
|
||||
|
||||
- render runtime state, logs, and conversation
|
||||
- render planner output before execution
|
||||
- collect user input and forward it through host events
|
||||
|
||||
`frontend bundle` must not:
|
||||
|
||||
- decide whether planner runs
|
||||
- directly select provider/backend outside runtime contract
|
||||
- bypass sgClaw / zeroclaw execution
|
||||
|
||||
## 3. Runtime-Managed Files
|
||||
|
||||
### 3.1 launch config
|
||||
|
||||
Owned by `host`, preferred path:
|
||||
|
||||
```text
|
||||
<profile>/superrpa/sgclaw_launch_config.json
|
||||
```
|
||||
|
||||
Fields:
|
||||
|
||||
- `binary`
|
||||
- `args`
|
||||
- `env`
|
||||
- `working_dir`
|
||||
- `runtime_config_path`
|
||||
- `frontend_bundle_dir`
|
||||
|
||||
Fallback rules:
|
||||
|
||||
1. Prefer external `launch config`
|
||||
2. Fall back to bundled browser defaults when file is missing or invalid
|
||||
3. Fall back to profile-local `runtime config` if `runtime_config_path` is absent
|
||||
4. Fall back to bundled frontend resources if `frontend_bundle_dir` is absent or invalid
|
||||
|
||||
### 3.2 runtime config
|
||||
|
||||
Owned by sgClaw, current default path:
|
||||
|
||||
```text
|
||||
<profile>/superrpa/sgclaw_config.json
|
||||
```
|
||||
|
||||
This file should carry runtime behavior instead of browser compile-time constants, including:
|
||||
|
||||
- planner mode
|
||||
- providers and active provider
|
||||
- browser backend
|
||||
- office backend
|
||||
- skills prompt mode
|
||||
- runtime profile
|
||||
|
||||
### 3.3 frontend bundle
|
||||
|
||||
Owned by `host` for loading, but externally replaceable at runtime:
|
||||
|
||||
- prefer `frontend_bundle_dir`
|
||||
- validate path and allowed loading rules
|
||||
- fall back to bundled resources if invalid
|
||||
|
||||
## 4. Planner-First Rule
|
||||
|
||||
`planner-first` is a runtime contract, not a frontend trick.
|
||||
|
||||
The sequence must be:
|
||||
|
||||
1. sgClaw / zeroclaw produces a plan
|
||||
2. `frontend bundle` displays the plan
|
||||
3. runtime continues into execution
|
||||
4. acceptance verifies both visible plan rendering and actual execution ordering
|
||||
|
||||
## 5. Failing Checklist
|
||||
|
||||
The following questions remain intentionally unresolved at design-freeze time and must be closed by implementation plus verification:
|
||||
|
||||
- [ ] Can browser startup switch sgClaw binary without rebuilding Chromium?
|
||||
- [ ] Can model/provider selection change without rebuilding Chromium?
|
||||
- [ ] Can floating UI be replaced without rebuilding Chromium?
|
||||
- [ ] Can acceptance flows prove planner-first behavior visually and functionally?
|
||||
|
||||
## 6. Terminology Guardrail
|
||||
|
||||
All related docs and code reviews must use the same terms:
|
||||
|
||||
- `host`
|
||||
- `launch config`
|
||||
- `runtime config`
|
||||
- `frontend bundle`
|
||||
- `planner-first`
|
||||
|
||||
Any proposal that moves planner or executor logic back into browser-side presentation code is out of bounds for this design.
|
||||
@@ -0,0 +1,451 @@
|
||||
# SGClaw ZeroClaw Planner-First Realignment Implementation Plan
|
||||
|
||||
> **For Claude:** REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.
|
||||
|
||||
**Goal:** Realign the browser submit path so `sgclaw` uses `zeroclaw` as the primary planner/executor, with `sgclaw` acting only as the secure SuperRPA host plus custom tool bridge.
|
||||
|
||||
**Architecture:** Stop treating `zeroclaw` as a thin LLM wrapper. The browser message path should enter a `zeroclaw`-native orchestration entry point first, let `zeroclaw` perform planning/tool-loop control, and expose SuperRPA-specific browser/office/screen capabilities as regular tools inside that runtime. Any deterministic fast paths for Zhihu/Office must be implemented as `zeroclaw`-aligned execution components, not as frontend-owned control flow. The frontend may display the generated plan and current stage for UX, but it must not own planning or execution decisions.
|
||||
|
||||
**Tech Stack:** Rust, `sgclaw` compat bridge, `third_party/zeroclaw` agent loop, SuperRPA browser pipe, local skill library, OpenXML office export, HTML screen export, cargo tests, Python live acceptance.
|
||||
|
||||
### Task 1: Freeze The Current Architecture Gap With Characterization Tests
|
||||
|
||||
**Files:**
|
||||
- Modify: `/home/zyl/projects/sgClaw/claw/tests/compat_runtime_test.rs`
|
||||
- Reference only: `/home/zyl/projects/sgClaw/claw/src/agent/mod.rs`
|
||||
- Reference only: `/home/zyl/projects/sgClaw/claw/src/compat/runtime.rs`
|
||||
- Reference only: `/home/zyl/projects/sgClaw/claw/third_party/zeroclaw/src/agent/loop_.rs`
|
||||
|
||||
**Step 1: Write the failing test**
|
||||
|
||||
Add a test that submits `读取知乎热榜前10,并导出 excel 文件` through `handle_browser_message_with_context(...)` and asserts the browser submit path does **not** terminate inside the current thin `Agent::turn_streamed(...)` compat bridge.
|
||||
|
||||
The test should check for one of these observable signals:
|
||||
- a new orchestration mode log such as `zeroclaw_process_message_primary`
|
||||
- absence of the old `compat_llm_primary` mode log
|
||||
- absence of selector-thrashing logs like repeated `getText .HotList-item`, `[data-hot-item]`, `ol li`
|
||||
|
||||
**Step 2: Run test to verify it fails**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
cargo test --test compat_runtime_test browser_submit_path_prefers_zeroclaw_process_message_orchestrator -- --nocapture
|
||||
```
|
||||
|
||||
Expected: FAIL because the current implementation still enters `src/compat/runtime.rs` and drives `agent.turn_streamed(...)` directly.
|
||||
|
||||
**Step 3: Write the smallest additional characterization test**
|
||||
|
||||
Add a second failing test that proves SuperRPA-specific tools remain available after the orchestration switch:
|
||||
- browser host tool
|
||||
- `openxml_office`
|
||||
- `screen_html_export`
|
||||
|
||||
This test should not require real network calls.
|
||||
|
||||
**Step 4: Run both failing tests**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
cargo test --test compat_runtime_test -- --nocapture
|
||||
```
|
||||
|
||||
Expected: at least the new characterization tests fail for the expected reason.
|
||||
|
||||
**Step 5: Commit**
|
||||
|
||||
```bash
|
||||
git add tests/compat_runtime_test.rs
|
||||
git commit -m "test: characterize browser path bypass of zeroclaw orchestrator"
|
||||
```
|
||||
|
||||
### Task 2: Introduce A ZeroClaw-Native Browser Orchestration Entry Point
|
||||
|
||||
**Files:**
|
||||
- Create: `/home/zyl/projects/sgClaw/claw/src/compat/orchestration.rs`
|
||||
- Modify: `/home/zyl/projects/sgClaw/claw/src/compat/mod.rs`
|
||||
- Modify: `/home/zyl/projects/sgClaw/claw/src/agent/mod.rs`
|
||||
- Modify: `/home/zyl/projects/sgClaw/claw/src/compat/runtime.rs`
|
||||
- Reference only: `/home/zyl/projects/sgClaw/claw/third_party/zeroclaw/src/agent/loop_.rs:4752`
|
||||
|
||||
**Step 1: Write the failing unit test for the new entry point**
|
||||
|
||||
Add a test for a new helper in `src/compat/orchestration.rs` that:
|
||||
- receives browser task context
|
||||
- builds a `zeroclaw` config
|
||||
- returns a browser-safe orchestration handle or result
|
||||
|
||||
The test should prove the new helper is chosen by `handle_browser_message_with_context(...)`.
|
||||
|
||||
**Step 2: Run the new test to verify it fails**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
cargo test --test compat_runtime_test browser_submit_path_prefers_zeroclaw_process_message_orchestrator -- --nocapture
|
||||
```
|
||||
|
||||
Expected: FAIL because the helper does not exist yet.
|
||||
|
||||
**Step 3: Implement the minimal entry point**
|
||||
|
||||
Create `src/compat/orchestration.rs` with one responsibility:
|
||||
- bridge browser submit tasks into a `zeroclaw`-native orchestration path
|
||||
|
||||
Do not implement Zhihu-specific logic here. This layer must only:
|
||||
- map config
|
||||
- map task context/history
|
||||
- inject SuperRPA tools
|
||||
- call the chosen `zeroclaw` orchestration function
|
||||
|
||||
**Step 4: Switch `handle_browser_message_with_context(...)` to the new entry point**
|
||||
|
||||
Modify:
|
||||
- `/home/zyl/projects/sgClaw/claw/src/agent/mod.rs`
|
||||
|
||||
Replace the direct `compat::runtime::execute_task_with_sgclaw_settings(...)` primary path with the new orchestration bridge.
|
||||
|
||||
**Step 5: Run the test to verify it passes**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
cargo test --test compat_runtime_test browser_submit_path_prefers_zeroclaw_process_message_orchestrator -- --nocapture
|
||||
```
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
**Step 6: Commit**
|
||||
|
||||
```bash
|
||||
git add src/compat/orchestration.rs src/compat/mod.rs src/agent/mod.rs src/compat/runtime.rs tests/compat_runtime_test.rs
|
||||
git commit -m "refactor: route browser submit flow through zeroclaw orchestration bridge"
|
||||
```
|
||||
|
||||
### Task 3: Register SuperRPA Browser/Office/Screen Capabilities As Native ZeroClaw Tools
|
||||
|
||||
**Files:**
|
||||
- Modify: `/home/zyl/projects/sgClaw/claw/src/compat/browser_tool_adapter.rs`
|
||||
- Modify: `/home/zyl/projects/sgClaw/claw/src/compat/openxml_office_tool.rs`
|
||||
- Modify: `/home/zyl/projects/sgClaw/claw/src/compat/screen_html_export_tool.rs`
|
||||
- Modify: `/home/zyl/projects/sgClaw/claw/src/runtime/engine.rs`
|
||||
- Modify: `/home/zyl/projects/sgClaw/claw/src/compat/orchestration.rs`
|
||||
- Test: `/home/zyl/projects/sgClaw/claw/tests/compat_runtime_test.rs`
|
||||
- Test: `/home/zyl/projects/sgClaw/claw/tests/compat_openxml_office_tool_test.rs`
|
||||
- Test: `/home/zyl/projects/sgClaw/claw/tests/compat_screen_html_export_tool_test.rs`
|
||||
|
||||
**Step 1: Write the failing tool-registration test**
|
||||
|
||||
Add a test that asserts the `zeroclaw` orchestration path exposes:
|
||||
- the preferred SuperRPA browser tool
|
||||
- `openxml_office` when Excel export is requested
|
||||
- `screen_html_export` when screen export is requested
|
||||
|
||||
The test must verify this through the new orchestration path, not the old compat path.
|
||||
|
||||
**Step 2: Run the test to verify it fails**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
cargo test --test compat_runtime_test browser_orchestration_registers_superrpa_tools_natively -- --nocapture
|
||||
```
|
||||
|
||||
Expected: FAIL until tool wiring is complete.
|
||||
|
||||
**Step 3: Implement minimal native tool registration**
|
||||
|
||||
Ensure the new orchestration bridge injects `sgclaw` tools into the `zeroclaw` runtime without changing frontend code. Keep tool naming stable:
|
||||
- `superrpa_browser`
|
||||
- `openxml_office`
|
||||
- `screen_html_export`
|
||||
|
||||
**Step 4: Verify tool-level tests still pass**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
cargo test --test compat_openxml_office_tool_test -- --nocapture
|
||||
cargo test --test compat_screen_html_export_tool_test -- --nocapture
|
||||
```
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
**Step 5: Run the new orchestration registration test**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
cargo test --test compat_runtime_test browser_orchestration_registers_superrpa_tools_natively -- --nocapture
|
||||
```
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
**Step 6: Commit**
|
||||
|
||||
```bash
|
||||
git add src/compat/browser_tool_adapter.rs src/compat/openxml_office_tool.rs src/compat/screen_html_export_tool.rs src/runtime/engine.rs src/compat/orchestration.rs tests/compat_runtime_test.rs tests/compat_openxml_office_tool_test.rs tests/compat_screen_html_export_tool_test.rs
|
||||
git commit -m "feat: expose superrpa browser and export tools through zeroclaw orchestration"
|
||||
```
|
||||
|
||||
### Task 4: Remove Frontend-Owned Or Custom Compat Mainline Control Flow
|
||||
|
||||
**Files:**
|
||||
- Modify: `/home/zyl/projects/sgClaw/claw/src/compat/runtime.rs`
|
||||
- Modify: `/home/zyl/projects/sgClaw/claw/src/agent/mod.rs`
|
||||
- Modify: `/home/zyl/projects/sgClaw/claw/src/compat/skill_runner.rs`
|
||||
- Test: `/home/zyl/projects/sgClaw/claw/tests/compat_runtime_test.rs`
|
||||
- Reference only: `/home/zyl/projects/sgClaw/claw/docs/plans/2026-03-29-sgclaw-zeroclaw-planner-first-execution-plan.md`
|
||||
|
||||
**Step 1: Write the failing regression test**
|
||||
|
||||
Add a test that proves Zhihu hotlist export no longer depends on a frontend-owned mainline such as:
|
||||
- `compat_skill_runner_primary`
|
||||
- direct `sgclaw`-local branching before `zeroclaw`
|
||||
|
||||
The expected primary mode should be a `zeroclaw`-owned orchestration mode.
|
||||
|
||||
**Step 2: Run the regression test to verify it fails**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
cargo test --test compat_runtime_test zhihu_export_does_not_use_frontend_owned_mainline -- --nocapture
|
||||
```
|
||||
|
||||
Expected: FAIL while `src/compat/skill_runner.rs` still owns primary control flow.
|
||||
|
||||
**Step 3: Remove or demote the custom mainline**
|
||||
|
||||
Change the code so:
|
||||
- `src/compat/skill_runner.rs` becomes either a helper invoked inside the `zeroclaw` tool/runtime ecosystem, or is removed if redundant
|
||||
- `src/agent/mod.rs` no longer branches to a custom primary executor before `zeroclaw`
|
||||
|
||||
Do not leave two competing primary modes.
|
||||
|
||||
**Step 4: Run the regression test**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
cargo test --test compat_runtime_test zhihu_export_does_not_use_frontend_owned_mainline -- --nocapture
|
||||
```
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
**Step 5: Run the broader compat suite**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
cargo test --test compat_runtime_test -- --nocapture
|
||||
```
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
**Step 6: Commit**
|
||||
|
||||
```bash
|
||||
git add src/compat/runtime.rs src/agent/mod.rs src/compat/skill_runner.rs tests/compat_runtime_test.rs
|
||||
git commit -m "refactor: remove frontend-owned primary control flow from browser submit path"
|
||||
```
|
||||
|
||||
### Task 5: Align Skills With ZeroClaw Execution Semantics Instead Of Prompt-Only Semantics
|
||||
|
||||
**Files:**
|
||||
- Modify: `/home/zyl/projects/sgClaw/claw/src/runtime/engine.rs`
|
||||
- Modify: `/home/zyl/projects/sgClaw/claw/src/compat/runtime.rs`
|
||||
- Modify: `/home/zyl/projects/sgClaw/claw/third_party/zeroclaw/src/tools/read_skill.rs`
|
||||
- Modify: `/home/zyl/projects/sgClaw/claw/tests/compat_runtime_test.rs`
|
||||
- Modify: `/home/zyl/projects/sgClaw/claw/tests/read_skill_tool_test.rs`
|
||||
- Reference only: `/home/zyl/projects/sgClaw/skill_lib/skills/zhihu-hotlist/SKILL.md`
|
||||
- Reference only: `/home/zyl/projects/sgClaw/skill_lib/skills/office-export-xlsx/SKILL.md`
|
||||
- Reference only: `/home/zyl/projects/sgClaw/skill_lib/skills/zhihu-hotlist-screen/SKILL.md`
|
||||
|
||||
**Step 1: Write the failing skill-execution regression test**
|
||||
|
||||
Add a test that proves skill usage in the browser submit path is not just:
|
||||
- prompt injection
|
||||
- `read_skill` text stuffing
|
||||
- model-led selector wandering
|
||||
|
||||
Instead, the test should verify the task produces:
|
||||
- a plan-driven collection/execution flow
|
||||
- a real `.xlsx` or `.html` artifact path
|
||||
- no selector-thrashing loop
|
||||
|
||||
**Step 2: Run the test to verify it fails**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
cargo test --test compat_runtime_test browser_skill_usage_is_execution_not_prompt_only -- --nocapture
|
||||
```
|
||||
|
||||
Expected: FAIL until skill semantics are aligned with execution.
|
||||
|
||||
**Step 3: Implement the minimal alignment**
|
||||
|
||||
Change the orchestration so `read_skill` is a fallback for missing context, not the primary means of making high-frequency browser workflows executable.
|
||||
|
||||
Keep:
|
||||
- skill discovery
|
||||
- skill references
|
||||
- artifact contract wording
|
||||
|
||||
Reduce:
|
||||
- over-reliance on prompt stuffing
|
||||
- over-reliance on model-led selector discovery for known workflows
|
||||
|
||||
**Step 4: Re-run the skill regression tests**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
cargo test --test compat_runtime_test browser_skill_usage_is_execution_not_prompt_only -- --nocapture
|
||||
cargo test --test read_skill_tool_test -- --nocapture
|
||||
```
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
**Step 5: Commit**
|
||||
|
||||
```bash
|
||||
git add src/runtime/engine.rs src/compat/runtime.rs third_party/zeroclaw/src/tools/read_skill.rs tests/compat_runtime_test.rs tests/read_skill_tool_test.rs
|
||||
git commit -m "refactor: align browser skill execution with zeroclaw-native workflow semantics"
|
||||
```
|
||||
|
||||
### Task 6: Verify The Planner-First Path End-To-End
|
||||
|
||||
**Files:**
|
||||
- Modify: `/home/zyl/projects/sgClaw/claw/docs/acceptance/2026-03-29-zhihu-hotlist-excel.md`
|
||||
- Test: `/home/zyl/projects/sgClaw/claw/tests/runtime_profile_test.rs`
|
||||
- Test: `/home/zyl/projects/sgClaw/claw/tests/compat_config_test.rs`
|
||||
- Test: `/home/zyl/projects/sgClaw/claw/tests/compat_runtime_test.rs`
|
||||
- Test: `/home/zyl/projects/sgClaw/claw/tests/live_acceptance_score_test.py`
|
||||
- Reference only: `/home/zyl/projects/superRpa/src/out/KylinRelease/sgclaw`
|
||||
|
||||
**Step 1: Run the Rust regression suites**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
cargo test --test runtime_profile_test -- --nocapture
|
||||
cargo test --test compat_config_test -- --nocapture
|
||||
cargo test --test compat_runtime_test -- --nocapture
|
||||
cargo test --test read_skill_tool_test -- --nocapture
|
||||
```
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
**Step 2: Run the Python scoring test**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
python3 -m unittest tests/live_acceptance_score_test.py
|
||||
```
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
**Step 3: Run the live Zhihu hotlist Excel acceptance**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
python3 tools/live_acceptance/run_zhihu_hotlist_excel_acceptance.py
|
||||
```
|
||||
|
||||
Expected:
|
||||
- total score returns to `100`
|
||||
- logs show planner-first `zeroclaw` orchestration instead of selector-thrashing
|
||||
- no `shell`, `web_fetch`, `web_search_tool`
|
||||
- final summary includes a real `.xlsx` path
|
||||
|
||||
**Step 4: Update the acceptance note**
|
||||
|
||||
Record:
|
||||
- new orchestration mode
|
||||
- tool sequence
|
||||
- timing notes
|
||||
- any remaining selector or latency risk
|
||||
|
||||
**Step 5: Rebuild and sync the runtime binary used by SuperRPA**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
cargo build
|
||||
cp /home/zyl/projects/sgClaw/claw/target/debug/sgclaw /home/zyl/projects/superRpa/src/out/KylinRelease/sgclaw
|
||||
sha256sum /home/zyl/projects/sgClaw/claw/target/debug/sgclaw /home/zyl/projects/superRpa/src/out/KylinRelease/sgclaw
|
||||
```
|
||||
|
||||
Expected: the two hashes match exactly.
|
||||
|
||||
**Step 6: Commit**
|
||||
|
||||
```bash
|
||||
git add docs/acceptance/2026-03-29-zhihu-hotlist-excel.md tests/runtime_profile_test.rs tests/compat_config_test.rs tests/compat_runtime_test.rs tests/live_acceptance_score_test.py
|
||||
git commit -m "test: verify planner-first zeroclaw browser orchestration end to end"
|
||||
```
|
||||
|
||||
### Task 7: Surface The Generated Plan In The Chat UI Without Giving Frontend Control
|
||||
|
||||
**Files:**
|
||||
- Modify: `/home/zyl/projects/sgClaw/claw/src/compat/event_bridge.rs`
|
||||
- Modify: `/home/zyl/projects/sgClaw/claw/src/pipe/protocol.rs`
|
||||
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/ui/webui/superrpa/sgclaw_session_service.cc`
|
||||
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/` (the active sgClaw chat UI files that render task progress)
|
||||
- Test: `/home/zyl/projects/sgClaw/claw/tests/pipe_protocol_test.rs`
|
||||
- Test: `/home/zyl/projects/sgClaw/claw/tests/compat_runtime_test.rs`
|
||||
|
||||
**Step 1: Write the failing protocol/UI test**
|
||||
|
||||
Add a test that proves the backend can emit a structured planning event before tool execution starts. The event must carry:
|
||||
- a short plan title
|
||||
- a flat ordered step list
|
||||
- current phase such as `planning`, `executing`, `completed`
|
||||
|
||||
The frontend test or fixture should verify the chat can render the plan summary without waiting for final completion.
|
||||
|
||||
**Step 2: Run test to verify it fails**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
cargo test --test pipe_protocol_test -- --nocapture
|
||||
cargo test --test compat_runtime_test plan_events_are_emitted_before_browser_execution -- --nocapture
|
||||
```
|
||||
|
||||
Expected: FAIL because the protocol does not yet expose a dedicated plan-progress event.
|
||||
|
||||
**Step 3: Add the minimal backend event shape**
|
||||
|
||||
Extend the `sgclaw` pipe/event bridge so the orchestration layer can emit:
|
||||
- planner summary
|
||||
- execution stage transitions
|
||||
|
||||
Keep the event read-only from the frontend’s perspective. The UI may display it, but cannot edit or branch execution.
|
||||
|
||||
**Step 4: Render the plan in the active chat UI**
|
||||
|
||||
Update the SuperRPA sgClaw chat UI so it:
|
||||
- prints the generated plan immediately after planning completes
|
||||
- keeps the plan compact and collapsible
|
||||
- highlights the current phase while waiting
|
||||
|
||||
Do not add frontend-owned retry logic, decision logic, or browser action generation.
|
||||
|
||||
**Step 5: Run verification**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
cargo test --test pipe_protocol_test -- --nocapture
|
||||
cargo test --test compat_runtime_test -- --nocapture
|
||||
```
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
**Step 6: Manual browser validation**
|
||||
|
||||
Submit:
|
||||
```text
|
||||
读取知乎热榜前10,并导出 excel 文件
|
||||
```
|
||||
|
||||
Expected:
|
||||
- the chat first shows a short generated plan
|
||||
- the user sees stage transitions instead of a blank wait
|
||||
- execution still follows the backend-owned `zeroclaw` path
|
||||
|
||||
**Step 7: Commit**
|
||||
|
||||
```bash
|
||||
git add src/compat/event_bridge.rs src/pipe/protocol.rs tests/pipe_protocol_test.rs tests/compat_runtime_test.rs
|
||||
git commit -m "feat: surface backend-generated execution plans in sgclaw chat ui"
|
||||
```
|
||||
444
docs/plans/2026-03-29-zhihu-hotlist-office-export-plan.md
Normal file
444
docs/plans/2026-03-29-zhihu-hotlist-office-export-plan.md
Normal file
@@ -0,0 +1,444 @@
|
||||
# Zhihu Hotlist To Excel Implementation Plan
|
||||
|
||||
> **For Claude:** REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.
|
||||
|
||||
**Goal:** Make sgClaw reliably read Zhihu hotlist data through a Zhihu browser skill and export the collected structured result into a local `.xlsx` file through an independent Office skill.
|
||||
|
||||
**Architecture:** Keep zeroclaw as the core planner, but stop it from wandering across unrelated tools once a browser-attached skill is selected. The hotlist skill must produce a strict structured artifact, and the Office skill must consume that artifact through a dedicated `openxml_office` tool that wraps the sibling `openxml_cli` project. For the first delivery, reuse `openxml_cli template render` with a bundled `.xlsx` template instead of inventing a new workbook-construction API.
|
||||
|
||||
**Tech Stack:** Rust, vendored zeroclaw, sgClaw browser pipe, skill packages under `/home/zyl/projects/sgClaw/skill_lib`, sibling `openxml_cli`, JSON payload handoff, `.xlsx` template render, Python/Rust regression tests, real-provider smoke verification.
|
||||
|
||||
## Scope Guard
|
||||
|
||||
- In scope:
|
||||
- browser-attached skill execution discipline
|
||||
- `zhihu-hotlist` structured export artifact
|
||||
- new `office-export-xlsx` skill
|
||||
- new `openxml_office` runtime tool
|
||||
- end-to-end acceptance for "读取知乎热榜数据,并导出 excel 文件"
|
||||
- Out of scope:
|
||||
- generic Office authoring platform
|
||||
- arbitrary shell-based export flows
|
||||
- browser-side file generation as the main export path
|
||||
- broad multi-site data export before Zhihu hotlist is stable
|
||||
|
||||
## Current Findings To Preserve
|
||||
|
||||
- Real-provider validation already proved that `zhihu-hotlist`, `zhihu-navigate`, and `zhihu-write` can be selected through `read_skill`.
|
||||
- The current failure mode is not "skill missing" but "tool discipline collapse":
|
||||
- `file_read`, `glob_search`, and `shell` are attempted after `read_skill`
|
||||
- `zhihu-write` can fill title/body but still exceeds max tool iterations
|
||||
- `zhihu-navigate` succeeds for some intents but still detours through non-browser tools
|
||||
- The sibling Office project already exists at `/home/zyl/projects/sgClaw/openxml_cli`.
|
||||
- `openxml_cli` currently exposes `capabilities`, `template inspect`, `template validate`, and `template render`; it does not yet expose a direct "create workbook from scratch" command.
|
||||
|
||||
## Final Acceptance Contract
|
||||
|
||||
Input:
|
||||
|
||||
```text
|
||||
读取知乎热榜数据,并导出 excel 文件
|
||||
```
|
||||
|
||||
Required behavior:
|
||||
|
||||
1. sgClaw selects `zhihu-hotlist`.
|
||||
2. sgClaw gathers hotlist rows through the SuperRPA browser interface only.
|
||||
3. sgClaw converts the result into a structured JSON export payload.
|
||||
4. sgClaw selects `office-export-xlsx`.
|
||||
5. sgClaw calls `openxml_office`.
|
||||
6. A local `.xlsx` file is produced and its path is returned.
|
||||
|
||||
Required logs:
|
||||
|
||||
- `read_skill zhihu-hotlist`
|
||||
- browser actions only: `navigate`, `getText`, optionally `click`
|
||||
- `read_skill office-export-xlsx`
|
||||
- `call openxml_office`
|
||||
|
||||
Forbidden logs during the mainline path:
|
||||
|
||||
- `call shell`
|
||||
- `call glob_search`
|
||||
- `call file_read` on skill references or skill roots
|
||||
- `docker run`
|
||||
|
||||
Required Excel content:
|
||||
|
||||
- one sheet named `知乎热榜`
|
||||
- columns: `rank`, `title`, `heat`
|
||||
- at least 10 hotlist rows
|
||||
- exported values match the collected rows
|
||||
|
||||
## Task 1: Lock Browser-Attached Skill Runs To The Right Tools
|
||||
|
||||
**Files:**
|
||||
- Modify: `/home/zyl/projects/sgClaw/claw/tests/compat_runtime_test.rs`
|
||||
- Modify: `/home/zyl/projects/sgClaw/claw/src/runtime/engine.rs`
|
||||
- Modify: `/home/zyl/projects/sgClaw/claw/src/runtime/tool_policy.rs`
|
||||
- Modify: `/home/zyl/projects/sgClaw/claw/src/compat/runtime.rs`
|
||||
|
||||
**Intent:**
|
||||
- Once the task is clearly in a browser-attached Zhihu skill flow, the runtime must stop offering unrelated tools such as `shell`, `glob_search`, and arbitrary `file_read`.
|
||||
|
||||
**Step 1: Write the failing regression tests**
|
||||
|
||||
Add focused tests in `tests/compat_runtime_test.rs` for:
|
||||
|
||||
```rust
|
||||
#[test]
|
||||
fn zhihu_hotlist_skill_flow_does_not_expose_shell_or_glob_tools() {}
|
||||
|
||||
#[test]
|
||||
fn browser_attached_export_flow_exposes_browser_and_office_tools_only() {}
|
||||
```
|
||||
|
||||
Assertions to include:
|
||||
|
||||
- request tool list contains `superrpa_browser`
|
||||
- request tool list contains `read_skill`
|
||||
- request tool list does not contain `shell`
|
||||
- request tool list does not contain `glob_search`
|
||||
- request tool list does not contain generic `file_read` during the constrained browser skill phase
|
||||
|
||||
**Step 2: Run the focused tests to verify failure**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test compat_runtime_test zhihu_hotlist_skill_flow_does_not_expose_shell_or_glob_tools -- --nocapture
|
||||
cargo test --test compat_runtime_test browser_attached_export_flow_exposes_browser_and_office_tools_only -- --nocapture
|
||||
```
|
||||
|
||||
Expected:
|
||||
- fail because current runtime still exposes too many tools in browser-attached mode
|
||||
|
||||
**Step 3: Implement minimal constrained-tool policy**
|
||||
|
||||
Implement a browser-skill execution mode that:
|
||||
|
||||
- keeps `superrpa_browser`
|
||||
- keeps compatibility alias `browser_action`
|
||||
- keeps `read_skill`
|
||||
- optionally keeps the new `openxml_office` tool only for export tasks
|
||||
- removes `shell`, `glob_search`, and free-form `file_read` from the allowed tool list for these phases
|
||||
|
||||
Do this in `src/runtime/engine.rs` by deriving a narrower `allowed_tools` set from:
|
||||
|
||||
- runtime profile
|
||||
- browser surface present flag
|
||||
- instruction intent
|
||||
- whether export mode is active
|
||||
|
||||
**Step 4: Re-run the focused tests**
|
||||
|
||||
Run the same commands.
|
||||
|
||||
Expected:
|
||||
- both pass
|
||||
|
||||
## Task 2: Convert Zhihu Hotlist Skill To Structured Output First
|
||||
|
||||
**Files:**
|
||||
- Modify: `/home/zyl/projects/sgClaw/skill_lib/skills/zhihu-hotlist/SKILL.md`
|
||||
- Modify: `/home/zyl/projects/sgClaw/claw/tests/skill_lib_validation_test.py`
|
||||
- Modify: `/home/zyl/projects/sgClaw/claw/tests/compat_runtime_test.rs`
|
||||
|
||||
**Intent:**
|
||||
- The hotlist skill should stop ending with prose-only summaries. Its primary output must be a stable export artifact the Office skill can consume.
|
||||
|
||||
**Step 1: Write the failing tests**
|
||||
|
||||
Add tests that enforce:
|
||||
|
||||
- `zhihu-hotlist` prompt body contains an explicit `Export Artifact` section
|
||||
- the artifact schema includes `sheet_name`, `columns`, and `rows`
|
||||
- runtime regression checks can find those fields in the skill content when `read_skill` is used
|
||||
|
||||
**Step 2: Run tests to verify failure**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
python3 -m unittest tests.skill_lib_validation_test
|
||||
cargo test --test compat_runtime_test handle_browser_message_executes_real_zhihu_hotlist_skill_flow -- --nocapture
|
||||
```
|
||||
|
||||
Expected:
|
||||
- validation fails because the artifact contract is not yet required
|
||||
|
||||
**Step 3: Update `zhihu-hotlist`**
|
||||
|
||||
Add an `Export Artifact` section that requires this shape:
|
||||
|
||||
```json
|
||||
{
|
||||
"source": "https://www.zhihu.com/hot",
|
||||
"sheet_name": "知乎热榜",
|
||||
"columns": ["rank", "title", "heat"],
|
||||
"rows": [[1, "标题", "344万"]]
|
||||
}
|
||||
```
|
||||
|
||||
Also add hard rules:
|
||||
|
||||
- no extra exploratory tools after the browser data is collected
|
||||
- prose summary is secondary, structured artifact is primary
|
||||
|
||||
**Step 4: Re-run tests**
|
||||
|
||||
Expected:
|
||||
- validation passes
|
||||
|
||||
## Task 3: Create The Office Export Skill Package
|
||||
|
||||
**Files:**
|
||||
- Create: `/home/zyl/projects/sgClaw/skill_lib/skills/office-export-xlsx/SKILL.md`
|
||||
- Create: `/home/zyl/projects/sgClaw/skill_lib/skills/office-export-xlsx/references/export-flow.md`
|
||||
- Create: `/home/zyl/projects/sgClaw/skill_lib/skills/office-export-xlsx/assets/zhihu_hotlist_template.xlsx`
|
||||
- Modify: `/home/zyl/projects/sgClaw/claw/tests/skill_lib_validation_test.py`
|
||||
|
||||
**Intent:**
|
||||
- Add a fully separate Office skill that knows nothing about browser scraping and only turns structured table data into a local Excel file.
|
||||
|
||||
**Step 1: Write the failing validation test**
|
||||
|
||||
Extend `tests/skill_lib_validation_test.py` so discovery expects:
|
||||
|
||||
```python
|
||||
EXPECTED_SKILL_NAMES = [
|
||||
"office-export-xlsx",
|
||||
"zhihu-hotlist",
|
||||
"zhihu-navigate",
|
||||
"zhihu-write",
|
||||
]
|
||||
```
|
||||
|
||||
Also require the new skill to mention:
|
||||
|
||||
- `openxml_office`
|
||||
- `.xlsx`
|
||||
- `sheet_name`
|
||||
- `columns`
|
||||
- `rows`
|
||||
|
||||
**Step 2: Run the validation test to verify failure**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
python3 -m unittest tests.skill_lib_validation_test
|
||||
```
|
||||
|
||||
Expected:
|
||||
- fail because the new skill package does not exist yet
|
||||
|
||||
**Step 3: Create the skill package**
|
||||
|
||||
`SKILL.md` must define:
|
||||
|
||||
- when to use: local Office export from structured rows
|
||||
- required input schema
|
||||
- output: exported file path
|
||||
- tool rule: only call `openxml_office`, do not use browser tools
|
||||
|
||||
`export-flow.md` must define:
|
||||
|
||||
- validate payload shape
|
||||
- choose output path
|
||||
- invoke `openxml_office`
|
||||
- return file path and row count
|
||||
|
||||
The first workbook template should be a fixed `zhihu_hotlist_template.xlsx` with:
|
||||
|
||||
- sheet `知乎热榜`
|
||||
- row 1 headers already present
|
||||
- table fill anchored to a stable name or placeholder expected by `openxml_cli`
|
||||
|
||||
**Step 4: Re-run validation**
|
||||
|
||||
Expected:
|
||||
- new skill passes audit
|
||||
|
||||
## Task 4: Add The `openxml_office` Runtime Tool
|
||||
|
||||
**Files:**
|
||||
- Create: `/home/zyl/projects/sgClaw/claw/src/compat/openxml_office_tool.rs`
|
||||
- Modify: `/home/zyl/projects/sgClaw/claw/src/compat/mod.rs`
|
||||
- Modify: `/home/zyl/projects/sgClaw/claw/src/compat/runtime.rs`
|
||||
- Modify: `/home/zyl/projects/sgClaw/claw/src/runtime/tool_policy.rs`
|
||||
- Test: `/home/zyl/projects/sgClaw/claw/tests/compat_openxml_office_tool_test.rs`
|
||||
|
||||
**Intent:**
|
||||
- Wrap sibling `openxml_cli` as a first-class local tool instead of leaking Office export through shell prompting.
|
||||
|
||||
**Step 1: Write the failing tool test**
|
||||
|
||||
Create `tests/compat_openxml_office_tool_test.rs` with cases for:
|
||||
|
||||
- capability probe
|
||||
- render request assembly for xlsx export
|
||||
- rejection when rows/columns are missing
|
||||
- stable JSON output containing `output_path`
|
||||
|
||||
**Step 2: Run the test to verify failure**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test compat_openxml_office_tool_test -- --nocapture
|
||||
```
|
||||
|
||||
Expected:
|
||||
- fail because the tool does not exist
|
||||
|
||||
**Step 3: Implement minimal tool**
|
||||
|
||||
Tool contract:
|
||||
|
||||
```json
|
||||
{
|
||||
"action": "export_hotlist_xlsx",
|
||||
"template_path": ".../zhihu_hotlist_template.xlsx",
|
||||
"output_path": "/tmp/zhihu_hotlist.xlsx",
|
||||
"sheet_name": "知乎热榜",
|
||||
"columns": ["rank", "title", "heat"],
|
||||
"rows": [[1, "标题", "344万"]]
|
||||
}
|
||||
```
|
||||
|
||||
Implementation rules:
|
||||
|
||||
- write the payload JSON to a temp file
|
||||
- invoke sibling `openxml_cli template render --request <file> --json`
|
||||
- return parsed JSON result and normalized `output_path`
|
||||
- no free-form shell composition from model text
|
||||
|
||||
**Step 4: Re-run the focused tests**
|
||||
|
||||
Expected:
|
||||
- pass
|
||||
|
||||
## Task 5: Wire Export Tasks To Use Two Skills In Sequence
|
||||
|
||||
**Files:**
|
||||
- Modify: `/home/zyl/projects/sgClaw/claw/src/runtime/engine.rs`
|
||||
- Modify: `/home/zyl/projects/sgClaw/claw/tests/compat_runtime_test.rs`
|
||||
|
||||
**Intent:**
|
||||
- The single user instruction must naturally flow from hotlist capture into Office export, not end after the first skill.
|
||||
|
||||
**Step 1: Write the failing runtime test**
|
||||
|
||||
Add a focused regression test for:
|
||||
|
||||
```rust
|
||||
#[test]
|
||||
fn zhihu_hotlist_export_task_reads_hotlist_skill_then_office_skill() {}
|
||||
```
|
||||
|
||||
Assertions:
|
||||
|
||||
- request stream includes `read_skill zhihu-hotlist`
|
||||
- later includes `read_skill office-export-xlsx`
|
||||
- office phase exposes `openxml_office`
|
||||
- no `shell` is exposed in the constrained task path
|
||||
|
||||
**Step 2: Run the test to verify failure**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test compat_runtime_test zhihu_hotlist_export_task_reads_hotlist_skill_then_office_skill -- --nocapture
|
||||
```
|
||||
|
||||
Expected:
|
||||
- fail because the task currently has no structured handoff to Office export
|
||||
|
||||
**Step 3: Implement minimal chaining support**
|
||||
|
||||
Do not add a hard-coded workflow engine.
|
||||
|
||||
Minimal implementation:
|
||||
|
||||
- strengthen prompt contract so export tasks require structured hotlist artifact
|
||||
- include `openxml_office` in allowed tools for export intent
|
||||
- keep browser-only tools for the collection phase and Office-only tool for the export phase
|
||||
|
||||
**Step 4: Re-run the test**
|
||||
|
||||
Expected:
|
||||
- pass
|
||||
|
||||
## Task 6: Add Real Acceptance Harness And Scoring
|
||||
|
||||
**Files:**
|
||||
- Create: `/home/zyl/projects/sgClaw/claw/tools/live_acceptance/run_zhihu_hotlist_excel_acceptance.py`
|
||||
- Create: `/home/zyl/projects/sgClaw/claw/docs/acceptance/2026-03-29-zhihu-hotlist-excel.md`
|
||||
|
||||
**Intent:**
|
||||
- Make the final acceptance repeatable with the real user config and a transparent score.
|
||||
|
||||
**Step 1: Write the script**
|
||||
|
||||
The script must:
|
||||
|
||||
- use `/home/zyl/.config/superrpa/Default/superrpa/sgclaw_config.json`
|
||||
- boot local `target/debug/sgclaw`
|
||||
- send one browser `submit_task`
|
||||
- respond to browser commands with controlled fixture responses
|
||||
- capture:
|
||||
- loaded skills
|
||||
- selected skills
|
||||
- forbidden tool calls
|
||||
- final summary
|
||||
- exported file path
|
||||
|
||||
**Step 2: Define score rubric**
|
||||
|
||||
Rubric:
|
||||
|
||||
- `skill selection`: 30
|
||||
- `tool discipline`: 25
|
||||
- `hotlist data correctness`: 20
|
||||
- `xlsx export success`: 20
|
||||
- `final response quality`: 5
|
||||
|
||||
Automatic deductions:
|
||||
|
||||
- `shell` called: `-15`
|
||||
- `glob_search` called: `-10`
|
||||
- `file_read` on skill references: `-10`
|
||||
- wrong skill selected first: `-15`
|
||||
- export missing output path: `-20`
|
||||
|
||||
**Step 3: Run acceptance**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
python3 tools/live_acceptance/run_zhihu_hotlist_excel_acceptance.py
|
||||
```
|
||||
|
||||
Expected:
|
||||
- prints total score and per-dimension breakdown
|
||||
- stores final evidence in `docs/acceptance/2026-03-29-zhihu-hotlist-excel.md`
|
||||
|
||||
## Delivery Sequence
|
||||
|
||||
Execute in this order:
|
||||
|
||||
1. Task 1: constrain tools
|
||||
2. Task 2: structure hotlist output
|
||||
3. Task 3: add office skill package
|
||||
4. Task 4: add `openxml_office`
|
||||
5. Task 5: chain the two skills
|
||||
6. Task 6: run acceptance and score
|
||||
|
||||
## Definition Of Done
|
||||
|
||||
- browser-attached hotlist tasks no longer wander into `shell`, `glob_search`, or ad-hoc `file_read`
|
||||
- `office-export-xlsx` exists as an independent skill
|
||||
- `openxml_office` exists as an explicit tool
|
||||
- a single user task can collect hotlist data and export `.xlsx`
|
||||
- acceptance score is at least `85/100`
|
||||
175
docs/plans/2026-03-30-browser-script-skill-capability-plan.md
Normal file
175
docs/plans/2026-03-30-browser-script-skill-capability-plan.md
Normal file
@@ -0,0 +1,175 @@
|
||||
# Browser Script Skill Capability Implementation Plan
|
||||
|
||||
> **For Claude:** REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.
|
||||
|
||||
**Goal:** Let zeroclaw-backed sgclaw skills call deterministic browser-context scripts through existing SuperRPA page-execution capabilities.
|
||||
|
||||
**Architecture:** Reuse SuperRPA's existing page script execution path instead of inventing a parallel workflow executor. Extend sgclaw's browser pipe and zeroclaw skill-tool mapping so a skill can declare a browser script tool, execute a packaged script file with parameters, and receive structured JSON back in the current page context.
|
||||
|
||||
**Tech Stack:** Rust (`sgclaw`, vendored `zeroclaw`), Chromium/SuperRPA C++, existing sgclaw browser pipe protocol, skill package `SKILL.toml` / `scripts/`.
|
||||
|
||||
### Task 1: Define the browser-script contract
|
||||
|
||||
**Files:**
|
||||
- Modify: `third_party/zeroclaw/src/skills/mod.rs`
|
||||
- Modify: `third_party/zeroclaw/src/tools/mod.rs`
|
||||
- Create: `third_party/zeroclaw/src/tools/browser_script_skill.rs`
|
||||
- Test: `tests/compat_browser_tool_test.rs`
|
||||
|
||||
**Step 1: Write the failing test**
|
||||
|
||||
Add a sgclaw-side test proving the browser tool schema accepts a script execution action with required browser-script fields and rejects missing script payload.
|
||||
|
||||
**Step 2: Run test to verify it fails**
|
||||
|
||||
Run: `cargo test --manifest-path /home/zyl/projects/sgClaw/claw/Cargo.toml --test compat_browser_tool_test`
|
||||
Expected: FAIL because the browser tool still exposes only `click/type/navigate/getText`.
|
||||
|
||||
**Step 3: Write minimal implementation**
|
||||
|
||||
Extend the browser tool adapter and protocol-facing action parsing to support a deterministic script execution action carrying:
|
||||
- `action`
|
||||
- `expected_domain`
|
||||
- `script`
|
||||
- optional `args`
|
||||
|
||||
**Step 4: Run test to verify it passes**
|
||||
|
||||
Run: `cargo test --manifest-path /home/zyl/projects/sgClaw/claw/Cargo.toml --test compat_browser_tool_test`
|
||||
Expected: PASS for the new browser script action coverage.
|
||||
|
||||
**Step 5: Commit**
|
||||
|
||||
```bash
|
||||
git add tests/compat_browser_tool_test.rs src/compat/browser_tool_adapter.rs src/pipe/protocol.rs third_party/zeroclaw/src/skills/mod.rs third_party/zeroclaw/src/tools/mod.rs third_party/zeroclaw/src/tools/browser_script_skill.rs
|
||||
git commit -m "feat: add browser script skill tool support"
|
||||
```
|
||||
|
||||
### Task 2: Wire browser script execution through the sgclaw host bridge
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/pipe/protocol.rs`
|
||||
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/superrpa/sgclaw/sgclaw_pipe_protocol.cc`
|
||||
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/superrpa/sgclaw/sgclaw_security_gate.cc`
|
||||
- Test: `/home/zyl/projects/superRpa/src/chrome/browser/superrpa/sgclaw/sgclaw_pipe_protocol_mainline_unittest.cc`
|
||||
- Test: `/home/zyl/projects/superRpa/src/chrome/browser/superrpa/sgclaw/sgclaw_process_host_mainline_unittest.cc`
|
||||
|
||||
**Step 1: Write the failing tests**
|
||||
|
||||
Add host-side tests proving:
|
||||
- sgclaw protocol accepts the new browser script action
|
||||
- handshake advertises the action
|
||||
- the security gate allows it only through the sgclaw host allowlist
|
||||
|
||||
**Step 2: Run tests to verify they fail**
|
||||
|
||||
Run the relevant SuperRPA unit tests.
|
||||
Expected: FAIL because the host protocol and allowlist do not yet include browser script execution.
|
||||
|
||||
**Step 3: Write minimal implementation**
|
||||
|
||||
Reuse existing SuperRPA router support by bridging the new sgclaw action to the host's page script execution capability instead of inventing a second execution path.
|
||||
|
||||
**Step 4: Run tests to verify they pass**
|
||||
|
||||
Run the same SuperRPA unit tests.
|
||||
Expected: PASS with the new action accepted end-to-end.
|
||||
|
||||
**Step 5: Commit**
|
||||
|
||||
```bash
|
||||
git add src/pipe/protocol.rs /home/zyl/projects/superRpa/src/chrome/browser/superrpa/sgclaw/sgclaw_pipe_protocol.cc /home/zyl/projects/superRpa/src/chrome/browser/superrpa/sgclaw/sgclaw_security_gate.cc /home/zyl/projects/superRpa/src/chrome/browser/superrpa/sgclaw/sgclaw_pipe_protocol_mainline_unittest.cc /home/zyl/projects/superRpa/src/chrome/browser/superrpa/sgclaw/sgclaw_process_host_mainline_unittest.cc
|
||||
git commit -m "feat: expose browser script execution to sgclaw"
|
||||
```
|
||||
|
||||
### Task 3: Convert Zhihu hotlist collection into a script-backed skill path
|
||||
|
||||
**Files:**
|
||||
- Create: `/home/zyl/projects/sgClaw/skill_lib/skills/zhihu-hotlist/SKILL.toml`
|
||||
- Create: `/home/zyl/projects/sgClaw/skill_lib/skills/zhihu-hotlist/scripts/extract_hotlist.js`
|
||||
- Modify: `/home/zyl/projects/sgClaw/skill_lib/skills/zhihu-hotlist/SKILL.md`
|
||||
- Modify: `tests/compat_runtime_test.rs`
|
||||
|
||||
**Step 1: Write the failing test**
|
||||
|
||||
Add runtime coverage proving the Zhihu hotlist export flow uses the skill-backed browser script tool instead of repeated `getText` probing.
|
||||
|
||||
**Step 2: Run test to verify it fails**
|
||||
|
||||
Run: `cargo test --manifest-path /home/zyl/projects/sgClaw/claw/Cargo.toml --test compat_runtime_test`
|
||||
Expected: FAIL because the current flow still uses `read_skill` plus generic `getText`.
|
||||
|
||||
**Step 3: Write minimal implementation**
|
||||
|
||||
Package the hotlist extractor as a deterministic browser script tool returning structured rows and update the skill description to require that tool before export.
|
||||
|
||||
**Step 4: Run test to verify it passes**
|
||||
|
||||
Run: `cargo test --manifest-path /home/zyl/projects/sgClaw/claw/Cargo.toml --test compat_runtime_test`
|
||||
Expected: PASS with the script-backed tool visible in logs and the generic repeated `getText` fallback removed from the primary path.
|
||||
|
||||
**Step 5: Commit**
|
||||
|
||||
```bash
|
||||
git add /home/zyl/projects/sgClaw/skill_lib/skills/zhihu-hotlist/SKILL.toml /home/zyl/projects/sgClaw/skill_lib/skills/zhihu-hotlist/scripts/extract_hotlist.js /home/zyl/projects/sgClaw/skill_lib/skills/zhihu-hotlist/SKILL.md tests/compat_runtime_test.rs
|
||||
git commit -m "feat: make zhihu hotlist extraction script-backed"
|
||||
```
|
||||
|
||||
### Task 4: Fix plan visibility in the sgclaw frontend
|
||||
|
||||
**Files:**
|
||||
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_state.ts`
|
||||
- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat.ts`
|
||||
- Test: frontend or host tests covering `[sgclaw/plan]`
|
||||
|
||||
**Step 1: Write the failing test**
|
||||
|
||||
Add coverage proving `[sgclaw/plan]` logs are treated as planner logs and preserve multiline task-step rendering.
|
||||
|
||||
**Step 2: Run test to verify it fails**
|
||||
|
||||
Run the relevant frontend or browser-side tests.
|
||||
Expected: FAIL because the current frontend only recognizes `[browser/plan]` or `[plan]`.
|
||||
|
||||
**Step 3: Write minimal implementation**
|
||||
|
||||
Accept `[sgclaw/plan]` as a planner log source and keep the plan text rendered as multiple visible lines.
|
||||
|
||||
**Step 4: Run test to verify it passes**
|
||||
|
||||
Run the same tests.
|
||||
Expected: PASS with sgclaw planner logs rendered correctly.
|
||||
|
||||
**Step 5: Commit**
|
||||
|
||||
```bash
|
||||
git add /home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_state.ts /home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat.ts
|
||||
git commit -m "fix: render sgclaw planner logs in chat ui"
|
||||
```
|
||||
|
||||
### Task 5: Full verification
|
||||
|
||||
**Files:**
|
||||
- Verify only
|
||||
|
||||
**Step 1: Run sgclaw Rust tests**
|
||||
|
||||
Run: `cargo test --manifest-path /home/zyl/projects/sgClaw/claw/Cargo.toml`
|
||||
Expected: PASS.
|
||||
|
||||
**Step 2: Run relevant SuperRPA unit tests**
|
||||
|
||||
Run the updated sgclaw host and frontend tests.
|
||||
Expected: PASS.
|
||||
|
||||
**Step 3: Run a manual browser acceptance flow**
|
||||
|
||||
Verify that:
|
||||
- startup logs include runtime and skill versions
|
||||
- `[sgclaw/plan]` is shown as a real multi-line plan
|
||||
- Zhihu hotlist export invokes the script-backed skill tool
|
||||
- the flow returns a real `.xlsx` path without repeated self-referential output
|
||||
|
||||
**Step 4: Commit verification artifacts if needed**
|
||||
|
||||
Keep verification commits separate from behavior changes.
|
||||
309
docs/sgClaw技术路线总览.html
Normal file
309
docs/sgClaw技术路线总览.html
Normal file
@@ -0,0 +1,309 @@
|
||||
<!DOCTYPE html>
|
||||
<html lang="zh-CN">
|
||||
<head>
|
||||
<meta charset="UTF-8">
|
||||
<meta name="viewport" content="width=device-width, initial-scale=1.0">
|
||||
<title>sgClaw 智能浏览器自动化平台 - 技术路线总览</title>
|
||||
<script src="https://cdn.jsdelivr.net/npm/mermaid@10.9.5/dist/mermaid.min.js"></script>
|
||||
<style>
|
||||
*{margin:0;padding:0;box-sizing:border-box}
|
||||
body{font-family:-apple-system,BlinkMacSystemFont,"Segoe UI","PingFang SC","Hiragino Sans GB","Microsoft YaHei",sans-serif;background:#0d1117;color:#c9d1d9;line-height:1.8}
|
||||
.header{background:linear-gradient(135deg,#0a1628,#16213e,#1a3a5c);padding:3rem 2rem;text-align:center;border-bottom:3px solid #e65100}
|
||||
.header h1{font-size:2.2rem;color:#e6edf3;margin-bottom:.5rem}
|
||||
.header .subtitle{color:#8b949e;font-size:1rem}
|
||||
.container{max-width:1300px;margin:0 auto;padding:2rem}
|
||||
.section{background:#161b22;border:1px solid #30363d;border-radius:12px;margin-bottom:2rem;overflow:hidden}
|
||||
.section-header{background:linear-gradient(90deg,#1a1a2e,#16213e);padding:1rem 1.5rem;border-bottom:1px solid #30363d;display:flex;align-items:center;gap:.8rem}
|
||||
.section-number{background:#e65100;color:#fff;width:32px;height:32px;border-radius:50%;display:flex;align-items:center;justify-content:center;font-weight:700;flex-shrink:0}
|
||||
.section-title{font-size:1.2rem;color:#e6edf3;font-weight:600}
|
||||
.section-body{padding:1.5rem;overflow-x:auto}
|
||||
.mermaid{display:flex;justify-content:center;padding:1rem 0}
|
||||
.mermaid svg{max-width:100%;height:auto}
|
||||
.desc{background:#1a1a2e;border-left:3px solid #e65100;padding:1rem 1.2rem;margin:1rem 0;border-radius:0 8px 8px 0;font-size:.95rem;color:#8b949e}
|
||||
.desc strong{color:#e6edf3}
|
||||
.value-grid{display:grid;grid-template-columns:repeat(auto-fit,minmax(280px,1fr));gap:1rem;margin:1rem 0}
|
||||
.value-card{background:#1a1a2e;border:1px solid #30363d;border-radius:10px;padding:1.2rem}
|
||||
.value-card h3{color:#e65100;font-size:1rem;margin-bottom:.5rem}
|
||||
.value-card p{color:#8b949e;font-size:.9rem}
|
||||
.phase-list{display:flex;flex-direction:column;gap:.8rem;margin:1rem 0}
|
||||
.phase-item{display:flex;gap:1rem;align-items:flex-start;background:#1a1a2e;border-radius:10px;padding:1rem;border-left:4px solid #e65100}
|
||||
.phase-badge{background:#e65100;color:#fff;padding:.3rem .8rem;border-radius:20px;font-size:.85rem;font-weight:600;white-space:nowrap}
|
||||
.phase-item h4{color:#e6edf3;font-size:1rem;margin-bottom:.2rem}
|
||||
.phase-item p{color:#8b949e;font-size:.9rem}
|
||||
.footer{text-align:center;padding:2rem;color:#484f58;font-size:.85rem;border-top:1px solid #21262d}
|
||||
</style>
|
||||
</head>
|
||||
<body>
|
||||
<div class="header">
|
||||
<h1>sgClaw 智能浏览器自动化平台</h1>
|
||||
<div class="subtitle">用自然语言驱动浏览器操作 让业务流程自动执行</div>
|
||||
</div>
|
||||
<div class="container">
|
||||
|
||||
<div class="section">
|
||||
<div class="section-header">
|
||||
<div class="section-number">1</div>
|
||||
<div class="section-title">一句话理解 sgClaw</div>
|
||||
</div>
|
||||
<div class="section-body">
|
||||
<div class="desc">
|
||||
<strong>sgClaw 是一个"智能浏览器助手"。</strong>用户用自然语言说出需求(例如"帮我查本月线损率"),sgClaw 自动在浏览器中完成点击、输入、查询、导出等一系列操作,最终将结果呈现给用户。全程无需人工逐步操作浏览器。
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<div class="section">
|
||||
<div class="section-header">
|
||||
<div class="section-number">2</div>
|
||||
<div class="section-title">整体业务流程 - 从用户指令到结果呈现</div>
|
||||
</div>
|
||||
<div class="section-body">
|
||||
<div class="desc">
|
||||
以下是用户使用 sgClaw 的完整流程。用户只需<strong>输入一句话</strong>,剩下的全部自动完成。
|
||||
</div>
|
||||
<div class="mermaid">
|
||||
graph LR
|
||||
A["用户输入自然语言指令\n例如: 帮我查本月线损率"] --> B["sgClaw 理解指令意图\n识别是哪个业务场景"]
|
||||
B --> C{"是否已知场景?"}
|
||||
C -->|是 已知场景| D["直接执行预设流程\n快速通道 无需AI"]
|
||||
C -->|否 新场景| E["AI大模型分析理解\n拆解为具体操作步骤"]
|
||||
D --> F["自动操作浏览器\n点击 输入 查询 导出"]
|
||||
E --> F
|
||||
F --> G["将结果呈现给用户\n生成报表 打开Excel"]
|
||||
|
||||
classDef userInput fill:#1a3a5c,stroke:#4a9eff,color:#e6edf3
|
||||
classDef ai fill:#2d1f3d,stroke:#9c27b0,color:#e6edf3
|
||||
classDef fast fill:#1f3d2d,stroke:#4caf50,color:#e6edf3
|
||||
classDef action fill:#4a2c17,stroke:#e65100,color:#e6edf3
|
||||
classDef result fill:#1a3a5c,stroke:#4a9eff,color:#e6edf3
|
||||
|
||||
class A userInput
|
||||
class B ai
|
||||
class C ai
|
||||
class D fast
|
||||
class E ai
|
||||
class F action
|
||||
class G result
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<div class="section">
|
||||
<div class="section-header">
|
||||
<div class="section-number">3</div>
|
||||
<div class="section-title">平台如何与现有业务系统协同工作</div>
|
||||
</div>
|
||||
<div class="section-body">
|
||||
<div class="desc">
|
||||
sgClaw <strong>不需要改造现有业务系统</strong>,它像一个坐在电脑前的员工,直接操作浏览器完成工作。
|
||||
</div>
|
||||
<div class="mermaid">
|
||||
graph TB
|
||||
User["业务人员\n分公司副主任 线损专责 班组长"]
|
||||
|
||||
subgraph Platform["统一业务平台"]
|
||||
S1["线损大数据系统\n查询线损率 统计分析"]
|
||||
S2["95598客服系统\n故障报修 工单处理"]
|
||||
S3["其他业务子系统\n..."]
|
||||
end
|
||||
|
||||
subgraph sgClaw["sgClaw 智能助手"]
|
||||
SG1["理解用户自然语言指令"]
|
||||
SG2["自动操作浏览器完成任务"]
|
||||
SG3["安全保障 权限管控"]
|
||||
end
|
||||
|
||||
Result["最终结果\nExcel报表 Word文档 数据展示"]
|
||||
|
||||
User -->|"说出需求"| SG1
|
||||
SG1 --> SG2
|
||||
SG1 --> SG3
|
||||
SG2 -->|"自动点击查询"| S1
|
||||
SG2 -->|"自动填写表单"| S2
|
||||
SG2 -->|"自动导出报表"| S3
|
||||
S1 -->|"数据返回"| SG2
|
||||
S2 -->|"数据返回"| SG2
|
||||
S3 -->|"数据返回"| SG2
|
||||
SG2 -->|"生成报表文件"| Result
|
||||
Result -->|"展示给用户"| User
|
||||
|
||||
classDef people fill:#1a3a5c,stroke:#4a9eff,color:#e6edf3
|
||||
classDef plat fill:#2d1f3d,stroke:#9c27b0,color:#e6edf3
|
||||
classDef sg fill:#4a2c17,stroke:#e65100,color:#e6edf3
|
||||
classDef out fill:#1f3d2d,stroke:#4caf50,color:#e6edf3
|
||||
|
||||
class User people
|
||||
class Platform,S1,S2,S3 plat
|
||||
class sgClaw,SG1,SG2,SG3 sg
|
||||
class Result out
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<div class="section">
|
||||
<div class="section-header">
|
||||
<div class="section-number">4</div>
|
||||
<div class="section-title">安全管控体系</div>
|
||||
</div>
|
||||
<div class="section-body">
|
||||
<div class="desc">
|
||||
sgClaw 建立了<strong>三道安全防线</strong>,确保即使在AI驱动下,所有操作也在可控范围内。
|
||||
</div>
|
||||
<div class="mermaid">
|
||||
graph TB
|
||||
A["第一道防线\n身份确认: 确保通信双方可信"] --> B["第二道防线\n规则校验: 只能访问允许的系统和页面"]
|
||||
B --> C["第三道防线\n二次复核: 操作前再次确认合法性"]
|
||||
C --> D["最终结果\n所有操作可追溯 可审计"]
|
||||
|
||||
classDef l1 fill:#1a3a5c,stroke:#4a9eff,color:#e6edf3
|
||||
classDef l2 fill:#4a2c17,stroke:#e65100,color:#e6edf3
|
||||
classDef l3 fill:#2d1f3d,stroke:#9c27b0,color:#e6edf3
|
||||
classDef ok fill:#1f3d2d,stroke:#4caf50,color:#e6edf3
|
||||
|
||||
class A l1
|
||||
class B l2
|
||||
class C l3
|
||||
class D ok
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<div class="section">
|
||||
<div class="section-header">
|
||||
<div class="section-number">5</div>
|
||||
<div class="section-title">两种运行模式</div>
|
||||
</div>
|
||||
<div class="section-body">
|
||||
<div class="desc">
|
||||
sgClaw 支持两种运行方式,适应不同场景需求。
|
||||
</div>
|
||||
<div class="mermaid">
|
||||
graph LR
|
||||
subgraph Mode1["模式一: 嵌入式 浏览器子进程模式"]
|
||||
M1A["浏览器启动sgClaw"]
|
||||
M1B["一问一答式通信"]
|
||||
M1C["适合单次任务执行"]
|
||||
M1A --> M1B --> M1C
|
||||
end
|
||||
|
||||
subgraph Mode2["模式二: 独立服务模式"]
|
||||
M2A["sgClaw作为持久化服务运行"]
|
||||
M2B["前端网页随时连接使用"]
|
||||
M2C["适合频繁交互使用"]
|
||||
M2A --> M2B --> M2C
|
||||
end
|
||||
|
||||
classDef m1 fill:#1a3a5c,stroke:#4a9eff,color:#e6edf3
|
||||
classDef m2 fill:#2d1f3d,stroke:#9c27b0,color:#e6edf3
|
||||
|
||||
class Mode1,M1A,M1B,M1C m1
|
||||
class Mode2,M2A,M2B,M2C m2
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<div class="section">
|
||||
<div class="section-header">
|
||||
<div class="section-number">6</div>
|
||||
<div class="section-title">技术演进路线</div>
|
||||
</div>
|
||||
<div class="section-body">
|
||||
<div class="phase-list">
|
||||
<div class="phase-item">
|
||||
<div class="phase-badge">第一阶段</div>
|
||||
<div>
|
||||
<h4>基础能力构建</h4>
|
||||
<p>完成浏览器基础操作能力(点击、输入、导航、读取页面内容),建立安全管控体系,实现与现有业务平台的对接。</p>
|
||||
</div>
|
||||
</div>
|
||||
<div class="phase-item">
|
||||
<div class="phase-badge">第二阶段</div>
|
||||
<div>
|
||||
<h4>AI智能驱动</h4>
|
||||
<p>接入AI大模型,支持自然语言理解,用户用日常语言描述需求,AI自动拆解为操作步骤并执行。</p>
|
||||
</div>
|
||||
</div>
|
||||
<div class="phase-item">
|
||||
<div class="phase-badge">第三阶段</div>
|
||||
<div>
|
||||
<h4>业务场景沉淀</h4>
|
||||
<p>将高频使用的场景沉淀为标准化技能包(如线损查询、故障统计、周报生成等),实现快速执行,减少对AI的依赖。</p>
|
||||
</div>
|
||||
</div>
|
||||
<div class="phase-item">
|
||||
<div class="phase-badge">第四阶段</div>
|
||||
<div>
|
||||
<h4>平台化服务</h4>
|
||||
<p>从单次任务执行升级为持久化服务,支持多用户并发使用,建立完整的技能市场和任务编排体系。</p>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<div class="section">
|
||||
<div class="section-header">
|
||||
<div class="section-number">7</div>
|
||||
<div class="section-title">核心价值</div>
|
||||
</div>
|
||||
<div class="section-body">
|
||||
<div class="value-grid">
|
||||
<div class="value-card">
|
||||
<h3>效率提升</h3>
|
||||
<p>原来需要人工逐步操作浏览器完成的任务,现在只需一句话,自动完成查询、导出、报表生成全流程。</p>
|
||||
</div>
|
||||
<div class="value-card">
|
||||
<h3>零改造接入</h3>
|
||||
<p>不需要改造现有业务系统,sgClaw像员工一样直接操作浏览器,对现有系统零侵入。</p>
|
||||
</div>
|
||||
<div class="value-card">
|
||||
<h3>安全可控</h3>
|
||||
<p>三道安全防线确保所有操作在允许范围内,域名白名单、动作管控、二次复核,全程可追溯。</p>
|
||||
</div>
|
||||
<div class="value-card">
|
||||
<h3>灵活扩展</h3>
|
||||
<p>新业务场景通过编写技能包快速接入,已有场景走快速通道无需AI,兼顾效率和灵活性。</p>
|
||||
</div>
|
||||
<div class="value-card">
|
||||
<h3>技术自主</h3>
|
||||
<p>核心代码自主可控,基于Rust语言构建,性能优异,不依赖外部SaaS服务,数据安全有保障。</p>
|
||||
</div>
|
||||
<div class="value-card">
|
||||
<h3>持续演进</h3>
|
||||
<p>从单任务执行到持久化服务,从人工指令到AI驱动,技术路线清晰,逐步构建平台化能力。</p>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<div class="section">
|
||||
<div class="section-header">
|
||||
<div class="section-number">8</div>
|
||||
<div class="section-title">典型使用场景举例</div>
|
||||
</div>
|
||||
<div class="section-body">
|
||||
<div class="desc">
|
||||
以下是业务人员日常使用 sgClaw 的真实场景。
|
||||
</div>
|
||||
<div class="mermaid">
|
||||
graph TB
|
||||
U1["线损专责\n每月查询线损率统计数据"] -->|"输入: 帮我查本月线损率"| SG1["sgClaw自动完成\n打开线损系统 选择月份 查询数据 导出Excel"]
|
||||
U2["供电所班组长\n每周生成线损分析周报"] -->|"输入: 生成上周线损周报"| SG2["sgClaw自动完成\n查询周数据 汇总分析 生成Word报告"]
|
||||
U3["客服专责\n处理95598故障工单统计"] -->|"输入: 统计本周故障工单"| SG3["sgClaw自动完成\n登录客服系统 筛选工单 生成统计表"]
|
||||
|
||||
classDef user fill:#1a3a5c,stroke:#4a9eff,color:#e6edf3
|
||||
classDef sg fill:#4a2c17,stroke:#e65100,color:#e6edf3
|
||||
|
||||
class U1,U2,U3 user
|
||||
class SG1,SG2,SG3 sg
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
</div>
|
||||
<div class="footer">sgClaw 智能浏览器自动化平台 - 技术路线总览 - 2026年4月</div>
|
||||
<script>
|
||||
mermaid.initialize({ startOnLoad:true, theme:'dark', securityLevel:'loose', logLevel:'warn' });
|
||||
</script>
|
||||
</body>
|
||||
</html>
|
||||
413
docs/sgClaw系统架构全景图.html
Normal file
413
docs/sgClaw系统架构全景图.html
Normal file
@@ -0,0 +1,413 @@
|
||||
<!DOCTYPE html>
|
||||
<html lang="zh-CN">
|
||||
<head>
|
||||
<meta charset="UTF-8">
|
||||
<meta name="viewport" content="width=device-width, initial-scale=1.0">
|
||||
<title>sgClaw 系统架构全景图</title>
|
||||
<script src="https://cdn.jsdelivr.net/npm/mermaid@10.9.5/dist/mermaid.min.js"></script>
|
||||
<style>
|
||||
*{margin:0;padding:0;box-sizing:border-box}
|
||||
body{font-family:-apple-system,BlinkMacSystemFont,"Segoe UI","PingFang SC","Hiragino Sans GB","Microsoft YaHei",sans-serif;background:#0d1117;color:#c9d1d9;line-height:1.6}
|
||||
.header{background:linear-gradient(135deg,#1a1a2e,#16213e,#0f3460);padding:3rem 2rem;text-align:center;border-bottom:3px solid #e65100}
|
||||
.header h1{font-size:2.5rem;color:#e6edf3;margin-bottom:.5rem}
|
||||
.header .subtitle{color:#8b949e;font-size:.95rem}
|
||||
.container{max-width:1400px;margin:0 auto;padding:2rem}
|
||||
.section{background:#161b22;border:1px solid #30363d;border-radius:12px;margin-bottom:2rem;overflow:hidden}
|
||||
.section-header{background:linear-gradient(90deg,#1a1a2e,#16213e);padding:1.2rem 1.5rem;border-bottom:1px solid #30363d;display:flex;align-items:center;gap:1rem}
|
||||
.section-number{background:#e65100;color:#fff;width:36px;height:36px;border-radius:50%;display:flex;align-items:center;justify-content:center;font-weight:700;font-size:1.1rem}
|
||||
.section-title{font-size:1.3rem;color:#e6edf3;font-weight:600}
|
||||
.section-body{padding:1.5rem;overflow-x:auto}
|
||||
.mermaid{display:flex;justify-content:center;padding:1rem 0}
|
||||
.mermaid svg{max-width:100%;height:auto}
|
||||
.file-table{width:100%;border-collapse:collapse;margin-top:1rem}
|
||||
.file-table th,.file-table td{padding:.75rem 1rem;text-align:left;border-bottom:1px solid #21262d}
|
||||
.file-table th{background:#1a1a2e;color:#e65100;font-weight:600}
|
||||
.file-table td:first-child{color:#58a6ff;font-family:"SF Mono",Monaco,Consolas,monospace;font-size:.85rem}
|
||||
.file-table tr:hover{background:rgba(230,81,0,.05)}
|
||||
.footer{text-align:center;padding:2rem;color:#484f58;font-size:.85rem;border-top:1px solid #21262d}
|
||||
</style>
|
||||
</head>
|
||||
<body>
|
||||
<div class="header">
|
||||
<h1>sgClaw 系统架构全景图</h1>
|
||||
<div class="subtitle">浏览器宿主 x Rust 安全控制层 x ZeroClaw 能力核心 - 双部署模式 三层安全防线 Skill 体系</div>
|
||||
</div>
|
||||
<div class="container">
|
||||
|
||||
<div class="section">
|
||||
<div class="section-header">
|
||||
<div class="section-number">1</div>
|
||||
<div class="section-title">系统边界总览 - 四大区域与数据流向</div>
|
||||
</div>
|
||||
<div class="section-body">
|
||||
<div class="mermaid">
|
||||
graph TB
|
||||
BH["浏览器宿主\n受保护的安全边界\n启动和托管sgClaw子进程"]
|
||||
SP["sgClaw进程\nRust安全控制层\nZeroClaw为能力核心"]
|
||||
ZC["ZeroClaw核心\nvendored crate\n任务分解 工具循环 LLM路由"]
|
||||
ES["外部服务\nLLM API和业务浏览器页面"]
|
||||
BH <-- "STDIO JSON Line 进程间通信协议" --> SP
|
||||
SP <-- "Rust API调用 vendored库" --> ZC
|
||||
ZC <-- "HTTP API 或内部调用" --> ES
|
||||
SP <-- "Browser Backend Pipe或WS" --> ES
|
||||
classDef hostClass fill:#1a3a5c,stroke:#4a9eff,color:#e6edf3
|
||||
classDef sgclawClass fill:#4a2c17,stroke:#e65100,color:#e6edf3
|
||||
classDef zcClass fill:#2d1f3d,stroke:#9c27b0,color:#e6edf3
|
||||
classDef extClass fill:#1f3d2d,stroke:#4caf50,color:#e6edf3
|
||||
class BH hostClass
|
||||
class SP sgclawClass
|
||||
class ZC zcClass
|
||||
class ES extClass
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<div class="section">
|
||||
<div class="section-header">
|
||||
<div class="section-number">2</div>
|
||||
<div class="section-title">双部署模式 - Pipe Mode STDIO一问一答</div>
|
||||
</div>
|
||||
<div class="section-body">
|
||||
<div class="mermaid">
|
||||
sequenceDiagram
|
||||
participant Host as 浏览器宿主
|
||||
participant Pipe as StdioTransport
|
||||
participant MAC as MAC Policy
|
||||
participant Agent as Agent/TaskRunner
|
||||
participant ZC as ZeroClaw Runtime
|
||||
participant Tool as BrowserPipeTool
|
||||
participant Exec as 宿主命令执行器
|
||||
Note over Host,Exec: Pipe Mode 一问一答式STDIO通信
|
||||
Host->>Pipe: Init 握手:携带版本号 HMAC种子 能力列表
|
||||
Pipe->>Pipe: derive_session_key 派生会话密钥
|
||||
Pipe-->>Host: InitAck 确认:返回agent_id和支持动作
|
||||
Host->>Agent: SubmitTask 提交任务
|
||||
Agent->>Agent: 检测确定性提交模式
|
||||
alt 确定性提交
|
||||
Agent->>Agent: 生成执行计划
|
||||
Agent->>Tool: 直接执行Skill
|
||||
else LLM驱动
|
||||
Agent->>ZC: 构造ZeroClaw Agent
|
||||
ZC->>Tool: tool loop调用
|
||||
end
|
||||
Tool->>MAC: 校验域名和动作
|
||||
MAC-->>Tool: 允许或拒绝
|
||||
Tool->>Pipe: 写入Command JSON
|
||||
Pipe-->>Host: 浏览器接收命令
|
||||
Host->>Exec: 执行浏览器命令
|
||||
Exec-->>Host: 返回执行结果
|
||||
Host->>Pipe: Response回包
|
||||
Pipe-->>Tool: 结果回传
|
||||
Tool-->>ZC: ToolResult
|
||||
ZC-->>Agent: 继续或完成
|
||||
Agent-->>Host: TaskComplete
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<div class="section">
|
||||
<div class="section-header">
|
||||
<div class="section-number">3</div>
|
||||
<div class="section-title">双部署模式 - Service Mode TCP加WebSocket加Helper Page桥接</div>
|
||||
</div>
|
||||
<div class="section-body">
|
||||
<div class="mermaid">
|
||||
sequenceDiagram
|
||||
participant Console as 前端控制台
|
||||
participant WS as WebSocket Server
|
||||
participant Agent as Agent/TaskRunner
|
||||
participant CB as BrowserCallbackBackend
|
||||
participant HTTP as Callback HTTP Server
|
||||
participant Helper as Helper Page
|
||||
participant Target as 目标业务页面
|
||||
Note over Console,Target: Service Mode 持久化服务+Helper Page桥接
|
||||
Console->>WS: WebSocket Connect
|
||||
WSS->>CB: 创建会话
|
||||
Console->>WS: SubmitTask
|
||||
WS->>Agent: 分发任务
|
||||
Agent->>CB: invoke执行
|
||||
CB->>HTTP: POST Command到队列
|
||||
HTTP-->>Helper: long-poll返回Command
|
||||
Helper->>Target: sgBrowserExcuteJsCodeByDomain执行JS
|
||||
Target-->>Helper: callBackJsToCpp回调
|
||||
Helper->>HTTP: POST事件回传
|
||||
HTTP-->>CB: Callback事件
|
||||
CB-->>Agent: CommandOutput
|
||||
Agent-->>WS: TaskComplete
|
||||
WS-->>Console: 推送结果
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<div class="section">
|
||||
<div class="section-header">
|
||||
<div class="section-number">4</div>
|
||||
<div class="section-title">sgClaw 内部模块关系图</div>
|
||||
</div>
|
||||
<div class="section-body">
|
||||
<div class="mermaid">
|
||||
graph LR
|
||||
E1["main.rs Pipe模式入口"]
|
||||
E2["service模式入口"]
|
||||
P1["StdioTransport STDIO读写"]
|
||||
P2["消息枚举定义"]
|
||||
P3["Handshake握手协议"]
|
||||
P4["BrowserPipeTool发送等待响应"]
|
||||
P5["HMAC签名防篡改"]
|
||||
M1["MacPolicy加载解析"]
|
||||
M2["Domain白名单标准化比对"]
|
||||
M3["Action黑白名单双重过滤"]
|
||||
A1["消息分发handle_browser_message"]
|
||||
A2["TaskRunner任务解析"]
|
||||
A3["Deterministic Submit指令检测"]
|
||||
C1["RuntimeEngine构建Agent"]
|
||||
C2["ToolPolicy工具权限"]
|
||||
C3["BrowserScriptSkillTool执行器"]
|
||||
C4["DeterministicSubmit线损快速通道"]
|
||||
C5["BrowserToolAdapter工具适配"]
|
||||
B1["BrowserBackend统一接口"]
|
||||
B2["PipeBrowserBackend实现"]
|
||||
B3["WsBrowserBackend实现"]
|
||||
B4["BrowserCallbackBackend实现"]
|
||||
SV1["WebSocket Server监听"]
|
||||
SV2["Session Manager单客户端单任务"]
|
||||
SV3["Callback HTTP Server监听"]
|
||||
CF1["SgClawSettings加载"]
|
||||
CF2["Provider Config"]
|
||||
CF3["Backend Selection选择"]
|
||||
E1 --> P1 --> P2 --> P3 --> P4 --> P5 --> M1
|
||||
M1 --> M2
|
||||
M1 --> M3 --> A1 --> A2 --> A3
|
||||
A3 --> C1 --> C2 --> C5 --> B1
|
||||
A3 --> C4 --> B1
|
||||
CF1 --> C1
|
||||
B1 --> B2
|
||||
B1 --> B3
|
||||
B1 --> B4
|
||||
E2 --> SV1 --> SV2 --> B4
|
||||
SV1 --> SV3
|
||||
CF1 --> CF2
|
||||
CF1 --> CF3 --> A1
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<div class="section">
|
||||
<div class="section-header">
|
||||
<div class="section-number">5</div>
|
||||
<div class="section-title">安全模型 - 三层防线</div>
|
||||
</div>
|
||||
<div class="section-body">
|
||||
<div class="mermaid">
|
||||
graph TB
|
||||
L1A["浏览器发送Init携带hmac_seed"]
|
||||
L1B["sgClaw回InitAck分配agent_id"]
|
||||
L1C["派生Session Key SHA256"]
|
||||
L1D["未完成握手拒绝运行"]
|
||||
L1A --> L1B --> L1C --> L1D
|
||||
L2A["加载rules.json解析规则"]
|
||||
L2B["Domain白名单校验去掉协议路径端口"]
|
||||
L2C["Action黑白名单双重过滤"]
|
||||
L2D["本地仪表盘特殊处理"]
|
||||
L2A --> L2B
|
||||
L2A --> L2C
|
||||
L2A --> L2D
|
||||
L3A["序列号关联校验"]
|
||||
L3B["HMAC-SHA256签名验证"]
|
||||
L3C["域名与页面上下文匹配"]
|
||||
L3D["非法参数拒绝执行"]
|
||||
L3A --> L3B --> L3C --> L3D
|
||||
L1D ==> L2A
|
||||
L2B ==> L3A
|
||||
L2C ==> L3A
|
||||
L2D ==> L3A
|
||||
classDef l1Class fill:#1a3a5c,stroke:#4a9eff,color:#e6edf3
|
||||
classDef l2Class fill:#4a2c17,stroke:#e65100,color:#e6edf3
|
||||
classDef l3Class fill:#2d1f3d,stroke:#9c27b0,color:#e6edf3
|
||||
class L1A,L1B,L1C,L1D l1Class
|
||||
class L2A,L2B,L2C,L2D l2Class
|
||||
class L3A,L3B,L3C,L3D l3Class
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<div class="section">
|
||||
<div class="section-header">
|
||||
<div class="section-number">6</div>
|
||||
<div class="section-title">Skill体系与执行路径</div>
|
||||
</div>
|
||||
<div class="section-body">
|
||||
<div class="mermaid">
|
||||
graph TB
|
||||
SD1["SKILL.toml元数据"]
|
||||
SD2["tools数组kind定义"]
|
||||
SD3["prompts数组触发条件"]
|
||||
SD4["scripts目录JS脚本"]
|
||||
SL1["ZeroClaw Skill Loader扫描"]
|
||||
SL2["BrowserScriptSkillTool创建执行器"]
|
||||
SL3["命名规范skill.tool"]
|
||||
EP1["路径A LLM驱动"]
|
||||
EP2["路径B Deterministic Submit"]
|
||||
EP3["路径C Direct Skill Runtime"]
|
||||
BE1["Eval包装脚本注入args"]
|
||||
BE2["Action Eval执行"]
|
||||
BE3["返回ToolResult结构化JSON"]
|
||||
SD1 --> SD2 --> SD4
|
||||
SD2 --> SD3
|
||||
SD1 --> SL1 --> SL2 --> SL3
|
||||
SL3 --> EP1
|
||||
SL3 --> EP2
|
||||
SL3 --> EP3
|
||||
EP1 --> BE1 --> BE2 --> BE3
|
||||
EP2 --> BE1
|
||||
EP3 --> BE1
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<div class="section">
|
||||
<div class="section-header">
|
||||
<div class="section-number">7</div>
|
||||
<div class="section-title">Helper Page机制 - Service Mode核心桥接</div>
|
||||
</div>
|
||||
<div class="section-body">
|
||||
<div class="mermaid">
|
||||
graph TB
|
||||
WS["WebSocket Server监听42321"]
|
||||
HTTP["Callback HTTP Server监听17888"]
|
||||
CB["BrowserCallbackBackend交互"]
|
||||
Helper["Helper Page Tab辅助页"]
|
||||
Target1["业务页面1线损系统"]
|
||||
Target2["业务页面2平台页面"]
|
||||
HP1["WebSocket连接特权API"]
|
||||
HP2["轮询Command长轮询"]
|
||||
HP3["推送Events POST回调"]
|
||||
HP4["回调函数注册"]
|
||||
WS --> CB --> HTTP --> HP2
|
||||
HP1 --> Target1
|
||||
HP1 --> Target2
|
||||
HP2 --> Target1
|
||||
HP2 --> Target2
|
||||
Target1 --> HP4 --> HP3 --> HTTP
|
||||
HTTP --> CB --> WS
|
||||
classDef svcClass fill:#4a2c17,stroke:#e65100,color:#e6edf3
|
||||
classDef tabClass fill:#1a3a5c,stroke:#4a9eff,color:#e6edf3
|
||||
classDef hpClass fill:#2d1f3d,stroke:#9c27b0,color:#e6edf3
|
||||
class WS,HTTP,CB svcClass
|
||||
class Helper,Target1,Target2 tabClass
|
||||
class HP1,HP2,HP3,HP4 hpClass
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<div class="section">
|
||||
<div class="section-header">
|
||||
<div class="section-number">8</div>
|
||||
<div class="section-title">线损确定性提交流程 - 用户输入到Excel导出</div>
|
||||
</div>
|
||||
<div class="section-body">
|
||||
<div class="mermaid">
|
||||
sequenceDiagram
|
||||
participant User as 用户
|
||||
participant Host as 浏览器宿主
|
||||
participant Agent as Agent/TaskRunner
|
||||
participant DS as DeterministicSubmit
|
||||
participant Skill as collect_lineloss
|
||||
participant Backend as BrowserBackend
|
||||
participant Browser as 线损浏览器页面
|
||||
participant Rust as Rust xlsx导出
|
||||
User->>Host: 输入指令:帮我查本月线损率
|
||||
Host->>Agent: SubmitTask
|
||||
Agent->>DS: decide_deterministic_submit
|
||||
Note over DS: 指令以句号结尾且包含线损关键词
|
||||
DS-->>Agent: Execute执行计划
|
||||
Agent->>Skill: execute_browser_script
|
||||
Skill->>Backend: Action Eval
|
||||
Backend->>Browser: sgBrowserExcuteJsCodeByDomain
|
||||
Browser->>Browser: validatePageContext
|
||||
Browser->>Browser: buildRequest
|
||||
Browser->>Browser: ajax查询API
|
||||
Browser-->>Backend: 返回JSON
|
||||
Backend-->>Skill: ToolResult
|
||||
Skill-->>Agent: artifact
|
||||
Agent->>Rust: export_lineloss_xlsx
|
||||
Rust->>Rust: 生成xlsx文件
|
||||
Rust-->>Agent: 导出完成
|
||||
Agent-->>Host: TaskComplete
|
||||
Host-->>User: 展示结果打开Excel
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<div class="section">
|
||||
<div class="section-header">
|
||||
<div class="section-number">9</div>
|
||||
<div class="section-title">平台浏览器与sgClaw交互边界</div>
|
||||
</div>
|
||||
<div class="section-body">
|
||||
<div class="mermaid">
|
||||
graph TB
|
||||
PlatformBrowser["平台浏览器Chromium"]
|
||||
sgClawProcess["sgClaw进程Rust"]
|
||||
PP1["场景页Vue实例window.mac"]
|
||||
PP2["mutableSystemList子系统账号池"]
|
||||
PP3["getLogint登录编排方法"]
|
||||
TP1["线损系统20.76.57.61"]
|
||||
TP2["其他子系统"]
|
||||
BC1["sgBrowserExcuteJsCodeByDomain按域名执行JS"]
|
||||
BC2["sgHideBrowerserOpenPage打开隐藏页面"]
|
||||
BC3["sgBrowserCallAfterLoaded加载后执行JS"]
|
||||
BC4["callBackJsToCpp JS到C++回调"]
|
||||
T1["Transport层STDIO传输"]
|
||||
T2["MAC Policy加HMAC安全校验"]
|
||||
T3["Agent/TaskRunner任务分发器"]
|
||||
T4["Compat层ZeroClaw兼容"]
|
||||
T5["Browser Backend浏览器后端"]
|
||||
PP1 --> PP2
|
||||
PP1 --> PP3
|
||||
PP3 -.-> TP1
|
||||
T1 --> PlatformBrowser
|
||||
PlatformBrowser --> T1
|
||||
T3 --> T4 --> T5
|
||||
T5 --> BC1
|
||||
T5 --> BC2
|
||||
T5 --> BC3
|
||||
BC4 -.-> T5
|
||||
PlatformBrowser -.-> sgClawProcess
|
||||
classDef browserSide fill:#1a3a5c,stroke:#4a9eff,color:#e6edf3
|
||||
classDef sgclawSide fill:#4a2c17,stroke:#e65100,color:#e6edf3
|
||||
class PlatformBrowser,PP1,PP2,PP3,TP1,TP2,BC1,BC2,BC3,BC4 browserSide
|
||||
class sgClawProcess,T1,T2,T3,T4,T5 sgclawSide
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<div class="section">
|
||||
<div class="section-header">
|
||||
<div class="section-number">10</div>
|
||||
<div class="section-title">模块文件映射表</div>
|
||||
</div>
|
||||
<div class="section-body">
|
||||
<table class="file-table">
|
||||
<thead><tr><th>模块</th><th>主要源文件</th><th>职责说明</th></tr></thead>
|
||||
<tbody>
|
||||
<tr><td>pipe传输层</td><td>src/pipe/mod.rs transport.rs handshake.rs browser_tool.rs</td><td>STDIO读写 握手流程 消息编码解码 HMAC签名</td></tr>
|
||||
<tr><td>security安全层</td><td>src/security/mod.rs mac_policy.rs hmac.rs</td><td>MAC Policy加载 域名白名单 动作黑白名单 HMAC签名</td></tr>
|
||||
<tr><td>agent消息路由</td><td>src/agent/mod.rs task_runner.rs</td><td>消息分发 任务解析 Deterministic Submit检测</td></tr>
|
||||
<tr><td>browser后端抽象</td><td>src/browser/mod.rs callback_backend.rs callback_host.rs ws_protocol.rs</td><td>BrowserBackend接口 Pipe/WS/Callback实现</td></tr>
|
||||
<tr><td>compat兼容层</td><td>src/compat/mod.rs runtime.rs deterministic_submit.rs browser_script_skill_tool.rs</td><td>ZeroClaw运行时构建 线损快速通道 Skill执行</td></tr>
|
||||
<tr><td>service服务模式</td><td>src/service/mod.rs session.rs</td><td>WS服务器 单客户端单任务模型</td></tr>
|
||||
<tr><td>config配置</td><td>src/config/mod.rs settings.rs</td><td>Settings加载 Provider配置 Backend选择</td></tr>
|
||||
<tr><td>runtime引擎</td><td>src/runtime/mod.rs engine.rs tool_policy.rs</td><td>Agent实例构建 ToolPolicy权限控制</td></tr>
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
</div>
|
||||
<div class="footer">sgClaw 系统架构全景图 - 2026-04-15 - 基于 Mermaid.js 10.9.5</div>
|
||||
<script>
|
||||
mermaid.initialize({ startOnLoad:true, theme:'dark', securityLevel:'loose', logLevel:'warn' });
|
||||
</script>
|
||||
</body>
|
||||
</html>
|
||||
494
docs/sgClaw系统架构全景图.md
Normal file
494
docs/sgClaw系统架构全景图.md
Normal file
@@ -0,0 +1,494 @@
|
||||
# sgClaw 系统架构全景图
|
||||
|
||||
**文档版本**: 1.0<br>
|
||||
**适用项目**: sgClaw<br>
|
||||
**编制日期**: 2026-04-15
|
||||
|
||||
---
|
||||
|
||||
## 1. 系统边界总览
|
||||
|
||||
```mermaid
|
||||
graph TB
|
||||
subgraph BrowserHost["浏览器宿主 (SuperRPA / Chromium)"]
|
||||
direction TB
|
||||
H1["Launch Config<br/>启动配置"]
|
||||
H2["Chromium 子进程管理<br/>启动/监控 sgClaw"]
|
||||
H3["Browser Command 执行器<br/>click/type/navigate/eval/..."]
|
||||
H4["HMAC 复检 + 域名校验<br/>宿主侧安全边界"]
|
||||
H5["Frontend Bundle<br/>展示面 (Vue 2 页面)"]
|
||||
|
||||
H1 --> H2
|
||||
H2 --> H3
|
||||
H3 --> H4
|
||||
H4 -.展示.-> H5
|
||||
end
|
||||
|
||||
subgraph sgClawProcess["sgClaw 进程 (Rust)"]
|
||||
direction TB
|
||||
S1["Transport 层<br/>STDIO / WebSocket"]
|
||||
S2["Security 层<br/>MAC Policy + HMAC 签名"]
|
||||
S3["Agent 层<br/>消息路由 + 任务分发"]
|
||||
S4["Compat 层<br/>ZeroClaw 运行时 + Skill 工具链"]
|
||||
S5["Browser Backend 抽象<br/>Pipe / WS / Callback / Bridge"]
|
||||
S6["Config 层<br/>Runtime Config + 环境变量"]
|
||||
|
||||
S1 --> S2
|
||||
S2 --> S3
|
||||
S3 --> S4
|
||||
S4 --> S5
|
||||
S6 -.配置注入.-> S4
|
||||
end
|
||||
|
||||
subgraph ZeroClawCore["ZeroClaw 核心 (vendored)"]
|
||||
direction TB
|
||||
Z1["Planner / Executor<br/>任务分解与执行"]
|
||||
Z2["Tool Loop<br/>工具调用循环"]
|
||||
Z3["Skills / Memory<br/>技能加载与记忆"]
|
||||
Z4["Provider Dispatch<br/>LLM 路由"]
|
||||
Z5["Prompt Builder<br/>System Prompt 组装"]
|
||||
|
||||
Z1 --> Z2
|
||||
Z2 --> Z3
|
||||
Z3 --> Z4
|
||||
Z5 --> Z1
|
||||
end
|
||||
|
||||
subgraph ExternalServices["外部服务"]
|
||||
direction TB
|
||||
E1["LLM Provider<br/>DeepSeek / OpenAI / Claude"]
|
||||
E2["平台浏览器页面<br/>业务页面 + 隐藏域"]
|
||||
end
|
||||
|
||||
BrowserHost <-->|"STDIO JSON Line<br/>AgentMessage / BrowserMessage"| sgClawProcess
|
||||
sgClawProcess <-->|"Rust API 调用|vendored"| ZeroClawCore
|
||||
ZeroClawCore <-->|"HTTP API|内部调用"| ExternalServices
|
||||
sgClawProcess <-->|"Pipe Mode: STDIO<br/>Service Mode: WS|Browser Backend| ExternalServices
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 2. 双部署模式架构
|
||||
|
||||
### 2.1 Pipe Mode (STDIO) — 传统嵌入模式
|
||||
|
||||
```mermaid
|
||||
sequenceDiagram
|
||||
participant Host as 浏览器宿主 (Chromium)
|
||||
participant Pipe as StdioTransport
|
||||
participant MAC as MAC Policy
|
||||
participant Agent as Agent / TaskRunner
|
||||
participant ZC as ZeroClaw Runtime
|
||||
participant Backend as PipeBrowserBackend
|
||||
participant Tool as BrowserPipeTool
|
||||
participant HostExec as 宿主 Command 执行器
|
||||
|
||||
Note over Host,HostExec: Pipe Mode: 一问一答式 STDIO
|
||||
|
||||
Host->>Pipe: Init {version, hmac_seed, capabilities}
|
||||
Pipe->>Pipe: derive_session_key(hmac_seed)
|
||||
Pipe-->>Host: InitAck {version, agent_id, supported_actions}
|
||||
|
||||
Host->>Agent: SubmitTask {instruction, page_url, page_title}
|
||||
Agent->>Agent: resolve_submit_instruction()
|
||||
alt deterministic_submit (如 线损。。。)
|
||||
Agent->>Agent: 生成 DeterministicExecutionPlan
|
||||
Agent->>Tool: execute_browser_script_skill_raw_output
|
||||
else 通用 LLM 驱动
|
||||
Agent->>ZC: 构造 ZeroClaw Agent
|
||||
ZC->>Tool: tool loop: browser_action
|
||||
end
|
||||
|
||||
Tool->>MAC: validate(domain, action)
|
||||
MAC-->>Tool: allow / deny
|
||||
|
||||
Tool->>Backend: invoke(action, params)
|
||||
Backend->>Pipe: AgentMessage::Command {seq, action, params, hmac}
|
||||
Pipe-->>Host: stdout: Command JSON
|
||||
|
||||
Host->>HostExec: 执行浏览器命令
|
||||
HostExec-->>Host: 执行结果
|
||||
Host->>Pipe: BrowserMessage::Response {seq, success, data}
|
||||
Pipe-->>Backend: Response 回包
|
||||
Backend-->>Tool: CommandOutput
|
||||
Tool-->>ZC: ToolResult
|
||||
ZC-->>Agent: tool loop 继续或完成
|
||||
Agent-->>Host: TaskComplete {success, summary}
|
||||
```
|
||||
|
||||
### 2.2 Service Mode (TCP + WebSocket) — 独立服务模式
|
||||
|
||||
```mermaid
|
||||
sequenceDiagram
|
||||
participant Console as 前端控制台 (浏览器)
|
||||
participant WSS as WebSocket Server<br/>(127.0.0.1:42321)
|
||||
participant Agent as Agent / TaskRunner
|
||||
participant Callback as BrowserCallbackBackend
|
||||
participant HTTP as Callback HTTP Server<br/>(127.0.0.1:17888)
|
||||
participant Helper as Helper Page<br/>(浏览器内嵌辅助页)
|
||||
participant Target as 目标业务页面
|
||||
|
||||
Note over Console,Target: Service Mode: 持久化服务 + Helper Page 桥接
|
||||
|
||||
Console->>WSS: WebSocket Connect
|
||||
WSS->>Callback: 创建会话
|
||||
|
||||
Console->>WSS: ClientMessage::SubmitTask
|
||||
WSS->>Agent: 分发任务
|
||||
Agent->>Callback: BrowserBackend::invoke()
|
||||
|
||||
callback Backend 内部流程:
|
||||
Callback->>Helper: 通过 HTTP Server 推送 Command
|
||||
Helper->>Target: sgBrowserExcuteJsCodeByDomain<br/>在目标域执行 JS
|
||||
|
||||
Target-->>Helper: callBackJsToCpp / XHR POST
|
||||
Helper->>HTTP: POST /sgclaw/callback/events
|
||||
HTTP-->>Callback: Callback 事件回传
|
||||
|
||||
Callback-->>Agent: CommandOutput
|
||||
Agent-->>WSS: ServiceMessage::TaskComplete
|
||||
WSS-->>Console: WebSocket 推送结果
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 3. sgClaw 内部模块关系
|
||||
|
||||
```mermaid
|
||||
graph LR
|
||||
subgraph EntryPoints["入口点"]
|
||||
E1["src/main.rs<br/>sgclaw::run()"]
|
||||
E2["src/service/mod.rs<br/>service::run()"]
|
||||
end
|
||||
|
||||
subgraph PipeLayer["pipe 层 — 传输与协议"]
|
||||
P1["StdioTransport<br/>STDIO 读写"]
|
||||
P2["BrowserMessage / AgentMessage<br/>消息枚举定义"]
|
||||
P3["Handshake<br/>握手协议"]
|
||||
P4["BrowserPipeTool<br/>发送 Command / 等待 Response"]
|
||||
P5["HMAC 签名<br/>sign_command"]
|
||||
end
|
||||
|
||||
subgraph SecurityLayer["security 层 — 安全策略"]
|
||||
M1["MacPolicy<br/>从 rules.json 加载规则"]
|
||||
M2["Domain Allowlist<br/>域名白名单校验"]
|
||||
M3["Action Allowlist/Blocklist<br/>动作黑白名单"]
|
||||
end
|
||||
|
||||
subgraph AgentLayer["agent 层 — 消息路由与任务分发"]
|
||||
A1["handle_browser_message_with_context<br/>消息分发"]
|
||||
A2["TaskRunner<br/>任务解析与执行"]
|
||||
A3["resolve_submit_instruction<br/>Deterministic Submit 检测"]
|
||||
end
|
||||
|
||||
subgraph CompatLayer["compat 层 — ZeroClaw 兼容"]
|
||||
C1["RuntimeEngine<br/>构建 Agent 实例"]
|
||||
C2["ToolPolicy<br/>工具权限控制"]
|
||||
C3["BrowserScriptSkillTool<br/>Skill browser_script 执行"]
|
||||
C4["DeterministicSubmit<br/>线损确定性提交"]
|
||||
C5["BrowserToolAdapter<br/>ZeroClaw 工具适配"]
|
||||
C6["ConfigAdapter<br/>配置转换"]
|
||||
end
|
||||
|
||||
subgraph BrowserLayer["browser 层 — 浏览器后端"]
|
||||
B1["BrowserBackend trait<br/>统一接口"]
|
||||
B2["PipeBrowserBackend<br/>Pipe Mode 实现"]
|
||||
B3["WsBrowserBackend<br/>WebSocket 直接连接"]
|
||||
B4["BrowserCallbackBackend<br/>Helper Page 桥接"]
|
||||
B5["BridgeBrowserBackend<br/>网桥模式"]
|
||||
end
|
||||
|
||||
subgraph ServiceLayer["service 层 — 服务模式"]
|
||||
SV1["WebSocket Server<br/>TCP 监听"]
|
||||
SV2["Session Manager<br/>单客户端单任务"]
|
||||
SV3["Callback HTTP Server<br/>辅助页通信"]
|
||||
end
|
||||
|
||||
subgraph ConfigLayer["config 层 — 运行时配置"]
|
||||
CF1["SgClawSettings<br/>从 JSON / 环境变量加载"]
|
||||
CF2["Provider Config<br/>API Key / Model"]
|
||||
CF3["Backend Selection<br/>Pipe vs Service"]
|
||||
end
|
||||
|
||||
E1 --> P1
|
||||
E2 --> SV1
|
||||
|
||||
P1 --> P2
|
||||
P2 --> P3
|
||||
P3 --> P4
|
||||
P4 --> P5
|
||||
|
||||
P5 --> M1
|
||||
M1 --> M2
|
||||
M1 --> M3
|
||||
|
||||
M3 --> A1
|
||||
A1 --> A2
|
||||
A2 --> A3
|
||||
|
||||
A3 --> C1
|
||||
A3 --> C4
|
||||
C1 --> C2
|
||||
C1 --> C3
|
||||
C2 --> C5
|
||||
C6 --> C1
|
||||
|
||||
C3 --> B1
|
||||
C4 --> B1
|
||||
C5 --> B1
|
||||
|
||||
B1 --> B2
|
||||
B1 --> B3
|
||||
B1 --> B4
|
||||
B1 --> B5
|
||||
|
||||
SV1 --> SV2
|
||||
SV1 --> SV3
|
||||
SV2 --> B4
|
||||
|
||||
CF1 --> CF2
|
||||
CF1 --> CF3
|
||||
CF3 --> A1
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 4. 安全模型三层防线
|
||||
|
||||
```mermaid
|
||||
graph TB
|
||||
subgraph Layer1["第一层: 握手与会话完整性"]
|
||||
L1A["Browser 发送 Init<br/>携带 hmac_seed"]
|
||||
L1B["sgClaw 回 InitAck<br/>分配 agent_id"]
|
||||
L1C["派生 Session Key<br/>SHA256(hmac_seed + salt)"]
|
||||
L1D["未完成握手<br/>拒绝进入运行态"]
|
||||
|
||||
L1A --> L1B --> L1C --> L1D
|
||||
end
|
||||
|
||||
subgraph Layer2["第二层: Rust 侧 MAC Policy"]
|
||||
L2A["加载 rules.json<br/>version, domains, actions"]
|
||||
L2B["Domain 白名单校验<br/>strip scheme/path/port"]
|
||||
L2C["Action 黑白名单<br/>allowed + blocked 双重过滤"]
|
||||
L2D["本地仪表盘特殊处理<br/>__sgclaw_local_dashboard__"]
|
||||
|
||||
L2A --> L2B
|
||||
L2A --> L2C
|
||||
L2A --> L2D
|
||||
end
|
||||
|
||||
subgraph Layer3["第三层: 宿主侧命令执行约束"]
|
||||
L3A["序列号关联校验"]
|
||||
L3B["HMAC-SHA256 签名验证"]
|
||||
L3C["域名与页面上下文匹配"]
|
||||
L3D["非法参数拒绝执行"]
|
||||
|
||||
L3A --> L3B --> L3C --> L3D
|
||||
end
|
||||
|
||||
Layer1 ==>|"Session Key"| Layer2
|
||||
Layer2 ==>|"Command + HMAC"| Layer3
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 5. Skill 体系与执行路径
|
||||
|
||||
```mermaid
|
||||
graph TB
|
||||
subgraph SkillDefinition["Skill 定义 (SKILL.toml)"]
|
||||
SD1["skill metadata<br/>name, version, description"]
|
||||
SD2["tools 数组<br/>kind: browser_script / http_request / ..."]
|
||||
SD3["prompts 数组<br/>触发条件描述"]
|
||||
SD4["scripts/ 目录<br/>JS 脚本文件"]
|
||||
end
|
||||
|
||||
subgraph SkillLoading["Skill 加载"]
|
||||
SL1["ZeroClaw Skill Loader<br/>从 skillsDir 扫描"]
|
||||
SL2["BrowserScriptSkillTool<br/>为每个 tool 创建执行器"]
|
||||
SL3["命名: {skill_name}.{tool_name}"]
|
||||
end
|
||||
|
||||
subgraph ExecutionPaths["执行路径"]
|
||||
EP1["路径 A: LLM 驱动<br/>Agent tool loop → browser_action"]
|
||||
EP2["路径 B: Deterministic Submit<br/>指令匹配 → 直接执行 (无 LLM)"]
|
||||
EP3["路径 C: Direct Skill Runtime<br/>配置指定 skill → 直接执行"]
|
||||
end
|
||||
|
||||
subgraph BrowserExecution["浏览器侧执行"]
|
||||
BE1["Eval 包装<br/>(function() { const args = {...}; ... })()"]
|
||||
BE2["Action::Eval<br/>通过 BrowserBackend 执行"]
|
||||
BE3["返回 ToolResult<br/>结构化结果"]
|
||||
end
|
||||
|
||||
SD1 --> SD2 --> SD4
|
||||
SD2 --> SD3
|
||||
|
||||
SD1 --> SL1 --> SL2 --> SL3
|
||||
|
||||
SL3 --> EP1
|
||||
SL3 --> EP2
|
||||
SL3 --> EP3
|
||||
|
||||
EP1 --> BE1
|
||||
EP2 --> BE1
|
||||
EP3 --> BE1
|
||||
|
||||
BE1 --> BE2 --> BE3
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 6. Helper Page 机制 (Service Mode)
|
||||
|
||||
```mermaid
|
||||
graph TB
|
||||
subgraph sgClawService["sgClaw Service 进程"]
|
||||
WS["WebSocket Server<br/>:42321"]
|
||||
HTTP["HTTP Server<br/>:17888"]
|
||||
CB["BrowserCallbackBackend"]
|
||||
end
|
||||
|
||||
subgraph BrowserTabs["浏览器 Tab 页"]
|
||||
Helper["Helper Page Tab<br/>/sgclaw/browser-helper.html"]
|
||||
Target1["业务页面 1<br/>20.76.57.61:18080/..."]
|
||||
Target2["业务页面 2<br/>25.215.213.128:18080/..."]
|
||||
end
|
||||
|
||||
subgraph HelperPage["Helper Page 内部"]
|
||||
HP1["WebSocket 连接<br/>ws://127.0.0.1:12345"]
|
||||
HP2["轮询 Command<br/>GET /sgclaw/callback/commands/next"]
|
||||
HP3["推送 Events<br/>POST /sgclaw/callback/events"]
|
||||
HP4["回调函数注册<br/>sgclawOnClickProbe / sgclawOnEval / ..."]
|
||||
end
|
||||
|
||||
WS -->|"WebSocket"| CB
|
||||
CB -->|"推送 Command"| HTTP
|
||||
HTTP -->|long-poll| HP2
|
||||
|
||||
HP1 -->|"浏览器 WebSocket API"| Target1
|
||||
HP1 -->|"浏览器 WebSocket API"| Target2
|
||||
|
||||
HP2 -->|"执行 JS 命令<br/>sgBrowserExcuteJsCodeByDomain|Target1
|
||||
HP2 -->|"执行 JS 命令<br/>sgBrowserExcuteJsCodeByDomain|Target2
|
||||
|
||||
Target1 -->|"callBackJsToCpp|HP4
|
||||
HP3 -->|"XHR POST|HTTP
|
||||
HP4 --> HP3
|
||||
|
||||
HTTP -->|"Callback 事件|CB
|
||||
CB -->|"ToolResult|WS
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 7. 线损确定性提交流程 (Deterministic Submit)
|
||||
|
||||
```mermaid
|
||||
sequenceDiagram
|
||||
participant User as 用户
|
||||
participant Host as 浏览器宿主
|
||||
participant Agent as Agent / TaskRunner
|
||||
participant DS as DeterministicSubmit
|
||||
participant Skill as BrowserScriptSkillTool<br/>(collect_lineloss)
|
||||
participant Backend as BrowserBackend
|
||||
participant Browser as 浏览器页面<br/>(线损域)
|
||||
participant Rust as Rust 侧<br/>xlsx 导出
|
||||
|
||||
User->>Host: 输入: "帮我查本月线损率。。。"
|
||||
Host->>Agent: SubmitTask {instruction}
|
||||
|
||||
Agent->>DS: decide_deterministic_submit()
|
||||
Note over DS: 指令以 "。。。" 结尾<br/>且包含 "线损" 关键词
|
||||
DS-->>Agent: Execute(DeterministicExecutionPlan)
|
||||
|
||||
Agent->>Skill: execute_browser_script_skill_raw_output()
|
||||
Skill->>Backend: Action::Eval {script: collect_lineloss.js}
|
||||
Backend->>Browser: sgBrowserExcuteJsCodeByDomain<br/>(20.76.57.61, js_code)
|
||||
|
||||
Browser->>Browser: validatePageContext(args)
|
||||
Browser->>Browser: buildMonthRequest / buildWeekRequest
|
||||
Browser->>Browser: $.ajax 查询线损 API
|
||||
Browser-->>Backend: 返回 report-artifact JSON
|
||||
Backend-->>Skill: ToolResult
|
||||
Skill-->>Agent: artifact {status, rows, column_defs}
|
||||
|
||||
Agent->>Rust: export_lineloss_xlsx(artifact)
|
||||
Rust->>Rust: 生成 .xlsx 文件
|
||||
Rust-->>Agent: 导出完成
|
||||
Agent-->>Host: TaskComplete {success: true}
|
||||
Host-->>User: 展示结果 + 打开 Excel
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 8. 平台浏览器与 sgClaw 的交互边界
|
||||
|
||||
```mermaid
|
||||
graph TB
|
||||
subgraph PlatformBrowser["平台浏览器 (Chromium)"]
|
||||
direction TB
|
||||
subgraph PlatformPages["平台场景页面"]
|
||||
PP1["场景页 Vue 实例<br/>window.mac"]
|
||||
PP2["mutableSystemList<br/>子系统账号池"]
|
||||
PP3["getLogint / loginStatusTing<br/>子系统登录编排"]
|
||||
end
|
||||
|
||||
subgraph TargetPages["目标业务页面"]
|
||||
TP1["线损系统<br/>20.76.57.61:18080"]
|
||||
TP2["其他子系统"]
|
||||
end
|
||||
|
||||
subgraph BrowserCapabilities["浏览器特权能力"]
|
||||
BC1["sgBrowserExcuteJsCodeByDomain<br/>按域名执行 JS"]
|
||||
BC2["sgHideBrowerserOpenPage<br/>打开隐藏页面"]
|
||||
BC3["sgBrowserCallAfterLoaded<br/>页面加载后执行 JS"]
|
||||
BC4["callBackJsToCpp<br/>JS → C++ 回调"]
|
||||
end
|
||||
|
||||
PP1 --> PP2
|
||||
PP1 --> PP3
|
||||
end
|
||||
|
||||
subgraph sgClawProcess["sgClaw 进程"]
|
||||
direction TB
|
||||
subsgClawTransport["Transport 层"]
|
||||
subgClawSecurity["MAC Policy + HMAC"]
|
||||
subgClawAgent["Agent / TaskRunner"]
|
||||
subgClawCompat["Compat 层"]
|
||||
subgClawBackend["Browser Backend"]
|
||||
end
|
||||
|
||||
subgClawTransport <-->|"STDIO JSON Line<br/>AgentMessage / BrowserMessage|PlatformBrowser
|
||||
subgClawAgent --> subgClawCompat
|
||||
subgClawCompat --> subgClawBackend
|
||||
subgClawBackend -->|"BrowserAction<br/>sgBrowserExcuteJsCodeByDomain|BC1
|
||||
subgClawBackend -->|"BrowserAction<br/>sgHideBrowerserOpenPage|BC2
|
||||
subgClawBackend -->|"BrowserAction<br/>sgBrowserCallAfterLoaded|BC3
|
||||
|
||||
BC4 -.回调.-> subgClawBackend
|
||||
|
||||
PlatformBrowser -.安全边界.-> sgClawProcess
|
||||
|
||||
classDef browserSide fill:#e3f2fd,stroke:#1565c0,color:#000
|
||||
classDef sgclawSide fill:#fff3e0,stroke:#e65100,color:#000
|
||||
classDef interaction fill:#f3e5f5,stroke:#7b1fa2,color:#000
|
||||
|
||||
class PlatformBrowser,PlatformPages,TargetPages,BrowserCapabilities browserSide
|
||||
class sgClawProcess,subgClawTransport,subgClawSecurity,subgClawAgent,subgClawCompat,subgClawBackend sgclawSide
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 9. 模块文件映射
|
||||
|
||||
| 模块 | 主要文件 | 职责 |
|
||||
|---|---|---|
|
||||
| **pipe 传输层** | `src/pipe/mod.rs`, `src/pipe/transport.rs`, `src/pipe/handshake.rs`, `src/pipe/browser_tool.rs` | STDIO 读写、握手、消息编码解码、HMAC 签名、Command 发送与 Response 等待 |
|
||||
| **security 安全层** | `src/security/mod.rs`, `src/security/mac_policy.rs`, `src/security/hmac.rs` | MAC Policy 加载与校验、Session Key 派生、命令签名 |
|
||||
| **agent 消息路由** | `src/agent/mod.rs`, `src/agent/task_runner.rs` | 接收 BrowserMessage 并分发、任务解析、Deterministic Submit 检测 |
|
||||
| **browser 后端抽象** | `src/browser/mod.rs`, `src/browser/callback_backend.rs`, `src/browser/callback_host.rs`, `src/browser/ws_protocol.rs` | BrowserBackend trait 定义、Pipe/WS/Callback/Bridge 四种实现 |
|
||||
| **compat 兼容层** | `src/compat/mod.rs`, `src/compat/runtime.rs`, `src/compat/deterministic_submit.rs`, `src/compat/browser_script_skill_tool.rs` | ZeroClaw 运行时构建、线损确定性提交、Skill browser_script 执行 |
|
||||
| **service 服务模式** | `src/service/mod.rs`, `src/service/session.rs` | WebSocket 服务器、客户端会话管理、单任务并发模型 |
|
||||
| **config 运行时配置** | `src/config/mod.rs`, `src/config/settings.rs` | SgClawSettings 加载、Provider 配置、Backend 选择 |
|
||||
| **runtime 运行时引擎** | `src/runtime/mod.rs`, `src/runtime/engine.rs`, `src/runtime/tool_policy.rs` | RuntimeEngine 构建 Agent、ToolPolicy 工具权限控制 |
|
||||
645
docs/sgClaw组件职责与流转全景图.html
Normal file
645
docs/sgClaw组件职责与流转全景图.html
Normal file
@@ -0,0 +1,645 @@
|
||||
<!DOCTYPE html>
|
||||
<html lang="zh-CN">
|
||||
<head>
|
||||
<meta charset="UTF-8">
|
||||
<meta name="viewport" content="width=device-width, initial-scale=1.0">
|
||||
<title>sgClaw 智能浏览器自动化平台 - 组件职责与流转全景图</title>
|
||||
<script src="https://cdn.jsdelivr.net/npm/mermaid@10.9.5/dist/mermaid.min.js"></script>
|
||||
<style>
|
||||
*{margin:0;padding:0;box-sizing:border-box}
|
||||
body{font-family:-apple-system,BlinkMacSystemFont,"Segoe UI","PingFang SC","Hiragino Sans GB","Microsoft YaHei",sans-serif;background:#0d1117;color:#c9d1d9;line-height:1.8}
|
||||
.header{background:linear-gradient(135deg,#0a1628,#16213e,#1a3a5c);padding:3rem 2rem;text-align:center;border-bottom:3px solid #e65100}
|
||||
.header h1{font-size:2.2rem;color:#e6edf3;margin-bottom:.5rem}
|
||||
.header .subtitle{color:#8b949e;font-size:1rem}
|
||||
.container{max-width:1400px;margin:0 auto;padding:2rem}
|
||||
.section{background:#161b22;border:1px solid #30363d;border-radius:12px;margin-bottom:2rem;overflow:hidden}
|
||||
.section-header{background:linear-gradient(90deg,#1a1a2e,#16213e);padding:1rem 1.5rem;border-bottom:1px solid #30363d;display:flex;align-items:center;gap:.8rem}
|
||||
.section-number{background:#e65100;color:#fff;width:32px;height:32px;border-radius:50%;display:flex;align-items:center;justify-content:center;font-weight:700;flex-shrink:0}
|
||||
.section-title{font-size:1.2rem;color:#e6edf3;font-weight:600}
|
||||
.section-body{padding:1.5rem;overflow-x:auto}
|
||||
.mermaid{display:flex;justify-content:center;padding:1rem 0}
|
||||
.mermaid svg{max-width:100%;height:auto}
|
||||
.desc{background:#1a1a2e;border-left:3px solid #e65100;padding:1rem 1.2rem;margin:1rem 0;border-radius:0 8px 8px 0;font-size:.95rem;color:#8b949e}
|
||||
.desc strong{color:#e6edf3}
|
||||
.component-grid{display:grid;grid-template-columns:repeat(auto-fit,minmax(300px,1fr));gap:1.2rem;margin:1.5rem 0}
|
||||
.component-card{background:#1a1a2e;border:1px solid #30363d;border-radius:10px;padding:1.3rem;transition:all .2s}
|
||||
.component-card:hover{border-color:#e65100;transform:translateY(-2px)}
|
||||
.component-card h3{color:#e65100;font-size:1.05rem;margin-bottom:.6rem;display:flex;align-items:center;gap:.5rem}
|
||||
.component-card .badge{background:#e65100;color:#fff;padding:.15rem .5rem;border-radius:12px;font-size:.75rem;font-weight:600}
|
||||
.component-card .badge.external{background:#4a9eff}
|
||||
.component-card p{color:#8b949e;font-size:.9rem;margin-bottom:.5rem}
|
||||
.component-card .meta{display:flex;flex-direction:column;gap:.3rem;margin-top:.8rem;padding-top:.8rem;border-top:1px solid #30363d}
|
||||
.component-card .meta-item{display:flex;gap:.5rem;font-size:.85rem}
|
||||
.component-card .meta-label{color:#6e7681;white-space:nowrap;min-width:60px}
|
||||
.component-card .meta-value{color:#c9d1d9}
|
||||
.flow-container{background:#1a1a2e;border-radius:10px;padding:1.5rem;margin:1rem 0}
|
||||
.flow-step{display:flex;gap:1rem;align-items:flex-start;margin-bottom:1rem;padding:.8rem;background:#161b22;border-radius:8px;border-left:3px solid #e65100}
|
||||
.flow-step:last-child{margin-bottom:0}
|
||||
.flow-step-number{background:#e65100;color:#fff;width:28px;height:28px;border-radius:50%;display:flex;align-items:center;justify-content:center;font-weight:700;flex-shrink:0}
|
||||
.flow-step-content{flex:1}
|
||||
.flow-step-content h4{color:#e6edf3;font-size:1rem;margin-bottom:.3rem}
|
||||
.flow-step-content p{color:#8b949e;font-size:.9rem}
|
||||
.flow-step-content .highlight{color:#4a9eff;font-weight:600}
|
||||
.footer{text-align:center;padding:2rem;color:#484f58;font-size:.85rem;border-top:1px solid #21262d}
|
||||
</style>
|
||||
</head>
|
||||
<body>
|
||||
<div class="header">
|
||||
<h1>sgClaw 智能浏览器自动化平台</h1>
|
||||
<div class="subtitle">核心组件职责与流转全景图 - 每个组件是什么 做什么 什么时候调用</div>
|
||||
</div>
|
||||
<div class="container">
|
||||
|
||||
<!-- Section 1: Overview -->
|
||||
<div class="section">
|
||||
<div class="section-header">
|
||||
<div class="section-number">1</div>
|
||||
<div class="section-title">全景概览 - 从用户指令到浏览器执行的完整链路</div>
|
||||
</div>
|
||||
<div class="section-body">
|
||||
<div class="desc">
|
||||
当用户说出"帮我查本月线损率"时,sgClaw 内部多个组件协同工作。以下是<strong>完整的执行链路</strong>,展示每个组件在哪个环节被调用、承担什么职责。
|
||||
</div>
|
||||
<div class="mermaid">
|
||||
graph TB
|
||||
U["用户\n输入自然语言指令"] -->|"1. SubmitTask"| GW["通信网关\nSTDIO Pipe / Service WS\n接收请求 建立会话"]
|
||||
GW -->|"2. 加载配置"| CFG["SgClawSettings\n加载 sgclaw_config.json\nLLM Provider RuntimeProfile SkillsDir"]
|
||||
CFG -->|"3. 四级路由决策"| RT["Agent Runtime\ntask_runner 任务调度"]
|
||||
|
||||
RT -->|"3a. 匹配场景"| DS["确定性执行\ndeterministic_submit\nscene_platform 匹配场景清单\n直接执行预设脚本 无需LLM"]
|
||||
RT -->|"3b. 主编排"| PO["主编排路径\nzeroclaw_process_message_primary\n完整Agent工具循环 LLM自主规划"]
|
||||
RT -->|"3c. 直连技能"| DSK["直连技能路径\ndirect_skill_primary\n配置指定skill.tool直接执行"]
|
||||
RT -->|"3d. 标准LLM"| ZC["标准LLM路径\ncompat_llm_primary\nzeroclaw agent turn 默认回退"]
|
||||
|
||||
DS -->|"4. 执行操作"| BB["浏览器后端\nBrowserBackend trait\nPipeBrowser / WsBrowser"]
|
||||
PO -->|"4. 调用工具"| BB
|
||||
DSK -->|"4. 执行操作"| BB
|
||||
ZC -->|"4. 调用工具"| BB
|
||||
|
||||
BB -->|"5. 安全校验"| SC["MAC Policy\n检查 rules.json\n域名白名单 动作白名单 HMAC"]
|
||||
SC -->|"6. 执行命令"| EXT["SuperRPA Chromium\n执行实际DOM操作\n导航 点击 输入 读取"]
|
||||
EXT -->|"7. 返回结果"| BB
|
||||
BB -->|"8. 结果回传"| RT
|
||||
RT -->|"9. 后处理"| PH["Report Artifact\nopenxml_office 生成Excel\nscreen_html_export 生成大屏"]
|
||||
PH -->|"10. TaskComplete"| GW
|
||||
GW -->|"11. 结果"| U
|
||||
|
||||
classDef userNode fill:#1a3a5c,stroke:#4a9eff,color:#e6edf3
|
||||
classDef coreNode fill:#4a2c17,stroke:#e65100,color:#e6edf3
|
||||
classDef routeNode fill:#2d1f3d,stroke:#9c27b0,color:#e6edf3
|
||||
classDef extNode fill:#1f3d2d,stroke:#4caf50,color:#e6edf3
|
||||
classDef cfgNode fill:#484f58,stroke:#8b949e,color:#e6edf3
|
||||
|
||||
class U userNode
|
||||
class GW,RT,BB,PH coreNode
|
||||
class DS,PO,DSK,ZC routeNode
|
||||
class SC,EXT extNode
|
||||
class CFG cfgNode
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<!-- Section 2: Core Components Detail -->
|
||||
<div class="section">
|
||||
<div class="section-header">
|
||||
<div class="section-number">2</div>
|
||||
<div class="section-title">核心组件详解 - 职责 调用时机 输入输出</div>
|
||||
</div>
|
||||
<div class="section-body">
|
||||
<div class="desc">
|
||||
以下是每个核心组件的详细说明。点击卡片可查看<strong>什么时候调用</strong>、<strong>输入什么</strong>、<strong>输出什么</strong>。
|
||||
</div>
|
||||
<div class="component-grid">
|
||||
<div class="component-card">
|
||||
<h3><span class="badge">内部</span>通信网关</h3>
|
||||
<p>负责接收用户请求、建立会话、返回最终结果。支持两种模式:STDIO Pipe(默认,与浏览器宿主通过 stdin/stdout JSON Line 通信)和 Service WS(WebSocket 服务模式,接受外部客户端连接)。</p>
|
||||
<div class="meta">
|
||||
<div class="meta-item"><span class="meta-label">何时调用:</span><span class="meta-value">用户发起请求时第一时间响应</span></div>
|
||||
<div class="meta-item"><span class="meta-label">输入:</span><span class="meta-value">SubmitTask 消息(指令 conversationId pageUrl pageTitle)</span></div>
|
||||
<div class="meta-item"><span class="meta-label">输出:</span><span class="meta-value">TaskComplete LogEntry StatusChanged 消息</span></div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<div class="component-card">
|
||||
<h3><span class="badge">内部</span>Agent Runtime 任务调度</h3>
|
||||
<p>run_submit_task() 是任务执行入口。依次执行四级路由决策:① deterministic_submit 确定性场景匹配 ② primary_orchestration 主编排 ③ direct_submit_skill 直连技能 ④ compat_llm_primary 标准LLM回退。</p>
|
||||
<div class="meta">
|
||||
<div class="meta-item"><span class="meta-label">何时调用:</span><span class="meta-value">SubmitTask 消息到达后</span></div>
|
||||
<div class="meta-item"><span class="meta-label">输入:</span><span class="meta-value">指令 AgentRuntimeContext BrowserPipeTool</span></div>
|
||||
<div class="meta-item"><span class="meta-label">输出:</span><span class="meta-value">AgentMessage::TaskComplete</span></div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<div class="component-card">
|
||||
<h3><span class="badge">内部</span>场景平台 Scene Platform</h3>
|
||||
<p>扫描 skills/ 目录下的场景清单(scene.toml),解析 deterministic 段落的关键词规则。当用户指令匹配时,构建 DeterministicExecutionPlan(含 target_url org_code period_mode 等执行参数),直接执行预设脚本。</p>
|
||||
<div class="meta">
|
||||
<div class="meta-item"><span class="meta-label">何时调用:</span><span class="meta-value">四级路由决策第一步</span></div>
|
||||
<div class="meta-item"><span class="meta-label">输入:</span><span class="meta-value">用户指令 pageUrl pageTitle skills目录</span></div>
|
||||
<div class="meta-item"><span class="meta-label">输出:</span><span class="meta-value">DeterministicExecutionPlan 或 NotDeterministic</span></div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<div class="component-card">
|
||||
<h3><span class="badge">内部</span>SgClawSettings 配置管理</h3>
|
||||
<p>从 JSON 配置文件或环境变量加载运行时配置:多 Provider 管理(apiKey baseUrl model)、Runtime Profile、SkillsDir、BrowserBackend 类型、OfficeBackend、Service WS 监听地址等。</p>
|
||||
<div class="meta">
|
||||
<div class="meta-item"><span class="meta-label">何时调用:</span><span class="meta-value">每次任务提交时加载</span></div>
|
||||
<div class="meta-item"><span class="meta-label">输入:</span><span class="meta-value">sgclaw_config.json 或环境变量</span></div>
|
||||
<div class="meta-item"><span class="meta-label">输出:</span><span class="meta-value">SgClawSettings 结构体</span></div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<div class="component-card">
|
||||
<h3><span class="badge">内部</span>Runtime Engine 运行时引擎</h3>
|
||||
<p>根据 Runtime Profile(BrowserAttached/BrowserHeavy/GeneralAssistant)构建 Tool Policy 白名单,加载技能包,注入 Memory,构建 Agent 实例。同时负责指令增强(附加浏览器合约提示、检测特定任务类型)。</p>
|
||||
<div class="meta">
|
||||
<div class="meta-item"><span class="meta-label">何时调用:</span><span class="meta-value">主编排路径和标准LLM路径构建Agent时</span></div>
|
||||
<div class="meta-item"><span class="meta-label">核心方法:</span><span class="meta-value">build_agent() build_instruction()</span></div>
|
||||
<div class="meta-item"><span class="meta-label">Profile:</span><span class="meta-value">BrowserAttached / BrowserHeavy / GeneralAssistant</span></div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<div class="component-card">
|
||||
<h3><span class="badge external">外部</span>ZeroClaw Core 智能体核心</h3>
|
||||
<p>位于 third_party/zeroclaw/ 的 vendored Agent 核心库。提供 Agent 构建、Provider 管理、工具循环、Memory 接口、技能加载、Prompt 组装等核心能力。sgClaw 在其基础上叠加安全信封层。</p>
|
||||
<div class="meta">
|
||||
<div class="meta-item"><span class="meta-label">何时调用:</span><span class="meta-value">主编排和标准LLM路径中</span></div>
|
||||
<div class="meta-item"><span class="meta-label">位置:</span><span class="meta-value">third_party/zeroclaw/</span></div>
|
||||
<div class="meta-item"><span class="meta-label">核心能力:</span><span class="meta-value">Agent Provider ToolLoop Memory Skills</span></div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<div class="component-card">
|
||||
<h3><span class="badge">内部</span>Browser Backend 浏览器后端</h3>
|
||||
<p>统一的浏览器操作接口(BrowserBackend trait)。两种实现:PipeBrowserBackend(通过 STDIO 与宿主通信)和 WsBrowserBackend(通过 WebSocket 直连 DevTools)。支持 SuperRpa/AgentBrowser/RustNative/ComputerUse 多种后端类型。</p>
|
||||
<div class="meta">
|
||||
<div class="meta-item"><span class="meta-label">何时调用:</span><span class="meta-value">需要操作浏览器时</span></div>
|
||||
<div class="meta-item"><span class="meta-label">支持操作:</span><span class="meta-value">navigate click type getText eval select scrollTo 等15种</span></div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<div class="component-card">
|
||||
<h3><span class="badge">内部</span>MAC Policy 安全策略</h3>
|
||||
<p>从 resources/rules.json 加载安全规则。三层安全模型:①握手时 HMAC seed 交换和会话密钥派生 ②Rust 侧域名+动作白名单校验 ③宿主侧 HMAC 二次验证。拒绝不在白名单的域名和被禁用的动作。</p>
|
||||
<div class="meta">
|
||||
<div class="meta-item"><span class="meta-label">何时调用:</span><span class="meta-value">每次浏览器操作执行前</span></div>
|
||||
<div class="meta-item"><span class="meta-label">检查项:</span><span class="meta-value">域名白名单 动作类型 HMAC验证</span></div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<div class="component-card">
|
||||
<h3><span class="badge external">外部</span>SuperRPA Chromium 浏览器宿主</h3>
|
||||
<p>实际执行 DOM 操作的外部系统。接收 sgClaw 的 Command(含 HMAC),验证后执行 navigate/click/type/getText 等操作,返回 Response(含操作结果 + HMAC)。STDIO 模式下与 sgClaw 进程通过管道通信。</p>
|
||||
<div class="meta">
|
||||
<div class="meta-item"><span class="meta-label">何时调用:</span><span class="meta-value">BrowserBackend 发送命令时</span></div>
|
||||
<div class="meta-item"><span class="meta-label">通信协议:</span><span class="meta-value">STDIO JSON Line 或 WebSocket</span></div>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<!-- Section 3: LLM Detail Flow -->
|
||||
<div class="section">
|
||||
<div class="section-header">
|
||||
<div class="section-number">3</div>
|
||||
<div class="section-title">LLM 大模型工作全流程 - 从语义识别到任务规划</div>
|
||||
</div>
|
||||
<div class="section-body">
|
||||
<div class="desc">
|
||||
当用户指令无法匹配已知技能时,LLM 大模型开始工作。以下是<strong>大模型从理解用户意图到生成可执行计划的完整过程</strong>。
|
||||
</div>
|
||||
<div class="flow-container">
|
||||
<div class="flow-step">
|
||||
<div class="flow-step-number">1</div>
|
||||
<div class="flow-step-content">
|
||||
<h4>语义识别 - 理解用户说了什么</h4>
|
||||
<p>LLM 接收用户自然语言指令,识别用户的<strong>真实意图</strong>。例如"帮我查本月线损率" → 识别为"查询线损率数据"。</p>
|
||||
</div>
|
||||
</div>
|
||||
<div class="flow-step">
|
||||
<div class="flow-step-number">2</div>
|
||||
<div class="flow-step-content">
|
||||
<h4>场景匹配 - 判断是否为已知场景</h4>
|
||||
<p>结合 <span class="highlight">Memory(记忆模块)</span>中存储的历史任务记录,判断该指令是否与已有技能匹配。如果匹配,转交快速通道执行。</p>
|
||||
</div>
|
||||
</div>
|
||||
<div class="flow-step">
|
||||
<div class="flow-step-number">3</div>
|
||||
<div class="flow-step-content">
|
||||
<h4>任务拆解 - 将大目标分解为小步骤</h4>
|
||||
<p>如果是新场景,LLM 将用户目标拆解为具体的、可操作的步骤序列。例如:打开系统 → 选择月份 → 点击查询 → 读取数据 → 导出Excel。</p>
|
||||
</div>
|
||||
</div>
|
||||
<div class="flow-step">
|
||||
<div class="flow-step-number">4</div>
|
||||
<div class="flow-step-content">
|
||||
<h4>工具选择 - 决定用什么能力完成任务</h4>
|
||||
<p>LLM 根据步骤需求,从可用工具库中选择合适的工具。例如:需要打开网页选择"导航工具",需要点击按钮选择"点击工具",需要读取数据选择"读取工具"。</p>
|
||||
</div>
|
||||
</div>
|
||||
<div class="flow-step">
|
||||
<div class="flow-step-number">5</div>
|
||||
<div class="flow-step-content">
|
||||
<h4>参数填充 - 确定每个工具的具体参数</h4>
|
||||
<p>LLM 为每个工具填充具体参数。例如点击工具需要知道"点击哪个按钮",导航工具需要知道"打开哪个URL"。这些参数从用户指令和上下文中提取。</p>
|
||||
</div>
|
||||
</div>
|
||||
<div class="flow-step">
|
||||
<div class="flow-step-number">6</div>
|
||||
<div class="flow-step-content">
|
||||
<h4>执行计划生成 - 输出可执行的JSON/结构化指令</h4>
|
||||
<p>LLM 将拆解的步骤、选择的工具、填充的参数整合为<strong>结构化的执行计划</strong>,交由工具执行引擎依次执行。</p>
|
||||
</div>
|
||||
</div>
|
||||
<div class="flow-step">
|
||||
<div class="flow-step-number">7</div>
|
||||
<div class="flow-step-content">
|
||||
<h4>循环迭代 - 根据执行结果动态调整</h4>
|
||||
<p>如果某一步执行失败或结果不符合预期,LLM 会收到反馈,重新规划后续步骤。例如页面打不开则尝试备用URL,元素找不到则换选择器。</p>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<!-- Section 4: Memory, Skills & Runtime Engine -->
|
||||
<div class="section">
|
||||
<div class="section-header">
|
||||
<div class="section-number">4</div>
|
||||
<div class="section-title">Memory 技能管理 与 Runtime Engine - 运行时核心引擎</div>
|
||||
</div>
|
||||
<div class="section-body">
|
||||
<div class="desc">
|
||||
sgClaw 的运行时核心由三大引擎协同工作:<strong>Memory(记忆模块)</strong>负责持久化存储对话历史与任务状态,<strong>技能管理系统</strong>负责加载和注入技能包到 Agent,<strong>Runtime Engine</strong>负责根据 Runtime Profile 构建完整的 Agent 运行环境(工具策略 + 技能加载 + 指令增强)。
|
||||
</div>
|
||||
<div class="mermaid">
|
||||
graph TB
|
||||
subgraph Memory["Memory 记忆模块 zeroclaw::memory"]
|
||||
M1["SQLite 存储 brain.db\n对话历史 任务状态 执行结果"]
|
||||
M2["Memory Trait 接口\ncreateMemoryWithStorage\n支持多种后端 SQLite/文件"]
|
||||
M1 -.->|"读写"| M2
|
||||
end
|
||||
|
||||
subgraph SkillMgmt["技能管理 Skills Management"]
|
||||
S1["技能加载器\nloadSkillsFromDirectory\n按目录扫描技能包"]
|
||||
S2["技能过滤器\n按浏览器可用性过滤\nbrowser_script 工具裁剪"]
|
||||
S3["ReadSkill Tool\n运行时按需读取技能详情\n支持 open_skills 配置"]
|
||||
S4["技能目录解析\nskills/ 默认目录\n自定义 skillsDir"]
|
||||
S1 --> S2
|
||||
S4 --> S1
|
||||
S1 --> S3
|
||||
end
|
||||
|
||||
subgraph RuntimeEngine["Runtime Engine 运行时引擎"]
|
||||
R1["Runtime Profile\nBrowserAttached / BrowserHeavy / GeneralAssistant"]
|
||||
R2["Tool Policy 工具策略\n按 Profile 维护工具白名单\nallowed_tools 列表"]
|
||||
R3["Agent Builder\n组装 Provider + Tools + Memory + Skills\n构建完整 Agent 实例"]
|
||||
R4["指令增强器\n附加浏览器合约提示\n检测知乎热榜/Excel导出/大屏任务"]
|
||||
R1 -->|"决定"| R2
|
||||
R2 -->|"约束"| R3
|
||||
R3 -->|"使用"| R4
|
||||
end
|
||||
|
||||
Memory -->|"注入"| RuntimeEngine
|
||||
SkillMgmt -->|"注入"| RuntimeEngine
|
||||
|
||||
classDef memFill fill:#1a3a5c,stroke:#4a9eff,color:#e6edf3
|
||||
classDef skillFill fill:#2d1f3d,stroke:#9c27b0,color:#e6edf3
|
||||
classDef runtimeFill fill:#4a2c17,stroke:#e65100,color:#e6edf3
|
||||
|
||||
class Memory,M1,M2 memFill
|
||||
class SkillMgmt,S1,S2,S3,S4 skillFill
|
||||
class RuntimeEngine,R1,R2,R3,R4 runtimeFill
|
||||
</div>
|
||||
<div class="component-grid" style="margin-top:1.5rem">
|
||||
<div class="component-card">
|
||||
<h3><span class="badge">内部</span>Memory 记忆模块</h3>
|
||||
<p><strong>职责:</strong>基于 SQLite(brain.db)持久化存储对话历史、任务状态和执行结果。通过 zeroclaw::memory::Memory trait 提供统一接口,支持多种存储后端。</p>
|
||||
<div class="meta">
|
||||
<div class="meta-item"><span class="meta-label">何时调用:</span><span class="meta-value">Agent 构建时创建 每次 LLM 调用前后读写</span></div>
|
||||
<div class="meta-item"><span class="meta-label">调用者:</span><span class="meta-value">Runtime Engine(build_agent 方法)</span></div>
|
||||
<div class="meta-item"><span class="meta-label">存储路径:</span><span class="meta-value">workspace/memory/brain.db</span></div>
|
||||
</div>
|
||||
</div>
|
||||
<div class="component-card">
|
||||
<h3><span class="badge">内部</span>技能管理系统</h3>
|
||||
<p><strong>职责:</strong>从 skills/ 目录(或自定义 skillsDir)扫描加载技能包,按浏览器是否可用过滤 browser_script 工具,通过 ReadSkill Tool 让 Agent 按需读取技能详情。支持 open_skills 独立技能目录配置。</p>
|
||||
<div class="meta">
|
||||
<div class="meta-item"><span class="meta-label">何时调用:</span><span class="meta-value">每次 Agent 构建时加载技能列表</span></div>
|
||||
<div class="meta-item"><span class="meta-label">调用者:</span><span class="meta-value">Runtime Engine(load_skills_for_surface)</span></div>
|
||||
<div class="meta-item"><span class="meta-label">技能来源:</span><span class="meta-value">workspace/skills/ 或 skillsDir 配置</span></div>
|
||||
</div>
|
||||
</div>
|
||||
<div class="component-card">
|
||||
<h3><span class="badge">内部</span>Runtime Engine</h3>
|
||||
<p><strong>职责:</strong>运行时核心编排器。根据 Runtime Profile 决定工具白名单,加载技能,注入 Memory,构建 Agent 实例。同时负责指令增强(附加浏览器合约提示、检测特定任务类型如知乎热榜/Excel导出/大屏展示)。</p>
|
||||
<div class="meta">
|
||||
<div class="meta-item"><span class="meta-label">何时调用:</span><span class="meta-value">每次任务提交时 构建 Agent 前</span></div>
|
||||
<div class="meta-item"><span class="meta-label">核心方法:</span><span class="meta-value">build_agent() build_instruction()</span></div>
|
||||
<div class="meta-item"><span class="meta-label">Profile:</span><span class="meta-value">BrowserAttached / BrowserHeavy / GeneralAssistant</span></div>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<!-- Section 5: Task Routing - 4 Execution Paths -->
|
||||
<div class="section">
|
||||
<div class="section-header">
|
||||
<div class="section-number">5</div>
|
||||
<div class="section-title">任务路由 - 四种执行路径决策树</div>
|
||||
</div>
|
||||
<div class="section-body">
|
||||
<div class="desc">
|
||||
任务提交到 sgClaw 后,<strong>Agent Runtime</strong> 按优先级依次判断走哪条执行路径。这不是简单的"快速/AI"二选一,而是<strong>四级决策树</strong>。
|
||||
</div>
|
||||
<div class="mermaid">
|
||||
graph TB
|
||||
A["SubmitTask 用户指令进入"] --> B["1. deterministic_submit\n场景平台匹配"]
|
||||
B -->|"匹配已知确定场景"| C["确定性执行路径\ndeterministic_submit\n直接执行预设场景脚本"]
|
||||
B -->|"未匹配 非确定性"| D["2. Primary Orchestration\nzeroclaw process_message"]
|
||||
|
||||
D -->|"browser_surface_enabled\n且 should_use_primary"| E["主编排路径\nzeroclaw_process_message_primary\n完整 Agent 工具循环"]
|
||||
D -->|"不满足条件"| F["3. direct_submit_skill\n配置了直连技能"]
|
||||
|
||||
F -->|"directSubmitSkill已配置"| G["直连技能路径\ndirect_skill_primary\n绕过Agent直接执行"]
|
||||
F -->|"未配置"| H["4. compat_llm_primary\n标准LLM路径\nzeroclaw agent turn"]
|
||||
|
||||
C --> I["TaskComplete 返回结果"]
|
||||
E --> I
|
||||
G --> I
|
||||
H --> I
|
||||
|
||||
classDef routeFill fill:#e65100,stroke:#ff6d00,color:#fff
|
||||
classDef path1Fill fill:#1f3d2d,stroke:#4caf50,color:#e6edf3
|
||||
classDef path2Fill fill:#2d1f3d,stroke:#9c27b0,color:#e6edf3
|
||||
classDef path3Fill fill:#1a3a5c,stroke:#4a9eff,color:#e6edf3
|
||||
classDef path4Fill fill:#4a2c17,stroke:#e65100,color:#e6edf3
|
||||
classDef endFill fill:#484f58,stroke:#8b949e,color:#e6edf3
|
||||
|
||||
class B,D,F routeFill
|
||||
class C path1Fill
|
||||
class E path2Fill
|
||||
class G path3Fill
|
||||
class H path4Fill
|
||||
class I endFill
|
||||
</div>
|
||||
<div class="flow-container" style="margin-top:1.5rem">
|
||||
<div class="flow-step">
|
||||
<div class="flow-step-number">1</div>
|
||||
<div class="flow-step-content">
|
||||
<h4>确定性场景匹配 - deterministic_submit</h4>
|
||||
<p>通过 <span class="highlight">scene_platform</span> 模块扫描 skills/ 目录下的场景清单(scene.toml),匹配指令关键词、URL、页面标题。匹配成功则构建 <span class="highlight">DeterministicExecutionPlan</span>,直接执行场景预设的浏览器脚本,<strong>无需 LLM 参与</strong>。典型场景:线损查询、报表导出等固定流程。</p>
|
||||
</div>
|
||||
</div>
|
||||
<div class="flow-step">
|
||||
<div class="flow-step-number">2</div>
|
||||
<div class="flow-step-content">
|
||||
<h4>主编排路径 - zeroclaw_process_message_primary</h4>
|
||||
<p>当 Runtime Profile 启用浏览器工具(browser_surface_enabled)且 <span class="highlight">orchestration::should_use_primary</span> 判定走主编排时,调用 zeroclaw 的 process_message 完整 Agent 循环。LLM 可以调用所有允许的工具(浏览器操作、技能工具等),支持多轮工具调用和动态规划。</p>
|
||||
</div>
|
||||
</div>
|
||||
<div class="flow-step">
|
||||
<div class="flow-step-number">3</div>
|
||||
<div class="flow-step-content">
|
||||
<h4>直连技能路径 - direct_skill_primary</h4>
|
||||
<p>当配置中设置了 <span class="highlight">directSubmitSkill</span>(格式:skillName.toolName),绕过正常 Agent 循环,直接执行指定的技能工具。适用于需要固定流程但又不适合确定性场景的中间态。</p>
|
||||
</div>
|
||||
</div>
|
||||
<div class="flow-step">
|
||||
<div class="flow-step-number">4</div>
|
||||
<div class="flow-step-content">
|
||||
<h4>标准 LLM 路径 - compat_llm_primary</h4>
|
||||
<p>以上三条路都不通时的默认回退。创建标准 zeroclaw Agent turn,LLM 根据指令自主决定使用哪些工具。这是最灵活但也最慢的路径。</p>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<!-- Section 6: Browser Execution Full Process -->
|
||||
<div class="section">
|
||||
<div class="section-header">
|
||||
<div class="section-number">6</div>
|
||||
<div class="section-title">浏览器执行全过程 - 从sgClaw到SuperRPA浏览器的命令传输</div>
|
||||
</div>
|
||||
<div class="section-body">
|
||||
<div class="desc">
|
||||
sgClaw 有两种浏览器后端模式:<strong>STDIO Pipe 模式</strong>(sgClaw 进程通过 stdin/stdout 与浏览器宿主通信)和 <strong>WebSocket 模式</strong>(直接连接浏览器 DevTools WebSocket)。安全校验在两种模式下都由 MAC Policy 层负责。
|
||||
</div>
|
||||
<div class="mermaid">
|
||||
graph TB
|
||||
subgraph PipeMode["STDIO Pipe 模式(嵌入SuperRPA)"]
|
||||
TE1["ZeroClawBrowserTool\n实现 zeroclaw::tools::Tool trait\n暴露 browser_action / superrpa_browser"]
|
||||
SC1["MAC Policy 安全策略\n检查 rules.json 域名白名单\n动作白名单 HMAC验证"]
|
||||
BC1["BrowserPipeTool\n分配 seq 计算 HMAC\n发送Command 等待Response"]
|
||||
TP1["StdioTransport\nJSON Line 协议\nstdin/stdout 1MB限制"]
|
||||
HOST1["浏览器宿主进程\nSuperRPA Chromium\n验证HMAC 执行DOM操作"]
|
||||
|
||||
TE1 -->|"tool call"| SC1
|
||||
SC1 -->|"校验通过"| BC1
|
||||
BC1 -->|"Command + HMAC"| TP1
|
||||
TP1 -->|"JSON Line"| HOST1
|
||||
HOST1 -->|"Response + HMAC"| TP1
|
||||
TP1 -->|"匹配 seq 返回"| BC1
|
||||
BC1 -->|"结果"| TE1
|
||||
end
|
||||
|
||||
subgraph WsMode["WebSocket 模式(独立运行)"]
|
||||
TE2["ZeroClawBrowserTool\n相同的 Tool 接口"]
|
||||
SC2["MAC Policy 相同的安全检查"]
|
||||
BC2["WsBrowserBackend\nWebSocket 连接\nDevTools Protocol"]
|
||||
WS1["WebSocket 协议层\ntungstenite 库"]
|
||||
HOST2["浏览器 DevTools\nChrome DevTools Protocol"]
|
||||
|
||||
TE2 -->|"tool call"| SC2
|
||||
SC2 -->|"校验通过"| BC2
|
||||
BC2 -->|"CDP Command"| WS1
|
||||
WS1 -->|"ws://host:port"| HOST2
|
||||
HOST2 -->|"CDP Response"| WS1
|
||||
WS1 -->|"结果"| BC2
|
||||
BC2 -->|"结果"| TE2
|
||||
end
|
||||
|
||||
classDef teFill fill:#4a2c17,stroke:#e65100,color:#e6edf3
|
||||
classDef scFill fill:#2d1f3d,stroke:#9c27b0,color:#e6edf3
|
||||
classDef bcFill fill:#1a3a5c,stroke:#4a9eff,color:#e6edf3
|
||||
classDef tpFill fill:#484f58,stroke:#8b949e,color:#e6edf3
|
||||
classDef hostFill fill:#1f3d2d,stroke:#4caf50,color:#e6edf3
|
||||
|
||||
class TE1,TE2 teFill
|
||||
class SC1,SC2 scFill
|
||||
class BC1,BC2 bcFill
|
||||
class TP1,WS1 tpFill
|
||||
class HOST1,HOST2 hostFill
|
||||
</div>
|
||||
<div class="component-grid" style="margin-top:1.5rem">
|
||||
<div class="component-card">
|
||||
<h3><span class="badge">内部</span>ZeroClawBrowserTool</h3>
|
||||
<p><strong>职责:</strong>实现 zeroclaw::tools::Tool trait,将 BrowserBackend 适配为 LLM 可调用的工具。暴露两个工具名:browser_action(传统别名)和 superrpa_browser(SuperRPA 专用,优先使用)。</p>
|
||||
<div class="meta">
|
||||
<div class="meta-item"><span class="meta-label">何时调用:</span><span class="meta-value">LLM 决定操作浏览器时</span></div>
|
||||
<div class="meta-item"><span class="meta-label">文件位置:</span><span class="meta-value">compat/browser_tool_adapter.rs</span></div>
|
||||
</div>
|
||||
</div>
|
||||
<div class="component-card">
|
||||
<h3><span class="badge">内部</span>MAC Policy 安全策略</h3>
|
||||
<p><strong>职责:</strong>从 resources/rules.json 加载安全规则。三层安全检查:①握手时 HMAC seed 交换 ②Rust 侧域名+动作白名单校验 ③宿主侧 HMAC 二次验证。拒绝不在白名单的域名和被禁用的动作。</p>
|
||||
<div class="meta">
|
||||
<div class="meta-item"><span class="meta-label">何时调用:</span><span class="meta-value">每次浏览器工具调用前</span></div>
|
||||
<div class="meta-item"><span class="meta-label">规则文件:</span><span class="meta-value">resources/rules.json</span></div>
|
||||
</div>
|
||||
</div>
|
||||
<div class="component-card">
|
||||
<h3><span class="badge">内部</span>BrowserBackend 浏览器后端</h3>
|
||||
<p><strong>职责:</strong>统一的浏览器操作接口(BrowserBackend trait)。两种实现:PipeBrowserBackend(通过 StdioTransport 与宿主通信)和 WsBrowserBackend(通过 WebSocket 直连 DevTools)。由 BrowserBackend 配置决定使用哪种。</p>
|
||||
<div class="meta">
|
||||
<div class="meta-item"><span class="meta-label">后端类型:</span><span class="meta-value">SuperRpa / AgentBrowser / RustNative / ComputerUse / Auto</span></div>
|
||||
<div class="meta-item"><span class="meta-label">文件位置:</span><span class="meta-value">browser/pipe_backend.rs browser/ws_backend.rs</span></div>
|
||||
</div>
|
||||
</div>
|
||||
<div class="component-card">
|
||||
<h3><span class="badge">内部</span>BrowserPipeTool</h3>
|
||||
<p><strong>职责:</strong>STDIO Pipe 模式下的特权浏览器工具。为每个命令分配单调递增 seq,使用派生会话密钥计算 HMAC,发送 Command 消息后阻塞等待匹配的 Response,支持超时。</p>
|
||||
<div class="meta">
|
||||
<div class="meta-item"><span class="meta-label">何时调用:</span><span class="meta-value">Pipe 模式下每次浏览器操作</span></div>
|
||||
<div class="meta-item"><span class="meta-label">文件位置:</span><span class="meta-value">pipe/browser_tool.rs</span></div>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<!-- Section 7: External Systems -->
|
||||
<div class="section">
|
||||
<div class="section-header">
|
||||
<div class="section-number">7</div>
|
||||
<div class="section-title">外部系统关系图 - sgClaw与谁交互</div>
|
||||
</div>
|
||||
<div class="section-body">
|
||||
<div class="desc">
|
||||
sgClaw 不是孤立运行的,它与多个<strong>外部系统</strong>协同工作。以下是sgClaw与外部系统的交互关系。
|
||||
</div>
|
||||
<div class="mermaid">
|
||||
graph TB
|
||||
subgraph External["外部系统 - sgClaw不控制这些系统"]
|
||||
E1["LLM 提供商\nDeepSeek OpenAI Claude\nHTTP API 调用"]
|
||||
E2["SuperRPA Chromium\n浏览器宿主进程\nSTDIO 或 WebSocket"]
|
||||
E3["业务系统\n线损系统 客服系统\n通过浏览器访问"]
|
||||
E4["客户端\nsg_claw_client CLI\nService WebSocket 连接"]
|
||||
end
|
||||
|
||||
subgraph sgClawInternal["sgClaw 内部"]
|
||||
S1["通信网关\nSTDIO Pipe / Service WS"]
|
||||
S2["Agent Runtime\ntask_runner 任务调度"]
|
||||
S3["Runtime Engine\n构建Agent 工具策略"]
|
||||
S4["ZeroClaw Core\nthird_party/zeroclaw\nAgent循环 工具循环"]
|
||||
S5["MAC Policy\n安全策略 rules.json"]
|
||||
S6["Browser Backend\nPipeBrowser / WsBrowser"]
|
||||
end
|
||||
|
||||
E4 -->|"SubmitTask"| S1
|
||||
S1 -->|"TaskComplete / LogEntry"| E4
|
||||
|
||||
S2 -->|"构建 Agent"| S3
|
||||
S3 -->|"build_agent"| S4
|
||||
|
||||
S4 -->|"发送Prompt 接收响应"| E1
|
||||
S4 -->|"调用工具"| S5
|
||||
S5 -->|"校验通过"| S6
|
||||
S6 -->|"浏览器命令"| E2
|
||||
E2 -->|"DOM操作"| E3
|
||||
E3 -->|"页面数据"| E2
|
||||
E2 -->|"命令结果"| S6
|
||||
S6 -->|"结果"| S4
|
||||
S4 -->|"事件桥接 log_entry"| S1
|
||||
|
||||
classDef extFill fill:#1f3d2d,stroke:#4caf50,color:#e6edf3
|
||||
classDef intFill fill:#4a2c17,stroke:#e65100,color:#e6edf3
|
||||
|
||||
class External,E1,E2,E3,E4 extFill
|
||||
class sgClawInternal,S1,S2,S3,S4,S5,S6 intFill
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<!-- Section 8: Complete Lifecycle -->
|
||||
<div class="section">
|
||||
<div class="section-header">
|
||||
<div class="section-number">8</div>
|
||||
<div class="section-title">完整生命周期 - 一个任务从出生到结束</div>
|
||||
</div>
|
||||
<div class="section-body">
|
||||
<div class="desc">
|
||||
以一个真实场景为例:<strong>"帮我查本月线损率并导出Excel"</strong>,展示sgClaw从接收指令到返回结果的完整生命周期。
|
||||
</div>
|
||||
<div class="flow-container">
|
||||
<div class="flow-step">
|
||||
<div class="flow-step-number">1</div>
|
||||
<div class="flow-step-content">
|
||||
<h4><span class="highlight">通信网关</span>接收指令</h4>
|
||||
<p>浏览器宿主进程通过 STDIO(JSON Line 协议)发送 SubmitTask 消息。sgClaw 创建会话,解析指令、page_url、page_title、conversation_id。</p>
|
||||
</div>
|
||||
</div>
|
||||
<div class="flow-step">
|
||||
<div class="flow-step-number">2</div>
|
||||
<div class="flow-step-content">
|
||||
<h4><span class="highlight">加载配置</span>SgClawSettings</h4>
|
||||
<p>从 sgclaw_config.json 或环境变量加载配置:LLM provider(apiKey/baseUrl/model)、runtimeProfile、skillsDir、directSubmitSkill 等。</p>
|
||||
</div>
|
||||
</div>
|
||||
<div class="flow-step">
|
||||
<div class="flow-step-number">3</div>
|
||||
<div class="flow-step-content">
|
||||
<h4><span class="highlight">确定性场景匹配</span>deterministic_submit</h4>
|
||||
<p>扫描 skills/ 目录下的场景清单(scene.toml),发现指令包含"线损率"、"本月"关键词,匹配到"线损查询"场景。构建 DeterministicExecutionPlan(含 target_url、org_code、period_mode 等参数)。</p>
|
||||
</div>
|
||||
</div>
|
||||
<div class="flow-step">
|
||||
<div class="flow-step-number">4</div>
|
||||
<div class="flow-step-content">
|
||||
<h4><span class="highlight">MAC Policy</span>安全校验</h4>
|
||||
<p>检查目标域名是否在 rules.json 白名单中 → 通过。检查操作类型(navigate、click、getText)是否在动作白名单中 → 通过。</p>
|
||||
</div>
|
||||
</div>
|
||||
<div class="flow-step">
|
||||
<div class="flow-step-number">5</div>
|
||||
<div class="flow-step-content">
|
||||
<h4><span class="highlight">BrowserPipeTool</span>执行浏览器命令</h4>
|
||||
<p>为每个命令分配单调递增 seq,使用派生会话密钥计算 HMAC。通过 StdioTransport 发送 Command 消息给浏览器宿主。执行:导航到线损系统 → 选择月份 → 点击查询 → 读取表格数据。</p>
|
||||
</div>
|
||||
</div>
|
||||
<div class="flow-step">
|
||||
<div class="flow-step-number">6</div>
|
||||
<div class="flow-step-content">
|
||||
<h4><span class="highlight">SuperRPA Chromium</span>执行DOM操作</h4>
|
||||
<p>浏览器宿主接收 Command,验证 HMAC,执行实际 DOM 操作(导航、选择下拉框、点击按钮、读取表格内容),返回 Response(含操作结果 + HMAC)。</p>
|
||||
</div>
|
||||
</div>
|
||||
<div class="flow-step">
|
||||
<div class="flow-step-number">7</div>
|
||||
<div class="flow-step-content">
|
||||
<h4><span class="highlight">Report Artifact</span>后处理</h4>
|
||||
<p>将浏览器返回的表格数据解析为结构化格式。根据场景的 postprocess 配置,使用 openxml_office 工具生成 .xlsx 文件。生成结果包含本地文件路径。</p>
|
||||
</div>
|
||||
</div>
|
||||
<div class="flow-step">
|
||||
<div class="flow-step-number">8</div>
|
||||
<div class="flow-step-content">
|
||||
<h4><span class="highlight">通信网关</span>返回结果</h4>
|
||||
<p>通过 StdioTransport 发送 TaskComplete 消息给浏览器宿主,包含 success=true 和执行摘要(含生成的 .xlsx 文件路径)。浏览器宿主提示用户下载完成。</p>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
</div>
|
||||
<div class="footer">sgClaw 智能浏览器自动化平台 - 组件职责与流转全景图 - 2026年4月</div>
|
||||
<script>
|
||||
mermaid.initialize({ startOnLoad:true, theme:'dark', securityLevel:'loose', logLevel:'warn' });
|
||||
</script>
|
||||
</body>
|
||||
</html>
|
||||
BIN
docs/sgClaw组件职责与流转全景图.pdf
Normal file
BIN
docs/sgClaw组件职责与流转全景图.pdf
Normal file
Binary file not shown.
314
docs/sgClaw项目现有优势与下一步计划汇报稿.md
Normal file
314
docs/sgClaw项目现有优势与下一步计划汇报稿.md
Normal file
@@ -0,0 +1,314 @@
|
||||
# sgClaw 项目现有优势与下一步计划汇报稿
|
||||
|
||||
## 一、项目现有优势
|
||||
|
||||
和传统 openclaw 类自动化项目相比,本项目的优势不只是“能做页面操作”,而是已经具备了向企业级、长期可用方向演进的基础。简单说,传统方案更像“能跑起来的自动化脚本集合”,本项目更像“带安全边界、带统一控制、带长期演进能力的智能执行底座”。
|
||||
|
||||
而且根据后续已经落地的提交来看,本项目的优势已经不再只是架构上的“方向正确”,而是开始体现为一批已经交付的现实能力,包括:
|
||||
|
||||
- 运行时策略已经从写死逻辑转向配置驱动。
|
||||
- 已支持 planner-first 的先规划后执行模式。
|
||||
- 已支持技能包驱动的浏览器脚本执行。
|
||||
- 已支持面向具体任务的 Excel 导出和屏幕展示导出能力。
|
||||
- 已具备版本级日志、技能版本日志和真实验收记录。
|
||||
|
||||
这意味着本项目已经从“原型底座”进一步进入“可验证、可扩展、可交付”的阶段。
|
||||
|
||||
### 1. 从“脚本工具”升级为“统一执行底座”
|
||||
|
||||
传统项目通常是一个场景一套脚本,一个系统一套规则,能解决眼前问题,但难以复用、难以管理、难以持续演进。页面一改,脚本就要改;系统一多,维护成本就会快速上升。
|
||||
|
||||
本项目已经把任务接入、运行时控制、浏览器执行、日志回传、结果回传放进统一框架里。这样后续不管是新增能力、扩展场景,还是替换模型、替换策略,都不需要推倒重来,而是在同一个底座上持续增强。对业务侧来说,这意味着项目价值不再停留在“做通一个流程”,而是逐步沉淀为可以支撑更多业务的公共能力。
|
||||
|
||||
### 2. 安全设计更完整,更适合企业内网
|
||||
|
||||
这是本项目最突出的优势。传统自动化方案最大的问题,是一旦拿到页面操作能力,就容易变成“脚本想点什么就点什么”,安全边界不清楚,风险很难控制。本项目当前已经不是这种模式,而是把浏览器能力放进了严格受控的执行边界里。
|
||||
|
||||
从汇报角度,可以把当前安全设计概括为以下几组能力:
|
||||
|
||||
#### 3 层安全机制
|
||||
|
||||
第一层是启动门禁。浏览器宿主和运行时必须先完成握手,版本不一致、超时、顺序错误,系统都不会进入正式运行状态。
|
||||
|
||||
第二层是运行时策略校验。每次真正要执行页面动作前,都会先检查动作是否合法、目标页面是否在允许范围内。
|
||||
|
||||
第三层是宿主二次校验。就算运行时已经发出了命令,浏览器宿主仍然会再做一次本地校验,防止异常命令真正落地。
|
||||
|
||||
这 3 层叠加起来,形成了“不是模型想做什么就做什么,而是每一步都要过关”的安全控制方式。
|
||||
|
||||
#### 6 项协议硬约束
|
||||
|
||||
当前协议层已经明确了 6 项刚性要求:
|
||||
|
||||
1. 传输格式固定,不能随意乱发消息。
|
||||
2. 编码方式固定,避免解析异常。
|
||||
3. 单条消息大小有限制,防止异常数据冲击系统。
|
||||
4. 序列号必须严格递增,不能重复、不能乱序。
|
||||
5. 每条关键命令都必须带安全字段。
|
||||
6. 一次请求只能对应一次响应,不能混乱回包。
|
||||
|
||||
这说明系统不是“发个命令过去试试看”,而是每条消息都有严格规则,更适合企业环境中的稳定运行。
|
||||
|
||||
#### 2 类白名单
|
||||
|
||||
本项目当前至少有两类核心白名单同时生效。
|
||||
|
||||
第一类是域名白名单。只有允许的业务域名才可以被操作,不是浏览器里所有页面都能碰。
|
||||
|
||||
第二类是动作白名单。只有允许的动作类型才能执行,不是脚本写得出来就一定能跑。
|
||||
|
||||
白名单机制的意义在于,系统把“哪些页面能动、哪些动作能做”提前规定清楚,而不是把决定权完全交给模型或脚本。
|
||||
|
||||
#### 1 类显式黑名单
|
||||
|
||||
除了白名单,本项目还明确保留了显式阻断项。也就是说,不只是“没允许的不行”,而是“高风险动作被直接明确禁止”。
|
||||
|
||||
这在企业场景下非常重要,因为有些能力不是“暂时不用”,而是“原则上就不能开放”。有了黑名单,系统在设计上就能提前规避高风险能力外溢。
|
||||
|
||||
#### 5 个默认允许动作
|
||||
|
||||
当前默认真正开放给运行时执行的动作共有 5 个:
|
||||
|
||||
1. 点击
|
||||
2. 输入
|
||||
3. 页面跳转
|
||||
4. 文本读取
|
||||
5. 受控脚本执行
|
||||
|
||||
这里最重要的一点不是“多了一个动作”,而是这个新增能力并没有破坏安全边界。它不是把任意页面脚本能力全部放开,而是在现有受控协议和校验链路内,给技能包提供了一种更强但仍然可控的执行方式。
|
||||
|
||||
这看起来不如一些传统方案“动作数量多”,但它的价值恰恰在于边界非常清楚。先把最稳定、最可控、最容易审计的核心动作做好,再逐步扩展,比一开始把大量高风险动作全部开放更稳。
|
||||
|
||||
#### 7 个默认允许域名
|
||||
|
||||
当前规则里默认允许的域名是有限集合,而不是浏览器里的所有网页都能碰。这样做非常符合企业内网环境的实际需求。对于办公系统、ERP、OA 等场景,大家真正需要的不是“全网自动化”,而是“在明确范围内可控地自动化”。
|
||||
|
||||
#### 1 套 HMAC 签名机制
|
||||
|
||||
所有关键命令不是明文裸发,而是带签名校验。可以简单理解成,每条关键操作都会带“防伪标记”。
|
||||
|
||||
这样做的价值是,命令在链路中不容易被伪造、篡改或错误复用,整体安全性远高于普通脚本直接调用页面接口的模式。
|
||||
|
||||
#### 1 套序列号机制
|
||||
|
||||
每条命令都有严格递增的序列号,而且一个序列号只能对应一个结果。
|
||||
|
||||
这让系统能够清楚知道“这条结果到底对应哪一次操作”,避免串包、乱序、错配等问题,提升稳定性和可追溯性。
|
||||
|
||||
#### 3 重脚本执行约束
|
||||
|
||||
后续提交里新增了技能包驱动的浏览器脚本能力,但这部分并不是“把页面执行彻底放开”,而是在现有安全边界内增加了一层受控能力。
|
||||
|
||||
可以把它理解为 3 重约束:
|
||||
|
||||
1. 脚本必须来自技能包内的受管路径,不能越界读取技能目录之外的文件。
|
||||
2. 执行时必须声明目标域名,不能脱离页面上下文随意运行。
|
||||
3. 脚本仍然通过现有浏览器 pipe 和动作白名单执行,而不是绕过宿主直接落地。
|
||||
|
||||
这类设计很关键,因为它说明项目在增强能力的同时,仍然坚持“新增能力必须留在安全边界里”,而不是为了方便把安全口子越开越大。
|
||||
|
||||
#### 5 类错误处理策略
|
||||
|
||||
系统不是失败了就“直接崩”,而是把错误分成不同类型处理。
|
||||
|
||||
- 有的错误不允许重试,直接失败。
|
||||
- 有的错误可以限次重试。
|
||||
- 有的错误需要等待配置或人工确认。
|
||||
- 有的错误会触发熔断。
|
||||
- 所有失败都要求结构化返回,便于定位问题。
|
||||
|
||||
这比传统脚本“报错了就人工重跑”的方式要成熟得多。
|
||||
|
||||
#### 1 个熔断阈值
|
||||
|
||||
同一动作如果连续失败超过阈值,系统会主动停止继续尝试并通知界面,而不是无限重复。
|
||||
|
||||
这能有效避免错误状态下反复点击、反复提交、反复操作,减少业务风险和误操作成本。
|
||||
|
||||
#### 7 项联调验收标准
|
||||
|
||||
当前项目已经把联调成功的标准写清楚了,包括:
|
||||
|
||||
1. 握手成功率要求
|
||||
2. 版本不匹配的失败处理
|
||||
3. 序列号异常场景处理
|
||||
4. 超大消息拦截
|
||||
5. 核心动作成功率要求
|
||||
6. 结构化错误返回要求
|
||||
7. 日志全链路贯通能力
|
||||
|
||||
这说明项目不是“靠经验凑合能跑”,而是已经开始形成可以复制、可以验收、可以交付的工程标准。
|
||||
|
||||
### 3. 浏览器只是执行面,不再定义整个系统
|
||||
|
||||
传统 openclaw 类项目常见问题是浏览器能力太强,最后整个系统都围着页面脚本转,浏览器脚本几乎变成了系统本体。
|
||||
|
||||
本项目已经明确把浏览器定义为“受保护的特权执行面”,而不是整个 runtime 本体。这意味着以后就算扩展到别的工具面、别的执行面,也不需要推翻现有架构,系统的演进空间更大,整体结构也更清楚。
|
||||
|
||||
### 4. 运行时能力已经从“写死逻辑”升级为“配置驱动”
|
||||
|
||||
这一点是后续提交中非常重要的进展。传统项目经常把模型、策略、模式、环境差异写死在代码里,导致后续一改就牵动整体。
|
||||
|
||||
本项目现在已经把一批关键决策收进运行时配置,包括:
|
||||
|
||||
- 使用哪个模型提供方
|
||||
- 当前激活哪个 provider
|
||||
- 使用什么 planner 模式
|
||||
- 采用哪种 runtime profile
|
||||
- 浏览器能力走哪种 backend
|
||||
- Office 导出走哪种 backend
|
||||
- skills 从哪个目录加载
|
||||
|
||||
从汇报口径上,可以把它概括为:
|
||||
|
||||
1 套统一 runtime config
|
||||
3 种 runtime profile
|
||||
多项可切换运行策略
|
||||
|
||||
这意味着系统不再只是“代码怎么写就怎么跑”,而是开始进入“按环境、按任务、按场景灵活切换”的阶段,更适合企业实际落地。
|
||||
|
||||
### 5. 已形成“先规划、再执行、再产出结果”的闭环能力
|
||||
|
||||
传统自动化项目往往是一上来就直接操作页面,缺少中间过程的可解释性,也不利于后续审计和治理。
|
||||
|
||||
本项目后续提交已经进一步加强了 planner-first 模式,也就是在真正执行之前,先给出计划,再按计划执行,再输出结果。对业务和管理层来说,这样的价值非常直接:
|
||||
|
||||
- 更容易理解系统准备做什么
|
||||
- 更容易检查执行过程是否偏离目标
|
||||
- 更容易把计划、执行、结果串成闭环
|
||||
|
||||
同时,本项目已经不是只有一个简单浏览器工具,而是开始形成更清晰的能力分工,例如:
|
||||
|
||||
- `superrpa_browser` 负责受控浏览器操作
|
||||
- `openxml_office` 负责结果导出
|
||||
- `screen_html_export` 负责展示类产物导出
|
||||
|
||||
这说明项目正在从“一个浏览器操作入口”走向“围绕业务结果组织工具链”的阶段。
|
||||
|
||||
### 6. 技能体系已经开始从“提示词描述”走向“可执行能力包”
|
||||
|
||||
这是本项目相对传统 openclaw 类项目非常重要的一个现实优势。很多传统项目里的“技能”更多只是提示词模板,真正落地时还是回到页面脚本堆叠。
|
||||
|
||||
本项目后续提交已经支持技能包驱动的浏览器脚本执行。简单理解,就是一个技能不再只是“告诉模型怎么做”,而是可以带着确定的脚本能力一起交付。这样做有几个明显好处:
|
||||
|
||||
1. 能力更稳定
|
||||
关键步骤不必完全依赖模型自由发挥,而是可以由打包好的脚本完成。
|
||||
|
||||
2. 可复用性更强
|
||||
同一个技能包可以在相似场景中重复使用,不必每次都重新组织页面操作。
|
||||
|
||||
3. 更适合沉淀企业资产
|
||||
后续很多高价值流程,都可以逐步从“提示词经验”沉淀成“可复用技能包”。
|
||||
|
||||
这意味着项目已经开始从“智能执行框架”走向“智能执行框架 + 可复用技能资产”的模式。
|
||||
|
||||
### 7. 前端只负责展示,不掌握执行权
|
||||
|
||||
传统项目里,前端、脚本、执行逻辑经常混在一起,最后变成“界面里藏了很多业务决策”。这种方式短期看开发快,长期看风险大、维护成本高。
|
||||
|
||||
本项目已经把前端限制为展示层,只负责展示状态、日志、计划和结果,不负责决定是否执行、如何切换模型、如何绕过安全边界。这样一来,系统结构更清楚,后续维护和升级时也更不容易失控。
|
||||
|
||||
此外,后续提交已经支持外部 frontend bundle 优先、内置资源兜底的装载方式。这意味着后续改界面、改展示逻辑,不必每次都重编浏览器宿主,研发效率和交付效率都会更高。
|
||||
|
||||
### 8. 配置能力更强,更适合业务落地
|
||||
|
||||
传统项目往往把很多关键逻辑写死在脚本里,修改一次就要重新改代码。这样不仅效率低,而且很容易因为局部修改牵动整体。
|
||||
|
||||
本项目已经开始把运行时配置、模型配置、策略配置从代码里抽出来,让宿主、运行时、前端之间的责任更清楚。这意味着未来业务调整、模型切换、策略升级都可以更平滑,而不是每次都进行大规模改造。
|
||||
|
||||
同时,后续提交还进一步加强了 source checkout 启动包装和 rules 同步能力,这对开发团队来说很重要。它意味着项目不仅适合做成二进制交付,也更适合在源码态持续联调和快速迭代。
|
||||
|
||||
### 9. 更适合做长期资产沉淀
|
||||
|
||||
传统自动化方案常见的问题,是做完一个流程后,价值基本也就结束了,经验很难积累成资产。
|
||||
|
||||
本项目不一样,它的方向是把执行能力、规则、安全边界、日志能力以及后续的元素识别能力,逐步沉淀成可复用资产。对企业来说,这种价值远高于“今天跑通一个流程”,因为它决定了未来是不是能够越做越快、越做越稳、越做越便宜。
|
||||
|
||||
现在这件事已经开始有现实支撑了。因为项目不只是在“能操作页面”,还已经能把技能、脚本、导出流程、运行时策略和日志标准逐步固化下来。后续再推进“混合自愈选择器”和元素指纹库时,这些都会自然成为资产沉淀的基础层。
|
||||
|
||||
### 10. 可观测性更强,已经开始具备运行级审计基础
|
||||
|
||||
传统项目常常只在失败时打印一段日志,出了问题很难知道系统到底做了什么。
|
||||
|
||||
本项目后续提交已经补上了一批很关键的运行级日志信息,包括:
|
||||
|
||||
- runtime 版本
|
||||
- 协议版本
|
||||
- 配置来源
|
||||
- skills 目录解析结果
|
||||
- runtime profile
|
||||
- skills prompt mode
|
||||
- 已加载技能及版本号
|
||||
- 当前执行模式
|
||||
|
||||
这类能力的价值非常直接:它让系统开始具备“说清楚自己是怎么运行的”的能力。对研发、测试、验收和后续审计来说,这都是非常重要的基础。
|
||||
|
||||
### 11. 已经形成“真实验收”而不是“概念演示”
|
||||
|
||||
后续提交里,项目已经留下了更完整的验收记录,而不是停留在文档层面的能力描述。以知乎热榜 Excel 导出为例,当前已经形成真实验收结果,包括:
|
||||
|
||||
- 真实 provider 模式运行
|
||||
- 实时热榜数据采集
|
||||
- 结构化结果导出
|
||||
- Excel 文件生成
|
||||
- 验收打分
|
||||
|
||||
这说明项目已经不是“理论上可以做到”,而是已经在真实任务链路中证明“能够跑通、能够输出结果、能够形成验收记录”。
|
||||
|
||||
对外汇报时,这一点很重要,因为它代表项目已经从“能力设想”走向“能力验证”。
|
||||
|
||||
### 12. 工程化基础更好
|
||||
|
||||
本项目已经不是单纯的验证页面或原型,而是以运行时内核、协议、规则和测试为主的工程结构。这说明项目更接近“可持续建设的产品内核”,而不只是“一个能演示的自动化效果”。
|
||||
|
||||
从目前仓库状态看,已经有 20 多个顶层测试文件,覆盖协议、握手、runtime、配置、兼容层、导出工具、技能执行和验收评分等多个方面。这说明项目已经在往“可持续交付、可持续验证”的方向走,而不是停留在临时性脚本工程。
|
||||
|
||||
从长期看,这种工程化能力决定了项目能不能真正进入生产环境,能不能被更多团队协同使用,能不能在后续持续扩展能力。
|
||||
|
||||
### 13. 一句话总结现有优势
|
||||
|
||||
如果用非技术语言概括,本项目当前最大的优势可以总结为:
|
||||
|
||||
不是“更会点网页”,而是“已经具备了企业级智能执行系统该有的安全边界、控制能力、真实交付能力、稳定基础和长期演进空间”。
|
||||
|
||||
## 二、下一步计划
|
||||
|
||||
下一阶段的重点,不是继续堆脚本,而是进一步解决“页面一变就失效”的老问题,同时把项目能力从“能执行任务”继续提升为“能持续积累企业级自动化资产”。
|
||||
|
||||
### 1. 研发“混合自愈选择器”(Hybrid Self-Healing Selector)
|
||||
|
||||
在内网环境下,逐步摆脱对单一 XPath 的依赖,建立企业级元素指纹库,让系统在页面变化后依然能更稳地找到目标元素。
|
||||
|
||||
### 2. 定义元素指纹数据结构(JSON)
|
||||
|
||||
给每个可操作元素建立一份“数字档案”,核心字段包括:
|
||||
|
||||
- 语义文本
|
||||
- A11y Role
|
||||
- 相对空间位置
|
||||
- 属性哈希
|
||||
- 兜底 XPath
|
||||
- 视觉切图(Base64 小图)
|
||||
|
||||
这样系统找元素时,不再只靠一条路径,而是像“多特征识别”。
|
||||
|
||||
### 3. 推进“影子录制”(Shadow Recording)机制
|
||||
|
||||
在现有传统 RPA 正常运行时,于底层开启影子模式。当旧脚本通过 XPath 成功命中元素并完成操作时,后台静默抓取该元素的完整指纹并写入本地数据库。
|
||||
|
||||
通过这种方式,在不额外增加大量人工录制成本的前提下,持续沉淀高价值元素资产库。
|
||||
|
||||
### 4. 开发穿透层能力
|
||||
|
||||
利用定制 Chromium 的底层权限,解决 `iframe` 和闭合 `Shadow DOM` 这类复杂页面结构下的定位难题,为后续自愈选择器提供更强支撑。
|
||||
|
||||
## 三、预期结果
|
||||
|
||||
通过下一阶段建设,本项目将从“能执行任务”进一步升级为:
|
||||
|
||||
- 在复杂企业页面中更稳
|
||||
- 对页面变化更不敏感
|
||||
- 更容易持续积累高价值资产
|
||||
- 更适合在企业环境中长期推广使用
|
||||
|
||||
从业务视角看,项目价值也会从“完成单个流程自动化”进一步升级为“建设企业级智能执行底座”。
|
||||
BIN
docs/sgClaw项目现有优势与下一步计划汇报稿.pdf
Normal file
BIN
docs/sgClaw项目现有优势与下一步计划汇报稿.pdf
Normal file
Binary file not shown.
425
docs/superpowers/plans/2026-04-01-claw-ws-execution-cards.md
Normal file
425
docs/superpowers/plans/2026-04-01-claw-ws-execution-cards.md
Normal file
@@ -0,0 +1,425 @@
|
||||
# Claw-WS 开发执行顺序卡片
|
||||
|
||||
> 配套计划:[`2026-04-01-claw-ws-parallel-transport.md`](./2026-04-01-claw-ws-parallel-transport.md)
|
||||
>
|
||||
> 使用方式:严格按卡片顺序执行。每张卡片完成后先跑卡片内测试,再进入下一张。不要跳卡,不要提前接线,不要先写 service/client 再回头抽象底层。
|
||||
|
||||
---
|
||||
|
||||
## 卡片 0:执行前约束
|
||||
|
||||
**目标**
|
||||
先锁定边界,避免实现过程中把 pipe 模式改坏。
|
||||
|
||||
**必须遵守**
|
||||
- 现有 pipe 模式必须保持可用
|
||||
- 新增的是并行 `claw-ws` 模式,不是替换 pipe
|
||||
- v1 只做单客户端、单任务串行
|
||||
- `browser_action` / `superrpa_browser` 外部命名保持稳定
|
||||
- 如果 WS `Eval` 不完整,先禁用相关 browser-script skill 暴露
|
||||
- 不要提前做多客户端、任务队列、管理接口
|
||||
|
||||
**完成标准**
|
||||
- 开发者明确后续所有改动都围绕“抽象复用 + 并行新增”进行
|
||||
|
||||
---
|
||||
|
||||
## 卡片 1:抽共享 SubmitTask Runner
|
||||
|
||||
**目标**
|
||||
把当前 `BrowserMessage::SubmitTask` 的主执行逻辑从 pipe 入口里抽出来,变成共享执行器。
|
||||
|
||||
**先做什么**
|
||||
1. 新增 `tests/task_runner_test.rs`
|
||||
2. 先写失败用例:
|
||||
- 空 instruction
|
||||
- 无 LLM 配置
|
||||
- 日志顺序仍然是 `LogEntry` -> `TaskComplete`
|
||||
|
||||
**要改哪些文件**
|
||||
- `src/agent/mod.rs`
|
||||
- `src/lib.rs`
|
||||
- `src/agent/task_runner.rs`
|
||||
- `tests/task_runner_test.rs`
|
||||
|
||||
**实现动作**
|
||||
1. 建 `SubmitTaskRequest`
|
||||
2. 建 `AgentEventSink`
|
||||
3. 建 `run_submit_task(...)`
|
||||
4. 让 pipe 入口只做:
|
||||
- 解包 `BrowserMessage::SubmitTask`
|
||||
- 转成 `SubmitTaskRequest`
|
||||
- 调共享 runner
|
||||
|
||||
**绝对不要做**
|
||||
- 不要在这一张卡里引入 ws backend
|
||||
- 不要改 tool adapter
|
||||
- 不要碰 service/client
|
||||
|
||||
**本卡测试命令**
|
||||
|
||||
```bash
|
||||
cargo test --test runtime_task_flow_test --test task_runner_test
|
||||
```
|
||||
|
||||
**通过标准**
|
||||
- 老的 `runtime_task_flow_test` 继续绿
|
||||
- 新的 `task_runner_test` 通过
|
||||
- pipe 行为无变化
|
||||
|
||||
**完成后提交**
|
||||
|
||||
```bash
|
||||
git commit -m "refactor: extract shared submit-task runner"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 卡片 2:抽 BrowserBackend 抽象
|
||||
|
||||
**目标**
|
||||
把上层 runtime / orchestration / tool adapter 从 `BrowserPipeTool<T>` 解耦,统一依赖浏览器后端接口。
|
||||
|
||||
**先做什么**
|
||||
1. 新增 `tests/browser_backend_capability_test.rs`
|
||||
2. 先写失败用例:
|
||||
- pipe backend 元数据不变
|
||||
- pipe backend 支持 `Eval`
|
||||
- `supports_eval() == false` 时不暴露 browser-script tools
|
||||
|
||||
**要改哪些文件**
|
||||
- `src/browser/mod.rs`
|
||||
- `src/browser/backend.rs`
|
||||
- `src/browser/pipe_backend.rs`
|
||||
- `src/compat/browser_tool_adapter.rs`
|
||||
- `src/compat/browser_script_skill_tool.rs`
|
||||
- `src/compat/runtime.rs`
|
||||
- `src/compat/orchestration.rs`
|
||||
- `src/compat/workflow_executor.rs`
|
||||
- `src/lib.rs`
|
||||
- `tests/browser_backend_capability_test.rs`
|
||||
|
||||
**实现动作**
|
||||
1. 定义 `BrowserBackend`
|
||||
2. 写 `pipe_backend` 包装当前 `BrowserPipeTool`
|
||||
3. 把上层签名改成 `Arc<dyn BrowserBackend>`
|
||||
4. 保持工具名不变:
|
||||
- `browser_action`
|
||||
- `superrpa_browser`
|
||||
5. 增加 `supports_eval()` gating
|
||||
|
||||
**绝对不要做**
|
||||
- 不要在这一张卡里接浏览器 ws 协议
|
||||
- 不要建 service
|
||||
- 不要加 client 协议
|
||||
|
||||
**本卡测试命令**
|
||||
|
||||
```bash
|
||||
cargo test --test browser_tool_test --test compat_browser_tool_test --test browser_backend_capability_test
|
||||
```
|
||||
|
||||
**通过标准**
|
||||
- 现有 browser tool 相关测试不回归
|
||||
- 新 capability test 通过
|
||||
- 上层逻辑已脱离 `BrowserPipeTool<T>` 的硬耦合
|
||||
|
||||
**完成后提交**
|
||||
|
||||
```bash
|
||||
git commit -m "refactor: abstract browser backend from pipe transport"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 卡片 3:写死浏览器 WS 协议 Codec
|
||||
|
||||
**目标**
|
||||
单独做浏览器固定 WebSocket 协议编解码层,不把协议细节散落到 backend 和 service 里。
|
||||
|
||||
**先做什么**
|
||||
1. 新增 `tests/browser_ws_protocol_test.rs`
|
||||
2. 先写失败用例:
|
||||
- outbound frame 精确编码
|
||||
- callback payload 解析
|
||||
- 异常格式拒绝
|
||||
- v1 action 覆盖完整
|
||||
|
||||
**要改哪些文件**
|
||||
- `src/browser/ws_protocol.rs`
|
||||
- `tests/browser_ws_protocol_test.rs`
|
||||
|
||||
**实现动作**
|
||||
1. 按浏览器文档编码数组消息
|
||||
2. 只支持 v1 必需动作:
|
||||
- `Navigate`
|
||||
- `GetText`
|
||||
- `Click`
|
||||
- `Type`
|
||||
- `Eval`
|
||||
3. 定义 callback 解析和关联规则
|
||||
4. 对 unsupported / malformed 早失败
|
||||
|
||||
**绝对不要做**
|
||||
- 不要在这张卡里连真实浏览器
|
||||
- 不要写 service 协议
|
||||
- 不要把网络连接逻辑塞进 codec
|
||||
|
||||
**本卡测试命令**
|
||||
|
||||
```bash
|
||||
cargo test --test browser_ws_protocol_test
|
||||
```
|
||||
|
||||
**通过标准**
|
||||
- codec 单测全绿
|
||||
- 无网络依赖
|
||||
- 已能作为 backend 的纯协议层基础
|
||||
|
||||
**完成后提交**
|
||||
|
||||
```bash
|
||||
git commit -m "test: codify fixed browser websocket protocol"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 卡片 4:实现 Browser WS Backend
|
||||
|
||||
**目标**
|
||||
在 codec 之上提供和 pipe backend 类似的阻塞式 `invoke(...)` 能力。
|
||||
|
||||
**先做什么**
|
||||
1. 新增 `tests/browser_ws_backend_test.rs`
|
||||
2. 先写失败用例:
|
||||
- `0 + 无 callback` 成功
|
||||
- 非 `0` 失败
|
||||
- `0 + callback` 成功
|
||||
- callback timeout
|
||||
- socket drop
|
||||
|
||||
**要改哪些文件**
|
||||
- `src/browser/mod.rs`
|
||||
- `src/browser/ws_backend.rs`
|
||||
- `tests/browser_ws_backend_test.rs`
|
||||
|
||||
**实现动作**
|
||||
1. 建长连接管理器
|
||||
2. 先做串行单飞请求
|
||||
3. 发送前过 `MacPolicy`
|
||||
4. 统一即时返回和 callback 返回
|
||||
5. 输出统一 `CommandOutput`
|
||||
|
||||
**绝对不要做**
|
||||
- 不要在这一张卡里接 service 层
|
||||
- 不要做多并发 browser request
|
||||
- 不要直接把浏览器 ws 代码散进 runtime
|
||||
|
||||
**本卡测试命令**
|
||||
|
||||
```bash
|
||||
cargo test --test browser_ws_backend_test
|
||||
```
|
||||
|
||||
**通过标准**
|
||||
- backend 在 mocks/fakes 下稳定通过
|
||||
- invoke 语义与 pipe backend 接近
|
||||
- 可供上层 runtime 直接替换使用
|
||||
|
||||
**完成后提交**
|
||||
|
||||
```bash
|
||||
git commit -m "feat: add browser websocket backend"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 卡片 5:实现 sg_claw Service
|
||||
|
||||
**目标**
|
||||
新增本地长驻服务端,承接 client 请求并复用共享 task runner。
|
||||
|
||||
**先做什么**
|
||||
1. 新增 `tests/service_ws_session_test.rs`
|
||||
2. 先写失败用例:
|
||||
- 首个客户端接入成功
|
||||
- 第二个客户端收到 busy
|
||||
- 断开后状态释放
|
||||
- 任务重入被拒绝
|
||||
|
||||
**要改哪些文件**
|
||||
- `src/service/mod.rs`
|
||||
- `src/service/protocol.rs`
|
||||
- `src/service/server.rs`
|
||||
- `src/bin/sg_claw.rs`
|
||||
- `src/lib.rs`
|
||||
- `Cargo.toml`
|
||||
- `tests/service_ws_session_test.rs`
|
||||
|
||||
**实现动作**
|
||||
1. 定义 client/service 协议
|
||||
2. 实现 service 端事件 sink
|
||||
3. 建单活 session 状态机:
|
||||
- `Idle`
|
||||
- `ClientAttached`
|
||||
- `TaskRunning`
|
||||
4. 路由 `SubmitTask` 到共享 runner
|
||||
5. 保持 pipe 入口不变
|
||||
|
||||
**绝对不要做**
|
||||
- 不要在这一张卡里做 client 交互体验优化
|
||||
- 不要加任务队列
|
||||
- 不要支持多客户端并发
|
||||
|
||||
**本卡测试命令**
|
||||
|
||||
```bash
|
||||
cargo test --test service_ws_session_test
|
||||
```
|
||||
|
||||
**通过标准**
|
||||
- 服务端会话锁生效
|
||||
- 共享 runner 可被 service 复用
|
||||
- pipe 模式入口未受影响
|
||||
|
||||
**完成后提交**
|
||||
|
||||
```bash
|
||||
git commit -m "feat: add claw-ws service entrypoint"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 卡片 6:实现 sg_claw_client
|
||||
|
||||
**目标**
|
||||
新增一个薄客户端,提供类似 `claude/codex` 的交互式命令行体验。
|
||||
|
||||
**先做什么**
|
||||
1. 新增 `tests/service_task_flow_test.rs`
|
||||
2. 先写失败用例:
|
||||
- submit-task 送达 service
|
||||
- 日志按顺序流回
|
||||
- completion 只到一次
|
||||
- 完成后断开处理清晰
|
||||
|
||||
**要改哪些文件**
|
||||
- `src/bin/sg_claw_client.rs`
|
||||
- `Cargo.toml`
|
||||
- `tests/service_task_flow_test.rs`
|
||||
|
||||
**实现动作**
|
||||
1. 建立到本地 service 的 ws 连接
|
||||
2. 读取用户输入
|
||||
3. 发送 `SubmitTask`
|
||||
4. 实时打印日志
|
||||
5. 收到 `TaskComplete` 结束本轮
|
||||
|
||||
**绝对不要做**
|
||||
- 不要把 runtime、skills、browser backend 复制进 client
|
||||
- 不要让 client 直接连浏览器
|
||||
- 不要让 client 承担业务逻辑
|
||||
|
||||
**本卡测试命令**
|
||||
|
||||
```bash
|
||||
cargo test --test service_task_flow_test
|
||||
cargo build --bin sg_claw --bin sg_claw_client
|
||||
```
|
||||
|
||||
**通过标准**
|
||||
- client 是薄壳
|
||||
- task flow 正常
|
||||
- 两个新 binary 可编译
|
||||
|
||||
**完成后提交**
|
||||
|
||||
```bash
|
||||
git commit -m "feat: add interactive claw-ws client"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 卡片 7:最终接线与回归验证
|
||||
|
||||
**目标**
|
||||
把 ws 路径接起来,同时确认 pipe 路径零回归。
|
||||
|
||||
**先做什么**
|
||||
1. 只增加最小配置项:
|
||||
- `browser_ws_url`
|
||||
- `service_ws_listen_addr`
|
||||
2. 检查外部工具命名保持稳定
|
||||
|
||||
**要改哪些文件**
|
||||
- `Cargo.toml`
|
||||
- `src/lib.rs`
|
||||
- `src/config/settings.rs`
|
||||
- `src/runtime/engine.rs`(如确有必要)
|
||||
|
||||
**实现动作**
|
||||
1. 接入最小配置面
|
||||
2. 确保 pipe / ws 下工具命名一致
|
||||
3. 跑旧 pipe 回归
|
||||
4. 跑新 ws 测试
|
||||
5. 跑全量 Rust tests
|
||||
6. 编译所有 binary
|
||||
7. 做一次真实本地 smoke test
|
||||
|
||||
**本卡 pipe 回归命令**
|
||||
|
||||
```bash
|
||||
cargo test --test browser_tool_test --test compat_browser_tool_test --test runtime_task_flow_test --test pipe_handshake_test --test pipe_protocol_test --test task_protocol_test
|
||||
```
|
||||
|
||||
**本卡 ws 测试命令**
|
||||
|
||||
```bash
|
||||
cargo test --test task_runner_test --test browser_ws_protocol_test --test browser_ws_backend_test --test browser_backend_capability_test --test service_ws_session_test --test service_task_flow_test
|
||||
```
|
||||
|
||||
**本卡全量命令**
|
||||
|
||||
```bash
|
||||
cargo test --tests
|
||||
cargo build --bin sgclaw --bin sg_claw --bin sg_claw_client
|
||||
```
|
||||
|
||||
**手工验证**
|
||||
1. 启动浏览器,确认 `ws://127.0.0.1:12345` 可用
|
||||
2. `cargo run --bin sg_claw`
|
||||
3. 新终端运行 `cargo run --bin sg_claw_client`
|
||||
4. 发一个简单浏览器任务
|
||||
5. 确认日志流和单次 completion
|
||||
6. 确认旧 `cargo run` pipe 入口仍可启动
|
||||
|
||||
**通过标准**
|
||||
- pipe 模式零回归
|
||||
- ws 模式可独立工作
|
||||
- 两套模式并行存在
|
||||
|
||||
**完成后提交**
|
||||
|
||||
```bash
|
||||
git commit -m "feat: wire parallel claw-ws transport"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 一句话执行顺序
|
||||
|
||||
严格按下面顺序做:
|
||||
|
||||
1. 共享 runner
|
||||
2. browser backend 抽象
|
||||
3. ws 协议 codec
|
||||
4. ws backend
|
||||
5. service
|
||||
6. client
|
||||
7. 配置接线 + 回归
|
||||
|
||||
如果顺序乱了,最容易出现的问题是:
|
||||
- 上层重复实现
|
||||
- pipe 被误伤
|
||||
- ws 协议细节扩散到整个工程
|
||||
- service/client 提前写完后又被迫重构
|
||||
687
docs/superpowers/plans/2026-04-01-claw-ws-parallel-transport.md
Normal file
687
docs/superpowers/plans/2026-04-01-claw-ws-parallel-transport.md
Normal file
@@ -0,0 +1,687 @@
|
||||
# Claw-WS Parallel Transport Implementation Plan
|
||||
|
||||
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
|
||||
|
||||
**Goal:** Add a parallel `claw-ws` transport path that keeps the current pipe mode intact while introducing a long-lived `sg_claw` local service, an interactive `sg_claw_client`, and a browser WebSocket backend at `ws://127.0.0.1:12345`.
|
||||
|
||||
**Architecture:** First extract a transport-agnostic submit-task runner and browser backend abstraction from the current pipe-coupled flow. Keep the existing pipe path as one adapter/backend, then add a fixed-protocol browser WebSocket backend plus a small service/session layer and an interactive CLI client that reuse the same runtime, orchestration, and browser-facing tool adapters.
|
||||
|
||||
**Tech Stack:** Rust 2021, current sgclaw compat runtime, zeroclaw runtime engine, `serde`/`serde_json`, existing `MacPolicy`, and a blocking WebSocket crate for v1 (`tungstenite` preferred over a broad async rewrite).
|
||||
|
||||
---
|
||||
|
||||
## Scope Guardrails
|
||||
|
||||
- Keep the current pipe mode entrypoint and behavior working.
|
||||
- Do **not** replace the existing browser pipe path.
|
||||
- Add a **parallel** WebSocket path only.
|
||||
- v1 supports **one active client session** only.
|
||||
- Reuse existing tool names and runtime behavior whenever possible.
|
||||
- If WS `Eval` support is incomplete, disable eval-dependent browser-script skill exposure in WS mode rather than shipping partial behavior.
|
||||
- Do not broaden v1 with task queues, multi-client support, or admin endpoints.
|
||||
|
||||
---
|
||||
|
||||
## File Structure
|
||||
|
||||
### Existing files to reuse
|
||||
|
||||
- Modify: `src/lib.rs` — current pipe bootstrap and receive loop; keep as the legacy pipe entrypoint.
|
||||
- Modify: `src/agent/mod.rs` — current `BrowserMessage::SubmitTask` entrypoint and config-loading flow.
|
||||
- Modify: `src/compat/runtime.rs` — compat runtime and tool assembly.
|
||||
- Modify: `src/compat/orchestration.rs` — direct workflow vs compat runtime routing.
|
||||
- Modify: `src/compat/browser_tool_adapter.rs` — exposes `browser_action` and `superrpa_browser`.
|
||||
- Modify: `src/compat/browser_script_skill_tool.rs` — browser-script skill execution.
|
||||
- Modify: `src/compat/workflow_executor.rs` — direct browser workflows such as Zhihu flows.
|
||||
- Reuse: `src/pipe/browser_tool.rs` — current browser command executor; retain as the pipe backend implementation.
|
||||
- Reuse: `src/pipe/protocol.rs` — `BrowserMessage`, `AgentMessage`, `Action`, `ExecutionSurfaceMetadata`.
|
||||
- Reuse: `src/security/mac_policy.rs` — local action/domain guardrails.
|
||||
- Modify: `src/config/settings.rs` — minimal new config surface for WS mode.
|
||||
- Optional modify: `src/runtime/engine.rs` — only if backend capability wiring requires it.
|
||||
|
||||
### New files to create
|
||||
|
||||
- Create: `src/agent/task_runner.rs` — shared submit-task execution entrypoint.
|
||||
- Create: `src/browser/mod.rs` — browser backend exports.
|
||||
- Create: `src/browser/backend.rs` — `BrowserBackend` trait and helpers.
|
||||
- Create: `src/browser/pipe_backend.rs` — wrapper around existing `BrowserPipeTool`.
|
||||
- Create: `src/browser/ws_protocol.rs` — fixed browser WS request/response codec.
|
||||
- Create: `src/browser/ws_backend.rs` — browser WS backend with blocking invoke semantics.
|
||||
- Create: `src/service/mod.rs` — service exports.
|
||||
- Create: `src/service/protocol.rs` — client/service WS message types.
|
||||
- Create: `src/service/server.rs` — single-session `sg_claw` server.
|
||||
- Create: `src/bin/sg_claw.rs` — service binary.
|
||||
- Create: `src/bin/sg_claw_client.rs` — interactive CLI client.
|
||||
- Create: `tests/task_runner_test.rs` — shared submit-task runner regressions.
|
||||
- Create: `tests/browser_backend_capability_test.rs` — backend capability/tool exposure tests.
|
||||
- Create: `tests/browser_ws_protocol_test.rs` — browser WS protocol tests.
|
||||
- Create: `tests/browser_ws_backend_test.rs` — browser WS backend tests.
|
||||
- Create: `tests/service_ws_session_test.rs` — single-session server tests.
|
||||
- Create: `tests/service_task_flow_test.rs` — client/service task flow tests.
|
||||
|
||||
---
|
||||
|
||||
## Task 1: Extract a shared submit-task runner
|
||||
|
||||
**Files:**
|
||||
- Create: `src/agent/task_runner.rs`
|
||||
- Modify: `src/agent/mod.rs`
|
||||
- Modify: `src/lib.rs`
|
||||
- Test: `tests/task_runner_test.rs`
|
||||
- Reuse: `src/compat/runtime.rs`, `src/compat/orchestration.rs`
|
||||
|
||||
- [ ] **Step 1: Write a failing runner regression test**
|
||||
|
||||
Create `tests/task_runner_test.rs` covering:
|
||||
- empty instruction returns the same `TaskComplete` failure summary
|
||||
- missing LLM config still returns the same summary shape
|
||||
- the pipe adapter still emits `LogEntry` before `TaskComplete`
|
||||
|
||||
- [ ] **Step 2: Run the targeted regression tests first**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test runtime_task_flow_test --test task_runner_test
|
||||
```
|
||||
|
||||
Expected: `task_runner_test` fails because the shared runner does not exist yet.
|
||||
|
||||
- [ ] **Step 3: Define the transport-neutral request model**
|
||||
|
||||
Create `src/agent/task_runner.rs` with a request struct that mirrors the current pipe payload:
|
||||
|
||||
```rust
|
||||
pub struct SubmitTaskRequest {
|
||||
pub instruction: String,
|
||||
pub conversation_id: Option<String>,
|
||||
pub messages: Vec<ConversationMessage>,
|
||||
pub page_url: Option<String>,
|
||||
pub page_title: Option<String>,
|
||||
}
|
||||
```
|
||||
|
||||
Normalize empty strings to `None` at the adapter boundary.
|
||||
|
||||
- [ ] **Step 4: Define an event sink abstraction**
|
||||
|
||||
Add a small trait that can emit the current agent events without depending on a specific transport:
|
||||
|
||||
```rust
|
||||
pub trait AgentEventSink {
|
||||
fn send(&self, message: &AgentMessage) -> Result<(), PipeError>;
|
||||
}
|
||||
```
|
||||
|
||||
The existing pipe transport should implement this first.
|
||||
|
||||
- [ ] **Step 5: Move submit-task execution into a shared function**
|
||||
|
||||
Extract the body currently inside `BrowserMessage::SubmitTask` handling from `src/agent/mod.rs` into a shared function such as:
|
||||
|
||||
```rust
|
||||
pub fn run_submit_task(
|
||||
sink: &dyn AgentEventSink,
|
||||
browser_backend: Arc<dyn BrowserBackend>,
|
||||
context: &AgentRuntimeContext,
|
||||
request: SubmitTaskRequest,
|
||||
) -> Result<(), PipeError>
|
||||
```
|
||||
|
||||
This function must still:
|
||||
- validate empty instruction
|
||||
- load sgclaw settings
|
||||
- log runtime/config info
|
||||
- choose orchestration vs compat runtime
|
||||
- emit `AgentMessage::TaskComplete`
|
||||
|
||||
- [ ] **Step 6: Keep pipe mode as a thin adapter**
|
||||
|
||||
Refactor `handle_browser_message_with_context(...)` in `src/agent/mod.rs` so it only:
|
||||
- pattern matches `BrowserMessage`
|
||||
- converts `SubmitTask` into `SubmitTaskRequest`
|
||||
- forwards into `run_submit_task(...)`
|
||||
|
||||
- [ ] **Step 7: Re-run the runner regressions**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test runtime_task_flow_test --test task_runner_test
|
||||
```
|
||||
|
||||
Expected: both tests pass and pipe behavior remains unchanged.
|
||||
|
||||
- [ ] **Step 8: Commit**
|
||||
|
||||
```bash
|
||||
git add src/agent/mod.rs src/agent/task_runner.rs src/lib.rs tests/task_runner_test.rs
|
||||
git commit -m "refactor: extract shared submit-task runner"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Task 2: Introduce a browser backend abstraction and wrap the current pipe implementation
|
||||
|
||||
**Files:**
|
||||
- Create: `src/browser/mod.rs`
|
||||
- Create: `src/browser/backend.rs`
|
||||
- Create: `src/browser/pipe_backend.rs`
|
||||
- Modify: `src/lib.rs`
|
||||
- Modify: `src/compat/browser_tool_adapter.rs`
|
||||
- Modify: `src/compat/browser_script_skill_tool.rs`
|
||||
- Modify: `src/compat/runtime.rs`
|
||||
- Modify: `src/compat/orchestration.rs`
|
||||
- Modify: `src/compat/workflow_executor.rs`
|
||||
- Test: `tests/browser_backend_capability_test.rs`
|
||||
- Reuse: `src/pipe/browser_tool.rs`, `src/security/mac_policy.rs`
|
||||
|
||||
- [ ] **Step 1: Add a failing backend capability test**
|
||||
|
||||
Create `tests/browser_backend_capability_test.rs` to verify:
|
||||
- pipe backend still exposes privileged surface metadata
|
||||
- pipe backend still supports `Eval`
|
||||
- browser-script tool exposure is disabled when `supports_eval()` is false
|
||||
|
||||
- [ ] **Step 2: Run the current browser adapter tests first**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test browser_tool_test --test compat_browser_tool_test --test browser_backend_capability_test
|
||||
```
|
||||
|
||||
Expected: new capability test fails because the backend abstraction does not exist yet.
|
||||
|
||||
- [ ] **Step 3: Define the shared browser interface**
|
||||
|
||||
Create `src/browser/backend.rs`:
|
||||
|
||||
```rust
|
||||
pub trait BrowserBackend: Send + Sync {
|
||||
fn invoke(
|
||||
&self,
|
||||
action: Action,
|
||||
params: Value,
|
||||
expected_domain: &str,
|
||||
) -> Result<CommandOutput, PipeError>;
|
||||
|
||||
fn surface_metadata(&self) -> ExecutionSurfaceMetadata;
|
||||
|
||||
fn supports_eval(&self) -> bool {
|
||||
true
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
- [ ] **Step 4: Implement the pipe backend as a wrapper**
|
||||
|
||||
Create `src/browser/pipe_backend.rs` that stores the current `BrowserPipeTool<T>` and forwards `invoke(...)` and `surface_metadata()` unchanged.
|
||||
|
||||
Pipe mode must continue using:
|
||||
- `perform_handshake(...)`
|
||||
- `MacPolicy::load_from_path(...)`
|
||||
- `BrowserPipeTool::new(...).with_response_timeout(...)`
|
||||
|
||||
- [ ] **Step 5: Refactor runtime and tool adapters to depend on `Arc<dyn BrowserBackend>`**
|
||||
|
||||
Update:
|
||||
- `src/compat/browser_tool_adapter.rs`
|
||||
- `src/compat/browser_script_skill_tool.rs`
|
||||
- `src/compat/runtime.rs`
|
||||
- `src/compat/orchestration.rs`
|
||||
- `src/compat/workflow_executor.rs`
|
||||
|
||||
Preserve external tool names:
|
||||
- `browser_action`
|
||||
- `superrpa_browser`
|
||||
|
||||
- [ ] **Step 6: Add capability gating for eval-dependent script tools**
|
||||
|
||||
If `supports_eval()` is false, do **not** expose browser-script skill tools from `build_browser_script_skill_tools(...)` in that backend mode.
|
||||
|
||||
- [ ] **Step 7: Re-run browser adapter tests**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test browser_tool_test --test compat_browser_tool_test --test browser_backend_capability_test
|
||||
```
|
||||
|
||||
Expected: all three pass.
|
||||
|
||||
- [ ] **Step 8: Commit**
|
||||
|
||||
```bash
|
||||
git add src/browser src/lib.rs src/compat/browser_tool_adapter.rs src/compat/browser_script_skill_tool.rs src/compat/runtime.rs src/compat/orchestration.rs src/compat/workflow_executor.rs tests/browser_backend_capability_test.rs
|
||||
git commit -m "refactor: abstract browser backend from pipe transport"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Task 3: Implement the fixed browser WebSocket protocol codec in isolation
|
||||
|
||||
**Files:**
|
||||
- Create: `src/browser/ws_protocol.rs`
|
||||
- Test: `tests/browser_ws_protocol_test.rs`
|
||||
- Reuse: `docs/_tmp_sgbrowser_ws_api_doc.txt`
|
||||
|
||||
- [ ] **Step 1: Write failing protocol codec tests**
|
||||
|
||||
Create `tests/browser_ws_protocol_test.rs` covering:
|
||||
- exact outbound frame encoding
|
||||
- callback payload decoding
|
||||
- unknown callback format rejection
|
||||
- mapping coverage for every supported v1 action
|
||||
|
||||
- [ ] **Step 2: Run the protocol tests first**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test browser_ws_protocol_test
|
||||
```
|
||||
|
||||
Expected: fail because the WS protocol codec does not exist yet.
|
||||
|
||||
- [ ] **Step 3: Encode the exact browser frame shapes**
|
||||
|
||||
Create `src/browser/ws_protocol.rs` so it can build exact array-form payloads such as:
|
||||
|
||||
```rust
|
||||
[requesturl, "sgBrowserExcuteJsCodeByArea", target_url, js_code, area]
|
||||
```
|
||||
|
||||
Serialize to the JSON string format required by the browser service.
|
||||
|
||||
- [ ] **Step 4: Define the v1 action mapping table**
|
||||
|
||||
Support only the actions already needed by current sgclaw flows:
|
||||
- `Navigate`
|
||||
- `GetText`
|
||||
- `Click`
|
||||
- `Type`
|
||||
- `Eval`
|
||||
|
||||
Document which browser functions each one maps to and what assumptions they rely on.
|
||||
|
||||
- [ ] **Step 5: Define callback parsing and correlation rules**
|
||||
|
||||
Represent callback-bearing operations explicitly, including the callback function naming or request-correlation strategy the backend will depend on.
|
||||
|
||||
- [ ] **Step 6: Reject unsupported or malformed shapes early**
|
||||
|
||||
Fail fast for:
|
||||
- unsupported actions
|
||||
- malformed callback payloads
|
||||
- missing request correlation metadata
|
||||
|
||||
- [ ] **Step 7: Re-run the protocol tests**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test browser_ws_protocol_test
|
||||
```
|
||||
|
||||
Expected: pass with no network dependency.
|
||||
|
||||
- [ ] **Step 8: Commit**
|
||||
|
||||
```bash
|
||||
git add src/browser/ws_protocol.rs tests/browser_ws_protocol_test.rs
|
||||
git commit -m "test: codify fixed browser websocket protocol"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Task 4: Build the browser WS backend with synchronous invoke semantics
|
||||
|
||||
**Files:**
|
||||
- Create: `src/browser/ws_backend.rs`
|
||||
- Modify: `src/browser/mod.rs`
|
||||
- Test: `tests/browser_ws_backend_test.rs`
|
||||
- Reuse: `CommandOutput`, `PipeError`, `ExecutionSurfaceMetadata`, `MacPolicy`
|
||||
|
||||
- [ ] **Step 1: Write failing backend behavior tests**
|
||||
|
||||
Create `tests/browser_ws_backend_test.rs` covering:
|
||||
- zero return + no callback => success
|
||||
- non-zero return => failure
|
||||
- zero return + callback => success with normalized `CommandOutput`
|
||||
- callback timeout => timeout error
|
||||
- dropped socket => clear failure
|
||||
|
||||
- [ ] **Step 2: Run backend tests first**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test browser_ws_backend_test
|
||||
```
|
||||
|
||||
Expected: fail because the WS backend does not exist yet.
|
||||
|
||||
- [ ] **Step 3: Build a long-lived browser connection manager**
|
||||
|
||||
Implement `src/browser/ws_backend.rs` to connect to `ws://127.0.0.1:12345` and expose blocking `invoke(...)` calls.
|
||||
|
||||
Use a dedicated connection loop plus request/response coordination instead of scattering raw socket calls through the runtime.
|
||||
|
||||
- [ ] **Step 4: Preserve local guardrails before send**
|
||||
|
||||
Validate `MacPolicy` before translating an action into the browser WS protocol, matching current pipe backend behavior.
|
||||
|
||||
- [ ] **Step 5: Normalize immediate status returns and delayed callbacks**
|
||||
|
||||
For each `invoke(...)` call:
|
||||
- fail immediately on non-zero return codes
|
||||
- succeed immediately for operations with no data callback
|
||||
- wait for the matching callback for result-bearing operations
|
||||
- convert the final outcome into `CommandOutput`
|
||||
|
||||
- [ ] **Step 6: Keep v1 concurrency intentionally serialized**
|
||||
|
||||
Allow only one in-flight browser request at a time unless the browser callback protocol proves a stable request-id guarantee.
|
||||
|
||||
- [ ] **Step 7: Re-run backend tests**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test browser_ws_backend_test
|
||||
```
|
||||
|
||||
Expected: pass using mocks/fakes, not the real browser.
|
||||
|
||||
- [ ] **Step 8: Commit**
|
||||
|
||||
```bash
|
||||
git add src/browser/mod.rs src/browser/ws_backend.rs tests/browser_ws_backend_test.rs
|
||||
git commit -m "feat: add browser websocket backend"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Task 5: Add the `sg_claw` service protocol and single-session server
|
||||
|
||||
**Files:**
|
||||
- Create: `src/service/mod.rs`
|
||||
- Create: `src/service/protocol.rs`
|
||||
- Create: `src/service/server.rs`
|
||||
- Create: `src/bin/sg_claw.rs`
|
||||
- Modify: `src/lib.rs`
|
||||
- Modify: `Cargo.toml`
|
||||
- Test: `tests/service_ws_session_test.rs`
|
||||
- Reuse: `AgentMessage::LogEntry`, `AgentMessage::TaskComplete`, `SubmitTaskRequest`, `run_submit_task(...)`
|
||||
|
||||
- [ ] **Step 1: Write failing service session tests**
|
||||
|
||||
Create `tests/service_ws_session_test.rs` to verify:
|
||||
- first client attaches
|
||||
- second client gets `Busy`
|
||||
- disconnect resets session state
|
||||
- overlapping task submission is rejected clearly
|
||||
|
||||
- [ ] **Step 2: Run the session tests first**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test service_ws_session_test
|
||||
```
|
||||
|
||||
Expected: fail because the service layer does not exist yet.
|
||||
|
||||
- [ ] **Step 3: Define a thin client/service WS protocol**
|
||||
|
||||
In `src/service/protocol.rs`, reuse existing task/event shapes as much as possible:
|
||||
|
||||
```rust
|
||||
ClientMessage::SubmitTask { instruction, conversation_id, messages, page_url, page_title }
|
||||
ClientMessage::Ping
|
||||
ServiceMessage::LogEntry { level, message }
|
||||
ServiceMessage::TaskComplete { success, summary }
|
||||
ServiceMessage::Busy { message }
|
||||
```
|
||||
|
||||
- [ ] **Step 4: Add the service event sink adapter**
|
||||
|
||||
Implement `AgentEventSink` for the service session writer so the shared task runner can stream `LogEntry` and `TaskComplete` over the service WebSocket.
|
||||
|
||||
- [ ] **Step 5: Implement single-active-client session state**
|
||||
|
||||
Model explicit states such as:
|
||||
- `Idle`
|
||||
- `ClientAttached`
|
||||
- `TaskRunning`
|
||||
|
||||
Reject a second client with `ServiceMessage::Busy` and close the socket. Reject overlapping tasks instead of queueing them.
|
||||
|
||||
- [ ] **Step 6: Add the service binary**
|
||||
|
||||
Create `src/bin/sg_claw.rs` that:
|
||||
- loads config
|
||||
- creates the browser WS backend
|
||||
- listens for local client connections
|
||||
- routes `SubmitTask` into `run_submit_task(...)`
|
||||
|
||||
Keep `src/main.rs` and the existing `sgclaw::run()` pipe path unchanged.
|
||||
|
||||
- [ ] **Step 7: Re-run the session tests**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test service_ws_session_test
|
||||
```
|
||||
|
||||
Expected: pass without the real browser.
|
||||
|
||||
- [ ] **Step 8: Commit**
|
||||
|
||||
```bash
|
||||
git add src/service src/bin/sg_claw.rs src/lib.rs Cargo.toml tests/service_ws_session_test.rs
|
||||
git commit -m "feat: add claw-ws service entrypoint"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Task 6: Add the `sg_claw_client` interactive CLI
|
||||
|
||||
**Files:**
|
||||
- Create: `src/bin/sg_claw_client.rs`
|
||||
- Modify: `Cargo.toml`
|
||||
- Test: `tests/service_task_flow_test.rs`
|
||||
- Reuse: `src/service/protocol.rs`
|
||||
|
||||
- [ ] **Step 1: Write failing client/service task flow tests**
|
||||
|
||||
Create `tests/service_task_flow_test.rs` to verify:
|
||||
- the submit-task request reaches the service
|
||||
- log entries stream in order
|
||||
- the final summary arrives exactly once
|
||||
- disconnect after task completion is handled cleanly
|
||||
|
||||
- [ ] **Step 2: Run the flow tests first**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test service_task_flow_test
|
||||
```
|
||||
|
||||
Expected: fail because the client does not exist yet.
|
||||
|
||||
- [ ] **Step 3: Implement a thin interactive client loop**
|
||||
|
||||
Create `src/bin/sg_claw_client.rs` that:
|
||||
- connects to the local `sg_claw` service
|
||||
- reads a line of user input
|
||||
- sends `ClientMessage::SubmitTask`
|
||||
- prints streamed `LogEntry` events as they arrive
|
||||
- ends the turn on `TaskComplete`
|
||||
|
||||
- [ ] **Step 4: Keep the client intentionally dumb**
|
||||
|
||||
Do **not** duplicate runtime logic in the client. Browser access, skills, orchestration, and task execution remain entirely inside the service.
|
||||
|
||||
- [ ] **Step 5: Re-run the flow tests**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test service_task_flow_test
|
||||
```
|
||||
|
||||
Expected: pass without the real browser.
|
||||
|
||||
- [ ] **Step 6: Build the new binaries explicitly**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo build --bin sg_claw --bin sg_claw_client
|
||||
```
|
||||
|
||||
Expected: both binaries compile successfully.
|
||||
|
||||
- [ ] **Step 7: Commit**
|
||||
|
||||
```bash
|
||||
git add src/bin/sg_claw_client.rs Cargo.toml tests/service_task_flow_test.rs
|
||||
git commit -m "feat: add interactive claw-ws client"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Task 7: Finish wiring, preserve pipe mode, and verify end-to-end behavior
|
||||
|
||||
**Files:**
|
||||
- Modify: `Cargo.toml`
|
||||
- Modify: `src/lib.rs`
|
||||
- Modify: `src/config/settings.rs`
|
||||
- Optional modify: `src/runtime/engine.rs`
|
||||
- Reuse: `tests/browser_tool_test.rs`, `tests/runtime_task_flow_test.rs`, `tests/compat_runtime_test.rs`
|
||||
|
||||
- [ ] **Step 1: Add only the minimum config surface for v1**
|
||||
|
||||
Add settings such as:
|
||||
- `browser_ws_url` defaulting to `ws://127.0.0.1:12345`
|
||||
- `service_ws_listen_addr` defaulting to local loopback
|
||||
|
||||
Do **not** change the meaning of existing browser backend/profile settings just to represent service mode.
|
||||
|
||||
- [ ] **Step 2: Keep external browser tool naming stable**
|
||||
|
||||
Verify that the runtime still exposes:
|
||||
- `superrpa_browser`
|
||||
- `browser_action`
|
||||
|
||||
under both pipe and WS modes where the backend supports them.
|
||||
|
||||
- [ ] **Step 3: Re-run the current pipe regression suite**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test browser_tool_test --test compat_browser_tool_test --test runtime_task_flow_test --test pipe_handshake_test --test pipe_protocol_test --test task_protocol_test
|
||||
```
|
||||
|
||||
Expected: all existing pipe-oriented tests still pass unchanged.
|
||||
|
||||
- [ ] **Step 4: Run the new WS-focused suite**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test task_runner_test --test browser_ws_protocol_test --test browser_ws_backend_test --test browser_backend_capability_test --test service_ws_session_test --test service_task_flow_test
|
||||
```
|
||||
|
||||
Expected: all new tests pass without launching the real browser.
|
||||
|
||||
- [ ] **Step 5: Run a full Rust test sweep**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --tests
|
||||
```
|
||||
|
||||
Expected: all Rust tests pass.
|
||||
|
||||
- [ ] **Step 6: Build all three binaries**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo build --bin sgclaw --bin sg_claw --bin sg_claw_client
|
||||
```
|
||||
|
||||
Expected: all three binaries compile.
|
||||
|
||||
- [ ] **Step 7: Perform a manual local smoke test**
|
||||
|
||||
Manual test:
|
||||
1. Start the browser app so `ws://127.0.0.1:12345` is available.
|
||||
2. Run `cargo run --bin sg_claw`.
|
||||
3. In another terminal, run `cargo run --bin sg_claw_client`.
|
||||
4. Submit a simple browser task such as opening a page or fetching visible text.
|
||||
5. Confirm the client prints streaming logs and exactly one final completion summary.
|
||||
6. Confirm the old pipe-mode entry still starts via `cargo run`.
|
||||
|
||||
Expected: both modes work side-by-side.
|
||||
|
||||
- [ ] **Step 8: Commit**
|
||||
|
||||
```bash
|
||||
git add Cargo.toml src/lib.rs src/config/settings.rs src/runtime/engine.rs
|
||||
git commit -m "feat: wire parallel claw-ws transport"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Verification Checklist
|
||||
|
||||
### Fast regression checks
|
||||
|
||||
```bash
|
||||
cargo test --test browser_tool_test --test compat_browser_tool_test --test runtime_task_flow_test
|
||||
```
|
||||
|
||||
Expected: current pipe/browser runtime behavior remains green.
|
||||
|
||||
### Full Rust test sweep
|
||||
|
||||
```bash
|
||||
cargo test --tests
|
||||
```
|
||||
|
||||
Expected: all Rust tests pass.
|
||||
|
||||
### Binary build verification
|
||||
|
||||
```bash
|
||||
cargo build --bin sgclaw --bin sg_claw --bin sg_claw_client
|
||||
```
|
||||
|
||||
Expected: all three binaries compile.
|
||||
|
||||
### Manual end-to-end verification
|
||||
|
||||
- Browser app listening on `ws://127.0.0.1:12345`
|
||||
- `cargo run --bin sg_claw`
|
||||
- `cargo run --bin sg_claw_client`
|
||||
- submit one browser task
|
||||
- verify streaming logs, final completion, and single-client lock behavior
|
||||
- verify `cargo run` still preserves old pipe bootstrap
|
||||
|
||||
---
|
||||
|
||||
## Notes for Implementation
|
||||
|
||||
- Keep the current pipe bootstrap in `src/lib.rs` intact until the shared runner and pipe backend wrapper are both green.
|
||||
- Prefer small commits at each task boundary.
|
||||
- Keep the new WS path additive and isolated.
|
||||
- Do not ship partial browser capabilities under stable tool names.
|
||||
- Treat `docs/_tmp_sgbrowser_ws_api_doc.txt` as the browser WS protocol source of truth while implementing `src/browser/ws_protocol.rs`.
|
||||
@@ -0,0 +1,607 @@
|
||||
# WS Browser Backend Auth Replacement Implementation Plan
|
||||
|
||||
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
|
||||
|
||||
**Goal:** Replace the ws service path’s empty-session-key `BrowserPipeTool` dependency with a ws-native browser backend path so real browser websocket calls work, while preserving legacy pipe behavior exactly.
|
||||
|
||||
**Architecture:** Keep the existing pipe entry untouched and add a ws-only parallel execution seam. The ws service path will construct a `ServiceBrowserWsClient` plus `WsBrowserBackend`, pass that backend through a new ws-only submit-task entry, and let the existing compat/runtime stack consume `Arc<dyn BrowserBackend>` instead of requiring `BrowserPipeTool` on the ws path.
|
||||
|
||||
**Tech Stack:** Rust 2021, current sgclaw agent/task runner, compat runtime/orchestration stack, `tungstenite`, `serde_json`, existing `MacPolicy`, existing `BrowserBackend`/`WsBrowserBackend`, and the current Rust test suite.
|
||||
|
||||
---
|
||||
|
||||
## Scope Guardrails
|
||||
|
||||
- Only change the ws service path.
|
||||
- Do **not** change `src/lib.rs` pipe runtime behavior.
|
||||
- Do **not** change pipe handshake semantics.
|
||||
- Do **not** introduce fake session keys, fake HMAC seeds, or auth bypasses.
|
||||
- Keep legacy `run_submit_task(...)` available for the pipe entry.
|
||||
- If a shared layer must change, add a parallel ws-only entry instead of weakening the pipe path.
|
||||
- Keep the current single-client, single-task service model.
|
||||
- Do not broaden this slice into browser process launch, queueing, multi-client support, or protocol extensions.
|
||||
|
||||
---
|
||||
|
||||
## File Structure
|
||||
|
||||
### Existing files to modify
|
||||
|
||||
- Modify: `src/agent/task_runner.rs` — keep the current pipe-oriented submit path and add the ws-only backend-based submit path.
|
||||
- Modify: `src/compat/runtime.rs` — add a backend-driven execution entry that accepts `Arc<dyn BrowserBackend>` directly, while keeping the current pipe-oriented public functions behaviorally unchanged.
|
||||
- Modify: `src/compat/orchestration.rs` — add a matching backend-driven execution entry for orchestration/direct-route flows, while keeping the current pipe-oriented public functions behaviorally unchanged.
|
||||
- Modify: `src/compat/workflow_executor.rs` — add backend-driven sibling APIs for direct-route/fallback execution, while keeping the current pipe-oriented public functions behaviorally unchanged.
|
||||
- Modify: `src/service/server.rs` — replace the ws service’s `BrowserPipeTool::new(..., vec![])` path with a ws-native `WsClient` + `WsBrowserBackend` path.
|
||||
- Modify: `src/service/mod.rs` — only if minimal re-export or call-signature changes are needed around the new ws-only submit path.
|
||||
- Modify: `src/browser/mod.rs` — only if export cleanup is truly needed for the service wiring.
|
||||
- Reuse: `src/agent/mod.rs` — keep the current pipe routing unchanged unless a tiny internal refactor is strictly needed to reuse shared code.
|
||||
- Reuse: `src/browser/backend.rs` — existing shared browser backend trait.
|
||||
- Reuse: `src/browser/ws_backend.rs` — existing ws-native browser backend implementation.
|
||||
- Reuse: `src/browser/ws_protocol.rs` — existing browser websocket protocol codec.
|
||||
- Reuse: `src/compat/browser_tool_adapter.rs` — should already speak `BrowserBackend`; only touch if a narrow ws regression forces it.
|
||||
- Reuse: `src/compat/browser_script_skill_tool.rs` — eval-capability gating already exists; only touch if a narrow ws regression forces it.
|
||||
- Reuse: `src/lib.rs` — pipe entrypoint must remain behaviorally unchanged; verify only.
|
||||
|
||||
### Existing tests to extend
|
||||
|
||||
- Modify: `tests/browser_ws_backend_test.rs` — keep existing ws backend coverage green after the service adapter wiring lands.
|
||||
- Modify: `tests/browser_script_skill_tool_test.rs` — re-verify eval-gating and browser-script behavior after the shared compat/runtime seam changes.
|
||||
- Modify: `tests/service_ws_session_test.rs` — update service-side unit/session tests to exercise the ws-only submit path.
|
||||
- Modify: `tests/service_task_flow_test.rs` — add client→service chain coverage proving the ws path reaches a browser websocket and no longer emits `invalid hmac seed`.
|
||||
- Modify: `src/service/server.rs` under `#[cfg(test)]` if the private service-side ws adapter cannot be exercised from an integration test crate without changing production visibility.
|
||||
|
||||
### New files to create
|
||||
|
||||
- Create: `tests/browser_ws_service_adapter_test.rs` if the adapter can be exercised through a public seam; otherwise keep the deterministic adapter tests as unit tests in `src/service/server.rs` so no production visibility changes are required.
|
||||
|
||||
---
|
||||
|
||||
## Task 1: Lock the ws-only behavior with deterministic failing tests
|
||||
|
||||
**Files:**
|
||||
- Create: `tests/browser_ws_service_adapter_test.rs`
|
||||
- Modify: `tests/service_task_flow_test.rs`
|
||||
- Reuse: `tests/browser_ws_backend_test.rs`, `src/browser/ws_backend.rs`, `src/service/server.rs`
|
||||
|
||||
- [ ] **Step 1: Write the first failing backend/adapter test**
|
||||
|
||||
Create `tests/browser_ws_service_adapter_test.rs` with one focused test that directly exercises the ws-service adapter layer, without `sg_claw_client`, without LLM planning, and without natural-language tasks.
|
||||
|
||||
Start with the smallest real behavior from the spec:
|
||||
- fake browser websocket server accepts one connection
|
||||
- the ws-service adapter builds the same kind of client the service will use
|
||||
- `WsBrowserBackend.invoke(Action::Navigate, ...)` succeeds on status `0`
|
||||
- the fake server receives one text frame that decodes as a ws `Navigate` call
|
||||
|
||||
- [ ] **Step 2: Run that single new test and watch it fail**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test browser_ws_service_adapter_test ws_service_backend_navigate_reaches_browser_websocket -- --nocapture
|
||||
```
|
||||
|
||||
Expected: FAIL because the service-side ws client/adapter does not exist yet.
|
||||
|
||||
- [ ] **Step 3: Add the second failing deterministic test**
|
||||
|
||||
In the same file, add a test for the forced-close path:
|
||||
- fake browser websocket server accepts a request, then closes/reset the socket before returning a status frame
|
||||
- observe the error at the `WsBrowserBackend.invoke(...)` call site
|
||||
- assert the outward error is exactly `PipeError::PipeClosed`
|
||||
|
||||
- [ ] **Step 4: Run only the forced-close test and watch it fail**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test browser_ws_service_adapter_test ws_service_backend_maps_browser_disconnect_to_pipe_closed -- --nocapture
|
||||
```
|
||||
|
||||
Expected: FAIL because the service-side ws client/adapter still does not exist.
|
||||
|
||||
- [ ] **Step 5: Add the third failing deterministic test**
|
||||
|
||||
In the same file, add a callback-timeout test:
|
||||
- fake browser websocket server returns status `0`
|
||||
- it never returns the callback frame
|
||||
- assert the outward error at `invoke(...)` is exactly `PipeError::Timeout`
|
||||
|
||||
Use a tiny response timeout in the backend under test.
|
||||
|
||||
- [ ] **Step 6: Run only the callback-timeout test and watch it fail**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test browser_ws_service_adapter_test ws_service_backend_times_out_waiting_for_callback -- --nocapture
|
||||
```
|
||||
|
||||
Expected: FAIL because the service-side ws client/adapter still does not exist.
|
||||
|
||||
- [ ] **Step 7: Add the end-to-end failing regression for the auth bug**
|
||||
|
||||
Extend `tests/service_task_flow_test.rs` with one client→service integration test that:
|
||||
- starts a fake browser websocket server
|
||||
- starts the real `sg_claw` service binary with a temp config pointing `browserWsUrl` to that fake server
|
||||
- starts the real `sg_claw_client`
|
||||
- submits the fixed instruction `打开知乎热榜并读取页面主区域文本`
|
||||
- captures service/client output
|
||||
- asserts the fake browser server received at least one text frame
|
||||
- asserts output does **not** contain `invalid hmac seed: session key must not be empty`
|
||||
|
||||
Do not assert planner details here. This test only proves the service path no longer goes through the empty-session-key auth failure.
|
||||
|
||||
- [ ] **Step 8: Run the integration regression and watch it fail**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test service_task_flow_test ws_service_submit_task_no_longer_hits_invalid_hmac_seed -- --nocapture
|
||||
```
|
||||
|
||||
Expected: FAIL on the current code because the ws service still constructs `BrowserPipeTool::new(..., vec![])`.
|
||||
|
||||
- [ ] **Step 9: Commit the red tests only after they are all in place**
|
||||
|
||||
Do not commit yet if any required red test was skipped. The next task will make them pass.
|
||||
|
||||
---
|
||||
|
||||
## Task 2: Add a ws-only browser-backend execution seam without changing the pipe path
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/agent/task_runner.rs`
|
||||
- Modify: `src/compat/runtime.rs`
|
||||
- Modify: `src/compat/orchestration.rs`
|
||||
- Modify: `src/compat/workflow_executor.rs`
|
||||
- Reuse: `src/agent/mod.rs`, `src/browser/backend.rs`
|
||||
- Test: `tests/task_runner_test.rs`, `tests/browser_script_skill_tool_test.rs`
|
||||
|
||||
- [ ] **Step 1: Write the smallest failing runner-level ws entry test**
|
||||
|
||||
Extend `tests/task_runner_test.rs` with a focused test that proves there is a ws-only submit entry accepting `Arc<dyn BrowserBackend>` and an arbitrary event sink, while the old `run_submit_task(...)` signature still exists for pipe mode.
|
||||
|
||||
The test can stay on the missing-LLM-config path so it does not need a real browser call. It should compile only once the new ws-only function exists.
|
||||
|
||||
- [ ] **Step 2: Run the targeted runner test and watch it fail**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test task_runner_test ws_only_submit_task_entry_accepts_browser_backend -- --nocapture
|
||||
```
|
||||
|
||||
Expected: FAIL to compile or FAIL to link because the ws-only entry does not exist yet.
|
||||
|
||||
- [ ] **Step 3: Add the new ws-only submit-task entry in `src/agent/task_runner.rs`**
|
||||
|
||||
Keep the current pipe function intact:
|
||||
|
||||
```rust
|
||||
pub fn run_submit_task<T: Transport + 'static>(... browser_tool: &BrowserPipeTool<T>, ...)
|
||||
```
|
||||
|
||||
Add a parallel entry for the service path, for example:
|
||||
|
||||
```rust
|
||||
pub fn run_submit_task_with_browser_backend(
|
||||
sink: &dyn AgentEventSink,
|
||||
browser_backend: Arc<dyn BrowserBackend>,
|
||||
context: &AgentRuntimeContext,
|
||||
request: SubmitTaskRequest,
|
||||
) -> Result<(), PipeError>
|
||||
```
|
||||
|
||||
Rules:
|
||||
- share as much internal logic as possible with the pipe path
|
||||
- do not change `run_submit_task(...)` behavior
|
||||
- do not change `src/agent/mod.rs` pipe wiring except, at most, small internal refactoring to reuse common code
|
||||
|
||||
- [ ] **Step 4: Add a backend-driven compat runtime entry**
|
||||
|
||||
In `src/compat/runtime.rs`, add a parallel entry that accepts `Arc<dyn BrowserBackend>` directly instead of `BrowserPipeTool<T>`.
|
||||
|
||||
Keep the existing pipe-oriented public function in place.
|
||||
|
||||
The backend-driven entry must preserve:
|
||||
- existing log emission order
|
||||
- tool names (`superrpa_browser`, `browser_action`)
|
||||
- existing browser-script tool gating behavior
|
||||
- existing office/screen tool attachment logic
|
||||
- existing conversation seeding and provider setup
|
||||
|
||||
- [ ] **Step 5: Add backend-driven orchestration and workflow-executor entries**
|
||||
|
||||
In `src/compat/orchestration.rs`, add the matching backend-driven entry so direct-route flows and fallback flows can run with `Arc<dyn BrowserBackend>` on the ws path.
|
||||
|
||||
In `src/compat/workflow_executor.rs`, add backend-driven sibling APIs for any direct-route/fallback execution that is currently hard-wired to `BrowserPipeTool<T>`.
|
||||
|
||||
Keep the existing pipe-oriented orchestration and workflow-executor public functions in place.
|
||||
|
||||
- [ ] **Step 6: Route the new ws-only submit entry through the backend-driven compat/orchestration/workflow-executor path**
|
||||
|
||||
Inside `src/agent/task_runner.rs`, make the new ws-only submit entry call the new backend-based compat/orchestration functions, while the old pipe entry keeps calling the old pipe-based functions.
|
||||
|
||||
This is the core compatibility seam, and it must cover both normal compat-runtime execution and direct-route/fallback workflow execution.
|
||||
|
||||
- [ ] **Step 7: Re-run the new runner test**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test task_runner_test ws_only_submit_task_entry_accepts_browser_backend -- --nocapture
|
||||
```
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
- [ ] **Step 8: Re-run the full runner, workflow, and browser-script regressions**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test task_runner_test --test browser_script_skill_tool_test -- --nocapture
|
||||
```
|
||||
|
||||
Then run the workflow executor unit coverage that protects direct-route behavior:
|
||||
|
||||
```bash
|
||||
cargo test compat::workflow_executor::tests -- --nocapture
|
||||
```
|
||||
|
||||
Expected: all existing runner, workflow, and browser-script tests still pass, proving the pipe-facing path, direct-route behavior, and eval-gating stayed stable.
|
||||
|
||||
- [ ] **Step 9: Commit**
|
||||
|
||||
```bash
|
||||
git add src/agent/task_runner.rs src/compat/runtime.rs src/compat/orchestration.rs src/compat/workflow_executor.rs tests/task_runner_test.rs tests/browser_script_skill_tool_test.rs
|
||||
git commit -m "refactor: add ws-only browser backend submit path"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Task 3: Replace the ws service’s empty-session-key browser tool with a ws-native backend
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/service/server.rs`
|
||||
- Modify: `src/service/mod.rs` only if minimal re-export or signature cleanup is required
|
||||
- Modify: `src/browser/mod.rs` only if export cleanup is needed
|
||||
- Test: `tests/browser_ws_service_adapter_test.rs`
|
||||
- Reuse: `src/browser/ws_backend.rs`, `src/browser/ws_protocol.rs`
|
||||
|
||||
- [ ] **Step 1: Write the smallest failing service-side adapter compile target**
|
||||
|
||||
Add a compile-level or construction-level assertion in `tests/browser_ws_service_adapter_test.rs` that the service path can construct the new service-side ws client type used by `serve_client(...)`.
|
||||
|
||||
This should fail until the type exists in `src/service/server.rs`.
|
||||
|
||||
- [ ] **Step 2: Run the adapter test group and watch the constructor test fail**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test browser_ws_service_adapter_test -- --nocapture
|
||||
```
|
||||
|
||||
Expected: FAIL because the service-side ws client type does not exist yet.
|
||||
|
||||
- [ ] **Step 3: Introduce `ServiceBrowserWsClient` in `src/service/server.rs`**
|
||||
|
||||
Create a narrow client type that owns the real websocket connection to `browser_ws_url` and implements `WsClient`:
|
||||
|
||||
Required responsibilities only:
|
||||
- lazily connect on first use
|
||||
- send raw text frames
|
||||
- receive raw text frames with timeout
|
||||
- map close/reset to exactly `PipeError::PipeClosed`
|
||||
- map connect failure to exactly `PipeError::Protocol("browser websocket connect failed: ...")`
|
||||
- map timeouts to exactly `PipeError::Timeout`
|
||||
|
||||
Do **not** duplicate `WsBrowserBackend` responsibilities here.
|
||||
|
||||
- [ ] **Step 4: Remove ws-path use of `BrowserPipeTool::new(..., vec![])`**
|
||||
|
||||
In `serve_client(...)`, replace this shape:
|
||||
|
||||
```rust
|
||||
let transport = Arc::new(ServiceBrowserTransport::new(...));
|
||||
let browser_tool = BrowserPipeTool::new(transport.clone(), mac_policy.clone(), vec![])
|
||||
```
|
||||
|
||||
with the ws-native shape:
|
||||
|
||||
```rust
|
||||
let ws_client = Arc::new(ServiceBrowserWsClient::new(...));
|
||||
let browser_backend: Arc<dyn BrowserBackend> = Arc::new(
|
||||
WsBrowserBackend::new(ws_client, mac_policy.clone(), initial_request_url(...))
|
||||
.with_response_timeout(BROWSER_RESPONSE_TIMEOUT)
|
||||
);
|
||||
```
|
||||
|
||||
Then route the task through the new ws-only submit entry from Task 2.
|
||||
|
||||
- [ ] **Step 5: Delete or narrow old ws-path transport code that duplicated protocol handling**
|
||||
|
||||
Remove the service-only callback polling / response queue logic that existed solely to feed `BrowserPipeTool`.
|
||||
|
||||
Keep only what is still needed for:
|
||||
- service client websocket I/O (`sg_claw_client` ↔ `sg_claw`)
|
||||
- browser websocket I/O (`sg_claw` ↔ `browser_ws_url`)
|
||||
|
||||
Do not leave two competing ws protocol implementations in `src/service/server.rs`.
|
||||
|
||||
- [ ] **Step 6: Re-run deterministic adapter/backend tests**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test browser_ws_service_adapter_test -- --nocapture
|
||||
```
|
||||
|
||||
Expected: PASS, including:
|
||||
- navigate success
|
||||
- disconnect => `PipeError::PipeClosed`
|
||||
- callback timeout => `PipeError::Timeout`
|
||||
|
||||
- [ ] **Step 7: Re-run existing ws backend tests**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test browser_ws_backend_test -- --nocapture
|
||||
```
|
||||
|
||||
Expected: PASS, confirming the service adapter change did not break the existing ws backend semantics.
|
||||
|
||||
- [ ] **Step 8: Commit**
|
||||
|
||||
```bash
|
||||
git add src/service/server.rs src/service/mod.rs src/browser/mod.rs tests/browser_ws_service_adapter_test.rs
|
||||
git commit -m "feat: switch ws service to ws-native browser backend"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Task 4: Prove the auth bug is gone and pipe mode is unchanged
|
||||
|
||||
**Files:**
|
||||
- Modify: `tests/service_ws_session_test.rs`
|
||||
- Modify: `tests/service_task_flow_test.rs`
|
||||
- Reuse: `src/lib.rs`, `src/service/mod.rs`, `src/compat/workflow_executor.rs`
|
||||
|
||||
- [ ] **Step 1: Update service session tests for the new ws-only call path**
|
||||
|
||||
Adjust any service session tests that still call `handle_client_message(...)` through the old ws-path `BrowserPipeTool` assumption.
|
||||
|
||||
Prefer one of these narrow approaches:
|
||||
- overload `handle_client_message(...)` with a backend-based service entry used only in ws tests, or
|
||||
- keep `handle_client_message(...)` pipe-oriented and test the ws path through `serve_client(...)` and the real service binary instead
|
||||
|
||||
Choose the option that changes the fewest existing tests and leaves the pipe path simplest.
|
||||
|
||||
- [ ] **Step 2: Run the focused service session file**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test service_ws_session_test -- --nocapture
|
||||
```
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
- [ ] **Step 3: Make the auth-regression integration test pass**
|
||||
|
||||
Re-run the exact end-to-end regression from Task 1:
|
||||
|
||||
```bash
|
||||
cargo test --test service_task_flow_test ws_service_submit_task_no_longer_hits_invalid_hmac_seed -- --nocapture
|
||||
```
|
||||
|
||||
Expected: PASS, with evidence that:
|
||||
- the fake browser websocket server received at least one frame
|
||||
- output no longer contains `invalid hmac seed: session key must not be empty`
|
||||
|
||||
- [ ] **Step 4: Add one explicit mandatory assertion for browser websocket connect failures**
|
||||
|
||||
Add one focused assertion that a browser websocket connect failure surfaces outward as:
|
||||
|
||||
```rust
|
||||
PipeError::Protocol("browser websocket connect failed: ...")
|
||||
```
|
||||
|
||||
Do not leave this semantic implied.
|
||||
|
||||
- [ ] **Step 5: Add one explicit ws direct-route regression**
|
||||
|
||||
Add one focused regression that proves a ws-backed browser backend can traverse a direct-route/fallback path that currently flows through `src/compat/workflow_executor.rs`.
|
||||
|
||||
Keep it deterministic and narrow. Prefer a fake backend plus direct function invocation over a planner-dependent natural-language end-to-end test.
|
||||
|
||||
- [ ] **Step 6: Run the ws-focused regression suite**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test browser_ws_service_adapter_test --test browser_ws_backend_test --test browser_ws_protocol_test --test service_ws_session_test --test service_task_flow_test -- --nocapture
|
||||
```
|
||||
|
||||
Then run the workflow-executor direct-route coverage:
|
||||
|
||||
```bash
|
||||
cargo test compat::workflow_executor::tests -- --nocapture
|
||||
```
|
||||
|
||||
Expected: all ws-focused and direct-route workflow tests pass.
|
||||
|
||||
- [ ] **Step 7: Run the required pipe and browser-script regression suite**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test pipe_handshake_test --test browser_tool_test --test compat_browser_tool_test --test browser_script_skill_tool_test --test runtime_task_flow_test -- --nocapture
|
||||
```
|
||||
|
||||
Expected: all required pipe and browser-script regressions pass unchanged.
|
||||
|
||||
- [ ] **Step 8: Run the full relevant verification sweep**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test browser_ws_service_adapter_test --test browser_ws_backend_test --test browser_ws_protocol_test --test service_ws_session_test --test service_task_flow_test --test pipe_handshake_test --test browser_tool_test --test compat_browser_tool_test --test browser_script_skill_tool_test --test runtime_task_flow_test -- --nocapture
|
||||
```
|
||||
|
||||
Then run:
|
||||
|
||||
```bash
|
||||
cargo test compat::workflow_executor::tests -- --nocapture
|
||||
```
|
||||
|
||||
Expected: full mixed ws+pipe verification passes in fresh runs.
|
||||
|
||||
- [ ] **Step 9: Build the affected binaries**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo build --bin sgclaw --bin sg_claw --bin sg_claw_client
|
||||
```
|
||||
|
||||
Expected: all three binaries compile.
|
||||
|
||||
- [ ] **Step 10: Commit**
|
||||
|
||||
```bash
|
||||
git add tests/service_ws_session_test.rs tests/service_task_flow_test.rs tests/browser_ws_service_adapter_test.rs src/compat/workflow_executor.rs
|
||||
git commit -m "test: verify ws auth replacement and pipe regressions"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Task 5: Manual smoke verification against the real browser
|
||||
|
||||
**Files:**
|
||||
- Reuse only: no code changes unless a verified bug is found during smoke work
|
||||
|
||||
- [ ] **Step 1: Start the real browser websocket target**
|
||||
|
||||
Confirm the real sgBrowser endpoint is reachable at the configured `browserWsUrl`.
|
||||
|
||||
- [ ] **Step 2: Start the real ws service**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo run --bin sg_claw -- --config-path "D:/data/ideaSpace/rust/sgClaw/sgclaw_config.json"
|
||||
```
|
||||
|
||||
Expected: service prints the resolved listen address and browser websocket URL.
|
||||
|
||||
- [ ] **Step 3: Run the minimal browser task through the real client**
|
||||
|
||||
Run from a separate terminal with UTF-8-safe input:
|
||||
|
||||
```bash
|
||||
cargo run --bin sg_claw_client -- --config-path "D:/data/ideaSpace/rust/sgClaw/sgclaw_config.json"
|
||||
```
|
||||
|
||||
Submit:
|
||||
|
||||
```text
|
||||
打开知乎热榜并读取页面主区域文本
|
||||
```
|
||||
|
||||
Expected:
|
||||
- browser actions start executing
|
||||
- no `invalid hmac seed: session key must not be empty`
|
||||
- one final completion is returned
|
||||
|
||||
- [ ] **Step 4: Run the old Zhihu skill smoke**
|
||||
|
||||
Submit:
|
||||
|
||||
```text
|
||||
读取知乎热榜数据,并导出 excel 文件
|
||||
```
|
||||
|
||||
Expected: the task enters the real browser action path instead of dying at auth initialization.
|
||||
|
||||
- [ ] **Step 5: Re-check the legacy pipe entry without modifying it**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo run
|
||||
```
|
||||
|
||||
Only verify startup behavior appropriate for the current pipe environment. Do not change pipe code during this smoke step.
|
||||
|
||||
- [ ] **Step 6: If a smoke failure appears, stop and debug before editing**
|
||||
|
||||
Any failure found here must be handled with:
|
||||
- a fresh reproducer
|
||||
- a failing automated test if feasible
|
||||
- the smallest scoped fix
|
||||
|
||||
Do not fold speculative smoke fixes into this slice.
|
||||
|
||||
---
|
||||
|
||||
## Verification Checklist
|
||||
|
||||
### Deterministic ws-only tests
|
||||
|
||||
```bash
|
||||
cargo test --test browser_ws_service_adapter_test --test browser_ws_backend_test --test browser_ws_protocol_test -- --nocapture
|
||||
```
|
||||
|
||||
Expected: ws-native backend and service adapter semantics are green without LLM/planner dependencies.
|
||||
|
||||
### Client→service ws chain tests
|
||||
|
||||
```bash
|
||||
cargo test --test service_ws_session_test --test service_task_flow_test -- --nocapture
|
||||
```
|
||||
|
||||
Expected: the ws service path reaches the browser websocket and no longer emits the empty-session-key auth failure.
|
||||
|
||||
### Required pipe and browser-script regressions
|
||||
|
||||
```bash
|
||||
cargo test --test pipe_handshake_test --test browser_tool_test --test compat_browser_tool_test --test browser_script_skill_tool_test --test runtime_task_flow_test -- --nocapture
|
||||
```
|
||||
|
||||
Expected: legacy pipe behavior and browser-script eval-gating remain unchanged.
|
||||
|
||||
### Binary build verification
|
||||
|
||||
```bash
|
||||
cargo build --bin sgclaw --bin sg_claw --bin sg_claw_client
|
||||
```
|
||||
|
||||
Expected: all affected binaries compile.
|
||||
|
||||
### Manual end-to-end verification
|
||||
|
||||
- real sgBrowser running at configured `browserWsUrl`
|
||||
- `cargo run --bin sg_claw -- --config-path "D:/data/ideaSpace/rust/sgClaw/sgclaw_config.json"`
|
||||
- `cargo run --bin sg_claw_client -- --config-path "D:/data/ideaSpace/rust/sgClaw/sgclaw_config.json"`
|
||||
- run the Zhihu minimal task
|
||||
- run the old Zhihu export task
|
||||
- verify no `invalid hmac seed` appears
|
||||
- verify pipe startup still behaves as before
|
||||
|
||||
---
|
||||
|
||||
## Notes for Implementation
|
||||
|
||||
- Keep the current pipe bootstrap in `src/lib.rs` untouched.
|
||||
- Prefer adding ws-only functions over changing existing pipe signatures.
|
||||
- Reuse `WsBrowserBackend` for protocol semantics; do not re-implement callback handling inside the service.
|
||||
- Keep `ServiceBrowserWsClient` narrow: connection lifecycle + raw websocket I/O only.
|
||||
- Preserve exact outward error semantics from the spec:
|
||||
- connect failure => `PipeError::Protocol("browser websocket connect failed: ...")`
|
||||
- non-zero status => `PipeError::Protocol("browser returned non-zero status: ...")`
|
||||
- callback timeout => `PipeError::Timeout`
|
||||
- close/reset => `PipeError::PipeClosed`
|
||||
- Do not claim success until the mixed ws+pipe verification commands have been run fresh.
|
||||
482
docs/superpowers/plans/2026-04-03-ws-browser-bridge-path-plan.md
Normal file
482
docs/superpowers/plans/2026-04-03-ws-browser-bridge-path-plan.md
Normal file
@@ -0,0 +1,482 @@
|
||||
# WS Browser Bridge Path Implementation Plan
|
||||
|
||||
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
|
||||
|
||||
**Goal:** Replace the raw-ws-direct browser execution assumption with a bridge-backed browser integration path that matches the validated FunctionsUI / BrowserAction / CommandRouter model while preserving existing pipe behavior.
|
||||
|
||||
**Architecture:** Keep the current Rust-side browser orchestration flow centered on `Arc<dyn BrowserBackend>`, but stop treating `WsBrowserBackend` as the real production browser surface. Model the validated bridge as two explicit layers: Layer 1 session/lifecycle calls (`sgclawConnect`, `sgclawStart`, `sgclawStop`, `sgclawSubmitTask`) and Layer 2 browser-action execution (`window.sgFunctionsUI(...)`, `window.BrowserAction(...)`, `CommandRouter`). The new backend targets Layer 2 only through a narrow repo-local `BridgeActionTransport` seam, while lifecycle/session concerns stay separate from per-action browser execution.
|
||||
|
||||
**Tech Stack:** Rust 2021, existing `BrowserBackend` abstraction, compat/runtime/orchestration stack, current service/task runner integration, existing bridge-oriented design docs, existing Rust unit/integration test suite.
|
||||
|
||||
---
|
||||
|
||||
## Scope Guardrails
|
||||
|
||||
- Do **not** continue extending raw external sgBrowser websocket business-frame handling as the mainline path.
|
||||
- Do **not** modify `src/lib.rs`, pipe handshake behavior, or the working `BrowserPipeTool` path.
|
||||
- Do **not** invent a parallel browser-command contract unrelated to the documented bridge surface.
|
||||
- Do **not** rewrite the whole compat/runtime stack when a narrow adapter will do.
|
||||
- Do **not** assume access to the full SuperRPA browser-host codebase from this repository; encode the validated contract at the nearest seam available here.
|
||||
|
||||
---
|
||||
|
||||
## File Structure
|
||||
|
||||
### Existing files to modify
|
||||
|
||||
- Modify: `src/browser/mod.rs`
|
||||
- export the new bridge contract/transport/backend modules explicitly
|
||||
- Modify: `src/browser/backend.rs`
|
||||
- only if a tiny shared helper or trait documentation update is needed for the new bridge-backed backend
|
||||
- Modify: `src/compat/browser_tool_adapter.rs`
|
||||
- ensure existing browser action mapping remains reusable with the new backend implementation
|
||||
- Modify: `src/compat/runtime.rs`
|
||||
- wire the bridge-backed browser backend into the ws service/browser execution path without changing the pipe path
|
||||
- Modify: `src/compat/orchestration.rs`
|
||||
- only where browser backend wiring requires the bridge-backed path to flow through orchestration
|
||||
- Modify: `src/compat/workflow_executor.rs`
|
||||
- preserve direct-route/fallback use of `BrowserBackend` when the backend is bridge-backed instead of websocket-backed
|
||||
- Modify: `src/service/server.rs`
|
||||
- replace the current real-browser execution assumption with bridge-backend construction plus a repo-local bridge transport provider seam for the relevant service path
|
||||
- Modify: `tests/compat_browser_tool_test.rs`
|
||||
- extend browser tool mapping coverage if needed for bridge-backed execution
|
||||
- Modify: `tests/service_task_flow_test.rs`
|
||||
- replace raw-ws-direct expectations with bridge-path expectations where appropriate
|
||||
- Modify: `tests/service_ws_session_test.rs`
|
||||
- update service-side tests if they currently assume the real browser path is raw websocket driven
|
||||
|
||||
### New files to create
|
||||
|
||||
- Create: `src/browser/bridge_contract.rs`
|
||||
- narrow, explicit contract types that keep lifecycle/session bridge calls separate from browser-action execution requests/replies
|
||||
- Create: `src/browser/bridge_transport.rs`
|
||||
- repo-local `BridgeActionTransport` seam used by the backend and injected by service/runtime wiring
|
||||
- Create: `src/browser/bridge_backend.rs`
|
||||
- new `BrowserBackend` implementation that maps browser actions onto the Layer-2 bridge action contract through `BridgeActionTransport`
|
||||
- Create: `tests/browser_bridge_backend_test.rs`
|
||||
- deterministic unit tests for action-to-bridge mapping and reply/error normalization using a fake bridge transport
|
||||
- Create: `tests/browser_bridge_contract_test.rs`
|
||||
- narrow tests proving the two bridge layers stay explicit and browser-action requests remain semantic rather than raw-websocket-shaped
|
||||
|
||||
### Evidence files to consult during implementation
|
||||
|
||||
- Read: `docs/_tmp_sgbrowser_ws_probe_transcript.md`
|
||||
- Read: `frontend/archive/sgClaw验证-已归档/testRunner.js`
|
||||
- Read: `docs/superpowers/specs/2026-03-25-superrpa-sgclaw-browser-control-design.md`
|
||||
- Read: `docs/archive/项目管理与排期/协作时间表.md`
|
||||
- Read: `docs/plans/2026-03-27-sgclaw-floating-chat-frontend-design.md`
|
||||
|
||||
---
|
||||
|
||||
## Task 1: Lock the bridge contract in deterministic tests before adding the backend
|
||||
|
||||
**Files:**
|
||||
- Create: `src/browser/bridge_contract.rs`
|
||||
- Create: `tests/browser_bridge_contract_test.rs`
|
||||
- Reuse as design evidence:
|
||||
- `frontend/archive/sgClaw验证-已归档/testRunner.js`
|
||||
- `docs/archive/项目管理与排期/协作时间表.md`
|
||||
- `docs/plans/2026-03-27-sgclaw-floating-chat-frontend-design.md`
|
||||
|
||||
- [ ] **Step 1: Write the first failing contract test for named bridge calls**
|
||||
|
||||
Create `tests/browser_bridge_contract_test.rs` with one focused test that encodes the bridge naming expectations already evidenced in the repo.
|
||||
|
||||
Start with a test shape like:
|
||||
|
||||
```rust
|
||||
#[test]
|
||||
fn bridge_contract_names_match_documented_bridge_surface() {
|
||||
// assert the contract contains the exact bridge action names
|
||||
}
|
||||
```
|
||||
|
||||
Required expectations:
|
||||
- `sgclawConnect`
|
||||
- `sgclawStart`
|
||||
- `sgclawStop`
|
||||
- `sgclawSubmitTask`
|
||||
- these names live in an explicit lifecycle/session contract type, not in the browser-action request type
|
||||
|
||||
Do **not** invent additional action names in this first test.
|
||||
|
||||
- [ ] **Step 2: Run the single contract test and verify it fails**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test browser_bridge_contract_test bridge_contract_names_match_documented_bridge_surface -- --nocapture
|
||||
```
|
||||
|
||||
Expected: FAIL because `src/browser/bridge_contract.rs` does not exist yet.
|
||||
|
||||
- [ ] **Step 3: Add the second failing contract test for browser-action request shaping**
|
||||
|
||||
In the same file, add one focused test proving the bridge contract can represent a browser action request without leaking raw websocket business-frame semantics.
|
||||
|
||||
Test shape:
|
||||
|
||||
```rust
|
||||
#[test]
|
||||
fn bridge_contract_represents_browser_action_requests_without_ws_business_frames() {
|
||||
// create a click/navigate/getText style action request and assert shape
|
||||
}
|
||||
```
|
||||
|
||||
Required assertions:
|
||||
- request shape identifies the intended browser action semantically
|
||||
- request shape is distinct from the lifecycle/session bridge call type
|
||||
- request shape does **not** embed `sgBrowerserOpenPage`, `callBackJsToCpp`, or other raw websocket business-frame names
|
||||
|
||||
- [ ] **Step 4: Run the second contract test and verify it fails**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test browser_bridge_contract_test bridge_contract_represents_browser_action_requests_without_ws_business_frames -- --nocapture
|
||||
```
|
||||
|
||||
Expected: FAIL because the bridge contract does not exist yet.
|
||||
|
||||
- [ ] **Step 5: Implement the minimal bridge contract module**
|
||||
|
||||
Create `src/browser/bridge_contract.rs` with only the types needed by the tests.
|
||||
|
||||
Recommended shape:
|
||||
|
||||
```rust
|
||||
pub enum BridgeLifecycleCall {
|
||||
Connect,
|
||||
Start,
|
||||
Stop,
|
||||
SubmitTask,
|
||||
}
|
||||
|
||||
impl BridgeLifecycleCall {
|
||||
pub fn bridge_name(&self) -> &'static str {
|
||||
match self {
|
||||
Self::Connect => "sgclawConnect",
|
||||
Self::Start => "sgclawStart",
|
||||
Self::Stop => "sgclawStop",
|
||||
Self::SubmitTask => "sgclawSubmitTask",
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
pub struct BridgeBrowserActionRequest {
|
||||
pub action: String,
|
||||
pub params: serde_json::Value,
|
||||
pub expected_domain: String,
|
||||
}
|
||||
```
|
||||
|
||||
Rules:
|
||||
- model the documented bridge/lifecycle naming explicitly
|
||||
- keep the browser action request semantic, not websocket-frame-shaped
|
||||
- keep the module small and repository-local
|
||||
|
||||
- [ ] **Step 6: Re-run the contract tests**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test browser_bridge_contract_test -- --nocapture
|
||||
```
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
- [ ] **Step 7: Commit**
|
||||
|
||||
```bash
|
||||
git add src/browser/bridge_contract.rs tests/browser_bridge_contract_test.rs
|
||||
git commit -m "test: define sgClaw bridge contract surface"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Task 2: Add the repo-local transport seam and bridge-backed `BrowserBackend`
|
||||
|
||||
**Files:**
|
||||
- Create: `src/browser/bridge_transport.rs`
|
||||
- Create: `src/browser/bridge_backend.rs`
|
||||
- Create: `tests/browser_bridge_backend_test.rs`
|
||||
- Reuse: `src/browser/backend.rs`
|
||||
- Reuse: `src/browser/bridge_contract.rs`
|
||||
- Reuse: `src/compat/browser_tool_adapter.rs`
|
||||
|
||||
- [ ] **Step 1: Write the first failing backend test for action mapping**
|
||||
|
||||
Create `tests/browser_bridge_backend_test.rs` with one focused test proving a `BrowserBackend` action is translated into the bridge contract request shape.
|
||||
|
||||
Start with a narrow action such as `Action::Navigate`.
|
||||
|
||||
Required assertions:
|
||||
- `Action::Navigate` becomes one semantic bridge browser-action request
|
||||
- the request preserves action parameters and expected domain
|
||||
- the test does **not** assert any raw websocket payload strings
|
||||
|
||||
- [ ] **Step 2: Run the first backend test and verify it fails**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test browser_bridge_backend_test bridge_backend_maps_navigate_to_bridge_action_request -- --nocapture
|
||||
```
|
||||
|
||||
Expected: FAIL because `src/browser/bridge_backend.rs` does not exist yet.
|
||||
|
||||
- [ ] **Step 3: Add the second failing backend test for reply normalization**
|
||||
|
||||
Add one focused test proving the backend can normalize a successful bridge reply into the existing `CommandOutput` shape expected by `BrowserBackend` callers.
|
||||
|
||||
- [ ] **Step 4: Run the second backend test and verify it fails**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test browser_bridge_backend_test bridge_backend_normalizes_successful_bridge_reply -- --nocapture
|
||||
```
|
||||
|
||||
Expected: FAIL because the backend does not exist yet.
|
||||
|
||||
- [ ] **Step 5: Add the third failing backend test for bridge-side errors**
|
||||
|
||||
Add one focused test proving a bridge-side error normalizes into the correct outward `PipeError` semantics for backend callers.
|
||||
|
||||
- [ ] **Step 6: Run the error-path test and verify it fails**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test browser_bridge_backend_test bridge_backend_maps_bridge_failure_to_pipe_error -- --nocapture
|
||||
```
|
||||
|
||||
Expected: FAIL because the backend does not exist yet.
|
||||
|
||||
- [ ] **Step 7: Implement the minimal transport seam and bridge backend**
|
||||
|
||||
Create `src/browser/bridge_transport.rs` and `src/browser/bridge_backend.rs`.
|
||||
|
||||
The transport seam must:
|
||||
- define the repo-local `BridgeActionTransport` contract used for Layer-2 browser-action execution only
|
||||
- accept semantic `BridgeBrowserActionRequest` values and return semantic success/error replies
|
||||
- remain small, explicit, and easy to fake in tests
|
||||
|
||||
The backend must:
|
||||
- implement the existing `BrowserBackend` trait
|
||||
- translate supported actions into `BridgeBrowserActionRequest`
|
||||
- depend on `BridgeActionTransport` instead of raw websocket payload building
|
||||
- normalize success/error replies into existing backend-facing result types
|
||||
|
||||
Rules:
|
||||
- do not embed raw websocket business-frame names
|
||||
- do not change `BrowserBackend` semantics for existing callers
|
||||
- do not pull lifecycle/session bridge calls into this backend layer
|
||||
|
||||
- [ ] **Step 8: Re-run the bridge backend tests**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test browser_bridge_backend_test -- --nocapture
|
||||
```
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
- [ ] **Step 9: Re-run browser tool adapter coverage**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test compat_browser_tool_test -- --nocapture
|
||||
```
|
||||
|
||||
Expected: PASS, proving the existing browser action mapping remains reusable with the new backend.
|
||||
|
||||
- [ ] **Step 10: Commit**
|
||||
|
||||
```bash
|
||||
git add src/browser/bridge_transport.rs src/browser/bridge_backend.rs tests/browser_bridge_backend_test.rs src/compat/browser_tool_adapter.rs src/browser/mod.rs
|
||||
git commit -m "feat: add bridge-backed browser backend"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Task 3: Wire the bridge-backed backend into the real-browser service path
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/service/server.rs`
|
||||
- Modify: `src/compat/runtime.rs`
|
||||
- Modify: `src/compat/orchestration.rs`
|
||||
- Modify: `src/compat/workflow_executor.rs`
|
||||
- Modify: `tests/service_task_flow_test.rs`
|
||||
- Modify: `tests/service_ws_session_test.rs`
|
||||
- Reuse: `src/browser/bridge_backend.rs`
|
||||
- Reuse: `src/browser/bridge_contract.rs`
|
||||
- Reuse: `src/browser/bridge_transport.rs`
|
||||
|
||||
- [ ] **Step 1: Write the first failing service-path test for bridge backend construction**
|
||||
|
||||
Add or update one focused service test proving the real-browser execution path constructs and uses the bridge-backed backend instead of the raw websocket backend assumption.
|
||||
|
||||
The test should observe backend selection at the nearest possible seam.
|
||||
|
||||
- [ ] **Step 2: Run the focused service test and verify it fails**
|
||||
|
||||
Run the narrowest affected service test command.
|
||||
|
||||
Expected: FAIL because the service path is not wired to the bridge backend yet.
|
||||
|
||||
- [ ] **Step 3: Add the minimal service/runtime wiring**
|
||||
|
||||
Change the relevant service/browser execution path so it constructs the new bridge-backed backend, injects the repo-local bridge transport provider at the nearest seam, and passes the backend through the existing runtime/orchestration flow.
|
||||
|
||||
Rules:
|
||||
- keep the pipe path unchanged
|
||||
- keep changes localized
|
||||
- keep lifecycle/session bridge handling separate from per-action browser execution
|
||||
- preserve existing runtime log and task flow behavior where possible
|
||||
|
||||
- [ ] **Step 4: Add one direct-route/fallback regression**
|
||||
|
||||
Add one focused regression proving a bridge-backed backend still works through the direct-route or fallback path exercised by `src/compat/workflow_executor.rs`.
|
||||
|
||||
- [ ] **Step 5: Run the bridge-focused service tests**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test service_ws_session_test --test service_task_flow_test -- --nocapture
|
||||
```
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
- [ ] **Step 6: Re-run workflow/runtime regressions**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test compat::workflow_executor::tests -- --nocapture
|
||||
cargo test --test compat_browser_tool_test --test browser_script_skill_tool_test --test task_runner_test -- --nocapture
|
||||
```
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
- [ ] **Step 7: Commit**
|
||||
|
||||
```bash
|
||||
git add src/service/server.rs src/compat/runtime.rs src/compat/orchestration.rs src/compat/workflow_executor.rs tests/service_ws_session_test.rs tests/service_task_flow_test.rs
|
||||
git commit -m "refactor: route real browser path through bridge backend"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Task 4: Verify bridge-path behavior without pipe regression
|
||||
|
||||
**Files:**
|
||||
- Reuse only unless a failing test proves a minimal fix is still needed
|
||||
|
||||
- [ ] **Step 1: Run bridge/backend unit coverage**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test browser_bridge_contract_test --test browser_bridge_backend_test -- --nocapture
|
||||
```
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
- [ ] **Step 2: Run service/runtime bridge-path regressions**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test service_ws_session_test --test service_task_flow_test -- --nocapture
|
||||
```
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
- [ ] **Step 3: Run required pipe regressions**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test pipe_handshake_test --test browser_tool_test --test compat_browser_tool_test --test browser_script_skill_tool_test --test runtime_task_flow_test -- --nocapture
|
||||
```
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
- [ ] **Step 4: Build the affected binaries**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo build --bin sgclaw --bin sg_claw --bin sg_claw_client
|
||||
```
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
- [ ] **Step 5: Stop if any regression points back to raw websocket assumptions**
|
||||
|
||||
If any test still encodes raw external websocket business-frame assumptions as the real-browser path, update that test to the bridge-backed design rather than patching production code to satisfy the old assumption.
|
||||
|
||||
- [ ] **Step 6: Commit**
|
||||
|
||||
```bash
|
||||
git add tests/browser_bridge_contract_test.rs tests/browser_bridge_backend_test.rs tests/service_ws_session_test.rs tests/service_task_flow_test.rs
|
||||
git commit -m "test: verify bridge path and preserve pipe behavior"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Verification Checklist
|
||||
|
||||
### Bridge contract tests
|
||||
|
||||
```bash
|
||||
cargo test --test browser_bridge_contract_test -- --nocapture
|
||||
```
|
||||
|
||||
Expected: documented bridge names and semantic browser-action request shaping are locked.
|
||||
|
||||
### Bridge backend tests
|
||||
|
||||
```bash
|
||||
cargo test --test browser_bridge_backend_test -- --nocapture
|
||||
```
|
||||
|
||||
Expected: backend action mapping and reply/error normalization are green.
|
||||
|
||||
### Service/runtime integration tests
|
||||
|
||||
```bash
|
||||
cargo test --test service_ws_session_test --test service_task_flow_test -- --nocapture
|
||||
cargo test compat::workflow_executor::tests -- --nocapture
|
||||
```
|
||||
|
||||
Expected: real-browser path uses the bridge-backed backend and direct-route/fallback behavior remains intact.
|
||||
|
||||
### Pipe regressions
|
||||
|
||||
```bash
|
||||
cargo test --test pipe_handshake_test --test browser_tool_test --test compat_browser_tool_test --test browser_script_skill_tool_test --test runtime_task_flow_test -- --nocapture
|
||||
```
|
||||
|
||||
Expected: pipe path remains unchanged.
|
||||
|
||||
### Binary build verification
|
||||
|
||||
```bash
|
||||
cargo build --bin sgclaw --bin sg_claw --bin sg_claw_client
|
||||
```
|
||||
|
||||
Expected: affected binaries compile.
|
||||
|
||||
---
|
||||
|
||||
## Notes for Implementation
|
||||
|
||||
- The websocket probe work stays in the repository as diagnostic tooling; do not repurpose it into the bridge adapter.
|
||||
- `docs/_tmp_sgbrowser_ws_probe_transcript.md` is evidence that rejected the raw-ws-direct assumption, not a contract to keep satisfying.
|
||||
- Favor one narrow bridge-backed backend over broad runtime rewrites.
|
||||
- If the nearest repo-local seam is still slightly abstract because the external SuperRPA host code is outside this repository, make that abstraction explicit and test it rather than guessing hidden behavior.
|
||||
@@ -0,0 +1,566 @@
|
||||
# WS Browser Integration Surface Correction Implementation Plan
|
||||
|
||||
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
|
||||
|
||||
**Goal:** Replace the unvalidated raw-ws-direct assumption with an evidence-backed decision: either prove a minimal sgBrowser bootstrap sequence for raw websocket control, or pivot to the real browser bridge surface.
|
||||
|
||||
**Architecture:** Treat the existing ws-native backend as a protocol/testing asset, not as a validated production integration surface. First build a narrow probe/validation harness that can run candidate bootstrap sequences and capture exact live transcripts from the real endpoint. Then branch decisively: if a reproducible bootstrap sequence yields real status/callback frames, implement that bootstrap path; otherwise stop raw-ws speculation and write the bridge-first implementation slice.
|
||||
|
||||
**Tech Stack:** Rust 2021, existing `src/browser/ws_protocol.rs` / `src/browser/ws_backend.rs`, service websocket infrastructure, `tungstenite`, `serde_json`, current Rust test suite, local sgBrowser websocket documentation.
|
||||
|
||||
---
|
||||
|
||||
## Scope Guardrails
|
||||
|
||||
- Do **not** add more speculative production fixes to `src/service/server.rs` just to “try one more thing.”
|
||||
- Do **not** claim raw websocket is the supported path unless the live probe transcript proves it.
|
||||
- Do **not** modify `src/lib.rs`, pipe handshake behavior, or the pipe browser-tool path.
|
||||
- Do **not** implement both the bootstrap architecture and the bridge architecture in the same branch.
|
||||
- Keep the ws-native code unless and until the bridge decision makes specific pieces obsolete.
|
||||
- Prefer a dedicated probe surface over embedding validation logic into production request handling.
|
||||
|
||||
---
|
||||
|
||||
## File Structure
|
||||
|
||||
### Existing files to modify
|
||||
|
||||
- Modify: `src/browser/mod.rs`
|
||||
- export the new `ws_probe` module so both tests and the probe binary use the same library surface
|
||||
- Modify: `src/browser/ws_protocol.rs`
|
||||
- only if a tiny helper extraction is required for test/probe readability
|
||||
- do not change existing protocol semantics in this slice
|
||||
- Modify: `tests/browser_ws_protocol_test.rs`
|
||||
- add deterministic coverage for any extracted helper used by the probe harness
|
||||
|
||||
### New files to create
|
||||
|
||||
- Create: `src/bin/sgbrowser_ws_probe.rs`
|
||||
- standalone diagnostic binary for ordered frame-script probing against a live sgBrowser websocket endpoint
|
||||
- Create: `src/browser/ws_probe.rs`
|
||||
- small reusable probe/transcript module, if needed, to keep the binary and tests focused
|
||||
- Create: `tests/browser_ws_probe_test.rs`
|
||||
- deterministic fake-server tests for transcript capture, timeout reporting, and scripted sequence execution
|
||||
- Create: `docs/superpowers/specs/2026-04-03-ws-browser-bridge-path-design.md` **only if Option B wins after probing**
|
||||
- follow-up bridge design, not part of the initial coding slice
|
||||
- Create: `docs/superpowers/plans/2026-04-03-ws-browser-bridge-path-plan.md` **only if Option B wins after probing**
|
||||
- follow-up bridge implementation plan, not part of the initial coding slice
|
||||
- Create: `docs/_tmp_sgbrowser_ws_probe_transcript.md`
|
||||
- temporary evidence artifact capturing the real endpoint probe matrix and outcomes
|
||||
|
||||
### Files deliberately not changed in the initial slice
|
||||
|
||||
- `src/lib.rs`
|
||||
- `src/agent/task_runner.rs`
|
||||
- `src/compat/runtime.rs`
|
||||
- `src/compat/orchestration.rs`
|
||||
- `src/compat/workflow_executor.rs`
|
||||
- `src/browser/ws_backend.rs`
|
||||
|
||||
Unless the probe results prove a real bootstrap contract, these files stay untouched.
|
||||
|
||||
---
|
||||
|
||||
## Task 1: Build a deterministic websocket probe harness before touching production behavior
|
||||
|
||||
**Files:**
|
||||
- Create: `src/browser/ws_probe.rs`
|
||||
- Create: `tests/browser_ws_probe_test.rs`
|
||||
- Reuse: `src/browser/ws_protocol.rs`
|
||||
|
||||
- [ ] **Step 1: Write the first failing transcript test**
|
||||
|
||||
Create `tests/browser_ws_probe_test.rs` with one focused fake-server test that executes a scripted sequence of outgoing text frames and records all received text frames in order.
|
||||
|
||||
Start with this shape:
|
||||
|
||||
```rust
|
||||
#[test]
|
||||
fn probe_records_welcome_then_silence_transcript() {
|
||||
// fake server sends one welcome frame and then stays silent
|
||||
// probe result should preserve that exact transcript and mark timeout/silence explicitly
|
||||
}
|
||||
```
|
||||
|
||||
Required assertions:
|
||||
- the probe can connect to the fake websocket server
|
||||
- it can send a scripted first frame
|
||||
- it records the first inbound text frame exactly
|
||||
- it returns a transcript/result object that distinguishes timeout from protocol parse failure
|
||||
|
||||
- [ ] **Step 2: Run the single new test and verify it fails**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test browser_ws_probe_test probe_records_welcome_then_silence_transcript -- --nocapture
|
||||
```
|
||||
|
||||
Expected: FAIL because the probe harness does not exist yet.
|
||||
|
||||
- [ ] **Step 3: Add the second failing probe test for ordered multi-step scripts**
|
||||
|
||||
In the same file, add a test proving the harness can run multiple outgoing frames in a fixed order and keep the transcript segmented by step.
|
||||
|
||||
Test shape:
|
||||
|
||||
```rust
|
||||
#[test]
|
||||
fn probe_runs_ordered_frame_script_and_records_per_step_results() {
|
||||
// send bootstrap frame 1, bootstrap frame 2, then minimal action
|
||||
// fake server replies differently at each step
|
||||
// probe result preserves exact order and outcomes
|
||||
}
|
||||
```
|
||||
|
||||
Required assertions:
|
||||
- outgoing frames are sent in the configured order
|
||||
- inbound frames are attached to the correct step
|
||||
- the probe can stop the sequence on timeout/close if configured
|
||||
|
||||
- [ ] **Step 4: Run the ordered-script test and verify it fails**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test browser_ws_probe_test probe_runs_ordered_frame_script_and_records_per_step_results -- --nocapture
|
||||
```
|
||||
|
||||
Expected: FAIL because the probe harness does not exist yet.
|
||||
|
||||
- [ ] **Step 5: Add the third failing probe test for close/reset visibility**
|
||||
|
||||
Add one focused fake-server test that closes the connection after a script step and asserts the transcript reports close/reset rather than generic timeout.
|
||||
|
||||
- [ ] **Step 6: Run the close/reset test and verify it fails**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test browser_ws_probe_test probe_reports_socket_close_separately_from_timeout -- --nocapture
|
||||
```
|
||||
|
||||
Expected: FAIL because the probe harness does not exist yet.
|
||||
|
||||
- [ ] **Step 7: Implement the minimal probe module**
|
||||
|
||||
Create `src/browser/ws_probe.rs` with only the types and behavior needed by the tests.
|
||||
|
||||
Recommended shape:
|
||||
|
||||
```rust
|
||||
pub struct ProbeStep {
|
||||
pub label: String,
|
||||
pub payload: String,
|
||||
pub expect_reply: bool,
|
||||
}
|
||||
|
||||
pub enum ProbeOutcome {
|
||||
Received(Vec<String>),
|
||||
TimedOut,
|
||||
Closed,
|
||||
ConnectFailed(String),
|
||||
}
|
||||
|
||||
pub struct ProbeStepResult {
|
||||
pub label: String,
|
||||
pub sent: String,
|
||||
pub outcome: ProbeOutcome,
|
||||
}
|
||||
|
||||
pub fn run_probe_script(/* ws url, timeout, steps */) -> Result<Vec<ProbeStepResult>, ProbeError> {
|
||||
// connect, send ordered frames, collect exact transcript
|
||||
}
|
||||
```
|
||||
|
||||
Rules:
|
||||
- do not parse business meaning yet
|
||||
- do not mix this into normal task execution
|
||||
- preserve exact raw text frames in transcript results
|
||||
- keep the module small and diagnostic-oriented
|
||||
|
||||
- [ ] **Step 8: Re-run the new probe tests**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test browser_ws_probe_test -- --nocapture
|
||||
```
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
- [ ] **Step 9: Commit**
|
||||
|
||||
```bash
|
||||
git add src/browser/ws_probe.rs tests/browser_ws_probe_test.rs
|
||||
git commit -m "test: add sgBrowser websocket probe harness"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Task 2: Add a standalone probe binary for live sgBrowser evidence collection
|
||||
|
||||
**Files:**
|
||||
- Create: `src/bin/sgbrowser_ws_probe.rs`
|
||||
- Create: `src/browser/ws_probe.rs`
|
||||
- Modify: `src/browser/mod.rs`
|
||||
- Create: `tests/browser_ws_probe_test.rs`
|
||||
|
||||
- [ ] **Step 1: Write the failing helper parser test**
|
||||
|
||||
In `tests/browser_ws_probe_test.rs`, add one focused test for a new helper function in `src/browser/ws_probe.rs`:
|
||||
|
||||
```rust
|
||||
#[test]
|
||||
fn parse_probe_args_accepts_ws_url_timeout_and_ordered_steps() {
|
||||
// parse a fixed argv-style slice into a ProbeCliConfig
|
||||
}
|
||||
```
|
||||
|
||||
Create and use this exact helper shape:
|
||||
|
||||
```rust
|
||||
pub struct ProbeCliConfig {
|
||||
pub ws_url: String,
|
||||
pub timeout_ms: u64,
|
||||
pub steps: Vec<ProbeStep>,
|
||||
}
|
||||
|
||||
pub fn parse_probe_args(args: &[String]) -> Result<ProbeCliConfig, ProbeError>
|
||||
```
|
||||
|
||||
The test must assert that these exact arguments parse successfully and preserve step order:
|
||||
|
||||
```text
|
||||
--ws-url ws://127.0.0.1:12345
|
||||
--timeout-ms 1500
|
||||
--step open-agent::["about:blank","sgOpenAgent"]
|
||||
--step open-hot::["about:blank","sgBrowerserOpenPage","https://www.zhihu.com/hot"]
|
||||
```
|
||||
|
||||
- [ ] **Step 2: Run the parser test and verify it fails**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test browser_ws_probe_test parse_probe_args_accepts_ws_url_timeout_and_ordered_steps -- --nocapture
|
||||
```
|
||||
|
||||
Expected: FAIL because `parse_probe_args(...)` and `ProbeCliConfig` do not exist yet.
|
||||
|
||||
- [ ] **Step 3: Implement the helper and binary together**
|
||||
|
||||
In `src/browser/ws_probe.rs`, add `ProbeCliConfig` and `parse_probe_args(...)`.
|
||||
|
||||
In `src/browser/mod.rs`, add the module export:
|
||||
|
||||
```rust
|
||||
pub mod ws_probe;
|
||||
```
|
||||
|
||||
In `src/bin/sgbrowser_ws_probe.rs`, implement the binary using only `std::env::args()` plus `parse_probe_args(...)`.
|
||||
|
||||
Required behavior:
|
||||
- accepts a websocket URL
|
||||
- accepts a timeout in milliseconds
|
||||
- accepts repeated ordered steps
|
||||
- runs the probe harness
|
||||
- prints a markdown-friendly transcript including:
|
||||
- step label
|
||||
- exact sent payload
|
||||
- exact received frames, if any
|
||||
- timeout/close outcome
|
||||
|
||||
Output shape can be simple, for example:
|
||||
|
||||
```text
|
||||
STEP 1 bootstrap-open-agent
|
||||
SEND: ["about:blank","sgOpenAgent"]
|
||||
RECV: Welcome! You are client #1
|
||||
OUTCOME: timeout
|
||||
```
|
||||
|
||||
Rules:
|
||||
- no production/browser-runtime integration
|
||||
- no hidden fallback logic
|
||||
- no “best effort” guessing of next steps
|
||||
|
||||
- [ ] **Step 4: Re-run the parser/helper test**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test browser_ws_probe_test parse_probe_args_accepts_ws_url_timeout_and_ordered_steps -- --nocapture
|
||||
```
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
- [ ] **Step 5: Build the probe binary**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo build --bin sgbrowser_ws_probe
|
||||
```
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
- [ ] **Step 6: Commit**
|
||||
|
||||
```bash
|
||||
git add src/bin/sgbrowser_ws_probe.rs src/browser/ws_probe.rs src/browser/mod.rs tests/browser_ws_probe_test.rs
|
||||
git commit -m "feat: add live sgBrowser websocket probe binary"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Task 3: Run the real endpoint probe matrix and write the evidence transcript
|
||||
|
||||
**Files:**
|
||||
- Create: `docs/_tmp_sgbrowser_ws_probe_transcript.md`
|
||||
- Reuse only: `src/bin/sgbrowser_ws_probe.rs`, `docs/_tmp_sgbrowser_ws_api_doc.txt`
|
||||
|
||||
- [ ] **Step 1: Run the no-bootstrap baseline probe**
|
||||
|
||||
Run exactly:
|
||||
|
||||
```bash
|
||||
cargo run --bin sgbrowser_ws_probe -- --ws-url "ws://127.0.0.1:12345" --timeout-ms 1500 --step "baseline-open::[\"about:blank\",\"sgBrowerserOpenPage\",\"https://www.zhihu.com/hot\"]"
|
||||
```
|
||||
|
||||
Append the exact output under a `## baseline-open` heading in `docs/_tmp_sgbrowser_ws_probe_transcript.md`.
|
||||
|
||||
- [ ] **Step 2: Run the documented `sgOpenAgent` candidate**
|
||||
|
||||
Run exactly:
|
||||
|
||||
```bash
|
||||
cargo run --bin sgbrowser_ws_probe -- --ws-url "ws://127.0.0.1:12345" --timeout-ms 1500 --step "open-agent::[\"about:blank\",\"sgOpenAgent\"]" --step "post-open-agent-open::[\"about:blank\",\"sgBrowerserOpenPage\",\"https://www.zhihu.com/hot\"]"
|
||||
```
|
||||
|
||||
Append the exact output under a `## open-agent` heading.
|
||||
|
||||
- [ ] **Step 3: Run the documented `sgSetAuthInfo` candidate**
|
||||
|
||||
Run exactly:
|
||||
|
||||
```bash
|
||||
cargo run --bin sgbrowser_ws_probe -- --ws-url "ws://127.0.0.1:12345" --timeout-ms 1500 --step "set-auth::[\"about:blank\",\"sgSetAuthInfo\",\"probe-user\",\"probe-token\"]" --step "post-set-auth-open::[\"about:blank\",\"sgBrowerserOpenPage\",\"https://www.zhihu.com/hot\"]"
|
||||
```
|
||||
|
||||
Append the exact output under a `## set-auth` heading.
|
||||
|
||||
- [ ] **Step 4: Run the documented `sgBrowserLogin` candidate**
|
||||
|
||||
Run exactly:
|
||||
|
||||
```bash
|
||||
cargo run --bin sgbrowser_ws_probe -- --ws-url "ws://127.0.0.1:12345" --timeout-ms 1500 --step "browser-login::{\"request\":\"use-json-helper\"}"
|
||||
```
|
||||
|
||||
Before running, replace the placeholder payload with the exact JSON-array frame produced by the helper for:
|
||||
|
||||
```json
|
||||
["about:blank","sgBrowserLogin",{"appName":"probe","userName":"probe","orgName":"probe","menus":[{"name":"probe","normalImg":"x","activeImg":"x","url":"https://www.zhihu.com/hot"}]}]
|
||||
```
|
||||
|
||||
Then add a second step in the same command:
|
||||
|
||||
```json
|
||||
["about:blank","sgBrowerserOpenPage","https://www.zhihu.com/hot"]
|
||||
```
|
||||
|
||||
Append the exact output under a `## browser-login` heading.
|
||||
|
||||
- [ ] **Step 5: Run the documented `sgBrowerserActiveTab` candidate**
|
||||
|
||||
Run exactly:
|
||||
|
||||
```bash
|
||||
cargo run --bin sgbrowser_ws_probe -- --ws-url "ws://127.0.0.1:12345" --timeout-ms 1500 --step "active-tab::[\"about:blank\",\"sgBrowerserActiveTab\",\"https://www.zhihu.com/hot\",\"probeCallback\"]" --step "post-active-tab-open::[\"about:blank\",\"sgBrowerserOpenPage\",\"https://www.zhihu.com/hot\"]"
|
||||
```
|
||||
|
||||
Append the exact output under a `## active-tab` heading.
|
||||
|
||||
- [ ] **Step 6: Run one combined bootstrap candidate**
|
||||
|
||||
Run exactly:
|
||||
|
||||
```bash
|
||||
cargo run --bin sgbrowser_ws_probe -- --ws-url "ws://127.0.0.1:12345" --timeout-ms 1500 --step "combined-open-agent::[\"about:blank\",\"sgOpenAgent\"]" --step "combined-active-tab::[\"about:blank\",\"sgBrowerserActiveTab\",\"https://www.zhihu.com/hot\",\"probeCallback\"]" --step "combined-open::[\"about:blank\",\"sgBrowerserOpenPage\",\"https://www.zhihu.com/hot\"]"
|
||||
```
|
||||
|
||||
Append the exact output under a `## combined-bootstrap` heading.
|
||||
|
||||
- [ ] **Step 7: Run `requesturl` variants for the minimal action**
|
||||
|
||||
Run exactly these two additional commands:
|
||||
|
||||
```bash
|
||||
cargo run --bin sgbrowser_ws_probe -- --ws-url "ws://127.0.0.1:12345" --timeout-ms 1500 --step "target-as-requesturl::[\"https://www.zhihu.com/hot\",\"sgBrowerserOpenPage\",\"https://www.zhihu.com/hot\"]"
|
||||
```
|
||||
|
||||
```bash
|
||||
cargo run --bin sgbrowser_ws_probe -- --ws-url "ws://127.0.0.1:12345" --timeout-ms 1500 --step "baidu-requesturl::[\"https://www.baidu.com\",\"sgBrowerserOpenPage\",\"https://www.zhihu.com/hot\"]"
|
||||
```
|
||||
|
||||
Append the exact outputs under `## requesturl-variants`.
|
||||
|
||||
- [ ] **Step 8: Summarize the matrix in the transcript file**
|
||||
|
||||
At the end of `docs/_tmp_sgbrowser_ws_probe_transcript.md`, add this exact table template and fill it in:
|
||||
|
||||
```markdown
|
||||
| Sequence | Sent frames | First reply | Final outcome | Decision signal |
|
||||
| --- | --- | --- | --- | --- |
|
||||
```
|
||||
|
||||
- [ ] **Step 9: Determine which architecture option wins**
|
||||
|
||||
Decision rule:
|
||||
- if at least one sequence reproducibly yields real numeric status and/or callback frames for a real business action, Option A (bootstrap-validated raw websocket) wins
|
||||
- otherwise, Option B (bridge-first) wins
|
||||
|
||||
Do not weaken this decision rule.
|
||||
|
||||
- [ ] **Step 10: Commit the evidence artifact**
|
||||
|
||||
```bash
|
||||
git add docs/_tmp_sgbrowser_ws_probe_transcript.md
|
||||
git commit -m "docs: capture sgBrowser websocket probe evidence"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Task 4A: If Option A wins, write the narrow bootstrap implementation slice
|
||||
|
||||
**Files:**
|
||||
- Create: `docs/superpowers/specs/2026-04-03-ws-browser-bootstrap-contract-design.md`
|
||||
- Create: `docs/superpowers/plans/2026-04-03-ws-browser-bootstrap-contract-plan.md`
|
||||
- Reuse as evidence input:
|
||||
- `docs/_tmp_sgbrowser_ws_probe_transcript.md`
|
||||
- `docs/_tmp_sgbrowser_ws_api_doc.txt`
|
||||
- `src/browser/ws_backend.rs`
|
||||
- `src/browser/ws_protocol.rs`
|
||||
|
||||
- [ ] **Step 1: Write one new design doc capturing the proven bootstrap contract**
|
||||
|
||||
Create:
|
||||
|
||||
```text
|
||||
docs/superpowers/specs/2026-04-03-ws-browser-bootstrap-contract-design.md
|
||||
```
|
||||
|
||||
Include:
|
||||
- exact validated sequence
|
||||
- exact required state (`requesturl`, active tab, agent page, auth payload)
|
||||
- exact failure semantics
|
||||
- why this is sufficient evidence to keep raw websocket as the product surface
|
||||
|
||||
- [ ] **Step 2: Write one new implementation plan for the bootstrap path**
|
||||
|
||||
Create:
|
||||
|
||||
```text
|
||||
docs/superpowers/plans/2026-04-03-ws-browser-bootstrap-contract-plan.md
|
||||
```
|
||||
|
||||
Plan only the minimal production changes required to embed the validated bootstrap sequence into the service/browser path.
|
||||
|
||||
- [ ] **Step 3: Commit the bootstrap decision docs**
|
||||
|
||||
```bash
|
||||
git add docs/superpowers/specs/2026-04-03-ws-browser-bootstrap-contract-design.md docs/superpowers/plans/2026-04-03-ws-browser-bootstrap-contract-plan.md
|
||||
git commit -m "docs: capture ws browser bootstrap contract"
|
||||
```
|
||||
|
||||
- [ ] **Step 4: Stop after writing the bootstrap plan**
|
||||
|
||||
Do not begin production implementation in the same slice unless the user explicitly asks for execution.
|
||||
|
||||
---
|
||||
|
||||
## Task 4B: If Option B wins, write the bridge-first implementation slice
|
||||
|
||||
**Files:**
|
||||
- Create: `docs/superpowers/specs/2026-04-03-ws-browser-bridge-path-design.md`
|
||||
- Create: `docs/superpowers/plans/2026-04-03-ws-browser-bridge-path-plan.md`
|
||||
- Reuse as evidence input:
|
||||
- `docs/_tmp_sgbrowser_ws_probe_transcript.md`
|
||||
- `frontend/archive/sgClaw验证-已归档/testRunner.js`
|
||||
- `docs/superpowers/specs/2026-03-25-superrpa-sgclaw-browser-control-design.md`
|
||||
- `docs/archive/项目管理与排期/协作时间表.md`
|
||||
- `docs/plans/2026-03-27-sgclaw-floating-chat-frontend-design.md`
|
||||
|
||||
- [ ] **Step 1: Write the bridge-path design doc**
|
||||
|
||||
Create `docs/superpowers/specs/2026-04-03-ws-browser-bridge-path-design.md`.
|
||||
|
||||
The design must specify:
|
||||
- why raw websocket is considered non-validated for external control
|
||||
- which bridge surface becomes authoritative
|
||||
- where sgClaw should integrate (`FunctionsUI`, host bridge, `BrowserAction`, `CommandRouter`, or the nearest validated seam in this repo)
|
||||
- how to preserve pipe behavior and existing abstractions where practical
|
||||
|
||||
- [ ] **Step 2: Write the bridge-path implementation plan**
|
||||
|
||||
Create `docs/superpowers/plans/2026-04-03-ws-browser-bridge-path-plan.md`.
|
||||
|
||||
The plan must:
|
||||
- identify exact files to touch
|
||||
- describe the narrowest adapter implementation
|
||||
- keep TDD/task granularity as in this document
|
||||
- avoid speculative work outside the bridge slice
|
||||
|
||||
- [ ] **Step 3: Commit the bridge decision docs**
|
||||
|
||||
```bash
|
||||
git add docs/superpowers/specs/2026-04-03-ws-browser-bridge-path-design.md docs/superpowers/plans/2026-04-03-ws-browser-bridge-path-plan.md
|
||||
git commit -m "docs: define bridge-first sgBrowser integration"
|
||||
```
|
||||
|
||||
- [ ] **Step 4: Stop after writing the bridge plan**
|
||||
|
||||
Do not start the bridge implementation in the same slice unless the user explicitly asks for execution.
|
||||
|
||||
---
|
||||
|
||||
## Verification Checklist
|
||||
|
||||
### Deterministic probe harness tests
|
||||
|
||||
```bash
|
||||
cargo test --test browser_ws_probe_test -- --nocapture
|
||||
```
|
||||
|
||||
Expected: transcript capture, ordered scripts, timeout reporting, and close/reset reporting all pass.
|
||||
|
||||
### Probe binary build
|
||||
|
||||
```bash
|
||||
cargo build --bin sgbrowser_ws_probe
|
||||
```
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
### Live evidence collection
|
||||
|
||||
- run the probe matrix against the real configured endpoint
|
||||
- save exact transcripts to `docs/_tmp_sgbrowser_ws_probe_transcript.md`
|
||||
- make the architecture decision using the documented rule
|
||||
|
||||
### Follow-up branch condition
|
||||
|
||||
- if Option A wins, repository contains a bootstrap-contract design + plan
|
||||
- if Option B wins, repository contains a bridge-path design + plan
|
||||
- no production runtime changes are made until that decision is written down
|
||||
|
||||
---
|
||||
|
||||
## Notes for Implementation
|
||||
|
||||
- The existing `WsBrowserBackend` fix that remembers the last navigated URL remains valid; do not revert it.
|
||||
- The previous auth-replacement work also remains valid; it removed a real bug but did not prove the raw websocket architecture.
|
||||
- Keep the probe tool brutally literal: exact sent frames, exact received frames, explicit timeout/close outcomes.
|
||||
- Resist the temptation to make the probe “smart.” Smart probes hide evidence.
|
||||
- If the real endpoint still replies only with the welcome banner and then silence across the matrix, treat that as a decision, not as an excuse for more guessing.
|
||||
@@ -0,0 +1,362 @@
|
||||
# WS Browser Welcome Frame Compatibility Implementation Plan
|
||||
|
||||
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
|
||||
|
||||
**Goal:** Make the ws service path tolerate the real sgBrowser welcome banner (`Welcome! You are client #...`) without weakening general ws protocol validation or changing pipe behavior.
|
||||
|
||||
**Architecture:** Keep the shared `WsBrowserBackend` strict and implement the compatibility shim only in `ServiceBrowserWsClient`, which is already the real-browser adapter for the ws service path. Add one positive red test for the known welcome frame and one negative red test proving non-matching first text frames still fail as protocol errors, then make the minimal stateful change in `src/service/server.rs` and verify ws + pipe regressions.
|
||||
|
||||
**Tech Stack:** Rust 2021, tungstenite websocket client/server, existing `WsBrowserBackend`, existing `ServiceBrowserWsClient`, existing Rust unit/integration test suite.
|
||||
|
||||
---
|
||||
|
||||
## File Structure
|
||||
|
||||
### Existing files to modify
|
||||
|
||||
- Modify: `src/service/server.rs`
|
||||
- Add the one-time per-connection welcome-skip state to `ServiceBrowserWsClient`
|
||||
- Add the minimal helper(s) for detecting and discarding the first known welcome frame
|
||||
- Add focused service-adapter unit tests in the existing `#[cfg(test)]` module
|
||||
- Reuse: `src/browser/ws_backend.rs`
|
||||
- Do not change protocol parsing rules; only verify behavior remains strict for all non-service callers
|
||||
- Reuse: `tests/service_task_flow_test.rs`
|
||||
- Re-run to confirm the ws service path still reaches the browser websocket after the service-side shim
|
||||
- Reuse: `tests/browser_ws_backend_test.rs`
|
||||
- Re-run to prove the shared backend semantics remain unchanged
|
||||
|
||||
### Files deliberately not changed
|
||||
|
||||
- `src/browser/ws_backend.rs`
|
||||
- `src/browser/ws_protocol.rs`
|
||||
- `src/agent/task_runner.rs`
|
||||
- `src/compat/runtime.rs`
|
||||
- `src/compat/orchestration.rs`
|
||||
- `src/compat/workflow_executor.rs`
|
||||
- `src/lib.rs`
|
||||
|
||||
The design explicitly keeps the welcome-banner workaround out of the shared backend and out of the pipe path.
|
||||
|
||||
---
|
||||
|
||||
## Task 1: Reproduce the real welcome-frame failure with focused unit tests
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/service/server.rs`
|
||||
|
||||
- [ ] **Step 1: Add the positive failing test for the known welcome frame**
|
||||
|
||||
In the existing `#[cfg(test)] mod tests` inside `src/service/server.rs`, add one focused test next to the current ws adapter tests.
|
||||
|
||||
Test shape:
|
||||
|
||||
```rust
|
||||
#[test]
|
||||
fn future_server_side_ws_native_adapter_skips_initial_known_welcome_frame() {
|
||||
// fake server sends:
|
||||
// 1. "Welcome! You are client #1"
|
||||
// 2. "0"
|
||||
// backend.invoke(Action::Navigate, ...) should succeed
|
||||
}
|
||||
```
|
||||
|
||||
Required assertions:
|
||||
- the fake websocket server accepts one connection
|
||||
- it sends the welcome banner first, then the numeric success status
|
||||
- `WsBrowserBackend.invoke(Action::Navigate, ...)` returns `Ok(CommandOutput { success: true, .. })`
|
||||
|
||||
- [ ] **Step 2: Run only the positive new test and watch it fail**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test service::server::tests::future_server_side_ws_native_adapter_skips_initial_known_welcome_frame -- --nocapture
|
||||
```
|
||||
|
||||
Expected: FAIL with a protocol error containing `invalid browser status frame: Welcome! You are client #1`.
|
||||
|
||||
- [ ] **Step 3: Add the negative failing test for arbitrary first text**
|
||||
|
||||
In the same `#[cfg(test)]` module, add one negative test proving we do **not** silently skip arbitrary first text frames.
|
||||
|
||||
Test shape:
|
||||
|
||||
```rust
|
||||
#[test]
|
||||
fn future_server_side_ws_native_adapter_does_not_skip_unknown_first_text_frame() {
|
||||
// fake server sends:
|
||||
// 1. "Hello from server"
|
||||
// assert invoke(...) fails as PipeError::Protocol(...)
|
||||
}
|
||||
```
|
||||
|
||||
Required assertions:
|
||||
- the first frame is a non-matching text frame such as `Hello from server`
|
||||
- `invoke(...)` fails
|
||||
- the failure remains a protocol error rather than success or timeout
|
||||
|
||||
- [ ] **Step 4: Run only the negative new test and verify the current behavior is already strict**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test service::server::tests::future_server_side_ws_native_adapter_does_not_skip_unknown_first_text_frame -- --nocapture
|
||||
```
|
||||
|
||||
Expected: PASS, proving the current implementation already treats arbitrary first text as a protocol error. Keep that assertion in place before any production change.
|
||||
|
||||
- [ ] **Step 5: Confirm the TDD gate before implementation**
|
||||
|
||||
Do not implement production code before both tests exist and the positive test has failed on current behavior.
|
||||
|
||||
---
|
||||
|
||||
## Task 2: Add the minimal per-connection welcome-skip state in the service adapter
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/service/server.rs`
|
||||
|
||||
- [ ] **Step 1: Add one-time per-connection state to `ServiceBrowserWsClient`**
|
||||
|
||||
Extend `ServiceBrowserWsClient` with one extra state field that tracks whether the initial welcome candidate has already been consumed for the current websocket connection.
|
||||
|
||||
Allowed shape:
|
||||
|
||||
```rust
|
||||
struct ServiceBrowserWsClient {
|
||||
browser_ws_url: String,
|
||||
browser_socket: Mutex<Option<WebSocket<MaybeTlsStream<TcpStream>>>>,
|
||||
initial_text_frame_checked: Mutex<bool>,
|
||||
}
|
||||
```
|
||||
|
||||
or an equally small equivalent.
|
||||
|
||||
Rules:
|
||||
- state is per connection, not per request
|
||||
- state must survive multiple `invoke(...)` calls while reusing the same socket
|
||||
- do not add broader protocol state machines
|
||||
|
||||
- [ ] **Step 2: Add a narrow welcome-frame matcher**
|
||||
|
||||
In `src/service/server.rs`, add one small helper that recognizes only the known banner prefix:
|
||||
|
||||
```rust
|
||||
fn is_known_welcome_frame(frame: &str) -> bool {
|
||||
frame.starts_with("Welcome! You are client #")
|
||||
}
|
||||
```
|
||||
|
||||
Rules:
|
||||
- no regex needed
|
||||
- no generic “ignore arbitrary text” logic
|
||||
- keep the matcher local to `src/service/server.rs`
|
||||
|
||||
- [ ] **Step 3: Update `recv_text_timeout(...)` to skip at most one initial known banner**
|
||||
|
||||
Modify `impl WsClient for ServiceBrowserWsClient` so that the first text frame received after connection establishment is handled like this:
|
||||
|
||||
1. read the next text frame
|
||||
2. if the initial-frame state is still false:
|
||||
- mark the first-frame check as consumed
|
||||
- if the frame matches `is_known_welcome_frame(...)`, read the next frame and return that next frame instead
|
||||
3. otherwise, return the frame unchanged
|
||||
|
||||
Rules:
|
||||
- skip only once per connection
|
||||
- do not loop indefinitely over multiple text frames
|
||||
- do not swallow unknown first text frames
|
||||
- do not change timeout / close / reset / connect-failure behavior
|
||||
|
||||
- [ ] **Step 4: Reset the one-time state when a fresh socket is created**
|
||||
|
||||
When `with_socket(...)` establishes a brand-new websocket connection, ensure the one-time banner-check state is reset so a new connection can tolerate its own first welcome frame.
|
||||
|
||||
- [ ] **Step 5: Add one reconnect regression in the service adapter tests**
|
||||
|
||||
Add one focused test proving the welcome skip resets on a fresh connection after socket close/reset.
|
||||
|
||||
Test shape:
|
||||
|
||||
```rust
|
||||
#[test]
|
||||
fn future_server_side_ws_native_adapter_skips_welcome_again_after_reconnect() {
|
||||
// first connection closes after use
|
||||
// second fresh connection sends the same welcome banner again
|
||||
// both invocations succeed
|
||||
}
|
||||
```
|
||||
|
||||
Required assertion:
|
||||
- the one-time skip is per connection, not global for the client instance
|
||||
|
||||
- [ ] **Step 6: Run the positive new test**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test service::server::tests::future_server_side_ws_native_adapter_skips_initial_known_welcome_frame -- --nocapture
|
||||
```
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
- [ ] **Step 7: Run the negative new test**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test service::server::tests::future_server_side_ws_native_adapter_does_not_skip_unknown_first_text_frame -- --nocapture
|
||||
```
|
||||
|
||||
Expected: PASS, proving unknown first text is still treated as a protocol error.
|
||||
|
||||
- [ ] **Step 8: Run the reconnect regression**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test service::server::tests::future_server_side_ws_native_adapter_skips_welcome_again_after_reconnect -- --nocapture
|
||||
```
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
- [ ] **Step 9: Run the full service adapter unit group**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test service::server::tests -- --nocapture
|
||||
```
|
||||
|
||||
Expected: PASS, including the existing tests for:
|
||||
- status `0` success
|
||||
- connect failure => `PipeError::Protocol("browser websocket connect failed: ...")`
|
||||
- disconnect/reset => `PipeError::PipeClosed`
|
||||
- callback timeout => `PipeError::Timeout`
|
||||
- new known-welcome success path
|
||||
- new unknown-first-frame strictness path
|
||||
- new reconnect reset behavior
|
||||
|
||||
---
|
||||
|
||||
## Task 3: Verify the shared backend stayed strict and the ws service path still works
|
||||
|
||||
**Files:**
|
||||
- Reuse: `tests/browser_ws_backend_test.rs`
|
||||
- Reuse: `tests/service_task_flow_test.rs`
|
||||
- Reuse: `src/browser/ws_backend.rs`
|
||||
|
||||
- [ ] **Step 1: Re-run the shared ws backend tests unchanged**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test browser_ws_backend_test -- --nocapture
|
||||
```
|
||||
|
||||
Expected: PASS. This proves `WsBrowserBackend` semantics remain unchanged for its existing deterministic callers.
|
||||
|
||||
- [ ] **Step 2: Re-run the service task-flow regression**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test service_task_flow_test -- --nocapture
|
||||
```
|
||||
|
||||
Expected: PASS, including the auth-regression test that proves the ws service path reaches the browser websocket and no longer emits `invalid hmac seed: session key must not be empty`.
|
||||
|
||||
- [ ] **Step 3: Re-run the ws-focused mixed verification**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test browser_ws_backend_test --test browser_ws_protocol_test --test service_ws_session_test --test service_task_flow_test -- --nocapture
|
||||
```
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
---
|
||||
|
||||
## Task 4: Re-run the real manual smoke that originally failed
|
||||
|
||||
**Files:**
|
||||
- Reuse only: no code changes unless a fresh reproducer proves another bug
|
||||
|
||||
- [ ] **Step 1: Confirm real browser websocket reachability**
|
||||
|
||||
Run a reachability check for `ws://127.0.0.1:12345` (or the configured `browserWsUrl`) before starting smoke.
|
||||
|
||||
Expected: reachable.
|
||||
|
||||
- [ ] **Step 2: Start the real ws service**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo run --bin sg_claw -- --config-path "D:/data/ideaSpace/rust/sgClaw/sgclaw_config.json"
|
||||
```
|
||||
|
||||
Expected: the service prints:
|
||||
- `sg_claw ready: ...`
|
||||
- the resolved `service_ws_listen_addr`
|
||||
- the configured `browser_ws_url`
|
||||
|
||||
- [ ] **Step 3: Re-run the original failing manual smoke**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
printf '打开知乎热榜并读取页面主区域文本\n' | cargo run --bin sg_claw_client -- --config-path "D:/data/ideaSpace/rust/sgClaw/sgclaw_config.json"
|
||||
```
|
||||
|
||||
Expected:
|
||||
- no `invalid browser status frame: Welcome! You are client #1`
|
||||
- browser actions proceed past the first status frame
|
||||
- if the browser later fails for another reason, capture that new reason exactly
|
||||
|
||||
- [ ] **Step 4: Re-run the old Zhihu export task smoke**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
printf '读取知乎热榜数据,并导出 excel 文件\n' | cargo run --bin sg_claw_client -- --config-path "D:/data/ideaSpace/rust/sgClaw/sgclaw_config.json"
|
||||
```
|
||||
|
||||
Expected:
|
||||
- no `invalid browser status frame: Welcome! You are client #1`
|
||||
- the task reaches the real browser action path beyond connection banner handling
|
||||
|
||||
- [ ] **Step 5: Stop and debug if a new real-browser issue appears**
|
||||
|
||||
If smoke now fails for a different reason, do not piggyback a second fix into this slice without:
|
||||
- capturing the exact new output
|
||||
- writing a new focused spec/plan if the issue is materially different
|
||||
|
||||
---
|
||||
|
||||
## Verification Checklist
|
||||
|
||||
### Service adapter unit tests
|
||||
|
||||
```bash
|
||||
cargo test service::server::tests -- --nocapture
|
||||
```
|
||||
|
||||
Expected: all service-side ws adapter tests pass, including the new welcome-frame positive/negative cases and reconnect reset case.
|
||||
|
||||
### Shared ws backend + ws service regressions
|
||||
|
||||
```bash
|
||||
cargo test --test browser_ws_backend_test --test browser_ws_protocol_test --test service_ws_session_test --test service_task_flow_test -- --nocapture
|
||||
```
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
### Real smoke verification
|
||||
|
||||
- `browserWsUrl` reachable
|
||||
- `sg_claw` starts with real config
|
||||
- `sg_claw_client` no longer fails on `Welcome! You are client #...`
|
||||
- Zhihu minimal read task gets past the first status frame
|
||||
- Zhihu export task gets past the first status frame
|
||||
@@ -0,0 +1,564 @@
|
||||
# Zhihu Release WS Function-Callback Migration Implementation Plan
|
||||
|
||||
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
|
||||
|
||||
**Goal:** Move only the Zhihu direct-execution path to the new Release browser websocket interaction style while keeping the existing pipe protocol and non-Zhihu submit behavior unchanged.
|
||||
|
||||
**Architecture:** Keep `ClientMessage` / `ServiceMessage`, `run_submit_task_with_browser_backend(...)`, and the high-level Zhihu workflow steps unchanged. First prove the exact Release browser interaction contract with transcript-backed probes. Then implement the smallest Zhihu-scoped backend path that follows that proven contract. Do not globally rewire the submit path unless the probe evidence proves there is no narrower safe seam.
|
||||
|
||||
**Tech Stack:** Rust, tungstenite, existing sgclaw service/client pipe protocol, `docs/_tmp_sgbrowser_ws_api_doc.txt`, Release browser websocket at `ws://127.0.0.1:12345`, current Zhihu direct-execution workflow.
|
||||
|
||||
---
|
||||
|
||||
## Context
|
||||
|
||||
The user has now made the target behavior explicit:
|
||||
|
||||
- the browser has changed and the working reference behavior is the user-provided HTML page that connects to `ws://127.0.0.1:12345`
|
||||
- that page sends a bootstrap registration frame: `{"type":"register","role":"web"}`
|
||||
- browser requests are still JSON arrays such as `[window.location.href, "sgBrowserSetTheme", "1"]` and `[window.location.href, "sgBrowerserGetUrls", "showUrls"]`
|
||||
- callback-bearing browser behavior is now centered on page-defined JS callback functions like `showUrls`, not on Rust directly reading a websocket callback frame as the final business result
|
||||
- the existing sgclaw pipe protocol must remain unchanged
|
||||
|
||||
The current sgclaw drift that must be corrected is visible in:
|
||||
|
||||
- `src/browser/ws_protocol.rs`
|
||||
- `Action::Navigate` currently emits `sgHideBrowserCallAfterLoaded` with an inline `callBackJsToCpp(...)` string
|
||||
- `src/browser/ws_backend.rs`
|
||||
- Rust currently waits for a browser websocket callback frame and treats that as the action result
|
||||
- `tests/service_ws_session_test.rs:498-605`
|
||||
- `tests/service_task_flow_test.rs:499-635`
|
||||
- existing **generic submit-flow** regressions still lock in the old direct raw-websocket callback-frame assumption
|
||||
- these are useful as non-regression guardrails, but they are not themselves Zhihu-specific regressions
|
||||
|
||||
Zhihu-specific verification must therefore be added explicitly instead of assuming those Baidu-path tests already cover Zhihu.
|
||||
|
||||
The new browser style proves these facts and only these facts so far:
|
||||
|
||||
1. sgclaw must handle a register-first websocket handshake
|
||||
2. browser requests are still `[requesturl, action, ...args]`
|
||||
3. some browser capabilities now return through page-defined callback functions like `showUrls`
|
||||
4. the current direct raw-websocket callback expectation in Zhihu path is no longer a safe assumption
|
||||
|
||||
The production seam is **not** pre-decided here. Task 1 must determine whether Zhihu can be integrated by:
|
||||
- a direct Zhihu-scoped backend with no helper page, or
|
||||
- a helper page / relay design because named page callbacks are the only reliable result path
|
||||
|
||||
Until Task 1 evidence is captured, both remain hypotheses.
|
||||
|
||||
## Evidence to preserve in the implementation
|
||||
|
||||
### Browser websocket API doc
|
||||
From `docs/_tmp_sgbrowser_ws_api_doc.txt`:
|
||||
- `ws://localhost:12345` is the browser websocket endpoint
|
||||
- request frames are array payloads with `requesturl`
|
||||
- `sgBrowerserGetUrls(callback)` uses a callback **function name**: `[requesturl,"sgBrowerserGetUrls", callback]`
|
||||
- `sgBrowserCallAfterLoaded(targetUrl, callback)` and `sgHideBrowserCallAfterLoaded(targetUrl, callback)` use callback strings with parentheses
|
||||
- `callBackJsToCpp(param)` uses `sourceUrl@_@targetUrl@_@callback@_@actionUrl@_@responseTxt`
|
||||
- `sgBrowserRegJsFun(targeturl, funContent)` and `sgBrowserExcuteJsFun(targeturl, funName)` exist and may be useful when the helper page needs durable callback helpers
|
||||
|
||||
### Current working HTML pattern from the user
|
||||
The now-working reference interaction is:
|
||||
|
||||
```html
|
||||
const socket = new WebSocket('ws://127.0.0.1:12345');
|
||||
socket.onopen = () => {
|
||||
socket.send(JSON.stringify({type: 'register', role: 'web'}));
|
||||
};
|
||||
socket.send(JSON.stringify([window.location.href,"sgBrowerserGetUrls","showUrls"]));
|
||||
function showUrls(urls) {
|
||||
// browser invokes this page-defined callback
|
||||
}
|
||||
```
|
||||
|
||||
That is the browser behavior sgclaw now needs to follow.
|
||||
|
||||
---
|
||||
|
||||
## Critical files
|
||||
|
||||
### Production files to modify
|
||||
- `src/browser/ws_protocol.rs`
|
||||
- `src/compat/workflow_executor.rs` (only if a narrow Zhihu-specific correction is required after backend swap)
|
||||
- `src/service/server.rs` (only if the chosen Zhihu-scoped integration seam must be wired here)
|
||||
- `src/service/mod.rs` (only if startup plumbing changes are truly required)
|
||||
- `src/browser/mod.rs`
|
||||
|
||||
### New production files likely needed
|
||||
- `src/browser/zhihu_release_backend.rs`
|
||||
- a Zhihu-scoped `BrowserBackend` adapter that follows the proven Release browser interaction style without changing non-Zhihu routes
|
||||
- `src/service/browser_callback_host.rs` **only if the probe proves a service-controlled helper page is actually required**
|
||||
- service-local helper-page lifecycle and callback relay, if evidence shows the browser cannot be driven safely without it
|
||||
|
||||
### Existing files to preserve
|
||||
- `src/agent/task_runner.rs`
|
||||
- `src/service/protocol.rs`
|
||||
- `src/compat/orchestration.rs`
|
||||
- `src/compat/runtime.rs`
|
||||
- `src/pipe/*`
|
||||
|
||||
### Existing direct-ws files to review explicitly
|
||||
- `src/browser/ws_backend.rs`
|
||||
- `tests/browser_ws_backend_test.rs`
|
||||
|
||||
These files currently encode the old direct raw-websocket callback expectation. The implementation must either:
|
||||
- leave them untouched as legacy/direct-contract coverage with no Zhihu production callers, or
|
||||
- update/remove the Zhihu-specific assumptions they currently lock in.
|
||||
|
||||
### Primary test files
|
||||
- `tests/browser_ws_probe_test.rs`
|
||||
- `tests/browser_ws_protocol_test.rs`
|
||||
- `tests/service_ws_session_test.rs`
|
||||
- `tests/service_task_flow_test.rs`
|
||||
- `tests/task_runner_test.rs`
|
||||
- `tests/browser_ws_backend_test.rs`
|
||||
|
||||
---
|
||||
|
||||
## File structure decisions
|
||||
|
||||
### `src/browser/zhihu_release_backend.rs`
|
||||
Prefer a Zhihu-scoped backend first.
|
||||
|
||||
Responsibilities:
|
||||
- keep the same `BrowserBackend` trait surface
|
||||
- implement only the behavior needed by the current Zhihu direct-execution route
|
||||
- translate `Action::Navigate`, `Action::GetText`, and `Action::Eval` into the proven Release-browser interaction style
|
||||
- normalize results back into `CommandOutput`
|
||||
- avoid affecting non-Zhihu callers
|
||||
|
||||
This is the preferred seam because the user asked to change the current Zhihu flow, not to redesign the whole submit pipeline.
|
||||
|
||||
### `src/service/browser_callback_host.rs` (conditional)
|
||||
Create this file only if Task 1 probe evidence proves that sgclaw must host or control a page in order to receive named callback-function results.
|
||||
|
||||
If it is needed, the plan must keep the design minimal and specific:
|
||||
- one concrete transport only (choose websocket or HTTP, not “websocket or HTTP”)
|
||||
- explicit readiness handshake
|
||||
- explicit request correlation by `request_id`
|
||||
- explicit cleanup when the submit task ends
|
||||
|
||||
If Task 1 shows a simpler seam, do not create this file.
|
||||
|
||||
### `src/browser/ws_protocol.rs`
|
||||
Do not let this file keep only the old direct-callback assumption.
|
||||
|
||||
It should become the shared place for doc-native request builders such as:
|
||||
- browser bootstrap frames proven by the transcript
|
||||
- `sgBrowserCallAfterLoaded` / `sgHideBrowserCallAfterLoaded`
|
||||
- `sgBrowserExcuteJsCodeByArea`
|
||||
- optional `sgBrowserRegJsFun` / `sgBrowserExcuteJsFun`
|
||||
|
||||
But do **not** let `ws_protocol.rs` absorb service-host lifecycle logic.
|
||||
|
||||
### `src/browser/ws_backend.rs` and `tests/browser_ws_backend_test.rs`
|
||||
Handle these explicitly in the implementation:
|
||||
- if they still describe a valid direct browser contract, keep them as isolated legacy/direct-ws coverage only
|
||||
- if their current navigate/callback assumptions conflict with the proven Release Zhihu path, update or narrow those tests so they no longer describe the active Zhihu integration path
|
||||
|
||||
Do not leave the old direct-callback assumptions ambiguously “reviewed”; the implementation must make their status explicit.
|
||||
|
||||
---
|
||||
|
||||
## Task 1: Capture the new Release browser contract in a reproducible probe transcript
|
||||
|
||||
**Files:**
|
||||
- Review/modify: `src/browser/ws_probe.rs`
|
||||
- Review/modify: `src/bin/sgbrowser_ws_probe.rs`
|
||||
- Review/modify: `tests/browser_ws_probe_test.rs`
|
||||
- Create: `docs/_tmp_release_ws_callback_host_transcript.md`
|
||||
|
||||
- [ ] **Step 1: Verify current probe coverage against the Release-browser questions**
|
||||
|
||||
Read the existing probe module and tests and check whether they already prove all of the following:
|
||||
- a register-first websocket script can be expressed
|
||||
- a later array action frame can be expressed in the same script
|
||||
- per-step inbound frames/outcomes are preserved separately
|
||||
- timeout/close remain distinguishable in the transcript
|
||||
|
||||
Required result:
|
||||
- identify the exact existing tests that already prove these behaviors
|
||||
- identify the smallest missing Release-specific coverage, if any
|
||||
|
||||
- [ ] **Step 2: Add only the missing regression coverage**
|
||||
|
||||
If current tests do **not** already prove the Release-browser bootstrap shape, add the narrowest failing regression in `tests/browser_ws_probe_test.rs`.
|
||||
|
||||
Preferred shape if coverage is missing:
|
||||
|
||||
```rust
|
||||
#[test]
|
||||
fn probe_supports_register_then_array_action_script() {
|
||||
// fake server expects:
|
||||
// 1. {"type":"register","role":"web"}
|
||||
// 2. ["http://127.0.0.1/helper.html","sgBrowerserGetUrls","showUrls"]
|
||||
}
|
||||
```
|
||||
|
||||
And, if still missing, add one regression proving per-step transcript separation for the register reply and later action reply.
|
||||
|
||||
If those behaviors are already covered, skip new test creation and record the exact test names to rely on.
|
||||
|
||||
- [ ] **Step 3: Run the relevant probe tests**
|
||||
|
||||
Run the narrowest exact tests that prove the Release bootstrap behavior, or the full file if multiple areas changed:
|
||||
|
||||
```bash
|
||||
cargo test --test browser_ws_probe_test -- --nocapture
|
||||
```
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
- [ ] **Step 4: Make the probe binary ergonomic for the Release transcript if needed**
|
||||
|
||||
Only if the current CLI cannot conveniently express the real Release-browser script, make the smallest change needed in `src/bin/sgbrowser_ws_probe.rs` / `src/browser/ws_probe.rs` so it can capture:
|
||||
- register frame behavior
|
||||
- minimal `sgBrowserSetTheme`
|
||||
- minimal `sgBrowerserGetUrls`
|
||||
- exact inbound websocket text per step
|
||||
|
||||
Do not redesign the probe if it already supports this.
|
||||
|
||||
- [ ] **Step 5: Run the live probe against the Release browser and record the real bootstrap**
|
||||
|
||||
Use the probe binary against the real endpoint to capture at minimum:
|
||||
- register frame behavior
|
||||
- minimal `sgBrowserSetTheme`
|
||||
- minimal `sgBrowerserGetUrls`
|
||||
- whether replies come back as websocket text, page-function invocation only, or both
|
||||
|
||||
Save the exact transcript in `docs/_tmp_release_ws_callback_host_transcript.md`.
|
||||
|
||||
Required output in that temp doc:
|
||||
- exact sent frames
|
||||
- exact received websocket frames
|
||||
- the observed rule for when named callback functions are invoked
|
||||
- whether Option A or Option B is supported by evidence
|
||||
|
||||
- [ ] **Step 6: Commit the probe-only slice if code changed**
|
||||
|
||||
If probe code/tests changed:
|
||||
|
||||
```bash
|
||||
git add src/browser/ws_probe.rs src/bin/sgbrowser_ws_probe.rs tests/browser_ws_probe_test.rs docs/_tmp_release_ws_callback_host_transcript.md
|
||||
git commit -m "test: capture release browser ws bootstrap contract"
|
||||
```
|
||||
|
||||
If only the transcript doc changed, stage only that file and use a docs/test-appropriate commit message.
|
||||
|
||||
---
|
||||
|
||||
## Task 2: Choose the narrowest Zhihu-only production seam from the probe evidence
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/service/server.rs` (only if required)
|
||||
- Modify: `src/browser/mod.rs`
|
||||
- Modify: `src/compat/workflow_executor.rs` (only if required)
|
||||
- Create: `src/browser/zhihu_release_backend.rs`
|
||||
- Create: `src/service/browser_callback_host.rs` **only if required**
|
||||
- Test: `tests/service_ws_session_test.rs`
|
||||
- Test: `tests/service_task_flow_test.rs`
|
||||
|
||||
- [ ] **Step 1: Write down the seam decision in the plan notes before coding**
|
||||
|
||||
Based on the transcript from Task 1, record which one of these is supported by evidence:
|
||||
- Option A: a Zhihu-scoped backend can talk to the Release browser directly with no service-hosted helper page
|
||||
- Option B: a Zhihu-scoped backend needs a service-controlled helper page because named page callbacks are the only reliable way to get business results
|
||||
|
||||
Do not proceed until one option is chosen explicitly from evidence.
|
||||
|
||||
- [ ] **Step 2: Add a failing service/task-flow regression that proves only the Zhihu path changes**
|
||||
|
||||
Update or add focused tests so that:
|
||||
- Zhihu submit flow uses the new Release-browser interaction seam
|
||||
- non-Zhihu behavior is unchanged
|
||||
- pipe messages remain unchanged
|
||||
|
||||
Required assertions:
|
||||
- the new path is activated only for Zhihu route detection
|
||||
- `ClientMessage` / `ServiceMessage` stay identical
|
||||
- existing non-Zhihu submit behavior is not accidentally rerouted
|
||||
|
||||
- [ ] **Step 3: Run the new focused regression and confirm failure first**
|
||||
|
||||
Run the narrowest exact test names you added in:
|
||||
```bash
|
||||
cargo test --test service_ws_session_test <new_test_name> -- --nocapture
|
||||
cargo test --test service_task_flow_test <new_test_name> -- --nocapture
|
||||
```
|
||||
|
||||
Expected: FAIL because the Zhihu-specific seam does not exist yet.
|
||||
|
||||
- [ ] **Step 4: Implement the chosen seam with the smallest blast radius**
|
||||
|
||||
If Option A won:
|
||||
- add `src/browser/zhihu_release_backend.rs`
|
||||
- wire it only where the Zhihu direct-execution route is selected
|
||||
- leave global submit-path wiring alone
|
||||
|
||||
If Option B won:
|
||||
- add `src/service/browser_callback_host.rs` with one specific transport and one explicit readiness/correlation model
|
||||
- add `src/browser/zhihu_release_backend.rs` to talk to that helper path
|
||||
- wire it only for the Zhihu route
|
||||
|
||||
In both cases:
|
||||
- do not change non-Zhihu callers
|
||||
- do not redesign `run_submit_task_with_browser_backend(...)`
|
||||
- do not change the pipe protocol
|
||||
|
||||
- [ ] **Step 5: Make the status of old direct-ws code explicit**
|
||||
|
||||
Update `src/browser/ws_backend.rs` / `tests/browser_ws_backend_test.rs` only as needed so they no longer ambiguously describe the active Zhihu path.
|
||||
|
||||
Allowed outcomes:
|
||||
- keep them untouched as legacy/direct-ws coverage with no Zhihu production caller
|
||||
- narrow/update the tests so they no longer claim the active Zhihu integration path
|
||||
|
||||
Not allowed:
|
||||
- leaving the plan and code in a state where both old and new paths appear to be the active Zhihu contract
|
||||
|
||||
- [ ] **Step 6: Run focused integration tests**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
cargo test --test service_ws_session_test -- --nocapture
|
||||
cargo test --test service_task_flow_test -- --nocapture
|
||||
cargo test --test task_runner_test -- --nocapture
|
||||
```
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
- [ ] **Step 7: Commit the seam-selection slice**
|
||||
|
||||
Adjust staged files to match the option actually implemented, for example:
|
||||
|
||||
```bash
|
||||
git add src/browser/zhihu_release_backend.rs src/browser/mod.rs src/service/server.rs src/service/browser_callback_host.rs tests/service_ws_session_test.rs tests/service_task_flow_test.rs tests/browser_ws_backend_test.rs
|
||||
git commit -m "feat: route zhihu flow through release browser ws contract"
|
||||
```
|
||||
|
||||
Only stage files that were truly changed.
|
||||
|
||||
---
|
||||
|
||||
## Task 3: Implement Zhihu action mapping on the chosen Release-browser seam
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/browser/ws_protocol.rs`
|
||||
- Modify: `src/browser/zhihu_release_backend.rs`
|
||||
- Test: `tests/browser_ws_protocol_test.rs`
|
||||
- Create: `tests/browser_zhihu_release_backend_test.rs`
|
||||
|
||||
- [ ] **Step 1: Write the first failing backend test for Zhihu navigate mapping**
|
||||
|
||||
Create `tests/browser_zhihu_release_backend_test.rs` with a fake transport/relay and assert that `Action::Navigate` for the Zhihu path becomes the exact browser request shape proven by Task 1.
|
||||
|
||||
Start with this shape:
|
||||
|
||||
```rust
|
||||
#[test]
|
||||
fn zhihu_release_backend_maps_navigate_to_proven_release_frame() {
|
||||
// invoke Action::Navigate
|
||||
// assert exact outbound frame/opcode chosen from transcript evidence
|
||||
}
|
||||
```
|
||||
|
||||
Required assertions:
|
||||
- the call site still uses `BrowserBackend::invoke(...)`
|
||||
- the exact outbound frame matches the recorded Release-browser evidence
|
||||
- request correlation stays deterministic
|
||||
|
||||
- [ ] **Step 2: Run the single new backend test and verify it fails**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
cargo test --test browser_zhihu_release_backend_test zhihu_release_backend_maps_navigate_to_proven_release_frame -- --nocapture
|
||||
```
|
||||
|
||||
Expected: FAIL because the backend does not exist yet.
|
||||
|
||||
- [ ] **Step 3: Implement minimal `Navigate` support**
|
||||
|
||||
In `src/browser/zhihu_release_backend.rs`:
|
||||
- implement `BrowserBackend`
|
||||
- support `Action::Navigate` first
|
||||
- use `ws_protocol.rs` helpers for exact browser-frame construction
|
||||
- do not hardcode speculative opcodes; follow the transcript from Task 1
|
||||
|
||||
- [ ] **Step 4: Add failing tests for `GetText` and `Eval`**
|
||||
|
||||
Add tests proving:
|
||||
- `Action::GetText` returns `CommandOutput.data == {"text": "..."}`
|
||||
- `Action::Eval` returns `CommandOutput.data == {"text": "..."}`
|
||||
- callback or relay failures become `PipeError::Protocol(...)`
|
||||
|
||||
- [ ] **Step 5: Implement `GetText` and `Eval` on the chosen seam**
|
||||
|
||||
Use the smallest proven mechanism:
|
||||
- if the transcript proves page-defined callback functions are required, route through them
|
||||
- if `callBackJsToCpp(...)` to a page context is still part of the proven path, use it deliberately
|
||||
- if `sgBrowserRegJsFun` / `sgBrowserExcuteJsFun` becomes necessary, add it only with test coverage and only for the Zhihu path
|
||||
|
||||
- [ ] **Step 6: Run focused backend/protocol tests**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
cargo test --test browser_zhihu_release_backend_test -- --nocapture
|
||||
cargo test --test browser_ws_protocol_test -- --nocapture
|
||||
```
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
- [ ] **Step 7: Commit the Zhihu backend slice**
|
||||
|
||||
```bash
|
||||
git add src/browser/ws_protocol.rs src/browser/zhihu_release_backend.rs src/browser/mod.rs tests/browser_ws_protocol_test.rs tests/browser_zhihu_release_backend_test.rs
|
||||
git commit -m "feat: add zhihu release ws backend"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Task 4: Keep the Zhihu workflow logic stable and patch only proven mismatches
|
||||
|
||||
**Files:**
|
||||
- Review: `src/compat/workflow_executor.rs`
|
||||
- Test: `tests/service_task_flow_test.rs`
|
||||
- Test: `tests/compat_runtime_test.rs` (only if a focused direct-execution regression is needed)
|
||||
|
||||
- [ ] **Step 1: Write a failing Zhihu-specific regression only if the chosen seam changes route assumptions**
|
||||
|
||||
If the new Zhihu backend changes request-url or target-url handling enough to break hotlist flow, add one focused failing regression for that exact behavior.
|
||||
|
||||
Candidate assertions:
|
||||
- hotlist navigate still logs `navigate https://www.zhihu.com/hot`
|
||||
- follow-up `GetText body` still targets the Zhihu page, not any helper page
|
||||
- extractor `Eval` still runs against Zhihu, not any helper page
|
||||
|
||||
- [ ] **Step 2: Keep the current high-level Zhihu action sequence unless a test proves otherwise**
|
||||
|
||||
`src/compat/workflow_executor.rs` currently does the right high-level work:
|
||||
- navigate to Zhihu hotlist
|
||||
- poll body text until ready
|
||||
- run the extractor script
|
||||
|
||||
Prefer to keep this file unchanged. Only patch it if the new backend needs a narrow explicit `target_url` fix or similar evidence-backed adjustment.
|
||||
|
||||
- [ ] **Step 3: Run the smallest Zhihu-focused verification sweep**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
cargo test --test service_task_flow_test -- --nocapture
|
||||
cargo test --test compat_runtime_test zhihu -- --nocapture
|
||||
```
|
||||
|
||||
If the `compat_runtime_test zhihu` filter is too broad or unstable, run the exact focused Zhihu cases that cover hotlist extraction.
|
||||
|
||||
- [ ] **Step 4: Commit only if a Zhihu-specific code change was actually required**
|
||||
|
||||
```bash
|
||||
git add src/compat/workflow_executor.rs tests/service_task_flow_test.rs tests/compat_runtime_test.rs
|
||||
git commit -m "fix: keep zhihu workflow aligned with release ws backend"
|
||||
```
|
||||
|
||||
Skip this commit if no production change in `workflow_executor.rs` was needed.
|
||||
|
||||
---
|
||||
|
||||
## Task 5: Prove that pipe behavior and non-Zhihu behavior stayed unchanged
|
||||
|
||||
**Files:**
|
||||
- Test: `tests/service_ws_session_test.rs`
|
||||
- Test: `tests/service_task_flow_test.rs`
|
||||
- Test: `tests/task_runner_test.rs`
|
||||
|
||||
- [ ] **Step 1: Add or update one regression that proves pipe messages are unchanged**
|
||||
|
||||
Use the smallest existing test seam to assert that `ClientMessage` / `ServiceMessage` payloads remain unchanged while the Zhihu route uses the new browser integration path internally.
|
||||
|
||||
- [ ] **Step 2: Add or update one regression that proves non-Zhihu behavior is unchanged**
|
||||
|
||||
Use a non-Zhihu submit or service-session case and assert it does not take the new Zhihu-specific backend path.
|
||||
|
||||
- [ ] **Step 3: Preserve current runtime regression guards**
|
||||
|
||||
The end-to-end tests must continue asserting that output does **not** contain:
|
||||
- `invalid hmac seed: session key must not be empty`
|
||||
- `Cannot drop a runtime in a context where blocking is not allowed`
|
||||
|
||||
- [ ] **Step 4: Run the final focused verification sweep**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
cargo test --test service_ws_session_test -- --nocapture
|
||||
cargo test --test service_task_flow_test -- --nocapture
|
||||
cargo test --test task_runner_test -- --nocapture
|
||||
```
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
- [ ] **Step 5: Commit the verification sweep**
|
||||
|
||||
```bash
|
||||
git add tests/service_ws_session_test.rs tests/service_task_flow_test.rs tests/task_runner_test.rs tests/browser_ws_backend_test.rs
|
||||
git commit -m "test: constrain zhihu release ws migration scope"
|
||||
```
|
||||
|
||||
Only stage files that were truly changed.
|
||||
|
||||
---
|
||||
|
||||
## Out of scope
|
||||
|
||||
Do **not** do these in this slice:
|
||||
- change the pipe protocol
|
||||
- change `ClientMessage` / `ServiceMessage`
|
||||
- redesign `run_submit_task_with_browser_backend(...)`
|
||||
- reintroduce any browser bridge surface
|
||||
- keep adding speculative direct-raw-websocket callback patches to `ws_backend.rs`
|
||||
- redesign non-Zhihu workflows unless the new backend abstraction forces a shared fix
|
||||
- create a long-lived external dependency or third-party server just to host the helper page
|
||||
|
||||
---
|
||||
|
||||
## Verification checklist
|
||||
|
||||
Run at minimum:
|
||||
|
||||
```bash
|
||||
cargo test --test browser_ws_probe_test -- --nocapture
|
||||
cargo test --test browser_zhihu_release_backend_test -- --nocapture
|
||||
cargo test --test browser_ws_protocol_test -- --nocapture
|
||||
cargo test --test service_ws_session_test -- --nocapture
|
||||
cargo test --test service_task_flow_test -- --nocapture
|
||||
cargo test --test task_runner_test -- --nocapture
|
||||
```
|
||||
|
||||
If Task 2 chose the helper-page / relay design, also run the helper-page-specific backend tests you added for that path.
|
||||
|
||||
Manual verification after code changes:
|
||||
|
||||
1. start the real Release browser/runtime that exposes `ws://127.0.0.1:12345`
|
||||
2. start `sg_claw` with real config
|
||||
3. start `sg_claw_client`
|
||||
4. submit:
|
||||
- `打开知乎热榜,获取前10条数据,并导出 Excel`
|
||||
5. confirm the Zhihu path uses the exact Release-browser interaction seam proven by Task 1
|
||||
6. if Task 2 chose Option B, confirm the helper page / relay path is used only for the Zhihu integration seam
|
||||
7. confirm non-Zhihu behavior is unchanged
|
||||
8. confirm the task completes without:
|
||||
- `timeout while waiting for browser message`
|
||||
- `invalid browser status frame: Welcome! You are client #1`
|
||||
- `invalid hmac seed: session key must not be empty`
|
||||
- `Cannot drop a runtime in a context where blocking is not allowed`
|
||||
|
||||
---
|
||||
|
||||
## Expected outcome
|
||||
|
||||
After this slice:
|
||||
- sgclaw still exposes the same pipe/service contract
|
||||
- Zhihu hotlist execution uses the Release-browser websocket contract proven by Task 1
|
||||
- non-Zhihu behavior remains unchanged
|
||||
- old direct-ws Zhihu assumptions are no longer ambiguous in production/tests
|
||||
- if Option A won, Zhihu uses a direct Release-browser backend
|
||||
- if Option B won, Zhihu uses the minimal helper-page / relay seam justified by the probe evidence
|
||||
322
docs/superpowers/plans/2026-04-04-zhihu-ws-submit-realignment.md
Normal file
322
docs/superpowers/plans/2026-04-04-zhihu-ws-submit-realignment.md
Normal file
@@ -0,0 +1,322 @@
|
||||
# Zhihu WS Submit Realignment Implementation Plan
|
||||
|
||||
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
|
||||
|
||||
**Goal:** Realign Zhihu submit routes to the documented websocket callback model, removing helper-page bootstrap from the mainline while keeping the existing pipe/service contract unchanged.
|
||||
|
||||
**Architecture:** The change stays inside the existing submit-path backend selection and websocket protocol flow. Zhihu routes stop choosing `BrowserCallbackBackend` and instead use `WsBrowserBackend` when a real browser websocket is configured, preserving the existing pipe fallback in direct runtime when no websocket URL is available.
|
||||
|
||||
**Tech Stack:** Rust, tungstenite websocket client/server, serde_json, cargo test
|
||||
|
||||
---
|
||||
|
||||
## File Map
|
||||
|
||||
- Modify: `src/service/server.rs`
|
||||
- Change only the Zhihu route-gated submit-path backend selection
|
||||
- Remove Zhihu submit mainline use of `LiveBrowserCallbackHost` / `BrowserCallbackBackend`
|
||||
- Keep service submit path on `WsBrowserBackend`
|
||||
- Preserve initial request URL derivation for Zhihu routes
|
||||
- Modify: `src/agent/mod.rs`
|
||||
- Change only the Zhihu route-gated submit-path backend selection
|
||||
- Remove Zhihu submit mainline use of `LiveBrowserCallbackHost` / `BrowserCallbackBackend`
|
||||
- Keep direct runtime pipe fallback when browser websocket URL is absent
|
||||
- Modify: `tests/agent_runtime_test.rs`
|
||||
- Replace helper-page bootstrap regression with direct websocket submit regression
|
||||
- Assert no `/sgclaw/browser-helper.html` bootstrap frames are emitted
|
||||
- Assert real-page request ownership on follow-up Zhihu actions
|
||||
- Modify: `src/browser/callback_host.rs`
|
||||
- Remove or rewrite the now-wrong red test that preserves Option-B callback-host startup behavior
|
||||
- Verify: `tests/browser_ws_backend_test.rs`
|
||||
- Reuse existing websocket request-url behavior coverage; extend only if the new regression proves insufficient
|
||||
- Reference: `docs/superpowers/specs/2026-04-04-zhihu-ws-submit-realignment-design.md`
|
||||
|
||||
### Task 1: Rewrite the stale submit regression around the real websocket mainline
|
||||
|
||||
**Files:**
|
||||
- Modify: `tests/agent_runtime_test.rs:507-660`
|
||||
- Test: `tests/agent_runtime_test.rs`
|
||||
|
||||
- [ ] **Step 1: Write the failing test**
|
||||
|
||||
Rename and rewrite the existing helper-page regression so it asserts the new behavior:
|
||||
|
||||
```rust
|
||||
#[test]
|
||||
fn production_submit_task_routes_zhihu_through_ws_backend_without_helper_bootstrap() {
|
||||
// arrange runtime context and fake browser ws server
|
||||
// submit Zhihu hotlist request
|
||||
// assert ws frames never contain "/sgclaw/browser-helper.html"
|
||||
// assert first action is navigate to https://www.zhihu.com/hot
|
||||
// assert follow-up action uses real-page requesturl instead of helper page
|
||||
}
|
||||
```
|
||||
|
||||
Use the existing fake ws helpers in the file where possible. Do not add localhost callback-host HTTP plumbing to this rewritten test.
|
||||
|
||||
- [ ] **Step 2: Run test to verify it fails**
|
||||
|
||||
Run: `cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" production_submit_task_routes_zhihu_through_ws_backend_without_helper_bootstrap --test agent_runtime_test -- --nocapture`
|
||||
|
||||
Expected: FAIL because current production code still routes Zhihu submit into `BrowserCallbackBackend` and emits helper-page bootstrap frames.
|
||||
|
||||
- [ ] **Step 3: Keep the regression focused**
|
||||
|
||||
Before touching production code, confirm the rewritten test checks only these behaviors:
|
||||
|
||||
```text
|
||||
- no callback-host bootstrap frame
|
||||
- no helper-page URL
|
||||
- navigate frame still targets https://www.zhihu.com/hot
|
||||
- follow-up websocket action uses real-page request ownership
|
||||
```
|
||||
|
||||
Do not assert unrelated workflow details beyond what is needed to prove the route correction.
|
||||
|
||||
- [ ] **Step 4: Commit the red test**
|
||||
|
||||
```bash
|
||||
git add tests/agent_runtime_test.rs
|
||||
git commit -m "test: rewrite zhihu submit ws routing regression"
|
||||
```
|
||||
|
||||
### Task 2: Switch service Zhihu submit routes off the callback-host backend
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/service/server.rs:287-328`
|
||||
- Test: `tests/agent_runtime_test.rs`
|
||||
|
||||
- [ ] **Step 1: Write the minimal production change**
|
||||
|
||||
Replace only the Zhihu-route callback-host branch with direct websocket backend selection.
|
||||
|
||||
Minimal target shape:
|
||||
|
||||
```rust
|
||||
fn browser_backend_for_submit(
|
||||
browser_ws_url: &str,
|
||||
mac_policy: &MacPolicy,
|
||||
request: &SubmitTaskRequest,
|
||||
) -> Result<Arc<dyn BrowserBackend>, PipeError> {
|
||||
if should_use_callback_host_backend(request) {
|
||||
return Ok(Arc::new(WsBrowserBackend::new(
|
||||
Arc::new(ServiceWsClient::connect(browser_ws_url)?),
|
||||
mac_policy.clone(),
|
||||
initial_request_url_for_submit_task(request),
|
||||
)));
|
||||
}
|
||||
|
||||
Ok(Arc::new(WsBrowserBackend::new(
|
||||
Arc::new(ServiceWsClient::connect(browser_ws_url)?),
|
||||
mac_policy.clone(),
|
||||
initial_request_url_for_submit_task(request),
|
||||
)))
|
||||
}
|
||||
```
|
||||
|
||||
After the route-gated branch is removed, simplify further only if the branch becomes redundant without changing non-Zhihu behavior.
|
||||
|
||||
- [ ] **Step 2: Run the rewritten regression**
|
||||
|
||||
Run: `cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" production_submit_task_routes_zhihu_through_ws_backend_without_helper_bootstrap --test agent_runtime_test -- --nocapture`
|
||||
|
||||
Expected: still FAIL or advance to a later assertion until the direct-runtime path is corrected too.
|
||||
|
||||
- [ ] **Step 3: Add or update a service-specific regression if needed**
|
||||
|
||||
If the rewritten `agent_runtime_test` does not exercise the service submit path directly, add one narrow service regression before continuing.
|
||||
|
||||
Target shape:
|
||||
|
||||
```rust
|
||||
#[test]
|
||||
fn service_submit_task_routes_zhihu_through_ws_backend_without_helper_bootstrap() {
|
||||
// fake browser ws
|
||||
// submit Zhihu route through service path
|
||||
// assert no helper bootstrap frame
|
||||
}
|
||||
```
|
||||
|
||||
Run the exact test you end up using:
|
||||
|
||||
`cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" service_submit_task_routes_zhihu_through_ws_backend_without_helper_bootstrap --test <exact test file> -- --nocapture`
|
||||
|
||||
Expected: PASS
|
||||
|
||||
- [ ] **Step 4: Commit the service-path fix**
|
||||
|
||||
```bash
|
||||
git add src/service/server.rs tests/agent_runtime_test.rs
|
||||
git commit -m "fix: route zhihu submit through ws backend"
|
||||
```
|
||||
|
||||
### Task 3: Switch direct runtime Zhihu submit routes off the callback-host backend while keeping pipe fallback
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/agent/mod.rs:49-100`
|
||||
- Test: `tests/agent_runtime_test.rs`
|
||||
|
||||
- [ ] **Step 1: Write the minimal production change**
|
||||
|
||||
Remove callback-host backend selection from `browser_backend_for_submit(...)`.
|
||||
|
||||
Minimal target behavior:
|
||||
|
||||
```rust
|
||||
if let Some(browser_ws_url) = configured_browser_ws_url(context) {
|
||||
return Ok(Arc::new(WsBrowserBackend::new(
|
||||
Arc::new(ServiceWsClient::connect(&browser_ws_url)?),
|
||||
browser_tool.mac_policy().clone(),
|
||||
initial_request_url_for_submit_task(request),
|
||||
).with_response_timeout(browser_tool.response_timeout())));
|
||||
}
|
||||
|
||||
Ok(Arc::new(PipeBrowserBackend::from_inner(browser_tool.clone())))
|
||||
```
|
||||
|
||||
If `ServiceWsClient` is not reusable from `src/service/server.rs`, extract the smallest shared websocket client helper into the browser module instead of inventing a new abstraction.
|
||||
|
||||
- [ ] **Step 2: Add a focused fallback assertion only if needed**
|
||||
|
||||
If the rewritten regression does not cover the direct-runtime no-websocket case, add one small test:
|
||||
|
||||
```rust
|
||||
#[test]
|
||||
fn production_submit_task_keeps_pipe_fallback_when_browser_ws_url_is_unset() {
|
||||
// no SGCLAW_BROWSER_WS_URL
|
||||
// blank/no ws config
|
||||
// assert no websocket bootstrap attempt occurs
|
||||
}
|
||||
```
|
||||
|
||||
Only add this test if current coverage is insufficient.
|
||||
|
||||
- [ ] **Step 3: Run tests to verify green**
|
||||
|
||||
Run: `cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" production_submit_task_routes_zhihu_through_ws_backend_without_helper_bootstrap --test agent_runtime_test -- --nocapture`
|
||||
|
||||
Expected: PASS
|
||||
|
||||
If a fallback test was added, run it immediately after and expect PASS.
|
||||
|
||||
- [ ] **Step 4: Commit the direct-runtime fix**
|
||||
|
||||
```bash
|
||||
git add src/agent/mod.rs tests/agent_runtime_test.rs
|
||||
git commit -m "fix: align runtime zhihu submit with ws contract"
|
||||
```
|
||||
|
||||
### Task 4: Reassess stale callback-host regression coverage only if it blocks the approved slice
|
||||
|
||||
**Files:**
|
||||
- Maybe modify: `src/browser/callback_host.rs:793-810`
|
||||
- Test: `src/browser/callback_host.rs`
|
||||
|
||||
- [ ] **Step 1: Check whether the callback-host red test still blocks the approved Option A slice**
|
||||
|
||||
Inspect whether this test still preserves rejected Option-B behavior and whether it fails or becomes misleading after Tasks 1-3:
|
||||
|
||||
```rust
|
||||
#[test]
|
||||
fn live_callback_host_starts_without_bootstrapping_external_helper_page() {
|
||||
// inspect before editing
|
||||
}
|
||||
```
|
||||
|
||||
If the test is unrelated to the approved Zhihu mainline or remains harmless, leave it unchanged in this slice.
|
||||
|
||||
- [ ] **Step 2: Remove or rewrite only if required by the changed submit-path behavior**
|
||||
|
||||
If the test blocks the approved slice, make the smallest change needed:
|
||||
|
||||
- delete it if it exists only to preserve rejected Option B behavior, or
|
||||
- rewrite it so it no longer asserts callback-host startup as the accepted Zhihu mainline
|
||||
|
||||
- [ ] **Step 3: Run focused callback-host tests only if Step 2 changed code**
|
||||
|
||||
Run: `cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" callback_host --lib -- --nocapture`
|
||||
|
||||
Expected: PASS
|
||||
|
||||
- [ ] **Step 4: Commit only if Step 2 changed code**
|
||||
|
||||
```bash
|
||||
git add src/browser/callback_host.rs
|
||||
git commit -m "test: clean up stale callback host regression"
|
||||
```
|
||||
|
||||
### Task 5: Run the focused verification sweep
|
||||
|
||||
**Files:**
|
||||
- Verify: `tests/agent_runtime_test.rs`
|
||||
- Verify: `tests/compat_runtime_test.rs`
|
||||
- Verify: any directly affected service/browser websocket tests
|
||||
|
||||
- [ ] **Step 1: Run submit-path regression coverage**
|
||||
|
||||
Run: `cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" production_submit_task_routes_zhihu_through_ws_backend_without_helper_bootstrap --test agent_runtime_test -- --nocapture`
|
||||
|
||||
Expected: PASS
|
||||
|
||||
- [ ] **Step 2: Run websocket backend request-url coverage**
|
||||
|
||||
Run: `cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" ws_backend_reuses_last_navigated_url_for_followup_requests --test browser_ws_backend_test -- --nocapture`
|
||||
|
||||
Expected: PASS
|
||||
|
||||
- [ ] **Step 3: Run Zhihu compat runtime coverage**
|
||||
|
||||
Run: `cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" zhihu --test compat_runtime_test -- --nocapture`
|
||||
|
||||
Expected: PASS for the changed submit-path surface or clear, directly related failures only.
|
||||
|
||||
- [ ] **Step 4: Run affected service submit regression coverage**
|
||||
|
||||
Run the exact service-specific regression from Task 2 if you added one.
|
||||
|
||||
Otherwise, run the narrowest existing service submit test that covers `ClientMessage::SubmitTask` handling for browser routes.
|
||||
|
||||
Expected: PASS
|
||||
|
||||
- [ ] **Step 5: Commit the verified slice**
|
||||
|
||||
```bash
|
||||
git add src/service/server.rs src/agent/mod.rs tests/agent_runtime_test.rs src/browser/callback_host.rs
|
||||
git commit -m "fix: realign zhihu submit with browser ws callbacks"
|
||||
```
|
||||
|
||||
### Task 6: Run stronger real-browser validation
|
||||
|
||||
**Files:**
|
||||
- Verify live behavior through existing binaries and real config only
|
||||
|
||||
- [ ] **Step 1: Start the service with the real config**
|
||||
|
||||
Run: `cargo run --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" --bin sg_claw -- --config-path "D:/data/ideaSpace/rust/sgClaw/sgclaw_config.json"`
|
||||
|
||||
Expected: service starts without failing at callback-host readiness timeout.
|
||||
|
||||
- [ ] **Step 2: Run the client against the started service**
|
||||
|
||||
Run: `cargo run --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" --bin sg_claw_client`
|
||||
|
||||
Expected: for `打开知乎热榜,获取前10条数据,并导出 Excel`, the browser proceeds into real Zhihu page work instead of stalling before page open.
|
||||
|
||||
- [ ] **Step 3: Capture the narrow acceptance evidence**
|
||||
|
||||
Verify all of the following from logs/frames/observed behavior:
|
||||
|
||||
```text
|
||||
- no callback-host readiness timeout
|
||||
- no helper-page bootstrap frame
|
||||
- at least one real-page follow-up browser action after navigate
|
||||
```
|
||||
|
||||
- [ ] **Step 4: Commit only if live verification required code changes**
|
||||
|
||||
```bash
|
||||
git add <only files changed during live-fix follow-up>
|
||||
git commit -m "fix: tighten zhihu ws submit live validation follow-up"
|
||||
```
|
||||
|
||||
If no further code changes were needed, do not create an extra commit.
|
||||
@@ -0,0 +1,406 @@
|
||||
# Service Chat Web Console Implementation Plan
|
||||
|
||||
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
|
||||
|
||||
**Goal:** Add a standalone local HTML console that connects to the existing service websocket, submits natural-language tasks with the current `submit_task` payload, and leaves the browser-helper/runtime path untouched.
|
||||
|
||||
**Architecture:** The change stays fully at the presentation edge. A new self-contained HTML file under `frontend/service-console/` reuses the current websocket protocol from `src/service/protocol.rs`, while one narrow Rust integration test guards the page's protocol shape and forbids any reference to `browser-helper.html`, callback-host endpoints, or the browser websocket. No Rust runtime logic changes are part of this slice.
|
||||
|
||||
**Tech Stack:** HTML, CSS, vanilla JavaScript, Rust integration tests, std::fs, Cargo test
|
||||
|
||||
---
|
||||
|
||||
## File Map
|
||||
|
||||
- Create: `frontend/service-console/sg_claw_service_console.html`
|
||||
- Standalone local page with inline CSS and JavaScript
|
||||
- Connects to the existing service websocket at `ws://127.0.0.1:42321` by default
|
||||
- Sends existing `ClientMessage::SubmitTask` JSON
|
||||
- Renders inbound `ServiceMessage` rows only
|
||||
- Create: `tests/service_console_html_test.rs`
|
||||
- Source guard for the standalone page
|
||||
- Verifies file location, allowed protocol usage, and forbidden helper/callback references
|
||||
- Reference: `src/service/protocol.rs`
|
||||
- Existing websocket message shape to mirror exactly
|
||||
- Reference: `src/bin/sg_claw_client.rs`
|
||||
- Existing terminal client behavior to mirror for `submit_task`
|
||||
- Reference: `docs/superpowers/specs/2026-04-06-service-chat-web-console-design.md`
|
||||
|
||||
## Scope Guardrails
|
||||
|
||||
- Do not modify `src/service/server.rs`.
|
||||
- Do not modify `src/browser/callback_host.rs`.
|
||||
- Do not modify `src/browser/callback_backend.rs`.
|
||||
- Do not modify `src/bin/sg_claw_client.rs`.
|
||||
- Do not add an HTTP server.
|
||||
- Do not connect the new page to `ws://127.0.0.1:12345`.
|
||||
- Do not reference `/sgclaw/browser-helper.html` or `/sgclaw/callback/*` anywhere in the new page.
|
||||
|
||||
### Task 1: Add a failing source-guard test for the standalone page
|
||||
|
||||
**Files:**
|
||||
- Create: `tests/service_console_html_test.rs`
|
||||
- Reference: `docs/superpowers/specs/2026-04-06-service-chat-web-console-design.md`
|
||||
|
||||
- [ ] **Step 1: Write the failing test**
|
||||
|
||||
Create a focused integration test that resolves the HTML path from `CARGO_MANIFEST_DIR` and asserts the file contract.
|
||||
|
||||
```rust
|
||||
use std::fs;
|
||||
use std::path::PathBuf;
|
||||
|
||||
#[test]
|
||||
fn service_console_html_stays_on_service_ws_boundary() {
|
||||
let manifest_dir = PathBuf::from(env!("CARGO_MANIFEST_DIR"));
|
||||
let html_path = manifest_dir
|
||||
.join("frontend")
|
||||
.join("service-console")
|
||||
.join("sg_claw_service_console.html");
|
||||
let source = fs::read_to_string(&html_path)
|
||||
.expect("service console html should exist");
|
||||
|
||||
assert!(source.contains("ws://127.0.0.1:42321"));
|
||||
assert!(source.contains("submit_task"));
|
||||
assert!(!source.contains("/sgclaw/browser-helper.html"));
|
||||
assert!(!source.contains("/sgclaw/callback/ready"));
|
||||
assert!(!source.contains("/sgclaw/callback/events"));
|
||||
assert!(!source.contains("/sgclaw/callback/commands/next"));
|
||||
assert!(!source.contains("/sgclaw/callback/commands/ack"));
|
||||
assert!(!source.contains("ws://127.0.0.1:12345"));
|
||||
}
|
||||
```
|
||||
|
||||
- [ ] **Step 2: Run test to verify it fails**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" service_console_html_stays_on_service_ws_boundary --test service_console_html_test -- --exact
|
||||
```
|
||||
|
||||
Expected: FAIL because the HTML file does not exist yet.
|
||||
|
||||
- [ ] **Step 3: Keep the test narrow**
|
||||
|
||||
Before writing production code, confirm the test guards only the approved boundary:
|
||||
|
||||
```text
|
||||
- file exists at frontend/service-console/sg_claw_service_console.html
|
||||
- service websocket default is present
|
||||
- submit_task payload marker is present
|
||||
- no helper-page path
|
||||
- no callback-host endpoints
|
||||
- no browser websocket URL
|
||||
```
|
||||
|
||||
Do not turn this into an end-to-end browser test.
|
||||
|
||||
- [ ] **Step 4: Commit the red test**
|
||||
|
||||
```bash
|
||||
git add tests/service_console_html_test.rs
|
||||
git commit -m "test: add service console html boundary guard"
|
||||
```
|
||||
|
||||
### Task 2: Implement the standalone HTML console with the approved boundary
|
||||
|
||||
**Files:**
|
||||
- Create: `frontend/service-console/sg_claw_service_console.html`
|
||||
- Reference: `src/service/protocol.rs:6`
|
||||
- Reference: `src/bin/sg_claw_client.rs:16`
|
||||
- Test: `tests/service_console_html_test.rs`
|
||||
|
||||
- [ ] **Step 1: Create the HTML file with the minimal structure**
|
||||
|
||||
Write one self-contained page with:
|
||||
|
||||
```html
|
||||
<!doctype html>
|
||||
<html lang="zh-CN">
|
||||
<head>
|
||||
<meta charset="utf-8" />
|
||||
<meta name="viewport" content="width=device-width, initial-scale=1" />
|
||||
<title>sgClaw Service Console</title>
|
||||
</head>
|
||||
<body>
|
||||
<div id="app">
|
||||
<input id="wsUrl" value="ws://127.0.0.1:42321" />
|
||||
<button id="connectBtn">连接</button>
|
||||
<div id="connectionState">未连接</div>
|
||||
<div id="messageStream"></div>
|
||||
<textarea id="instructionInput"></textarea>
|
||||
<div id="validationText"></div>
|
||||
<button id="sendBtn" disabled>发送任务</button>
|
||||
</div>
|
||||
</body>
|
||||
</html>
|
||||
```
|
||||
|
||||
Keep all CSS and JavaScript inline. Do not add external assets or a build step.
|
||||
|
||||
- [ ] **Step 2: Implement websocket connect/disconnect behavior**
|
||||
|
||||
Add the smallest possible JS behavior, including explicit disconnect on the same button so the UI
|
||||
matches the approved connect/disconnect contract:
|
||||
|
||||
```javascript
|
||||
let socket = null;
|
||||
|
||||
function appendRow(kind, text) {
|
||||
// append a visible row to #messageStream
|
||||
}
|
||||
|
||||
function updateUiState() {
|
||||
const connected = socket && socket.readyState === WebSocket.OPEN;
|
||||
document.getElementById('connectBtn').textContent = connected ? '断开' : '连接';
|
||||
document.getElementById('sendBtn').disabled = !connected;
|
||||
document.getElementById('connectionState').textContent = connected ? '已连接' : '未连接';
|
||||
}
|
||||
|
||||
function connectOrDisconnectService() {
|
||||
if (socket && socket.readyState === WebSocket.OPEN) {
|
||||
socket.close();
|
||||
return;
|
||||
}
|
||||
|
||||
const url = document.getElementById('wsUrl').value.trim() || 'ws://127.0.0.1:42321';
|
||||
socket = new WebSocket(url);
|
||||
updateUiState();
|
||||
socket.addEventListener('open', () => {
|
||||
appendRow('status', 'service websocket connected');
|
||||
updateUiState();
|
||||
});
|
||||
socket.addEventListener('close', () => {
|
||||
appendRow('status', 'service websocket disconnected');
|
||||
updateUiState();
|
||||
});
|
||||
socket.addEventListener('error', () => appendRow('error', 'service websocket error'));
|
||||
socket.addEventListener('message', handleMessage);
|
||||
}
|
||||
```
|
||||
|
||||
Do not add retry loops or background reconnect logic.
|
||||
|
||||
- [ ] **Step 3: Implement submit_task sending with the current message shape**
|
||||
|
||||
Mirror the terminal client payload shape exactly and show inline validation for empty input:
|
||||
|
||||
```javascript
|
||||
function setValidation(message) {
|
||||
document.getElementById('validationText').textContent = message;
|
||||
}
|
||||
|
||||
function sendTask() {
|
||||
const instruction = document.getElementById('instructionInput').value.trim();
|
||||
if (!socket || socket.readyState !== WebSocket.OPEN) {
|
||||
return;
|
||||
}
|
||||
if (!instruction) {
|
||||
setValidation('请输入任务内容。');
|
||||
return;
|
||||
}
|
||||
|
||||
setValidation('');
|
||||
socket.send(JSON.stringify({
|
||||
type: 'submit_task',
|
||||
instruction,
|
||||
conversation_id: '',
|
||||
messages: [],
|
||||
page_url: '',
|
||||
page_title: ''
|
||||
}));
|
||||
}
|
||||
```
|
||||
|
||||
Do not add new fields. Do not add conversation replay logic in this slice.
|
||||
|
||||
- [ ] **Step 4: Render existing inbound service messages only**
|
||||
|
||||
Handle the current `ServiceMessage` variants with a minimal dispatcher:
|
||||
|
||||
```javascript
|
||||
function handleMessage(event) {
|
||||
const message = JSON.parse(event.data);
|
||||
switch (message.type) {
|
||||
case 'status_changed':
|
||||
appendRow('status', message.state);
|
||||
break;
|
||||
case 'log_entry':
|
||||
appendRow('log', message.message);
|
||||
break;
|
||||
case 'task_complete':
|
||||
appendRow(message.success ? 'complete' : 'error', message.summary);
|
||||
break;
|
||||
case 'busy':
|
||||
appendRow('error', message.message);
|
||||
break;
|
||||
default:
|
||||
appendRow('error', 'unknown service message: ' + event.data);
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Keep the composer enabled during in-flight work so repeated submits surface the existing `busy` response instead of inventing a frontend queue.
|
||||
|
||||
- [ ] **Step 5: Keep the helper boundary explicit in the source**
|
||||
|
||||
Before running tests, inspect the HTML source and confirm:
|
||||
|
||||
```text
|
||||
- no /sgclaw/browser-helper.html
|
||||
- no /sgclaw/callback/*
|
||||
- no ws://127.0.0.1:12345
|
||||
- no browser websocket register frame logic
|
||||
```
|
||||
|
||||
If any such string appears, remove it before testing.
|
||||
|
||||
- [ ] **Step 6: Run the source-guard test to verify green**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" service_console_html_stays_on_service_ws_boundary --test service_console_html_test -- --exact
|
||||
```
|
||||
|
||||
Expected: PASS
|
||||
|
||||
- [ ] **Step 7: Commit the standalone page**
|
||||
|
||||
```bash
|
||||
git add frontend/service-console/sg_claw_service_console.html tests/service_console_html_test.rs
|
||||
git commit -m "feat: add standalone service chat console"
|
||||
```
|
||||
|
||||
### Task 3: Run the focused verification sweep
|
||||
|
||||
**Files:**
|
||||
- Verify: `tests/service_console_html_test.rs`
|
||||
- Reference: `src/service/protocol.rs`
|
||||
- Reference: `src/bin/sg_claw_client.rs`
|
||||
|
||||
- [ ] **Step 1: Re-run the source-guard test**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" service_console_html_stays_on_service_ws_boundary --test service_console_html_test -- --exact
|
||||
```
|
||||
|
||||
Expected: PASS
|
||||
|
||||
- [ ] **Step 2: Manually inspect disconnected-send and validation markers in the HTML source**
|
||||
|
||||
Before broader verification, confirm the page source clearly contains all three UI-local rules:
|
||||
|
||||
```text
|
||||
- connect button can disconnect an open websocket
|
||||
- send button starts disabled while disconnected
|
||||
- empty instruction shows inline validation text
|
||||
```
|
||||
|
||||
This inspection stays source-level; do not add extra backend tests for it in this slice.
|
||||
|
||||
- [ ] **Step 3: Run an existing service protocol regression for safety**
|
||||
|
||||
Run the narrow existing protocol coverage to prove the page did not require backend changes:
|
||||
|
||||
```bash
|
||||
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" submit_task_client_message_converts_into_shared_runner_request --test service_ws_session_test -- --exact
|
||||
```
|
||||
|
||||
Expected: PASS
|
||||
|
||||
- [ ] **Step 4: Run an existing terminal-client regression for safety**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" client_submits_first_user_line_to_service --test service_task_flow_test -- --exact
|
||||
```
|
||||
|
||||
Expected: PASS
|
||||
|
||||
- [ ] **Step 5: Commit only if verification required any code change**
|
||||
|
||||
```bash
|
||||
git add frontend/service-console/sg_claw_service_console.html tests/service_console_html_test.rs
|
||||
git commit -m "test: tighten service console verification"
|
||||
```
|
||||
|
||||
If verification required no code changes, do not create an extra commit.
|
||||
|
||||
### Task 4: Perform the manual smoke check
|
||||
|
||||
**Files:**
|
||||
- Verify live behavior only; no new code required
|
||||
|
||||
- [ ] **Step 1: Start the existing service binary**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo run --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" --bin sg_claw -- --config-path "D:/data/ideaSpace/rust/sgClaw/sgclaw_config.json"
|
||||
```
|
||||
|
||||
Expected: service starts and prints its ready line with the service websocket listen address.
|
||||
|
||||
- [ ] **Step 2: Open the standalone page directly**
|
||||
|
||||
Open:
|
||||
|
||||
```text
|
||||
D:/data/ideaSpace/rust/sgClaw/claw-new/frontend/service-console/sg_claw_service_console.html
|
||||
```
|
||||
|
||||
Expected: the page loads through the browser as a local file and shows the default websocket URL `ws://127.0.0.1:42321`.
|
||||
|
||||
- [ ] **Step 3: Connect, disconnect, and reconnect once**
|
||||
|
||||
Expected:
|
||||
|
||||
```text
|
||||
- message stream shows websocket connected
|
||||
- clicking the same button disconnects the websocket cleanly
|
||||
- message stream shows websocket disconnected
|
||||
- send button is disabled again while disconnected
|
||||
- reconnect succeeds without reloading the page
|
||||
```
|
||||
|
||||
- [ ] **Step 4: Submit one natural-language task**
|
||||
|
||||
Use a small harmless instruction such as:
|
||||
|
||||
```text
|
||||
打开百度
|
||||
```
|
||||
|
||||
Expected:
|
||||
|
||||
```text
|
||||
- empty textarea send attempt first shows inline validation without sending a websocket frame
|
||||
- page sends one submit_task payload after valid input
|
||||
- page receives and renders status/log/task_complete or busy rows
|
||||
```
|
||||
|
||||
- [ ] **Step 5: Confirm the helper boundary stayed untouched**
|
||||
|
||||
Verify from the page source and observed behavior:
|
||||
|
||||
```text
|
||||
- the page never loads /sgclaw/browser-helper.html
|
||||
- the page never calls /sgclaw/callback/*
|
||||
- the page never connects to ws://127.0.0.1:12345
|
||||
```
|
||||
|
||||
If the task itself triggers browser automation, that remains owned by the existing Rust runtime rather than by the page.
|
||||
|
||||
- [ ] **Step 6: Commit only if the manual pass required code changes**
|
||||
|
||||
```bash
|
||||
git add frontend/service-console/sg_claw_service_console.html tests/service_console_html_test.rs
|
||||
git commit -m "fix: tighten standalone service console smoke flow"
|
||||
```
|
||||
|
||||
If the manual pass required no code changes, do not create an extra commit.
|
||||
@@ -0,0 +1,637 @@
|
||||
# Zhihu Hotlist Post-Export Auto-Open Implementation Plan
|
||||
|
||||
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
|
||||
|
||||
**Goal:** Extend the existing Zhihu hotlist Excel and dashboard routes so each route can auto-open its own generated artifact after export, while preserving the current callback-host-backed browser boundary and route exclusivity.
|
||||
|
||||
**Architecture:** Keep orchestration in `src/compat/workflow_executor.rs`, but move post-export side effects into a new `src/compat/artifact_open.rs` helper so workflow routing stays readable. Excel auto-open is a local OS-launch side effect; dashboard auto-open reuses `screen_html_export`'s existing `presentation.url` and sends one narrow, marker-based `Action::Navigate` request through `BrowserCallbackBackend`, with a matching special-case validator in `MacPolicy` so arbitrary `file://` navigation remains blocked.
|
||||
|
||||
**Tech Stack:** Rust, serde_json, std::process::Command, std::path, Cargo tests
|
||||
|
||||
---
|
||||
|
||||
## File Map
|
||||
|
||||
- Create: `src/compat/artifact_open.rs`
|
||||
- Define the narrow post-export helper surface for this slice only
|
||||
- Parse and validate generated artifact payload fields passed in by the workflow layer
|
||||
- Open generated `.xlsx` files with the local default app
|
||||
- Build the exact approved local-dashboard navigate payload
|
||||
- Keep one testable internal seam, `open_exported_xlsx_with(output_path, opener)`, so unit tests can prove the generated `.xlsx` path is handed to the launcher without starting a real spreadsheet app
|
||||
- Include unit tests in the same file for exact Excel path handoff and launcher-failure reporting
|
||||
- Modify: `src/compat/mod.rs`
|
||||
- Export the new `artifact_open` module
|
||||
- Modify: `src/compat/workflow_executor.rs`
|
||||
- Keep route detection and artifact generation where they are now
|
||||
- Change `export_xlsx(...)` and `export_screen(...)` so they parse tool payloads, call the route-specific opener, and produce the new success/failure summaries
|
||||
- Modify: `src/browser/callback_backend.rs`
|
||||
- Recognize only the approved local-dashboard navigate request shape at `Action::Navigate`
|
||||
- Keep normal remote navigate behavior unchanged
|
||||
- Continue emitting `sgBrowerserOpenPage` for the approved local-dashboard case so the helper page stays alive and the dashboard opens in a new visible tab
|
||||
- Add focused callback-backend unit tests in the existing test module for approved and malformed local-dashboard requests
|
||||
- Modify: `src/security/mac_policy.rs`
|
||||
- Add a narrow validator for the approved local-dashboard presentation case
|
||||
- Keep `validate(...)` unchanged for ordinary remote-domain flow
|
||||
- Reject malformed marker payloads, non-HTML local paths, and mismatched `file://` / output-path combinations
|
||||
- Modify: `tests/compat_runtime_test.rs`
|
||||
- Keep the concrete hotlist workflow regressions in this existing integration test file
|
||||
- Extend existing Zhihu hotlist export/screen regressions to assert the new summaries and the dashboard marker payload
|
||||
- Keep the Excel route workflow assertion limited to summary plus “no dashboard navigate marker,” because exact launcher handoff is covered in `src/compat/artifact_open.rs` unit tests
|
||||
- Modify: `tests/browser_tool_test.rs`
|
||||
- Add `MacPolicy` coverage for approved local-dashboard presentation, rejected malformed presentation, and unchanged normal-domain validation in one exact file
|
||||
- Extend the existing `default_rules_allow_zhihu_navigation` area with the new local-dashboard validation tests rather than creating a second policy test location
|
||||
- Reference only if summary wording ripples outward: `tests/agent_runtime_test.rs:173-258`
|
||||
- Existing direct-runtime user-visible summary assertion for Zhihu Excel export
|
||||
- Reference only if summary wording ripples outward: `tests/service_task_flow_test.rs:704-839`
|
||||
- Existing CLI-to-service user-visible summary assertion for Zhihu Excel export
|
||||
- Reference only if summary wording ripples outward: `tests/service_ws_session_test.rs:755-869`
|
||||
- Existing service-binary user-visible summary assertion for Zhihu Excel export
|
||||
- Reference: `tests/compat_screen_html_export_tool_test.rs`
|
||||
- Reuse the exact test seam `screen_html_export_tool_renders_dashboard_html_with_presentation_contract`
|
||||
- Existing proof that `screen_html_export` already returns `presentation.url`
|
||||
- Reference: `docs/superpowers/specs/2026-04-06-zhihu-hotlist-post-export-auto-open-design.md`
|
||||
|
||||
## Scope Guardrails
|
||||
|
||||
- Do not modify `frontend/service-console/sg_claw_service_console.html`.
|
||||
- Do not modify `src/service/protocol.rs`.
|
||||
- Do not modify `browser-helper.html`.
|
||||
- Do not modify `/sgclaw/callback/*` endpoint contracts.
|
||||
- Do not modify websocket protocol framing or `src/browser/ws_protocol.rs`.
|
||||
- Do not turn Excel-open and dashboard-open into a combined mode.
|
||||
- Do not add a general-purpose local file browser or generic `file://` allowlist.
|
||||
- Do not move post-export decisions into the frontend service console.
|
||||
- Do not require websocket-backend parity in this slice.
|
||||
|
||||
### Task 1: Add failing workflow tests for route-specific post-export actions
|
||||
|
||||
**Files:**
|
||||
- Modify: `tests/compat_runtime_test.rs:2154-2304`
|
||||
- Reference: `src/compat/workflow_executor.rs:375-446`
|
||||
- Reference: `docs/superpowers/specs/2026-04-06-zhihu-hotlist-post-export-auto-open-design.md`
|
||||
|
||||
- [ ] **Step 1: Rewrite the Excel hotlist assertion as a red test for the new summary only**
|
||||
|
||||
Keep the current flow setup, but tighten the expectation so it proves the workflow route now reports post-export open success while staying exclusive from the dashboard path.
|
||||
|
||||
Target shape:
|
||||
|
||||
```rust
|
||||
#[test]
|
||||
fn handle_browser_message_chains_hotlist_skill_into_xlsx_export_and_auto_open() {
|
||||
// existing setup
|
||||
assert!(summary.contains("已导出并打开知乎热榜 Excel"));
|
||||
assert!(generated.exists());
|
||||
assert!(!sent.iter().any(|message| {
|
||||
matches!(
|
||||
message,
|
||||
AgentMessage::Command { action, params, .. }
|
||||
if action == &Action::Navigate
|
||||
&& params.get("sgclaw_local_dashboard_open").is_some()
|
||||
)
|
||||
}));
|
||||
}
|
||||
```
|
||||
|
||||
Do not try to prove real OS launching in this workflow test. The exact `.xlsx` path handoff to the launcher belongs in `src/compat/artifact_open.rs` unit tests from Task 2.
|
||||
|
||||
- [ ] **Step 2: Rewrite the dashboard hotlist assertion as a red test for browser auto-open**
|
||||
|
||||
Tighten the existing dashboard test so it proves the workflow consumes `presentation.url` and emits the approved compat marker payload.
|
||||
|
||||
Target shape:
|
||||
|
||||
```rust
|
||||
#[test]
|
||||
fn handle_browser_message_chains_hotlist_skill_into_screen_export_and_auto_open() {
|
||||
// existing setup
|
||||
assert!(summary.contains("已在浏览器中打开知乎热榜大屏"));
|
||||
let navigate = sent.iter().find_map(|message| match message {
|
||||
AgentMessage::Command { action, params, security, .. }
|
||||
if action == &Action::Navigate
|
||||
&& security.expected_domain == "__sgclaw_local_dashboard__" => Some((params, security)),
|
||||
_ => None,
|
||||
}).expect("dashboard route should emit local-dashboard navigate request");
|
||||
|
||||
assert!(navigate.0["url"].as_str().unwrap().starts_with("file://"));
|
||||
assert_eq!(navigate.0["sgclaw_local_dashboard_open"]["source"], json!("compat.workflow_executor"));
|
||||
assert_eq!(navigate.0["sgclaw_local_dashboard_open"]["kind"], json!("zhihu_hotlist_screen"));
|
||||
assert_eq!(navigate.0["sgclaw_local_dashboard_open"]["presentation_url"], navigate.0["url"]);
|
||||
}
|
||||
```
|
||||
|
||||
Also assert that this route still logs `call screen_html_export` and does not invoke the Excel opener path.
|
||||
|
||||
- [ ] **Step 3: Add a missing-`presentation.url` regression in the workflow test module if none exists**
|
||||
|
||||
Put this close to the existing hotlist tests and keep it narrow:
|
||||
|
||||
```rust
|
||||
#[test]
|
||||
fn handle_browser_message_reports_dashboard_auto_open_protocol_error_when_presentation_url_is_missing() {
|
||||
// mock screen_html_export success payload with output_path but no presentation.url
|
||||
// assert summary contains 已生成知乎热榜大屏 <path>,但浏览器自动打开失败:
|
||||
}
|
||||
```
|
||||
|
||||
Use the existing summary/path helpers in the file instead of inventing new parsing helpers.
|
||||
|
||||
- [ ] **Step 4: Run the focused compat runtime tests to verify they fail**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" handle_browser_message_chains_hotlist_skill_into_xlsx_export_and_auto_open --test compat_runtime_test -- --exact
|
||||
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" handle_browser_message_chains_hotlist_skill_into_screen_export_and_auto_open --test compat_runtime_test -- --exact
|
||||
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" handle_browser_message_reports_dashboard_auto_open_protocol_error_when_presentation_url_is_missing --test compat_runtime_test -- --exact
|
||||
```
|
||||
|
||||
Expected: FAIL because the workflow still returns artifact-only summaries and has no post-export open handling.
|
||||
|
||||
- [ ] **Step 5: Commit the red workflow tests**
|
||||
|
||||
```bash
|
||||
git add tests/compat_runtime_test.rs
|
||||
git commit -m "test: add hotlist post-export auto-open regressions"
|
||||
```
|
||||
|
||||
### Task 2: Implement the compat post-export opener and update workflow summaries
|
||||
|
||||
**Files:**
|
||||
- Create: `src/compat/artifact_open.rs`
|
||||
- Modify: `src/compat/mod.rs`
|
||||
- Modify: `src/compat/workflow_executor.rs:375-446`
|
||||
- Test: `src/compat/artifact_open.rs`
|
||||
- Test: `tests/compat_runtime_test.rs`
|
||||
|
||||
- [ ] **Step 1: Add the red unit tests in `src/compat/artifact_open.rs` before writing production code**
|
||||
|
||||
Create the new module with a `#[cfg(test)]` block first so the Excel opener has an exact, non-UI verification seam.
|
||||
|
||||
Target tests:
|
||||
|
||||
```rust
|
||||
#[test]
|
||||
fn open_exported_xlsx_with_passes_generated_path_to_launcher() {
|
||||
let mut seen = None;
|
||||
let result = open_exported_xlsx_with(Path::new("C:/tmp/zhihu-hotlist.xlsx"), |path| {
|
||||
seen = Some(path.to_path_buf());
|
||||
Ok(())
|
||||
});
|
||||
assert!(matches!(result, PostExportOpen::Opened));
|
||||
assert_eq!(seen.unwrap(), PathBuf::from("C:/tmp/zhihu-hotlist.xlsx"));
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn open_exported_xlsx_with_reports_launcher_failure() {
|
||||
let result = open_exported_xlsx_with(Path::new("C:/tmp/zhihu-hotlist.xlsx"), |_path| {
|
||||
Err("launcher failed".to_string())
|
||||
});
|
||||
assert!(matches!(result, PostExportOpen::Failed(reason) if reason.contains("launcher failed")));
|
||||
}
|
||||
```
|
||||
|
||||
Add one matching dashboard payload test in the same file:
|
||||
|
||||
```rust
|
||||
#[test]
|
||||
fn open_local_dashboard_uses_exact_approved_marker_payload() {
|
||||
// FakeBrowserBackend records invoke(action, params, expected_domain)
|
||||
// assert expected_domain == "__sgclaw_local_dashboard__"
|
||||
// assert params.url == params.sgclaw_local_dashboard_open.presentation_url
|
||||
// assert source/kind/output_path all match the approved contract
|
||||
}
|
||||
```
|
||||
|
||||
This step is mandatory so the Excel route is proven to hand the generated path to the opener without launching a real application.
|
||||
|
||||
- [ ] **Step 2: Run the new unit tests to verify they fail**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" open_exported_xlsx_with_passes_generated_path_to_launcher --lib -- --exact
|
||||
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" open_exported_xlsx_with_reports_launcher_failure --lib -- --exact
|
||||
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" open_local_dashboard_uses_exact_approved_marker_payload --lib -- --exact
|
||||
```
|
||||
|
||||
Expected: FAIL because `src/compat/artifact_open.rs` does not exist yet.
|
||||
|
||||
- [ ] **Step 3: Create the small compat opener module**
|
||||
|
||||
Add one focused helper module rather than embedding side effects directly into `workflow_executor.rs`.
|
||||
|
||||
Target shape:
|
||||
|
||||
```rust
|
||||
pub const LOCAL_DASHBOARD_EXPECTED_DOMAIN: &str = "__sgclaw_local_dashboard__";
|
||||
pub const LOCAL_DASHBOARD_SOURCE: &str = "compat.workflow_executor";
|
||||
pub const LOCAL_DASHBOARD_KIND_ZHIHU_HOTLIST_SCREEN: &str = "zhihu_hotlist_screen";
|
||||
|
||||
pub enum PostExportOpen {
|
||||
Opened,
|
||||
Failed(String),
|
||||
}
|
||||
|
||||
pub fn open_exported_xlsx(output_path: &Path) -> PostExportOpen {
|
||||
open_exported_xlsx_with(output_path, launch_with_default_xlsx_app)
|
||||
}
|
||||
|
||||
fn open_exported_xlsx_with<F>(output_path: &Path, opener: F) -> PostExportOpen
|
||||
where
|
||||
F: FnOnce(&Path) -> Result<(), String>,
|
||||
{ /* test seam */ }
|
||||
|
||||
pub fn open_local_dashboard(
|
||||
browser_backend: &dyn BrowserBackend,
|
||||
output_path: &Path,
|
||||
presentation_url: &str,
|
||||
) -> PostExportOpen { /* invoke Action::Navigate with exact marker payload */ }
|
||||
```
|
||||
|
||||
Keep the module tiny. The only dedicated test seam in this file should be `open_exported_xlsx_with(...)`; do not introduce a general launcher trait.
|
||||
|
||||
- [ ] **Step 4: Implement the Windows-first `.xlsx` opener minimally**
|
||||
|
||||
Use a focused local launcher that targets the current environment first.
|
||||
|
||||
Preferred target shape:
|
||||
|
||||
```rust
|
||||
Command::new("cmd")
|
||||
.args(["/C", "start", "", output_path_as_windows_string])
|
||||
```
|
||||
|
||||
Requirements:
|
||||
|
||||
```text
|
||||
- fail if the path does not exist
|
||||
- do not swallow command-spawn errors
|
||||
- do not open arbitrary user-selected files from outside this workflow
|
||||
- keep cross-platform behavior minimal; only add a fallback branch if required to keep tests/build portable
|
||||
```
|
||||
|
||||
If you need a non-Windows fallback for compilation, keep it obviously minimal and out of the hot path.
|
||||
|
||||
- [ ] **Step 5: Parse payloads in `workflow_executor.rs` and call the new helper**
|
||||
|
||||
Refactor `export_xlsx(...)` and `export_screen(...)` just enough to separate:
|
||||
|
||||
```text
|
||||
- tool execution
|
||||
- payload parsing
|
||||
- route-specific post-export open
|
||||
- summary formatting
|
||||
```
|
||||
|
||||
Minimal target behavior:
|
||||
|
||||
```rust
|
||||
match open_exported_xlsx(&output_path) {
|
||||
PostExportOpen::Opened => format!("已导出并打开知乎热榜 Excel {output_path}"),
|
||||
PostExportOpen::Failed(reason) => format!("已导出知乎热榜 Excel {output_path},但自动打开失败:{reason}"),
|
||||
}
|
||||
```
|
||||
|
||||
```rust
|
||||
match open_local_dashboard(browser_backend, &output_path, &presentation_url) {
|
||||
PostExportOpen::Opened => format!("已在浏览器中打开知乎热榜大屏 {output_path}"),
|
||||
PostExportOpen::Failed(reason) => format!("已生成知乎热榜大屏 {output_path},但浏览器自动打开失败:{reason}"),
|
||||
}
|
||||
```
|
||||
|
||||
Change signatures only as much as needed to pass `browser_backend` into the dashboard route. Do not broaden unrelated call chains.
|
||||
|
||||
- [ ] **Step 6: Export the helper module**
|
||||
|
||||
Update `src/compat/mod.rs`:
|
||||
|
||||
```rust
|
||||
pub mod artifact_open;
|
||||
```
|
||||
|
||||
Do not reorder unrelated module exports unless rustfmt does it.
|
||||
|
||||
- [ ] **Step 7: Run the focused library and workflow regressions to verify green**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" open_exported_xlsx_with_passes_generated_path_to_launcher --lib -- --exact
|
||||
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" open_exported_xlsx_with_reports_launcher_failure --lib -- --exact
|
||||
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" open_local_dashboard_uses_exact_approved_marker_payload --lib -- --exact
|
||||
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" handle_browser_message_chains_hotlist_skill_into_xlsx_export_and_auto_open --test compat_runtime_test -- --exact
|
||||
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" handle_browser_message_chains_hotlist_skill_into_screen_export_and_auto_open --test compat_runtime_test -- --exact
|
||||
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" handle_browser_message_reports_dashboard_auto_open_protocol_error_when_presentation_url_is_missing --test compat_runtime_test -- --exact
|
||||
```
|
||||
|
||||
Expected: PASS for the new library tests and the workflow regressions, unless the dashboard-open path still fails at backend/policy validation.
|
||||
|
||||
- [ ] **Step 8: Commit the compat opener and workflow changes**
|
||||
|
||||
```bash
|
||||
git add src/compat/artifact_open.rs src/compat/mod.rs src/compat/workflow_executor.rs tests/compat_runtime_test.rs
|
||||
git commit -m "feat: auto-open zhihu hotlist export artifacts"
|
||||
```
|
||||
|
||||
### Task 3: Add failing backend and security tests for the narrow local-dashboard allowance
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/browser/callback_backend.rs:536-840`
|
||||
- Modify: `tests/browser_tool_test.rs` (`default_rules_allow_zhihu_navigation` section plus new local-dashboard validation tests)
|
||||
- Reference: `src/security/mac_policy.rs:56-132`
|
||||
|
||||
- [ ] **Step 1: Add a red callback-backend acceptance test for the approved local-dashboard request shape**
|
||||
|
||||
Extend the existing `src/browser/callback_backend.rs` test module with one focused navigate test.
|
||||
|
||||
Target shape:
|
||||
|
||||
```rust
|
||||
#[test]
|
||||
fn callback_backend_accepts_approved_local_dashboard_navigate_request() {
|
||||
let host = Arc::new(FakeCallbackHost::new(vec![success_reply(json!({ "navigated": true }))]));
|
||||
let backend = BrowserCallbackBackend::new(
|
||||
host.clone(),
|
||||
test_policy(),
|
||||
"http://127.0.0.1:17888/sgclaw/browser-helper.html",
|
||||
);
|
||||
|
||||
let output = backend.invoke(
|
||||
Action::Navigate,
|
||||
json!({
|
||||
"url": "file:///C:/tmp/zhihu-hotlist-screen.html",
|
||||
"sgclaw_local_dashboard_open": {
|
||||
"source": "compat.workflow_executor",
|
||||
"kind": "zhihu_hotlist_screen",
|
||||
"output_path": "C:/tmp/zhihu-hotlist-screen.html",
|
||||
"presentation_url": "file:///C:/tmp/zhihu-hotlist-screen.html"
|
||||
}
|
||||
}),
|
||||
"__sgclaw_local_dashboard__",
|
||||
);
|
||||
|
||||
assert!(output.unwrap().success);
|
||||
assert_eq!(host.requests()[0].command, json!([
|
||||
"http://127.0.0.1:17888/sgclaw/browser-helper.html",
|
||||
"sgBrowerserOpenPage",
|
||||
"file:///C:/tmp/zhihu-hotlist-screen.html"
|
||||
]));
|
||||
}
|
||||
```
|
||||
|
||||
Do not weaken any existing normal-domain tests.
|
||||
|
||||
- [ ] **Step 2: Add red rejection tests in exact files**
|
||||
|
||||
Put malformed-request rejection in `src/browser/callback_backend.rs` next to the acceptance test:
|
||||
|
||||
```rust
|
||||
#[test]
|
||||
fn callback_backend_rejects_local_dashboard_navigate_without_required_marker_fields() {}
|
||||
```
|
||||
|
||||
Put policy-only validation in `tests/browser_tool_test.rs` so all public `MacPolicy` assertions stay in one place:
|
||||
|
||||
```rust
|
||||
#[test]
|
||||
fn mac_policy_rejects_non_html_local_dashboard_presentation() {}
|
||||
|
||||
#[test]
|
||||
fn default_rules_allow_zhihu_navigation() {
|
||||
let policy = MacPolicy::load_from_path(...).unwrap();
|
||||
policy.validate(&Action::Navigate, "www.zhihu.com").unwrap();
|
||||
}
|
||||
```
|
||||
|
||||
Do not create a second `MacPolicy` regression location.
|
||||
|
||||
- [ ] **Step 3: Run the focused backend/policy tests to verify red**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" callback_backend_accepts_approved_local_dashboard_navigate_request --lib -- --exact
|
||||
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" callback_backend_rejects_local_dashboard_navigate_without_required_marker_fields --lib -- --exact
|
||||
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" mac_policy_rejects_non_html_local_dashboard_presentation --test browser_tool_test -- --exact
|
||||
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" default_rules_allow_zhihu_navigation --test browser_tool_test -- --exact
|
||||
```
|
||||
|
||||
Expected: the new local-dashboard tests FAIL; `default_rules_allow_zhihu_navigation` should still PASS.
|
||||
|
||||
- [ ] **Step 4: Commit the red backend/security tests**
|
||||
|
||||
```bash
|
||||
git add src/browser/callback_backend.rs tests/browser_tool_test.rs
|
||||
git commit -m "test: lock local dashboard navigate boundary"
|
||||
```
|
||||
|
||||
### Task 4: Implement the narrow callback-backend and MacPolicy allowance
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/browser/callback_backend.rs:300-351`
|
||||
- Modify: `src/security/mac_policy.rs:56-132`
|
||||
- Maybe modify: `src/security/mod.rs:9-27`
|
||||
- Test: `src/browser/callback_backend.rs:536-840`
|
||||
- Test: `tests/browser_tool_test.rs` (`default_rules_allow_zhihu_navigation` section plus new local-dashboard validation tests)
|
||||
|
||||
- [ ] **Step 1: Add a narrow local-dashboard validation helper in `MacPolicy`**
|
||||
|
||||
Keep `validate(...)` unchanged for ordinary domain flow. Add one small explicit helper instead.
|
||||
|
||||
Target shape:
|
||||
|
||||
```rust
|
||||
pub fn validate_local_dashboard_presentation(
|
||||
&self,
|
||||
action: &Action,
|
||||
expected_domain: &str,
|
||||
presentation_url: &str,
|
||||
output_path: &str,
|
||||
) -> Result<(), SecurityError> {
|
||||
// require Action::Navigate
|
||||
// require expected_domain == "__sgclaw_local_dashboard__"
|
||||
// require file:// URL
|
||||
// require .html path
|
||||
// require normalized file URL path matches output_path
|
||||
}
|
||||
```
|
||||
|
||||
If you need a new `SecurityError` variant for malformed local-dashboard input, add the smallest one that keeps error text clear.
|
||||
|
||||
- [ ] **Step 2: Recognize only the exact approved request shape in `BrowserCallbackBackend::invoke(...)`**
|
||||
|
||||
Before the normal `self.mac_policy.validate(&action, expected_domain)?` path runs, detect the one approved special case.
|
||||
|
||||
Minimal target behavior:
|
||||
|
||||
```rust
|
||||
if let Some(local_dashboard) = approved_local_dashboard_request(&action, ¶ms, expected_domain) {
|
||||
self.mac_policy.validate_local_dashboard_presentation(
|
||||
&action,
|
||||
expected_domain,
|
||||
&local_dashboard.presentation_url,
|
||||
&local_dashboard.output_path,
|
||||
)?;
|
||||
} else {
|
||||
self.mac_policy.validate(&action, expected_domain)?;
|
||||
}
|
||||
```
|
||||
|
||||
The helper should require all of these fields exactly:
|
||||
|
||||
```text
|
||||
- action == Action::Navigate
|
||||
- expected_domain == "__sgclaw_local_dashboard__"
|
||||
- params.url exists
|
||||
- params.sgclaw_local_dashboard_open.source == "compat.workflow_executor"
|
||||
- params.sgclaw_local_dashboard_open.kind == "zhihu_hotlist_screen"
|
||||
- params.sgclaw_local_dashboard_open.output_path exists
|
||||
- params.sgclaw_local_dashboard_open.presentation_url exists and equals params.url
|
||||
```
|
||||
|
||||
Anything else must continue down the normal rejection path.
|
||||
|
||||
- [ ] **Step 3: Keep `build_command(Action::Navigate, ...)` simple**
|
||||
|
||||
Do not add a second browser opcode or change the callback-host runtime contract. The approved local-dashboard case should still flow into the existing navigate command builder so the emitted command stays:
|
||||
|
||||
```rust
|
||||
json!([
|
||||
self.helper_page_url,
|
||||
"sgBrowerserOpenPage",
|
||||
target_url,
|
||||
])
|
||||
```
|
||||
|
||||
- [ ] **Step 4: Run the focused backend/security tests to verify green**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" callback_backend_accepts_approved_local_dashboard_navigate_request --lib -- --exact
|
||||
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" callback_backend_rejects_local_dashboard_navigate_without_required_marker_fields --lib -- --exact
|
||||
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" mac_policy_rejects_non_html_local_dashboard_presentation --test browser_tool_test -- --exact
|
||||
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" default_rules_allow_zhihu_navigation --test browser_tool_test -- --exact
|
||||
```
|
||||
|
||||
Expected: PASS
|
||||
|
||||
- [ ] **Step 5: Re-run the dashboard workflow regression after backend validation lands**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" handle_browser_message_chains_hotlist_skill_into_screen_export_and_auto_open --test compat_runtime_test -- --exact
|
||||
```
|
||||
|
||||
Expected: PASS
|
||||
|
||||
- [ ] **Step 6: Commit the backend/security implementation**
|
||||
|
||||
```bash
|
||||
git add src/browser/callback_backend.rs src/security/mac_policy.rs src/security/mod.rs tests/browser_tool_test.rs tests/compat_runtime_test.rs
|
||||
git commit -m "fix: allow approved local dashboard auto-open"
|
||||
```
|
||||
|
||||
If `src/security/mod.rs` did not change, omit it from the commit.
|
||||
|
||||
### Task 5: Run the focused verification sweep
|
||||
|
||||
**Files:**
|
||||
- Verify: `src/compat/artifact_open.rs`
|
||||
- Verify: `tests/compat_runtime_test.rs`
|
||||
- Verify: `tests/compat_screen_html_export_tool_test.rs`
|
||||
- Verify: `tests/browser_tool_test.rs`
|
||||
- Verify: `src/browser/callback_backend.rs` test module
|
||||
- Reference only if summary wording ripples outward: `tests/agent_runtime_test.rs:173-258`
|
||||
- Reference only if summary wording ripples outward: `tests/service_task_flow_test.rs:704-839`
|
||||
- Reference only if summary wording ripples outward: `tests/service_ws_session_test.rs:755-869`
|
||||
|
||||
- [ ] **Step 1: Re-run the library and workflow regressions**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" open_exported_xlsx_with_passes_generated_path_to_launcher --lib -- --exact
|
||||
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" open_exported_xlsx_with_reports_launcher_failure --lib -- --exact
|
||||
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" open_local_dashboard_uses_exact_approved_marker_payload --lib -- --exact
|
||||
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" handle_browser_message_chains_hotlist_skill_into_xlsx_export_and_auto_open --test compat_runtime_test -- --exact
|
||||
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" handle_browser_message_chains_hotlist_skill_into_screen_export_and_auto_open --test compat_runtime_test -- --exact
|
||||
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" handle_browser_message_reports_dashboard_auto_open_protocol_error_when_presentation_url_is_missing --test compat_runtime_test -- --exact
|
||||
```
|
||||
|
||||
Expected: PASS
|
||||
|
||||
- [ ] **Step 2: Re-run the tool contract regression that the dashboard route depends on**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" screen_html_export_tool_renders_dashboard_html_with_presentation_contract --test compat_screen_html_export_tool_test -- --exact
|
||||
```
|
||||
|
||||
Expected: PASS
|
||||
|
||||
- [ ] **Step 3: Re-run the callback-backend and policy boundary tests**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" callback_backend_accepts_approved_local_dashboard_navigate_request --lib -- --exact
|
||||
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" callback_backend_rejects_local_dashboard_navigate_without_required_marker_fields --lib -- --exact
|
||||
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" mac_policy_rejects_non_html_local_dashboard_presentation --test browser_tool_test -- --exact
|
||||
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" default_rules_allow_zhihu_navigation --test browser_tool_test -- --exact
|
||||
```
|
||||
|
||||
Expected: PASS
|
||||
|
||||
- [ ] **Step 4: Re-run outward-facing summary regressions only if needed**
|
||||
|
||||
Only if the updated summary text breaks existing assertions, run exactly these existing regressions and adjust only the affected expectation text:
|
||||
|
||||
```bash
|
||||
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" production_submit_task_routes_zhihu_through_ws_backend_without_helper_bootstrap --test agent_runtime_test -- --exact
|
||||
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" client_to_service_regression_routes_zhihu_without_helper_bootstrap_or_invalid_hmac_seed_output --test service_task_flow_test -- --exact
|
||||
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" service_binary_submit_flow_routes_zhihu_without_helper_bootstrap --test service_ws_session_test -- --exact
|
||||
```
|
||||
|
||||
Expected: PASS for any test you had to touch. Skip this step entirely if those files needed no edits.
|
||||
|
||||
- [ ] **Step 5: Inspect scope before finishing with exact git commands**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
git diff --name-only -- src/compat/artifact_open.rs src/compat/mod.rs src/compat/workflow_executor.rs src/browser/callback_backend.rs src/security/mac_policy.rs src/security/mod.rs tests/compat_runtime_test.rs tests/browser_tool_test.rs tests/agent_runtime_test.rs tests/service_task_flow_test.rs tests/service_ws_session_test.rs
|
||||
git diff --stat -- src/compat/artifact_open.rs src/compat/mod.rs src/compat/workflow_executor.rs src/browser/callback_backend.rs src/security/mac_policy.rs src/security/mod.rs tests/compat_runtime_test.rs tests/browser_tool_test.rs tests/agent_runtime_test.rs tests/service_task_flow_test.rs tests/service_ws_session_test.rs
|
||||
```
|
||||
|
||||
Confirm the diff only touches:
|
||||
|
||||
```text
|
||||
- compat workflow/orchestration
|
||||
- compat post-export helper module
|
||||
- callback backend narrow local-dashboard acceptance
|
||||
- MacPolicy narrow local-dashboard validation
|
||||
- focused related tests
|
||||
```
|
||||
|
||||
Confirm it does **not** touch:
|
||||
|
||||
```text
|
||||
- frontend/service-console/
|
||||
- src/service/protocol.rs
|
||||
- browser-helper.html
|
||||
- callback-host endpoint contracts
|
||||
- websocket transport/protocol files
|
||||
```
|
||||
|
||||
- [ ] **Step 6: Commit only if verification required additional code changes**
|
||||
|
||||
```bash
|
||||
git add src/compat/artifact_open.rs src/compat/mod.rs src/compat/workflow_executor.rs src/browser/callback_backend.rs src/security/mac_policy.rs tests/compat_runtime_test.rs tests/browser_tool_test.rs tests/agent_runtime_test.rs tests/service_task_flow_test.rs tests/service_ws_session_test.rs
|
||||
git commit -m "test: tighten hotlist post-export auto-open verification"
|
||||
```
|
||||
|
||||
If verification required no further code changes, do not create an extra commit.
|
||||
@@ -0,0 +1,281 @@
|
||||
# Config-Owned Direct Skill Contract Implementation Plan
|
||||
|
||||
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
|
||||
|
||||
**Goal:** Validate the `directSubmitSkill` control surface early and prevent malformed direct-skill configs from entering the submit routing path, without changing the current happy-path direct execution behavior.
|
||||
|
||||
**Architecture:** Keep the existing direct-submit runtime and submit-task seam intact for valid configs. Move `directSubmitSkill` format validation into the normal `SgClawSettings` load path so malformed config fails before routing begins, while leaving valid-but-unresolvable `skill.tool` targets as direct runtime errors in the current direct path.
|
||||
|
||||
**Tech Stack:** Rust 2021, `serde` config parsing, current `BrowserMessage::SubmitTask` path, current direct skill runtime, Rust integration tests.
|
||||
|
||||
---
|
||||
|
||||
## Execution Context
|
||||
|
||||
- Follow @superpowers:test-driven-development for the Rust code changes in this plan.
|
||||
- Follow @superpowers:verification-before-completion before claiming any task is done.
|
||||
- Do **not** create a git worktree unless the user explicitly asks. This project prefers staying in the current checkout.
|
||||
- Keep scope tight: this plan does **not** add per-skill dispatch metadata, docs changes, intent classification, or LLM routing changes.
|
||||
|
||||
## File Map
|
||||
|
||||
### Existing files to modify
|
||||
|
||||
- Modify: `src/config/settings.rs`
|
||||
- validate `directSubmitSkill` during config normalization
|
||||
- keep the stored field as `Option<String>` so the current direct runtime API stays stable
|
||||
- Modify: `tests/compat_config_test.rs`
|
||||
- add a failing config-load regression for malformed `directSubmitSkill`
|
||||
- Modify: `tests/agent_runtime_test.rs`
|
||||
- add a failing submit-path regression proving malformed config is rejected before direct routing begins
|
||||
|
||||
### Existing files to read but not broaden
|
||||
|
||||
- Reuse without redesign: `src/agent/mod.rs`
|
||||
- Reuse without redesign: `src/compat/direct_skill_runtime.rs`
|
||||
- Reuse without redesign: `docs/superpowers/specs/2026-04-09-config-owned-direct-skill-dispatch-design.md`
|
||||
|
||||
### No new files expected
|
||||
|
||||
This slice should fit in the existing config and tests surfaces only.
|
||||
|
||||
---
|
||||
|
||||
### Task 1: Validate `directSubmitSkill` Before Submit Routing
|
||||
|
||||
**Files:**
|
||||
- Modify: `tests/compat_config_test.rs`
|
||||
- Modify: `tests/agent_runtime_test.rs`
|
||||
- Modify: `src/config/settings.rs`
|
||||
- Read only: `src/agent/mod.rs`
|
||||
- Read only: `src/compat/direct_skill_runtime.rs`
|
||||
|
||||
- [ ] **Step 1: Write the failing config test for malformed `directSubmitSkill`**
|
||||
|
||||
Add this focused test to `tests/compat_config_test.rs`:
|
||||
|
||||
```rust
|
||||
#[test]
|
||||
fn sgclaw_settings_reject_invalid_direct_submit_skill_format() {
|
||||
let root = std::env::temp_dir().join(format!(
|
||||
"sgclaw-invalid-direct-submit-skill-{}",
|
||||
Uuid::new_v4()
|
||||
));
|
||||
fs::create_dir_all(&root).unwrap();
|
||||
let config_path = root.join("sgclaw_config.json");
|
||||
|
||||
fs::write(
|
||||
&config_path,
|
||||
r#"{
|
||||
"providers": [],
|
||||
"skillsDir": "skill_lib",
|
||||
"directSubmitSkill": "fault-details-report"
|
||||
}"#,
|
||||
)
|
||||
.unwrap();
|
||||
|
||||
let err = SgClawSettings::load(Some(config_path.as_path()))
|
||||
.expect_err("expected invalid directSubmitSkill format");
|
||||
let message = err.to_string();
|
||||
|
||||
assert!(message.contains("directSubmitSkill"));
|
||||
assert!(message.contains("skill.tool"));
|
||||
}
|
||||
```
|
||||
|
||||
- [ ] **Step 2: Run the focused config test and verify it fails**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test compat_config_test sgclaw_settings_reject_invalid_direct_submit_skill_format -- --nocapture
|
||||
```
|
||||
|
||||
Expected: FAIL because the current config loader accepts the malformed string instead of rejecting it early.
|
||||
|
||||
- [ ] **Step 3: Write the failing agent regression for malformed config**
|
||||
|
||||
Add this focused test to `tests/agent_runtime_test.rs`:
|
||||
|
||||
```rust
|
||||
#[test]
|
||||
fn submit_task_rejects_invalid_direct_submit_skill_config_before_routing() {
|
||||
std::env::remove_var("DEEPSEEK_API_KEY");
|
||||
std::env::remove_var("DEEPSEEK_BASE_URL");
|
||||
std::env::remove_var("DEEPSEEK_MODEL");
|
||||
|
||||
let skill_root = build_direct_runtime_skill_root();
|
||||
let workspace_root = std::env::temp_dir().join(format!(
|
||||
"sgclaw-invalid-direct-submit-workspace-{}",
|
||||
Uuid::new_v4()
|
||||
));
|
||||
fs::create_dir_all(&workspace_root).unwrap();
|
||||
let config_path = workspace_root.join("sgclaw_config.json");
|
||||
fs::write(
|
||||
&config_path,
|
||||
serde_json::json!({
|
||||
"providers": [],
|
||||
"skillsDir": skill_root,
|
||||
"directSubmitSkill": "fault-details-report"
|
||||
})
|
||||
.to_string(),
|
||||
)
|
||||
.unwrap();
|
||||
|
||||
let runtime_context = AgentRuntimeContext::new(Some(config_path), workspace_root);
|
||||
let transport = Arc::new(MockTransport::new(vec![]));
|
||||
let browser_tool = BrowserPipeTool::new(
|
||||
transport.clone(),
|
||||
direct_runtime_test_policy(),
|
||||
vec![1, 2, 3, 4, 5, 6, 7, 8],
|
||||
)
|
||||
.with_response_timeout(Duration::from_secs(1));
|
||||
|
||||
handle_browser_message_with_context(
|
||||
transport.as_ref(),
|
||||
&browser_tool,
|
||||
&runtime_context,
|
||||
submit_fault_details_message(),
|
||||
)
|
||||
.unwrap();
|
||||
|
||||
let sent = transport.sent_messages();
|
||||
assert!(matches!(
|
||||
sent.last(),
|
||||
Some(AgentMessage::TaskComplete { success, summary })
|
||||
if !success && summary.contains("skill.tool")
|
||||
));
|
||||
assert!(direct_submit_mode_logs(&sent).is_empty());
|
||||
assert!(!sent.iter().any(|message| matches!(message, AgentMessage::Command { .. })));
|
||||
}
|
||||
```
|
||||
|
||||
- [ ] **Step 4: Run the focused agent test and verify it fails**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test agent_runtime_test submit_task_rejects_invalid_direct_submit_skill_config_before_routing -- --nocapture
|
||||
```
|
||||
|
||||
Expected: FAIL because the malformed config currently loads, enters the direct-submit branch, and emits `direct_skill_primary` before failing later.
|
||||
|
||||
- [ ] **Step 5: Implement the minimal config validation**
|
||||
|
||||
In `src/config/settings.rs`, add a small helper that validates the normalized `directSubmitSkill` string during `SgClawSettings::new(...)`.
|
||||
|
||||
Recommended implementation shape:
|
||||
|
||||
```rust
|
||||
fn normalize_direct_submit_skill(raw: Option<String>) -> Result<Option<String>, ConfigError> {
|
||||
let value = normalize_optional_value(raw);
|
||||
let Some(value) = value.as_deref() else {
|
||||
return Ok(None);
|
||||
};
|
||||
|
||||
let Some((skill_name, tool_name)) = value.split_once('.') else {
|
||||
return Err(ConfigError::InvalidValue(
|
||||
"directSubmitSkill",
|
||||
format!("must use skill.tool format, got {value}"),
|
||||
));
|
||||
};
|
||||
|
||||
if skill_name.trim().is_empty() || tool_name.trim().is_empty() {
|
||||
return Err(ConfigError::InvalidValue(
|
||||
"directSubmitSkill",
|
||||
format!("must use skill.tool format, got {value}"),
|
||||
));
|
||||
}
|
||||
|
||||
Ok(Some(value.to_string()))
|
||||
}
|
||||
```
|
||||
|
||||
Then use it here:
|
||||
|
||||
```rust
|
||||
let direct_submit_skill = normalize_direct_submit_skill(direct_submit_skill)?;
|
||||
```
|
||||
|
||||
Rules:
|
||||
- do not change the public field type from `Option<String>`
|
||||
- do not move parsing responsibility into `src/agent/mod.rs`
|
||||
- do not redesign `src/compat/direct_skill_runtime.rs`
|
||||
- keep valid-but-unresolvable `skill.tool` targets as runtime errors in the direct path
|
||||
|
||||
- [ ] **Step 6: Re-run the two focused tests and verify they pass**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test compat_config_test sgclaw_settings_reject_invalid_direct_submit_skill_format -- --nocapture
|
||||
cargo test --test agent_runtime_test submit_task_rejects_invalid_direct_submit_skill_config_before_routing -- --nocapture
|
||||
```
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
- [ ] **Step 7: Re-run the broader regression suites**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test compat_config_test -- --nocapture
|
||||
cargo test --test agent_runtime_test -- --nocapture
|
||||
cargo test --test browser_script_skill_tool_test -- --nocapture
|
||||
cargo build --bin sgclaw
|
||||
```
|
||||
|
||||
Expected: PASS, including:
|
||||
- the direct-submit happy path
|
||||
- the existing no-LLM fallback behavior when `directSubmitSkill` is absent
|
||||
- unchanged browser-script helper semantics
|
||||
- clean binary build
|
||||
|
||||
---
|
||||
|
||||
## Verification Checklist
|
||||
|
||||
### Config validation
|
||||
|
||||
```bash
|
||||
cargo test --test compat_config_test -- --nocapture
|
||||
```
|
||||
|
||||
Expected: malformed `directSubmitSkill` is rejected early, while the existing direct-only config shape still loads.
|
||||
|
||||
### Submit-path behavior
|
||||
|
||||
```bash
|
||||
cargo test --test agent_runtime_test -- --nocapture
|
||||
```
|
||||
|
||||
Expected:
|
||||
- malformed `directSubmitSkill` never reaches direct routing
|
||||
- valid configured direct skill still succeeds without LLM config
|
||||
- no direct skill configured still returns the existing no-LLM message
|
||||
|
||||
### Browser-script helper safety
|
||||
|
||||
```bash
|
||||
cargo test --test browser_script_skill_tool_test -- --nocapture
|
||||
```
|
||||
|
||||
Expected: current browser-script execution semantics remain unchanged.
|
||||
|
||||
### Build
|
||||
|
||||
```bash
|
||||
cargo build --bin sgclaw
|
||||
```
|
||||
|
||||
Expected: the main binary compiles cleanly.
|
||||
|
||||
---
|
||||
|
||||
## Notes For The Engineer
|
||||
|
||||
- The paired spec is `docs/superpowers/specs/2026-04-09-config-owned-direct-skill-dispatch-design.md`.
|
||||
- Do **not** add sgClaw-specific dispatch metadata under `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging` in this slice.
|
||||
- Do **not** turn this into a per-skill registry task yet. This plan only hardens the current config-owned bootstrap contract.
|
||||
- Keep the current direct target example as `fault-details-report.collect_fault_details`; avoid hard-coding that name into new generic APIs.
|
||||
- If you discover a need for broader policy routing (`direct_browser` / `llm_agent` by skill), stop and write a new spec/plan instead of expanding this one.
|
||||
@@ -0,0 +1,520 @@
|
||||
# Direct Skill Invocation Without LLM Implementation Plan
|
||||
|
||||
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
|
||||
|
||||
**Goal:** Let the current pipe submit-task flow accept natural-language input but directly invoke one fixed staged browser skill without calling any model, while reserving a clean switch back to LLM-based routing later.
|
||||
|
||||
**Architecture:** Keep the existing `BrowserMessage::SubmitTask` entrypoint and add one narrow pre-routing seam before the current compat/LLM chain. When a new config field points to a fixed direct-submit skill, sgClaw loads that skill package from the configured external skills root, finds the target `browser_script` tool, executes it through the existing browser-script wrapper, and returns the result directly. When the field is absent, the current behavior stays unchanged. This preserves a future path where each skill can later declare `direct_browser` or `llm_agent` dispatch without rewriting the submit pipeline again.
|
||||
|
||||
**Tech Stack:** Rust 2021, existing `BrowserPipeTool`, current submit-task agent entrypoint, current browser-script skill executor, current sgClaw JSON config loader, `zeroclaw` skill manifest loader.
|
||||
|
||||
---
|
||||
|
||||
## Recommended First Skill
|
||||
|
||||
Use `fault-details-report.collect_fault_details` from:
|
||||
- `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/scenes/fault-details-report/scene.json`
|
||||
- `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/fault-details-report/SKILL.toml`
|
||||
- `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/fault-details-report/scripts/collect_fault_details.js`
|
||||
|
||||
Why this one first:
|
||||
- it is clearly a report/export skill
|
||||
- it exposes exactly one browser-script tool: `collect_fault_details`
|
||||
- it has the smallest contract surface (`period` only)
|
||||
- its current JS is deterministic and simple, so the first slice can focus on plumbing instead of browser scraping complexity
|
||||
|
||||
## Scope Guardrails
|
||||
|
||||
- Do **not** redesign the existing submit-task protocol.
|
||||
- Do **not** remove or rewrite the current LLM/compat path; leave it as the fallback/default path.
|
||||
- Do **not** introduce generic NL intent routing in this slice; this is one fixed direct skill only.
|
||||
- Do **not** modify `third_party/zeroclaw` skill manifest schema in phase 1.
|
||||
- Do **not** add Excel export wiring in the first slice unless a test explicitly requires it.
|
||||
- Do **not** invent a new browser-script execution model; reuse the existing wrapper semantics.
|
||||
|
||||
---
|
||||
|
||||
## File Map
|
||||
|
||||
### Existing files to modify
|
||||
|
||||
- Modify: `src/config/settings.rs`
|
||||
- add a minimal config field for one direct-submit skill name
|
||||
- Modify: `src/agent/mod.rs`
|
||||
- add a narrow pre-routing branch before the current compat/LLM path
|
||||
- Modify: `src/compat/browser_script_skill_tool.rs`
|
||||
- expose the smallest reusable helper for direct browser-script execution
|
||||
- Modify: `src/compat/mod.rs` or the nearest module export surface
|
||||
- export the new narrow direct-skill runtime module if needed
|
||||
- Modify: `tests/compat_config_test.rs`
|
||||
- add config coverage for the new direct-submit field
|
||||
- Modify: `tests/browser_script_skill_tool_test.rs`
|
||||
- add coverage for the reusable direct-execution helper
|
||||
- Modify: `tests/agent_runtime_test.rs`
|
||||
- prove submit-task can bypass the model and directly invoke the fixed skill
|
||||
|
||||
### New files to create
|
||||
|
||||
- Create: `src/compat/direct_skill_runtime.rs`
|
||||
- small runtime for loading one configured skill, resolving one configured tool, deriving minimal args, and executing it directly
|
||||
|
||||
### Files to reuse without changing behavior
|
||||
|
||||
- Reuse: `src/compat/runtime.rs`
|
||||
- Reuse: `src/compat/orchestration.rs`
|
||||
- Reuse: `src/compat/config_adapter.rs`
|
||||
- Reuse: `third_party/zeroclaw/src/skills/mod.rs`
|
||||
|
||||
---
|
||||
|
||||
### Task 1: Add A Minimal Direct-Submit Skill Config Field
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/config/settings.rs`
|
||||
- Modify: `tests/compat_config_test.rs`
|
||||
|
||||
- [ ] **Step 1: Write the failing config test for the new field**
|
||||
|
||||
In `tests/compat_config_test.rs`, add a focused config-load test proving the browser config file can declare one fixed direct-submit skill.
|
||||
|
||||
Test shape:
|
||||
|
||||
```rust
|
||||
#[test]
|
||||
fn sgclaw_settings_load_direct_submit_skill_from_browser_config() {
|
||||
let root = std::env::temp_dir().join(format!("sgclaw-direct-skill-{}", uuid::Uuid::new_v4()));
|
||||
std::fs::create_dir_all(&root).unwrap();
|
||||
let config_path = root.join("sgclaw_config.json");
|
||||
|
||||
std::fs::write(
|
||||
&config_path,
|
||||
r#"{
|
||||
"apiKey": "sk-runtime",
|
||||
"baseUrl": "https://api.deepseek.com",
|
||||
"model": "deepseek-chat",
|
||||
"skillsDir": "D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging",
|
||||
"directSubmitSkill": "fault-details-report.collect_fault_details"
|
||||
}"#,
|
||||
)
|
||||
.unwrap();
|
||||
|
||||
let settings = sgclaw::config::SgClawSettings::load(Some(config_path.as_path()))
|
||||
.unwrap()
|
||||
.expect("expected sgclaw settings from config file");
|
||||
|
||||
assert_eq!(
|
||||
settings.direct_submit_skill.as_deref(),
|
||||
Some("fault-details-report.collect_fault_details")
|
||||
);
|
||||
}
|
||||
```
|
||||
|
||||
- [ ] **Step 2: Run the focused config test and verify it fails**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test compat_config_test sgclaw_settings_load_direct_submit_skill_from_browser_config -- --nocapture
|
||||
```
|
||||
|
||||
Expected: FAIL because the config field does not exist yet.
|
||||
|
||||
- [ ] **Step 3: Implement the minimal config field**
|
||||
|
||||
In `src/config/settings.rs`, add:
|
||||
- `direct_submit_skill: Option<String>` to `SgClawSettings`
|
||||
- `direct_submit_skill: Option<String>` to `RawSgClawSettings`
|
||||
- field normalization in `SgClawSettings::new(...)`
|
||||
|
||||
Recommended JSON key shape:
|
||||
|
||||
```rust
|
||||
#[serde(rename = "directSubmitSkill", alias = "direct_submit_skill", default)]
|
||||
direct_submit_skill: Option<String>,
|
||||
```
|
||||
|
||||
Rules:
|
||||
- trim empty values to `None`
|
||||
- keep `DeepSeekSettings` unchanged for this slice unless a compile error proves it must mirror the field
|
||||
- do not alter unrelated config semantics
|
||||
|
||||
- [ ] **Step 4: Re-run the focused config test**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test compat_config_test sgclaw_settings_load_direct_submit_skill_from_browser_config -- --nocapture
|
||||
```
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
- [ ] **Step 5: Re-run the broader config file tests**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test compat_config_test -- --nocapture
|
||||
```
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
- [ ] **Step 6: Commit Task 1**
|
||||
|
||||
```bash
|
||||
git add src/config/settings.rs tests/compat_config_test.rs
|
||||
git commit -m "feat: add direct submit skill config"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 2: Extract A Reusable Browser-Script Direct Execution Helper
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/compat/browser_script_skill_tool.rs`
|
||||
- Modify: `tests/browser_script_skill_tool_test.rs`
|
||||
|
||||
- [ ] **Step 1: Write the first failing helper test**
|
||||
|
||||
In `tests/browser_script_skill_tool_test.rs`, add a focused test proving direct code can execute a packaged browser script without constructing a full `Tool` object first.
|
||||
|
||||
Test shape:
|
||||
|
||||
```rust
|
||||
#[tokio::test]
|
||||
async fn execute_browser_script_tool_runs_packaged_script_with_expected_domain() {
|
||||
// build temp skill script
|
||||
// call the helper directly
|
||||
// assert Action::Eval was sent with wrapped args and normalized domain
|
||||
}
|
||||
```
|
||||
|
||||
Required assertions:
|
||||
- the helper reads the packaged JS file
|
||||
- it wraps args with `const args = ...`
|
||||
- it normalizes URL-like `expected_domain`
|
||||
- it returns the serialized payload string on success
|
||||
|
||||
- [ ] **Step 2: Run the helper test and verify it fails**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test browser_script_skill_tool_test execute_browser_script_tool_runs_packaged_script_with_expected_domain -- --nocapture
|
||||
```
|
||||
|
||||
Expected: FAIL because the helper does not exist yet.
|
||||
|
||||
- [ ] **Step 3: Add the second failing helper test for required-domain validation**
|
||||
|
||||
Add a focused failure-path test proving the helper rejects missing or invalid `expected_domain` before any browser command is sent.
|
||||
|
||||
- [ ] **Step 4: Run the validation test and verify it fails**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test browser_script_skill_tool_test execute_browser_script_tool_rejects_missing_expected_domain -- --nocapture
|
||||
```
|
||||
|
||||
Expected: FAIL because the helper does not exist yet.
|
||||
|
||||
- [ ] **Step 5: Implement the minimal reusable helper**
|
||||
|
||||
In `src/compat/browser_script_skill_tool.rs`, extract the smallest reusable function, for example:
|
||||
|
||||
```rust
|
||||
pub async fn execute_browser_script_tool<T: Transport + 'static>(
|
||||
tool: &SkillTool,
|
||||
skill_root: &Path,
|
||||
browser_tool: BrowserPipeTool<T>,
|
||||
args: Value,
|
||||
) -> anyhow::Result<ToolResult>
|
||||
```
|
||||
|
||||
Rules:
|
||||
- reuse the current path validation, script loading, wrapping, `Action::Eval`, and payload formatting logic already used by `BrowserScriptSkillTool::execute`
|
||||
- do not change outward behavior of `BrowserScriptSkillTool`
|
||||
- keep the helper narrow and browser-script-only
|
||||
|
||||
- [ ] **Step 6: Refactor `BrowserScriptSkillTool::execute` to call the helper**
|
||||
|
||||
Keep existing behavior and tests green while removing duplicate execution logic.
|
||||
|
||||
- [ ] **Step 7: Re-run the browser-script tests**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test browser_script_skill_tool_test -- --nocapture
|
||||
```
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
- [ ] **Step 8: Commit Task 2**
|
||||
|
||||
```bash
|
||||
git add src/compat/browser_script_skill_tool.rs tests/browser_script_skill_tool_test.rs
|
||||
git commit -m "refactor: extract direct browser script execution helper"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 3: Add A Narrow Direct Skill Runtime For One Fixed Skill
|
||||
|
||||
**Files:**
|
||||
- Create: `src/compat/direct_skill_runtime.rs`
|
||||
- Modify: `src/compat/mod.rs` or nearest module export point
|
||||
- Reuse: `src/compat/config_adapter.rs`
|
||||
- Reuse: `third_party/zeroclaw/src/skills/mod.rs`
|
||||
|
||||
- [ ] **Step 1: Write the first failing direct-runtime test**
|
||||
|
||||
Add a focused test in `tests/agent_runtime_test.rs` or a new narrow compat test proving code can resolve the configured external skills root, load `fault-details-report`, find `collect_fault_details`, and execute it directly.
|
||||
|
||||
Recommended shape:
|
||||
|
||||
```rust
|
||||
#[test]
|
||||
fn direct_skill_runtime_executes_fault_details_report_without_provider() {
|
||||
// config points at skill_staging root
|
||||
// direct_submit_skill points at fault-details-report.collect_fault_details
|
||||
// browser response returns report-artifact payload
|
||||
// assert no provider/http path is touched
|
||||
}
|
||||
```
|
||||
|
||||
- [ ] **Step 2: Run the focused direct-runtime test and verify it fails**
|
||||
|
||||
Run the narrowest test command for the new test.
|
||||
|
||||
Expected: FAIL because the direct runtime does not exist yet.
|
||||
|
||||
- [ ] **Step 3: Implement `src/compat/direct_skill_runtime.rs`**
|
||||
|
||||
Add a narrow runtime with responsibilities only to:
|
||||
- resolve the configured skills dir with `resolve_skills_dir_from_sgclaw_settings(...)`
|
||||
- load skills from that directory with `load_skills_from_directory(...)`
|
||||
- parse the configured tool name into `skill_name` + `tool_name`
|
||||
- find the matching skill and matching tool
|
||||
- verify `tool.kind == "browser_script"`
|
||||
- derive the minimal argument object
|
||||
- call the new browser-script helper
|
||||
- return the output string or a clear `PipeError`
|
||||
|
||||
Do **not** add generic routing, scenes, or model fallback here.
|
||||
|
||||
- [ ] **Step 4: Keep argument derivation intentionally minimal**
|
||||
|
||||
For the first slice, derive only:
|
||||
- `expected_domain` from `page_url` when present, otherwise fail with a clear message
|
||||
- `period` from the instruction using a narrow deterministic pattern such as `YYYY-MM`
|
||||
|
||||
If the period cannot be derived, return a concise error telling the user to provide it explicitly. Do not guess.
|
||||
|
||||
- [ ] **Step 5: Re-run the focused direct-runtime test**
|
||||
|
||||
Run the same test command again.
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
- [ ] **Step 6: Commit Task 3**
|
||||
|
||||
```bash
|
||||
git add src/compat/direct_skill_runtime.rs src/compat/mod.rs tests/agent_runtime_test.rs
|
||||
git commit -m "feat: add fixed direct skill runtime"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 4: Insert The Pre-Routing Seam In Submit-Task Entry
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/agent/mod.rs`
|
||||
- Modify: `tests/agent_runtime_test.rs`
|
||||
|
||||
- [ ] **Step 1: Write the first failing submit-path bypass test**
|
||||
|
||||
In `tests/agent_runtime_test.rs`, add a focused regression proving that when `directSubmitSkill` is configured, `BrowserMessage::SubmitTask` can succeed without any model/provider being configured.
|
||||
|
||||
Test shape:
|
||||
|
||||
```rust
|
||||
#[test]
|
||||
fn submit_task_uses_direct_skill_mode_without_llm_configuration() {
|
||||
// config contains skillsDir + directSubmitSkill, but no reachable provider
|
||||
// natural-language instruction includes period and page_url
|
||||
// expect TaskComplete success from direct skill result
|
||||
}
|
||||
```
|
||||
|
||||
Required assertions:
|
||||
- task succeeds even if provider would be unavailable
|
||||
- output contains the report artifact payload
|
||||
- no summary like `未配置大语言模型`
|
||||
|
||||
- [ ] **Step 2: Run the bypass test and verify it fails**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test agent_runtime_test submit_task_uses_direct_skill_mode_without_llm_configuration -- --nocapture
|
||||
```
|
||||
|
||||
Expected: FAIL because submit-task still goes into the current LLM-oriented path.
|
||||
|
||||
- [ ] **Step 3: Add the second failing priority test**
|
||||
|
||||
Add one focused test proving the direct-submit branch runs before the existing compat/LLM branch.
|
||||
|
||||
The easiest assertion is that the mode log becomes something new like:
|
||||
- `direct_skill_primary`
|
||||
|
||||
and the normal mode logs do not appear for that turn.
|
||||
|
||||
- [ ] **Step 4: Run the priority test and verify it fails**
|
||||
|
||||
Run the narrow test command for the new test.
|
||||
|
||||
Expected: FAIL because the mode does not exist yet.
|
||||
|
||||
- [ ] **Step 5: Add the narrow pre-routing branch in `src/agent/mod.rs`**
|
||||
|
||||
In `handle_browser_message_with_context(...)`, after config load/logging and before the existing `should_use_primary_orchestration(...)` / `compat::runtime` path:
|
||||
- check `settings.direct_submit_skill`
|
||||
- if present, emit mode log `direct_skill_primary`
|
||||
- call the new direct runtime
|
||||
- send `TaskComplete` and return immediately
|
||||
|
||||
Rules:
|
||||
- if `direct_submit_skill` is absent, keep existing behavior byte-for-byte where possible
|
||||
- do not modify `compat::runtime.rs` or `compat::orchestration.rs` for this slice
|
||||
- do not silently fall through to LLM when direct execution fails; return the direct error clearly so the first slice is debuggable
|
||||
|
||||
- [ ] **Step 6: Re-run the focused submit-path tests**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test agent_runtime_test submit_task_uses_direct_skill_mode_without_llm_configuration -- --nocapture
|
||||
cargo test --test agent_runtime_test direct_skill_mode_logs_direct_skill_primary -- --nocapture
|
||||
```
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
- [ ] **Step 7: Re-run existing no-LLM submit regression coverage**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test agent_runtime_test -- --nocapture
|
||||
```
|
||||
|
||||
Expected: PASS, including existing cases where no direct skill is configured and the old no-LLM failure still applies.
|
||||
|
||||
- [ ] **Step 8: Commit Task 4**
|
||||
|
||||
```bash
|
||||
git add src/agent/mod.rs tests/agent_runtime_test.rs
|
||||
git commit -m "feat: route submit tasks through fixed direct skill mode"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 5: Lock The Future Migration Seam Without Implementing LLM Dispatch Yet
|
||||
|
||||
**Files:**
|
||||
- Modify only if needed: `src/config/settings.rs`
|
||||
- Modify only if needed: `src/compat/direct_skill_runtime.rs`
|
||||
- Reuse: docs/plan only unless code needs one tiny naming fix
|
||||
|
||||
- [ ] **Step 1: Keep the config naming compatible with future per-skill dispatch**
|
||||
|
||||
Document and preserve this future meaning in code naming:
|
||||
- current field: one fixed direct skill for submit-task bootstrap
|
||||
- future model: each skill can declare dispatch mode such as `direct_browser` or `llm_agent`
|
||||
|
||||
Prefer neutral names in helper code like:
|
||||
- `direct skill mode`
|
||||
- `direct submit skill`
|
||||
|
||||
Avoid hard-coding `fault_details` into generic APIs.
|
||||
|
||||
- [ ] **Step 2: Add one small negative test for fallback behavior**
|
||||
|
||||
Add a focused test proving that when `directSubmitSkill` is not configured, submit-task still behaves exactly as before and can still return the existing no-LLM message.
|
||||
|
||||
If an existing test already proves this, keep it and do not add another.
|
||||
|
||||
- [ ] **Step 3: Re-run the focused end-to-end verification set**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test compat_config_test -- --nocapture
|
||||
cargo test --test browser_script_skill_tool_test -- --nocapture
|
||||
cargo test --test agent_runtime_test -- --nocapture
|
||||
```
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
- [ ] **Step 4: Build the main binary**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo build --bin sgclaw
|
||||
```
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
- [ ] **Step 5: Commit Task 5**
|
||||
|
||||
```bash
|
||||
git add src/config/settings.rs src/compat/direct_skill_runtime.rs src/compat/browser_script_skill_tool.rs src/agent/mod.rs tests/compat_config_test.rs tests/browser_script_skill_tool_test.rs tests/agent_runtime_test.rs
|
||||
git commit -m "test: verify fixed direct skill submit path"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Verification Checklist
|
||||
|
||||
### Config loading
|
||||
|
||||
```bash
|
||||
cargo test --test compat_config_test -- --nocapture
|
||||
```
|
||||
|
||||
Expected: `directSubmitSkill` loads correctly and existing config behavior remains intact.
|
||||
|
||||
### Browser-script helper
|
||||
|
||||
```bash
|
||||
cargo test --test browser_script_skill_tool_test -- --nocapture
|
||||
```
|
||||
|
||||
Expected: direct helper preserves the existing browser-script execution semantics.
|
||||
|
||||
### Submit-path bypass
|
||||
|
||||
```bash
|
||||
cargo test --test agent_runtime_test -- --nocapture
|
||||
```
|
||||
|
||||
Expected: configured direct skill bypasses the model path, while unconfigured submit-task behavior stays unchanged.
|
||||
|
||||
### Build
|
||||
|
||||
```bash
|
||||
cargo build --bin sgclaw
|
||||
```
|
||||
|
||||
Expected: the binary compiles cleanly.
|
||||
|
||||
---
|
||||
|
||||
## Notes For The Engineer
|
||||
|
||||
- The key to keeping this slice small is to avoid changing `compat::runtime.rs` and `compat::orchestration.rs`; they remain the future LLM path.
|
||||
- `fault-details-report.collect_fault_details` is only the bootstrap skill. The plumbing must stay generic enough that the configured tool name can later point to another staged browser skill.
|
||||
- Phase 1 should not add per-skill dispatch metadata to the external skill manifests yet. Keep that decision in sgClaw config first; move it into skill metadata only after the direct path is proven useful.
|
||||
- Once the intranet model is ready, the clean next step is to add a dispatch policy layer that chooses between `direct_browser` and `llm_agent` before the current compat path is entered, reusing this same pre-routing seam.
|
||||
@@ -0,0 +1,666 @@
|
||||
# WS Branch Scene Cleanup Implementation Plan
|
||||
|
||||
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
|
||||
|
||||
**Goal:** Strip `feature/claw-ws` back to websocket plus Zhihu execution only by removing staged scene-skill routing, `skill_staging`-aware loading, and array-style `skillsDir` config behavior from this branch.
|
||||
|
||||
**Architecture:** Treat `feature/claw-ws` as a transport-focused branch, not a business-scene branch. Keep the browser websocket/callback submit path and the existing Zhihu direct workflows, but delete the fault-details / `95598` scene registry, scene-specific prompt injection, staged scene directory expansion, and scene-only docs/tests so the branch stays small and merges cleanly after the real scene implementation lands on `main`.
|
||||
|
||||
**Tech Stack:** Rust 2021, existing sgClaw compat/runtime/orchestration stack, websocket browser backend, callback-host service path, existing `cargo test` suite.
|
||||
|
||||
---
|
||||
|
||||
## Preconditions
|
||||
|
||||
- Execute this plan **only after** `main` already contains the desired clean scene-skill implementation.
|
||||
- Run it on `feature/claw-ws`, not on `main`.
|
||||
- Keep websocket and Zhihu behavior intact; this plan is cleanup, not a redesign.
|
||||
- Keep `docs/_tmp_sgbrowser_ws_api_doc.txt`; it remains the browser integration contract for this branch.
|
||||
|
||||
## Scope Guardrails
|
||||
|
||||
- Do **not** change the working Zhihu websocket flow in `tests/agent_runtime_test.rs`.
|
||||
- Do **not** remove `src/browser/ws_backend.rs`, `src/service/server.rs`, or Zhihu routes from `src/compat/workflow_executor.rs`.
|
||||
- Do **not** add a replacement scene abstraction on this branch.
|
||||
- Do **not** keep partial scene plumbing “for future use”; delete it completely if it is scene-only.
|
||||
- Do **not** keep array-style `skillsDir` tests or docs on this branch once the single-path cleanup is complete.
|
||||
|
||||
---
|
||||
|
||||
## File Map
|
||||
|
||||
### Delete
|
||||
|
||||
- `src/runtime/scene_registry.rs`
|
||||
- staged scene registry, hard-coded `skill_staging` scene root, scene matching helpers
|
||||
- `tests/scene_registry_test.rs`
|
||||
- scene-registry-specific coverage that should disappear with the feature
|
||||
- `docs/superpowers/specs/2026-04-06-scene-skill-runtime-routing-design.md`
|
||||
- scene-routing design doc that no longer belongs on the ws-only branch
|
||||
- `docs/superpowers/plans/2026-04-06-scene-skill-runtime-routing-plan.md`
|
||||
- scene-routing implementation plan that no longer belongs on the ws-only branch
|
||||
|
||||
### Modify
|
||||
|
||||
- `src/runtime/mod.rs`
|
||||
- stop exporting deleted scene registry APIs
|
||||
- `src/runtime/engine.rs`
|
||||
- remove scene-contract prompt injection and staged scene skill loading
|
||||
- `src/compat/workflow_executor.rs`
|
||||
- remove `FaultDetailsReport` route detection/execution while keeping Zhihu routes
|
||||
- `src/compat/orchestration.rs`
|
||||
- keep direct Zhihu orchestration only; remove scene-driven primary routing triggers
|
||||
- `src/config/settings.rs`
|
||||
- collapse `skillsDir` config handling back to single-path semantics
|
||||
- `src/compat/config_adapter.rs`
|
||||
- remove scene-specific skills-dir helpers and keep one resolved skills dir
|
||||
- `src/compat/runtime.rs`
|
||||
- stop carrying scene-expanded skills dirs through compat runtime
|
||||
- `src/agent/task_runner.rs`
|
||||
- update runtime logging and runtime calls to the single skills-dir contract
|
||||
- `tests/compat_runtime_test.rs`
|
||||
- remove fault-details / `95598` assertions and keep Zhihu/direct-route coverage
|
||||
- `tests/runtime_profile_test.rs`
|
||||
- remove `95598` scene-contract expectations and keep normal browser-profile coverage
|
||||
- `tests/compat_config_test.rs`
|
||||
- remove scene-dir / array-config coverage and add single-path cleanup coverage
|
||||
- `tests/agent_runtime_test.rs`
|
||||
- only extend if one extra Zhihu keep-path regression is needed after the config cleanup
|
||||
|
||||
### Keep As-Is Unless A Signature Change Forces A Tiny Edit
|
||||
|
||||
- `src/browser/ws_backend.rs`
|
||||
- `src/browser/callback_backend.rs`
|
||||
- `src/browser/callback_host.rs`
|
||||
- `src/service/server.rs`
|
||||
- `src/agent/mod.rs`
|
||||
- `tests/browser_ws_backend_test.rs`
|
||||
- `tests/service_ws_session_test.rs`
|
||||
- `tests/task_runner_test.rs`
|
||||
|
||||
---
|
||||
|
||||
### Task 1: Lock The Cleanup Contract In Failing Tests
|
||||
|
||||
**Files:**
|
||||
- Modify: `tests/compat_runtime_test.rs`
|
||||
- Modify: `tests/runtime_profile_test.rs`
|
||||
- Modify: `tests/compat_config_test.rs`
|
||||
- Reuse: `tests/agent_runtime_test.rs`
|
||||
|
||||
- [ ] **Step 1: Add the first failing route-removal test**
|
||||
|
||||
In `tests/compat_runtime_test.rs`, add a focused assertion proving the ws branch no longer recognizes the fault-details scene as a direct route:
|
||||
|
||||
```rust
|
||||
#[test]
|
||||
fn ws_cleanup_no_longer_detects_fault_details_scene_route() {
|
||||
use sgclaw::compat::workflow_executor::detect_route;
|
||||
|
||||
assert_eq!(
|
||||
detect_route(
|
||||
"导出故障明细",
|
||||
Some("https://example.invalid/workbench"),
|
||||
Some("业务台账"),
|
||||
),
|
||||
None,
|
||||
);
|
||||
}
|
||||
```
|
||||
|
||||
- [ ] **Step 2: Run the focused route test and verify it fails**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test compat_runtime_test ws_cleanup_no_longer_detects_fault_details_scene_route -- --nocapture
|
||||
```
|
||||
|
||||
Expected: FAIL because `FaultDetailsReport` is still detected today.
|
||||
|
||||
- [ ] **Step 3: Add the second failing orchestration-gate test**
|
||||
|
||||
In `tests/compat_runtime_test.rs`, add one focused assertion proving scene keywords no longer open the primary direct-orchestration path:
|
||||
|
||||
```rust
|
||||
#[test]
|
||||
fn ws_cleanup_scene_keywords_do_not_trigger_primary_orchestration() {
|
||||
assert!(!sgclaw::compat::orchestration::should_use_primary_orchestration(
|
||||
"请处理95598抢修市指监测",
|
||||
Some("https://95598.example.invalid/dispatch"),
|
||||
Some("95598抢修市指监测"),
|
||||
));
|
||||
}
|
||||
```
|
||||
|
||||
- [ ] **Step 4: Run the orchestration-gate test and verify it fails**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test compat_runtime_test ws_cleanup_scene_keywords_do_not_trigger_primary_orchestration -- --nocapture
|
||||
```
|
||||
|
||||
Expected: FAIL because the scene matcher still feeds primary orchestration today.
|
||||
|
||||
- [ ] **Step 5: Add the third failing runtime-instruction test**
|
||||
|
||||
In `tests/runtime_profile_test.rs`, add a focused negative assertion proving browser-attached turns no longer receive the `95598` scene execution contract:
|
||||
|
||||
```rust
|
||||
#[test]
|
||||
fn ws_cleanup_browser_profile_does_not_inject_95598_scene_contract() {
|
||||
let engine = RuntimeEngine::new(RuntimeProfile::BrowserAttached);
|
||||
let instruction = engine.build_instruction(
|
||||
"请处理95598-repair-city-dispatch场景,查看抢修市指派单并汇总当前队列",
|
||||
Some("https://95598.example.invalid/dispatch"),
|
||||
Some("95598抢修市指监测"),
|
||||
true,
|
||||
);
|
||||
|
||||
assert!(!instruction.contains("95598-repair-city-dispatch.collect_repair_orders"));
|
||||
}
|
||||
```
|
||||
|
||||
- [ ] **Step 6: Run the runtime-profile test and verify it fails**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test runtime_profile_test ws_cleanup_browser_profile_does_not_inject_95598_scene_contract -- --nocapture
|
||||
```
|
||||
|
||||
Expected: FAIL because `src/runtime/engine.rs` still injects the scene contract today.
|
||||
|
||||
- [ ] **Step 7: Add the fourth failing config-shape test**
|
||||
|
||||
In `tests/compat_config_test.rs`, add one focused assertion proving ws cleanup goes back to a single configured skills path and no longer accepts array-style `skillsDir` JSON:
|
||||
|
||||
```rust
|
||||
#[test]
|
||||
fn ws_cleanup_rejects_array_style_skills_dir_config() {
|
||||
let root = std::env::temp_dir().join(format!("sgclaw-config-{}", uuid::Uuid::new_v4()));
|
||||
std::fs::create_dir_all(&root).unwrap();
|
||||
let config_path = root.join("sgclaw_config.json");
|
||||
std::fs::write(
|
||||
&config_path,
|
||||
r#"{
|
||||
"apiKey": "sk-test",
|
||||
"baseUrl": "https://api.deepseek.com",
|
||||
"model": "deepseek-chat",
|
||||
"skillsDir": ["skill_lib", "skill_staging"]
|
||||
}"#,
|
||||
)
|
||||
.unwrap();
|
||||
|
||||
assert!(sgclaw::config::SgClawSettings::load(Some(config_path.as_path())).is_err());
|
||||
}
|
||||
```
|
||||
|
||||
- [ ] **Step 8: Run the config-shape test and verify it fails**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test compat_config_test ws_cleanup_rejects_array_style_skills_dir_config -- --nocapture
|
||||
```
|
||||
|
||||
Expected: FAIL because the current parser still accepts string-or-array `skillsDir` input.
|
||||
|
||||
- [ ] **Step 9: Re-run the existing Zhihu keep-path test as a safety baseline**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test agent_runtime_test production_submit_task_routes_zhihu_through_ws_backend_without_helper_bootstrap -- --nocapture
|
||||
```
|
||||
|
||||
Expected: PASS, proving the behavior we want to keep is already covered before deletion starts.
|
||||
|
||||
---
|
||||
|
||||
### Task 2: Remove Scene Registry, Scene Prompt Injection, And Fault-Details Routing
|
||||
|
||||
**Files:**
|
||||
- Delete: `src/runtime/scene_registry.rs`
|
||||
- Modify: `src/runtime/mod.rs`
|
||||
- Modify: `src/runtime/engine.rs`
|
||||
- Modify: `src/compat/workflow_executor.rs`
|
||||
- Modify: `src/compat/orchestration.rs`
|
||||
- Modify: `tests/compat_runtime_test.rs`
|
||||
- Modify: `tests/runtime_profile_test.rs`
|
||||
- Delete: `tests/scene_registry_test.rs`
|
||||
|
||||
- [ ] **Step 1: Remove the runtime scene module export surface**
|
||||
|
||||
Update `src/runtime/mod.rs` so it no longer declares or re-exports scene registry items.
|
||||
|
||||
Target shape:
|
||||
|
||||
```rust
|
||||
mod engine;
|
||||
mod profile;
|
||||
mod tool_policy;
|
||||
|
||||
pub use engine::{
|
||||
is_zhihu_hotlist_task,
|
||||
is_zhihu_write_task,
|
||||
task_requests_zhihu_article_publish,
|
||||
RuntimeEngine,
|
||||
};
|
||||
pub use profile::RuntimeProfile;
|
||||
pub use tool_policy::ToolPolicy;
|
||||
```
|
||||
|
||||
- [ ] **Step 2: Delete `src/runtime/scene_registry.rs`**
|
||||
|
||||
Remove the file entirely. Do not leave a stub module or comments about future scene support.
|
||||
|
||||
- [ ] **Step 3: Remove scene-aware prompt injection from `src/runtime/engine.rs`**
|
||||
|
||||
Delete:
|
||||
- the `resolve_scene_skills_dir_path` import
|
||||
- the `DispatchMode` / `match_scene_instruction` imports
|
||||
- `REPAIR_CITY_DISPATCH_EXECUTION_PROMPT`
|
||||
- `build_scene_execution_contract(...)`
|
||||
- the `if let Some(scene_contract) = ...` block inside `RuntimeEngine::build_instruction(...)`
|
||||
- staged scene directory loading inside `load_runtime_skills(...)`
|
||||
|
||||
The resulting instruction assembly should keep:
|
||||
- browser tool contract
|
||||
- Zhihu hotlist/export prompts
|
||||
- Zhihu publish guard
|
||||
- page context
|
||||
|
||||
Do **not** change Zhihu prompt text.
|
||||
|
||||
- [ ] **Step 4: Remove the fault-details route from `src/compat/workflow_executor.rs`**
|
||||
|
||||
Shrink `WorkflowRoute` back to Zhihu-only variants:
|
||||
|
||||
```rust
|
||||
pub enum WorkflowRoute {
|
||||
ZhihuHotlistExportXlsx,
|
||||
ZhihuHotlistScreen,
|
||||
ZhihuArticleEntry,
|
||||
ZhihuArticleDraft,
|
||||
ZhihuArticlePublish,
|
||||
ZhihuArticleAutoPublishGenerated,
|
||||
}
|
||||
```
|
||||
|
||||
Delete:
|
||||
- `FAULT_DETAILS_SCENE_ID`
|
||||
- the scene check at the top of `detect_route(...)`
|
||||
- `WorkflowRoute::FaultDetailsReport`
|
||||
- `execute_fault_details_route(...)`
|
||||
- any scene-only helpers used only by that path
|
||||
|
||||
Keep the Zhihu route order unchanged.
|
||||
|
||||
- [ ] **Step 5: Simplify `src/compat/orchestration.rs` to Zhihu-only direct routing**
|
||||
|
||||
After the fault-details route is gone, keep `should_use_primary_orchestration(...)` and the two execute functions focused on:
|
||||
- Zhihu direct routes detected by `detect_route(...)`
|
||||
- existing Zhihu export/dashboard fallback behavior
|
||||
|
||||
Do not add new conditions.
|
||||
|
||||
- [ ] **Step 6: Remove scene-only tests and replace them with cleanup assertions**
|
||||
|
||||
In `tests/compat_runtime_test.rs` and `tests/runtime_profile_test.rs`:
|
||||
- delete `fault-details` assertions that require the old route to exist
|
||||
- delete `95598` scene-contract assertions that require the old prompt injection to exist
|
||||
- keep the new negative cleanup tests from Task 1
|
||||
- keep the existing Zhihu assertions intact
|
||||
|
||||
Delete `tests/scene_registry_test.rs` completely.
|
||||
|
||||
- [ ] **Step 7: Run the focused cleanup tests**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test compat_runtime_test ws_cleanup_no_longer_detects_fault_details_scene_route -- --nocapture && cargo test --test compat_runtime_test ws_cleanup_scene_keywords_do_not_trigger_primary_orchestration -- --nocapture && cargo test --test runtime_profile_test ws_cleanup_browser_profile_does_not_inject_95598_scene_contract -- --nocapture
|
||||
```
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
- [ ] **Step 8: Re-run the focused Zhihu runtime tests**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test compat_runtime_test zhihu_ -- --nocapture
|
||||
```
|
||||
|
||||
Expected: PASS, proving the Zhihu direct routes still work after the scene deletion.
|
||||
|
||||
- [ ] **Step 9: Commit Task 2**
|
||||
|
||||
```bash
|
||||
git add src/runtime/mod.rs src/runtime/engine.rs src/compat/workflow_executor.rs src/compat/orchestration.rs tests/compat_runtime_test.rs tests/runtime_profile_test.rs
|
||||
git rm src/runtime/scene_registry.rs tests/scene_registry_test.rs
|
||||
git commit -m "refactor: remove scene routing from ws branch"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 3: Collapse `skillsDir` Back To Single-Path Semantics
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/config/settings.rs`
|
||||
- Modify: `src/compat/config_adapter.rs`
|
||||
- Modify: `src/compat/runtime.rs`
|
||||
- Modify: `src/agent/task_runner.rs`
|
||||
- Modify if needed: `tests/agent_runtime_test.rs`
|
||||
- Modify: `tests/compat_config_test.rs`
|
||||
|
||||
- [ ] **Step 1: Change config parsing to a single configured skills path**
|
||||
|
||||
In `src/config/settings.rs`, replace the string-or-array parser with a single optional string field.
|
||||
|
||||
Target shape:
|
||||
|
||||
```rust
|
||||
#[derive(Debug, Clone, PartialEq, Eq)]
|
||||
pub struct DeepSeekSettings {
|
||||
pub api_key: String,
|
||||
pub base_url: String,
|
||||
pub model: String,
|
||||
pub skills_dir: Option<PathBuf>,
|
||||
}
|
||||
|
||||
#[derive(Debug, Clone, PartialEq, Eq)]
|
||||
pub struct SgClawSettings {
|
||||
// ...
|
||||
pub skills_dir: Option<PathBuf>,
|
||||
// ...
|
||||
}
|
||||
```
|
||||
|
||||
And in `RawSgClawSettings`:
|
||||
|
||||
```rust
|
||||
#[serde(rename = "skillsDir", alias = "skills_dir", default)]
|
||||
skills_dir: Option<String>,
|
||||
```
|
||||
|
||||
Delete `deserialize_skills_dirs(...)` entirely.
|
||||
|
||||
- [ ] **Step 2: Keep relative-path resolution, but only for one path**
|
||||
|
||||
Replace `resolve_configured_skills_dirs(...) -> Vec<PathBuf>` with a single-path helper such as:
|
||||
|
||||
```rust
|
||||
fn resolve_configured_skills_dir(raw: Option<String>, config_dir: &Path) -> Option<PathBuf> {
|
||||
raw.map(|value| value.trim().to_string())
|
||||
.filter(|value| !value.is_empty())
|
||||
.map(PathBuf::from)
|
||||
.map(|path| if path.is_absolute() { path } else { config_dir.join(path) })
|
||||
}
|
||||
```
|
||||
|
||||
- [ ] **Step 3: Collapse compat config adapter back to one resolved skills dir**
|
||||
|
||||
In `src/compat/config_adapter.rs`:
|
||||
- keep `zeroclaw_default_skills_dir(...)`
|
||||
- change `resolve_skills_dir(...)` and `resolve_skills_dir_from_sgclaw_settings(...)` to return a single `PathBuf`
|
||||
- delete `resolve_scene_skills_dir_from_sgclaw_settings(...)`
|
||||
- delete `resolve_scene_skills_dir_path(...)`
|
||||
- delete any helper branches that append `skill_staging/skills`
|
||||
|
||||
Recommended shape:
|
||||
|
||||
```rust
|
||||
pub fn resolve_skills_dir_from_sgclaw_settings(
|
||||
workspace_root: &Path,
|
||||
settings: &SgClawSettings,
|
||||
) -> PathBuf {
|
||||
settings
|
||||
.skills_dir
|
||||
.as_ref()
|
||||
.map(|dir| normalize_configured_skills_dir(dir))
|
||||
.unwrap_or_else(|| zeroclaw_default_skills_dir(workspace_root))
|
||||
}
|
||||
```
|
||||
|
||||
- [ ] **Step 4: Update runtime callers to the single-path contract**
|
||||
|
||||
In `src/compat/runtime.rs` and `src/agent/task_runner.rs`:
|
||||
- stop passing vectors of skills dirs around
|
||||
- update logging from `skills dirs resolved to [...]` to a single-path message such as `skills dir resolved to ...`
|
||||
- keep the rest of the runtime behavior unchanged
|
||||
|
||||
In `src/runtime/engine.rs`, if the method still needs a collection internally, convert the one path at the call site instead of preserving public multi-root plumbing.
|
||||
|
||||
- [ ] **Step 5: Replace config tests with single-path cleanup coverage**
|
||||
|
||||
In `tests/compat_config_test.rs`:
|
||||
- keep single-string `skillsDir` resolution tests
|
||||
- remove `resolve_scene_skills_dir_path_*` coverage
|
||||
- remove array-acceptance expectations
|
||||
- keep the new rejecting-array test from Task 1
|
||||
|
||||
Add one focused positive test like:
|
||||
|
||||
```rust
|
||||
#[test]
|
||||
fn ws_cleanup_resolves_single_configured_skills_dir() {
|
||||
let root = std::env::temp_dir().join(format!("sgclaw-skills-{}", uuid::Uuid::new_v4()));
|
||||
std::fs::create_dir_all(root.join("skill_lib/skills")).unwrap();
|
||||
|
||||
let settings = DeepSeekSettings {
|
||||
api_key: "key".to_string(),
|
||||
base_url: "https://api.deepseek.com".to_string(),
|
||||
model: "deepseek-chat".to_string(),
|
||||
skills_dir: Some(root.join("skill_lib")),
|
||||
};
|
||||
|
||||
assert_eq!(
|
||||
resolve_skills_dir(&root, &settings),
|
||||
root.join("skill_lib/skills"),
|
||||
);
|
||||
}
|
||||
```
|
||||
|
||||
- [ ] **Step 6: Run the focused config tests**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test compat_config_test ws_cleanup_ -- --nocapture
|
||||
```
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
- [ ] **Step 7: Re-run the Zhihu websocket keep-path test**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test agent_runtime_test production_submit_task_routes_zhihu_through_ws_backend_without_helper_bootstrap -- --nocapture
|
||||
```
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
- [ ] **Step 8: Commit Task 3**
|
||||
|
||||
```bash
|
||||
git add src/config/settings.rs src/compat/config_adapter.rs src/compat/runtime.rs src/agent/task_runner.rs tests/compat_config_test.rs tests/agent_runtime_test.rs
|
||||
git commit -m "refactor: restore single skills dir on ws branch"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 4: Remove Scene-Only Docs And Residual Test References
|
||||
|
||||
**Files:**
|
||||
- Delete: `docs/superpowers/specs/2026-04-06-scene-skill-runtime-routing-design.md`
|
||||
- Delete: `docs/superpowers/plans/2026-04-06-scene-skill-runtime-routing-plan.md`
|
||||
- Modify: `tests/compat_runtime_test.rs`
|
||||
- Modify: `tests/runtime_profile_test.rs`
|
||||
- Modify: `tests/compat_config_test.rs`
|
||||
|
||||
- [ ] **Step 1: Delete the two scene-only planning documents**
|
||||
|
||||
Remove exactly these files:
|
||||
- `docs/superpowers/specs/2026-04-06-scene-skill-runtime-routing-design.md`
|
||||
- `docs/superpowers/plans/2026-04-06-scene-skill-runtime-routing-plan.md`
|
||||
|
||||
Keep the websocket/browser docs and Zhihu docs.
|
||||
|
||||
- [ ] **Step 2: Sweep remaining tests for scene-only names**
|
||||
|
||||
Remove or rewrite any remaining test blocks that still require:
|
||||
- `fault-details-report`
|
||||
- `95598-repair-city-dispatch`
|
||||
- `resolve_scene_skills_dir_path`
|
||||
- `resolve_scene_skills_dir_from_sgclaw_settings`
|
||||
- `scene_registry`
|
||||
|
||||
Do not delete Zhihu-related assertions during this sweep.
|
||||
|
||||
- [ ] **Step 3: Run a focused grep-style audit from the shell**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
git grep -n "fault-details-report\|95598-repair-city-dispatch\|resolve_scene_skills_dir_path\|resolve_scene_skills_dir_from_sgclaw_settings\|scene_registry" -- src tests docs
|
||||
```
|
||||
|
||||
Expected: no matches in `src/` or `tests/`; doc matches should be gone after the deletions.
|
||||
|
||||
- [ ] **Step 4: Commit Task 4**
|
||||
|
||||
```bash
|
||||
git add tests/compat_runtime_test.rs tests/runtime_profile_test.rs tests/compat_config_test.rs
|
||||
git rm docs/superpowers/specs/2026-04-06-scene-skill-runtime-routing-design.md docs/superpowers/plans/2026-04-06-scene-skill-runtime-routing-plan.md
|
||||
git commit -m "docs: remove ws-only scene planning artifacts"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 5: Verify The Branch Is Back To WS Plus Zhihu Only
|
||||
|
||||
**Files:**
|
||||
- Verify only unless a failing test proves one tiny follow-up fix is needed
|
||||
|
||||
- [ ] **Step 1: Run the retained Zhihu websocket regression**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test agent_runtime_test production_submit_task_routes_zhihu_through_ws_backend_without_helper_bootstrap -- --nocapture
|
||||
```
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
- [ ] **Step 2: Run websocket/backend focused coverage**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test browser_ws_backend_test -- --nocapture && cargo test --test service_ws_session_test -- --nocapture
|
||||
```
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
- [ ] **Step 3: Run direct-route/runtime Zhihu coverage**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test compat_runtime_test zhihu_ -- --nocapture && cargo test --test task_runner_test -- --nocapture
|
||||
```
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
- [ ] **Step 4: Run config/runtime verification after the single-dir cleanup**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test compat_config_test -- --nocapture && cargo test --test runtime_profile_test -- --nocapture
|
||||
```
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
- [ ] **Step 5: Build the affected binaries**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo build --bin sgclaw --bin sg_claw --bin sg_claw_client
|
||||
```
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
- [ ] **Step 6: Audit the remaining branch diff against `main`**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
git diff --stat main...HEAD
|
||||
```
|
||||
|
||||
Expected: the remaining meaningful differences are websocket/browser transport work and Zhihu-related behavior, not scene-routing or staged-scene config churn.
|
||||
|
||||
- [ ] **Step 7: Commit the final verification pass**
|
||||
|
||||
```bash
|
||||
git add src/config/settings.rs src/compat/config_adapter.rs src/compat/runtime.rs src/compat/workflow_executor.rs src/compat/orchestration.rs src/runtime/mod.rs src/runtime/engine.rs tests/compat_config_test.rs tests/runtime_profile_test.rs tests/compat_runtime_test.rs tests/agent_runtime_test.rs tests/task_runner_test.rs
|
||||
git commit -m "test: verify ws branch cleanup preserves zhihu websocket flow"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Verification Checklist
|
||||
|
||||
### Cleanup regressions
|
||||
|
||||
```bash
|
||||
cargo test --test compat_runtime_test ws_cleanup_ -- --nocapture
|
||||
cargo test --test runtime_profile_test ws_cleanup_ -- --nocapture
|
||||
cargo test --test compat_config_test ws_cleanup_ -- --nocapture
|
||||
```
|
||||
|
||||
Expected: scene detection, scene prompt injection, and array-style `skillsDir` behavior are gone.
|
||||
|
||||
### Retained Zhihu websocket behavior
|
||||
|
||||
```bash
|
||||
cargo test --test agent_runtime_test production_submit_task_routes_zhihu_through_ws_backend_without_helper_bootstrap -- --nocapture
|
||||
cargo test --test browser_ws_backend_test -- --nocapture
|
||||
cargo test --test service_ws_session_test -- --nocapture
|
||||
cargo test --test compat_runtime_test zhihu_ -- --nocapture
|
||||
```
|
||||
|
||||
Expected: websocket submit path and Zhihu direct workflows still pass.
|
||||
|
||||
### Runtime/config verification
|
||||
|
||||
```bash
|
||||
cargo test --test compat_config_test -- --nocapture
|
||||
cargo test --test runtime_profile_test -- --nocapture
|
||||
cargo test --test task_runner_test -- --nocapture
|
||||
```
|
||||
|
||||
Expected: runtime/config plumbing is stable after the single-dir cleanup.
|
||||
|
||||
### Build verification
|
||||
|
||||
```bash
|
||||
cargo build --bin sgclaw --bin sg_claw --bin sg_claw_client
|
||||
```
|
||||
|
||||
Expected: the branch still compiles cleanly.
|
||||
|
||||
---
|
||||
|
||||
## Notes For The Engineer
|
||||
|
||||
- The current scene support touches three different seams: runtime prompt injection, direct route detection/execution, and multi-root `skillsDir` plumbing. Remove all three; deleting only one leaves conflict-prone leftovers.
|
||||
- If collapsing `skillsDir` to `Option<PathBuf>` creates more churn than expected, keep the internal representation temporarily as a one-element collection, but the public config contract and tests on this branch must still go back to a single configured path.
|
||||
- Do not delete browser websocket or callback-host code just because it is adjacent to the scene work; this plan is about stripping scene behavior, not reworking transport.
|
||||
- If `git diff --stat main...HEAD` still shows scene-specific files after Task 5, stop and remove them before merging `main` back into this branch.
|
||||
@@ -0,0 +1,672 @@
|
||||
# Fault Details Full Skill Alignment Implementation Plan
|
||||
|
||||
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
|
||||
|
||||
**Goal:** Upgrade `fault-details-report.collect_fault_details` into a real staged browser skill that matches the original fault-details workflow, and make `claw-new` interpret the returned artifact status correctly in the direct-submit path.
|
||||
|
||||
**Architecture:** Keep routing and direct-skill selection in `claw-new`, but move all fault-details collection, normalization, classification, summary, export, and report-log behavior into the staged skill under `skill_staging`. Implement the staged skill as a true browser-eval entrypoint that remains valid in page context, while exposing testable pure helpers through an environment-safe export guard for `node:test`; then add a narrow Rust artifact interpreter in `src/compat/direct_skill_runtime.rs` so `ok` / `partial` / `empty` map to successful task completion while `blocked` / `error` map to failed completion.
|
||||
|
||||
**Tech Stack:** Rust 2021, `serde_json`, existing `BrowserPipeTool` / `browser_script` runtime, `node:test`, staged skill fixtures, Cargo integration tests.
|
||||
|
||||
---
|
||||
|
||||
## Execution Context
|
||||
|
||||
- Follow @superpowers:test-driven-development for every behavior change.
|
||||
- Follow @superpowers:verification-before-completion before claiming each task is done.
|
||||
- Do **not** create a git worktree unless the user explicitly asks. This repo preference is already established.
|
||||
- Keep scope tight. Do **not** add a new browser protocol, new dispatch metadata, new UI opener behavior, or Rust-side fault classification logic.
|
||||
- Keep the current direct path bootstrap requirement intact: the user instruction must still include an explicit `YYYY-MM`, but the staged skill must treat the page-selected range as the source of truth for collection once execution begins.
|
||||
- Preserve parity with the original package’s real behavior: port the original classification table, `qxxcjl`-based reason heuristics, canonical detail mapping, summary aggregation rules, localhost export call, and report-log call into the staged skill rather than implementing a fixture-only subset.
|
||||
|
||||
## File Map
|
||||
|
||||
### Existing files to modify in `claw-new`
|
||||
|
||||
- Modify: `src/compat/direct_skill_runtime.rs`
|
||||
- add narrow structured artifact parsing and status-to-summary mapping
|
||||
- keep direct-skill routing/config ownership unchanged
|
||||
- Modify: `tests/agent_runtime_test.rs`
|
||||
- add direct-submit regressions for `ok`, `partial`, `empty`, `blocked`, and `error`
|
||||
- Modify: `tests/browser_script_skill_tool_test.rs`
|
||||
- add browser-script execution-shape regression for browser-eval return payloads used by fault-details
|
||||
|
||||
### Existing files to modify in `skill_staging`
|
||||
|
||||
- Modify: `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/fault-details-report/scripts/collect_fault_details.js`
|
||||
- replace empty shell with browser-eval entrypoint plus parity helpers
|
||||
- Create: `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/fault-details-report/scripts/collect_fault_details.test.js`
|
||||
- deterministic fixture coverage for normalization, classification, summary, artifact contract, export/logging degradation, and entrypoint shape helpers
|
||||
- Modify: `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/fault-details-report/SKILL.toml`
|
||||
- align tool description with real collection/export/report-log behavior
|
||||
- Modify: `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/fault-details-report/SKILL.md`
|
||||
- align written contract with actual runtime behavior and artifact fields
|
||||
- Modify: `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/fault-details-report/references/collection-flow.md`
|
||||
- align flow with page-range/query/export/report-log sequence
|
||||
- Modify: `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/fault-details-report/references/data-quality.md`
|
||||
- make canonical columns, original classification tables, reason heuristics, summary rules, and partial semantics explicit
|
||||
- Modify: `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/scenes/fault-details-report/scene.json`
|
||||
- keep scene output/state contract aligned with real staged artifact behavior
|
||||
|
||||
### Existing files to read but not redesign
|
||||
|
||||
- Read only: `docs/superpowers/specs/2026-04-10-fault-details-full-skill-alignment-design.md`
|
||||
- Read only: `src/agent/mod.rs`
|
||||
- Read only: `src/compat/browser_script_skill_tool.rs`
|
||||
- Read only: `D:/desk/智能体资料/大四区报告监测项/故障明细/index.html`
|
||||
|
||||
---
|
||||
|
||||
### Task 1: Add staged-skill red tests for normalization, summary, and artifact-contract semantics
|
||||
|
||||
**Files:**
|
||||
- Create: `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/fault-details-report/scripts/collect_fault_details.test.js`
|
||||
- Read only: `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/fault-details-report/scripts/collect_fault_details.js`
|
||||
- Read only: `D:/desk/智能体资料/大四区报告监测项/故障明细/index.html`
|
||||
|
||||
- [ ] **Step 1: Write the failing staged-skill test file**
|
||||
|
||||
Add `collect_fault_details.test.js` using `node:test` and `assert/strict`. Cover these behaviors with fixed fixtures:
|
||||
|
||||
```javascript
|
||||
const test = require('node:test');
|
||||
const assert = require('node:assert/strict');
|
||||
|
||||
const {
|
||||
DETAIL_COLUMNS,
|
||||
SUMMARY_COLUMNS,
|
||||
normalizeDetailRow,
|
||||
deriveSummaryRows,
|
||||
determineArtifactStatus,
|
||||
buildFaultDetailsArtifact,
|
||||
buildBrowserEntrypointResult
|
||||
} = require('./collect_fault_details.js');
|
||||
|
||||
test('normalizeDetailRow maps canonical detail fields from raw repair rows', () => {
|
||||
const row = normalizeDetailRow({
|
||||
qxdbh: 'QX-1',
|
||||
bxsj: '2026-03-09 08:00:00',
|
||||
cityName: '国网兰州供电公司',
|
||||
maintOrgName: '城关供电服务班',
|
||||
maintGroupName: '抢修一班',
|
||||
bdzMc: '110kV东岗变',
|
||||
xlmc10: '10kV东岗线',
|
||||
byqmc: '东岗1号变',
|
||||
yjflMc: '电网故障',
|
||||
ejflMc: '线路故障',
|
||||
sjflMc: '低压线路',
|
||||
qxxcjl: '现场检查:低压线路断线,已处理完成',
|
||||
gzms: '客户报修停电'
|
||||
}, {
|
||||
companyName: '国网兰州供电公司'
|
||||
});
|
||||
|
||||
assert.equal(row.slsj, '2026-03-09 08:00:00');
|
||||
assert.equal(row.gssgs, '甘肃省电力公司');
|
||||
assert.equal(row.gddw, '城关供电服务班');
|
||||
assert.equal(row.gds, '抢修一班');
|
||||
assert.equal(row.clzt, '处理完成');
|
||||
assert.equal(row.bdz, '110kV东岗变');
|
||||
assert.equal(row.line, '10kV东岗线');
|
||||
assert.equal(row.pb, '东岗1号变');
|
||||
});
|
||||
|
||||
test('deriveSummaryRows groups normalized rows by gds and computes counters', () => {
|
||||
const rows = [
|
||||
{ gds: '抢修一班', gddw: '城关供电服务班', sgs: '国网兰州供电公司', sxfl1: '无效', sxfl2: '无效', gzsb: '' },
|
||||
{ gds: '抢修一班', gddw: '城关供电服务班', sgs: '国网兰州供电公司', sxfl1: '有效', sxfl2: '用户侧', gzsb: '表后线' },
|
||||
{ gds: '抢修一班', gddw: '城关供电服务班', sgs: '国网兰州供电公司', sxfl1: '有效', sxfl2: '电网侧', dwcFl: '低压故障', gzsb: '低压线路' }
|
||||
];
|
||||
|
||||
const summaryRows = deriveSummaryRows(rows, { companyName: '国网兰州供电公司' });
|
||||
assert.equal(summaryRows.length, 1);
|
||||
assert.equal(summaryRows[0].className, '抢修一班');
|
||||
assert.equal(summaryRows[0].allCount, 3);
|
||||
assert.equal(summaryRows[0].wxCount, 1);
|
||||
assert.equal(summaryRows[0].khcCount, 0);
|
||||
assert.equal(summaryRows[0].dyGzCount, 1);
|
||||
assert.equal(summaryRows[0].dyxlCount, 1);
|
||||
assert.equal(summaryRows[0].bhxCount, 1);
|
||||
});
|
||||
|
||||
test('determineArtifactStatus follows blocked > error > partial > empty > ok precedence', () => {
|
||||
assert.equal(determineArtifactStatus({ blockedReason: 'missing_session', fatalError: null, partialReasons: [], detailRows: [{}] }), 'blocked');
|
||||
assert.equal(determineArtifactStatus({ blockedReason: null, fatalError: 'parse_failed', partialReasons: [], detailRows: [{}] }), 'error');
|
||||
assert.equal(determineArtifactStatus({ blockedReason: null, fatalError: null, partialReasons: ['export_failed'], detailRows: [{}] }), 'partial');
|
||||
assert.equal(determineArtifactStatus({ blockedReason: null, fatalError: null, partialReasons: [], detailRows: [] }), 'empty');
|
||||
assert.equal(determineArtifactStatus({ blockedReason: null, fatalError: null, partialReasons: [], detailRows: [{}] }), 'ok');
|
||||
});
|
||||
|
||||
test('buildFaultDetailsArtifact keeps canonical fields, selected range, counts, and downstream results', () => {
|
||||
const artifact = buildFaultDetailsArtifact({
|
||||
period: '2026-03',
|
||||
selectedRange: { start: '2026-03-08 16:00:00', end: '2026-03-09 16:00:00' },
|
||||
detailRows: [{ qxdbh: 'QX-1' }],
|
||||
summaryRows: [{ index: 1 }],
|
||||
partialReasons: ['report_log_failed'],
|
||||
downstream: {
|
||||
export: { attempted: true, success: true, path: 'http://localhost/export.xlsx' },
|
||||
report_log: { attempted: true, success: false, error: '500' }
|
||||
}
|
||||
});
|
||||
|
||||
assert.equal(artifact.type, 'report-artifact');
|
||||
assert.equal(artifact.status, 'partial');
|
||||
assert.deepEqual(artifact.selected_range, { start: '2026-03-08 16:00:00', end: '2026-03-09 16:00:00' });
|
||||
assert.equal(artifact.counts.detail_rows, 1);
|
||||
assert.equal(artifact.counts.summary_rows, 1);
|
||||
assert.deepEqual(artifact.partial_reasons, ['report_log_failed']);
|
||||
});
|
||||
|
||||
test('buildFaultDetailsArtifact keeps required top-level fields for blocked artifact', () => {
|
||||
const artifact = buildFaultDetailsArtifact({
|
||||
period: '2026-03',
|
||||
blockedReason: 'selected_range_unavailable',
|
||||
partialReasons: ['selected_range_unavailable']
|
||||
});
|
||||
|
||||
assert.equal(artifact.type, 'report-artifact');
|
||||
assert.equal(artifact.report_name, 'fault-details-report');
|
||||
assert.equal(artifact.period, '2026-03');
|
||||
assert.equal(artifact.status, 'blocked');
|
||||
assert.deepEqual(artifact.partial_reasons, ['selected_range_unavailable']);
|
||||
assert.equal('downstream' in artifact, false);
|
||||
});
|
||||
|
||||
test('buildFaultDetailsArtifact keeps known selected range and counts on late error', () => {
|
||||
const artifact = buildFaultDetailsArtifact({
|
||||
period: '2026-03',
|
||||
selectedRange: { start: '2026-03-08 16:00:00', end: '2026-03-09 16:00:00' },
|
||||
detailRows: [],
|
||||
summaryRows: [],
|
||||
fatalError: 'summary_failed',
|
||||
partialReasons: ['summary_failed']
|
||||
});
|
||||
|
||||
assert.equal(artifact.status, 'error');
|
||||
assert.deepEqual(artifact.selected_range, { start: '2026-03-08 16:00:00', end: '2026-03-09 16:00:00' });
|
||||
assert.equal(artifact.counts.detail_rows, 0);
|
||||
assert.equal(artifact.counts.summary_rows, 0);
|
||||
});
|
||||
|
||||
test('buildBrowserEntrypointResult returns blocked artifact when selected range is unavailable', async () => {
|
||||
const artifact = await buildBrowserEntrypointResult({
|
||||
period: '2026-03'
|
||||
}, {
|
||||
readSelectedRange: async () => null
|
||||
});
|
||||
|
||||
assert.equal(artifact.status, 'blocked');
|
||||
assert.ok(artifact.partial_reasons.includes('selected_range_unavailable'));
|
||||
});
|
||||
```
|
||||
|
||||
- [ ] **Step 2: Run the staged-skill test file and verify it fails**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
node "D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/fault-details-report/scripts/collect_fault_details.test.js"
|
||||
```
|
||||
|
||||
Expected: FAIL because `collect_fault_details.js` does not export these helpers yet and still only returns an empty shell.
|
||||
|
||||
---
|
||||
|
||||
### Task 2: Implement staged-skill parity helpers and a valid browser entrypoint
|
||||
|
||||
**Files:**
|
||||
- Modify: `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/fault-details-report/scripts/collect_fault_details.js`
|
||||
- Test: `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/fault-details-report/scripts/collect_fault_details.test.js`
|
||||
|
||||
- [ ] **Step 1: Implement the helper exports and browser entrypoint shape needed to satisfy the red tests**
|
||||
|
||||
Refactor `collect_fault_details.js` so the file remains a valid browser-eval script in page context while still supporting `node:test` through an environment-safe export guard.
|
||||
|
||||
Required implementation pieces:
|
||||
|
||||
```javascript
|
||||
const DETAIL_COLUMNS = [/* existing canonical columns */];
|
||||
const SUMMARY_COLUMNS = [/* existing summary columns */];
|
||||
|
||||
function normalizeDetailRow(raw, context) {
|
||||
// map qxdbh/gssgs/sgs/gddw/gds/slsj/clzt/bdz/line/pb
|
||||
// derive sxfl1/sxfl2/sxfl3/gzsb/gzyy from the original package rules
|
||||
}
|
||||
|
||||
function deriveSummaryRows(detailRows, context) {
|
||||
// group by gds and compute all original package counters
|
||||
}
|
||||
|
||||
function determineArtifactStatus({ blockedReason, fatalError, partialReasons, detailRows }) {
|
||||
// blocked > error > partial > empty > ok
|
||||
}
|
||||
|
||||
function buildFaultDetailsArtifact({
|
||||
period,
|
||||
selectedRange,
|
||||
detailRows,
|
||||
summaryRows,
|
||||
partialReasons,
|
||||
blockedReason,
|
||||
fatalError,
|
||||
downstream
|
||||
}) {
|
||||
// return report-artifact with columns, sections, counts, status, partial_reasons, downstream
|
||||
}
|
||||
|
||||
async function buildBrowserEntrypointResult(input, deps = defaultBrowserDeps()) {
|
||||
// read selected range from page
|
||||
// collect raw rows from page query
|
||||
// normalize rows
|
||||
// derive summary
|
||||
// attempt export + report log
|
||||
// return final artifact
|
||||
}
|
||||
|
||||
if (typeof module !== 'undefined' && module.exports) {
|
||||
module.exports = {
|
||||
DETAIL_COLUMNS,
|
||||
SUMMARY_COLUMNS,
|
||||
normalizeDetailRow,
|
||||
deriveSummaryRows,
|
||||
determineArtifactStatus,
|
||||
buildFaultDetailsArtifact,
|
||||
buildBrowserEntrypointResult
|
||||
};
|
||||
}
|
||||
|
||||
return await buildBrowserEntrypointResult(args);
|
||||
```
|
||||
|
||||
Rules:
|
||||
- keep `DETAIL_COLUMNS` and `SUMMARY_COLUMNS` canonical and stable
|
||||
- keep helper functions self-contained in this file unless a separate pure helper file becomes necessary for runtime validity
|
||||
- keep the browser entrypoint compatible with current `eval` wrapper
|
||||
- keep browser runtime free of unguarded Node-only assumptions
|
||||
- do **not** invent a new protocol or callback surface
|
||||
|
||||
- [ ] **Step 2: Re-run the staged-skill test file and verify it now reaches deeper failures or passes the initial helper coverage**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
node "D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/fault-details-report/scripts/collect_fault_details.test.js"
|
||||
```
|
||||
|
||||
Expected: either PASS for the Task 1 cases, or fail only on the still-missing full parity/export/history specifics added in Task 3.
|
||||
|
||||
---
|
||||
|
||||
### Task 3: Add red tests for full classification parity, downstream partials, and empty-result export semantics
|
||||
|
||||
**Files:**
|
||||
- Modify: `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/fault-details-report/scripts/collect_fault_details.test.js`
|
||||
- Modify: `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/fault-details-report/scripts/collect_fault_details.js`
|
||||
- Read only: `D:/desk/智能体资料/大四区报告监测项/故障明细/index.html`
|
||||
|
||||
- [ ] **Step 1: Extend the staged-skill tests with failing parity and downstream cases**
|
||||
|
||||
Add focused failing tests such as:
|
||||
|
||||
```javascript
|
||||
test('normalizeDetailRow derives gzyy from qxxcjl text heuristics', () => {
|
||||
const row = normalizeDetailRow({
|
||||
qxxcjl: '现场检查:客户表后线烧损,已恢复送电',
|
||||
ejflMc: '客户侧故障',
|
||||
sjflMc: '表后线'
|
||||
}, { companyName: '国网兰州供电公司' });
|
||||
|
||||
assert.equal(row.gzsb, '表后线');
|
||||
assert.equal(row.gzyy, '表后线烧损');
|
||||
});
|
||||
|
||||
test('buildBrowserEntrypointResult returns partial when export fails after detail collection succeeds', async () => {
|
||||
const artifact = await buildBrowserEntrypointResult({ period: '2026-03' }, {
|
||||
readSelectedRange: async () => ({ start: '2026-03-08 16:00:00', end: '2026-03-09 16:00:00' }),
|
||||
queryFaultRows: async () => [{ qxdbh: 'QX-1', bxsj: '2026-03-09 08:00:00', maintGroupName: '抢修一班' }],
|
||||
readCompanyContext: () => ({ companyName: '国网兰州供电公司' }),
|
||||
exportWorkbook: async () => {
|
||||
throw new Error('export_failed');
|
||||
},
|
||||
writeReportLog: async () => ({ success: true })
|
||||
});
|
||||
|
||||
assert.equal(artifact.status, 'partial');
|
||||
assert.ok(artifact.partial_reasons.includes('export_failed'));
|
||||
assert.equal(artifact.counts.detail_rows, 1);
|
||||
assert.equal(artifact.downstream.export.attempted, true);
|
||||
assert.equal(artifact.downstream.export.success, false);
|
||||
});
|
||||
|
||||
test('buildBrowserEntrypointResult returns error when normalized detail rows cannot be produced', async () => {
|
||||
const artifact = await buildBrowserEntrypointResult({ period: '2026-03' }, {
|
||||
readSelectedRange: async () => ({ start: '2026-03-08 16:00:00', end: '2026-03-09 16:00:00' }),
|
||||
queryFaultRows: async () => [{ qxdbh: '', bxsj: '' }],
|
||||
readCompanyContext: () => ({ companyName: '国网兰州供电公司' })
|
||||
});
|
||||
|
||||
assert.equal(artifact.status, 'error');
|
||||
assert.ok(artifact.partial_reasons.includes('detail_normalization_failed'));
|
||||
});
|
||||
|
||||
test('buildBrowserEntrypointResult keeps canonical rows empty for empty result and omits downstream before attempts', async () => {
|
||||
const artifact = await buildBrowserEntrypointResult({ period: '2026-03' }, {
|
||||
readSelectedRange: async () => ({ start: '2026-03-08 16:00:00', end: '2026-03-09 16:00:00' }),
|
||||
queryFaultRows: async () => [],
|
||||
readCompanyContext: () => ({ companyName: '国网兰州供电公司' })
|
||||
});
|
||||
|
||||
assert.equal(artifact.status, 'empty');
|
||||
assert.deepEqual(artifact.rows, []);
|
||||
assert.equal('downstream' in artifact, false);
|
||||
});
|
||||
```
|
||||
|
||||
Also add fixture cases derived from the original package’s full classification table and summary counters so the staged skill is forced toward parity, not a subset implementation.
|
||||
|
||||
- [ ] **Step 2: Run the staged-skill test file and verify it fails on the new cases**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
node "D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/fault-details-report/scripts/collect_fault_details.test.js"
|
||||
```
|
||||
|
||||
Expected: FAIL on missing full classification parity or downstream partial/error behavior.
|
||||
|
||||
- [ ] **Step 3: Implement the full business logic needed to satisfy the new tests**
|
||||
|
||||
In `collect_fault_details.js`:
|
||||
- port the original classification table and `qxxcjl` text heuristics for `sxfl1`, `sxfl2`, `sxfl3`, `gzsb`, `gzyy`
|
||||
- port the original summary derivation rules and counters completely
|
||||
- add required-field validation so structurally unusable normalized rows escalate to `error`
|
||||
- add downstream `exportWorkbook` and `writeReportLog` stages that record `{attempted, success, path, error}`
|
||||
- keep collection success distinct from downstream failures so export/logging failures become `partial`, not full failure
|
||||
- keep placeholder rows, if needed for downstream empty-export payloads, downstream-only and never in canonical returned `rows`
|
||||
- include both `period` and `selected_range` in the artifact
|
||||
- omit `downstream` when export/report-log have not been attempted yet
|
||||
|
||||
- [ ] **Step 4: Re-run the staged-skill test file and verify it passes**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
node "D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/fault-details-report/scripts/collect_fault_details.test.js"
|
||||
```
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
---
|
||||
|
||||
### Task 4: Align staged-skill metadata and reference docs with the implemented behavior
|
||||
|
||||
**Files:**
|
||||
- Modify: `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/fault-details-report/SKILL.toml`
|
||||
- Modify: `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/fault-details-report/SKILL.md`
|
||||
- Modify: `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/fault-details-report/references/collection-flow.md`
|
||||
- Modify: `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/fault-details-report/references/data-quality.md`
|
||||
- Modify: `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/scenes/fault-details-report/scene.json`
|
||||
|
||||
- [ ] **Step 1: Update the staged metadata/docs to match the implemented runtime contract**
|
||||
|
||||
Required changes:
|
||||
- `SKILL.toml`: description must say the tool collects rows, derives summary, attempts localhost export, and records report history
|
||||
- `SKILL.md`: artifact example must include `selected_range`, `counts`, `status`, `partial_reasons`, and `downstream`
|
||||
- `references/collection-flow.md`: sequence must explicitly include page-selected range -> raw query -> normalization -> summary -> export -> report-log
|
||||
- `references/data-quality.md`: document the original classification tables, `qxxcjl` heuristics, summary rules, partial/error escalation rules, and empty-result semantics explicitly enough to match the implemented helpers
|
||||
- `scene.json`: keep inputs/outputs/status semantics aligned with the richer artifact; do not add routing policy there
|
||||
|
||||
- [ ] **Step 2: Read the updated staged docs and verify they match the implemented JS behavior**
|
||||
|
||||
Read and confirm:
|
||||
- descriptions no longer claim “artifact shell” behavior
|
||||
- docs do not move routing ownership out of `claw-new`
|
||||
- docs do not promise auto-opening/downloading behavior in this slice
|
||||
- docs reflect blocked/error field-presence rules and downstream-attempt semantics
|
||||
|
||||
Expected: staged metadata/docs accurately reflect the implemented collector.
|
||||
|
||||
---
|
||||
|
||||
### Task 5: Add Rust red tests for artifact-status interpretation in the direct-submit runtime
|
||||
|
||||
**Files:**
|
||||
- Modify: `tests/agent_runtime_test.rs`
|
||||
- Modify: `tests/browser_script_skill_tool_test.rs`
|
||||
- Modify: `src/compat/direct_skill_runtime.rs`
|
||||
- Read only: `src/compat/browser_script_skill_tool.rs`
|
||||
|
||||
- [ ] **Step 1: Add failing direct-submit runtime tests for structured artifact statuses**
|
||||
|
||||
Extend `tests/agent_runtime_test.rs` with focused regressions that use the existing temp skill-root harness but return real `report-artifact` payloads:
|
||||
|
||||
```rust
|
||||
#[test]
|
||||
fn submit_task_treats_partial_report_artifact_as_success_with_warning_summary() {
|
||||
let skill_root = build_direct_runtime_skill_root();
|
||||
let runtime_context = direct_submit_runtime_context(&skill_root);
|
||||
let transport = Arc::new(MockTransport::new(vec![success_browser_response(
|
||||
1,
|
||||
serde_json::json!({
|
||||
"text": {
|
||||
"type": "report-artifact",
|
||||
"report_name": "fault-details-report",
|
||||
"period": "2026-03",
|
||||
"selected_range": { "start": "2026-03-08 16:00:00", "end": "2026-03-09 16:00:00" },
|
||||
"columns": ["qxdbh"],
|
||||
"rows": [{ "qxdbh": "QX-1" }],
|
||||
"sections": [{ "name": "summary-sheet", "columns": ["index"], "rows": [{ "index": 1 }] }],
|
||||
"counts": { "detail_rows": 1, "summary_rows": 1 },
|
||||
"status": "partial",
|
||||
"partial_reasons": ["report_log_failed"],
|
||||
"downstream": {
|
||||
"export": { "attempted": true, "success": true, "path": "http://localhost/export.xlsx" },
|
||||
"report_log": { "attempted": true, "success": false, "error": "500" }
|
||||
}
|
||||
}
|
||||
}),
|
||||
)]));
|
||||
// ... invoke handle_browser_message_with_context(...)
|
||||
// assert TaskComplete.success == true
|
||||
// assert summary contains partial/report_log_failed/detail_rows=1
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn submit_task_treats_empty_report_artifact_as_success() { /* status=empty => success=true */ }
|
||||
|
||||
#[test]
|
||||
fn submit_task_treats_blocked_report_artifact_as_failure() { /* status=blocked => success=false */ }
|
||||
|
||||
#[test]
|
||||
fn submit_task_treats_error_report_artifact_as_failure() { /* status=error => success=false */ }
|
||||
```
|
||||
|
||||
Also add one focused helper regression to `tests/browser_script_skill_tool_test.rs` that proves the browser-script helper can return a structured object payload used by the fault-details path without flattening required fields away.
|
||||
|
||||
Suggested test name:
|
||||
|
||||
```rust
|
||||
#[tokio::test]
|
||||
async fn execute_browser_script_tool_preserves_structured_report_artifact_payload() { /* ... */ }
|
||||
```
|
||||
|
||||
- [ ] **Step 2: Run the focused Rust tests and verify they fail**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test agent_runtime_test submit_task_treats_partial_report_artifact_as_success_with_warning_summary -- --nocapture
|
||||
cargo test --test browser_script_skill_tool_test execute_browser_script_tool_preserves_structured_report_artifact_payload -- --nocapture
|
||||
```
|
||||
|
||||
Expected: the new `agent_runtime_test` case fails because `execute_direct_submit_skill` still returns raw JSON text and `src/agent/mod.rs` still marks all direct-submit results as success when no Rust-side interpretation exists.
|
||||
|
||||
---
|
||||
|
||||
### Task 6: Implement narrow Rust artifact interpretation without moving business rules into Rust
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/compat/direct_skill_runtime.rs`
|
||||
- Modify: `tests/agent_runtime_test.rs`
|
||||
- Modify: `tests/browser_script_skill_tool_test.rs`
|
||||
|
||||
- [ ] **Step 1: Implement a narrow structured-artifact interpreter in `src/compat/direct_skill_runtime.rs`**
|
||||
|
||||
Add a small internal result type and parser, for example:
|
||||
|
||||
```rust
|
||||
struct DirectSubmitOutcome {
|
||||
success: bool,
|
||||
summary: String,
|
||||
}
|
||||
|
||||
fn interpret_direct_submit_output(output: &str) -> DirectSubmitOutcome {
|
||||
// parse JSON if possible
|
||||
// if type == "report-artifact", read status/counts/partial_reasons/downstream
|
||||
// map ok/partial/empty => success=true
|
||||
// map blocked/error => success=false
|
||||
// build concise summary with report_name, period, detail_rows, summary_rows, status, partial reasons
|
||||
// fall back to raw output text when payload is not a recognized artifact
|
||||
}
|
||||
```
|
||||
|
||||
Then change the public entrypoint shape from `Result<String, PipeError>` to a narrow result carrying `success` and `summary`, or add a second helper that `src/agent/mod.rs` can use without changing routing ownership.
|
||||
|
||||
Rules:
|
||||
- do **not** reimplement fault normalization/classification/summary in Rust
|
||||
- do **not** add fault-specific branching in `src/agent/mod.rs`
|
||||
- keep unrecognized non-artifact outputs working as before
|
||||
- keep explicit `YYYY-MM` derivation and configured `skill.tool` resolution unchanged
|
||||
|
||||
- [ ] **Step 2: Update the submit-path caller to use the interpreted success flag**
|
||||
|
||||
Adjust the direct-submit branch so `TaskComplete.success` comes from the artifact interpretation instead of blindly treating every `Ok(summary)` as success.
|
||||
|
||||
Implementation target:
|
||||
- keep the direct path in `src/agent/mod.rs`
|
||||
- keep error handling narrow
|
||||
- if needed, return a dedicated direct-submit outcome from `execute_direct_submit_skill`
|
||||
|
||||
- [ ] **Step 3: Re-run the focused Rust tests and verify they pass**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test agent_runtime_test submit_task_treats_partial_report_artifact_as_success_with_warning_summary -- --nocapture
|
||||
cargo test --test agent_runtime_test submit_task_treats_empty_report_artifact_as_success -- --nocapture
|
||||
cargo test --test agent_runtime_test submit_task_treats_blocked_report_artifact_as_failure -- --nocapture
|
||||
cargo test --test agent_runtime_test submit_task_treats_error_report_artifact_as_failure -- --nocapture
|
||||
cargo test --test browser_script_skill_tool_test execute_browser_script_tool_preserves_structured_report_artifact_payload -- --nocapture
|
||||
```
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
---
|
||||
|
||||
### Task 7: Run the full verification sweep for the staged skill and direct runtime
|
||||
|
||||
**Files:**
|
||||
- Verify only
|
||||
|
||||
- [ ] **Step 1: Run the staged-skill deterministic test file**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
node "D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/fault-details-report/scripts/collect_fault_details.test.js"
|
||||
```
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
- [ ] **Step 2: Run the relevant Rust regression suites**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test browser_script_skill_tool_test -- --nocapture
|
||||
cargo test --test agent_runtime_test -- --nocapture
|
||||
```
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
- [ ] **Step 3: Run the broader compatibility coverage and build**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test compat_runtime_test -- --nocapture
|
||||
cargo test --test compat_config_test -- --nocapture
|
||||
cargo build --bin sgclaw
|
||||
```
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
- [ ] **Step 4: Manually verify the requirements against the approved spec**
|
||||
|
||||
Checklist:
|
||||
- staged skill now reads page-selected range instead of inventing a month window after entry
|
||||
- staged skill returns canonical detail rows and summary rows
|
||||
- staged skill ports the original classification table, `qxxcjl` heuristics, and summary counters with parity coverage
|
||||
- staged skill records downstream export/report-log outcome
|
||||
- staged skill distinguishes `ok` / `partial` / `empty` / `blocked` / `error`
|
||||
- `blocked` / `error` artifacts keep the required top-level fields, and preserve known `selected_range` / `counts` when failure happens late enough
|
||||
- `downstream` is omitted when export/report-log were not attempted and included with attempted/success flags once they were attempted
|
||||
- empty-result canonical `rows` stay empty even if downstream export uses a placeholder transport row
|
||||
- `claw-new` maps `ok` / `partial` / `empty` to success and `blocked` / `error` to failure
|
||||
- no new routing metadata was added to `SKILL.toml` or `scene.json`
|
||||
- no new browser protocol or opener/UI behavior was introduced
|
||||
|
||||
Expected: all checklist items satisfied before calling the work complete.
|
||||
|
||||
---
|
||||
|
||||
## Verification Checklist
|
||||
|
||||
### Staged skill behavior
|
||||
|
||||
```bash
|
||||
node "D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/fault-details-report/scripts/collect_fault_details.test.js"
|
||||
```
|
||||
|
||||
Expected: deterministic fixture coverage passes for normalization, full classification parity, summary derivation, artifact shape, empty semantics, and downstream partial semantics.
|
||||
|
||||
### Direct-submit runtime mapping
|
||||
|
||||
```bash
|
||||
cargo test --test agent_runtime_test -- --nocapture
|
||||
```
|
||||
|
||||
Expected:
|
||||
- valid artifact `ok` / `partial` / `empty` completes successfully
|
||||
- valid artifact `blocked` / `error` completes as failure
|
||||
- existing invalid config regression still passes
|
||||
- existing direct-submit happy path still passes
|
||||
|
||||
### Browser-script helper safety
|
||||
|
||||
```bash
|
||||
cargo test --test browser_script_skill_tool_test -- --nocapture
|
||||
```
|
||||
|
||||
Expected: current browser-script execution semantics remain intact while returning structured artifact payloads.
|
||||
|
||||
### Compatibility/build
|
||||
|
||||
```bash
|
||||
cargo test --test compat_runtime_test -- --nocapture
|
||||
cargo test --test compat_config_test -- --nocapture
|
||||
cargo build --bin sgclaw
|
||||
```
|
||||
|
||||
Expected: no regressions in compat execution/config loading; main binary builds cleanly.
|
||||
|
||||
---
|
||||
|
||||
## Notes For The Engineer
|
||||
|
||||
- The paired spec is `docs/superpowers/specs/2026-04-10-fault-details-full-skill-alignment-design.md`.
|
||||
- Keep all fault business transforms in `skill_staging`, not in Rust.
|
||||
- Keep direct routing config-owned via `skillsDir` + `directSubmitSkill`.
|
||||
- Do **not** broaden this slice into LLM routing, generic dispatch policy, new browser opcodes, or export auto-open behavior.
|
||||
- If the original package reveals extra classification rules that are needed for parity, add them only inside `collect_fault_details.js` and its staged references/tests, not in `claw-new`.
|
||||
551
docs/superpowers/plans/2026-04-11-main-into-ws-merge-v2-plan.md
Normal file
551
docs/superpowers/plans/2026-04-11-main-into-ws-merge-v2-plan.md
Normal file
@@ -0,0 +1,551 @@
|
||||
# Main → WS Merge v2 Implementation Plan
|
||||
|
||||
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
|
||||
|
||||
**Goal:** 把最新 `origin/main` 合并到 `feature/claw-ws`,让 `ws` 分支最终同时保留 **pipe + ws** 两套通信能力、保留 Zhihu 行为,并用 `main` 上正式的 fault-details 实现替换 `ws` 上已 cleanup 删除的旧重复实现。
|
||||
|
||||
**Architecture:** 这次合并不是“把 cleanup 永久保持成没有 fault-details”,而是“先删除 ws 上旧重复实现,再吸收 main 上正式实现”。冲突裁决优先级是:**先保 pipe、再保 ws、再保 Zhihu、同时拒绝 ws 上旧重复 scene/fault-details 实现回流**。整个过程使用 `git merge --no-commit --no-ff origin/main`,冲突解决后只做聚焦验证,停在未提交状态。
|
||||
|
||||
**Tech Stack:** Git, Rust 2021, Cargo test, sgClaw pipe transport, ws transport, compat/runtime/orchestration stack, Zhihu direct workflow tests.
|
||||
|
||||
---
|
||||
|
||||
## Preconditions
|
||||
|
||||
- 当前分支必须是 `feature/claw-ws`
|
||||
- `2026-04-09-ws-branch-scene-cleanup-plan.md` 已完成
|
||||
- 当前不在 merge 状态
|
||||
- 当前没有 tracked 未提交改动
|
||||
- 本次**不创建 worktree**,按当前仓库执行
|
||||
- 本次结束点是:**已合并、已验证、未提交**
|
||||
|
||||
---
|
||||
|
||||
## Final Merge Principles
|
||||
|
||||
### 1) `main` 是 pipe 主线
|
||||
合并后不能把 `main` 上现有的 pipe 管道通信破坏掉。
|
||||
|
||||
### 2) `ws` 分支最终要同时保留 pipe + ws
|
||||
合并后不能让 `ws` 分支丢掉 websocket 路径,也不能只剩 pipe。
|
||||
|
||||
### 3) 两边都有 Zhihu
|
||||
合并后不能把现有 Zhihu 行为合坏,尤其是 ws→Zhihu 保留路径。
|
||||
|
||||
### 4) fault-details 以 `main` 正式实现为准
|
||||
- `ws` 上那套旧重复实现:**不能回流**
|
||||
- `main` 上正式实现:**应被合进来**
|
||||
- 最终结果不是“没有 fault-details”,而是“没有 ws 那套旧 fault-details,只保留 main 正式版本”
|
||||
|
||||
### 5) 不回流旧 scene plumbing
|
||||
以下旧面不能作为最终结果保留:
|
||||
- ws 自己那套旧 scene registry / old scene plumbing
|
||||
- ws cleanup 已删掉的旧重复 route/contract
|
||||
- 仅为旧 `skill_staging` 场景装配服务的残留逻辑
|
||||
|
||||
---
|
||||
|
||||
## File Map
|
||||
|
||||
### A. 合并时重点观察的共享/高风险文件
|
||||
- `Cargo.toml`
|
||||
- `Cargo.lock`
|
||||
- `src/agent/mod.rs`
|
||||
- `src/agent/task_runner.rs`
|
||||
- `src/config/settings.rs`
|
||||
- `src/compat/config_adapter.rs`
|
||||
- `src/compat/runtime.rs`
|
||||
- `src/compat/orchestration.rs`
|
||||
- `src/compat/workflow_executor.rs`
|
||||
- `src/compat/browser_script_skill_tool.rs`
|
||||
- `src/compat/direct_skill_runtime.rs`
|
||||
- `src/compat/openxml_office_tool.rs`
|
||||
|
||||
### B. pipe / ws / Zhihu 保护面
|
||||
- `src/compat/runtime.rs`
|
||||
- `src/compat/orchestration.rs`
|
||||
- `src/compat/workflow_executor.rs`
|
||||
- `src/agent/task_runner.rs`
|
||||
- `tests/agent_runtime_test.rs`
|
||||
- `tests/browser_ws_backend_test.rs`
|
||||
- `tests/service_ws_session_test.rs`
|
||||
- `tests/task_runner_test.rs`
|
||||
|
||||
### C. cleanup 后仍需防止旧实现回流的文件
|
||||
- `src/runtime/mod.rs`
|
||||
- `src/runtime/engine.rs`
|
||||
- `src/config/settings.rs`
|
||||
- `src/compat/config_adapter.rs`
|
||||
- `tests/compat_runtime_test.rs`
|
||||
- `tests/runtime_profile_test.rs`
|
||||
- `tests/compat_config_test.rs`
|
||||
|
||||
### D. 可能需要随 main 正式 fault-details 一起更新的测试面
|
||||
- `tests/compat_runtime_test.rs`
|
||||
- `tests/compat_config_test.rs`
|
||||
- `tests/browser_script_skill_tool_test.rs`
|
||||
- `tests/compat_openxml_office_tool_test.rs`
|
||||
|
||||
---
|
||||
|
||||
## Conflict Resolution Rule Table
|
||||
|
||||
| 类别 | 最终保留原则 |
|
||||
|---|---|
|
||||
| pipe 主路径 | **优先保留可工作的 main 版本**,不能被 ws 改坏 |
|
||||
| ws 路径 | **必须继续保留 ws 能力**,不能因吸收 main 而丢失 |
|
||||
| Zhihu | 两边相关能力都不能合坏,至少保住现有 keep-path |
|
||||
| fault-details | **保留 main 正式实现**,不保留 ws 旧重复实现 |
|
||||
| old scene/95598 cleanup 残留 | 不允许以 ws 旧重复实现形式回流 |
|
||||
| `skillsDir` / config | 以最终产品需要为准;若 main 正式实现不要求旧 array-style/scene expansion,则不回流 |
|
||||
| 临时 merge 修补 | 一律不保留 |
|
||||
|
||||
---
|
||||
|
||||
### Task 1: Confirm Merge Preconditions And Diff Surface
|
||||
|
||||
**Files:**
|
||||
- No code changes expected
|
||||
- Observe repo state and branch diff only
|
||||
|
||||
- [ ] **Step 1: Confirm current branch**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
git rev-parse --abbrev-ref HEAD
|
||||
```
|
||||
|
||||
Expected:
|
||||
```text
|
||||
feature/claw-ws
|
||||
```
|
||||
|
||||
- [ ] **Step 2: Confirm no merge is in progress**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
git rev-parse -q --verify MERGE_HEAD
|
||||
```
|
||||
|
||||
Expected: exit code `1`.
|
||||
|
||||
- [ ] **Step 3: Confirm no tracked local changes**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
git diff --name-only && printf '\n---STAGED---\n' && git diff --cached --name-only
|
||||
```
|
||||
|
||||
Expected:
|
||||
```text
|
||||
|
||||
---STAGED---
|
||||
```
|
||||
|
||||
- [ ] **Step 4: List current untracked files**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
git status --short
|
||||
```
|
||||
|
||||
Expected: only known local untracked items, or a clearly understood list.
|
||||
|
||||
- [ ] **Step 5: Update `origin/main`**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
git fetch origin main
|
||||
```
|
||||
|
||||
- [ ] **Step 6: Show ws vs main diff surface before merge**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
git diff --name-status HEAD...origin/main
|
||||
```
|
||||
|
||||
Expected: clear file list to compare likely merge surface.
|
||||
|
||||
- [ ] **Step 7: Stop if preconditions fail**
|
||||
|
||||
Stop if:
|
||||
- branch is wrong
|
||||
- merge is in progress
|
||||
- tracked changes exist
|
||||
- untracked file collision with `origin/main` is found and unresolved
|
||||
|
||||
---
|
||||
|
||||
### Task 2: Start The Merge Without Committing
|
||||
|
||||
**Files:**
|
||||
- Merge index / working tree only
|
||||
|
||||
- [ ] **Step 1: Start no-commit merge**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
git merge --no-commit --no-ff origin/main
|
||||
```
|
||||
|
||||
Expected:
|
||||
- either auto-merge pauses before commit
|
||||
- or Git reports conflicts
|
||||
|
||||
- [ ] **Step 2: Capture merge surface immediately**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
git status --short
|
||||
```
|
||||
|
||||
- [ ] **Step 3: Separate results into three buckets**
|
||||
Create a working list of conflicted files under:
|
||||
1. pipe-critical
|
||||
2. ws/Zhihu-critical
|
||||
3. shared infra / tests
|
||||
|
||||
- [ ] **Step 4: If no conflicts, proceed directly to Task 4 verification**
|
||||
|
||||
- [ ] **Step 5: If conflicts exist, proceed to Task 3**
|
||||
|
||||
---
|
||||
|
||||
### Task 3: Resolve Conflicts By System Role, Not By Branch Bias
|
||||
|
||||
**Files:**
|
||||
- Only files reported by Git as conflicted
|
||||
|
||||
#### Global conflict policy
|
||||
For every conflicted hunk, answer these four questions in order:
|
||||
|
||||
1. Does this hunk affect **pipe** correctness?
|
||||
2. Does this hunk affect **ws** correctness?
|
||||
3. Does this hunk affect **Zhihu** correctness?
|
||||
4. Is this hunk part of **ws old duplicate fault-details/scene logic** or **main official implementation**?
|
||||
|
||||
Then apply the rule:
|
||||
- **pipe cannot break**
|
||||
- **ws cannot break**
|
||||
- **Zhihu cannot break**
|
||||
- **ws old duplicate fault-details must stay deleted**
|
||||
- **main official fault-details should come in**
|
||||
|
||||
---
|
||||
|
||||
#### Task 3A: Resolve pipe-critical shared runtime files
|
||||
|
||||
**Files:**
|
||||
- `src/compat/runtime.rs`
|
||||
- `src/agent/task_runner.rs`
|
||||
- `src/agent/mod.rs`
|
||||
- `src/config/settings.rs`
|
||||
- `src/compat/config_adapter.rs`
|
||||
|
||||
- [ ] **Step 1: For each conflict, keep the side that preserves main’s pipe behavior**
|
||||
|
||||
- [ ] **Step 2: Reject ws-only duplicate business logic that main already owns**
|
||||
|
||||
- [ ] **Step 3: Keep ws support if the file also serves ws path**
|
||||
This is additive preservation, not “main wins everything”.
|
||||
|
||||
- [ ] **Step 4: Verify each resolved file has no conflict markers**
|
||||
|
||||
Run per file:
|
||||
```bash
|
||||
git diff --check -- <path>
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
#### Task 3B: Resolve ws / Zhihu-critical routing files
|
||||
|
||||
**Files:**
|
||||
- `src/compat/workflow_executor.rs`
|
||||
- `src/compat/orchestration.rs`
|
||||
|
||||
- [ ] **Step 1: Bring in main’s official fault-details path if it lives here**
|
||||
|
||||
- [ ] **Step 2: Do not reintroduce ws’s old duplicate fault-details path**
|
||||
|
||||
- [ ] **Step 3: Preserve ws submit/browser websocket path**
|
||||
|
||||
- [ ] **Step 4: Preserve Zhihu routing path**
|
||||
|
||||
- [ ] **Step 5: Verify each resolved file has no conflict markers**
|
||||
|
||||
Run per file:
|
||||
```bash
|
||||
git diff --check -- <path>
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
#### Task 3C: Resolve shared infra files minimally
|
||||
|
||||
**Files:**
|
||||
- `Cargo.toml`
|
||||
- `Cargo.lock`
|
||||
- `src/compat/browser_script_skill_tool.rs`
|
||||
- `src/compat/direct_skill_runtime.rs`
|
||||
- `src/compat/openxml_office_tool.rs`
|
||||
|
||||
- [ ] **Step 1: Keep only the dependency/code shape needed by the merged result**
|
||||
|
||||
- [ ] **Step 2: Do not keep lines from prior failed merge attempts**
|
||||
|
||||
- [ ] **Step 3: Accept main fixes unless they break pipe/ws/Zhihu behavior**
|
||||
|
||||
- [ ] **Step 4: Verify each resolved file has no conflict markers**
|
||||
|
||||
Run per file:
|
||||
```bash
|
||||
git diff --check -- <path>
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
#### Task 3D: Resolve tests to reflect final intended product
|
||||
|
||||
**Files:**
|
||||
- `tests/compat_runtime_test.rs`
|
||||
- `tests/runtime_profile_test.rs`
|
||||
- `tests/compat_config_test.rs`
|
||||
- `tests/agent_runtime_test.rs`
|
||||
- `tests/browser_script_skill_tool_test.rs`
|
||||
- `tests/compat_openxml_office_tool_test.rs`
|
||||
|
||||
- [ ] **Step 1: Keep tests proving pipe path still works**
|
||||
|
||||
- [ ] **Step 2: Keep tests proving ws path still works**
|
||||
|
||||
- [ ] **Step 3: Keep Zhihu keep-path regression**
|
||||
|
||||
- [ ] **Step 4: Replace cleanup-only “fault-details absent” assertions if final intended state is now “fault-details present via main official implementation”**
|
||||
|
||||
- [ ] **Step 5: Do not keep assertions that only prove ws’s old duplicate implementation is absent if they now contradict the intended merged product**
|
||||
|
||||
- [ ] **Step 6: Verify each resolved test file has no conflict markers**
|
||||
|
||||
Run per file:
|
||||
```bash
|
||||
git diff --check -- <path>
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
#### Task 3E: Confirm merge is fully resolved
|
||||
|
||||
**Files:**
|
||||
- No code changes expected
|
||||
|
||||
- [ ] **Step 1: Confirm no unmerged entries remain**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
git diff --name-only --diff-filter=U
|
||||
```
|
||||
|
||||
Expected: no output.
|
||||
|
||||
- [ ] **Step 2: Show final resolved file list**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
git diff --cached --name-only
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 4: Verify Final Product Behavior, Not Cleanup Intermediate State
|
||||
|
||||
**Files:**
|
||||
- Test: `tests/agent_runtime_test.rs`
|
||||
- Test: `tests/browser_ws_backend_test.rs`
|
||||
- Test: `tests/service_ws_session_test.rs`
|
||||
- Test: `tests/task_runner_test.rs`
|
||||
- Test: `tests/compat_runtime_test.rs`
|
||||
- Test: `tests/runtime_profile_test.rs`
|
||||
- Test: `tests/compat_config_test.rs`
|
||||
- Conditional: `tests/browser_script_skill_tool_test.rs`
|
||||
- Conditional: `tests/compat_openxml_office_tool_test.rs`
|
||||
|
||||
#### Verification goals
|
||||
This task must prove all four:
|
||||
|
||||
1. **pipe path still works**
|
||||
2. **ws path still works**
|
||||
3. **Zhihu still works**
|
||||
4. **final fault-details implementation is the main version, not ws’s old duplicate**
|
||||
|
||||
---
|
||||
|
||||
#### Task 4A: Verify pipe-related behavior
|
||||
|
||||
- [ ] **Step 1: Run task runner coverage**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
cargo test --test task_runner_test -- --nocapture
|
||||
```
|
||||
|
||||
- [ ] **Step 2: Run compat runtime suite relevant to main path**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
cargo test --test compat_runtime_test -- --nocapture
|
||||
```
|
||||
|
||||
- [ ] **Step 3: If pipe-specific tests fail, stop and fix merge resolution before continuing**
|
||||
|
||||
---
|
||||
|
||||
#### Task 4B: Verify ws-related behavior
|
||||
|
||||
- [ ] **Step 1: Run browser websocket backend suite**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
cargo test --test browser_ws_backend_test -- --nocapture
|
||||
```
|
||||
|
||||
- [ ] **Step 2: Run service websocket session suite**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
cargo test --test service_ws_session_test -- --nocapture
|
||||
```
|
||||
|
||||
- [ ] **Step 3: If ws-specific tests fail, stop and fix merge resolution before continuing**
|
||||
|
||||
---
|
||||
|
||||
#### Task 4C: Verify Zhihu behavior
|
||||
|
||||
- [ ] **Step 1: Re-run ws→Zhihu keep-path regression**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
cargo test --test agent_runtime_test production_submit_task_routes_zhihu_through_ws_backend_without_helper_bootstrap -- --nocapture
|
||||
```
|
||||
|
||||
Expected:
|
||||
```text
|
||||
1 passed; 0 failed
|
||||
```
|
||||
|
||||
- [ ] **Step 2: If additional Zhihu tests were touched by conflicts, run the smallest affected test target**
|
||||
|
||||
Run as needed:
|
||||
```bash
|
||||
cargo test --test agent_runtime_test -- --nocapture
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
#### Task 4D: Verify config/runtime contracts
|
||||
|
||||
- [ ] **Step 1: Run runtime profile suite**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
cargo test --test runtime_profile_test -- --nocapture
|
||||
```
|
||||
|
||||
- [ ] **Step 2: Run compat config suite**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
cargo test --test compat_config_test -- --nocapture
|
||||
```
|
||||
|
||||
- [ ] **Step 3: Ensure contracts now reflect final merged product, not the cleanup-only intermediate**
|
||||
|
||||
---
|
||||
|
||||
#### Task 4E: Verify shared infra if touched
|
||||
|
||||
- [ ] **Step 1: If browser-script tool files were touched**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
cargo test --test browser_script_skill_tool_test -- --nocapture
|
||||
```
|
||||
|
||||
- [ ] **Step 2: If openxml files were touched**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
cargo test --test compat_openxml_office_tool_test -- --nocapture
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
#### Task 4F: Compile guard
|
||||
|
||||
- [ ] **Step 1: Run compile-only full test build**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
cargo test --no-run
|
||||
```
|
||||
|
||||
Expected: exit code `0`.
|
||||
|
||||
---
|
||||
|
||||
### Task 5: Confirm The Merge Outcome Matches The Principle
|
||||
|
||||
**Files:**
|
||||
- No code changes expected
|
||||
|
||||
- [ ] **Step 1: Show final status**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
git status --short
|
||||
```
|
||||
|
||||
Expected:
|
||||
- no `UU` / `AA` / `DD`
|
||||
- merged, validated, uncommitted state only
|
||||
|
||||
- [ ] **Step 2: Show final staged summary**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
git diff --cached --stat
|
||||
```
|
||||
|
||||
- [ ] **Step 3: Report the four required facts with command-backed evidence**
|
||||
Only if verified:
|
||||
1. pipe 没坏
|
||||
2. ws 没坏
|
||||
3. Zhihu 没坏
|
||||
4. 最终 fault-details 来自 main 正式实现,而不是 ws 旧重复实现
|
||||
|
||||
- [ ] **Step 4: Stop here**
|
||||
Do **not** run:
|
||||
```bash
|
||||
git commit
|
||||
git push
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Stop Conditions
|
||||
|
||||
出现以下任一情况立即停止,不擅自扩展处理:
|
||||
|
||||
- `origin/main` 的正式 fault-details 实现依赖 cleanup 已删掉的契约,而这已经超出简单 merge 范围
|
||||
- pipe 与 ws 同时依赖同一段共享代码,但两边要求已结构性冲突
|
||||
- Zhihu keep-path 失败
|
||||
- `cargo test --no-run` 失败且问题超出本次 merge surface
|
||||
- 需要重新设计 pipe/ws 共存方式,而不是单纯合并
|
||||
|
||||
---
|
||||
|
||||
## One-line Execution Rule
|
||||
|
||||
**这次 merge 的最终标准不是“继续保持 ws 没有 fault-details”,而是“保住 pipe、保住 ws、保住 Zhihu,并让 main 的正式 fault-details 替换 ws 旧重复实现”。**
|
||||
@@ -0,0 +1,808 @@
|
||||
# TQ Lineloss Deterministic Skill Implementation Plan
|
||||
|
||||
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
|
||||
|
||||
**Goal:** Add a staged `tq-lineloss-report.collect_lineloss` browser-script skill plus a `。。。` deterministic submit path in `claw-new` that extracts and normalizes company/month/week parameters without LLM, executes through the existing pipe browser-script seam, and does not regress Zhihu hotlist behavior.
|
||||
|
||||
**Architecture:** Keep the new behavior behind a narrow deterministic branch that activates only when the raw instruction ends with the exact suffix `。。。`. `claw-new` owns deterministic trigger detection, explicit scene matching, semantic extraction, canonical normalization, prompt-or-execute control flow, and artifact interpretation; the staged skill owns page inspection, source/API collection, row normalization, export/report-log behavior, and final artifact generation. Reuse the existing `browser_script` execution seam already used by the direct browser path so the backend can later swap from pipe to ws without changing the deterministic contract.
|
||||
|
||||
**Tech Stack:** Rust 2021, Cargo tests, existing `BrowserPipeTool` / `execute_browser_script_tool` seam, staged skill packaging under `claw/claw/skills/skill_staging`, browser-side JavaScript, deterministic string parsing and normalization.
|
||||
|
||||
---
|
||||
|
||||
## Execution Context
|
||||
|
||||
- Follow @superpowers:test-driven-development for every behavior change.
|
||||
- Follow @superpowers:verification-before-completion before claiming each task is done.
|
||||
- Do **not** create a git worktree unless the user explicitly asks.
|
||||
- Keep the new behavior as a narrow branch; do **not** redesign the whole runtime into a general registry engine in this slice.
|
||||
- Preserve `src/runtime/engine.rs:147-159` and `src/runtime/engine.rs:265-286` behavior unless a failing regression test proves a change is required.
|
||||
- Do **not** add ws runtime requirements on `main`; keep ws-readiness isolated to backend-neutral contracts only.
|
||||
- Never fall back to page defaults for missing company, mode, or period in deterministic mode.
|
||||
- If a deterministic request does not match the lineloss whitelist scene, return a deterministic mismatch prompt instead of falling through to ordinary orchestration.
|
||||
|
||||
## File Map
|
||||
|
||||
### New or modified files in `claw-new`
|
||||
|
||||
- Create: `src/compat/deterministic_submit.rs`
|
||||
- suffix detection, deterministic scene match, prompt-or-execute decision
|
||||
- Create: `src/compat/tq_lineloss/mod.rs`
|
||||
- public normalization and artifact helpers
|
||||
- Create: `src/compat/tq_lineloss/contracts.rs`
|
||||
- canonical request/result data structures and status semantics
|
||||
- Create: `src/compat/tq_lineloss/org_resolver.rs`
|
||||
- alias generation, canonical label/code resolution, ambiguity handling
|
||||
- Create: `src/compat/tq_lineloss/period_resolver.rs`
|
||||
- month/week extraction, contradiction detection, canonical payload building
|
||||
- Create: `src/compat/tq_lineloss/org_units.rs`
|
||||
- checked-in canonical unit dictionary derived from the real source tree data
|
||||
- Modify: `src/compat/mod.rs`
|
||||
- export the deterministic and lineloss modules
|
||||
- Modify: `src/agent/mod.rs`
|
||||
- insert the deterministic branch before ordinary LLM interpretation, but only when the exact suffix is present
|
||||
- Modify only if code duplication would otherwise occur: `src/compat/direct_skill_runtime.rs`
|
||||
- extract narrow shared browser-script execution helpers without changing current configured direct-submit behavior
|
||||
- Read but avoid changing unless tests force it: `src/runtime/engine.rs`
|
||||
- existing Zhihu hotlist routing/prompt logic must remain intact
|
||||
|
||||
### New staged skill package in `claw`
|
||||
|
||||
- Create: `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/tq-lineloss-report/SKILL.md`
|
||||
- Create: `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/tq-lineloss-report/SKILL.toml`
|
||||
- Create: `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/tq-lineloss-report/references/collection-flow.md`
|
||||
- Create: `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/tq-lineloss-report/references/data-quality.md`
|
||||
- Create: `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/tq-lineloss-report/assets/scene-snapshot/index.html`
|
||||
- Create: `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/tq-lineloss-report/scripts/collect_lineloss.js`
|
||||
- Create: `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/tq-lineloss-report/scripts/collect_lineloss.test.js`
|
||||
- Create if staging conventions require it: `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/scenes/tq-lineloss-report/scene.json`
|
||||
|
||||
### Tests
|
||||
|
||||
- Create: `tests/deterministic_submit_test.rs`
|
||||
- Modify: `tests/compat_runtime_test.rs`
|
||||
- Modify only if end-to-end submit coverage requires it: `tests/runtime_task_flow_test.rs`
|
||||
|
||||
---
|
||||
|
||||
## Locked contracts
|
||||
|
||||
### Deterministic trigger contract
|
||||
|
||||
- Trigger only when the raw instruction ends with the exact suffix `。。。`.
|
||||
- No suffix: current behavior unchanged.
|
||||
- Suffix + unsupported scene: explicit deterministic mismatch prompt.
|
||||
- Suffix is not permission for arbitrary browser actions; only fixed deterministic scenes are allowed.
|
||||
- Negative cases must stay non-deterministic or mismatched exactly as designed:
|
||||
- ASCII `...` is not the trigger
|
||||
- `。。。。` is not the trigger
|
||||
- `。。。` appearing in the middle of the instruction is not the trigger
|
||||
- any trailing whitespace after `。。。` is not the trigger in this slice
|
||||
|
||||
### Canonical org contract
|
||||
|
||||
The resolver must output both display and backend values:
|
||||
|
||||
```rust
|
||||
pub struct ResolvedOrg {
|
||||
pub label: String,
|
||||
pub code: String,
|
||||
}
|
||||
```
|
||||
|
||||
Required supported inputs include:
|
||||
- `兰州公司`
|
||||
- `天水公司`
|
||||
- `国网兰州供电公司`
|
||||
- `城关供电分公司`
|
||||
- `榆中县供电公司`
|
||||
- normalized shorthand such as `榆中县公司`
|
||||
|
||||
Rules:
|
||||
- derive aliases from the real unit tree data
|
||||
- require uniqueness before execution
|
||||
- ambiguous aliases prompt and stop
|
||||
- missing company prompts and stop
|
||||
|
||||
### Canonical period contract
|
||||
|
||||
```rust
|
||||
pub enum PeriodMode {
|
||||
Month,
|
||||
Week,
|
||||
}
|
||||
|
||||
pub struct ResolvedPeriod {
|
||||
pub mode: PeriodMode,
|
||||
pub mode_code: String,
|
||||
pub value: String,
|
||||
pub payload: serde_json::Value,
|
||||
}
|
||||
```
|
||||
|
||||
Required supported inputs include:
|
||||
- `月累计 2026-03`
|
||||
- `月累计 2026年3月`
|
||||
- `周累计 2026年第12周`
|
||||
|
||||
Rules:
|
||||
- month and week intent are mutually exclusive
|
||||
- missing mode prompts and stop
|
||||
- missing period prompts and stop
|
||||
- bare `第12周` is incomplete in this slice and must prompt for year instead of guessing
|
||||
- derive the real backend `period_mode_code` values and request payload field names from the source page/API contract before implementation; do not ship placeholder enum echoes such as `month`/`week` unless the source materials prove those are the real backend codes
|
||||
- never use page-selected defaults in deterministic mode
|
||||
|
||||
### Artifact contract
|
||||
|
||||
Lock the field names now so `claw-new` can interpret status without re-embedding business logic:
|
||||
|
||||
```json
|
||||
{
|
||||
"type": "report-artifact",
|
||||
"report_name": "tq-lineloss-report",
|
||||
"status": "ok",
|
||||
"org": {
|
||||
"label": "国网兰州供电公司",
|
||||
"code": "008df5db70319f73e0508eoac23e0c3c"
|
||||
},
|
||||
"period": {
|
||||
"mode": "month",
|
||||
"mode_code": "<real-backend-mode-code>",
|
||||
"value": "2026-03",
|
||||
"payload": {
|
||||
"<real-backend-field>": "<real-backend-value>"
|
||||
}
|
||||
},
|
||||
"columns": [],
|
||||
"rows": [],
|
||||
"counts": {
|
||||
"rows": 0
|
||||
},
|
||||
"export": {
|
||||
"attempted": false,
|
||||
"status": "skipped",
|
||||
"message": null
|
||||
},
|
||||
"reasons": []
|
||||
}
|
||||
```
|
||||
|
||||
Status mapping in `claw-new`:
|
||||
- `ok` -> task success
|
||||
- `partial` -> task success with partial summary
|
||||
- `blocked` -> task failure
|
||||
- `error` -> task failure
|
||||
|
||||
---
|
||||
|
||||
### Task 1: Scaffold the staged skill package and written contract
|
||||
|
||||
**Files:**
|
||||
- Create: `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/tq-lineloss-report/SKILL.md`
|
||||
- Create: `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/tq-lineloss-report/SKILL.toml`
|
||||
- Create: `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/tq-lineloss-report/references/collection-flow.md`
|
||||
- Create: `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/tq-lineloss-report/references/data-quality.md`
|
||||
- Create: `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/tq-lineloss-report/assets/scene-snapshot/index.html`
|
||||
- Create if required: `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/scenes/tq-lineloss-report/scene.json`
|
||||
|
||||
- [ ] **Step 1: Write the failing package contract files**
|
||||
|
||||
Create the package using `fault-details-report` as the structure reference. Lock one tool only:
|
||||
|
||||
```toml
|
||||
[[tools]]
|
||||
name = "collect_lineloss"
|
||||
kind = "browser_script"
|
||||
description = "Collect 台区线损月/周累计线损率 rows using normalized company and period parameters and return a structured report artifact."
|
||||
```
|
||||
|
||||
Declare required args in `SKILL.toml`:
|
||||
- `expected_domain`
|
||||
- `org_label`
|
||||
- `org_code`
|
||||
- `period_mode`
|
||||
- `period_mode_code`
|
||||
- `period_value`
|
||||
- `period_payload`
|
||||
|
||||
- [ ] **Step 2: Write `SKILL.md` before implementation**
|
||||
|
||||
Document:
|
||||
- when to use / when not to use
|
||||
- required normalized args only
|
||||
- blocked/error semantics
|
||||
- exact returned artifact fields
|
||||
- no raw natural-language values passed to backend requests
|
||||
|
||||
- [ ] **Step 3: Write the reference docs**
|
||||
|
||||
`references/collection-flow.md` must describe:
|
||||
- relevant page state
|
||||
- month request mapping
|
||||
- week request mapping
|
||||
- export/report-log flow if retained
|
||||
|
||||
`references/data-quality.md` must define:
|
||||
- canonical output columns
|
||||
- required field coverage
|
||||
- status semantics
|
||||
- partial/error rules
|
||||
- org/period normalization assumptions
|
||||
|
||||
- [ ] **Step 4: Add scene metadata if the current staging registry needs it**
|
||||
|
||||
Keep it narrow: one scene, one tool, one artifact type.
|
||||
|
||||
- [ ] **Step 5: Add an automated staged-skill load/resolve check**
|
||||
|
||||
Add `tests/deterministic_submit_test.rs` coverage that loads the staged skills root used by runtime tests, resolves `tq-lineloss-report.collect_lineloss`, and asserts the tool is discoverable with the required args:
|
||||
- `expected_domain`
|
||||
- `org_label`
|
||||
- `org_code`
|
||||
- `period_mode`
|
||||
- `period_mode_code`
|
||||
- `period_value`
|
||||
- `period_payload`
|
||||
|
||||
Run:
|
||||
```bash
|
||||
cargo test deterministic_submit_discovers_tq_lineloss_skill_contract -- --exact
|
||||
```
|
||||
|
||||
Expected: FAIL before the package is fully wired, PASS once the staged skill contract is discoverable and complete.
|
||||
|
||||
- [ ] **Step 6: Verify structural parity with `fault-details-report`**
|
||||
|
||||
Run a manual file-layout diff and confirm there are no placeholder descriptions or missing required docs.
|
||||
|
||||
- [ ] **Step 7: Commit**
|
||||
|
||||
```bash
|
||||
git add "D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/tq-lineloss-report" "D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/scenes/tq-lineloss-report/scene.json"
|
||||
git commit -m "feat: scaffold tq lineloss staged skill contract"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 2: Add browser-side JS red tests and implement the staged collector
|
||||
|
||||
**Files:**
|
||||
- Create: `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/tq-lineloss-report/scripts/collect_lineloss.js`
|
||||
- Create: `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/tq-lineloss-report/scripts/collect_lineloss.test.js`
|
||||
|
||||
- [ ] **Step 1: Write the failing JS tests first**
|
||||
|
||||
Cover deterministic pure helpers for:
|
||||
- missing normalized args -> blocked/error contract
|
||||
- month request shape uses `org_code` + canonical month payload
|
||||
- week request shape uses `org_code` + canonical week payload
|
||||
- artifact field names and counts
|
||||
- partial/error status shaping
|
||||
- no raw user-entered org text leakage into request fields
|
||||
|
||||
Example test skeleton:
|
||||
|
||||
```javascript
|
||||
const test = require('node:test');
|
||||
const assert = require('node:assert/strict');
|
||||
|
||||
const {
|
||||
validateArgs,
|
||||
buildMonthRequest,
|
||||
buildWeekRequest,
|
||||
normalizeRows,
|
||||
buildArtifact
|
||||
} = require('./collect_lineloss.js');
|
||||
|
||||
test('buildMonthRequest uses canonical org code and month payload', () => {
|
||||
const request = buildMonthRequest({
|
||||
org_code: 'ORG-1',
|
||||
period_payload: { year: 2026, month: 3 }
|
||||
});
|
||||
|
||||
assert.equal(request.orgCode, 'ORG-1');
|
||||
assert.equal(request.year, 2026);
|
||||
assert.equal(request.month, 3);
|
||||
});
|
||||
|
||||
test('buildArtifact locks field names and partial semantics', () => {
|
||||
const artifact = buildArtifact({
|
||||
org_label: '国网兰州供电公司',
|
||||
org_code: 'ORG-1',
|
||||
period_mode: 'month',
|
||||
period_mode_code: 'month',
|
||||
period_value: '2026-03',
|
||||
period_payload: { year: 2026, month: 3 },
|
||||
rows: [{ id: 1 }],
|
||||
status: 'partial',
|
||||
reasons: ['export_failed']
|
||||
});
|
||||
|
||||
assert.equal(artifact.report_name, 'tq-lineloss-report');
|
||||
assert.equal(artifact.org.code, 'ORG-1');
|
||||
assert.equal(artifact.period.value, '2026-03');
|
||||
assert.deepEqual(artifact.reasons, ['export_failed']);
|
||||
});
|
||||
```
|
||||
|
||||
- [ ] **Step 2: Run the JS test file to confirm failure**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
node --test "D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/tq-lineloss-report/scripts/collect_lineloss.test.js"
|
||||
```
|
||||
|
||||
Expected: FAIL because the script/helpers do not exist yet.
|
||||
|
||||
- [ ] **Step 3: Write the minimal browser-side implementation**
|
||||
|
||||
Required structure:
|
||||
|
||||
```javascript
|
||||
function validateArgs(args) { /* require normalized canonical args */ }
|
||||
function buildMonthRequest(args) { /* build month request from canonical values */ }
|
||||
function buildWeekRequest(args) { /* build week request from canonical values */ }
|
||||
function normalizeRows(rawRows) { /* canonical columns only */ }
|
||||
function buildArtifact(input) { /* locked artifact shape */ }
|
||||
|
||||
return (async () => {
|
||||
const args = __SKILL_ARGS__;
|
||||
validateArgs(args);
|
||||
// validate page context
|
||||
// collect from page/API
|
||||
// normalize rows
|
||||
// optionally attempt export/report-log if the real business flow requires it
|
||||
return buildArtifact(result);
|
||||
})();
|
||||
```
|
||||
|
||||
Keep test exports behind an environment-safe guard so the file still works as browser-eval code.
|
||||
|
||||
- [ ] **Step 4: Re-run the JS tests until they pass**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
node --test "D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/tq-lineloss-report/scripts/collect_lineloss.test.js"
|
||||
```
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
- [ ] **Step 5: Commit**
|
||||
|
||||
```bash
|
||||
git add "D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/tq-lineloss-report/scripts/collect_lineloss.js" "D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/tq-lineloss-report/scripts/collect_lineloss.test.js"
|
||||
git commit -m "feat: add tq lineloss browser collection script"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 3: Add deterministic suffix detection and explicit scene routing
|
||||
|
||||
**Files:**
|
||||
- Create: `src/compat/deterministic_submit.rs`
|
||||
- Modify: `src/compat/mod.rs`
|
||||
- Modify: `src/agent/mod.rs`
|
||||
- Create: `tests/deterministic_submit_test.rs`
|
||||
|
||||
- [ ] **Step 1: Write failing routing tests**
|
||||
|
||||
Add Rust tests for:
|
||||
- exact raw `。。。` suffix enables deterministic mode
|
||||
- no suffix leaves current routing untouched
|
||||
- suffix + unsupported deterministic request returns supported-scene prompt
|
||||
- when page URL/title context is available and does not match the lineloss scene, deterministic routing returns mismatch/block prompt instead of proceeding
|
||||
- Zhihu hotlist request without suffix keeps the current route
|
||||
- ASCII `...` does not trigger deterministic mode
|
||||
- `。。。。` does not trigger deterministic mode
|
||||
- `。。。` in the middle of the instruction does not trigger deterministic mode
|
||||
- trailing whitespace after `。。。` does not trigger deterministic mode in this slice
|
||||
|
||||
Suggested tests:
|
||||
|
||||
```rust
|
||||
#[test]
|
||||
fn deterministic_submit_requires_exact_suffix() {}
|
||||
|
||||
#[test]
|
||||
fn deterministic_submit_nonmatch_returns_supported_scene_message() {}
|
||||
|
||||
#[test]
|
||||
fn deterministic_submit_rejects_page_context_mismatch() {}
|
||||
|
||||
#[test]
|
||||
fn zhihu_hotlist_request_without_suffix_keeps_existing_route() {}
|
||||
|
||||
#[test]
|
||||
fn deterministic_submit_rejects_non_exact_suffix_variants() {}
|
||||
```
|
||||
|
||||
- [ ] **Step 2: Run the targeted routing tests and confirm failure**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
cargo test deterministic_submit_requires_exact_suffix -- --exact
|
||||
cargo test deterministic_submit_nonmatch_returns_supported_scene_message -- --exact
|
||||
cargo test zhihu_hotlist_request_without_suffix_keeps_existing_route -- --exact
|
||||
```
|
||||
|
||||
Expected: FAIL because the deterministic routing seam does not exist yet.
|
||||
|
||||
- [ ] **Step 3: Implement the narrow deterministic routing module**
|
||||
|
||||
Recommended public shape:
|
||||
|
||||
```rust
|
||||
pub enum DeterministicSubmitDecision {
|
||||
NotDeterministic,
|
||||
Prompt { summary: String },
|
||||
Execute(DeterministicExecutionPlan),
|
||||
}
|
||||
```
|
||||
|
||||
`src/agent/mod.rs` should:
|
||||
1. detect deterministic suffix
|
||||
2. if not deterministic, continue current flow untouched
|
||||
3. if prompt, return `TaskComplete`
|
||||
4. if execute, pass the plan into the browser-script execution seam
|
||||
|
||||
- [ ] **Step 4: Re-run the routing tests**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
cargo test deterministic_submit_requires_exact_suffix -- --exact
|
||||
cargo test deterministic_submit_nonmatch_returns_supported_scene_message -- --exact
|
||||
cargo test zhihu_hotlist_request_without_suffix_keeps_existing_route -- --exact
|
||||
```
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
- [ ] **Step 5: Commit**
|
||||
|
||||
```bash
|
||||
git add src/compat/deterministic_submit.rs src/compat/mod.rs src/agent/mod.rs tests/deterministic_submit_test.rs
|
||||
git commit -m "feat: add deterministic submit routing seam"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 4: Implement company/unit normalization from real source data
|
||||
|
||||
**Files:**
|
||||
- Create: `src/compat/tq_lineloss/mod.rs`
|
||||
- Create: `src/compat/tq_lineloss/contracts.rs`
|
||||
- Create: `src/compat/tq_lineloss/org_resolver.rs`
|
||||
- Create: `src/compat/tq_lineloss/org_units.rs`
|
||||
- Modify: `tests/deterministic_submit_test.rs`
|
||||
|
||||
- [ ] **Step 1: Write failing org resolver tests**
|
||||
|
||||
Cover:
|
||||
- `兰州公司` -> canonical `国网兰州供电公司` + correct code
|
||||
- `天水公司` -> canonical `国网天水供电公司` + correct code
|
||||
- `城关供电分公司` -> lower-level direct match
|
||||
- `榆中县公司` -> normalized county alias match
|
||||
- ambiguous alias prompts instead of guessing
|
||||
- missing company prompts instead of executing
|
||||
|
||||
Example skeleton:
|
||||
|
||||
```rust
|
||||
#[test]
|
||||
fn lineloss_org_resolver_matches_city_alias() {}
|
||||
|
||||
#[test]
|
||||
fn lineloss_org_resolver_matches_county_alias() {}
|
||||
|
||||
#[test]
|
||||
fn lineloss_org_resolver_prompts_on_ambiguity() {}
|
||||
```
|
||||
|
||||
- [ ] **Step 2: Run the org tests and confirm failure**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
cargo test lineloss_org_resolver_matches_city_alias -- --exact
|
||||
cargo test lineloss_org_resolver_matches_county_alias -- --exact
|
||||
cargo test lineloss_org_resolver_prompts_on_ambiguity -- --exact
|
||||
```
|
||||
|
||||
Expected: FAIL because the resolver and checked-in unit dictionary do not exist yet.
|
||||
|
||||
- [ ] **Step 3: Check in the canonical unit dictionary and implement alias resolution**
|
||||
|
||||
Rules:
|
||||
- derive data from the real source materials, not guessed literals
|
||||
- keep canonical `label` and `code`
|
||||
- generate normalized aliases from formal names
|
||||
- support both city-company and district/county/sub-company levels
|
||||
- require uniqueness before execution
|
||||
|
||||
- [ ] **Step 4: Implement explicit prompt messages**
|
||||
|
||||
Examples:
|
||||
- `已命中台区线损报表技能,但缺少供电单位,请补充如“兰州公司”或“城关供电分公司”。`
|
||||
- `已命中台区线损报表技能,但供电单位存在歧义,请补充更完整名称。`
|
||||
|
||||
- [ ] **Step 5: Re-run the org tests**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
cargo test lineloss_org_resolver_matches_city_alias -- --exact
|
||||
cargo test lineloss_org_resolver_matches_county_alias -- --exact
|
||||
cargo test lineloss_org_resolver_prompts_on_ambiguity -- --exact
|
||||
```
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
- [ ] **Step 6: Commit**
|
||||
|
||||
```bash
|
||||
git add src/compat/tq_lineloss/mod.rs src/compat/tq_lineloss/contracts.rs src/compat/tq_lineloss/org_resolver.rs src/compat/tq_lineloss/org_units.rs tests/deterministic_submit_test.rs
|
||||
git commit -m "feat: add tq lineloss org normalization"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 5: Implement period extraction and canonical payload building
|
||||
|
||||
**Files:**
|
||||
- Create: `src/compat/tq_lineloss/period_resolver.rs`
|
||||
- Modify: `src/compat/tq_lineloss/mod.rs`
|
||||
- Modify: `tests/deterministic_submit_test.rs`
|
||||
|
||||
- [ ] **Step 1: Write failing period resolver tests**
|
||||
|
||||
Cover:
|
||||
- `月累计 2026-03`
|
||||
- `月累计 2026年3月`
|
||||
- `周累计 2026年第12周`
|
||||
- contradictory month/week expressions prompt
|
||||
- missing mode prompts
|
||||
- missing period prompts
|
||||
- bare `第12周` prompts for year in this slice
|
||||
- real backend month/week mode codes and request payload field names are derived from source materials instead of placeholder values
|
||||
|
||||
Example skeleton:
|
||||
|
||||
```rust
|
||||
#[test]
|
||||
fn lineloss_period_resolver_parses_month_text() {}
|
||||
|
||||
#[test]
|
||||
fn lineloss_period_resolver_parses_week_text() {}
|
||||
|
||||
#[test]
|
||||
fn lineloss_period_resolver_prompts_for_missing_year_on_week() {}
|
||||
|
||||
#[test]
|
||||
fn lineloss_period_resolver_rejects_contradictory_mode() {}
|
||||
```
|
||||
|
||||
- [ ] **Step 2: Run the period tests and confirm failure**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
cargo test lineloss_period_resolver_parses_month_text -- --exact
|
||||
cargo test lineloss_period_resolver_parses_week_text -- --exact
|
||||
cargo test lineloss_period_resolver_prompts_for_missing_year_on_week -- --exact
|
||||
cargo test lineloss_period_resolver_rejects_contradictory_mode -- --exact
|
||||
```
|
||||
|
||||
Expected: FAIL because the period resolver does not exist yet.
|
||||
|
||||
- [ ] **Step 3: Implement the minimal resolver**
|
||||
|
||||
Output contract:
|
||||
|
||||
```rust
|
||||
pub struct ResolvedPeriod {
|
||||
pub mode: PeriodMode,
|
||||
pub mode_code: String,
|
||||
pub value: String,
|
||||
pub payload: serde_json::Value,
|
||||
}
|
||||
```
|
||||
|
||||
Rules:
|
||||
- no page-default fallback
|
||||
- no implicit current-year assumptions
|
||||
- no mixed month/week execution
|
||||
|
||||
- [ ] **Step 4: Re-run the period tests**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
cargo test lineloss_period_resolver_parses_month_text -- --exact
|
||||
cargo test lineloss_period_resolver_parses_week_text -- --exact
|
||||
cargo test lineloss_period_resolver_prompts_for_missing_year_on_week -- --exact
|
||||
cargo test lineloss_period_resolver_rejects_contradictory_mode -- --exact
|
||||
```
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
- [ ] **Step 5: Commit**
|
||||
|
||||
```bash
|
||||
git add src/compat/tq_lineloss/period_resolver.rs src/compat/tq_lineloss/mod.rs tests/deterministic_submit_test.rs
|
||||
git commit -m "feat: add tq lineloss period normalization"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 6: Wire deterministic execution through the existing browser-script seam
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/compat/deterministic_submit.rs`
|
||||
- Modify: `src/agent/mod.rs`
|
||||
- Modify if needed: `src/compat/direct_skill_runtime.rs`
|
||||
- Modify: `tests/deterministic_submit_test.rs`
|
||||
- Modify: `tests/compat_runtime_test.rs`
|
||||
|
||||
- [ ] **Step 1: Write failing execution tests**
|
||||
|
||||
Cover:
|
||||
- successful deterministic lineloss request builds canonical tool args
|
||||
- missing company/mode/period returns prompt without browser execution
|
||||
- `partial` artifact maps to successful partial summary
|
||||
- `blocked` and `error` artifacts map to failed completion
|
||||
|
||||
Example skeleton:
|
||||
|
||||
```rust
|
||||
#[test]
|
||||
fn deterministic_lineloss_execution_passes_canonical_args() {}
|
||||
|
||||
#[test]
|
||||
fn deterministic_lineloss_missing_company_does_not_invoke_browser() {}
|
||||
|
||||
#[test]
|
||||
fn deterministic_lineloss_partial_artifact_maps_to_partial_summary() {}
|
||||
```
|
||||
|
||||
- [ ] **Step 2: Run the execution tests and confirm failure**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
cargo test deterministic_lineloss_execution_passes_canonical_args -- --exact
|
||||
cargo test deterministic_lineloss_missing_company_does_not_invoke_browser -- --exact
|
||||
cargo test deterministic_lineloss_partial_artifact_maps_to_partial_summary -- --exact
|
||||
```
|
||||
|
||||
Expected: FAIL because the deterministic execution plan is not wired yet.
|
||||
|
||||
- [ ] **Step 3: Implement execution via the existing `browser_script` seam**
|
||||
|
||||
Build tool args only from normalized values:
|
||||
- `expected_domain`
|
||||
- `org_label`
|
||||
- `org_code`
|
||||
- `period_mode`
|
||||
- `period_mode_code`
|
||||
- `period_value`
|
||||
- `period_payload`
|
||||
|
||||
Resolve the tool explicitly to:
|
||||
- `tq-lineloss-report.collect_lineloss`
|
||||
|
||||
Do not introduce a new browser opcode family or second browser protocol.
|
||||
|
||||
- [ ] **Step 4: Implement central artifact interpretation**
|
||||
|
||||
Recommended helper:
|
||||
|
||||
```rust
|
||||
fn summarize_lineloss_artifact(artifact: &serde_json::Value) -> (bool, String)
|
||||
```
|
||||
|
||||
Summary must include canonical org/period and row counts, and surface blocked/partial/error reasons.
|
||||
|
||||
- [ ] **Step 5: Re-run the execution tests**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
cargo test deterministic_lineloss_execution_passes_canonical_args -- --exact
|
||||
cargo test deterministic_lineloss_missing_company_does_not_invoke_browser -- --exact
|
||||
cargo test deterministic_lineloss_partial_artifact_maps_to_partial_summary -- --exact
|
||||
```
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
- [ ] **Step 6: Commit**
|
||||
|
||||
```bash
|
||||
git add src/compat/deterministic_submit.rs src/agent/mod.rs src/compat/direct_skill_runtime.rs tests/deterministic_submit_test.rs tests/compat_runtime_test.rs
|
||||
git commit -m "feat: execute deterministic tq lineloss skill through browser script seam"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 7: Add Zhihu regression coverage and run the full verification set
|
||||
|
||||
**Files:**
|
||||
- Modify: `tests/compat_runtime_test.rs`
|
||||
- Modify only if required: `tests/runtime_task_flow_test.rs`
|
||||
- Reuse: `tests/deterministic_submit_test.rs`
|
||||
|
||||
- [ ] **Step 1: Add focused Zhihu regression tests**
|
||||
|
||||
Required assertions:
|
||||
- ordinary Zhihu hotlist requests without `。。。` still use the current path
|
||||
- existing export/presentation requests still preserve their current behavior
|
||||
- deterministic suffix does not silently route unmatched requests into Zhihu logic
|
||||
- an existing non-lineloss direct `browser_script` path outside the new scene still behaves unchanged
|
||||
|
||||
- [ ] **Step 2: Add end-to-end deterministic submit coverage**
|
||||
|
||||
Required assertions:
|
||||
- suffix detection
|
||||
- scene match
|
||||
- page-context mismatch prompt/block behavior when URL/title contradict the lineloss scene
|
||||
- missing/ambiguous prompts
|
||||
- canonical args passed to the browser-script tool
|
||||
- returned summary shows canonical org and period
|
||||
- execution stays on the existing pipe-backed browser-script seam with no ws-only dependency introduced on `main`
|
||||
|
||||
- [ ] **Step 3: Run the focused Rust tests**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
cargo test --test deterministic_submit_test
|
||||
cargo test --test compat_runtime_test
|
||||
cargo test --test runtime_task_flow_test
|
||||
```
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
- [ ] **Step 4: Run the whole Rust suite**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
cargo test
|
||||
```
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
- [ ] **Step 5: Re-run the staged skill JS tests**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
node --test "D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/tq-lineloss-report/scripts/collect_lineloss.test.js"
|
||||
```
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
- [ ] **Step 6: Commit**
|
||||
|
||||
```bash
|
||||
git add tests/deterministic_submit_test.rs tests/compat_runtime_test.rs tests/runtime_task_flow_test.rs
|
||||
git commit -m "test: cover deterministic tq lineloss routing and zhihu regression"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Final verification checklist
|
||||
|
||||
- [ ] `。。。` is the only deterministic trigger.
|
||||
- [ ] Non-`。。。` requests preserve current routing.
|
||||
- [ ] Deterministic page-context mismatch blocks or mismatches before execution when URL/title contradict the lineloss scene.
|
||||
- [ ] Zhihu hotlist behavior is unchanged.
|
||||
- [ ] Existing non-lineloss direct `browser_script` behavior is unchanged.
|
||||
- [ ] Deterministic non-match returns an explicit supported-scene message.
|
||||
- [ ] Missing company prompts.
|
||||
- [ ] Ambiguous company prompts.
|
||||
- [ ] Missing mode prompts.
|
||||
- [ ] Missing period prompts.
|
||||
- [ ] Bare `第12周` prompts for year.
|
||||
- [ ] Canonical org code is passed to the staged skill.
|
||||
- [ ] Canonical period mode code and payload are passed to the staged skill.
|
||||
- [ ] The staged skill returns the locked artifact shape.
|
||||
- [ ] Execution uses the existing `browser_script` seam only.
|
||||
- [ ] No ws-specific runtime dependency is added on `main`.
|
||||
|
||||
## Implementation notes
|
||||
|
||||
- Prefer extracting a tiny shared execution helper from `src/compat/direct_skill_runtime.rs` if needed instead of duplicating tool lookup or browser-script invocation code.
|
||||
- Keep deterministic whitelist configuration in one place, but do not expand this slice into a full general scene-registry redesign.
|
||||
- If a failing test suggests changing Zhihu behavior, fix the deterministic branch or test harness instead of weakening the existing Zhihu path.
|
||||
- The checked-in unit dictionary is part of the deterministic contract; treat updates to that data as explicit behavior changes and cover them with tests.
|
||||
@@ -0,0 +1,448 @@
|
||||
# TQ Lineloss WS Dual-Transport Implementation Plan
|
||||
|
||||
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
|
||||
|
||||
**Goal:** Add ws communication support for the existing `tq-lineloss-report.collect_lineloss` deterministic browser_script path on the `feature/claw-ws` branch while preserving the current pipe path and validated Zhihu ws behavior.
|
||||
|
||||
**Architecture:** Reuse the existing backend-neutral execution seam that already exists for deterministic submit and browser_script execution. Keep lineloss business parsing, canonical args, and artifact interpretation unchanged; only make the ws backend/protocol and submit-path verification complete enough for the same lineloss skill contract to run over both pipe and ws.
|
||||
|
||||
**Tech Stack:** Rust 2021, Cargo tests, existing `BrowserBackend` abstraction, `WsBrowserBackend`, `ws_protocol`, browser websocket contract in `docs/_tmp_sgbrowser_ws_api_doc.txt`, existing staged `browser_script` skill execution seam.
|
||||
|
||||
---
|
||||
|
||||
## Execution Context
|
||||
|
||||
- Follow @superpowers:test-driven-development for each behavior change.
|
||||
- Follow @superpowers:verification-before-completion before claiming each task is done.
|
||||
- Do **not** create a git worktree unless the user explicitly asks.
|
||||
- This plan is **ws enablement only** for the already-added lineloss deterministic skill path.
|
||||
- Do **not** redesign deterministic routing, org parsing, period parsing, staged skill packaging, or artifact contracts unless a failing ws-specific test proves a minimal compatibility fix is required.
|
||||
- Do **not** modify validated Zhihu hotlist/export business behavior; only add regression coverage around it.
|
||||
- Preserve the current pipe execution path as the control implementation.
|
||||
- Preserve the current `BrowserBackend` seam; do not introduce a second lineloss-specific ws execution path.
|
||||
|
||||
## Scope Boundary
|
||||
|
||||
### In scope
|
||||
- Make the existing lineloss deterministic `browser_script` skill path run through ws on this branch.
|
||||
- Keep the same canonical tool args and returned artifact interpretation for both pipe and ws.
|
||||
- Verify ws browser-script execution against the documented browser ws contract.
|
||||
- Add focused tests for ws lineloss execution and regressions for Zhihu ws + pipe lineloss.
|
||||
|
||||
### Out of scope
|
||||
- Changing lineloss trigger semantics (`。。。`).
|
||||
- Changing org/unit normalization semantics or source dictionary shape.
|
||||
- Changing period normalization semantics.
|
||||
- Reworking staged skill docs or JS business collection logic beyond ws-compatibility necessities.
|
||||
- Any Zhihu feature work.
|
||||
- Any pipe-only cleanup/refactor.
|
||||
- Any general scene-registry redesign.
|
||||
|
||||
## File Map
|
||||
|
||||
### Expected code changes
|
||||
- Modify: `src/pipe/protocol.rs:49-78,130-165,192-209`
|
||||
- keep `Action::Eval` encoding aligned with the current transport contract and lineloss skill expectations
|
||||
- Modify: `src/pipe/browser_tool.rs:62-125`
|
||||
- ensure eval response correlation and payload handling remain sufficient for deterministic lineloss execution
|
||||
- Modify only if a focused test proves it is necessary: `src/compat/browser_script_skill_tool.rs:135-255`
|
||||
- preserve browser_script contract; only make minimal output-shape handling fixes if eval payloads differ from the pipe baseline in a way current code cannot consume
|
||||
- Modify only if a focused parity test proves it is necessary: `src/compat/direct_skill_runtime.rs:50-129`
|
||||
- preserve shared backend-neutral execution helper behavior; no business logic changes
|
||||
- Read and normally leave unchanged: `src/compat/deterministic_submit.rs:96-157`
|
||||
- this is the business contract baseline and should not be rewritten for transport parity work
|
||||
- Read and normally leave unchanged: `src/agent/mod.rs:242-285`
|
||||
- this contains the current deterministic dispatch split used by this branch
|
||||
|
||||
### Expected test changes
|
||||
- Modify: `tests/agent_runtime_test.rs`
|
||||
- add/extend deterministic lineloss runtime coverage and parity assertions using the current runtime path
|
||||
- Modify: `tests/compat_runtime_test.rs`
|
||||
- add/extend focused pipe lineloss regression assertions so transport work cannot silently break pipe
|
||||
- Modify only if end-to-end submit coverage truly needs it: `tests/runtime_task_flow_test.rs`
|
||||
- verify broader submit-flow expectations remain intact
|
||||
|
||||
### Reference-only files
|
||||
- Read only: `docs/superpowers/plans/2026-04-11-tq-lineloss-deterministic-skill-plan.md`
|
||||
- Read only: `docs/superpowers/specs/2026-04-11-tq-lineloss-deterministic-skill-design.md`
|
||||
- Read only: `docs/_tmp_sgbrowser_ws_api_doc.txt`
|
||||
|
||||
---
|
||||
|
||||
## Locked contracts
|
||||
|
||||
### Contract 1: Same lineloss deterministic business contract on both transports
|
||||
The ws path must reuse the existing values produced by `src/compat/deterministic_submit.rs:84-95` and `src/compat/deterministic_submit.rs:135-166`:
|
||||
- `expected_domain`
|
||||
- `org_label`
|
||||
- `org_code`
|
||||
- `period_mode`
|
||||
- `period_mode_code`
|
||||
- `period_value`
|
||||
- `period_payload`
|
||||
|
||||
No ws-specific lineloss args may be introduced in this slice.
|
||||
|
||||
### Contract 2: Same browser_script execution seam on both transports
|
||||
The ws path must continue to use `execute_browser_script_skill_raw_output_with_browser_backend(...)` from `src/compat/direct_skill_runtime.rs:95-112`, which in turn uses the same browser_script tool path as pipe. Do not add a second lineloss-only ws runner.
|
||||
|
||||
### Contract 3: Same artifact interpretation on both transports
|
||||
The ws path must produce output that remains consumable by `summarize_lineloss_output(...)` / `summarize_lineloss_artifact(...)` in `src/compat/deterministic_submit.rs:168-257` without transport-specific branching.
|
||||
|
||||
### Contract 4: Zhihu ws behavior must stay unchanged
|
||||
The existing ws browser-script / export path already validated by `tests/agent_runtime_test.rs` and `tests/compat_runtime_test.rs` is a hard regression boundary. If a change breaks Zhihu tests, fix the ws seam instead of weakening Zhihu expectations.
|
||||
|
||||
### Contract 5: Pipe remains the baseline
|
||||
For identical lineloss deterministic inputs, the pipe path should continue to succeed without requiring ws configuration.
|
||||
|
||||
---
|
||||
|
||||
### Task 1: Lock the ws contract with failing transport-level tests
|
||||
|
||||
**Files:**
|
||||
- Modify: `tests/agent_runtime_test.rs`
|
||||
- Modify: `tests/compat_runtime_test.rs`
|
||||
- Read: `docs/_tmp_sgbrowser_ws_api_doc.txt`
|
||||
|
||||
- [ ] **Step 1: Add a failing ws lineloss deterministic runtime test**
|
||||
|
||||
Model it after the existing ws harness in `tests/agent_runtime_test.rs:69-166`, but target lineloss deterministic execution instead of Zhihu. The test should:
|
||||
- configure `browserWsUrl`
|
||||
- submit a deterministic lineloss instruction ending with `。。。`
|
||||
- return a ws callback payload representing a lineloss `report-artifact`
|
||||
- assert success summary includes canonical org, period, status, and rows
|
||||
|
||||
Suggested skeleton:
|
||||
|
||||
```rust
|
||||
#[test]
|
||||
fn ws_deterministic_lineloss_submit_executes_browser_script_and_summarizes_artifact() {
|
||||
// arrange ws config + ws server + lineloss artifact callback
|
||||
// act handle_browser_message_with_context(... SubmitTask ...)
|
||||
// assert TaskComplete success summary contains canonical org/period/rows
|
||||
}
|
||||
```
|
||||
|
||||
- [ ] **Step 2: Add a failing pipe regression test for the same lineloss contract**
|
||||
|
||||
In `tests/compat_runtime_test.rs`, add a focused pipe-side assertion that the same deterministic lineloss instruction still succeeds through the current pipe seam and uses the same summary contract.
|
||||
|
||||
Suggested skeleton:
|
||||
|
||||
```rust
|
||||
#[test]
|
||||
fn pipe_deterministic_lineloss_submit_preserves_existing_summary_contract() {
|
||||
// arrange MockTransport responses for browser_script eval
|
||||
// act handle_browser_message_with_context(...)
|
||||
// assert success summary matches canonical contract
|
||||
}
|
||||
```
|
||||
|
||||
- [ ] **Step 3: Add a failing ws regression assertion for Zhihu**
|
||||
|
||||
Add or tighten a Zhihu ws assertion proving ordinary Zhihu requests still use the existing ws path and do not get intercepted by lineloss deterministic logic.
|
||||
|
||||
- [ ] **Step 4: Run the three focused tests to confirm failure**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
cargo test ws_deterministic_lineloss_submit_executes_browser_script_and_summarizes_artifact -- --exact
|
||||
cargo test pipe_deterministic_lineloss_submit_preserves_existing_summary_contract -- --exact
|
||||
cargo test ws_zhihu_submit_path_remains_unchanged_after_lineloss_transport_work -- --exact
|
||||
```
|
||||
|
||||
Expected: at least the new ws lineloss test fails before the seam is completed.
|
||||
|
||||
- [ ] **Step 5: Commit**
|
||||
|
||||
```bash
|
||||
git add tests/agent_runtime_test.rs tests/compat_runtime_test.rs
|
||||
git commit -m "test: lock ws and pipe lineloss transport contracts"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 2: Make the current eval transport contract explicitly satisfy browser-script requirements
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/pipe/protocol.rs:49-78,130-165,192-209`
|
||||
- Modify: `src/pipe/browser_tool.rs:62-124`
|
||||
- Modify only if tests prove necessary: `src/compat/browser_script_skill_tool.rs:99-180,214-255`
|
||||
- Modify: `tests/pipe_protocol_test.rs`
|
||||
- Modify: `tests/browser_tool_test.rs`
|
||||
- Modify: `tests/browser_script_skill_tool_test.rs`
|
||||
|
||||
- [ ] **Step 1: Add failing protocol/result-contract tests first**
|
||||
|
||||
Extend or add focused tests to lock the current branch's real transport contract:
|
||||
- `Action::Eval` remains supported by the line protocol and command encoding
|
||||
- eval request/response correlation remains stable via `seq` matching for lineloss-style target URLs
|
||||
- eval/browser_script result handling preserves the full JSON artifact string without truncation before deterministic lineloss summarization consumes it
|
||||
|
||||
Suggested skeletons:
|
||||
|
||||
```rust
|
||||
#[test]
|
||||
fn eval_action_remains_supported_in_protocol() {}
|
||||
|
||||
#[test]
|
||||
fn browser_tool_matches_eval_response_by_seq_for_lineloss_flow() {}
|
||||
|
||||
#[test]
|
||||
fn browser_script_tool_preserves_json_artifact_string_for_lineloss() {}
|
||||
```
|
||||
|
||||
- [ ] **Step 2: Run the focused Task 2 tests to confirm failure**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
cargo test eval_action_remains_supported_in_protocol -- --exact
|
||||
cargo test browser_tool_matches_eval_response_by_seq_for_lineloss_flow -- --exact
|
||||
cargo test browser_script_tool_preserves_json_artifact_string_for_lineloss -- --exact
|
||||
```
|
||||
|
||||
Expected: at least one test fails if the current protocol/correlation/result handling is still insufficient for the lineloss artifact path.
|
||||
|
||||
- [ ] **Step 3: Implement the minimal transport-contract fix**
|
||||
|
||||
Allowed changes:
|
||||
- adjust only the `Action::Eval` protocol/encoding support in `src/pipe/protocol.rs`
|
||||
- adjust only request/response correlation in `src/pipe/browser_tool.rs`
|
||||
- if and only if tests still prove it necessary, make a tiny result-shape/stringification fix in `src/compat/browser_script_skill_tool.rs`
|
||||
- keep existing Zhihu-compatible behavior intact
|
||||
|
||||
Not allowed:
|
||||
- adding lineloss-only transport fields
|
||||
- adding a second lineloss-specific execution path
|
||||
- changing deterministic lineloss business parsing or summary rules
|
||||
|
||||
- [ ] **Step 4: Re-run the focused Task 2 tests**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
cargo test eval_action_remains_supported_in_protocol -- --exact
|
||||
cargo test browser_tool_matches_eval_response_by_seq_for_lineloss_flow -- --exact
|
||||
cargo test browser_script_tool_preserves_json_artifact_string_for_lineloss -- --exact
|
||||
```
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
- [ ] **Step 5: Re-run the focused ws lineloss runtime test from Task 1**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
cargo test ws_deterministic_lineloss_submit_executes_browser_script_and_summarizes_artifact -- --exact
|
||||
```
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
- [ ] **Step 6: Commit**
|
||||
|
||||
```bash
|
||||
git add src/pipe/protocol.rs src/pipe/browser_tool.rs src/compat/browser_script_skill_tool.rs tests/pipe_protocol_test.rs tests/browser_tool_test.rs tests/browser_script_skill_tool_test.rs
|
||||
git commit -m "fix: align eval transport contract with lineloss browser script flow"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 3: Make eval result-shape handling surface the lineloss artifact cleanly
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/pipe/browser_tool.rs:62-125`
|
||||
- Modify only if tests prove necessary: `src/compat/browser_script_skill_tool.rs:159-180,248-255`
|
||||
- Modify: `tests/browser_script_skill_tool_test.rs`
|
||||
|
||||
- [ ] **Step 1: Add a failing result-shape test**
|
||||
|
||||
Lock that an eval response carrying a JSON string report artifact is surfaced as the same browser_script tool output shape expected by `execute_browser_script_tool(...)`.
|
||||
|
||||
Suggested skeleton:
|
||||
|
||||
```rust
|
||||
#[test]
|
||||
fn ws_backend_eval_returns_text_payload_consumable_by_browser_script_tool() {
|
||||
// arrange an eval response whose data.text is a JSON string artifact
|
||||
// assert execute_browser_script_tool(...) returns the full artifact text without truncation
|
||||
}
|
||||
```
|
||||
|
||||
- [ ] **Step 2: Run the result-shape test to confirm failure**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
cargo test ws_backend_eval_returns_text_payload_consumable_by_browser_script_tool -- --exact
|
||||
```
|
||||
|
||||
Expected: FAIL only if current eval/result handling is not sufficient for full lineloss artifact output.
|
||||
|
||||
- [ ] **Step 3: Implement the minimal result-shape fix**
|
||||
|
||||
Allowed fixes:
|
||||
- adjust `BrowserPipeTool::invoke(...)` only if response packaging itself is wrong
|
||||
- if and only if still required, make a tiny output-shape compatibility fix in `src/compat/browser_script_skill_tool.rs` so JSON string `data.text` payloads are preserved identically to the pipe baseline
|
||||
|
||||
Not allowed:
|
||||
- transport-specific lineloss parsing
|
||||
- changes to deterministic business logic
|
||||
- adding a second lineloss-specific execution path
|
||||
|
||||
- [ ] **Step 4: Re-run the result-shape test**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
cargo test ws_backend_eval_returns_text_payload_consumable_by_browser_script_tool -- --exact
|
||||
```
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
- [ ] **Step 5: Re-run the focused ws lineloss runtime test from Task 1**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
cargo test ws_deterministic_lineloss_submit_executes_browser_script_and_summarizes_artifact -- --exact
|
||||
```
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
- [ ] **Step 6: Commit**
|
||||
|
||||
```bash
|
||||
git add src/pipe/browser_tool.rs src/compat/browser_script_skill_tool.rs tests/browser_script_skill_tool_test.rs
|
||||
git commit -m "fix: make eval result shape match browser script contract"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 4: Verify the current backend-neutral deterministic execution path without changing business rules
|
||||
|
||||
**Files:**
|
||||
- Read baseline: `src/agent/mod.rs:242-285`
|
||||
- Read baseline: `src/compat/deterministic_submit.rs:96-157`
|
||||
- Modify only if a focused parity test proves it is necessary: `src/compat/direct_skill_runtime.rs:50-129`
|
||||
- Modify: `tests/agent_runtime_test.rs`
|
||||
- Modify: `tests/compat_runtime_test.rs`
|
||||
|
||||
- [ ] **Step 1: Add a failing integration test for backend-neutral parity**
|
||||
|
||||
Add a test proving these two current-branch paths produce the same lineloss summary contract for equivalent artifact payloads:
|
||||
- pipe path via the existing deterministic submit flow in `tests/compat_runtime_test.rs`
|
||||
- runtime path via `handle_browser_message_with_context(...)` deterministic submit routing in `tests/agent_runtime_test.rs`
|
||||
|
||||
Suggested skeleton:
|
||||
|
||||
```rust
|
||||
#[test]
|
||||
fn deterministic_lineloss_pipe_and_ws_paths_share_summary_contract() {}
|
||||
```
|
||||
|
||||
- [ ] **Step 2: Run the parity test to confirm failure or gap**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
cargo test deterministic_lineloss_pipe_and_ws_paths_share_summary_contract -- --exact
|
||||
```
|
||||
|
||||
Expected: FAIL only if a remaining shared execution seam gap still exists.
|
||||
|
||||
- [ ] **Step 3: Apply the smallest shared execution fix if needed**
|
||||
|
||||
Allowed changes:
|
||||
- tiny helper extraction or result handling in `src/compat/direct_skill_runtime.rs`
|
||||
- no new lineloss-specific branch
|
||||
- no change to deterministic lineloss business parsing or summary rules
|
||||
- no change to configured direct-submit behavior for non-lineloss skills
|
||||
|
||||
- [ ] **Step 4: Re-run the parity test**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
cargo test deterministic_lineloss_pipe_and_ws_paths_share_summary_contract -- --exact
|
||||
```
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
- [ ] **Step 5: Commit**
|
||||
|
||||
```bash
|
||||
git add src/compat/direct_skill_runtime.rs tests/agent_runtime_test.rs tests/compat_runtime_test.rs
|
||||
git commit -m "fix: preserve shared deterministic execution across pipe and ws"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 5: Run the full focused verification set and stop if any Zhihu or pipe regression appears
|
||||
|
||||
**Files:**
|
||||
- Reuse: `tests/agent_runtime_test.rs`
|
||||
- Reuse: `tests/compat_runtime_test.rs`
|
||||
- Reuse: `tests/runtime_task_flow_test.rs`
|
||||
|
||||
- [ ] **Step 1: Run focused ws + lineloss + Zhihu regression tests**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
cargo test --test agent_runtime_test
|
||||
cargo test --test compat_runtime_test
|
||||
cargo test --test runtime_task_flow_test
|
||||
```
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
- [ ] **Step 2: Run targeted protocol/backend unit tests**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
cargo test eval_action_remains_supported_in_protocol -- --exact
|
||||
cargo test browser_tool_matches_eval_response_by_seq_for_lineloss_flow -- --exact
|
||||
cargo test browser_script_tool_preserves_json_artifact_string_for_lineloss -- --exact
|
||||
cargo test ws_backend_eval_returns_text_payload_consumable_by_browser_script_tool -- --exact
|
||||
cargo test deterministic_lineloss_pipe_and_ws_paths_share_summary_contract -- --exact
|
||||
```
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
- [ ] **Step 3: Run the full Rust suite**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
cargo test
|
||||
```
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
- [ ] **Step 4: Manual review of diff scope**
|
||||
|
||||
Confirm the diff only touches:
|
||||
- current transport/result seam files (`src/pipe/protocol.rs`, `src/pipe/browser_tool.rs`)
|
||||
- narrow shared browser_script/result compatibility helpers if strictly necessary
|
||||
- tests
|
||||
|
||||
If diff includes Zhihu business logic, lineloss parsing rules, staged skill business JS, or unrelated cleanup, remove those changes before completion.
|
||||
|
||||
- [ ] **Step 5: Commit**
|
||||
|
||||
```bash
|
||||
git add src/pipe/protocol.rs src/pipe/browser_tool.rs src/compat/browser_script_skill_tool.rs src/compat/direct_skill_runtime.rs tests/pipe_protocol_test.rs tests/browser_tool_test.rs tests/browser_script_skill_tool_test.rs tests/agent_runtime_test.rs tests/compat_runtime_test.rs
|
||||
git commit -m "test: verify lineloss ws transport without regressing pipe or zhihu"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Final verification checklist
|
||||
|
||||
- [ ] The same lineloss deterministic instruction works on pipe and ws.
|
||||
- [ ] Pipe still works without any ws configuration.
|
||||
- [ ] Eval transport support remains available for deterministic lineloss execution.
|
||||
- [ ] Eval response payloads preserve the full lineloss artifact JSON string.
|
||||
- [ ] `src/compat/deterministic_submit.rs` business rules remain transport-neutral.
|
||||
- [ ] No ws-specific lineloss args were introduced.
|
||||
- [ ] Zhihu ws tests still pass unchanged in behavior.
|
||||
- [ ] No ordinary Zhihu request is intercepted by lineloss deterministic routing.
|
||||
- [ ] No new transport-specific business branch was added for lineloss.
|
||||
|
||||
## Implementation notes
|
||||
|
||||
- Default to changing the current transport/result seam first: `src/pipe/protocol.rs` and `src/pipe/browser_tool.rs`.
|
||||
- Treat `src/compat/browser_script_skill_tool.rs` and `src/compat/direct_skill_runtime.rs` as shared seams: change them only if a focused failing test shows a transport-neutral compatibility bug.
|
||||
- If a proposed fix requires changing `src/compat/deterministic_submit.rs` business logic, stop and re-evaluate; that likely means the seam fix is happening at the wrong layer.
|
||||
- If a proposed fix changes Zhihu expectations, stop and repair the seam instead.
|
||||
@@ -0,0 +1,228 @@
|
||||
# Async Browser Script 支持实现计划
|
||||
|
||||
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
|
||||
|
||||
**Goal:** 修改 `build_eval_js` 函数支持异步脚本,解决 Promise 被 JSON.stringify 序列化为 `{}` 的问题。
|
||||
|
||||
**Architecture:** 将 `build_eval_js` 生成的 JavaScript 代码从同步 IIFE 改为 async IIFE,用 await 等待脚本执行结果,并检测 Promise-like 对象进行二次等待。
|
||||
|
||||
**Tech Stack:** Rust, JavaScript (生成代码)
|
||||
|
||||
---
|
||||
|
||||
## 文件结构
|
||||
|
||||
| 文件 | 操作 | 说明 |
|
||||
|------|------|------|
|
||||
| `src/browser/callback_backend.rs` | 修改 | 修改 `build_eval_js` 函数 |
|
||||
| `tests/browser_script_skill_tool_test.rs` | 新增测试 | 添加异步脚本测试用例 |
|
||||
|
||||
---
|
||||
|
||||
### Task 1: 修改 build_eval_js 支持异步脚本
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/browser/callback_backend.rs:433-447`
|
||||
|
||||
**当前代码:**
|
||||
```rust
|
||||
fn build_eval_js(source_url: &str, script: &str) -> String {
|
||||
let escaped_source_url = escape_js_single_quoted(source_url);
|
||||
let callback = EVAL_CALLBACK_NAME;
|
||||
let events_url = escape_js_single_quoted(&events_endpoint_url(source_url));
|
||||
|
||||
format!(
|
||||
"(function(){{try{{var v=(function(){{return {script}}})();\
|
||||
var t=(typeof v==='string')?v:JSON.stringify(v);\
|
||||
try{{callBackJsToCpp('{escaped_source_url}@_@'+window.location.href+'@_@{callback}@_@sgBrowserExcuteJsCodeByDomain@_@'+(t??''))}}catch(_){{}}\
|
||||
var j=JSON.stringify({{type:'callback',callback:'{callback}',request_url:'{escaped_source_url}',payload:{{value:(t??'')}}}});\
|
||||
try{{var r=new XMLHttpRequest();r.open('POST','{events_url}',true);r.setRequestHeader('Content-Type','application/json');r.send(j)}}catch(_){{}}\
|
||||
try{{navigator.sendBeacon('{events_url}',new Blob([j],{{type:'application/json'}}))}}catch(_){{}}\
|
||||
}}catch(e){{}}}})()"
|
||||
)
|
||||
}
|
||||
```
|
||||
|
||||
**修改后代码:**
|
||||
```rust
|
||||
fn build_eval_js(source_url: &str, script: &str) -> String {
|
||||
let escaped_source_url = escape_js_single_quoted(source_url);
|
||||
let callback = EVAL_CALLBACK_NAME;
|
||||
let events_url = escape_js_single_quoted(&events_endpoint_url(source_url));
|
||||
|
||||
format!(
|
||||
"(async function(){{try{{\
|
||||
var v=await (async function(){{return {script}}})();\
|
||||
if(v&&typeof v.then==='function'){{v=await v;}}\
|
||||
var t=(typeof v==='string')?v:JSON.stringify(v);\
|
||||
try{{callBackJsToCpp('{escaped_source_url}@_@'+window.location.href+'@_@{callback}@_@sgBrowserExcuteJsCodeByDomain@_@'+(t??''))}}catch(_){{}}\
|
||||
var j=JSON.stringify({{type:'callback',callback:'{callback}',request_url:'{escaped_source_url}',payload:{{value:(t??'')}}}});\
|
||||
try{{var r=new XMLHttpRequest();r.open('POST','{events_url}',true);r.setRequestHeader('Content-Type','application/json');r.send(j)}}catch(_){{}}\
|
||||
try{{navigator.sendBeacon('{events_url}',new Blob([j],{{type:'application/json'}}))}}catch(_){{}}\
|
||||
}}catch(e){{}}}})()"
|
||||
)
|
||||
}
|
||||
```
|
||||
|
||||
**关键变更说明:**
|
||||
1. `(function()` → `(async function()` - 整个 IIFE 变为异步
|
||||
2. `var v=(function(){return {script}})()` → `var v=await (async function(){return {script}})()` - 内部包装也变为异步并 await
|
||||
3. 新增 `if(v&&typeof v.then==='function'){v=await v;}` - 检测并等待 Promise-like 对象
|
||||
|
||||
- [ ] **Step 1: 修改 build_eval_js 函数**
|
||||
|
||||
编辑 `src/browser/callback_backend.rs` 第 433-447 行,替换为上述新代码。
|
||||
|
||||
- [ ] **Step 2: 编译验证**
|
||||
|
||||
Run: `cargo build`
|
||||
Expected: 编译成功,无错误
|
||||
|
||||
- [ ] **Step 3: 运行现有测试**
|
||||
|
||||
Run: `cargo test browser_script_skill_tool`
|
||||
Expected: 所有测试通过
|
||||
|
||||
- [ ] **Step 4: Commit**
|
||||
|
||||
```bash
|
||||
git add src/browser/callback_backend.rs
|
||||
git commit -m "fix: support async browser scripts in build_eval_js
|
||||
|
||||
Wrap eval script in async IIFE and await Promise-like results.
|
||||
Fixes Promise serialization returning '{}' for async skill scripts.
|
||||
|
||||
🤖 Generated with [Qoder][https://qoder.com]"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 2: 添加异步脚本测试用例
|
||||
|
||||
**Files:**
|
||||
- Modify: `tests/browser_script_skill_tool_test.rs`
|
||||
|
||||
- [ ] **Step 1: 添加异步脚本测试用例**
|
||||
|
||||
在 `tests/browser_script_skill_tool_test.rs` 文件末尾添加新测试:
|
||||
|
||||
```rust
|
||||
#[tokio::test]
|
||||
async fn execute_browser_script_tool_awaits_async_script() {
|
||||
let skill_dir = unique_temp_dir("sgclaw-browser-script-async");
|
||||
let scripts_dir = skill_dir.join("scripts");
|
||||
fs::create_dir_all(&scripts_dir).unwrap();
|
||||
// 异步脚本,返回 Promise
|
||||
fs::write(
|
||||
scripts_dir.join("async_extract.js"),
|
||||
"return (async function() { return { async: true, args: args }; })();\n",
|
||||
)
|
||||
.unwrap();
|
||||
|
||||
let transport = Arc::new(MockTransport::new(vec![BrowserMessage::Response {
|
||||
seq: 1,
|
||||
success: true,
|
||||
data: json!({
|
||||
"text": {
|
||||
"async": true,
|
||||
"args": { "expected_domain": "example.com" }
|
||||
}
|
||||
}),
|
||||
aom_snapshot: vec![],
|
||||
timing: Timing {
|
||||
queue_ms: 1,
|
||||
exec_ms: 5,
|
||||
},
|
||||
}]));
|
||||
|
||||
let mut policy_json = test_policy();
|
||||
// 允许 example.com
|
||||
policy_json = MacPolicy::from_json_str(
|
||||
r#"{
|
||||
"version": "1.0",
|
||||
"domains": { "allowed": ["www.zhihu.com", "example.com"] },
|
||||
"pipe_actions": {
|
||||
"allowed": ["click", "type", "navigate", "getText", "eval"],
|
||||
"blocked": []
|
||||
}
|
||||
}"#,
|
||||
)
|
||||
.unwrap();
|
||||
|
||||
let browser_tool = BrowserPipeTool::new(
|
||||
transport.clone(),
|
||||
policy_json,
|
||||
vec![1, 2, 3, 4, 5, 6, 7, 8],
|
||||
)
|
||||
.with_response_timeout(Duration::from_secs(1));
|
||||
|
||||
let skill_tool = SkillTool {
|
||||
name: "async_extract".to_string(),
|
||||
description: "Extract data asynchronously".to_string(),
|
||||
kind: "browser_script".to_string(),
|
||||
command: "scripts/async_extract.js".to_string(),
|
||||
args: HashMap::new(),
|
||||
};
|
||||
|
||||
let result = execute_browser_script_tool(
|
||||
&skill_tool,
|
||||
&skill_dir,
|
||||
&PipeBrowserBackend::from_inner(browser_tool),
|
||||
json!({
|
||||
"expected_domain": "example.com"
|
||||
}),
|
||||
)
|
||||
.await
|
||||
.unwrap();
|
||||
|
||||
assert!(result.success);
|
||||
let output = serde_json::from_str::<serde_json::Value>(&result.output).unwrap();
|
||||
assert_eq!(output["async"], true);
|
||||
}
|
||||
```
|
||||
|
||||
- [ ] **Step 2: 运行新测试**
|
||||
|
||||
Run: `cargo test execute_browser_script_tool_awaits_async_script`
|
||||
Expected: 测试通过
|
||||
|
||||
- [ ] **Step 3: Commit**
|
||||
|
||||
```bash
|
||||
git add tests/browser_script_skill_tool_test.rs
|
||||
git commit -m "test: add async browser script test case
|
||||
|
||||
🤖 Generated with [Qoder][https://qoder.com]"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 3: 端到端验证
|
||||
|
||||
**Files:**
|
||||
- 无文件修改,仅验证
|
||||
|
||||
- [ ] **Step 1: 完整构建**
|
||||
|
||||
Run: `cargo build`
|
||||
Expected: 编译成功
|
||||
|
||||
- [ ] **Step 2: 运行全部测试**
|
||||
|
||||
Run: `cargo test`
|
||||
Expected: 所有测试通过
|
||||
|
||||
- [ ] **Step 3: 手动端到端测试**
|
||||
|
||||
使用 service console 测试 `tq-lineloss-report.collect_lineloss`:
|
||||
1. 启动 sgclaw: `target/debug/sg_claw.exe`
|
||||
2. 在 service console 输入: `兰州公司 台区线损大数据 月累计线损率统计分析。。。`
|
||||
3. 预期结果: 返回实际报表数据,而非 `{}`
|
||||
|
||||
---
|
||||
|
||||
## 自检清单
|
||||
|
||||
- [x] Spec 覆盖: 设计文档中所有要点都有对应任务
|
||||
- [x] 无占位符: 所有代码都是完整的
|
||||
- [x] 类型一致性: 函数签名无变化
|
||||
73
docs/superpowers/plans/2026-04-13-async-eval-then-fix.md
Normal file
73
docs/superpowers/plans/2026-04-13-async-eval-then-fix.md
Normal file
@@ -0,0 +1,73 @@
|
||||
# Async Eval .then() Fix Implementation Plan
|
||||
|
||||
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
|
||||
|
||||
**Goal:** Fix `build_eval_js` to handle async script return values using `.then()` instead of `async IIFE`.
|
||||
|
||||
**Architecture:** Extract callback-sending logic into a `_s` helper function inside the generated JS. If the script returns a Promise, call `_s` via `.then()`; otherwise call `_s` synchronously. This keeps the outer IIFE synchronous for C++ injection compatibility.
|
||||
|
||||
**Tech Stack:** Rust, JavaScript
|
||||
|
||||
---
|
||||
|
||||
## Files
|
||||
|
||||
- Modify: `src/browser/callback_backend.rs:433-447` - `build_eval_js` function
|
||||
|
||||
---
|
||||
|
||||
### Task 1: Modify build_eval_js to support async via .then()
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/browser/callback_backend.rs:433-447`
|
||||
|
||||
- [ ] **Step 1: Replace build_eval_js implementation**
|
||||
|
||||
Replace the entire `build_eval_js` function body (lines 433-447) with:
|
||||
|
||||
```rust
|
||||
fn build_eval_js(source_url: &str, script: &str) -> String {
|
||||
let escaped_source_url = escape_js_single_quoted(source_url);
|
||||
let callback = EVAL_CALLBACK_NAME;
|
||||
let events_url = escape_js_single_quoted(&events_endpoint_url(source_url));
|
||||
|
||||
format!(
|
||||
"(function(){{try{{\
|
||||
var v=(function(){{return {script}}})();\
|
||||
function _s(v){{\
|
||||
var t=(typeof v==='string')?v:JSON.stringify(v);\
|
||||
try{{callBackJsToCpp('{escaped_source_url}@_@'+window.location.href+'@_@{callback}@_@sgBrowserExcuteJsCodeByDomain@_@'+(t??''))}}catch(_){{}}\
|
||||
var j=JSON.stringify({{type:'callback',callback:'{callback}',request_url:'{escaped_source_url}',payload:{{value:(t??'')}}}});\
|
||||
try{{var r=new XMLHttpRequest();r.open('POST','{events_url}',true);r.setRequestHeader('Content-Type','application/json');r.send(j)}}catch(_){{}}\
|
||||
try{{navigator.sendBeacon('{events_url}',new Blob([j],{{type:'application/json'}}))}}catch(_){{}}\
|
||||
}}\
|
||||
if(v&&typeof v.then==='function'){{v.then(_s).catch(function(){{}});}}else{{_s(v);}}\
|
||||
}}catch(e){{}}}})()"
|
||||
)
|
||||
}
|
||||
```
|
||||
|
||||
- [ ] **Step 2: Run tests**
|
||||
|
||||
Run: `cargo test browser_script_skill_tool --no-fail-fast`
|
||||
|
||||
Expected: All tests pass.
|
||||
|
||||
- [ ] **Step 3: Run full test suite**
|
||||
|
||||
Run: `cargo test`
|
||||
|
||||
Expected: All tests pass (except pre-existing `lineloss_period_resolver_prompts_for_missing_period` failure which is unrelated).
|
||||
|
||||
- [ ] **Step 4: Build**
|
||||
|
||||
Run: `cargo build`
|
||||
|
||||
Expected: Compiles with no errors.
|
||||
|
||||
- [ ] **Step 5: Commit**
|
||||
|
||||
```bash
|
||||
git add src/browser/callback_backend.rs
|
||||
git commit -m "fix: support async browser scripts via .then() in build_eval_js"
|
||||
```
|
||||
52
docs/superpowers/plans/2026-04-13-expected-domain-arg-fix.md
Normal file
52
docs/superpowers/plans/2026-04-13-expected-domain-arg-fix.md
Normal file
@@ -0,0 +1,52 @@
|
||||
# Expected Domain Arg Fix Implementation Plan
|
||||
|
||||
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
|
||||
|
||||
**Goal:** Fix browser_script_skill_tool to pass expected_domain to wrapped JS scripts.
|
||||
|
||||
**Architecture:** Insert the normalized expected_domain back into args HashMap after domain normalization, before script wrapping.
|
||||
|
||||
**Tech Stack:** Rust, serde_json
|
||||
|
||||
---
|
||||
|
||||
## Files
|
||||
|
||||
- Modify: `src/compat/browser_script_skill_tool.rs:210` - Insert expected_domain back into args
|
||||
|
||||
---
|
||||
|
||||
### Task 1: Insert expected_domain into args
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/compat/browser_script_skill_tool.rs:210`
|
||||
|
||||
- [ ] **Step 1: Add expected_domain to args after normalization**
|
||||
|
||||
Edit `src/compat/browser_script_skill_tool.rs`, insert after line 209 (`eprintln!("[execute_browser_script_impl] expected_domain: {}", expected_domain);`):
|
||||
|
||||
```rust
|
||||
args.insert("expected_domain".to_string(), Value::String(expected_domain.clone()));
|
||||
```
|
||||
|
||||
The context around line 209-211 should look like this after the edit:
|
||||
|
||||
```rust
|
||||
eprintln!("[execute_browser_script_impl] expected_domain: {}", expected_domain);
|
||||
args.insert("expected_domain".to_string(), Value::String(expected_domain.clone()));
|
||||
|
||||
for required_arg in tool.args.keys() {
|
||||
```
|
||||
|
||||
- [ ] **Step 2: Run tests to verify the fix**
|
||||
|
||||
Run: `cargo test browser_script_skill_tool --no-fail-fast -- --nocapture`
|
||||
|
||||
Expected: All tests pass, including `execute_browser_script_tool_runs_packaged_script_with_expected_domain`
|
||||
|
||||
- [ ] **Step 3: Commit**
|
||||
|
||||
```bash
|
||||
git add src/compat/browser_script_skill_tool.rs
|
||||
git commit -m "fix: pass expected_domain to wrapped browser scripts"
|
||||
```
|
||||
163
docs/superpowers/plans/2026-04-13-lineloss-requesturl-fix.md
Normal file
163
docs/superpowers/plans/2026-04-13-lineloss-requesturl-fix.md
Normal file
@@ -0,0 +1,163 @@
|
||||
# 台区线损 requesturl 快速修复 实现计划
|
||||
|
||||
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
|
||||
|
||||
**Goal:** 在 `derive_request_url_from_instruction` 中添加台区线损 URL 映射,使 `sgHideBrowerserOpenPage` 命令能正确执行。
|
||||
|
||||
**Architecture:** 在现有知乎 URL 映射模式后追加台区线损场景的硬编码映射。
|
||||
|
||||
**Tech Stack:** Rust
|
||||
|
||||
---
|
||||
|
||||
### Task 1: 添加测试用例
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/service/server.rs:828` (tests 模块)
|
||||
|
||||
- [ ] **Step 1: 在 tests 模块中添加台区线损 URL 映射测试**
|
||||
|
||||
在 `initial_request_url_falls_back_to_zhihu_origin_for_generated_article_publish_routes` 测试后添加新测试:
|
||||
|
||||
```rust
|
||||
#[test]
|
||||
fn initial_request_url_falls_back_to_lineloss_origin_for_lineloss_instructions() {
|
||||
let request = SubmitTaskRequest {
|
||||
instruction: "兰州公司 台区线损大数据 月累计线损率统计分析。。。".to_string(),
|
||||
..SubmitTaskRequest::default()
|
||||
};
|
||||
|
||||
assert_eq!(
|
||||
initial_request_url_for_submit_task(&request),
|
||||
"http://20.76.57.61:18080"
|
||||
);
|
||||
}
|
||||
```
|
||||
|
||||
- [ ] **Step 2: 运行测试验证失败**
|
||||
|
||||
Run: `cargo test initial_request_url_falls_back_to_lineloss_origin_for_lineloss_instructions -- --nocapture`
|
||||
|
||||
Expected: FAIL - 测试应该失败,因为还未实现映射逻辑
|
||||
|
||||
- [ ] **Step 3: 提交测试文件**
|
||||
|
||||
```bash
|
||||
git add src/service/server.rs
|
||||
git commit -m "test: add lineloss requesturl mapping test"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 2: 实现台区线损 URL 映射
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/service/server.rs:354-382` (derive_request_url_from_instruction 函数)
|
||||
|
||||
- [ ] **Step 1: 在 derive_request_url_from_instruction 中添加台区线损映射**
|
||||
|
||||
在第二个知乎判断块后、`None` 之前添加:
|
||||
|
||||
```rust
|
||||
// 台区线损相关
|
||||
// TODO: 临时方案,后续应从 skill 配置或 deterministic_submit 解析结果中获取
|
||||
if instruction.contains("线损") || instruction.contains("lineloss") {
|
||||
return Some("http://20.76.57.61:18080".to_string());
|
||||
}
|
||||
|
||||
None
|
||||
```
|
||||
|
||||
完整函数应为:
|
||||
|
||||
```rust
|
||||
fn derive_request_url_from_instruction(instruction: &str) -> Option<String> {
|
||||
if crate::compat::workflow_executor::detect_route(instruction, None, None)
|
||||
.is_some_and(|route| {
|
||||
matches!(
|
||||
route,
|
||||
crate::compat::workflow_executor::WorkflowRoute::ZhihuHotlistExportXlsx
|
||||
| crate::compat::workflow_executor::WorkflowRoute::ZhihuHotlistScreen
|
||||
| crate::compat::workflow_executor::WorkflowRoute::ZhihuArticleEntry
|
||||
| crate::compat::workflow_executor::WorkflowRoute::ZhihuArticleAutoPublishGenerated
|
||||
)
|
||||
})
|
||||
{
|
||||
return Some("https://www.zhihu.com".to_string());
|
||||
}
|
||||
|
||||
if crate::compat::workflow_executor::detect_route(instruction, None, None)
|
||||
.is_some_and(|route| {
|
||||
matches!(
|
||||
route,
|
||||
crate::compat::workflow_executor::WorkflowRoute::ZhihuArticleDraft
|
||||
| crate::compat::workflow_executor::WorkflowRoute::ZhihuArticlePublish
|
||||
)
|
||||
})
|
||||
{
|
||||
return Some("https://zhuanlan.zhihu.com".to_string());
|
||||
}
|
||||
|
||||
// 台区线损相关
|
||||
// TODO: 临时方案,后续应从 skill 配置或 deterministic_submit 解析结果中获取
|
||||
if instruction.contains("线损") || instruction.contains("lineloss") {
|
||||
return Some("http://20.76.57.61:18080".to_string());
|
||||
}
|
||||
|
||||
None
|
||||
}
|
||||
```
|
||||
|
||||
- [ ] **Step 2: 运行测试验证通过**
|
||||
|
||||
Run: `cargo test initial_request_url_falls_back_to_lineloss_origin_for_lineloss_instructions -- --nocapture`
|
||||
|
||||
Expected: PASS
|
||||
|
||||
- [ ] **Step 3: 运行所有相关测试**
|
||||
|
||||
Run: `cargo test initial_request_url -- --nocapture`
|
||||
|
||||
Expected: 所有测试通过
|
||||
|
||||
- [ ] **Step 4: 构建项目**
|
||||
|
||||
Run: `cargo build`
|
||||
|
||||
Expected: 编译成功,无错误
|
||||
|
||||
- [ ] **Step 5: 提交实现**
|
||||
|
||||
```bash
|
||||
git add src/service/server.rs
|
||||
git commit -m "feat: add lineloss URL mapping in derive_request_url_from_instruction
|
||||
|
||||
临时方案:检测指令中包含'线损'或'lineloss'时返回台区线损平台 URL
|
||||
|
||||
🤖 Generated with [Qoder][https://qoder.com]"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 3: 端到端验证
|
||||
|
||||
**Files:**
|
||||
- 无文件修改,仅运行验证
|
||||
|
||||
- [ ] **Step 1: 停止现有 sgclaw 进程**
|
||||
|
||||
确保没有 `sg_claw.exe` 在运行
|
||||
|
||||
- [ ] **Step 2: 启动 sgclaw 服务**
|
||||
|
||||
Run: `target\debug\sg_claw.exe --config-path ..\sgclaw_config.json service`
|
||||
|
||||
- [ ] **Step 3: 在 service console 发送测试指令**
|
||||
|
||||
指令: `兰州公司 台区线损大数据 月累计线损率统计分析。。。`
|
||||
|
||||
Expected: 日志显示 `bootstrap_url=http://20.76.57.61:18080`,而非 `about:blank`
|
||||
|
||||
- [ ] **Step 4: 验证 helper page 打开成功**
|
||||
|
||||
Expected: 日志显示 `helper_loaded=true, ready=true`,不再超时
|
||||
76
docs/superpowers/plans/2026-04-13-lineloss-target-url-fix.md
Normal file
76
docs/superpowers/plans/2026-04-13-lineloss-target-url-fix.md
Normal file
@@ -0,0 +1,76 @@
|
||||
# 台区线损 target_url 缺失修复 实现计划
|
||||
|
||||
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
|
||||
|
||||
**Goal:** 在 `browser_script_skill_tool.rs` 调用 `Action::Eval` 时添加 `target_url` 参数。
|
||||
|
||||
**Architecture:** 从 `expected_domain` 构造完整 URL(`http://{expected_domain}`),添加到 invoke 的 params 中。
|
||||
|
||||
**Tech Stack:** Rust, serde_json
|
||||
|
||||
---
|
||||
|
||||
### Task 1: 添加 target_url 参数
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/compat/browser_script_skill_tool.rs:238-241` (invoke 调用)
|
||||
|
||||
- [ ] **Step 1: 修改 invoke 调用,添加 target_url**
|
||||
|
||||
将:
|
||||
```rust
|
||||
let result = match browser_tool.invoke(
|
||||
Action::Eval,
|
||||
json!({ "script": wrapped_script }),
|
||||
&expected_domain,
|
||||
) {
|
||||
```
|
||||
|
||||
改为:
|
||||
```rust
|
||||
let target_url = format!("http://{}", expected_domain);
|
||||
let result = match browser_tool.invoke(
|
||||
Action::Eval,
|
||||
json!({
|
||||
"script": wrapped_script,
|
||||
"target_url": target_url,
|
||||
}),
|
||||
&expected_domain,
|
||||
) {
|
||||
```
|
||||
|
||||
- [ ] **Step 2: 构建项目**
|
||||
|
||||
Run: `cargo build`
|
||||
|
||||
Expected: 编译成功,无错误
|
||||
|
||||
- [ ] **Step 3: 提交修改**
|
||||
|
||||
```bash
|
||||
git add src/compat/browser_script_skill_tool.rs
|
||||
git commit -m "fix: add target_url param for Action::Eval in browser_script_skill_tool
|
||||
|
||||
🤖 Generated with [Qoder][https://qoder.com]"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 2: 端到端验证
|
||||
|
||||
**Files:**
|
||||
- 无文件修改,仅运行验证
|
||||
|
||||
- [ ] **Step 1: 停止现有 sgclaw 进程**
|
||||
|
||||
确保没有 `sg_claw.exe` 在运行
|
||||
|
||||
- [ ] **Step 2: 启动 sgclaw 服务**
|
||||
|
||||
Run: `target\debug\sg_claw.exe --config-path ..\sgclaw_config.json service`
|
||||
|
||||
- [ ] **Step 3: 在 service console 发送测试指令**
|
||||
|
||||
指令: `兰州公司 台区线损大数据 月累计线损率统计分析。。。`
|
||||
|
||||
Expected: 日志显示 `invoke 成功`,不再出现 `target_url is required for eval` 错误
|
||||
@@ -0,0 +1,912 @@
|
||||
# Rust-Side Lineloss XLSX Export Implementation Plan
|
||||
|
||||
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
|
||||
|
||||
**Goal:** Move XLSX export from browser JS (blocked by CORS) to Rust side, so `collect_lineloss.js` only collects data and Rust generates the `.xlsx` file locally.
|
||||
|
||||
**Architecture:** JS collects API data and returns a `report-artifact` JSON with `rows`, `column_defs`, and metadata. Rust parses the artifact, extracts rows + column definitions, and generates a standard `.xlsx` file using the `zip` crate + OpenXML XML strings (same pattern as `openxml_office_tool.rs`). Report log is deferred.
|
||||
|
||||
**Tech Stack:** Rust, `zip` 0.6.6, `serde_json`, OpenXML Spreadsheet ML, JavaScript (browser-injected)
|
||||
|
||||
**Spec:** `docs/superpowers/specs/2026-04-13-rust-side-lineloss-xlsx-export.md`
|
||||
|
||||
---
|
||||
|
||||
## File Structure
|
||||
|
||||
| File | Responsibility |
|
||||
|------|---------------|
|
||||
| `src/compat/lineloss_xlsx_export.rs` | **New.** Pure XLSX generation: takes column defs + row data, produces `.xlsx` file. No business logic. |
|
||||
| `src/compat/deterministic_submit.rs` | **Modify.** After receiving JS artifact, extract rows + column_defs, call XLSX export, attach path to outcome. |
|
||||
| `src/compat/mod.rs` | **Modify.** Register `lineloss_xlsx_export` module. |
|
||||
| `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/tq-lineloss-report/scripts/collect_lineloss.js` | **Modify.** Remove `exportWorkbook`/`writeReportLog` calls. Add `column_defs` to artifact. |
|
||||
| `tests/lineloss_xlsx_export_test.rs` | **New.** Unit tests for XLSX generation. |
|
||||
|
||||
---
|
||||
|
||||
### Task 1: Create `lineloss_xlsx_export.rs` with Tests
|
||||
|
||||
**Files:**
|
||||
- Create: `src/compat/lineloss_xlsx_export.rs`
|
||||
- Create: `tests/lineloss_xlsx_export_test.rs`
|
||||
- Modify: `src/compat/mod.rs`
|
||||
|
||||
- [ ] **Step 1: Register the new module in `src/compat/mod.rs`**
|
||||
|
||||
Add the module declaration in alphabetical order. In `src/compat/mod.rs`, insert after `pub mod event_bridge;`:
|
||||
|
||||
```rust
|
||||
pub mod lineloss_xlsx_export;
|
||||
```
|
||||
|
||||
The full file becomes:
|
||||
|
||||
```rust
|
||||
pub mod artifact_open;
|
||||
pub mod browser_script_skill_tool;
|
||||
pub mod browser_tool_adapter;
|
||||
pub mod config_adapter;
|
||||
pub mod cron_adapter;
|
||||
pub mod deterministic_submit;
|
||||
pub mod direct_skill_runtime;
|
||||
pub mod event_bridge;
|
||||
pub mod lineloss_xlsx_export;
|
||||
pub mod memory_adapter;
|
||||
pub mod openxml_office_tool;
|
||||
pub mod orchestration;
|
||||
pub mod runtime;
|
||||
pub mod screen_html_export_tool;
|
||||
pub mod tq_lineloss;
|
||||
pub mod workflow_executor;
|
||||
```
|
||||
|
||||
- [ ] **Step 2: Write the failing test for XLSX generation**
|
||||
|
||||
Create `tests/lineloss_xlsx_export_test.rs`:
|
||||
|
||||
```rust
|
||||
use std::fs;
|
||||
use std::path::PathBuf;
|
||||
|
||||
use serde_json::json;
|
||||
use sgclaw::compat::lineloss_xlsx_export::{export_lineloss_xlsx, LinelossExportRequest};
|
||||
|
||||
fn temp_output_path(name: &str) -> PathBuf {
|
||||
let dir = std::env::temp_dir().join("sgclaw-test-xlsx");
|
||||
fs::create_dir_all(&dir).unwrap();
|
||||
dir.join(name)
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn export_month_lineloss_produces_valid_xlsx() {
|
||||
let output_path = temp_output_path("month-test.xlsx");
|
||||
if output_path.exists() {
|
||||
fs::remove_file(&output_path).unwrap();
|
||||
}
|
||||
|
||||
let request = LinelossExportRequest {
|
||||
sheet_name: "国网兰州供电公司月度线损分析报表(2026-03)".to_string(),
|
||||
column_defs: vec![
|
||||
("ORG_NAME".to_string(), "供电单位".to_string()),
|
||||
("YGDL".to_string(), "累计供电量".to_string()),
|
||||
("YYDL".to_string(), "累计售电量".to_string()),
|
||||
("YXSL".to_string(), "线损完成率(%)".to_string()),
|
||||
("RAT_SCOPE".to_string(), "线损率累计目标值".to_string()),
|
||||
("BLANK3".to_string(), "目标完成率".to_string()),
|
||||
("BLANK2".to_string(), "排行".to_string()),
|
||||
],
|
||||
rows: vec![
|
||||
serde_json::from_value(json!({
|
||||
"ORG_NAME": "城关供电",
|
||||
"YGDL": "12345.67",
|
||||
"YYDL": "11234.56",
|
||||
"YXSL": "9.00",
|
||||
"RAT_SCOPE": "9.50",
|
||||
"BLANK3": "94.74",
|
||||
"BLANK2": "1"
|
||||
}))
|
||||
.unwrap(),
|
||||
serde_json::from_value(json!({
|
||||
"ORG_NAME": "七里河供电",
|
||||
"YGDL": "9876.54",
|
||||
"YYDL": "8765.43",
|
||||
"YXSL": "11.24",
|
||||
"RAT_SCOPE": "10.00",
|
||||
"BLANK3": "112.40",
|
||||
"BLANK2": "2"
|
||||
}))
|
||||
.unwrap(),
|
||||
],
|
||||
output_path: output_path.clone(),
|
||||
};
|
||||
|
||||
let result_path = export_lineloss_xlsx(&request).unwrap();
|
||||
assert_eq!(result_path, output_path);
|
||||
assert!(output_path.exists());
|
||||
|
||||
// Verify it's a valid ZIP (xlsx is a zip archive)
|
||||
let file = fs::File::open(&output_path).unwrap();
|
||||
let mut archive = zip::ZipArchive::new(file).unwrap();
|
||||
|
||||
// Must contain the standard OpenXML entries
|
||||
let entry_names: Vec<String> = (0..archive.len())
|
||||
.map(|i| archive.by_index(i).unwrap().name().to_string())
|
||||
.collect();
|
||||
|
||||
assert!(entry_names.contains(&"[Content_Types].xml".to_string()));
|
||||
assert!(entry_names.contains(&"xl/worksheets/sheet1.xml".to_string()));
|
||||
assert!(entry_names.contains(&"xl/workbook.xml".to_string()));
|
||||
|
||||
// Read sheet1.xml and verify it contains our data
|
||||
let mut sheet = archive.by_name("xl/worksheets/sheet1.xml").unwrap();
|
||||
let mut xml = String::new();
|
||||
std::io::Read::read_to_string(&mut sheet, &mut xml).unwrap();
|
||||
|
||||
assert!(xml.contains("供电单位"), "header row should contain 供电单位");
|
||||
assert!(xml.contains("累计供电量"), "header row should contain 累计供电量");
|
||||
assert!(xml.contains("城关供电"), "data should contain 城关供电");
|
||||
assert!(xml.contains("12345.67"), "data should contain 12345.67");
|
||||
assert!(xml.contains("七里河供电"), "data should contain second row");
|
||||
|
||||
// Cleanup
|
||||
fs::remove_file(&output_path).unwrap();
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn export_empty_rows_returns_error() {
|
||||
let output_path = temp_output_path("empty-test.xlsx");
|
||||
|
||||
let request = LinelossExportRequest {
|
||||
sheet_name: "test".to_string(),
|
||||
column_defs: vec![("A".to_string(), "ColA".to_string())],
|
||||
rows: vec![],
|
||||
output_path: output_path.clone(),
|
||||
};
|
||||
|
||||
let result = export_lineloss_xlsx(&request);
|
||||
assert!(result.is_err());
|
||||
assert!(
|
||||
result.unwrap_err().to_string().contains("rows must not be empty"),
|
||||
"should reject empty rows"
|
||||
);
|
||||
}
|
||||
```
|
||||
|
||||
- [ ] **Step 3: Run the test to verify it fails**
|
||||
|
||||
Run: `cargo test --test lineloss_xlsx_export_test -- --nocapture`
|
||||
|
||||
Expected: compilation error — `lineloss_xlsx_export` module doesn't exist yet or `export_lineloss_xlsx` / `LinelossExportRequest` not defined.
|
||||
|
||||
- [ ] **Step 4: Implement `src/compat/lineloss_xlsx_export.rs`**
|
||||
|
||||
```rust
|
||||
use std::fs;
|
||||
use std::io::Write;
|
||||
use std::path::{Path, PathBuf};
|
||||
|
||||
use serde_json::{Map, Value};
|
||||
use zip::write::FileOptions;
|
||||
use zip::{CompressionMethod, ZipWriter};
|
||||
|
||||
pub struct LinelossExportRequest {
|
||||
pub sheet_name: String,
|
||||
pub column_defs: Vec<(String, String)>,
|
||||
pub rows: Vec<Map<String, Value>>,
|
||||
pub output_path: PathBuf,
|
||||
}
|
||||
|
||||
pub fn export_lineloss_xlsx(request: &LinelossExportRequest) -> anyhow::Result<PathBuf> {
|
||||
if request.rows.is_empty() {
|
||||
anyhow::bail!("rows must not be empty");
|
||||
}
|
||||
if request.column_defs.is_empty() {
|
||||
anyhow::bail!("column_defs must not be empty");
|
||||
}
|
||||
|
||||
let sheet_xml = build_worksheet_xml(&request.column_defs, &request.rows);
|
||||
|
||||
write_xlsx(
|
||||
&request.output_path,
|
||||
&request.sheet_name,
|
||||
&sheet_xml,
|
||||
)?;
|
||||
|
||||
Ok(request.output_path.clone())
|
||||
}
|
||||
|
||||
fn build_worksheet_xml(
|
||||
column_defs: &[(String, String)],
|
||||
rows: &[Map<String, Value>],
|
||||
) -> String {
|
||||
let mut xml_rows = Vec::with_capacity(rows.len() + 1);
|
||||
|
||||
// Header row (row 1)
|
||||
let header_cells: Vec<String> = column_defs
|
||||
.iter()
|
||||
.enumerate()
|
||||
.map(|(col_idx, (_key, label))| {
|
||||
let col_letter = column_letter(col_idx);
|
||||
format!(
|
||||
"<c r=\"{col_letter}1\" t=\"inlineStr\"><is><t>{}</t></is></c>",
|
||||
xml_escape(label)
|
||||
)
|
||||
})
|
||||
.collect();
|
||||
xml_rows.push(format!("<row r=\"1\">{}</row>", header_cells.join("")));
|
||||
|
||||
// Data rows (row 2+)
|
||||
for (row_idx, row) in rows.iter().enumerate() {
|
||||
let excel_row = row_idx + 2;
|
||||
let cells: Vec<String> = column_defs
|
||||
.iter()
|
||||
.enumerate()
|
||||
.map(|(col_idx, (key, _label))| {
|
||||
let col_letter = column_letter(col_idx);
|
||||
let value = row
|
||||
.get(key)
|
||||
.map(|v| value_to_string(v))
|
||||
.unwrap_or_default();
|
||||
format!(
|
||||
"<c r=\"{col_letter}{excel_row}\" t=\"inlineStr\"><is><t>{}</t></is></c>",
|
||||
xml_escape(&value)
|
||||
)
|
||||
})
|
||||
.collect();
|
||||
xml_rows.push(format!("<row r=\"{excel_row}\">{}</row>", cells.join("")));
|
||||
}
|
||||
|
||||
format!(
|
||||
"<?xml version=\"1.0\" encoding=\"UTF-8\" standalone=\"yes\"?>\
|
||||
<worksheet xmlns=\"http://schemas.openxmlformats.org/spreadsheetml/2006/main\">\
|
||||
<sheetData>{}</sheetData>\
|
||||
</worksheet>",
|
||||
xml_rows.join("")
|
||||
)
|
||||
}
|
||||
|
||||
fn column_letter(index: usize) -> String {
|
||||
let mut result = String::new();
|
||||
let mut n = index;
|
||||
loop {
|
||||
result.insert(0, (b'A' + (n % 26) as u8) as char);
|
||||
if n < 26 {
|
||||
break;
|
||||
}
|
||||
n = n / 26 - 1;
|
||||
}
|
||||
result
|
||||
}
|
||||
|
||||
fn value_to_string(value: &Value) -> String {
|
||||
match value {
|
||||
Value::String(text) => text.clone(),
|
||||
Value::Number(number) => number.to_string(),
|
||||
Value::Bool(flag) => flag.to_string(),
|
||||
Value::Null => String::new(),
|
||||
other => other.to_string(),
|
||||
}
|
||||
}
|
||||
|
||||
fn xml_escape(value: &str) -> String {
|
||||
value
|
||||
.replace('&', "&")
|
||||
.replace('<', "<")
|
||||
.replace('>', ">")
|
||||
}
|
||||
|
||||
fn write_xlsx(output_path: &Path, sheet_name: &str, sheet_xml: &str) -> anyhow::Result<()> {
|
||||
if let Some(parent) = output_path.parent() {
|
||||
fs::create_dir_all(parent)?;
|
||||
}
|
||||
if output_path.exists() {
|
||||
fs::remove_file(output_path)?;
|
||||
}
|
||||
|
||||
let file = fs::File::create(output_path)?;
|
||||
let mut zip = ZipWriter::new(file);
|
||||
let options = FileOptions::default().compression_method(CompressionMethod::Stored);
|
||||
|
||||
zip.start_file("[Content_Types].xml", options)?;
|
||||
zip.write_all(content_types_xml().as_bytes())?;
|
||||
|
||||
zip.start_file("_rels/.rels", options)?;
|
||||
zip.write_all(root_rels_xml().as_bytes())?;
|
||||
|
||||
zip.start_file("docProps/app.xml", options)?;
|
||||
zip.write_all(app_xml().as_bytes())?;
|
||||
|
||||
zip.start_file("docProps/core.xml", options)?;
|
||||
zip.write_all(core_xml().as_bytes())?;
|
||||
|
||||
zip.start_file("xl/workbook.xml", options)?;
|
||||
zip.write_all(workbook_xml(&xml_escape(sheet_name)).as_bytes())?;
|
||||
|
||||
zip.start_file("xl/_rels/workbook.xml.rels", options)?;
|
||||
zip.write_all(workbook_rels_xml().as_bytes())?;
|
||||
|
||||
zip.start_file("xl/worksheets/sheet1.xml", options)?;
|
||||
zip.write_all(sheet_xml.as_bytes())?;
|
||||
|
||||
zip.finish()?;
|
||||
Ok(())
|
||||
}
|
||||
|
||||
fn content_types_xml() -> &'static str {
|
||||
r#"<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
|
||||
<Types xmlns="http://schemas.openxmlformats.org/package/2006/content-types">
|
||||
<Default Extension="rels" ContentType="application/vnd.openxmlformats-package.relationships+xml"/>
|
||||
<Default Extension="xml" ContentType="application/xml"/>
|
||||
<Override PartName="/xl/workbook.xml" ContentType="application/vnd.openxmlformats-officedocument.spreadsheetml.sheet.main+xml"/>
|
||||
<Override PartName="/xl/worksheets/sheet1.xml" ContentType="application/vnd.openxmlformats-officedocument.spreadsheetml.worksheet+xml"/>
|
||||
<Override PartName="/docProps/core.xml" ContentType="application/vnd.openxmlformats-package.core-properties+xml"/>
|
||||
<Override PartName="/docProps/app.xml" ContentType="application/vnd.openxmlformats-officedocument.extended-properties+xml"/>
|
||||
</Types>"#
|
||||
}
|
||||
|
||||
fn root_rels_xml() -> &'static str {
|
||||
r#"<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
|
||||
<Relationships xmlns="http://schemas.openxmlformats.org/package/2006/relationships">
|
||||
<Relationship Id="rId1" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/officeDocument" Target="xl/workbook.xml"/>
|
||||
<Relationship Id="rId2" Type="http://schemas.openxmlformats.org/package/2006/relationships/metadata/core-properties" Target="docProps/core.xml"/>
|
||||
<Relationship Id="rId3" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/extended-properties" Target="docProps/app.xml"/>
|
||||
</Relationships>"#
|
||||
}
|
||||
|
||||
fn app_xml() -> &'static str {
|
||||
r#"<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
|
||||
<Properties xmlns="http://schemas.openxmlformats.org/officeDocument/2006/extended-properties"
|
||||
xmlns:vt="http://schemas.openxmlformats.org/officeDocument/2006/docPropsVTypes">
|
||||
<Application>sgClaw</Application>
|
||||
</Properties>"#
|
||||
}
|
||||
|
||||
fn core_xml() -> &'static str {
|
||||
r#"<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
|
||||
<cp:coreProperties xmlns:cp="http://schemas.openxmlformats.org/package/2006/metadata/core-properties"
|
||||
xmlns:dc="http://purl.org/dc/elements/1.1/"
|
||||
xmlns:dcterms="http://purl.org/dc/terms/"
|
||||
xmlns:dcmitype="http://purl.org/dc/dcmitype/"
|
||||
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
|
||||
<dc:title>台区线损报表</dc:title>
|
||||
</cp:coreProperties>"#
|
||||
}
|
||||
|
||||
fn workbook_xml(sheet_name: &str) -> String {
|
||||
format!(
|
||||
r#"<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
|
||||
<workbook xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main"
|
||||
xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships">
|
||||
<sheets>
|
||||
<sheet name="{sheet_name}" sheetId="1" r:id="rId1"/>
|
||||
</sheets>
|
||||
</workbook>"#
|
||||
)
|
||||
}
|
||||
|
||||
fn workbook_rels_xml() -> &'static str {
|
||||
r#"<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
|
||||
<Relationships xmlns="http://schemas.openxmlformats.org/package/2006/relationships">
|
||||
<Relationship Id="rId1" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/worksheet" Target="worksheets/sheet1.xml"/>
|
||||
</Relationships>"#
|
||||
}
|
||||
|
||||
#[cfg(test)]
|
||||
mod tests {
|
||||
use super::column_letter;
|
||||
|
||||
#[test]
|
||||
fn column_letter_maps_indices_correctly() {
|
||||
assert_eq!(column_letter(0), "A");
|
||||
assert_eq!(column_letter(1), "B");
|
||||
assert_eq!(column_letter(6), "G");
|
||||
assert_eq!(column_letter(25), "Z");
|
||||
assert_eq!(column_letter(26), "AA");
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
- [ ] **Step 5: Run the tests to verify they pass**
|
||||
|
||||
Run: `cargo test --test lineloss_xlsx_export_test -- --nocapture`
|
||||
|
||||
Expected: both `export_month_lineloss_produces_valid_xlsx` and `export_empty_rows_returns_error` PASS.
|
||||
|
||||
Also run the internal unit test:
|
||||
|
||||
Run: `cargo test lineloss_xlsx_export -- --nocapture`
|
||||
|
||||
Expected: `column_letter_maps_indices_correctly` PASS.
|
||||
|
||||
- [ ] **Step 6: Commit**
|
||||
|
||||
```bash
|
||||
git add src/compat/lineloss_xlsx_export.rs src/compat/mod.rs tests/lineloss_xlsx_export_test.rs
|
||||
git commit -m "feat(lineloss): add Rust-side XLSX generation for lineloss reports"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 2: Integrate XLSX Export into `deterministic_submit.rs`
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/compat/deterministic_submit.rs`
|
||||
|
||||
- [ ] **Step 1: Add imports and helper function to extract export data from artifact**
|
||||
|
||||
At the top of `src/compat/deterministic_submit.rs`, add the import:
|
||||
|
||||
```rust
|
||||
use crate::compat::lineloss_xlsx_export::{export_lineloss_xlsx, LinelossExportRequest};
|
||||
```
|
||||
|
||||
Then add a new helper function after `summarize_lineloss_artifact`:
|
||||
|
||||
```rust
|
||||
struct LinelossArtifactExportData {
|
||||
sheet_name: String,
|
||||
column_defs: Vec<(String, String)>,
|
||||
rows: Vec<Map<String, Value>>,
|
||||
}
|
||||
|
||||
fn extract_export_data(output: &str) -> Option<LinelossArtifactExportData> {
|
||||
let payload: Value = serde_json::from_str(output).ok()?;
|
||||
let artifact = payload
|
||||
.as_object()
|
||||
.and_then(|object| object.get("text"))
|
||||
.unwrap_or(&payload);
|
||||
let artifact = artifact.as_object()?;
|
||||
|
||||
if artifact.get("type").and_then(Value::as_str) != Some("report-artifact") {
|
||||
return None;
|
||||
}
|
||||
|
||||
let status = artifact.get("status").and_then(Value::as_str).unwrap_or("");
|
||||
if !matches!(status, "ok" | "partial") {
|
||||
return None;
|
||||
}
|
||||
|
||||
let rows = artifact
|
||||
.get("rows")
|
||||
.and_then(Value::as_array)?;
|
||||
if rows.is_empty() {
|
||||
return None;
|
||||
}
|
||||
let rows: Vec<Map<String, Value>> = rows
|
||||
.iter()
|
||||
.filter_map(|row| row.as_object().cloned())
|
||||
.collect();
|
||||
if rows.is_empty() {
|
||||
return None;
|
||||
}
|
||||
|
||||
let column_defs: Vec<(String, String)> = artifact
|
||||
.get("column_defs")
|
||||
.and_then(Value::as_array)
|
||||
.map(|defs| {
|
||||
defs.iter()
|
||||
.filter_map(|def| {
|
||||
let arr = def.as_array()?;
|
||||
let key = arr.first()?.as_str()?.to_string();
|
||||
let label = arr.get(1)?.as_str()?.to_string();
|
||||
Some((key, label))
|
||||
})
|
||||
.collect()
|
||||
})
|
||||
.unwrap_or_default();
|
||||
|
||||
// Fallback: if column_defs not in artifact, try "columns" array as keys
|
||||
let column_defs = if column_defs.is_empty() {
|
||||
let columns = artifact
|
||||
.get("columns")
|
||||
.and_then(Value::as_array)?;
|
||||
columns
|
||||
.iter()
|
||||
.filter_map(|col| {
|
||||
let key = col.as_str()?.to_string();
|
||||
Some((key.clone(), key))
|
||||
})
|
||||
.collect()
|
||||
} else {
|
||||
column_defs
|
||||
};
|
||||
|
||||
if column_defs.is_empty() {
|
||||
return None;
|
||||
}
|
||||
|
||||
let org_label = artifact
|
||||
.get("org")
|
||||
.and_then(Value::as_object)
|
||||
.and_then(|org| org.get("label"))
|
||||
.and_then(Value::as_str)
|
||||
.unwrap_or("lineloss");
|
||||
let period_mode = artifact
|
||||
.get("period")
|
||||
.and_then(Value::as_object)
|
||||
.and_then(|p| p.get("mode"))
|
||||
.and_then(Value::as_str)
|
||||
.unwrap_or("month");
|
||||
let period_value = artifact
|
||||
.get("period")
|
||||
.and_then(Value::as_object)
|
||||
.and_then(|p| p.get("value"))
|
||||
.and_then(Value::as_str)
|
||||
.unwrap_or("");
|
||||
let mode_label = if period_mode == "week" { "周度" } else { "月度" };
|
||||
let sheet_name = format!("{org_label}{mode_label}线损分析报表({period_value})");
|
||||
|
||||
Some(LinelossArtifactExportData {
|
||||
sheet_name,
|
||||
column_defs,
|
||||
rows,
|
||||
})
|
||||
}
|
||||
```
|
||||
|
||||
- [ ] **Step 2: Add the export-after-collection function**
|
||||
|
||||
Add a new function that wraps the existing flow with XLSX export:
|
||||
|
||||
```rust
|
||||
fn try_export_lineloss_xlsx(
|
||||
output: &str,
|
||||
workspace_root: &Path,
|
||||
) -> Option<PathBuf> {
|
||||
let data = extract_export_data(output)?;
|
||||
let nanos = std::time::SystemTime::now()
|
||||
.duration_since(std::time::UNIX_EPOCH)
|
||||
.map(|d| d.as_nanos())
|
||||
.unwrap_or_default();
|
||||
let out_dir = workspace_root.join("out");
|
||||
let output_path = out_dir.join(format!("tq-lineloss-{nanos}.xlsx"));
|
||||
|
||||
let request = LinelossExportRequest {
|
||||
sheet_name: data.sheet_name,
|
||||
column_defs: data.column_defs,
|
||||
rows: data.rows,
|
||||
output_path,
|
||||
};
|
||||
|
||||
match export_lineloss_xlsx(&request) {
|
||||
Ok(path) => {
|
||||
eprintln!("[deterministic_submit] XLSX exported to: {}", path.display());
|
||||
Some(path)
|
||||
}
|
||||
Err(err) => {
|
||||
eprintln!("[deterministic_submit] XLSX export failed: {err}");
|
||||
None
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
- [ ] **Step 3: Modify `execute_deterministic_submit_with_browser_backend` to call export**
|
||||
|
||||
Replace the body of `execute_deterministic_submit_with_browser_backend` (lines 119-136 of the original file):
|
||||
|
||||
```rust
|
||||
pub fn execute_deterministic_submit_with_browser_backend(
|
||||
browser_backend: Arc<dyn BrowserBackend>,
|
||||
plan: &DeterministicExecutionPlan,
|
||||
workspace_root: &Path,
|
||||
settings: &SgClawSettings,
|
||||
) -> Result<DirectSubmitOutcome, PipeError> {
|
||||
let args = deterministic_submit_args(plan);
|
||||
let output =
|
||||
crate::compat::direct_skill_runtime::execute_browser_script_skill_raw_output_with_browser_backend(
|
||||
browser_backend,
|
||||
&plan.tool_name,
|
||||
workspace_root,
|
||||
settings,
|
||||
args,
|
||||
)?;
|
||||
|
||||
let export_path = try_export_lineloss_xlsx(&output, workspace_root);
|
||||
Ok(summarize_lineloss_output_with_export(&output, export_path.as_deref()))
|
||||
}
|
||||
```
|
||||
|
||||
Apply the same change to `execute_deterministic_submit` (the non-backend variant, lines 101-117):
|
||||
|
||||
```rust
|
||||
pub fn execute_deterministic_submit<T: Transport + 'static>(
|
||||
browser_tool: BrowserPipeTool<T>,
|
||||
plan: &DeterministicExecutionPlan,
|
||||
workspace_root: &Path,
|
||||
settings: &SgClawSettings,
|
||||
) -> Result<DirectSubmitOutcome, PipeError> {
|
||||
let args = deterministic_submit_args(plan);
|
||||
let output = crate::compat::direct_skill_runtime::execute_browser_script_skill_raw_output(
|
||||
browser_tool,
|
||||
&plan.tool_name,
|
||||
workspace_root,
|
||||
settings,
|
||||
args,
|
||||
)?;
|
||||
|
||||
let export_path = try_export_lineloss_xlsx(&output, workspace_root);
|
||||
Ok(summarize_lineloss_output_with_export(&output, export_path.as_deref()))
|
||||
}
|
||||
```
|
||||
|
||||
- [ ] **Step 4: Add `summarize_lineloss_output_with_export` function**
|
||||
|
||||
Add this new function. It wraps the existing `summarize_lineloss_output` and appends the export path:
|
||||
|
||||
```rust
|
||||
fn summarize_lineloss_output_with_export(output: &str, export_path: Option<&Path>) -> DirectSubmitOutcome {
|
||||
let mut outcome = summarize_lineloss_output(output);
|
||||
|
||||
if let Some(path) = export_path {
|
||||
outcome.summary.push_str(&format!(" export_path={}", path.display()));
|
||||
}
|
||||
|
||||
outcome
|
||||
}
|
||||
```
|
||||
|
||||
- [ ] **Step 5: Run existing tests to ensure nothing breaks**
|
||||
|
||||
Run: `cargo test --test deterministic_submit_test -- --nocapture`
|
||||
|
||||
Expected: all existing tests PASS (the tests don't call `execute_deterministic_submit`, they test `decide_deterministic_submit` and parsing logic which is unchanged).
|
||||
|
||||
Run: `cargo test deterministic_submit -- --nocapture`
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
- [ ] **Step 6: Commit**
|
||||
|
||||
```bash
|
||||
git add src/compat/deterministic_submit.rs
|
||||
git commit -m "feat(lineloss): integrate Rust-side XLSX export into deterministic submit pipeline"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 3: Modify `collect_lineloss.js` to Skip Browser-Side Export
|
||||
|
||||
**Files:**
|
||||
- Modify: `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/tq-lineloss-report/scripts/collect_lineloss.js`
|
||||
|
||||
- [ ] **Step 1: Add `column_defs` to the artifact returned by `buildArtifact`**
|
||||
|
||||
In the `buildArtifact` function (around line 198), the `columns` field currently contains just column keys (e.g., `["ORG_NAME", "YGDL", ...]`). Add a `column_defs` field that includes the full key+label pairs. Change the `buildArtifact` function to also accept and emit `column_defs`:
|
||||
|
||||
Find this block in `buildArtifact` (line 198-242):
|
||||
|
||||
```javascript
|
||||
function buildArtifact({
|
||||
status,
|
||||
blockedReason = '',
|
||||
fatalError = '',
|
||||
org_label = '',
|
||||
org_code = '',
|
||||
period_mode = '',
|
||||
period_mode_code = '',
|
||||
period_value = '',
|
||||
period_payload = {},
|
||||
columns = [],
|
||||
rows = [],
|
||||
export: exportState,
|
||||
reasons = []
|
||||
}) {
|
||||
```
|
||||
|
||||
Replace with:
|
||||
|
||||
```javascript
|
||||
function buildArtifact({
|
||||
status,
|
||||
blockedReason = '',
|
||||
fatalError = '',
|
||||
org_label = '',
|
||||
org_code = '',
|
||||
period_mode = '',
|
||||
period_mode_code = '',
|
||||
period_value = '',
|
||||
period_payload = {},
|
||||
columns = [],
|
||||
column_defs = [],
|
||||
rows = [],
|
||||
export: exportState,
|
||||
reasons = []
|
||||
}) {
|
||||
```
|
||||
|
||||
In the returned object (the `return { ... }` block inside `buildArtifact`), add `column_defs` after `columns`:
|
||||
|
||||
```javascript
|
||||
columns: [...columns],
|
||||
column_defs: [...column_defs],
|
||||
rows: [...rows],
|
||||
```
|
||||
|
||||
- [ ] **Step 2: Pass `column_defs` from `buildBrowserEntrypointResult`**
|
||||
|
||||
In `buildBrowserEntrypointResult`, after the `columns` assignment (around line 452), add:
|
||||
|
||||
```javascript
|
||||
const columns = normalizedArgs.period_mode === 'week' ? WEEK_COLUMNS : MONTH_COLUMNS;
|
||||
const columnDefs = normalizedArgs.period_mode === 'week' ? WEEK_COLUMN_DEFS : MONTH_COLUMN_DEFS;
|
||||
```
|
||||
|
||||
Then in every call to `buildArtifact` inside `buildBrowserEntrypointResult`, add `column_defs: columnDefs` alongside `columns`. There are 5 calls:
|
||||
|
||||
**Call 1** (API error, around line 466):
|
||||
```javascript
|
||||
columns,
|
||||
column_defs: columnDefs,
|
||||
rows: [],
|
||||
```
|
||||
|
||||
**Call 2** (empty rows, around line 483):
|
||||
```javascript
|
||||
columns,
|
||||
column_defs: columnDefs,
|
||||
rows: []
|
||||
```
|
||||
|
||||
**Call 3** (normalization failure, around line 497):
|
||||
```javascript
|
||||
columns,
|
||||
column_defs: columnDefs,
|
||||
rows: [],
|
||||
```
|
||||
|
||||
**Call 4** (success, around line 558):
|
||||
```javascript
|
||||
columns,
|
||||
column_defs: columnDefs,
|
||||
rows,
|
||||
```
|
||||
|
||||
Note: the two `buildArtifact` calls before the `columns` variable is assigned (validation failure and page context failure, around lines 422 and 439) don't need `column_defs` since they don't have data.
|
||||
|
||||
- [ ] **Step 3: Remove the `exportWorkbook` and `writeReportLog` calls from the success path**
|
||||
|
||||
In `buildBrowserEntrypointResult`, replace the entire export block (lines 518-556) with a simplified version:
|
||||
|
||||
Find:
|
||||
```javascript
|
||||
const exportState = {
|
||||
attempted: false,
|
||||
status: 'skipped',
|
||||
message: null
|
||||
};
|
||||
|
||||
if (typeof deps.exportWorkbook === 'function') {
|
||||
exportState.attempted = true;
|
||||
try {
|
||||
const exportPayload = buildExportPayload({
|
||||
mode: normalizedArgs.period_mode,
|
||||
orgLabel: normalizedArgs.org_label,
|
||||
periodValue: normalizedArgs.period_value,
|
||||
rows
|
||||
});
|
||||
const exportResult = await deps.exportWorkbook(exportPayload);
|
||||
const exportPath = pickFirstNonEmpty(exportResult?.path, exportResult?.data?.path, exportResult?.data?.data);
|
||||
if (!exportPath) {
|
||||
throw new Error('export_failed');
|
||||
}
|
||||
exportState.status = 'ok';
|
||||
exportState.message = exportPath;
|
||||
|
||||
if (typeof deps.writeReportLog === 'function') {
|
||||
try {
|
||||
const reportLog = await deps.writeReportLog(buildReportName(normalizedArgs), exportPath);
|
||||
if (reportLog?.success === false) {
|
||||
reasons.push('report_log_failed');
|
||||
}
|
||||
} catch (_error) {
|
||||
reasons.push('report_log_failed');
|
||||
}
|
||||
}
|
||||
} catch (error) {
|
||||
reasons.push('export_failed');
|
||||
exportState.status = 'failed';
|
||||
exportState.message = pickFirstNonEmpty(error?.message, 'export_failed');
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Replace with:
|
||||
```javascript
|
||||
// Export is handled by Rust side after receiving the artifact.
|
||||
// JS only provides rows + column_defs in the artifact.
|
||||
const exportState = {
|
||||
attempted: false,
|
||||
status: 'deferred_to_rust',
|
||||
message: null
|
||||
};
|
||||
```
|
||||
|
||||
- [ ] **Step 4: Remove unused constants and functions**
|
||||
|
||||
Remove these constants (lines 5-6) since they are no longer called from JS:
|
||||
|
||||
```javascript
|
||||
const EXPORT_SERVICE_URL = 'http://localhost:13313/SurfaceServices/personalBread/export/faultDetailsExportXLSX';
|
||||
const REPORT_LOG_URL = 'http://localhost:13313/ReportServices/Api/setReportLog';
|
||||
```
|
||||
|
||||
Remove the `postJson` function (lines 264-294) — it is no longer needed since no JS-side HTTP calls are made to localhost.
|
||||
|
||||
Remove these functions from `defaultBrowserDeps()`:
|
||||
- `exportWorkbook` (lines 350-373)
|
||||
- `writeReportLog` (lines 375-409)
|
||||
|
||||
Remove these now-unused functions:
|
||||
- `buildExportTitles` (lines 244-254)
|
||||
- `buildExportPayload` (lines 256-262)
|
||||
- `buildReportName` (lines 413-415)
|
||||
|
||||
- [ ] **Step 5: Update the module.exports to remove unused exports**
|
||||
|
||||
Update the `module.exports` block (lines 572-586). Remove `buildBrowserEntrypointResult` from exports if it was only used for testing with full deps, or keep it for test compatibility. The final exports block:
|
||||
|
||||
```javascript
|
||||
if (typeof module !== 'undefined' && module.exports) {
|
||||
module.exports = {
|
||||
MONTH_COLUMNS,
|
||||
WEEK_COLUMNS,
|
||||
MONTH_COLUMN_DEFS,
|
||||
WEEK_COLUMN_DEFS,
|
||||
validateArgs,
|
||||
buildMonthRequest,
|
||||
buildWeekRequest,
|
||||
normalizeRows,
|
||||
determineArtifactStatus,
|
||||
buildArtifact,
|
||||
buildBrowserEntrypointResult
|
||||
};
|
||||
} else {
|
||||
return buildBrowserEntrypointResult(args);
|
||||
}
|
||||
```
|
||||
|
||||
- [ ] **Step 6: Verify the JS file has no syntax errors**
|
||||
|
||||
Run: `node -c "D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/tq-lineloss-report/scripts/collect_lineloss.js"`
|
||||
|
||||
Expected: no syntax errors. (Note: the file uses `return` at top level inside a wrapped IIFE when injected into the browser, so Node syntax check may warn — the important thing is no parse errors.)
|
||||
|
||||
Alternatively, check the test file still works:
|
||||
|
||||
Run: `node "D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/tq-lineloss-report/scripts/collect_lineloss.test.js"`
|
||||
|
||||
Expected: tests pass (or at least no JS parse errors).
|
||||
|
||||
- [ ] **Step 7: Commit**
|
||||
|
||||
```bash
|
||||
git add "D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/tq-lineloss-report/scripts/collect_lineloss.js"
|
||||
git commit -m "feat(lineloss): remove browser-side export, defer to Rust-side XLSX generation"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 4: Full Build Verification
|
||||
|
||||
**Files:** None (verification only)
|
||||
|
||||
- [ ] **Step 1: Run full cargo build**
|
||||
|
||||
Run: `cargo build`
|
||||
|
||||
Expected: successful compilation with no errors.
|
||||
|
||||
- [ ] **Step 2: Run all tests**
|
||||
|
||||
Run: `cargo test -- --nocapture`
|
||||
|
||||
Expected: all tests pass, including:
|
||||
- `lineloss_xlsx_export_test::export_month_lineloss_produces_valid_xlsx`
|
||||
- `lineloss_xlsx_export_test::export_empty_rows_returns_error`
|
||||
- `lineloss_xlsx_export::tests::column_letter_maps_indices_correctly`
|
||||
- All existing `deterministic_submit_test` tests
|
||||
|
||||
- [ ] **Step 3: Commit (if any fixups needed)**
|
||||
|
||||
Only if compilation or test fixes were required in this step.
|
||||
@@ -0,0 +1,117 @@
|
||||
# Helper Page Lifecycle Fix v2 — Same-Connection Close + Open
|
||||
|
||||
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
|
||||
|
||||
**Goal:** Prevent orphaned helper pages across process restarts by closing existing ones before opening new ones, all on the same WebSocket connection.
|
||||
|
||||
**Architecture:** In `bootstrap_helper_page`, after registering with the browser WS, send `sgHideBrowerserClosePage` (best-effort, silently ignored if no page exists), then send `sgHideBrowerserOpenPage`. Change `use_hidden_domain` to `true`.
|
||||
|
||||
**Tech Stack:** Rust, tungstenite, SuperRPA browser WS protocol
|
||||
|
||||
---
|
||||
|
||||
### Task 1: Add close-before-open in bootstrap_helper_page
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/browser/callback_host.rs:345-374` (bootstrap_helper_page function)
|
||||
|
||||
- [ ] **Step 1: Add close command before open command in bootstrap_helper_page**
|
||||
|
||||
Replace the current `bootstrap_helper_page` function. After `recv_bootstrap_prelude`, send the close command first, then the open command:
|
||||
|
||||
```rust
|
||||
fn bootstrap_helper_page(
|
||||
browser_ws_url: &str,
|
||||
request_url: &str,
|
||||
helper_url: &str,
|
||||
use_hidden_domain: bool,
|
||||
) -> Result<(), PipeError> {
|
||||
let (mut websocket, _) = connect(browser_ws_url)
|
||||
.map_err(|err| PipeError::Protocol(format!("browser websocket connect failed: {err}")))?;
|
||||
configure_bootstrap_socket(&mut websocket)?;
|
||||
websocket
|
||||
.send(Message::Text(
|
||||
r#"{"type":"register","role":"web"}"#.to_string().into(),
|
||||
))
|
||||
.map_err(|err| PipeError::Protocol(format!("browser websocket register failed: {err}")))?;
|
||||
let _ = recv_bootstrap_prelude(&mut websocket);
|
||||
|
||||
// Close any orphaned helper page from a previous process run.
|
||||
// Best-effort: if no page exists, the browser silently ignores this.
|
||||
let (open_action, close_action) = if use_hidden_domain {
|
||||
("sgHideBrowerserOpenPage", "sgHideBrowerserClosePage")
|
||||
} else {
|
||||
("sgBrowerserOpenPage", "sgBrowserClosePage")
|
||||
};
|
||||
let close_payload = json!([request_url, close_action, helper_url]).to_string();
|
||||
let _ = websocket.send(Message::Text(close_payload.into()));
|
||||
|
||||
let payload = json!([
|
||||
request_url,
|
||||
open_action,
|
||||
helper_url,
|
||||
])
|
||||
.to_string();
|
||||
websocket
|
||||
.send(Message::Text(payload.into()))
|
||||
.map_err(|err| PipeError::Protocol(format!("helper bootstrap send failed: {err}")))?;
|
||||
Ok(())
|
||||
}
|
||||
```
|
||||
|
||||
Key changes from current code:
|
||||
- After `recv_bootstrap_prelude`, add the close command (best-effort, ignore errors)
|
||||
- Compute both `open_action` and `close_action` from `use_hidden_domain` flag
|
||||
- Send close first, then open on the same WebSocket connection
|
||||
|
||||
- [ ] **Step 2: Change `use_hidden_domain` to `true` in server.rs**
|
||||
|
||||
In `src/service/server.rs`, at the `start_with_browser_ws_url` call, change `false` to `true`:
|
||||
|
||||
```rust
|
||||
match LiveBrowserCallbackHost::start_with_browser_ws_url(
|
||||
browser_ws_url,
|
||||
&bootstrap_url,
|
||||
Duration::from_secs(15),
|
||||
BROWSER_RESPONSE_TIMEOUT,
|
||||
true, // use_hidden_domain: hidden domain for invisible helper
|
||||
) {
|
||||
```
|
||||
|
||||
- [ ] **Step 3: Build**
|
||||
|
||||
Run: `cargo build 2>&1`
|
||||
Expected: 0 errors.
|
||||
|
||||
- [ ] **Step 4: Run callback_host tests**
|
||||
|
||||
Run: `cargo test --lib -- callback_host 2>&1`
|
||||
Expected: 12 tests pass (including `live_callback_host_sends_bootstrap_open_page_command` which still checks for `sgBrowerserOpenPage` because the test passes `false`, and `live_callback_host_hidden_domain_sends_hide_open_page_command` which passes `true`).
|
||||
|
||||
Note: The test passes `false` for `use_hidden_domain`, so the close command will use `sgBrowserClosePage`. The test's fake WebSocket server will receive both the close and open frames. The test only checks that `sgBrowerserOpenPage` is present, which is still true.
|
||||
|
||||
- [ ] **Step 5: Commit**
|
||||
|
||||
```bash
|
||||
git add src/browser/callback_host.rs src/service/server.rs
|
||||
git commit -m "fix(callback_host): close orphaned helper page before opening new one on same WS"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 2: Full verification
|
||||
|
||||
**Files:** None (verification only)
|
||||
|
||||
- [ ] **Step 1: Full test suite**
|
||||
|
||||
Run: `cargo test 2>&1`
|
||||
Expected: All tests pass except pre-existing `lineloss_period_resolver_prompts_for_missing_period` failure.
|
||||
|
||||
- [ ] **Step 2: Verify key behavioral changes**
|
||||
|
||||
Manually confirm:
|
||||
1. `bootstrap_helper_page` sends close command before open command (both on same WS connection)
|
||||
2. `use_hidden_domain` is `true` in `server.rs` — helper page opens in hidden domain
|
||||
3. `Drop for LiveBrowserCallbackHost` remains simple (shutdown only, no close attempt)
|
||||
4. `cached_host` is still in `mod.rs` outer loop (process-internal deduplication)
|
||||
@@ -0,0 +1,475 @@
|
||||
# Helper Page Lifecycle Fix & Hidden Domain Support — Implementation Plan
|
||||
|
||||
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
|
||||
|
||||
**Goal:** Fix duplicate browser-helper.html pages caused by WebSocket reconnections, add cleanup on Drop, and introduce a config switch for hidden-domain page opening.
|
||||
|
||||
**Architecture:** Three changes: (1) lift `cached_host` from `serve_client()` to the outer `run()` loop so reconnections share one host, (2) enhance `Drop for LiveBrowserCallbackHost` to send a close-page command via browser WS, (3) add `use_hidden_domain: bool` parameter that selects between `sgBrowerserOpenPage`/`sgHideBrowerserOpenPage` and their corresponding close APIs.
|
||||
|
||||
**Tech Stack:** Rust, tungstenite WebSocket crate, SuperRPA browser WS protocol
|
||||
|
||||
---
|
||||
|
||||
### Task 1: Add `use_hidden_domain` field and update `bootstrap_helper_page`
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/browser/callback_host.rs:28` (constant), `:44-51` (struct), `:215-252` (constructor), `:340-359` (bootstrap fn)
|
||||
|
||||
- [ ] **Step 1: Change `HELPER_BOOTSTRAP_ACTION` from constant to a function of `use_hidden_domain`**
|
||||
|
||||
Replace the constant and update `bootstrap_helper_page` to accept and use the flag:
|
||||
|
||||
```rust
|
||||
// DELETE this line:
|
||||
// const HELPER_BOOTSTRAP_ACTION: &str = "sgBrowerserOpenPage";
|
||||
|
||||
// REPLACE bootstrap_helper_page signature and body:
|
||||
fn bootstrap_helper_page(
|
||||
browser_ws_url: &str,
|
||||
request_url: &str,
|
||||
helper_url: &str,
|
||||
use_hidden_domain: bool,
|
||||
) -> Result<(), PipeError> {
|
||||
let (mut websocket, _) = connect(browser_ws_url)
|
||||
.map_err(|err| PipeError::Protocol(format!("browser websocket connect failed: {err}")))?;
|
||||
configure_bootstrap_socket(&mut websocket)?;
|
||||
websocket
|
||||
.send(Message::Text(
|
||||
r#"{"type":"register","role":"web"}"#.to_string().into(),
|
||||
))
|
||||
.map_err(|err| PipeError::Protocol(format!("browser websocket register failed: {err}")))?;
|
||||
let _ = recv_bootstrap_prelude(&mut websocket);
|
||||
let open_action = if use_hidden_domain {
|
||||
"sgHideBrowerserOpenPage"
|
||||
} else {
|
||||
"sgBrowerserOpenPage"
|
||||
};
|
||||
let payload = json!([
|
||||
request_url,
|
||||
open_action,
|
||||
helper_url,
|
||||
])
|
||||
.to_string();
|
||||
websocket
|
||||
.send(Message::Text(payload.into()))
|
||||
.map_err(|err| PipeError::Protocol(format!("helper bootstrap send failed: {err}")))?;
|
||||
Ok(())
|
||||
}
|
||||
```
|
||||
|
||||
- [ ] **Step 2: Add new fields to `LiveBrowserCallbackHost`**
|
||||
|
||||
```rust
|
||||
#[derive(Debug)]
|
||||
pub(crate) struct LiveBrowserCallbackHost {
|
||||
host: Arc<BrowserCallbackHost>,
|
||||
shutdown: Arc<AtomicBool>,
|
||||
server_thread: Mutex<Option<JoinHandle<()>>>,
|
||||
command_lock: Mutex<()>,
|
||||
result_timeout: Duration,
|
||||
browser_ws_url: String,
|
||||
use_hidden_domain: bool,
|
||||
}
|
||||
```
|
||||
|
||||
- [ ] **Step 3: Update `start_with_browser_ws_url` to accept and store the new parameter**
|
||||
|
||||
```rust
|
||||
impl LiveBrowserCallbackHost {
|
||||
pub(crate) fn start_with_browser_ws_url(
|
||||
browser_ws_url: &str,
|
||||
bootstrap_request_url: &str,
|
||||
ready_timeout: Duration,
|
||||
result_timeout: Duration,
|
||||
use_hidden_domain: bool,
|
||||
) -> Result<Self, PipeError> {
|
||||
let listener = TcpListener::bind("127.0.0.1:0").map_err(|err| {
|
||||
PipeError::Protocol(format!("failed to bind callback host listener: {err}"))
|
||||
})?;
|
||||
listener.set_nonblocking(true).map_err(|err| {
|
||||
PipeError::Protocol(format!("failed to configure callback host listener: {err}"))
|
||||
})?;
|
||||
let origin = format!(
|
||||
"http://{}",
|
||||
listener.local_addr().map_err(|err| {
|
||||
PipeError::Protocol(format!(
|
||||
"failed to resolve callback host listener address: {err}"
|
||||
))
|
||||
})?
|
||||
);
|
||||
let host = Arc::new(BrowserCallbackHost::with_urls(&origin, browser_ws_url));
|
||||
let shutdown = Arc::new(AtomicBool::new(false));
|
||||
let thread_host = host.clone();
|
||||
let thread_shutdown = shutdown.clone();
|
||||
let server_thread = thread::spawn(move || serve_loop(listener, thread_host, thread_shutdown));
|
||||
|
||||
bootstrap_helper_page(browser_ws_url, bootstrap_request_url, host.helper_url(), use_hidden_domain)?;
|
||||
wait_for_helper_ready(host.as_ref(), ready_timeout)?;
|
||||
|
||||
let live_host = Self {
|
||||
host,
|
||||
shutdown,
|
||||
server_thread: Mutex::new(Some(server_thread)),
|
||||
command_lock: Mutex::new(()),
|
||||
result_timeout,
|
||||
browser_ws_url: browser_ws_url.to_string(),
|
||||
use_hidden_domain,
|
||||
};
|
||||
Ok(live_host)
|
||||
}
|
||||
```
|
||||
|
||||
- [ ] **Step 4: Fix the inline test struct literal that constructs `LiveBrowserCallbackHost` directly**
|
||||
|
||||
In the `live_callback_host_treats_simulated_mouse_command_as_fire_and_forget` test (around line 1110), add the new fields:
|
||||
|
||||
```rust
|
||||
let host = LiveBrowserCallbackHost {
|
||||
host: Arc::new(BrowserCallbackHost::new()),
|
||||
shutdown: Arc::new(AtomicBool::new(false)),
|
||||
server_thread: Mutex::new(None),
|
||||
command_lock: Mutex::new(()),
|
||||
result_timeout: Duration::from_millis(10),
|
||||
browser_ws_url: "ws://127.0.0.1:12345".to_string(),
|
||||
use_hidden_domain: false,
|
||||
};
|
||||
```
|
||||
|
||||
- [ ] **Step 5: Run build to verify compilation**
|
||||
|
||||
Run: `cargo build 2>&1`
|
||||
Expected: 0 errors. The `HELPER_BOOTSTRAP_ACTION` constant removal and signature changes should all be internally consistent.
|
||||
|
||||
- [ ] **Step 6: Run tests to verify existing behavior is preserved**
|
||||
|
||||
Run: `cargo test -- callback_host 2>&1`
|
||||
Expected: All existing callback_host tests pass (including `live_callback_host_sends_bootstrap_open_page_command` which still checks for `sgBrowerserOpenPage` since no caller passes `true` yet).
|
||||
|
||||
- [ ] **Step 7: Commit**
|
||||
|
||||
```bash
|
||||
git add src/browser/callback_host.rs
|
||||
git commit -m "feat(callback_host): add use_hidden_domain param to bootstrap_helper_page"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 2: Enhance `Drop` to close the helper page
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/browser/callback_host.rs:321-328` (Drop impl)
|
||||
|
||||
- [ ] **Step 1: Add `close_helper_page` helper function**
|
||||
|
||||
Add this function near `bootstrap_helper_page` (after line ~360):
|
||||
|
||||
```rust
|
||||
/// Best-effort attempt to close the helper page tab via browser WebSocket.
|
||||
/// Silently ignores all errors — this runs during Drop and must not panic.
|
||||
fn close_helper_page(browser_ws_url: &str, helper_url: &str, use_hidden_domain: bool) {
|
||||
let close_action = if use_hidden_domain {
|
||||
"sgHideBrowerserClosePage"
|
||||
} else {
|
||||
"sgBrowserClosePage"
|
||||
};
|
||||
|
||||
let result: Result<(), Box<dyn std::error::Error>> = (|| {
|
||||
// Use a raw TcpStream with timeouts instead of tungstenite::connect
|
||||
// which does not expose a connection timeout.
|
||||
let addr = browser_ws_url
|
||||
.trim_start_matches("ws://")
|
||||
.trim_start_matches("wss://");
|
||||
let stream = TcpStream::connect_timeout(
|
||||
&addr.parse().map_err(|e| format!("addr parse: {e}"))?,
|
||||
Duration::from_millis(100),
|
||||
)?;
|
||||
stream.set_read_timeout(Some(Duration::from_millis(200)))?;
|
||||
stream.set_write_timeout(Some(Duration::from_millis(200)))?;
|
||||
let (mut websocket, _) = tungstenite::client(
|
||||
browser_ws_url,
|
||||
stream,
|
||||
)?;
|
||||
websocket.send(Message::Text(
|
||||
r#"{"type":"register","role":"web"}"#.to_string().into(),
|
||||
))?;
|
||||
// Drain the welcome prelude (best-effort, ignore timeout).
|
||||
let _ = websocket.read();
|
||||
let payload = json!([helper_url, close_action, helper_url]).to_string();
|
||||
websocket.send(Message::Text(payload.into()))?;
|
||||
Ok(())
|
||||
})();
|
||||
|
||||
if let Err(err) = result {
|
||||
eprintln!("close_helper_page best-effort failed (harmless): {err}");
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
- [ ] **Step 2: Update `Drop for LiveBrowserCallbackHost` to call `close_helper_page`**
|
||||
|
||||
```rust
|
||||
impl Drop for LiveBrowserCallbackHost {
|
||||
fn drop(&mut self) {
|
||||
// Best-effort: tell the browser to close the helper page tab.
|
||||
close_helper_page(
|
||||
&self.browser_ws_url,
|
||||
self.host.helper_url(),
|
||||
self.use_hidden_domain,
|
||||
);
|
||||
|
||||
self.shutdown.store(true, Ordering::Relaxed);
|
||||
if let Some(handle) = self.server_thread.lock().unwrap().take() {
|
||||
let _ = handle.join();
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
- [ ] **Step 3: Run build to verify compilation**
|
||||
|
||||
Run: `cargo build 2>&1`
|
||||
Expected: 0 errors. `close_helper_page` uses types already imported (`TcpStream`, `Duration`, `json!`, `Message`).
|
||||
|
||||
- [ ] **Step 4: Run tests**
|
||||
|
||||
Run: `cargo test -- callback_host 2>&1`
|
||||
Expected: All pass. The Drop enhancement is best-effort and the test helper constructs hosts with `server_thread: Mutex::new(None)`, so Drop completes cleanly.
|
||||
|
||||
- [ ] **Step 5: Commit**
|
||||
|
||||
```bash
|
||||
git add src/browser/callback_host.rs
|
||||
git commit -m "feat(callback_host): close helper page on Drop via browser WS"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 3: Lift `cached_host` to outer loop and update `serve_client` signature
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/service/mod.rs:72-96` (run loop)
|
||||
- Modify: `src/service/server.rs:232-241` (serve_client signature and cached_host init)
|
||||
|
||||
- [ ] **Step 1: Change `serve_client` to accept `cached_host` as a parameter**
|
||||
|
||||
In `src/service/server.rs`, change the function signature and remove the local `cached_host` variable:
|
||||
|
||||
```rust
|
||||
pub fn serve_client(
|
||||
context: &AgentRuntimeContext,
|
||||
session: &ServiceSession,
|
||||
sink: Arc<ServiceEventSink>,
|
||||
browser_ws_url: &str,
|
||||
mac_policy: &MacPolicy,
|
||||
cached_host: &mut Option<Arc<LiveBrowserCallbackHost>>,
|
||||
) -> Result<(), PipeError> {
|
||||
// DELETE the line: let mut cached_host: Option<Arc<LiveBrowserCallbackHost>> = None;
|
||||
|
||||
loop {
|
||||
// ... rest of function body unchanged, `cached_host` is now the parameter
|
||||
```
|
||||
|
||||
The body references to `cached_host` remain identical — they just operate on the borrowed mutable reference instead of a local variable.
|
||||
|
||||
- [ ] **Step 2: Update `start_with_browser_ws_url` call to pass `false` for `use_hidden_domain`**
|
||||
|
||||
In `src/service/server.rs`, at the `LiveBrowserCallbackHost::start_with_browser_ws_url` call (around line 288), add the `false` argument:
|
||||
|
||||
```rust
|
||||
match LiveBrowserCallbackHost::start_with_browser_ws_url(
|
||||
browser_ws_url,
|
||||
&bootstrap_url,
|
||||
Duration::from_secs(15),
|
||||
BROWSER_RESPONSE_TIMEOUT,
|
||||
false, // use_hidden_domain: visible tab for now
|
||||
) {
|
||||
```
|
||||
|
||||
- [ ] **Step 3: Lift `cached_host` into `run()` in `mod.rs`**
|
||||
|
||||
In `src/service/mod.rs`, declare `cached_host` before the loop and pass it to `serve_client`:
|
||||
|
||||
```rust
|
||||
// Add this import at the top of the function or file:
|
||||
use crate::browser::callback_host::LiveBrowserCallbackHost;
|
||||
|
||||
// Before the loop (after line 64, after `let session = ...`):
|
||||
let mut cached_host: Option<Arc<LiveBrowserCallbackHost>> = None;
|
||||
|
||||
loop {
|
||||
let (stream, _) = listener.accept()?;
|
||||
let websocket = accept(stream)
|
||||
.map_err(|err| PipeError::Protocol(format!("service websocket accept failed: {err}")))?;
|
||||
let sink = Arc::new(ServiceEventSink::from_websocket(websocket));
|
||||
match session.try_attach_client() {
|
||||
Ok(()) => {
|
||||
let result = serve_client(
|
||||
&runtime_context,
|
||||
&session,
|
||||
sink.clone(),
|
||||
browser_ws_url,
|
||||
&mac_policy,
|
||||
&mut cached_host,
|
||||
);
|
||||
session.detach_client();
|
||||
match result {
|
||||
Ok(()) | Err(PipeError::PipeClosed) => {}
|
||||
Err(err) => return Err(err),
|
||||
}
|
||||
}
|
||||
Err(message) => {
|
||||
sink.send_service_message(message)?;
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
- [ ] **Step 4: Update the `pub use` export if needed**
|
||||
|
||||
Check `src/service/mod.rs:17`:
|
||||
```rust
|
||||
pub use server::{serve_client, ServiceEventSink, ServiceSession};
|
||||
```
|
||||
The signature change is compatible — `serve_client` is still public with an added parameter. Any external callers will get a compile error guiding them to add the parameter, which is the desired behavior.
|
||||
|
||||
- [ ] **Step 5: Run build to verify compilation**
|
||||
|
||||
Run: `cargo build 2>&1`
|
||||
Expected: 0 errors. If there are external test files calling `serve_client`, they will fail here and need the new parameter added.
|
||||
|
||||
- [ ] **Step 6: Run full test suite**
|
||||
|
||||
Run: `cargo test 2>&1`
|
||||
Expected: All tests pass. External test files that call `serve_client` indirectly through the service protocol tests should still work because they use the WS protocol layer, not `serve_client` directly. (Verified: grep found 0 test files referencing `serve_client` or `LiveBrowserCallbackHost`.)
|
||||
|
||||
- [ ] **Step 7: Commit**
|
||||
|
||||
```bash
|
||||
git add src/service/mod.rs src/service/server.rs
|
||||
git commit -m "fix(service): lift cached_host to outer loop to prevent duplicate helper pages"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 4: Add tests for hidden domain bootstrap
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/browser/callback_host.rs` (inline tests module, around line 1071)
|
||||
|
||||
- [ ] **Step 1: Update existing `live_callback_host_sends_bootstrap_open_page_command` test**
|
||||
|
||||
The test currently calls `start_with_browser_ws_url` with 4 args. Add the 5th arg `false`:
|
||||
|
||||
```rust
|
||||
#[test]
|
||||
fn live_callback_host_sends_bootstrap_open_page_command() {
|
||||
let (ws_url, frames, handle) = start_fake_browser_status_server();
|
||||
|
||||
let result = LiveBrowserCallbackHost::start_with_browser_ws_url(
|
||||
&ws_url,
|
||||
"https://www.zhihu.com",
|
||||
Duration::from_millis(100),
|
||||
Duration::from_millis(50),
|
||||
false,
|
||||
);
|
||||
assert!(result.is_err(), "expected timeout because no real helper page loads");
|
||||
drop(result);
|
||||
handle.join().unwrap();
|
||||
|
||||
let sent = frames.lock().unwrap().clone();
|
||||
assert!(
|
||||
sent.iter().any(|frame| frame.contains("sgBrowerserOpenPage")),
|
||||
"bootstrap should send sgBrowerserOpenPage to the browser WS; sent frames: {sent:?}"
|
||||
);
|
||||
assert!(
|
||||
sent.iter().any(|frame| frame.contains("/sgclaw/browser-helper.html")),
|
||||
"bootstrap should include the helper page URL; sent frames: {sent:?}"
|
||||
);
|
||||
assert!(
|
||||
sent.iter().any(|frame| frame.contains("https://www.zhihu.com")),
|
||||
"bootstrap requestUrl should be the provided page URL; sent frames: {sent:?}"
|
||||
);
|
||||
}
|
||||
```
|
||||
|
||||
- [ ] **Step 2: Add new test for hidden domain bootstrap**
|
||||
|
||||
Add this test after the existing one:
|
||||
|
||||
```rust
|
||||
#[test]
|
||||
fn live_callback_host_hidden_domain_sends_hide_open_page_command() {
|
||||
let (ws_url, frames, handle) = start_fake_browser_status_server();
|
||||
|
||||
let result = LiveBrowserCallbackHost::start_with_browser_ws_url(
|
||||
&ws_url,
|
||||
"https://www.zhihu.com",
|
||||
Duration::from_millis(100),
|
||||
Duration::from_millis(50),
|
||||
true,
|
||||
);
|
||||
assert!(result.is_err(), "expected timeout because no real helper page loads");
|
||||
drop(result);
|
||||
handle.join().unwrap();
|
||||
|
||||
let sent = frames.lock().unwrap().clone();
|
||||
assert!(
|
||||
sent.iter().any(|frame| frame.contains("sgHideBrowerserOpenPage")),
|
||||
"hidden domain bootstrap should send sgHideBrowerserOpenPage; sent frames: {sent:?}"
|
||||
);
|
||||
assert!(
|
||||
!sent.iter().any(|frame| {
|
||||
frame.contains("\"sgBrowerserOpenPage\"")
|
||||
}),
|
||||
"hidden domain bootstrap should NOT send visible sgBrowerserOpenPage; sent frames: {sent:?}"
|
||||
);
|
||||
assert!(
|
||||
sent.iter().any(|frame| frame.contains("/sgclaw/browser-helper.html")),
|
||||
"bootstrap should include the helper page URL; sent frames: {sent:?}"
|
||||
);
|
||||
}
|
||||
```
|
||||
|
||||
- [ ] **Step 3: Run all callback_host tests**
|
||||
|
||||
Run: `cargo test -- callback_host 2>&1`
|
||||
Expected: All 3 tests pass:
|
||||
- `live_callback_host_sends_bootstrap_open_page_command` — regression, visible domain
|
||||
- `live_callback_host_hidden_domain_sends_hide_open_page_command` — new, hidden domain
|
||||
- `live_callback_host_treats_simulated_mouse_command_as_fire_and_forget` — unchanged
|
||||
|
||||
- [ ] **Step 4: Run full test suite**
|
||||
|
||||
Run: `cargo test 2>&1`
|
||||
Expected: All tests pass.
|
||||
|
||||
- [ ] **Step 5: Commit**
|
||||
|
||||
```bash
|
||||
git add src/browser/callback_host.rs
|
||||
git commit -m "test(callback_host): add hidden domain bootstrap test"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 5: Full build verification
|
||||
|
||||
**Files:** None (verification only)
|
||||
|
||||
- [ ] **Step 1: Clean build**
|
||||
|
||||
Run: `cargo build 2>&1`
|
||||
Expected: 0 errors. Warnings about dead code in unrelated modules are acceptable.
|
||||
|
||||
- [ ] **Step 2: Full test suite**
|
||||
|
||||
Run: `cargo test 2>&1`
|
||||
Expected: All tests pass. The pre-existing `lineloss_period_resolver_prompts_for_missing_period` failure (from previous work) is known and unrelated.
|
||||
|
||||
- [ ] **Step 3: Verify the key behavioral changes in code**
|
||||
|
||||
Manually confirm:
|
||||
1. `src/service/mod.rs` — `cached_host` is declared BEFORE the `loop`, not inside `serve_client`
|
||||
2. `src/browser/callback_host.rs` — `Drop::drop` calls `close_helper_page` before shutdown
|
||||
3. `src/browser/callback_host.rs` — `bootstrap_helper_page` uses `"sgHideBrowerserOpenPage"` when `use_hidden_domain == true` and `"sgBrowerserOpenPage"` when `false`
|
||||
4. `src/service/server.rs` — `start_with_browser_ws_url` call passes `false` as `use_hidden_domain`
|
||||
418
docs/superpowers/plans/2026-04-14-request-url-resolution-plan.md
Normal file
418
docs/superpowers/plans/2026-04-14-request-url-resolution-plan.md
Normal file
@@ -0,0 +1,418 @@
|
||||
# Request URL Resolution Implementation Plan
|
||||
|
||||
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
|
||||
|
||||
**Goal:** Replace the temporary line-loss request URL hardcode in `src/service/server.rs` with a unified bootstrap-target resolver that prefers current page context, then deterministic submit plans, then skill metadata, and finally `about:blank`.
|
||||
|
||||
**Architecture:** Add a small service-owned resolver that returns a narrow `SubmitBootstrapTarget` result and centralizes precedence rules. Reuse `DeterministicExecutionPlan.target_url` as the authoritative source for deterministic line-loss scenes, then add minimal skill metadata fallback for configured direct browser-script skills, while keeping callback-host behavior unchanged.
|
||||
|
||||
**Tech Stack:** Rust, serde/serde_json, tungstenite, zeroclaw skill loader, staged `SKILL.toml` manifests, cargo test
|
||||
|
||||
---
|
||||
|
||||
### Task 1: Add resolver-focused red tests for precedence
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/service/server.rs:422-467`
|
||||
- Test: `src/service/server.rs` (crate-local resolver tests)
|
||||
- Test: `tests/service_ws_session_test.rs`
|
||||
|
||||
- [ ] **Step 1: Write the failing page-context precedence test**
|
||||
|
||||
In a crate-local unit test inside `src/service/server.rs`, add a focused resolver test that exercises the request-url resolver with:
|
||||
- non-empty `page_url = "https://already-open.example.com/page"`
|
||||
- an instruction that would otherwise match deterministic line-loss logic
|
||||
- configured direct skill metadata present
|
||||
|
||||
Assert the resolved bootstrap target uses the explicit non-empty `page_url` and reports `PageContext` source.
|
||||
|
||||
- [ ] **Step 2: Run the test to verify it fails**
|
||||
|
||||
Run: `cargo test page_context_bootstrap_target_wins_over_deterministic_and_skill_fallback --lib -- --nocapture`
|
||||
Expected: FAIL because no unified resolver/source enum exists yet.
|
||||
|
||||
- [ ] **Step 3: Write the failing deterministic-precedence test**
|
||||
|
||||
In `src/service/server.rs` crate-local tests, add a focused test for a deterministic line-loss instruction with no `page_url`.
|
||||
|
||||
Use the same instruction shape already accepted by `decide_deterministic_submit(...)`, and assert:
|
||||
- resolver source is `DeterministicPlan`
|
||||
- resolved `request_url` equals `DeterministicExecutionPlan.target_url`
|
||||
- no raw `instruction.contains("线损")` fallback is needed
|
||||
|
||||
- [ ] **Step 4: Run the test to verify it fails**
|
||||
|
||||
Run: `cargo test deterministic_bootstrap_target_uses_plan_target_url --lib -- --nocapture`
|
||||
Expected: FAIL because service still uses `derive_request_url_from_instruction(...)`.
|
||||
|
||||
- [ ] **Step 5: Write the failing skill-fallback test**
|
||||
|
||||
In `src/service/server.rs` crate-local tests, add a focused test for:
|
||||
- no `page_url`
|
||||
- instruction not deterministic
|
||||
- configured direct-submit skill metadata provides `bootstrap_url`
|
||||
|
||||
Assert resolver source is `SkillConfig` and `request_url` matches metadata.
|
||||
|
||||
- [ ] **Step 6: Run the test to verify it fails**
|
||||
|
||||
Run: `cargo test skill_metadata_bootstrap_url_is_used_when_no_page_context_or_plan_exists --lib -- --nocapture`
|
||||
Expected: FAIL because skill metadata is not read today.
|
||||
|
||||
- [ ] **Step 7: Write the failing malformed-metadata fallback test**
|
||||
|
||||
In `src/service/server.rs` crate-local tests, add a focused test for malformed `bootstrap_url` metadata, with no page context and no deterministic plan.
|
||||
|
||||
Assert the resolver:
|
||||
- ignores malformed metadata
|
||||
- returns `Fallback`
|
||||
- resolves to `about:blank`
|
||||
|
||||
- [ ] **Step 8: Run the test to verify it fails**
|
||||
|
||||
Run: `cargo test malformed_skill_bootstrap_url_falls_back_to_about_blank --lib -- --nocapture`
|
||||
Expected: FAIL because malformed metadata is not handled by a resolver yet.
|
||||
|
||||
---
|
||||
|
||||
### Task 2: Introduce the bootstrap-target resolver in service code
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/service/server.rs:280-467`
|
||||
- Modify: `src/service/mod.rs:17-22`
|
||||
- Test: `src/service/server.rs` (crate-local resolver tests)
|
||||
|
||||
- [ ] **Step 1: Add the narrow resolver types in service code**
|
||||
|
||||
In `src/service/server.rs`, add:
|
||||
|
||||
```rust
|
||||
#[derive(Debug, Clone, PartialEq, Eq)]
|
||||
pub(crate) struct SubmitBootstrapTarget {
|
||||
pub request_url: String,
|
||||
pub expected_domain: Option<String>,
|
||||
pub source: BootstrapTargetSource,
|
||||
}
|
||||
|
||||
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
|
||||
pub(crate) enum BootstrapTargetSource {
|
||||
PageContext,
|
||||
DeterministicPlan,
|
||||
SkillConfig,
|
||||
Fallback,
|
||||
}
|
||||
```
|
||||
|
||||
Keep them scoped to service code. Do not create a generic cross-runtime planning object.
|
||||
|
||||
- [ ] **Step 2: Add a minimal resolver entry point**
|
||||
|
||||
Implement a service-owned function in `src/service/server.rs`, conceptually:
|
||||
|
||||
```rust
|
||||
pub(crate) fn resolve_submit_bootstrap_target(
|
||||
request: &crate::agent::SubmitTaskRequest,
|
||||
workspace_root: &Path,
|
||||
settings: &SgClawSettings,
|
||||
) -> SubmitBootstrapTarget
|
||||
```
|
||||
|
||||
Initial behavior for this step:
|
||||
- return `PageContext` only when `request.page_url` exists and is non-empty after trimming
|
||||
- add a crate-local regression that empty/whitespace `page_url` does not short-circuit later precedence tiers
|
||||
- otherwise fall through to existing behavior temporarily so the new tests can compile incrementally
|
||||
|
||||
- [ ] **Step 3: Update service startup to call the resolver**
|
||||
|
||||
At the callback-host startup call site in `serve_client(...)`, replace:
|
||||
|
||||
```rust
|
||||
let bootstrap_url = initial_request_url_for_submit_task(&request);
|
||||
```
|
||||
|
||||
with resolver usage:
|
||||
|
||||
```rust
|
||||
let bootstrap_target = resolve_submit_bootstrap_target(&request, context.workspace_root(), &settings);
|
||||
let bootstrap_url = bootstrap_target.request_url;
|
||||
```
|
||||
|
||||
Use the current settings-loading seam already used elsewhere in service code. Keep callback-host startup behavior otherwise unchanged.
|
||||
|
||||
- [ ] **Step 4: Keep resolver visibility crate-local**
|
||||
|
||||
Do not make the resolver types broadly public for integration tests. Keep the resolver and `BootstrapTargetSource` crate-local, and keep source-level assertions in `src/service/server.rs` unit tests.
|
||||
|
||||
Only re-export/remove existing `initial_request_url_for_submit_task(...)` seams through `src/service/mod.rs` if production callers still require that wiring.
|
||||
|
||||
- [ ] **Step 5: Run the first precedence test to verify it passes**
|
||||
|
||||
Run: `cargo test page_context_bootstrap_target_wins_over_deterministic_and_skill_fallback --lib -- --nocapture`
|
||||
Expected: PASS.
|
||||
|
||||
- [ ] **Step 6: Commit**
|
||||
|
||||
```bash
|
||||
git add src/service/server.rs src/service/mod.rs
|
||||
git commit -m "refactor(service): add submit bootstrap target resolver scaffold"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 3: Make deterministic submit the authoritative source for line-loss bootstrap URLs
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/service/server.rs:422-467`
|
||||
- Modify: `src/compat/deterministic_submit.rs:13-101`
|
||||
- Test: `src/service/server.rs` (crate-local resolver tests)
|
||||
- Test: `tests/service_ws_session_test.rs`
|
||||
|
||||
- [ ] **Step 1: Write a small service-side seam for deterministic resolution**
|
||||
|
||||
In `src/service/server.rs`, update the resolver so that when `page_url` is absent it calls:
|
||||
|
||||
```rust
|
||||
crate::compat::deterministic_submit::decide_deterministic_submit(
|
||||
&request.instruction,
|
||||
request.page_url.as_deref(),
|
||||
request.page_title.as_deref(),
|
||||
)
|
||||
```
|
||||
|
||||
Only `DeterministicSubmitDecision::Execute(plan)` should produce a deterministic bootstrap target.
|
||||
|
||||
Treat `NotDeterministic` and `Prompt { .. }` as “no deterministic bootstrap target” for service startup.
|
||||
|
||||
- [ ] **Step 2: Use `plan.target_url` directly**
|
||||
|
||||
Map `DeterministicSubmitDecision::Execute(plan)` to:
|
||||
- `request_url = plan.target_url.clone()`
|
||||
- `expected_domain = Some(plan.expected_domain.clone())`
|
||||
- `source = BootstrapTargetSource::DeterministicPlan`
|
||||
|
||||
Do not reconstruct the URL in `server.rs`.
|
||||
|
||||
- [ ] **Step 3: Remove the temporary line-loss hardcode**
|
||||
|
||||
Delete this branch from `derive_request_url_from_instruction(...)` or remove the function entirely if it is no longer needed:
|
||||
|
||||
```rust
|
||||
if instruction.contains("线损") || instruction.contains("lineloss") {
|
||||
return Some("http://20.76.57.61:18080".to_string());
|
||||
}
|
||||
```
|
||||
|
||||
Keep any still-needed legacy Zhihu fallback only if the resolver still requires it after deterministic integration.
|
||||
|
||||
- [ ] **Step 4: Add/adjust a deterministic regression test**
|
||||
|
||||
In `src/service/server.rs` crate-local tests, add a focused assertion that line-loss bootstrap URL now comes from `DeterministicExecutionPlan.target_url`, not raw text matching.
|
||||
|
||||
A good assertion shape is:
|
||||
- call resolver with deterministic line-loss instruction
|
||||
- assert `request_url == "http://20.76.57.61:18080/gsllys/tqLinelossStatis/tqQualifyRateMonitor"`
|
||||
- assert `source == DeterministicPlan`
|
||||
|
||||
- [ ] **Step 5: Run deterministic tests to verify they pass**
|
||||
|
||||
Run: `cargo test deterministic_bootstrap_target_uses_plan_target_url --lib -- --nocapture`
|
||||
Expected: PASS.
|
||||
|
||||
- [ ] **Step 6: Run service websocket coverage for the same precedence**
|
||||
|
||||
Run: `cargo test callback_host --test service_ws_session_test -- --nocapture`
|
||||
Expected: PASS with no line-loss hardcode dependency.
|
||||
|
||||
- [ ] **Step 7: Commit**
|
||||
|
||||
```bash
|
||||
git add src/service/server.rs src/compat/deterministic_submit.rs tests/service_ws_session_test.rs
|
||||
git commit -m "refactor(service): derive line-loss bootstrap URL from deterministic plan"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 4: Add skill-metadata fallback for configured direct-submit skills
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/compat/direct_skill_runtime.rs:114-153`
|
||||
- Modify: `src/service/server.rs:422-467`
|
||||
- Optionally modify: `src/config/settings.rs` only if a tiny metadata pointer is required
|
||||
- Modify: `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/fault-details-report/SKILL.toml`
|
||||
- Optionally modify: `D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/95598-weekly-monitor-report/SKILL.toml`
|
||||
- Test: `src/service/server.rs` (crate-local resolver tests)
|
||||
- Test: `tests/service_ws_session_test.rs`
|
||||
|
||||
- [ ] **Step 1: Define the minimal skill metadata shape**
|
||||
|
||||
Extend staged `SKILL.toml` parsing expectations to support a narrow metadata seam for browser-script direct skills.
|
||||
|
||||
The plan target fields are:
|
||||
- `bootstrap_url`
|
||||
- `expected_domain`
|
||||
|
||||
Keep the metadata minimal. Do not add a broad dispatch registry or scene-policy schema.
|
||||
|
||||
Recommended TOML shape in the skill manifest:
|
||||
|
||||
```toml
|
||||
[tools.metadata]
|
||||
bootstrap_url = "https://example.com/path"
|
||||
expected_domain = "example.com"
|
||||
```
|
||||
|
||||
If the actual skill loader only supports per-tool custom fields in another location, use that established seam instead. Do not invent a parallel config file.
|
||||
|
||||
- [ ] **Step 2: Add a helper that reads fallback metadata for the configured direct skill**
|
||||
|
||||
In `src/compat/direct_skill_runtime.rs`, add a helper like:
|
||||
|
||||
```rust
|
||||
pub(crate) fn resolve_direct_submit_bootstrap_metadata(
|
||||
configured_tool: &str,
|
||||
workspace_root: &Path,
|
||||
settings: &SgClawSettings,
|
||||
) -> Result<Option<DirectSubmitBootstrapMetadata>, PipeError>
|
||||
```
|
||||
|
||||
Recommended shape:
|
||||
|
||||
```rust
|
||||
pub(crate) struct DirectSubmitBootstrapMetadata {
|
||||
pub bootstrap_url: String,
|
||||
pub expected_domain: Option<String>,
|
||||
}
|
||||
```
|
||||
|
||||
Reuse the existing `resolve_browser_script_skill(...)` lookup path so the service resolver does not duplicate staged-skill discovery logic.
|
||||
|
||||
- [ ] **Step 3: Validate metadata conservatively**
|
||||
|
||||
When reading fallback metadata:
|
||||
- accept only non-empty `bootstrap_url`
|
||||
- require it to parse as a valid absolute URL
|
||||
- normalize or preserve `expected_domain` only if non-empty
|
||||
- on malformed metadata, return `Ok(None)` for resolver purposes instead of failing service startup
|
||||
|
||||
This keeps malformed fallback data from breaking submits and matches the approved spec.
|
||||
|
||||
- [ ] **Step 4: Wire skill metadata into the service resolver**
|
||||
|
||||
Update `resolve_submit_bootstrap_target(...)` to:
|
||||
- check skill metadata only after page context and deterministic parsing fail
|
||||
- use `SkillConfig` as the source when metadata resolves
|
||||
- fall through to `about:blank` when metadata is missing or malformed
|
||||
|
||||
- [ ] **Step 5: Add a staged-skill fixture update**
|
||||
|
||||
Update at least one configured direct skill fixture, likely `fault-details-report`, to include valid fallback metadata.
|
||||
|
||||
Use concrete values appropriate for that skill’s target page; do not reuse the line-loss URL.
|
||||
|
||||
- [ ] **Step 6: Run the skill-fallback test to verify it passes**
|
||||
|
||||
Run: `cargo test skill_metadata_bootstrap_url_is_used_when_no_page_context_or_plan_exists --lib -- --nocapture`
|
||||
Expected: PASS.
|
||||
|
||||
- [ ] **Step 7: Run the malformed-metadata test to verify it passes**
|
||||
|
||||
Run: `cargo test malformed_skill_bootstrap_url_falls_back_to_about_blank --lib -- --nocapture`
|
||||
Expected: PASS.
|
||||
|
||||
- [ ] **Step 8: Commit**
|
||||
|
||||
```bash
|
||||
git add src/compat/direct_skill_runtime.rs src/service/server.rs D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills/fault-details-report/SKILL.toml tests/service_ws_session_test.rs
|
||||
git commit -m "feat(service): add direct skill bootstrap URL fallback metadata"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 5: Remove obsolete request-url glue and lock the final precedence contract
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/service/server.rs:422-467`
|
||||
- Modify: `src/service/mod.rs:20-22`
|
||||
- Test: `src/service/server.rs` (crate-local resolver tests)
|
||||
- Test: `tests/service_ws_session_test.rs`
|
||||
|
||||
- [ ] **Step 1: Delete obsolete helper logic**
|
||||
|
||||
If `derive_request_url_from_instruction(...)` is no longer needed after resolver landing, delete it completely.
|
||||
|
||||
If a tiny legacy Zhihu-only seam still remains, keep it private behind the resolver and remove the old public shape from `service::browser_ws_client` if no longer needed.
|
||||
|
||||
- [ ] **Step 2: Lock the precedence contract with one final matrix test**
|
||||
|
||||
In `src/service/server.rs` crate-local tests, add one table-driven or clearly segmented test that verifies all four final outcomes:
|
||||
- non-empty page context wins
|
||||
- deterministic plan wins when page context is absent or empty
|
||||
- skill metadata wins when page context and deterministic plan are absent
|
||||
- fallback becomes `about:blank` when nothing resolves
|
||||
|
||||
- [ ] **Step 3: Run the focused resolver suite**
|
||||
|
||||
Run: `cargo test bootstrap_target --lib -- --nocapture`
|
||||
Expected: PASS.
|
||||
|
||||
- [ ] **Step 4: Run service websocket regression coverage**
|
||||
|
||||
Run: `cargo test callback_host --test service_ws_session_test -- --nocapture`
|
||||
Expected: PASS.
|
||||
|
||||
- [ ] **Step 5: Commit**
|
||||
|
||||
```bash
|
||||
git add src/service/server.rs src/service/mod.rs tests/service_ws_session_test.rs
|
||||
git commit -m "refactor(service): finalize bootstrap target precedence"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 6: Full verification and implementation handoff check
|
||||
|
||||
**Files:** None (verification only)
|
||||
|
||||
- [ ] **Step 1: Run focused deterministic and direct-skill tests**
|
||||
|
||||
Run: `cargo test deterministic_submit -- --nocapture`
|
||||
Expected: PASS.
|
||||
|
||||
Run: `cargo test direct_submit -- --nocapture`
|
||||
Expected: PASS.
|
||||
|
||||
- [ ] **Step 2: Run service submit regression coverage**
|
||||
|
||||
Run: `cargo test --test service_task_flow_test -- --nocapture`
|
||||
Expected: PASS.
|
||||
|
||||
Run: `cargo test --test service_ws_session_test -- --nocapture`
|
||||
Expected: PASS.
|
||||
|
||||
- [ ] **Step 3: Run targeted config/settings coverage if touched**
|
||||
|
||||
Run: `cargo test service_protocol_update_config_test -- --nocapture`
|
||||
Expected: PASS.
|
||||
|
||||
- [ ] **Step 4: Build the project**
|
||||
|
||||
Run: `cargo build --bin sg_claw`
|
||||
Expected: PASS.
|
||||
|
||||
- [ ] **Step 5: Manual behavior checklist**
|
||||
|
||||
Verify manually:
|
||||
1. Existing page-attached submits still bootstrap against the current page URL.
|
||||
2. Deterministic line-loss submit without page context boots helper against the line-loss target page from `DeterministicExecutionPlan.target_url`.
|
||||
3. Non-deterministic configured direct skill without page context uses skill metadata bootstrap URL if present.
|
||||
4. Missing or malformed skill metadata does not crash startup and falls back to `about:blank`.
|
||||
5. No service code remains that hardcodes line-loss request URL by checking raw instruction text.
|
||||
|
||||
- [ ] **Step 6: Final commit (only if verification revealed required follow-up fixes)**
|
||||
|
||||
```bash
|
||||
git add -A
|
||||
git commit -m "test: lock request URL resolution precedence"
|
||||
```
|
||||
|
||||
Only create this commit if verification required an additional code or test fix.
|
||||
762
docs/superpowers/plans/2026-04-14-service-console-enhancement.md
Normal file
762
docs/superpowers/plans/2026-04-14-service-console-enhancement.md
Normal file
@@ -0,0 +1,762 @@
|
||||
# Service Console Enhancement Implementation Plan
|
||||
|
||||
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
|
||||
|
||||
**Goal:** Add auto-connect on page load and a settings panel to sg_claw_service_console.html, with config save via WebSocket to the sgClaw service.
|
||||
|
||||
**Architecture:** The HTML page auto-connects on load and provides a settings modal. When user saves, the page sends an `update_config` WebSocket message. The Rust service receives it, merges with existing config, writes to `sgclaw_config.json`, and responds.
|
||||
|
||||
**Tech Stack:** Rust (serde, tungstenite), vanilla JavaScript/HTML/CSS
|
||||
|
||||
---
|
||||
|
||||
### Task 1: Add `UpdateConfig` and `ConfigUpdated` protocol types
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/service/protocol.rs`
|
||||
|
||||
- [ ] **Step 1: Add `ConfigUpdatePayload` struct and `UpdateConfig` variant to `ClientMessage`**
|
||||
|
||||
Add this struct above the `ClientMessage` enum, and add the `UpdateConfig` variant to the enum:
|
||||
|
||||
```rust
|
||||
use std::path::PathBuf;
|
||||
|
||||
#[derive(Debug, Clone, PartialEq, Eq, Serialize, Deserialize)]
|
||||
pub struct ConfigUpdatePayload {
|
||||
#[serde(rename = "apiKey", default)]
|
||||
pub api_key: Option<String>,
|
||||
#[serde(rename = "baseUrl", default)]
|
||||
pub base_url: Option<String>,
|
||||
#[serde(default)]
|
||||
pub model: Option<String>,
|
||||
#[serde(rename = "skillsDir", default)]
|
||||
pub skills_dir: Option<String>,
|
||||
#[serde(rename = "directSubmitSkill", default)]
|
||||
pub direct_submit_skill: Option<String>,
|
||||
#[serde(rename = "runtimeProfile", default)]
|
||||
pub runtime_profile: Option<String>,
|
||||
#[serde(rename = "browserBackend", default)]
|
||||
pub browser_backend: Option<String>,
|
||||
}
|
||||
```
|
||||
|
||||
Add `UpdateConfig` variant to `ClientMessage` enum (after `Ping`):
|
||||
|
||||
```rust
|
||||
UpdateConfig {
|
||||
config: ConfigUpdatePayload,
|
||||
},
|
||||
```
|
||||
|
||||
- [ ] **Step 2: Add `ConfigUpdated` variant to `ServiceMessage`**
|
||||
|
||||
Add after `Pong`:
|
||||
|
||||
```rust
|
||||
ConfigUpdated {
|
||||
success: bool,
|
||||
message: String,
|
||||
},
|
||||
```
|
||||
|
||||
- [ ] **Step 3: Update `into_submit_task_request` to handle `UpdateConfig`**
|
||||
|
||||
In the match arm, add `ClientMessage::UpdateConfig { .. }` to the list that returns `None`:
|
||||
|
||||
```rust
|
||||
ClientMessage::Connect
|
||||
| ClientMessage::Start
|
||||
| ClientMessage::Stop
|
||||
| ClientMessage::Ping
|
||||
| ClientMessage::UpdateConfig { .. } => None,
|
||||
```
|
||||
|
||||
- [ ] **Step 4: Run tests to verify protocol compiles**
|
||||
|
||||
Run: `cargo test --lib service::protocol`
|
||||
Expected: PASS (no protocol-specific tests yet, but it should compile)
|
||||
|
||||
### Task 2: Add `config_path()` getter to `AgentRuntimeContext`
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/agent/task_runner.rs`
|
||||
|
||||
- [ ] **Step 1: Add public getter method**
|
||||
|
||||
In the `impl AgentRuntimeContext` block, add after `load_sgclaw_settings()`:
|
||||
|
||||
```rust
|
||||
pub fn config_path(&self) -> Option<&Path> {
|
||||
self.config_path.as_deref()
|
||||
}
|
||||
```
|
||||
|
||||
Add the import at the top of the file if not present:
|
||||
|
||||
```rust
|
||||
use std::path::Path;
|
||||
```
|
||||
|
||||
- [ ] **Step 2: Run tests to verify**
|
||||
|
||||
Run: `cargo test agent::task_runner`
|
||||
Expected: PASS
|
||||
|
||||
### Task 3: Add `save_to_path()` method to `SgClawSettings`
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/config/settings.rs`
|
||||
|
||||
- [ ] **Step 1: Add Serialize derive to SgClawSettings and related types**
|
||||
|
||||
The `RawSgClawSettings` struct uses `Deserialize` only. We need to add `Serialize` to `SgClawSettings` for writing. Add `use serde::Serialize;` at the top.
|
||||
|
||||
Add `Serialize` derive to `SgClawSettings`:
|
||||
|
||||
```rust
|
||||
#[derive(Debug, Clone, PartialEq, Eq, Serialize)]
|
||||
pub struct SgClawSettings {
|
||||
```
|
||||
|
||||
But wait - `SgClawSettings` has enum fields (`RuntimeProfile`, `SkillsPromptMode`, `PlannerMode`, `BrowserBackend`, `OfficeBackend`) that don't implement `Serialize`. We need to add Serialize derives to those types too.
|
||||
|
||||
Instead, the simpler approach is to write a `to_raw()` method that converts `SgClawSettings` to a serializable struct, then serialize that.
|
||||
|
||||
- [ ] **Step 2: Create serializable raw config struct**
|
||||
|
||||
Add a new struct at the bottom of the file (before tests if any):
|
||||
|
||||
```rust
|
||||
#[derive(Debug, Serialize)]
|
||||
struct SerializableRawSgClawSettings {
|
||||
#[serde(rename = "apiKey")]
|
||||
api_key: String,
|
||||
#[serde(rename = "baseUrl")]
|
||||
base_url: String,
|
||||
model: String,
|
||||
#[serde(rename = "skillsDir", skip_serializing_if = "Option::is_none")]
|
||||
skills_dir: Option<String>,
|
||||
#[serde(rename = "directSubmitSkill", skip_serializing_if = "Option::is_none")]
|
||||
direct_submit_skill: Option<String>,
|
||||
#[serde(rename = "skillsPromptMode", skip_serializing_if = "Option::is_none")]
|
||||
skills_prompt_mode: Option<String>,
|
||||
#[serde(rename = "runtimeProfile", skip_serializing_if = "Option::is_none")]
|
||||
runtime_profile: Option<String>,
|
||||
#[serde(rename = "plannerMode", skip_serializing_if = "Option::is_none")]
|
||||
planner_mode: Option<String>,
|
||||
#[serde(rename = "activeProvider", skip_serializing_if = "Option::is_none")]
|
||||
active_provider: Option<String>,
|
||||
#[serde(rename = "browserBackend", skip_serializing_if = "Option::is_none")]
|
||||
browser_backend: Option<String>,
|
||||
#[serde(rename = "officeBackend", skip_serializing_if = "Option::is_none")]
|
||||
office_backend: Option<String>,
|
||||
#[serde(rename = "browserWsUrl", skip_serializing_if = "Option::is_none")]
|
||||
browser_ws_url: Option<String>,
|
||||
#[serde(rename = "serviceWsListenAddr", skip_serializing_if = "Option::is_none")]
|
||||
service_ws_listen_addr: Option<String>,
|
||||
#[serde(default)]
|
||||
providers: Vec<SerializableProviderSettings>,
|
||||
}
|
||||
|
||||
#[derive(Debug, Serialize)]
|
||||
struct SerializableProviderSettings {
|
||||
id: String,
|
||||
provider: Option<String>,
|
||||
#[serde(rename = "apiKey")]
|
||||
api_key: String,
|
||||
#[serde(rename = "baseUrl", skip_serializing_if = "Option::is_none")]
|
||||
base_url: Option<String>,
|
||||
model: String,
|
||||
#[serde(rename = "apiPath", skip_serializing_if = "Option::is_none")]
|
||||
api_path: Option<String>,
|
||||
#[serde(rename = "wireApi", skip_serializing_if = "Option::is_none")]
|
||||
wire_api: Option<String>,
|
||||
#[serde(rename = "requiresOpenaiAuth")]
|
||||
requires_openai_auth: bool,
|
||||
}
|
||||
```
|
||||
|
||||
Add `use serde::Serialize;` at the top of the file (combine with existing `use serde::Deserialize;`):
|
||||
|
||||
```rust
|
||||
use serde::{Deserialize, Serialize};
|
||||
```
|
||||
|
||||
- [ ] **Step 3: Add `to_serializable()` method to `SgClawSettings`**
|
||||
|
||||
In the `impl SgClawSettings` block, add:
|
||||
|
||||
```rust
|
||||
fn to_serializable(&self) -> SerializableRawSgClawSettings {
|
||||
let format_enum_value = |s: &str| s.to_string();
|
||||
|
||||
SerializableRawSgClawSettings {
|
||||
api_key: self.provider_api_key.clone(),
|
||||
base_url: self.provider_base_url.clone(),
|
||||
model: self.provider_model.clone(),
|
||||
skills_dir: self.skills_dir.as_ref().map(|p| p.to_string_lossy().into_owned()),
|
||||
direct_submit_skill: self.direct_submit_skill.clone(),
|
||||
skills_prompt_mode: Some(format_enum_value(match self.skills_prompt_mode {
|
||||
SkillsPromptMode::Full => "full",
|
||||
SkillsPromptMode::Compact => "compact",
|
||||
})),
|
||||
runtime_profile: Some(format_enum_value(match self.runtime_profile {
|
||||
RuntimeProfile::BrowserAttached => "browser-attached",
|
||||
RuntimeProfile::BrowserHeavy => "browser-heavy",
|
||||
RuntimeProfile::GeneralAssistant => "general-assistant",
|
||||
})),
|
||||
planner_mode: Some(format_enum_value(match self.planner_mode {
|
||||
PlannerMode::ZeroclawPlanFirst => "zeroclaw-plan-first",
|
||||
PlannerMode::LegacyDeterministic => "legacy-deterministic",
|
||||
})),
|
||||
active_provider: Some(self.active_provider.clone()),
|
||||
browser_backend: Some(format_enum_value(match self.browser_backend {
|
||||
BrowserBackend::SuperRpa => "super-rpa",
|
||||
BrowserBackend::AgentBrowser => "agent-browser",
|
||||
BrowserBackend::RustNative => "rust-native",
|
||||
BrowserBackend::ComputerUse => "computer-use",
|
||||
BrowserBackend::Auto => "auto",
|
||||
})),
|
||||
office_backend: Some(format_enum_value(match self.office_backend {
|
||||
OfficeBackend::OpenXml => "openxml",
|
||||
OfficeBackend::Disabled => "disabled",
|
||||
})),
|
||||
browser_ws_url: self.browser_ws_url.clone(),
|
||||
service_ws_listen_addr: self.service_ws_listen_addr.clone(),
|
||||
providers: self
|
||||
.providers
|
||||
.iter()
|
||||
.map(|p| SerializableProviderSettings {
|
||||
id: p.id.clone(),
|
||||
provider: Some(p.provider.clone()),
|
||||
api_key: p.api_key.clone(),
|
||||
base_url: p.base_url.clone(),
|
||||
model: p.model.clone(),
|
||||
api_path: p.api_path.clone(),
|
||||
wire_api: p.wire_api.clone(),
|
||||
requires_openai_auth: p.requires_openai_auth,
|
||||
})
|
||||
.collect(),
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
- [ ] **Step 4: Add `save_to_path()` method**
|
||||
|
||||
In the same `impl SgClawSettings` block, add:
|
||||
|
||||
```rust
|
||||
pub fn save_to_path(&self, path: &Path) -> Result<(), ConfigError> {
|
||||
let serializable = self.to_serializable();
|
||||
let json = serde_json::to_string_pretty(&serializable)
|
||||
.map_err(|err| ConfigError::ConfigParse(path.to_path_buf(), err.to_string()))?;
|
||||
std::fs::write(path, json)
|
||||
.map_err(|err| ConfigError::ConfigRead(path.to_path_buf(), err.to_string()))
|
||||
}
|
||||
```
|
||||
|
||||
- [ ] **Step 5: Run tests to verify compilation**
|
||||
|
||||
Run: `cargo test --lib config::settings`
|
||||
Expected: PASS
|
||||
|
||||
### Task 4: Handle `UpdateConfig` in the service server
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/service/server.rs`
|
||||
- Modify: `src/service/mod.rs` (if needed for imports)
|
||||
|
||||
- [ ] **Step 1: Add `UpdateConfig` match arm in `serve_client`**
|
||||
|
||||
In the `match message` block in `serve_client`, after the `SubmitTask` arm, add:
|
||||
|
||||
```rust
|
||||
ClientMessage::UpdateConfig { config } => {
|
||||
let Some(config_path) = context.config_path() else {
|
||||
sink.send_service_message(ServiceMessage::ConfigUpdated {
|
||||
success: false,
|
||||
message: "未找到配置文件路径。请通过 --config-path 参数启动 sg_claw 后再使用此功能。".to_string(),
|
||||
})?;
|
||||
continue;
|
||||
};
|
||||
|
||||
if !config_path.exists() {
|
||||
sink.send_service_message(ServiceMessage::ConfigUpdated {
|
||||
success: false,
|
||||
message: format!("配置文件不存在: {}", config_path.display()),
|
||||
})?;
|
||||
continue;
|
||||
}
|
||||
|
||||
let result = update_config_file(config_path, config);
|
||||
match result {
|
||||
Ok(()) => {
|
||||
sink.send_service_message(ServiceMessage::ConfigUpdated {
|
||||
success: true,
|
||||
message: "配置已保存。重启 sg_claw 以应用新配置。".to_string(),
|
||||
})?;
|
||||
}
|
||||
Err(err) => {
|
||||
sink.send_service_message(ServiceMessage::ConfigUpdated {
|
||||
success: false,
|
||||
message: format!("保存配置失败: {}", err),
|
||||
})?;
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
- [ ] **Step 2: Add `update_config_file` helper function**
|
||||
|
||||
Add this function above `serve_client` in `server.rs`:
|
||||
|
||||
```rust
|
||||
use crate::config::settings::{ConfigError, SgClawSettings};
|
||||
use crate::service::protocol::ConfigUpdatePayload;
|
||||
use std::path::Path;
|
||||
|
||||
fn update_config_file(config_path: &Path, config: ConfigUpdatePayload) -> Result<(), String> {
|
||||
let mut settings = SgClawSettings::load(Some(config_path))
|
||||
.map_err(|e| e.to_string())?
|
||||
.ok_or_else(|| "无法读取现有配置".to_string())?;
|
||||
|
||||
if let Some(v) = config.api_key {
|
||||
settings.provider_api_key = v;
|
||||
}
|
||||
if let Some(v) = config.base_url {
|
||||
settings.provider_base_url = v;
|
||||
}
|
||||
if let Some(v) = config.model {
|
||||
settings.provider_model = v;
|
||||
}
|
||||
if let Some(v) = config.skills_dir {
|
||||
settings.skills_dir = Some(PathBuf::from(&v));
|
||||
}
|
||||
if let Some(v) = config.direct_submit_skill {
|
||||
settings.direct_submit_skill = Some(v);
|
||||
}
|
||||
if let Some(v) = config.runtime_profile {
|
||||
settings.runtime_profile = match v.as_str() {
|
||||
"browser-attached" => crate::config::settings::RuntimeProfile::BrowserAttached,
|
||||
"browser-heavy" => crate::config::settings::RuntimeProfile::BrowserHeavy,
|
||||
"general-assistant" => crate::config::settings::RuntimeProfile::GeneralAssistant,
|
||||
_ => return Err(format!("无效的 runtimeProfile: {}", v)),
|
||||
};
|
||||
}
|
||||
if let Some(v) = config.browser_backend {
|
||||
settings.browser_backend = match v.as_str() {
|
||||
"super-rpa" => crate::config::settings::BrowserBackend::SuperRpa,
|
||||
"agent-browser" => crate::config::settings::BrowserBackend::AgentBrowser,
|
||||
"rust-native" => crate::config::settings::BrowserBackend::RustNative,
|
||||
"computer-use" => crate::config::settings::BrowserBackend::ComputerUse,
|
||||
"auto" => crate::config::settings::BrowserBackend::Auto,
|
||||
_ => return Err(format!("无效的 browserBackend: {}", v)),
|
||||
};
|
||||
}
|
||||
|
||||
settings
|
||||
.save_to_path(config_path)
|
||||
.map_err(|e| format!("写入配置文件失败: {}", e))
|
||||
}
|
||||
```
|
||||
|
||||
Add the import at the top of server.rs:
|
||||
|
||||
```rust
|
||||
use std::path::PathBuf;
|
||||
```
|
||||
|
||||
- [ ] **Step 3: Run tests to verify compilation**
|
||||
|
||||
Run: `cargo build`
|
||||
Expected: SUCCESS
|
||||
|
||||
### Task 5: Add auto-connect and settings UI to the service console HTML
|
||||
|
||||
**Files:**
|
||||
- Modify: `frontend/service-console/sg_claw_service_console.html`
|
||||
|
||||
- [ ] **Step 1: Add auto-connect on page load**
|
||||
|
||||
At the very end of the `<script>` section, after the existing event listeners and `updateUiState()`, add:
|
||||
|
||||
```javascript
|
||||
// Auto-connect on page load
|
||||
window.addEventListener("DOMContentLoaded", () => {
|
||||
connectOrDisconnectService(true);
|
||||
});
|
||||
```
|
||||
|
||||
- [ ] **Step 2: Add Settings button HTML**
|
||||
|
||||
In the sidebar section of the HTML, after the connect button and before the "Composer" section label, add:
|
||||
|
||||
```html
|
||||
<button id="settingsBtn" class="ghost-btn" style="margin-top: 8px;">⚙ 设置</button>
|
||||
```
|
||||
|
||||
- [ ] **Step 3: Add Settings modal HTML**
|
||||
|
||||
Before the closing `</body>` tag, add the modal HTML:
|
||||
|
||||
```html
|
||||
<!-- Settings Modal -->
|
||||
<div id="settingsModal" style="display: none; position: fixed; top: 0; left: 0; width: 100%; height: 100%; background: rgba(0,0,0,0.5); z-index: 1000; align-items: center; justify-content: center;">
|
||||
<div style="background: var(--panel); border-radius: 20px; padding: 28px; width: min(520px, 90%); max-height: 85vh; overflow-y: auto; box-shadow: var(--shadow);">
|
||||
<h3 style="margin: 0 0 20px; font-size: 1.2rem;">sgClaw 配置</h3>
|
||||
|
||||
<div class="field">
|
||||
<label for="settingApiKey">API 密钥 *</label>
|
||||
<input id="settingApiKey" type="password" placeholder="输入模型 API 密钥" />
|
||||
</div>
|
||||
|
||||
<div class="field">
|
||||
<label for="settingBaseUrl">模型服务地址 *</label>
|
||||
<input id="settingBaseUrl" type="url" placeholder="例如:https://api.deepseek.com" />
|
||||
</div>
|
||||
|
||||
<div class="field">
|
||||
<label for="settingModel">模型名称 *</label>
|
||||
<input id="settingModel" type="text" placeholder="例如:deepseek-chat" />
|
||||
</div>
|
||||
|
||||
<div class="field">
|
||||
<label for="settingSkillsDir">Skills 目录路径</label>
|
||||
<input id="settingSkillsDir" type="text" placeholder="例如:D:/data/ideaSpace/rust/sgClaw/claw/claw/skills/skill_staging/skills" />
|
||||
</div>
|
||||
|
||||
<div class="field">
|
||||
<label for="settingDirectSubmitSkill">直接提交技能</label>
|
||||
<input id="settingDirectSubmitSkill" type="text" placeholder="例如:tq-lineloss-report.collect_lineloss" />
|
||||
</div>
|
||||
|
||||
<div class="field">
|
||||
<label for="settingRuntimeProfile">运行模式</label>
|
||||
<select id="settingRuntimeProfile" style="width: 100%; border: 1px solid var(--line); border-radius: 16px; padding: 14px 16px; background: rgba(255, 255, 255, 0.92); color: var(--text); font: inherit;">
|
||||
<option value="browser-attached">browser-attached</option>
|
||||
<option value="browser-heavy">browser-heavy</option>
|
||||
<option value="general-assistant">general-assistant</option>
|
||||
</select>
|
||||
</div>
|
||||
|
||||
<div class="field">
|
||||
<label for="settingBrowserBackend">浏览器后端</label>
|
||||
<select id="settingBrowserBackend" style="width: 100%; border: 1px solid var(--line); border-radius: 16px; padding: 14px 16px; background: rgba(255, 255, 255, 0.92); color: var(--text); font: inherit;">
|
||||
<option value="super-rpa">super-rpa</option>
|
||||
<option value="agent-browser">agent-browser</option>
|
||||
<option value="rust-native">rust-native</option>
|
||||
<option value="computer-use">computer-use</option>
|
||||
<option value="auto">auto</option>
|
||||
</select>
|
||||
</div>
|
||||
|
||||
<div id="settingsValidation" style="color: var(--error); font-size: 0.92rem; min-height: 1.4em; margin: 10px 0;"></div>
|
||||
|
||||
<div style="display: flex; gap: 12px; margin-top: 16px;">
|
||||
<button id="settingsSaveBtn" class="primary-btn" style="flex: 1;">保存</button>
|
||||
<button id="settingsCancelBtn" class="ghost-btn" style="flex: 1;">取消</button>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
```
|
||||
|
||||
- [ ] **Step 4: Add settings modal CSS**
|
||||
|
||||
Add these CSS rules inside the `<style>` block, before the `@media` query:
|
||||
|
||||
```css
|
||||
/* Settings modal elements */
|
||||
select {
|
||||
width: 100%;
|
||||
border: 1px solid var(--line);
|
||||
border-radius: 16px;
|
||||
padding: 14px 16px;
|
||||
background: rgba(255, 255, 255, 0.92);
|
||||
color: var(--text);
|
||||
font: inherit;
|
||||
outline: none;
|
||||
cursor: pointer;
|
||||
}
|
||||
|
||||
select:focus {
|
||||
border-color: rgba(15, 118, 110, 0.5);
|
||||
box-shadow: 0 0 0 4px rgba(15, 118, 110, 0.12);
|
||||
}
|
||||
```
|
||||
|
||||
- [ ] **Step 5: Add settings modal JavaScript logic**
|
||||
|
||||
Add this JavaScript at the end of the `<script>` section, before the closing `</script>` tag:
|
||||
|
||||
```javascript
|
||||
// Settings modal state
|
||||
const settingsElements = {
|
||||
modal: document.getElementById("settingsModal"),
|
||||
apiKey: document.getElementById("settingApiKey"),
|
||||
baseUrl: document.getElementById("settingBaseUrl"),
|
||||
model: document.getElementById("settingModel"),
|
||||
skillsDir: document.getElementById("settingSkillsDir"),
|
||||
directSubmitSkill: document.getElementById("settingDirectSubmitSkill"),
|
||||
runtimeProfile: document.getElementById("settingRuntimeProfile"),
|
||||
browserBackend: document.getElementById("settingBrowserBackend"),
|
||||
validation: document.getElementById("settingsValidation"),
|
||||
saveBtn: document.getElementById("settingsSaveBtn"),
|
||||
cancelBtn: document.getElementById("settingsCancelBtn"),
|
||||
};
|
||||
let settingsOpenBtn = null; // will be set below
|
||||
|
||||
function openSettingsModal() {
|
||||
// Pre-fill with current values from wsUrl field (for baseUrl hint)
|
||||
settingsElements.apiKey.value = "";
|
||||
settingsElements.baseUrl.value = "";
|
||||
settingsElements.model.value = "";
|
||||
settingsElements.skillsDir.value = "";
|
||||
settingsElements.directSubmitSkill.value = "";
|
||||
settingsElements.runtimeProfile.value = "browser-attached";
|
||||
settingsElements.browserBackend.value = "super-rpa";
|
||||
settingsElements.validation.textContent = "";
|
||||
settingsElements.modal.style.display = "flex";
|
||||
}
|
||||
|
||||
function closeSettingsModal() {
|
||||
settingsElements.modal.style.display = "none";
|
||||
}
|
||||
|
||||
function validateSettings() {
|
||||
const apiKey = settingsElements.apiKey.value.trim();
|
||||
const baseUrl = settingsElements.baseUrl.value.trim();
|
||||
const model = settingsElements.model.value.trim();
|
||||
|
||||
if (!apiKey) {
|
||||
return "API 密钥不能为空";
|
||||
}
|
||||
if (!model) {
|
||||
return "模型名称不能为空";
|
||||
}
|
||||
if (!baseUrl) {
|
||||
return "模型服务地址不能为空";
|
||||
}
|
||||
try {
|
||||
new URL(baseUrl);
|
||||
} catch {
|
||||
return "模型服务地址格式无效,请输入有效的 URL";
|
||||
}
|
||||
return "";
|
||||
}
|
||||
|
||||
function saveSettings() {
|
||||
const error = validateSettings();
|
||||
if (error) {
|
||||
settingsElements.validation.textContent = error;
|
||||
return;
|
||||
}
|
||||
|
||||
if (!socket || socket.readyState !== WebSocket.OPEN) {
|
||||
settingsElements.validation.textContent = "请先连接服务";
|
||||
return;
|
||||
}
|
||||
|
||||
settingsElements.validation.textContent = "";
|
||||
settingsElements.saveBtn.disabled = true;
|
||||
settingsElements.saveBtn.textContent = "保存中...";
|
||||
|
||||
const config = {
|
||||
apiKey: settingsElements.apiKey.value.trim(),
|
||||
baseUrl: settingsElements.baseUrl.value.trim(),
|
||||
model: settingsElements.model.value.trim(),
|
||||
};
|
||||
|
||||
const skillsDir = settingsElements.skillsDir.value.trim();
|
||||
if (skillsDir) config.skillsDir = skillsDir;
|
||||
|
||||
const directSubmitSkill = settingsElements.directSubmitSkill.value.trim();
|
||||
if (directSubmitSkill) config.directSubmitSkill = directSubmitSkill;
|
||||
|
||||
config.runtimeProfile = settingsElements.runtimeProfile.value;
|
||||
config.browserBackend = settingsElements.browserBackend.value;
|
||||
|
||||
socket.send(JSON.stringify({
|
||||
type: "update_config",
|
||||
config,
|
||||
}));
|
||||
}
|
||||
|
||||
function handleConfigResponse(message) {
|
||||
settingsElements.saveBtn.disabled = false;
|
||||
settingsElements.saveBtn.textContent = "保存";
|
||||
|
||||
if (message.success) {
|
||||
settingsElements.validation.textContent = message.message;
|
||||
settingsElements.validation.style.color = "var(--success)";
|
||||
// Auto-close after 2 seconds on success
|
||||
setTimeout(closeSettingsModal, 2000);
|
||||
} else {
|
||||
settingsElements.validation.textContent = message.message;
|
||||
settingsElements.validation.style.color = "var(--error)";
|
||||
}
|
||||
}
|
||||
|
||||
// Event listeners for settings
|
||||
settingsOpenBtn = document.getElementById("settingsBtn");
|
||||
settingsOpenBtn.addEventListener("click", openSettingsModal);
|
||||
settingsElements.cancelBtn.addEventListener("click", closeSettingsModal);
|
||||
settingsElements.saveBtn.addEventListener("click", saveSettings);
|
||||
|
||||
// Close modal on background click
|
||||
settingsElements.modal.addEventListener("click", (e) => {
|
||||
if (e.target === settingsElements.modal) {
|
||||
closeSettingsModal();
|
||||
}
|
||||
});
|
||||
```
|
||||
|
||||
- [ ] **Step 6: Handle `config_updated` message in `handleMessage`**
|
||||
|
||||
In the existing `handleMessage` function, add a new case in the switch statement:
|
||||
|
||||
```javascript
|
||||
case "config_updated":
|
||||
handleConfigResponse(message);
|
||||
break;
|
||||
```
|
||||
|
||||
- [ ] **Step 7: Verify the HTML is well-formed**
|
||||
|
||||
Open the file in a browser and visually check that:
|
||||
- The settings button appears below the connect button
|
||||
- Clicking it opens the modal
|
||||
- The modal closes on Cancel or background click
|
||||
|
||||
### Task 6: Add protocol tests for new message types
|
||||
|
||||
**Files:**
|
||||
- Modify: `tests/service_console_html_test.rs`
|
||||
- Create: `tests/service_protocol_update_config_test.rs`
|
||||
|
||||
- [ ] **Step 1: Create protocol serialization test**
|
||||
|
||||
Create `tests/service_protocol_update_config_test.rs`:
|
||||
|
||||
```rust
|
||||
use sgclaw::service::protocol::{ClientMessage, ConfigUpdatePayload, ServiceMessage};
|
||||
|
||||
#[test]
|
||||
fn update_config_serializes_correctly() {
|
||||
let config = ConfigUpdatePayload {
|
||||
api_key: Some("test-key".to_string()),
|
||||
base_url: Some("https://api.example.com".to_string()),
|
||||
model: Some("test-model".to_string()),
|
||||
skills_dir: Some("/path/to/skills".to_string()),
|
||||
direct_submit_skill: Some("my-skill.my-tool".to_string()),
|
||||
runtime_profile: Some("browser-attached".to_string()),
|
||||
browser_backend: Some("super-rpa".to_string()),
|
||||
};
|
||||
|
||||
let msg = ClientMessage::UpdateConfig { config };
|
||||
let json = serde_json::to_string(&msg).unwrap();
|
||||
|
||||
assert!(json.contains("\"type\":\"update_config\""));
|
||||
assert!(json.contains("\"apiKey\":\"test-key\""));
|
||||
assert!(json.contains("\"baseUrl\":\"https://api.example.com\""));
|
||||
assert!(json.contains("\"model\":\"test-model\""));
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn update_config_deserializes_correctly() {
|
||||
let json = r#"{
|
||||
"type": "update_config",
|
||||
"config": {
|
||||
"apiKey": "key123",
|
||||
"baseUrl": "https://api.test.com",
|
||||
"model": "gpt-4"
|
||||
}
|
||||
}"#;
|
||||
|
||||
let msg: ClientMessage = serde_json::from_str(json).unwrap();
|
||||
match msg {
|
||||
ClientMessage::UpdateConfig { config } => {
|
||||
assert_eq!(config.api_key, Some("key123".to_string()));
|
||||
assert_eq!(config.base_url, Some("https://api.test.com".to_string()));
|
||||
assert_eq!(config.model, Some("gpt-4".to_string()));
|
||||
assert!(config.skills_dir.is_none());
|
||||
}
|
||||
_ => panic!("expected UpdateConfig variant"),
|
||||
}
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn config_updated_serializes_correctly() {
|
||||
let msg = ServiceMessage::ConfigUpdated {
|
||||
success: true,
|
||||
message: "配置已保存".to_string(),
|
||||
};
|
||||
let json = serde_json::to_string(&msg).unwrap();
|
||||
|
||||
assert!(json.contains("\"type\":\"config_updated\""));
|
||||
assert!(json.contains("\"success\":true"));
|
||||
assert!(json.contains("配置已保存"));
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn config_updated_deserializes_correctly() {
|
||||
let json = r#"{"type":"config_updated","success":false,"message":"保存失败"}"#;
|
||||
let msg: ServiceMessage = serde_json::from_str(json).unwrap();
|
||||
|
||||
match msg {
|
||||
ServiceMessage::ConfigUpdated { success, message } => {
|
||||
assert!(!success);
|
||||
assert_eq!(message, "保存失败");
|
||||
}
|
||||
_ => panic!("expected ConfigUpdated variant"),
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
- [ ] **Step 2: Update service console HTML test**
|
||||
|
||||
Add to `tests/service_console_html_test.rs`, at the end of the existing test:
|
||||
|
||||
```rust
|
||||
// New enhancement assertions
|
||||
assert!(source.contains("DOMContentLoaded"));
|
||||
assert!(source.contains("settingsBtn"));
|
||||
assert!(source.contains("settingsModal"));
|
||||
assert!(source.contains("update_config"));
|
||||
assert!(source.contains("config_updated"));
|
||||
assert!(source.contains("settingApiKey"));
|
||||
assert!(source.contains("settingBaseUrl"));
|
||||
assert!(source.contains("settingModel"));
|
||||
```
|
||||
|
||||
- [ ] **Step 3: Run all new tests**
|
||||
|
||||
Run: `cargo test --test service_protocol_update_config_test`
|
||||
Run: `cargo test --test service_console_html_test`
|
||||
Expected: All PASS
|
||||
|
||||
### Task 7: Full build and test verification
|
||||
|
||||
- [ ] **Step 1: Run full test suite**
|
||||
|
||||
Run: `cargo test 2>&1`
|
||||
Expected: All tests pass (except pre-existing `lineloss_period_resolver_prompts_for_missing_period` which was already failing before our changes)
|
||||
|
||||
- [ ] **Step 2: Build release binary**
|
||||
|
||||
Run: `cargo build --release 2>&1`
|
||||
Expected: SUCCESS
|
||||
|
||||
### Task 8: Manual smoke test instructions
|
||||
|
||||
After implementation, verify manually:
|
||||
|
||||
1. Start sg_claw with config path: `sg_claw.exe --config-path sgclaw_config.json`
|
||||
2. Open `sg_claw_service_console.html` in browser
|
||||
3. Verify: Page auto-connects (should show "已连接" within a few seconds)
|
||||
4. Click "设置" button
|
||||
5. Fill in API Key, Base URL, Model
|
||||
6. Click "保存"
|
||||
7. Verify: Modal shows "配置已保存。重启 sg_claw 以应用新配置。" and auto-closes after 2 seconds
|
||||
8. Verify: `sgclaw_config.json` file contains the new values
|
||||
9. Verify: Existing task submission still works (send a test instruction)
|
||||
File diff suppressed because it is too large
Load Diff
@@ -0,0 +1,810 @@
|
||||
# Multi-Scene-Kind Generator Implementation Plan
|
||||
|
||||
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
|
||||
|
||||
**Goal:** 扩展 `sg_scene_generate` 支持多种场景类型,让用户在 Web UI 上手动选择场景类型(报表收集类/监测类),不再依赖第三方场景目录中的 meta 标签。
|
||||
|
||||
**Architecture:** 放宽 analyzer.rs 的 meta 校验,让 meta 标签变为可选;在 CLI 增加 `--scene-kind` 参数;在 generator.rs 根据场景类型选择不同模板;在 Web UI 增加场景类型下拉框。
|
||||
|
||||
**Tech Stack:** Rust, Node.js, HTML/CSS/JS
|
||||
|
||||
---
|
||||
|
||||
## File Map
|
||||
|
||||
### Core Rust files (backend)
|
||||
|
||||
- **Modify:** `src/generated_scene/analyzer.rs` — 放宽 meta 校验,新增 `SceneKind::Monitoring`,函数签名增加 `scene_kind_hint` 参数
|
||||
- **Modify:** `src/generated_scene/generator.rs` — 多模板支持,根据 `SceneKind` 路由到不同模板函数
|
||||
- **Modify:** `src/bin/sg_scene_generate.rs` — 新增 `--scene-kind` CLI 参数
|
||||
|
||||
### Frontend files (Web UI)
|
||||
|
||||
- **Modify:** `frontend/scene-generator/sg_scene_generator.html` — 新增场景类型下拉框
|
||||
- **Modify:** `frontend/scene-generator/server.js` — `/generate` 接口传递 `sceneKind` 参数
|
||||
- **Modify:** `frontend/scene-generator/generator-runner.js` — `runGenerator` 增加 `sceneKind` 参数
|
||||
|
||||
### Test files
|
||||
|
||||
- **Modify:** `tests/scene_generator_test.rs` — 新增监测类场景测试
|
||||
- **Create:** `tests/fixtures/generated_scene/monitoring/index.html` — 监测类 fixture
|
||||
|
||||
---
|
||||
|
||||
### Task 1: 扩展 SceneKind 枚举和 analyzer 函数签名
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/generated_scene/analyzer.rs:1-127`
|
||||
- Test: `tests/scene_generator_test.rs`
|
||||
|
||||
- [ ] **Step 1: 写失败测试 — analyzer 接受 scene_kind_hint 参数**
|
||||
|
||||
修改 `tests/scene_generator_test.rs`,新增测试:
|
||||
|
||||
```rust
|
||||
#[test]
|
||||
fn analyzer_accepts_missing_meta_with_scene_kind_hint() {
|
||||
// non_report fixture 没有 scene-kind meta 标签
|
||||
let analysis = analyze_scene_source_with_hint(
|
||||
Path::new("tests/fixtures/generated_scene/non_report"),
|
||||
Some(SceneKind::ReportCollection),
|
||||
)
|
||||
.unwrap();
|
||||
|
||||
// 应该成功,使用 hint 参数作为类型
|
||||
assert_eq!(analysis.scene_kind, SceneKind::ReportCollection);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn analyzer_uses_hint_when_meta_missing() {
|
||||
let analysis = analyze_scene_source_with_hint(
|
||||
Path::new("tests/fixtures/generated_scene/non_report"),
|
||||
Some(SceneKind::Monitoring),
|
||||
)
|
||||
.unwrap();
|
||||
|
||||
assert_eq!(analysis.scene_kind, SceneKind::Monitoring);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn analyzer_uses_meta_when_present_and_no_hint() {
|
||||
// report_collection fixture 有正确的 meta 标签
|
||||
let analysis = analyze_scene_source_with_hint(
|
||||
Path::new("tests/fixtures/generated_scene/report_collection"),
|
||||
None,
|
||||
)
|
||||
.unwrap();
|
||||
|
||||
assert_eq!(analysis.scene_kind, SceneKind::ReportCollection);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn analyzer_hint_overrides_meta() {
|
||||
// 用户选择优先于 meta 标签
|
||||
let analysis = analyze_scene_source_with_hint(
|
||||
Path::new("tests/fixtures/generated_scene/report_collection"),
|
||||
Some(SceneKind::Monitoring),
|
||||
)
|
||||
.unwrap();
|
||||
|
||||
assert_eq!(analysis.scene_kind, SceneKind::Monitoring);
|
||||
}
|
||||
```
|
||||
|
||||
- [ ] **Step 2: 运行测试确认失败**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
cargo test --test scene_generator_test -- --nocapture
|
||||
```
|
||||
|
||||
Expected: FAIL,因为 `analyze_scene_source_with_hint` 函数不存在
|
||||
|
||||
- [ ] **Step 3: 实现 SceneKind::Monitoring 枚举变体**
|
||||
|
||||
修改 `src/generated_scene/analyzer.rs`,扩展枚举:
|
||||
|
||||
```rust
|
||||
#[derive(Debug, Clone, PartialEq, Eq)]
|
||||
pub enum SceneKind {
|
||||
ReportCollection,
|
||||
Monitoring,
|
||||
}
|
||||
|
||||
impl SceneKind {
|
||||
pub fn from_str(s: &str) -> Option<Self> {
|
||||
match s {
|
||||
"report_collection" => Some(Self::ReportCollection),
|
||||
"monitoring" => Some(Self::Monitoring),
|
||||
_ => None,
|
||||
}
|
||||
}
|
||||
|
||||
pub fn as_str(&self) -> &'static str {
|
||||
match self {
|
||||
Self::ReportCollection => "report_collection",
|
||||
Self::Monitoring => "monitoring",
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
- [ ] **Step 4: 实现带 hint 参数的新函数**
|
||||
|
||||
在 `src/generated_scene/analyzer.rs` 添加新函数:
|
||||
|
||||
```rust
|
||||
pub fn analyze_scene_source_with_hint(
|
||||
source_dir: &Path,
|
||||
scene_kind_hint: Option<SceneKind>,
|
||||
) -> Result<SceneSourceAnalysis, AnalyzeSceneError> {
|
||||
let index_path = source_dir.join("index.html");
|
||||
let html = fs::read_to_string(&index_path).map_err(|err| {
|
||||
AnalyzeSceneError::new(format!(
|
||||
"failed to read scene source {}: {err}",
|
||||
index_path.display()
|
||||
))
|
||||
})?;
|
||||
|
||||
// 从 meta 标签读取类型(可选)
|
||||
let meta_scene_kind = meta_content(&html, "sgclaw-scene-kind");
|
||||
let meta_tool_kind = meta_content(&html, "sgclaw-tool-kind");
|
||||
|
||||
// 用户 hint 优先于 meta 标签,默认为 ReportCollection
|
||||
let scene_kind = scene_kind_hint
|
||||
.or_else(|| meta_scene_kind.as_deref().and_then(SceneKind::from_str))
|
||||
.unwrap_or(SceneKind::ReportCollection);
|
||||
|
||||
// tool_kind 固定为 BrowserScript(V1 只支持这一种)
|
||||
let tool_kind = ToolKind::BrowserScript;
|
||||
|
||||
// 验证 meta 标签中的类型(如果存在)是否与最终类型兼容
|
||||
if let Some(meta) = meta_scene_kind.as_deref() {
|
||||
if SceneKind::from_str(meta).is_none() {
|
||||
return Err(AnalyzeSceneError::new(format!(
|
||||
"unknown sgclaw-scene-kind: {}",
|
||||
meta
|
||||
)));
|
||||
}
|
||||
}
|
||||
|
||||
let target_url = meta_content(&html, "sgclaw-target-url");
|
||||
let expected_domain = meta_content(&html, "sgclaw-expected-domain");
|
||||
let entry_script = meta_content(&html, "sgclaw-entry-script");
|
||||
|
||||
// 对于 report_collection 类型,要求必须有 target_url、expected_domain、entry_script
|
||||
// 对于 monitoring 类型,这些字段可选(生成简化模板)
|
||||
if scene_kind == SceneKind::ReportCollection {
|
||||
if target_url.as_deref().unwrap_or_default().trim().is_empty()
|
||||
|| expected_domain
|
||||
.as_deref()
|
||||
.unwrap_or_default()
|
||||
.trim()
|
||||
.is_empty()
|
||||
|| entry_script
|
||||
.as_deref()
|
||||
.unwrap_or_default()
|
||||
.trim()
|
||||
.is_empty()
|
||||
{
|
||||
return Err(AnalyzeSceneError::new(
|
||||
"report_collection scene source must declare target url, expected domain, and entry script",
|
||||
));
|
||||
}
|
||||
}
|
||||
|
||||
Ok(SceneSourceAnalysis {
|
||||
scene_kind,
|
||||
tool_kind,
|
||||
bootstrap: BootstrapAnalysis {
|
||||
target_url,
|
||||
expected_domain,
|
||||
},
|
||||
collection_entry_script: entry_script,
|
||||
source_dir: source_dir.to_path_buf(),
|
||||
})
|
||||
}
|
||||
|
||||
// 保留原函数签名以兼容现有调用
|
||||
pub fn analyze_scene_source(source_dir: &Path) -> Result<SceneSourceAnalysis, AnalyzeSceneError> {
|
||||
analyze_scene_source_with_hint(source_dir, None)
|
||||
}
|
||||
```
|
||||
|
||||
- [ ] **Step 5: 运行测试确认通过**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
cargo test --test scene_generator_test -- --nocapture
|
||||
```
|
||||
|
||||
Expected: PASS
|
||||
|
||||
- [ ] **Step 6: 提交 analyzer 改动**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
git add src/generated_scene/analyzer.rs tests/scene_generator_test.rs
|
||||
git commit -m "feat: add SceneKind::Monitoring and scene_kind_hint param to analyzer"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 2: 修改 generator 支持多模板
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/generated_scene/generator.rs:1-204`
|
||||
|
||||
- [ ] **Step 1: 写失败测试 — generator 生成监测类模板**
|
||||
|
||||
修改 `tests/scene_generator_test.rs`,新增测试:
|
||||
|
||||
```rust
|
||||
#[test]
|
||||
fn generator_emits_monitoring_template() {
|
||||
let output_root = temp_workspace("sgclaw-monitoring-generator");
|
||||
|
||||
generate_scene_package(GenerateSceneRequest {
|
||||
source_dir: PathBuf::from("tests/fixtures/generated_scene/monitoring"),
|
||||
scene_id: "sample-monitor-scene".to_string(),
|
||||
scene_name: "示例监测场景".to_string(),
|
||||
scene_kind: Some(SceneKind::Monitoring),
|
||||
output_root: output_root.clone(),
|
||||
lessons_path: PathBuf::from("docs/superpowers/references/tq-lineloss-lessons-learned.toml"),
|
||||
})
|
||||
.unwrap();
|
||||
|
||||
let skill_root = output_root.join("skills/sample-monitor-scene");
|
||||
assert!(skill_root.join("SKILL.toml").exists());
|
||||
assert!(skill_root.join("scene.toml").exists());
|
||||
|
||||
let generated_manifest = fs::read_to_string(skill_root.join("scene.toml")).unwrap();
|
||||
assert!(generated_manifest.contains("category = \"monitoring\""));
|
||||
// 监测类不应该有 org/period resolver
|
||||
assert!(!generated_manifest.contains("resolver = \"dictionary_entity\""));
|
||||
}
|
||||
```
|
||||
|
||||
- [ ] **Step 2: 运行测试确认失败**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
cargo test --test scene_generator_test -- --nocapture
|
||||
```
|
||||
|
||||
Expected: FAIL,因为 `GenerateSceneRequest` 没有 `scene_kind` 字段
|
||||
|
||||
- [ ] **Step 3: 修改 GenerateSceneRequest 增加 scene_kind 字段**
|
||||
|
||||
修改 `src/generated_scene/generator.rs`:
|
||||
|
||||
```rust
|
||||
#[derive(Debug, Clone)]
|
||||
pub struct GenerateSceneRequest {
|
||||
pub source_dir: PathBuf,
|
||||
pub scene_id: String,
|
||||
pub scene_name: String,
|
||||
pub scene_kind: Option<SceneKind>, // 新增
|
||||
pub output_root: PathBuf,
|
||||
pub lessons_path: PathBuf,
|
||||
}
|
||||
```
|
||||
|
||||
- [ ] **Step 4: 修改 generate_scene_package 使用新 analyzer 函数**
|
||||
|
||||
修改 `src/generated_scene/generator.rs`:
|
||||
|
||||
```rust
|
||||
use crate::generated_scene::analyzer::{analyze_scene_source_with_hint, AnalyzeSceneError, SceneKind};
|
||||
|
||||
pub fn generate_scene_package(
|
||||
request: GenerateSceneRequest,
|
||||
) -> Result<PathBuf, GenerateSceneError> {
|
||||
let analysis = analyze_scene_source_with_hint(&request.source_dir, request.scene_kind.clone())?;
|
||||
// ... 后续代码
|
||||
```
|
||||
|
||||
- [ ] **Step 5: 实现监测类模板函数**
|
||||
|
||||
在 `src/generated_scene/generator.rs` 添加:
|
||||
|
||||
```rust
|
||||
fn scene_toml_monitoring(
|
||||
request: &GenerateSceneRequest,
|
||||
analysis: &SceneSourceAnalysis,
|
||||
tool_name: &str,
|
||||
) -> String {
|
||||
let expected_domain = analysis.bootstrap.expected_domain.as_deref().unwrap_or("");
|
||||
let target_url = analysis.bootstrap.target_url.as_deref().unwrap_or("");
|
||||
|
||||
format!(
|
||||
"[scene]\nid = \"{}\"\nskill = \"{}\"\ntool = \"{}\"\nkind = \"browser_script\"\nversion = \"0.1.0\"\ncategory = \"monitoring\"\n\n[manifest]\nschema_version = \"1\"\n\n[bootstrap]\nexpected_domain = \"{}\"\ntarget_url = \"{}\"\nrequires_target_page = true\n\n[deterministic]\nsuffix = \"。。。\"\ninclude_keywords = [\"{}\"]\nexclude_keywords = []\n\n# 参数部分留空,用户手动编辑\n# [[params]]\n# name = \"xxx\"\n# resolver = \"literal_passthrough\"\n\n[artifact]\ntype = \"monitoring-status\"\nsuccess_status = [\"ok\", \"running\"]\nfailure_status = [\"error\", \"timeout\"]\n\n# 后处理留空,用户手动编辑\n",
|
||||
request.scene_id,
|
||||
request.scene_id,
|
||||
tool_name,
|
||||
expected_domain,
|
||||
target_url,
|
||||
request.scene_name
|
||||
)
|
||||
}
|
||||
```
|
||||
|
||||
- [ ] **Step 6: 修改 scene_toml 函数路由到不同模板**
|
||||
|
||||
修改 `src/generated_scene/generator.rs` 的 `scene_toml` 函数:
|
||||
|
||||
```rust
|
||||
fn scene_toml(
|
||||
request: &GenerateSceneRequest,
|
||||
analysis: &SceneSourceAnalysis,
|
||||
tool_name: &str,
|
||||
) -> String {
|
||||
match analysis.scene_kind {
|
||||
SceneKind::ReportCollection => scene_toml_report_collection(request, analysis, tool_name),
|
||||
SceneKind::Monitoring => scene_toml_monitoring(request, analysis, tool_name),
|
||||
}
|
||||
}
|
||||
|
||||
fn scene_toml_report_collection(
|
||||
request: &GenerateSceneRequest,
|
||||
analysis: &SceneSourceAnalysis,
|
||||
tool_name: &str,
|
||||
) -> String {
|
||||
let expected_domain = analysis.bootstrap.expected_domain.as_deref().unwrap_or("");
|
||||
let target_url = analysis.bootstrap.target_url.as_deref().unwrap_or("");
|
||||
|
||||
// 现有的 report_collection 模板代码
|
||||
format!(
|
||||
"[scene]\nid = \"{}\"\nskill = \"{}\"\ntool = \"{}\"\nkind = \"browser_script\"\nversion = \"0.1.0\"\ncategory = \"report_collection\"\n\n[manifest]\nschema_version = \"1\"\n\n[bootstrap]\nexpected_domain = \"{}\"\ntarget_url = \"{}\"\npage_title_keywords = [\"报表\", \"线损\"]\nrequires_target_page = true\n\n[deterministic]\nsuffix = \"。。。\"\ninclude_keywords = [\"{}\", \"报表\", \"统计\"]\nexclude_keywords = [\"知乎\"]\n\n[[params]]\nname = \"org\"\nresolver = \"dictionary_entity\"\nrequired = true\nprompt_missing = \"已命中{},但缺少供电单位。\"\nprompt_ambiguous = \"已命中{},但供电单位存在歧义。\"\n\n[params.resolver_config]\ndictionary_ref = \"references/org-dictionary.json\"\noutput_label_field = \"org_label\"\noutput_code_field = \"org_code\"\n\n[[params]]\nname = \"period\"\nresolver = \"month_week_period\"\nrequired = true\nprompt_missing = \"已命中{},但缺少统计周期。\"\nprompt_ambiguous = \"已命中{},但统计周期存在歧义。\"\n\n[artifact]\ntype = \"report-artifact\"\nsuccess_status = [\"ok\", \"partial\", \"empty\"]\nfailure_status = [\"blocked\", \"error\"]\n\n[postprocess]\nexporter = \"xlsx_report\"\nauto_open = \"excel\"\n",
|
||||
request.scene_id,
|
||||
request.scene_id,
|
||||
tool_name,
|
||||
expected_domain,
|
||||
target_url,
|
||||
request.scene_name,
|
||||
request.scene_name,
|
||||
request.scene_name,
|
||||
request.scene_name,
|
||||
request.scene_name
|
||||
)
|
||||
}
|
||||
```
|
||||
|
||||
- [ ] **Step 7: 创建监测类 fixture**
|
||||
|
||||
创建 `tests/fixtures/generated_scene/monitoring/index.html`:
|
||||
|
||||
```html
|
||||
<!doctype html>
|
||||
<html>
|
||||
<head>
|
||||
<meta charset="utf-8">
|
||||
<title>设备监测状态</title>
|
||||
<!-- 注意:没有 sgclaw-scene-kind meta 标签,测试 hint 参数 -->
|
||||
</head>
|
||||
<body>
|
||||
<main>
|
||||
<h1>设备监测状态</h1>
|
||||
<div id="monitor-status">running</div>
|
||||
</main>
|
||||
</body>
|
||||
</html>
|
||||
```
|
||||
|
||||
- [ ] **Step 8: 运行测试确认通过**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
cargo test --test scene_generator_test -- --nocapture
|
||||
```
|
||||
|
||||
Expected: PASS
|
||||
|
||||
- [ ] **Step 9: 提交 generator 改动**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
git add src/generated_scene/generator.rs tests/scene_generator_test.rs tests/fixtures/generated_scene/monitoring
|
||||
git commit -m "feat: add monitoring template support to generator"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 3: 修改 CLI 增加 --scene-kind 参数
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/bin/sg_scene_generate.rs:1-82`
|
||||
|
||||
- [ ] **Step 1: 修改 CliArgs 结构体增加 scene_kind 字段**
|
||||
|
||||
修改 `src/bin/sg_scene_generate.rs`:
|
||||
|
||||
```rust
|
||||
use sgclaw::generated_scene::analyzer::SceneKind;
|
||||
|
||||
struct CliArgs {
|
||||
source_dir: PathBuf,
|
||||
scene_id: String,
|
||||
scene_name: String,
|
||||
scene_kind: Option<SceneKind>, // 新增
|
||||
output_root: PathBuf,
|
||||
lessons_path: PathBuf,
|
||||
}
|
||||
```
|
||||
|
||||
- [ ] **Step 2: 修改 parse_args 解析 --scene-kind 参数**
|
||||
|
||||
修改 `src/bin/sg_scene_generate.rs`:
|
||||
|
||||
```rust
|
||||
fn parse_args(args: impl Iterator<Item = String>) -> Result<CliArgs, String> {
|
||||
let mut source_dir = None;
|
||||
let mut scene_id = None;
|
||||
let mut scene_name = None;
|
||||
let mut scene_kind = None; // 新增
|
||||
let mut output_root = None;
|
||||
let mut lessons_path = None;
|
||||
let mut pending_flag: Option<String> = None;
|
||||
|
||||
for arg in args {
|
||||
if let Some(flag) = pending_flag.take() {
|
||||
match flag.as_str() {
|
||||
"--source-dir" => source_dir = Some(PathBuf::from(arg)),
|
||||
"--scene-id" => scene_id = Some(arg),
|
||||
"--scene-name" => scene_name = Some(arg),
|
||||
"--scene-kind" => {
|
||||
scene_kind = Some(SceneKind::from_str(&arg).ok_or_else(|| {
|
||||
format!("invalid scene-kind: {}, expected report_collection or monitoring", arg)
|
||||
})?);
|
||||
}
|
||||
"--output-root" => output_root = Some(PathBuf::from(arg)),
|
||||
"--lessons" => lessons_path = Some(PathBuf::from(arg)),
|
||||
_ => return Err(format!("unsupported argument {flag}")),
|
||||
}
|
||||
continue;
|
||||
}
|
||||
|
||||
match arg.as_str() {
|
||||
"--source-dir" | "--scene-id" | "--scene-name" | "--scene-kind" | "--output-root" | "--lessons" => {
|
||||
pending_flag = Some(arg);
|
||||
}
|
||||
"--help" | "-h" => return Err(usage()),
|
||||
_ => return Err(format!("unsupported argument {arg}\n{}", usage())),
|
||||
}
|
||||
}
|
||||
|
||||
if let Some(flag) = pending_flag {
|
||||
return Err(format!("missing value for {flag}"));
|
||||
}
|
||||
|
||||
Ok(CliArgs {
|
||||
source_dir: source_dir.ok_or_else(usage)?,
|
||||
scene_id: scene_id.ok_or_else(usage)?,
|
||||
scene_name: scene_name.ok_or_else(usage)?,
|
||||
scene_kind, // 可选,默认 None
|
||||
output_root: output_root.ok_or_else(usage)?,
|
||||
lessons_path: lessons_path.ok_or_else(usage)?,
|
||||
})
|
||||
}
|
||||
```
|
||||
|
||||
- [ ] **Step 3: 修改 run 函数传递 scene_kind**
|
||||
|
||||
修改 `src/bin/sg_scene_generate.rs`:
|
||||
|
||||
```rust
|
||||
fn run() -> Result<(), String> {
|
||||
let args = parse_args(env::args().skip(1))?;
|
||||
let skill_root = generate_scene_package(GenerateSceneRequest {
|
||||
source_dir: args.source_dir,
|
||||
scene_id: args.scene_id,
|
||||
scene_name: args.scene_name,
|
||||
scene_kind: args.scene_kind, // 新增
|
||||
output_root: args.output_root,
|
||||
lessons_path: args.lessons_path,
|
||||
})
|
||||
.map_err(|err| err.to_string())?;
|
||||
|
||||
println!("generated scene package: {}", skill_root.display());
|
||||
Ok(())
|
||||
}
|
||||
```
|
||||
|
||||
- [ ] **Step 4: 更新 usage 函数**
|
||||
|
||||
修改 `src/bin/sg_scene_generate.rs`:
|
||||
|
||||
```rust
|
||||
fn usage() -> String {
|
||||
"usage: sg_scene_generate --source-dir <scenario-dir> --scene-id <scene-id> --scene-name <display-name> [--scene-kind <report_collection|monitoring>] --output-root <skill-staging-root> --lessons <lessons-toml>".to_string()
|
||||
}
|
||||
```
|
||||
|
||||
- [ ] **Step 5: 运行测试确认编译通过**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
cargo build --bin sg_scene_generate
|
||||
```
|
||||
|
||||
Expected: 编译成功
|
||||
|
||||
- [ ] **Step 6: 手动测试 CLI**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
cargo run --bin sg_scene_generate -- --source-dir tests/fixtures/generated_scene/monitoring --scene-id test-monitor --scene-name "测试监测" --scene-kind monitoring --output-root ./tmp_test --lessons docs/superpowers/references/tq-lineloss-lessons-learned.toml
|
||||
```
|
||||
|
||||
Expected: 生成成功,scene.toml 包含 `category = "monitoring"`
|
||||
|
||||
- [ ] **Step 7: 提交 CLI 改动**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
git add src/bin/sg_scene_generate.rs
|
||||
git commit -m "feat: add --scene-kind CLI param to sg_scene_generate"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 4: 修改 Node.js generator-runner 传递 sceneKind
|
||||
|
||||
**Files:**
|
||||
- Modify: `frontend/scene-generator/generator-runner.js:1-175`
|
||||
|
||||
- [ ] **Step 1: 修改 runGenerator 函数签名和 args 数组**
|
||||
|
||||
修改 `frontend/scene-generator/generator-runner.js`:
|
||||
|
||||
```javascript
|
||||
function runGenerator(params, sseWriter, projectRoot) {
|
||||
const { sourceDir, sceneId, sceneName, sceneKind, outputRoot, lessons } = params;
|
||||
|
||||
const normalize = (p) => p.replace(/\\/g, "/");
|
||||
|
||||
const args = [
|
||||
"run",
|
||||
"--bin",
|
||||
"sg_scene_generate",
|
||||
"--",
|
||||
"--source-dir",
|
||||
normalize(sourceDir),
|
||||
"--scene-id",
|
||||
sceneId,
|
||||
"--scene-name",
|
||||
sceneName,
|
||||
];
|
||||
|
||||
// 只有明确指定 sceneKind 时才添加参数(否则使用默认值 report_collection)
|
||||
if (sceneKind) {
|
||||
args.push("--scene-kind", sceneKind);
|
||||
}
|
||||
|
||||
args.push(
|
||||
"--output-root",
|
||||
normalize(outputRoot),
|
||||
"--lessons",
|
||||
normalize(lessons)
|
||||
);
|
||||
|
||||
// ... 后续代码不变
|
||||
```
|
||||
|
||||
- [ ] **Step 2: 提交 generator-runner 改动**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
git add frontend/scene-generator/generator-runner.js
|
||||
git commit -m "feat: add sceneKind param to generator-runner"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 5: 修改 Node.js server 传递 sceneKind
|
||||
|
||||
**Files:**
|
||||
- Modify: `frontend/scene-generator/server.js:119-154`
|
||||
|
||||
- [ ] **Step 1: 修改 handleGenerate 解构 sceneKind**
|
||||
|
||||
修改 `frontend/scene-generator/server.js`:
|
||||
|
||||
```javascript
|
||||
async function handleGenerate(req, res) {
|
||||
let body;
|
||||
try {
|
||||
body = await parseBody(req);
|
||||
} catch {
|
||||
res.writeHead(400, { "Content-Type": "application/json" });
|
||||
res.end(JSON.stringify({ error: "Invalid JSON body" }));
|
||||
return;
|
||||
}
|
||||
|
||||
const { sourceDir, sceneId, sceneName, sceneKind, outputRoot, lessons } = body;
|
||||
if (!sourceDir || !sceneId || !sceneName || !outputRoot || !lessons) {
|
||||
res.writeHead(400, { "Content-Type": "application/json" });
|
||||
res.end(
|
||||
JSON.stringify({
|
||||
error:
|
||||
"All fields required: sourceDir, sceneId, sceneName, outputRoot, lessons",
|
||||
})
|
||||
);
|
||||
return;
|
||||
}
|
||||
|
||||
const sseWriter = initSSE(res);
|
||||
|
||||
try {
|
||||
await runGenerator(
|
||||
{ sourceDir, sceneId, sceneName, sceneKind, outputRoot, lessons }, // 增加 sceneKind
|
||||
sseWriter,
|
||||
config.projectRoot
|
||||
);
|
||||
} catch (err) {
|
||||
writeSSE(sseWriter, "error", { message: `Server error: ${err.message}` });
|
||||
}
|
||||
|
||||
sseWriter.end();
|
||||
}
|
||||
```
|
||||
|
||||
- [ ] **Step 2: 提交 server 改动**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
git add frontend/scene-generator/server.js
|
||||
git commit -m "feat: pass sceneKind from /generate request to generator"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 6: 修改 Web UI 增加场景类型下拉框
|
||||
|
||||
**Files:**
|
||||
- Modify: `frontend/scene-generator/sg_scene_generator.html`
|
||||
|
||||
- [ ] **Step 1: 在 HTML 中增加场景类型下拉框**
|
||||
|
||||
在 `sg_scene_generator.html` 的表单区域,scene-name 输入框后面添加:
|
||||
|
||||
```html
|
||||
<div class="form-group">
|
||||
<label for="sceneKind">场景类型</label>
|
||||
<select id="sceneKind">
|
||||
<option value="report_collection" selected>报表收集类</option>
|
||||
<option value="monitoring">监测类</option>
|
||||
</select>
|
||||
<span class="hint">报表类:查询数据导出 Excel;监测类:定时检查状态</span>
|
||||
</div>
|
||||
```
|
||||
|
||||
- [ ] **Step 2: 修改 generate() 函数读取 sceneKind**
|
||||
|
||||
修改 `sg_scene_generator.html` 中的 `generate()` 函数:
|
||||
|
||||
```javascript
|
||||
async function generate() {
|
||||
const sourceDir = document.getElementById('sourceDir').value.trim();
|
||||
const sceneId = document.getElementById('sceneId').value.trim();
|
||||
const sceneName = document.getElementById('sceneName').value.trim();
|
||||
const sceneKind = document.getElementById('sceneKind').value; // 新增
|
||||
const outputRoot = document.getElementById('outputRoot').value.trim();
|
||||
const lessons = document.getElementById('lessons').value.trim();
|
||||
|
||||
// ... 验证逻辑不变
|
||||
|
||||
const response = await fetch('/generate', {
|
||||
method: 'POST',
|
||||
headers: { 'Content-Type': 'application/json' },
|
||||
body: JSON.stringify({
|
||||
sourceDir,
|
||||
sceneId,
|
||||
sceneName,
|
||||
sceneKind, // 新增
|
||||
outputRoot,
|
||||
lessons
|
||||
})
|
||||
});
|
||||
|
||||
// ... 后续代码不变
|
||||
}
|
||||
```
|
||||
|
||||
- [ ] **Step 3: 提交 HTML 改动**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
git add frontend/scene-generator/sg_scene_generator.html
|
||||
git commit -m "feat: add sceneKind dropdown to Web UI"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 7: 端到端测试和最终验证
|
||||
|
||||
**Files:**
|
||||
- Verify only
|
||||
|
||||
- [ ] **Step 1: 运行所有 Rust 测试**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
cargo test --test scene_generator_test -- --nocapture
|
||||
cargo test --test scene_registry_test -- --nocapture
|
||||
```
|
||||
|
||||
Expected: PASS
|
||||
|
||||
- [ ] **Step 2: 重启 Node.js 服务器**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
cd frontend/scene-generator && node server.js
|
||||
```
|
||||
|
||||
Expected: 服务启动成功
|
||||
|
||||
- [ ] **Step 3: 手动测试 Web UI 报表类场景**
|
||||
|
||||
1. 打开 `http://127.0.0.1:3210/`
|
||||
2. 输入场景路径 `D:\desk\智能体资料\场景\营销2.0零度户报表数据生成`
|
||||
3. 场景类型选择"报表收集类"
|
||||
4. 点击"分析" → 等待 LLM 提取 scene-id/scene-name
|
||||
5. 点击"生成 Skill" → 等待生成完成
|
||||
6. 检查输出目录下生成的文件
|
||||
|
||||
Expected: 生成成功,scene.toml 包含 `category = "report_collection"`
|
||||
|
||||
- [ ] **Step 4: 提交最终验证**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
git add -A
|
||||
git status
|
||||
```
|
||||
|
||||
确认无未提交改动。
|
||||
|
||||
---
|
||||
|
||||
## Verification Checklist
|
||||
|
||||
### Rust 层
|
||||
|
||||
```bash
|
||||
cargo test --test scene_generator_test -- --nocapture
|
||||
cargo build --bin sg_scene_generate
|
||||
```
|
||||
|
||||
Expected:
|
||||
- `analyze_scene_source_with_hint` 接受可选的 `SceneKind` 参数
|
||||
- `GenerateSceneRequest` 包含 `scene_kind` 字段
|
||||
- generator 根据类型生成不同模板
|
||||
- CLI 支持 `--scene-kind` 参数
|
||||
|
||||
### Node.js 层
|
||||
|
||||
```bash
|
||||
node frontend/scene-generator/server.js
|
||||
```
|
||||
|
||||
Expected:
|
||||
- `/generate` 接口接受 `sceneKind` 参数
|
||||
- `runGenerator` 正确传递参数给 CLI
|
||||
|
||||
### Web UI 层
|
||||
|
||||
手动测试:
|
||||
- 场景类型下拉框正常显示
|
||||
- 选择报表类生成 `category = "report_collection"`
|
||||
- 选择监测类生成 `category = "monitoring"`
|
||||
|
||||
---
|
||||
|
||||
## Notes For The Engineer
|
||||
|
||||
- 配对的 spec 文件是 `docs/superpowers/specs/2026-04-16-multi-scene-kind-generator-design.md`
|
||||
- 用户选择 `scene_kind_hint` 优先于 meta 标签
|
||||
- 监测类模板是简化版,用户需要手动编辑参数部分
|
||||
- V1 不修改 `registry.rs` 的运行时校验逻辑
|
||||
1121
docs/superpowers/plans/2026-04-16-scene-skill-generator.md
Normal file
1121
docs/superpowers/plans/2026-04-16-scene-skill-generator.md
Normal file
File diff suppressed because it is too large
Load Diff
@@ -0,0 +1,839 @@
|
||||
# Enhanced LLM Extraction Schema - Implementation Plan
|
||||
|
||||
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
|
||||
|
||||
**Goal:** Enhance the LLM extraction schema to support multi-mode business logic, enabling automatic generation of scripts like tq-lineloss-report that switch between month/week modes.
|
||||
|
||||
**Architecture:** Extend existing `SceneInfoJson` in Rust with new mode-related structs. Enhance LLM prompt in `llm-client.js` to detect multi-mode patterns. Add new template function `browser_script_with_modes()` for generating mode-aware JavaScript.
|
||||
|
||||
**Tech Stack:** Rust (serde_json), JavaScript (Node.js), LLM API
|
||||
|
||||
---
|
||||
|
||||
## File Structure
|
||||
|
||||
| File | Action | Purpose |
|
||||
|------|--------|---------|
|
||||
| `src/generated_scene/generator.rs` | Modify | Add mode-related schema structs and multi-mode template |
|
||||
| `frontend/scene-generator/llm-client.js` | Modify | Enhance DEEP_SYSTEM_PROMPT for mode detection |
|
||||
| `frontend/scene-generator/server.js` | Modify | Handle enhanced schema in deep analysis endpoint |
|
||||
|
||||
---
|
||||
|
||||
### Task 1: Add Rust Schema Structs for Multi-Mode Support
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/generated_scene/generator.rs` (after line 21)
|
||||
|
||||
**Goal:** Add new Rust structs to parse the enhanced JSON schema with modes support.
|
||||
|
||||
- [ ] **Step 1: Add ModeConditionJson struct**
|
||||
|
||||
Add after `ApiEndpointJson` struct (line 21):
|
||||
|
||||
```rust
|
||||
#[derive(Debug, Clone, serde::Serialize, serde::Deserialize)]
|
||||
pub struct ModeConditionJson {
|
||||
pub field: String,
|
||||
#[serde(default = "default_equals")]
|
||||
pub operator: String,
|
||||
pub value: serde_json::Value,
|
||||
}
|
||||
|
||||
fn default_equals() -> String {
|
||||
"equals".to_string()
|
||||
}
|
||||
```
|
||||
|
||||
- [ ] **Step 2: Add NormalizeRulesJson struct**
|
||||
|
||||
Add after `ModeConditionJson`:
|
||||
|
||||
```rust
|
||||
#[derive(Debug, Clone, serde::Serialize, serde::Deserialize)]
|
||||
pub struct NormalizeRulesJson {
|
||||
#[serde(rename = "type", default = "default_validate_all")]
|
||||
pub rules_type: String,
|
||||
#[serde(default)]
|
||||
pub required_fields: Vec<String>,
|
||||
#[serde(default = "default_true")]
|
||||
pub filter_null: bool,
|
||||
}
|
||||
|
||||
fn default_validate_all() -> String {
|
||||
"validate_all_columns".to_string()
|
||||
}
|
||||
|
||||
fn default_true() -> bool {
|
||||
true
|
||||
}
|
||||
```
|
||||
|
||||
- [ ] **Step 3: Add ModeConfigJson struct**
|
||||
|
||||
Add after `NormalizeRulesJson`:
|
||||
|
||||
```rust
|
||||
#[derive(Debug, Clone, serde::Serialize, serde::Deserialize)]
|
||||
pub struct ModeConfigJson {
|
||||
pub name: String,
|
||||
#[serde(default)]
|
||||
pub label: Option<String>,
|
||||
pub condition: ModeConditionJson,
|
||||
#[serde(rename = "apiEndpoint")]
|
||||
pub api_endpoint: ApiEndpointEnhancedJson,
|
||||
#[serde(rename = "columnDefs", default)]
|
||||
pub column_defs: Vec<(String, String)>,
|
||||
#[serde(rename = "requestTemplate", default)]
|
||||
pub request_template: Option<serde_json::Value>,
|
||||
#[serde(rename = "normalizeRules", default)]
|
||||
pub normalize_rules: Option<NormalizeRulesJson>,
|
||||
#[serde(rename = "responsePath", default)]
|
||||
pub response_path: Option<String>,
|
||||
}
|
||||
```
|
||||
|
||||
- [ ] **Step 4: Add ApiEndpointEnhancedJson struct**
|
||||
|
||||
Add before `ModeConfigJson`:
|
||||
|
||||
```rust
|
||||
#[derive(Debug, Clone, serde::Serialize, serde::Deserialize)]
|
||||
pub struct ApiEndpointEnhancedJson {
|
||||
pub name: String,
|
||||
pub url: String,
|
||||
#[serde(default)]
|
||||
pub method: String,
|
||||
#[serde(rename = "contentType", default)]
|
||||
pub content_type: Option<String>,
|
||||
#[serde(default)]
|
||||
pub description: Option<String>,
|
||||
}
|
||||
```
|
||||
|
||||
- [ ] **Step 5: Enhance SceneInfoJson struct**
|
||||
|
||||
Modify `SceneInfoJson` to add mode fields (add after line 54, before the closing brace):
|
||||
|
||||
```rust
|
||||
// Multi-mode support (new fields)
|
||||
#[serde(default)]
|
||||
pub modes: Vec<ModeConfigJson>,
|
||||
#[serde(rename = "defaultMode", default)]
|
||||
pub default_mode: Option<String>,
|
||||
#[serde(rename = "modeSwitchField", default)]
|
||||
pub mode_switch_field: Option<String>,
|
||||
```
|
||||
|
||||
- [ ] **Step 6: Verify the changes**
|
||||
|
||||
Run `cargo check` to verify:
|
||||
|
||||
```bash
|
||||
cargo check
|
||||
```
|
||||
|
||||
Expected: No compilation errors.
|
||||
|
||||
- [ ] **Step 7: Commit**
|
||||
|
||||
```bash
|
||||
git add src/generated_scene/generator.rs
|
||||
git commit -m "feat(generator): add multi-mode schema structs for enhanced LLM extraction"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 2: Enhance LLM Extraction Prompt
|
||||
|
||||
**Files:**
|
||||
- Modify: `frontend/scene-generator/llm-client.js` (lines 16-46)
|
||||
|
||||
**Goal:** Enhance `DEEP_SYSTEM_PROMPT` to instruct LLM to detect multi-mode business logic.
|
||||
|
||||
- [ ] **Step 1: Replace DEEP_SYSTEM_PROMPT with enhanced version**
|
||||
|
||||
Replace the entire `DEEP_SYSTEM_PROMPT` constant (lines 16-46):
|
||||
|
||||
```javascript
|
||||
const DEEP_SYSTEM_PROMPT = `你是一个场景代码分析专家。分析场景源码,提取关键业务信息。
|
||||
|
||||
## 分析目标
|
||||
|
||||
1. **多模式识别** (关键):
|
||||
- 查找条件分支逻辑 (if/switch) 中基于 period_mode、reportType 等字段的分支
|
||||
- 识别不同分支对应的 API 端点、列定义、请求格式
|
||||
- 如果发现多模式,使用 modes 数组格式输出
|
||||
|
||||
2. **API 端点**: 识别所有 HTTP 请求地址 (URL, method, contentType, 用途)
|
||||
- 从 $.ajax/fetch 调用中提取 contentType
|
||||
- 检测请求格式: application/json 或 application/x-www-form-urlencoded
|
||||
|
||||
3. **请求模板**: 识别请求参数结构
|
||||
- 提取硬编码的分页参数 (rows, page, sidx, sord)
|
||||
- 识别模板变量如 \${args.org_code}
|
||||
|
||||
4. **数据归一化**: 识别数据处理规则
|
||||
- 查找数据渲染/表格填充逻辑
|
||||
- 检测数据验证条件 (哪些字段不能为空)
|
||||
|
||||
5. **响应路径**: 识别数据在响应中的位置
|
||||
- 如 response.content 或 response.data
|
||||
|
||||
## 输出格式
|
||||
|
||||
### 单模式场景 (无 modes 数组):
|
||||
{
|
||||
"sceneId": "string",
|
||||
"sceneName": "string",
|
||||
"sceneKind": "report_collection | monitoring",
|
||||
"expectedDomain": "string",
|
||||
"targetUrl": "string",
|
||||
"apiEndpoints": [{"name": "", "url": "", "method": "POST"}],
|
||||
"staticParams": {"key": "value"},
|
||||
"columnDefs": [["fieldName", "中文列名"]]
|
||||
}
|
||||
|
||||
### 多模式场景 (有 modes 数组):
|
||||
{
|
||||
"sceneId": "tq-lineloss-report",
|
||||
"sceneName": "台区线损报表",
|
||||
"sceneKind": "report_collection",
|
||||
"modes": [
|
||||
{
|
||||
"name": "month",
|
||||
"label": "月度报表",
|
||||
"condition": {"field": "period_mode", "operator": "equals", "value": "month"},
|
||||
"apiEndpoint": {
|
||||
"name": "月度线损查询",
|
||||
"url": "http://...",
|
||||
"method": "POST",
|
||||
"contentType": "application/x-www-form-urlencoded"
|
||||
},
|
||||
"columnDefs": [["ORG_NAME", "供电单位"], ...],
|
||||
"requestTemplate": {"orgno": "\${args.org_code}", "rows": 1000, "page": 1},
|
||||
"normalizeRules": {"type": "validate_all_columns", "filterNull": true},
|
||||
"responsePath": "content"
|
||||
},
|
||||
{
|
||||
"name": "week",
|
||||
"label": "周报表",
|
||||
"condition": {"field": "period_mode", "operator": "equals", "value": "week"},
|
||||
"apiEndpoint": {...},
|
||||
"columnDefs": [...],
|
||||
...
|
||||
}
|
||||
],
|
||||
"defaultMode": "month",
|
||||
"modeSwitchField": "period_mode"
|
||||
}
|
||||
|
||||
**重要**: 如果发现代码中有基于 period_mode 的 if/switch 分支,必须使用多模式格式输出!`;
|
||||
```
|
||||
|
||||
- [ ] **Step 2: Verify JavaScript syntax**
|
||||
|
||||
```bash
|
||||
node --check frontend/scene-generator/llm-client.js
|
||||
```
|
||||
|
||||
Expected: No syntax errors.
|
||||
|
||||
- [ ] **Step 3: Commit**
|
||||
|
||||
```bash
|
||||
git add frontend/scene-generator/llm-client.js
|
||||
git commit -m "feat(llm): enhance DEEP_SYSTEM_PROMPT for multi-mode detection"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 3: Implement Multi-Mode Template in Rust
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/generated_scene/generator.rs` (add new function after `browser_script_with_business_logic`)
|
||||
|
||||
**Goal:** Add a new template function that generates mode-aware JavaScript.
|
||||
|
||||
- [ ] **Step 1: Add browser_script_with_modes function**
|
||||
|
||||
Add after `browser_script_with_business_logic` function (after line 476):
|
||||
|
||||
```rust
|
||||
fn browser_script_with_modes(scene_id: &str, scene_info: &SceneInfoJson) -> String {
|
||||
let modes_json = serde_json::to_string_pretty(&scene_info.modes).unwrap_or_else(|_| "[]".to_string());
|
||||
let default_mode = scene_info.default_mode.as_deref().unwrap_or("month");
|
||||
let mode_switch_field = scene_info.mode_switch_field.as_deref().unwrap_or("period_mode");
|
||||
|
||||
format!(r#"const REPORT_NAME = '{scene_id}';
|
||||
const MODES = {modes_json};
|
||||
const DEFAULT_MODE = '{default_mode}';
|
||||
const MODE_SWITCH_FIELD = '{mode_switch_field}';
|
||||
|
||||
function normalizePayload(payload) {{
|
||||
if (typeof payload === 'string') {{
|
||||
try {{ return JSON.parse(payload); }} catch (_) {{ return {{}}; }}
|
||||
}}
|
||||
return payload && typeof payload === 'object' ? payload : {{}};
|
||||
}}
|
||||
|
||||
function validateArgs(args) {{
|
||||
const errors = [];
|
||||
if (!args.org_code) errors.push('Missing org_code');
|
||||
if (!args.period_value) errors.push('Missing period_value');
|
||||
return {{ valid: errors.length === 0, errors }};
|
||||
}}
|
||||
|
||||
function detectMode(args) {{
|
||||
const modeValue = args[MODE_SWITCH_FIELD] || DEFAULT_MODE;
|
||||
return MODES.find(m => m.condition.value === modeValue) || MODES[0];
|
||||
}}
|
||||
|
||||
function buildModeRequest(args, mode) {{
|
||||
const endpoint = mode.apiEndpoint;
|
||||
const template = mode.requestTemplate || {{}};
|
||||
const contentType = endpoint.contentType || 'application/json';
|
||||
const url = endpoint.url;
|
||||
const method = endpoint.method || 'POST';
|
||||
|
||||
let body;
|
||||
if (contentType === 'application/x-www-form-urlencoded') {{
|
||||
body = {{ ...template }};
|
||||
for (const [key, value] of Object.entries(body)) {{
|
||||
if (typeof value === 'string' && value.startsWith('${{') && value.endsWith('}}')) {{
|
||||
const expr = value.slice(2, -1);
|
||||
try {{
|
||||
body[key] = eval(expr);
|
||||
}} catch (e) {{
|
||||
body[key] = args.org_code;
|
||||
}}
|
||||
}}
|
||||
}}
|
||||
body.orgno = args.org_code;
|
||||
}} else {{
|
||||
body = JSON.stringify({{ ...template, ...args }});
|
||||
}}
|
||||
|
||||
return {{ url, method, headers: {{ 'Content-Type': contentType }}, body }};
|
||||
}}
|
||||
|
||||
function normalizeModeRows(data, mode) {{
|
||||
const rules = mode.normalizeRules || {{ type: 'validate_all_columns', filterNull: true }};
|
||||
const columns = mode.columnDefs.map(([key]) => key);
|
||||
|
||||
if (!Array.isArray(data)) return [];
|
||||
|
||||
return data.map(row => {{
|
||||
const result = {{}};
|
||||
for (const key of columns) {{
|
||||
const v = row[key];
|
||||
result[key] = (v === null || v === undefined || v === '') ? '' : String(v).trim();
|
||||
}}
|
||||
return result;
|
||||
}}).filter(row => {{
|
||||
if (!rules.filterNull) return true;
|
||||
if (rules.type === 'validate_required' && rules.requiredFields) {{
|
||||
return rules.requiredFields.every(f => row[f] !== '');
|
||||
}}
|
||||
return columns.every(k => row[k] !== '');
|
||||
}});
|
||||
}}
|
||||
|
||||
function determineArtifactStatus({{ blockedReason = '', fatalError = '', reasons = [], rows = [] }}) {{
|
||||
if (blockedReason) return 'blocked';
|
||||
if (fatalError) return 'error';
|
||||
if (reasons.length > 0) return 'partial';
|
||||
if (!rows.length) return 'empty';
|
||||
return 'ok';
|
||||
}}
|
||||
|
||||
function buildArtifact({{ status, blockedReason = '', fatalError = '', reasons = [], rows = [], args, columnDefs, columns }}) {{
|
||||
return {{
|
||||
type: 'report-artifact',
|
||||
report_name: REPORT_NAME,
|
||||
status: status || determineArtifactStatus({{ blockedReason, fatalError, reasons, rows }}),
|
||||
period: {{
|
||||
mode: args.period_mode,
|
||||
mode_code: args.period_mode_code,
|
||||
value: args.period_value,
|
||||
payload: normalizePayload(args.period_payload)
|
||||
}},
|
||||
org: {{ label: args.org_label, code: args.org_code }},
|
||||
column_defs: columnDefs || [],
|
||||
columns: columns || [],
|
||||
rows,
|
||||
counts: {{ detail_rows: rows.length }},
|
||||
partial_reasons: reasons.filter(r => r && !r.startsWith('api_') && !r.startsWith('validation_')),
|
||||
reasons: Array.from(new Set(reasons.filter(Boolean)))
|
||||
}};
|
||||
}}
|
||||
|
||||
const defaultDeps = {{
|
||||
validatePageContext(args) {{
|
||||
const host = (globalThis.location?.hostname || '').trim();
|
||||
const expected = (args.expected_domain || '').trim();
|
||||
if (!host) return {{ ok: false, reason: 'page_context_unavailable' }};
|
||||
if (host !== expected) return {{ ok: false, reason: 'page_context_mismatch' }};
|
||||
return {{ ok: true }};
|
||||
}},
|
||||
|
||||
async queryModeData(args, mode) {{
|
||||
const endpoint = mode.apiEndpoint;
|
||||
const request = buildModeRequest(args, mode);
|
||||
const contentType = endpoint.contentType || 'application/json';
|
||||
|
||||
// Prefer jQuery
|
||||
if (typeof $ !== 'undefined' && typeof $.ajax === 'function') {{
|
||||
return new Promise((resolve, reject) => {{
|
||||
$.ajax({{
|
||||
url: request.url,
|
||||
type: request.method,
|
||||
data: request.body,
|
||||
contentType: contentType,
|
||||
dataType: 'json',
|
||||
success: resolve,
|
||||
error: (xhr, status, err) => reject(new Error(
|
||||
`API failed (${{xhr.status}}): ${{err}} | body=${{(xhr.responseText || '').substring(0, 200)}}`
|
||||
))
|
||||
}});
|
||||
}});
|
||||
}}
|
||||
|
||||
// Fallback: fetch
|
||||
if (typeof fetch === 'function') {{
|
||||
const response = await fetch(request.url, {{
|
||||
method: request.method,
|
||||
headers: request.headers,
|
||||
body: request.method !== 'GET' ? request.body : undefined
|
||||
}});
|
||||
if (!response.ok) {{
|
||||
const text = await response.text().catch(() => '');
|
||||
throw new Error(`HTTP ${{response.status}}: ${{text.substring(0, 200)}}`);
|
||||
}}
|
||||
return response.json();
|
||||
}}
|
||||
|
||||
throw new Error('No HTTP client available (need jQuery or fetch)');
|
||||
}}
|
||||
}};
|
||||
|
||||
async function buildBrowserEntrypointResult(args, deps = defaultDeps) {{
|
||||
// 1. Parameter validation
|
||||
const validation = validateArgs(args);
|
||||
if (!validation.valid) {{
|
||||
const mode = detectMode(args);
|
||||
return buildArtifact({{
|
||||
status: 'blocked',
|
||||
blockedReason: 'validation_failed',
|
||||
reasons: validation.errors,
|
||||
rows: [],
|
||||
args,
|
||||
columnDefs: mode.columnDefs,
|
||||
columns: mode.columnDefs.map(([key]) => key)
|
||||
}});
|
||||
}}
|
||||
|
||||
// 2. Page context validation
|
||||
const pageValidation = typeof deps.validatePageContext === 'function'
|
||||
? deps.validatePageContext(args)
|
||||
: {{ ok: true }};
|
||||
if (!pageValidation?.ok) {{
|
||||
const mode = detectMode(args);
|
||||
return buildArtifact({{
|
||||
status: 'blocked',
|
||||
blockedReason: pageValidation?.reason || 'page_context_mismatch',
|
||||
reasons: [pageValidation?.reason || 'page_context_mismatch'],
|
||||
rows: [],
|
||||
args,
|
||||
columnDefs: mode.columnDefs,
|
||||
columns: mode.columnDefs.map(([key]) => key)
|
||||
}});
|
||||
}}
|
||||
|
||||
// 3. Detect mode
|
||||
const mode = detectMode(args);
|
||||
|
||||
// 4. Data fetching
|
||||
const reasons = [];
|
||||
let rawData = null;
|
||||
try {{
|
||||
rawData = await (deps.queryModeData ? deps.queryModeData(args, mode) : Promise.resolve([]));
|
||||
}} catch (error) {{
|
||||
return buildArtifact({{
|
||||
status: 'error',
|
||||
fatalError: error.message,
|
||||
reasons: ['api_query_failed:' + error.message],
|
||||
rows: [],
|
||||
args,
|
||||
columnDefs: mode.columnDefs,
|
||||
columns: mode.columnDefs.map(([key]) => key)
|
||||
}});
|
||||
}}
|
||||
|
||||
// 5. Extract response data
|
||||
const responsePath = mode.responsePath || '';
|
||||
let data = rawData;
|
||||
if (responsePath && rawData) {{
|
||||
data = rawData[responsePath] || rawData;
|
||||
}}
|
||||
|
||||
// 6. Row normalization
|
||||
const rows = normalizeModeRows(data, mode);
|
||||
if (rows.length === 0 && Array.isArray(data) && data.length > 0) {{
|
||||
reasons.push('row_normalization_partial');
|
||||
}}
|
||||
|
||||
// 7. Build artifact
|
||||
return buildArtifact({{
|
||||
reasons,
|
||||
rows,
|
||||
args,
|
||||
columnDefs: mode.columnDefs,
|
||||
columns: mode.columnDefs.map(([key]) => key)
|
||||
}});
|
||||
}}
|
||||
|
||||
if (typeof module !== 'undefined') {{
|
||||
module.exports = {{ buildBrowserEntrypointResult, normalizePayload, validateArgs, detectMode, buildModeRequest, normalizeModeRows, buildArtifact, determineArtifactStatus, MODES, REPORT_NAME }};
|
||||
}}
|
||||
|
||||
if (typeof args !== 'undefined') {{
|
||||
return buildBrowserEntrypointResult(args);
|
||||
}}
|
||||
"#, scene_id = scene_id, modes_json = modes_json, default_mode = default_mode, mode_switch_field = mode_switch_field)
|
||||
}
|
||||
```
|
||||
|
||||
- [ ] **Step 2: Modify browser_script function to use multi-mode template**
|
||||
|
||||
Replace the `browser_script` function (lines 270-277):
|
||||
|
||||
```rust
|
||||
fn browser_script(scene_id: &str, analysis: &SceneSourceAnalysis, scene_info: Option<&SceneInfoJson>) -> String {
|
||||
match scene_info {
|
||||
Some(info) if !info.modes.is_empty() => {
|
||||
browser_script_with_modes(scene_id, info)
|
||||
}
|
||||
Some(info) if !info.api_endpoints.is_empty() || !info.column_defs.is_empty() => {
|
||||
browser_script_with_business_logic(scene_id, analysis, info)
|
||||
}
|
||||
_ => browser_script_skeleton(scene_id, analysis),
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
- [ ] **Step 3: Verify compilation**
|
||||
|
||||
```bash
|
||||
cargo check
|
||||
```
|
||||
|
||||
Expected: No errors.
|
||||
|
||||
- [ ] **Step 4: Commit**
|
||||
|
||||
```bash
|
||||
git add src/generated_scene/generator.rs
|
||||
git commit -m "feat(generator): add multi-mode template for mode-aware script generation"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 4: Add Unit Tests for Schema Parsing
|
||||
|
||||
**Files:**
|
||||
- Create: `src/generated_scene/generator_test.rs`
|
||||
|
||||
**Goal:** Add tests to verify the enhanced schema parses correctly.
|
||||
|
||||
- [ ] **Step 1: Create test file**
|
||||
|
||||
Create `src/generated_scene/generator_test.rs`:
|
||||
|
||||
```rust
|
||||
#[cfg(test)]
|
||||
mod tests {
|
||||
use super::*;
|
||||
|
||||
#[test]
|
||||
fn test_parse_mode_condition() {
|
||||
let json = r#"{"field": "period_mode", "operator": "equals", "value": "month"}"#;
|
||||
let condition: ModeConditionJson = serde_json::from_str(json).unwrap();
|
||||
assert_eq!(condition.field, "period_mode");
|
||||
assert_eq!(condition.operator, "equals");
|
||||
assert_eq!(condition.value.as_str().unwrap(), "month");
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_parse_normalize_rules() {
|
||||
let json = r#"{"type": "validate_required", "requiredFields": ["ORG_NAME"], "filterNull": true}"#;
|
||||
let rules: NormalizeRulesJson = serde_json::from_str(json).unwrap();
|
||||
assert_eq!(rules.rules_type, "validate_required");
|
||||
assert_eq!(rules.required_fields, vec!["ORG_NAME"]);
|
||||
assert!(rules.filter_null);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_parse_mode_config() {
|
||||
let json = r#"{
|
||||
"name": "month",
|
||||
"label": "月度报表",
|
||||
"condition": {"field": "period_mode", "operator": "equals", "value": "month"},
|
||||
"apiEndpoint": {"name": "test", "url": "http://example.com", "method": "POST"},
|
||||
"columnDefs": [["ORG_NAME", "供电单位"]],
|
||||
"responsePath": "content"
|
||||
}"#;
|
||||
let mode: ModeConfigJson = serde_json::from_str(json).unwrap();
|
||||
assert_eq!(mode.name, "month");
|
||||
assert_eq!(mode.column_defs.len(), 1);
|
||||
assert_eq!(mode.response_path, Some("content".to_string()));
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_parse_scene_info_with_modes() {
|
||||
let json = r#"{
|
||||
"sceneId": "test-report",
|
||||
"sceneName": "测试报表",
|
||||
"sceneKind": "report_collection",
|
||||
"modes": [
|
||||
{"name": "month", "condition": {"field": "period_mode", "value": "month"}, "apiEndpoint": {"name": "m", "url": "http://a"}, "columnDefs": []},
|
||||
{"name": "week", "condition": {"field": "period_mode", "value": "week"}, "apiEndpoint": {"name": "w", "url": "http://b"}, "columnDefs": []}
|
||||
],
|
||||
"defaultMode": "month",
|
||||
"modeSwitchField": "period_mode"
|
||||
}"#;
|
||||
let info: SceneInfoJson = serde_json::from_str(json).unwrap();
|
||||
assert_eq!(info.scene_id, "test-report");
|
||||
assert_eq!(info.modes.len(), 2);
|
||||
assert_eq!(info.default_mode, Some("month".to_string()));
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_parse_scene_info_backward_compatible() {
|
||||
// Old format without modes should still work
|
||||
let json = r#"{
|
||||
"sceneId": "old-report",
|
||||
"sceneName": "旧格式报表",
|
||||
"apiEndpoints": [{"name": "test", "url": "http://example.com"}],
|
||||
"columnDefs": [["col1", "列1"]]
|
||||
}"#;
|
||||
let info: SceneInfoJson = serde_json::from_str(json).unwrap();
|
||||
assert_eq!(info.scene_id, "old-report");
|
||||
assert!(info.modes.is_empty());
|
||||
assert_eq!(info.api_endpoints.len(), 1);
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
- [ ] **Step 2: Add test module to generator.rs**
|
||||
|
||||
Add at the end of `generator.rs`:
|
||||
|
||||
```rust
|
||||
#[cfg(test)]
|
||||
mod generator_test;
|
||||
```
|
||||
|
||||
- [ ] **Step 3: Run tests**
|
||||
|
||||
```bash
|
||||
cargo test --lib generator
|
||||
```
|
||||
|
||||
Expected: All tests pass.
|
||||
|
||||
- [ ] **Step 4: Commit**
|
||||
|
||||
```bash
|
||||
git add src/generated_scene/generator_test.rs src/generated_scene/generator.rs
|
||||
git commit -m "test(generator): add unit tests for multi-mode schema parsing"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 5: Integration Test with tq-lineloss-report
|
||||
|
||||
**Files:**
|
||||
- Test: Generate skill from tq-lineloss-report source
|
||||
|
||||
**Goal:** Verify the enhanced template can generate a multi-mode script.
|
||||
|
||||
- [ ] **Step 1: Build the project**
|
||||
|
||||
```bash
|
||||
cargo build --release
|
||||
```
|
||||
|
||||
Expected: Build succeeds.
|
||||
|
||||
- [ ] **Step 2: Create a test multi-mode scene-info JSON**
|
||||
|
||||
Create a test JSON file to simulate LLM output:
|
||||
|
||||
```json
|
||||
{
|
||||
"sceneId": "tq-lineloss-test",
|
||||
"sceneName": "台区线损测试报表",
|
||||
"sceneKind": "report_collection",
|
||||
"modes": [
|
||||
{
|
||||
"name": "month",
|
||||
"label": "月度报表",
|
||||
"condition": {"field": "period_mode", "operator": "equals", "value": "month"},
|
||||
"apiEndpoint": {
|
||||
"name": "月度线损查询",
|
||||
"url": "http://20.76.57.61:18080/gsllys/fourVerEightHor/fourVerEightHorLinelossRateList",
|
||||
"method": "POST",
|
||||
"contentType": "application/x-www-form-urlencoded"
|
||||
},
|
||||
"columnDefs": [["ORG_NAME", "供电单位"], ["YGDL", "供电量"], ["YYDL", "售电量"]],
|
||||
"requestTemplate": {"orgno": "${args.org_code}", "rows": 1000, "page": 1, "sidx": "ORG_NO", "sord": "asc"},
|
||||
"normalizeRules": {"type": "validate_all_columns", "filterNull": true},
|
||||
"responsePath": "content"
|
||||
},
|
||||
{
|
||||
"name": "week",
|
||||
"label": "周报表",
|
||||
"condition": {"field": "period_mode", "operator": "equals", "value": "week"},
|
||||
"apiEndpoint": {
|
||||
"name": "周线损查询",
|
||||
"url": "http://20.76.57.61:18080/gsllys/tqLinelossStatis/getYearMonWeekLinelossAnalysisList",
|
||||
"method": "POST",
|
||||
"contentType": "application/x-www-form-urlencoded"
|
||||
},
|
||||
"columnDefs": [["ORG_NAME", "供电单位"], ["LINE_LOSS_RATE", "线损率"]],
|
||||
"requestTemplate": {"orgno": "${args.org_code}", "tjzq": "week", "rows": 1000},
|
||||
"normalizeRules": {"type": "validate_required", "requiredFields": ["ORG_NAME", "LINE_LOSS_RATE"], "filterNull": true},
|
||||
"responsePath": "content"
|
||||
}
|
||||
],
|
||||
"defaultMode": "month",
|
||||
"modeSwitchField": "period_mode"
|
||||
}
|
||||
```
|
||||
|
||||
Save to `tmp_multi_mode_test.json`.
|
||||
|
||||
- [ ] **Step 3: Run generator with the multi-mode JSON**
|
||||
|
||||
```bash
|
||||
cargo run --bin sg_scene_generate -- --source-dir "examples/test-scene" --scene-id "tq-lineloss-test" --scene-name "台区线损测试" --output-root "tmp_multi_test" --scene-info-json "$(cat tmp_multi_mode_test.json)"
|
||||
```
|
||||
|
||||
Expected: Skill package generated without errors.
|
||||
|
||||
- [ ] **Step 4: Verify generated script syntax**
|
||||
|
||||
```bash
|
||||
node --check tmp_multi_test/skills/tq-lineloss-test/scripts/collect_tq_lineloss_test.js
|
||||
```
|
||||
|
||||
Expected: No syntax errors.
|
||||
|
||||
- [ ] **Step 5: Verify generated script has multi-mode logic**
|
||||
|
||||
Check that the generated script contains:
|
||||
- `detectMode()` function
|
||||
- `MODES` constant with mode configurations
|
||||
- `buildModeRequest()` function
|
||||
- `normalizeModeRows()` function
|
||||
|
||||
- [ ] **Step 6: Commit**
|
||||
|
||||
```bash
|
||||
git add -A
|
||||
git commit -m "test: verify multi-mode template generates valid JavaScript"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 6: Update Web UI to Display Mode Information
|
||||
|
||||
**Files:**
|
||||
- Modify: `frontend/scene-generator/sg_scene_generator.html`
|
||||
|
||||
**Goal:** Add mode information display in the extraction preview panel.
|
||||
|
||||
- [ ] **Step 1: Add mode display section**
|
||||
|
||||
Add after the column defs display in the preview panel:
|
||||
|
||||
```html
|
||||
<div id="modes-preview" class="preview-section" style="display: none;">
|
||||
<h4>业务模式</h4>
|
||||
<div id="modes-list" class="preview-list"></div>
|
||||
</div>
|
||||
```
|
||||
|
||||
- [ ] **Step 2: Add JavaScript to populate mode info**
|
||||
|
||||
In the `showExtractionPreview` function, add:
|
||||
|
||||
```javascript
|
||||
// Show modes if present
|
||||
const modesSection = document.getElementById('modes-preview');
|
||||
const modesList = document.getElementById('modes-list');
|
||||
|
||||
if (data.modes && data.modes.length > 0) {
|
||||
modesSection.style.display = 'block';
|
||||
modesList.innerHTML = data.modes.map(mode => {
|
||||
const name = escapeHtml(mode.name || 'unknown');
|
||||
const label = escapeHtml(mode.label || '');
|
||||
const api = escapeHtml(mode.apiEndpoint?.url || '');
|
||||
return `<div class="preview-list-item">
|
||||
<strong>${name}</strong>${label ? ` (${label})` : ''}: ${api}
|
||||
</div>`;
|
||||
}).join('');
|
||||
} else {
|
||||
modesSection.style.display = 'none';
|
||||
}
|
||||
```
|
||||
|
||||
- [ ] **Step 3: Verify changes**
|
||||
|
||||
```bash
|
||||
node --check frontend/scene-generator/sg_scene_generator.html
|
||||
```
|
||||
|
||||
Note: HTML files can't be syntax-checked directly, just verify the server starts.
|
||||
|
||||
- [ ] **Step 4: Commit**
|
||||
|
||||
```bash
|
||||
git add frontend/scene-generator/sg_scene_generator.html
|
||||
git commit -m "feat(ui): add mode information display in extraction preview"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Self-Review Checklist
|
||||
|
||||
**1. Spec Coverage:**
|
||||
- [x] Multi-mode schema structs → Task 1
|
||||
- [x] Enhanced LLM prompt → Task 2
|
||||
- [x] Multi-mode template → Task 3
|
||||
- [x] Unit tests → Task 4
|
||||
- [x] Integration test → Task 5
|
||||
- [x] UI update → Task 6
|
||||
|
||||
**2. Placeholder Scan:**
|
||||
- No TBD, TODO, or placeholder text found
|
||||
- All code snippets are complete
|
||||
- All commands have expected output
|
||||
|
||||
**3. Type Consistency:**
|
||||
- `ModeConditionJson` field names match JSON schema
|
||||
- `ModeConfigJson` uses `apiEndpoint` (camelCase) matching JSON
|
||||
- `NormalizeRulesJson` uses `rules_type` with serde rename
|
||||
|
||||
---
|
||||
|
||||
## Execution Handoff
|
||||
|
||||
Plan complete and saved to `docs/superpowers/plans/2026-04-17-enhanced-llm-extraction-schema-plan.md`. Two execution options:
|
||||
|
||||
**1. Subagent-Driven (recommended)** - I dispatch a fresh subagent per task, review between tasks, fast iteration
|
||||
|
||||
**2. Inline Execution** - Execute tasks in this session using executing-plans, batch execution with checkpoints
|
||||
|
||||
Which approach?
|
||||
@@ -0,0 +1,441 @@
|
||||
# Generated Scene Rectification Implementation Plan
|
||||
|
||||
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
|
||||
|
||||
**Goal:** Rectify the generated-scene pipeline so it stops emitting false-positive runnable skills for complex internal scenes, specifically by fixing `sceneId` degeneration, bootstrap pollution, incomplete workflow reconstruction, and readiness fail-open behavior.
|
||||
|
||||
**Architecture:** Keep the current `Scene IR` pipeline, but add four hard control chains around it: naming validation, bootstrap evidence stratification, workflow evidence reconstruction, and readiness gating. Generation must fail closed whenever these chains are incomplete.
|
||||
|
||||
**Tech Stack:** Rust, Node.js, HTML/CSS/JavaScript, serde_json, OpenAI-compatible LLM API
|
||||
|
||||
---
|
||||
|
||||
## Scope Check
|
||||
|
||||
This plan implements the design in:
|
||||
|
||||
- `docs/superpowers/specs/2026-04-17-generated-scene-rectification-design.md`
|
||||
|
||||
This plan builds on the existing generated-scene foundation already described in:
|
||||
|
||||
- `docs/superpowers/specs/2026-04-17-scene-skill-compiler-design.md`
|
||||
- `docs/superpowers/specs/2026-04-17-llm-driven-skill-generation-design.md`
|
||||
- `docs/superpowers/specs/2026-04-17-enhanced-llm-extraction-schema-design.md`
|
||||
|
||||
This plan does not attempt to solve:
|
||||
|
||||
- login or authentication recovery
|
||||
- Chromium host integration or browser embedding changes
|
||||
- full runtime resolver expansion beyond what this rectification needs
|
||||
- arbitrary historical scene compatibility outside the reference regression cases
|
||||
|
||||
---
|
||||
|
||||
## File Map
|
||||
|
||||
### Frontend scene generator
|
||||
|
||||
| File | Action | Purpose |
|
||||
|------|--------|---------|
|
||||
| `frontend/scene-generator/generator-runner.js` | Modify | Implement naming fallback control, URL evidence stratification, workflow evidence cleanup, and pre-generation gate inputs |
|
||||
| `frontend/scene-generator/llm-client.js` | Modify | Tighten sceneId semantic constraints and reject low-entropy LLM naming output |
|
||||
| `frontend/scene-generator/server.js` | Modify | Aggregate readiness gates, block unsafe generation, and return rectification diagnostics |
|
||||
| `frontend/scene-generator/sg_scene_generator.html` | Modify | Show invalid `sceneId`, bootstrap role breakdown, workflow evidence completeness, and generation block reasons |
|
||||
|
||||
### Rust generated-scene pipeline
|
||||
|
||||
| File | Action | Purpose |
|
||||
|------|--------|---------|
|
||||
| `src/generated_scene/analyzer.rs` | Modify | Add endpoint denoising, evidence role typing, and stricter archetype preconditions |
|
||||
| `src/generated_scene/ir.rs` | Modify | Extend IR to carry candidate roles, gate states, and workflow evidence completeness |
|
||||
| `src/generated_scene/generator.rs` | Modify | Prevent compiler routing when gates fail and surface fail-closed diagnostics |
|
||||
|
||||
### Tests and fixtures
|
||||
|
||||
| File | Action | Purpose |
|
||||
|------|--------|---------|
|
||||
| `tests/scene_generator_test.rs` | Modify | Cover naming, bootstrap, workflow, and readiness regression cases |
|
||||
| `tests/scene_generator_html_test.rs` | Modify | Cover HTML/UI risk and blocking output |
|
||||
| `tests/fixtures/generated_scene/paginated_enrichment/*` | Modify | Preserve marketing-like reference coverage |
|
||||
| `tests/fixtures/generated_scene/multi_mode/*` | Modify | Preserve tq-like multi-mode coverage |
|
||||
| Additional fixture files as needed | Create | Add low-entropy naming and localhost-pollution regression inputs |
|
||||
|
||||
---
|
||||
|
||||
## Scope Guardrails
|
||||
|
||||
- Do not broaden this work into a generic scene-generator redesign.
|
||||
- Do not remove the existing `Scene IR` structure; extend and constrain it.
|
||||
- Do not let `localhost` or helper/export endpoints participate in bootstrap selection.
|
||||
- Do not silently coerce invalid `sceneId` values into accepted ids.
|
||||
- Do not route into `paginated_enrichment` unless its minimum workflow evidence is complete.
|
||||
- Do not emit a default runnable skill when any rectification gate fails.
|
||||
|
||||
---
|
||||
|
||||
### Task 1: Rectify Naming Chain
|
||||
|
||||
**Files:**
|
||||
- Modify: `frontend/scene-generator/generator-runner.js`
|
||||
- Modify: `frontend/scene-generator/llm-client.js`
|
||||
- Modify: `frontend/scene-generator/server.js`
|
||||
- Modify: `src/generated_scene/ir.rs`
|
||||
|
||||
**Goal:** Stop Chinese-source scenes from degrading into low-information ids such as `2-0`, and turn `sceneId` into a validated business identifier instead of a raw slug fallback.
|
||||
|
||||
- [ ] **Step 1: Classify sceneId candidate sources**
|
||||
|
||||
Define explicit candidate tiers for `sceneId`:
|
||||
|
||||
1. LLM semantic business id
|
||||
2. deterministic keyword-derived id
|
||||
3. controlled alias/transliteration fallback
|
||||
4. invalid fallback candidate
|
||||
|
||||
Expected result: the pipeline can explain where the chosen id came from.
|
||||
|
||||
- [ ] **Step 2: Add low-entropy sceneId validation**
|
||||
|
||||
Implement shared validation rules that reject ids which are:
|
||||
|
||||
- numeric-only or numeric-dominant
|
||||
- too short to be business-readable
|
||||
- generic placeholders such as `scene` or `report`
|
||||
- semantically detached from the extracted `sceneName`
|
||||
|
||||
Expected result: ids like `2-0`, `1-0`, `scene`, `report` are blocked.
|
||||
|
||||
- [ ] **Step 3: Fail closed on invalid sceneId**
|
||||
|
||||
Update generation flow so invalid `sceneId` produces:
|
||||
|
||||
- `invalid_scene_id` gate failure
|
||||
- readiness downgrade
|
||||
- analysis/report output only unless explicitly overridden later by a separate approved flow
|
||||
|
||||
Expected result: invalid ids never create a formal generated skill directory by default.
|
||||
|
||||
- [ ] **Step 4: Surface naming diagnostics in server/UI**
|
||||
|
||||
Return and display:
|
||||
|
||||
- chosen `sceneId`
|
||||
- candidate source
|
||||
- validation result
|
||||
- invalidation reason if blocked
|
||||
|
||||
- [ ] **Step 5: Add regression tests**
|
||||
|
||||
Cover at least:
|
||||
|
||||
- Chinese source name that previously degraded to `2-0`
|
||||
- valid semantic id chosen over slug fallback
|
||||
- invalid low-entropy id blocked from generation
|
||||
|
||||
- [ ] **Step 6: Commit**
|
||||
|
||||
```bash
|
||||
git add frontend/scene-generator/generator-runner.js frontend/scene-generator/llm-client.js frontend/scene-generator/server.js src/generated_scene/ir.rs tests/scene_generator_test.rs
|
||||
git commit -m "fix(generator): block degenerate generated scene ids"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 2: Rectify Bootstrap Chain
|
||||
|
||||
**Files:**
|
||||
- Modify: `frontend/scene-generator/generator-runner.js`
|
||||
- Modify: `frontend/scene-generator/server.js`
|
||||
- Modify: `src/generated_scene/analyzer.rs`
|
||||
- Modify: `src/generated_scene/ir.rs`
|
||||
|
||||
**Goal:** Separate business bootstrap candidates from localhost/export/helper URLs so internal-network entry domains resolve correctly.
|
||||
|
||||
- [ ] **Step 1: Add URL evidence role stratification**
|
||||
|
||||
Classify URL candidates into:
|
||||
|
||||
- `business_entry`
|
||||
- `business_api`
|
||||
- `gateway_api`
|
||||
- `export_service`
|
||||
- `local_helper`
|
||||
- `static_asset`
|
||||
- `template_noise`
|
||||
|
||||
Expected result: every URL candidate is typed before bootstrap selection.
|
||||
|
||||
- [ ] **Step 2: Add deterministic localhost and noise rejection**
|
||||
|
||||
Ensure that:
|
||||
|
||||
- `localhost`
|
||||
- `127.0.0.1`
|
||||
- `SurfaceServices`
|
||||
- `ReportServices`
|
||||
- `.js` / `.css` assets
|
||||
- template placeholders and format strings
|
||||
|
||||
are routed away from bootstrap candidates.
|
||||
|
||||
Expected result: helper/export/static/template strings can remain as evidence but can never win bootstrap.
|
||||
|
||||
- [ ] **Step 3: Redefine bootstrap resolution order**
|
||||
|
||||
Bootstrap selection may only consume:
|
||||
|
||||
1. `business_entry`
|
||||
2. `business_api`
|
||||
3. `gateway_api`
|
||||
|
||||
When only helper/noise roles exist, set bootstrap to unresolved and downgrade readiness.
|
||||
|
||||
- [ ] **Step 4: Preserve export/helper evidence separately**
|
||||
|
||||
Retain localhost/export endpoints as downstream evidence for workflow/reporting, but isolate them from `expectedDomain` and `targetUrl`.
|
||||
|
||||
- [ ] **Step 5: Add regression tests**
|
||||
|
||||
Cover at least:
|
||||
|
||||
- marketing-like source choosing `yx.gs.sgcc.com.cn` over `localhost`
|
||||
- mixed business + gateway scene preserving business target page
|
||||
- scene with only localhost/noise ending in unresolved bootstrap
|
||||
|
||||
- [ ] **Step 6: Commit**
|
||||
|
||||
```bash
|
||||
git add frontend/scene-generator/generator-runner.js frontend/scene-generator/server.js src/generated_scene/analyzer.rs src/generated_scene/ir.rs tests/scene_generator_test.rs
|
||||
git commit -m "fix(generator): stratify bootstrap evidence and exclude localhost"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 3: Rectify Workflow Chain
|
||||
|
||||
**Files:**
|
||||
- Modify: `frontend/scene-generator/generator-runner.js`
|
||||
- Modify: `frontend/scene-generator/server.js`
|
||||
- Modify: `src/generated_scene/analyzer.rs`
|
||||
- Modify: `src/generated_scene/ir.rs`
|
||||
- Modify: `src/generated_scene/generator.rs`
|
||||
|
||||
**Goal:** Reconstruct workflow from request-chain evidence instead of generic field names, so `paginated_enrichment` is only emitted when its true workflow exists.
|
||||
|
||||
- [ ] **Step 1: Split workflow evidence into typed layers**
|
||||
|
||||
Represent workflow evidence as:
|
||||
|
||||
- request evidence
|
||||
- pagination evidence
|
||||
- secondary request evidence
|
||||
- post-process evidence
|
||||
|
||||
Expected result: archetype decisions operate on structured workflow signals instead of a flat endpoint list.
|
||||
|
||||
- [ ] **Step 2: Denoise endpoint and method evidence**
|
||||
|
||||
Normalize and filter out:
|
||||
|
||||
- `${apiUrl}`
|
||||
- template placeholders
|
||||
- exception strings
|
||||
- log text fragments
|
||||
- localhost export endpoints
|
||||
|
||||
Expected result: workflow reconstruction only consumes business-relevant requests.
|
||||
|
||||
- [ ] **Step 3: Tighten archetype routing rules**
|
||||
|
||||
Require `paginated_enrichment` to have at minimum:
|
||||
|
||||
1. one main list request
|
||||
2. one pagination variable set
|
||||
3. one secondary request or explicit per-item enrichment function
|
||||
4. one post-process action among `filter`, `transform`, `export`
|
||||
|
||||
If only part of this exists, preserve it as candidate evidence but do not route into the compiler.
|
||||
|
||||
- [ ] **Step 4: Narrow multi_mode detection**
|
||||
|
||||
Allow `multi_mode_request` only when mode switching materially changes at least one of:
|
||||
|
||||
- request body
|
||||
- endpoint shape
|
||||
- response path
|
||||
- column definition
|
||||
|
||||
Expected result: generic `type/tab/mode/status` fields alone no longer misclassify marketing-like scenes.
|
||||
|
||||
- [ ] **Step 5: Block compiler routing on incomplete workflow**
|
||||
|
||||
Update generator-side routing so incomplete evidence cannot produce a formal `paginated_enrichment` skill package.
|
||||
|
||||
- [ ] **Step 6: Add regression tests**
|
||||
|
||||
Cover at least:
|
||||
|
||||
- marketing-like scene must expose `paginate` + `secondary_request` + post-process evidence
|
||||
- generic mode fields without real mode divergence must not force `multi_mode_request`
|
||||
- noisy endpoint lists must still reconstruct the correct business request chain
|
||||
|
||||
- [ ] **Step 7: Commit**
|
||||
|
||||
```bash
|
||||
git add frontend/scene-generator/generator-runner.js frontend/scene-generator/server.js src/generated_scene/analyzer.rs src/generated_scene/ir.rs src/generated_scene/generator.rs tests/scene_generator_test.rs
|
||||
git commit -m "fix(generator): require complete workflow evidence before archetype routing"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 4: Rectify Readiness Chain
|
||||
|
||||
**Files:**
|
||||
- Modify: `frontend/scene-generator/server.js`
|
||||
- Modify: `frontend/scene-generator/sg_scene_generator.html`
|
||||
- Modify: `src/generated_scene/ir.rs`
|
||||
- Modify: `src/generated_scene/generator.rs`
|
||||
- Modify: `tests/scene_generator_html_test.rs`
|
||||
|
||||
**Goal:** Turn readiness into a hard gate that distinguishes analysis output from runnable skill output.
|
||||
|
||||
- [ ] **Step 1: Add explicit rectification gates**
|
||||
|
||||
Track at minimum:
|
||||
|
||||
- `scene_id_valid`
|
||||
- `bootstrap_resolved`
|
||||
- `workflow_complete_for_archetype`
|
||||
- `runtime_contract_compatible`
|
||||
|
||||
Expected result: readiness is derived from named gates rather than a loose score only.
|
||||
|
||||
- [ ] **Step 2: Enforce fail-closed readiness rules**
|
||||
|
||||
Require:
|
||||
|
||||
- all core gates pass for readiness `A` or `B`
|
||||
- any core gate failure forces readiness `C`
|
||||
- generation endpoint blocks runnable output on gate failure
|
||||
|
||||
- [ ] **Step 3: Separate analysis result from generation result**
|
||||
|
||||
When gates fail, allow:
|
||||
|
||||
- analysis preview
|
||||
- evidence report
|
||||
- block reasons
|
||||
|
||||
But do not default to:
|
||||
|
||||
- full skill emission
|
||||
- compiler success messaging
|
||||
|
||||
- [ ] **Step 4: Expose readiness breakdown in UI**
|
||||
|
||||
Display:
|
||||
|
||||
- gate names
|
||||
- pass/fail state
|
||||
- missing workflow pieces
|
||||
- bootstrap resolution reason
|
||||
- invalid sceneId reason
|
||||
|
||||
- [ ] **Step 5: Add regression tests**
|
||||
|
||||
Cover at least:
|
||||
|
||||
- invalid `sceneId` forcing readiness `C`
|
||||
- unresolved bootstrap forcing readiness `C`
|
||||
- incomplete paginated workflow forcing readiness `C`
|
||||
- fully valid reference fixture remaining eligible for generation
|
||||
|
||||
- [ ] **Step 6: Commit**
|
||||
|
||||
```bash
|
||||
git add frontend/scene-generator/server.js frontend/scene-generator/sg_scene_generator.html src/generated_scene/ir.rs src/generated_scene/generator.rs tests/scene_generator_html_test.rs tests/scene_generator_test.rs
|
||||
git commit -m "fix(generator): enforce readiness fail-closed gating"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 5: Reference Regression Verification
|
||||
|
||||
**Files:**
|
||||
- Modify: `tests/scene_generator_test.rs`
|
||||
- Modify: `tests/scene_generator_html_test.rs`
|
||||
- Modify/Create: relevant fixtures under `tests/fixtures/generated_scene/`
|
||||
|
||||
**Goal:** Lock the rectification against the two reference scene families and ensure future changes do not reintroduce the same false positives.
|
||||
|
||||
- [ ] **Step 1: Regress marketing-like fixture**
|
||||
|
||||
Verify the marketing reference path now satisfies:
|
||||
|
||||
- non-degenerate `sceneId`
|
||||
- bootstrap rooted in `yx.gs.sgcc.com.cn` family
|
||||
- workflow includes `paginate`
|
||||
- workflow includes `secondary_request`
|
||||
- readiness does not pass if any of the above are missing
|
||||
|
||||
- [ ] **Step 2: Regress tq-like fixture**
|
||||
|
||||
Verify the tq reference path still satisfies:
|
||||
|
||||
- stable semantic `sceneId`
|
||||
- valid non-localhost bootstrap
|
||||
- genuine `multi_mode_request` detection
|
||||
- no downgrade caused by the stricter marketing rectification rules
|
||||
|
||||
- [ ] **Step 3: Run verification commands**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo check
|
||||
cargo test --test scene_generator_test -- --nocapture
|
||||
cargo test --test scene_generator_html_test -- --nocapture
|
||||
node --check frontend/scene-generator/llm-client.js
|
||||
node --check frontend/scene-generator/generator-runner.js
|
||||
node --check frontend/scene-generator/server.js
|
||||
```
|
||||
|
||||
Expected result: rectification passes both Rust and Node validation plus regression coverage.
|
||||
|
||||
- [ ] **Step 4: Record outcomes in generated reports if needed**
|
||||
|
||||
If the implementation emits readiness or analysis JSON reports, ensure the test fixtures assert the key blocked/passed states directly.
|
||||
|
||||
- [ ] **Step 5: Commit**
|
||||
|
||||
```bash
|
||||
git add tests/scene_generator_test.rs tests/scene_generator_html_test.rs tests/fixtures/generated_scene frontend/scene-generator/llm-client.js frontend/scene-generator/generator-runner.js frontend/scene-generator/server.js src/generated_scene/analyzer.rs src/generated_scene/ir.rs src/generated_scene/generator.rs
|
||||
git commit -m "test(generator): lock generated scene rectification regressions"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
This plan is complete when all of the following are true:
|
||||
|
||||
1. Chinese-source scene names no longer degrade into low-entropy ids like `2-0`.
|
||||
2. `localhost`, `127.0.0.1`, export services, and helper URLs no longer compete for bootstrap resolution.
|
||||
3. `paginated_enrichment` routing only occurs when pagination, secondary request, and post-process evidence are all present.
|
||||
4. Incomplete evidence paths fail closed with explicit readiness gate failures instead of generating false-positive runnable skills.
|
||||
5. The marketing-like and tq-like reference scenes both remain covered by automated regression tests.
|
||||
|
||||
## Rollback Strategy
|
||||
|
||||
If this rectification causes unacceptable regressions:
|
||||
|
||||
1. Revert the latest rectification task commit only, not unrelated generated-scene work.
|
||||
2. Keep the previous `Scene IR` and compiler structure intact.
|
||||
3. Preserve newly added fixtures and tests where possible, then relax only the specific gate or classifier that caused the regression.
|
||||
|
||||
## Notes For Executors
|
||||
|
||||
- Implement this plan strictly in order: naming, bootstrap, workflow, readiness, verification.
|
||||
- Do not skip ahead to UI polish before the gating logic is in place.
|
||||
- Do not add speculative resolver or login work under this plan.
|
||||
- Any need for user override or forced draft generation must be handled as a separate follow-up spec, not smuggled into this rectification plan.
|
||||
File diff suppressed because it is too large
Load Diff
@@ -0,0 +1,627 @@
|
||||
# Progressive Browser Script Template Enhancement Implementation Plan
|
||||
|
||||
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
|
||||
|
||||
**Goal:** Enhance the `browser_script_with_business_logic` template in Rust to generate complete, runnable browser scripts with proper HTTP handling, status determination, and error handling.
|
||||
|
||||
**Architecture:** Modify `src/generated_scene/generator.rs` to replace the current incomplete JavaScript template with an enhanced version that includes: direct URL usage (fixing the URL construction bug), jQuery + fetch dual HTTP client support, complete status determination (blocked/error/partial/empty/ok), and enhanced entrypoint with page context validation.
|
||||
|
||||
**Tech Stack:** Rust, JavaScript (browser script), serde_json
|
||||
|
||||
---
|
||||
|
||||
## File Structure
|
||||
|
||||
| File | Action | Purpose |
|
||||
|------|--------|---------|
|
||||
| `src/generated_scene/generator.rs` | Modify | Replace `browser_script_with_business_logic` function with enhanced template |
|
||||
|
||||
---
|
||||
|
||||
### Task 1: Fix URL Building in buildRequest()
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/generated_scene/generator.rs:308-321` (current `buildRequest` function in template)
|
||||
|
||||
**Current bug:** The template uses `new URL(endpoint.url, window.location.origin)` which incorrectly constructs URLs based on the current page's origin instead of using the complete endpoint URL directly.
|
||||
|
||||
**Goal:** Replace the buggy URL construction with direct URL usage.
|
||||
|
||||
- [ ] **Step 1: Write the failing test**
|
||||
|
||||
Create a test file to verify URL construction behavior:
|
||||
|
||||
```javascript
|
||||
// Test that URL is used directly without window.location.origin
|
||||
const assert = require('assert');
|
||||
|
||||
// Mock a complete URL in endpoint
|
||||
const endpoint = { url: 'http://20.76.57.61:18080/gsllys/api/test', method: 'POST' };
|
||||
|
||||
// Expected: buildRequest should return the URL directly
|
||||
// NOT: new URL(endpoint.url, 'http://different-origin.com')
|
||||
```
|
||||
|
||||
- [ ] **Step 2: Implement the fix**
|
||||
|
||||
Replace the `buildRequest` function in `browser_script_with_business_logic` (lines 308-321 in the generated template):
|
||||
|
||||
**Current (buggy):**
|
||||
```javascript
|
||||
function buildRequest(args, endpoint) {
|
||||
const url = new URL(endpoint.url, window.location.origin);
|
||||
const params = { ...STATIC_PARAMS, ...args };
|
||||
for (const [key, value] of Object.entries(params)) {
|
||||
if (value !== undefined && value !== null) {
|
||||
url.searchParams.set(key, String(value));
|
||||
}
|
||||
}
|
||||
return {
|
||||
url: url.toString(),
|
||||
method: endpoint.method || 'GET',
|
||||
headers: { 'Content-Type': 'application/json' }
|
||||
};
|
||||
}
|
||||
```
|
||||
|
||||
**Fixed:**
|
||||
```javascript
|
||||
function buildRequest(args, endpoint) {
|
||||
// Use endpoint.url directly - it's already a complete URL
|
||||
const url = endpoint.url;
|
||||
const method = endpoint.method || 'POST';
|
||||
const headers = { 'Content-Type': 'application/json' };
|
||||
const body = JSON.stringify({ ...STATIC_PARAMS, ...args });
|
||||
return { url, method, headers, body };
|
||||
}
|
||||
```
|
||||
|
||||
Locate this in `src/generated_scene/generator.rs` within the `browser_script_with_business_logic` function (around line 308 in the format! string). Replace the entire `buildRequest` function definition.
|
||||
|
||||
- [ ] **Step 3: Verify the change**
|
||||
|
||||
Run `cargo build` to verify the Rust code compiles:
|
||||
|
||||
```bash
|
||||
cargo build
|
||||
```
|
||||
|
||||
Expected: Build succeeds without errors.
|
||||
|
||||
- [ ] **Step 4: Commit**
|
||||
|
||||
```bash
|
||||
git add src/generated_scene/generator.rs
|
||||
git commit -m "fix(generator): use endpoint.url directly in buildRequest to fix URL construction bug"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 2: Add jQuery + fetch Dual HTTP Client Support
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/generated_scene/generator.rs:355-368` (current `defaultDeps` object in template)
|
||||
|
||||
**Goal:** Add jQuery `$.ajax` as primary HTTP client with fetch as fallback for environments without jQuery.
|
||||
|
||||
- [ ] **Step 1: Replace defaultDeps with enhanced version**
|
||||
|
||||
Replace the current `defaultDeps` object in the template with enhanced jQuery + fetch support:
|
||||
|
||||
**Current:**
|
||||
```javascript
|
||||
const defaultDeps = {
|
||||
validatePageContext: async () => true,
|
||||
queryData: async (args) => {
|
||||
const endpoint = API_ENDPOINTS[0];
|
||||
if (!endpoint) throw new Error('No API endpoint configured');
|
||||
const request = buildRequest(args, endpoint);
|
||||
const response = await fetch(request.url, {
|
||||
method: request.method,
|
||||
headers: request.headers
|
||||
});
|
||||
if (!response.ok) throw new Error(`HTTP ${response.status}: ${response.statusText}`);
|
||||
return response.json();
|
||||
}
|
||||
};
|
||||
```
|
||||
|
||||
**Enhanced:**
|
||||
```javascript
|
||||
const defaultDeps = {
|
||||
validatePageContext(args) {
|
||||
const host = (globalThis.location?.hostname || '').trim();
|
||||
const expected = (args.expected_domain || '').trim();
|
||||
if (!host) return { ok: false, reason: 'page_context_unavailable' };
|
||||
if (host !== expected) return { ok: false, reason: 'page_context_mismatch' };
|
||||
return { ok: true };
|
||||
},
|
||||
|
||||
async queryData(args) {
|
||||
const endpoint = API_ENDPOINTS[0];
|
||||
if (!endpoint) throw new Error('No API endpoint configured');
|
||||
const request = buildRequest(args, endpoint);
|
||||
|
||||
// Prefer jQuery (internal pages typically have it)
|
||||
if (typeof $ !== 'undefined' && typeof $.ajax === 'function') {
|
||||
return new Promise((resolve, reject) => {
|
||||
$.ajax({
|
||||
url: request.url,
|
||||
type: request.method,
|
||||
data: request.body,
|
||||
contentType: 'application/json',
|
||||
dataType: 'json',
|
||||
success: resolve,
|
||||
error: (xhr, status, err) => reject(new Error(
|
||||
`API failed (${xhr.status}): ${err} | body=${(xhr.responseText || '').substring(0, 200)}`
|
||||
))
|
||||
});
|
||||
});
|
||||
}
|
||||
|
||||
// Fallback: fetch API
|
||||
if (typeof fetch === 'function') {
|
||||
const response = await fetch(request.url, {
|
||||
method: request.method,
|
||||
headers: request.headers,
|
||||
body: request.method !== 'GET' ? request.body : undefined
|
||||
});
|
||||
if (!response.ok) {
|
||||
const text = await response.text().catch(() => '');
|
||||
throw new Error(`HTTP ${response.status}: ${text.substring(0, 200)}`);
|
||||
}
|
||||
return response.json();
|
||||
}
|
||||
|
||||
throw new Error('No HTTP client available (need jQuery or fetch)');
|
||||
}
|
||||
};
|
||||
```
|
||||
|
||||
This code goes into the format! string in `browser_script_with_business_logic` function in `src/generated_scene/generator.rs`.
|
||||
|
||||
- [ ] **Step 2: Verify the change**
|
||||
|
||||
Run `cargo build` to verify:
|
||||
|
||||
```bash
|
||||
cargo build
|
||||
```
|
||||
|
||||
Expected: Build succeeds.
|
||||
|
||||
- [ ] **Step 3: Commit**
|
||||
|
||||
```bash
|
||||
git add src/generated_scene/generator.rs
|
||||
git commit -m "feat(generator): add jQuery + fetch dual HTTP client support in template"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 3: Add determineArtifactStatus Function
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/generated_scene/generator.rs` (add new function to template before `buildArtifact`)
|
||||
|
||||
**Goal:** Add complete status determination logic supporting blocked/error/partial/empty/ok statuses.
|
||||
|
||||
- [ ] **Step 1: Add determineArtifactStatus function to template**
|
||||
|
||||
Insert the following function into the template, before the `buildArtifact` function:
|
||||
|
||||
```javascript
|
||||
function determineArtifactStatus({ blockedReason = '', fatalError = '', reasons = [], rows = [] }) {
|
||||
if (blockedReason) return 'blocked';
|
||||
if (fatalError) return 'error';
|
||||
if (reasons.length > 0) return 'partial';
|
||||
if (!rows.length) return 'empty';
|
||||
return 'ok';
|
||||
}
|
||||
```
|
||||
|
||||
This should be placed in the template string between `normalizeRows` and `buildArtifact` functions.
|
||||
|
||||
- [ ] **Step 2: Verify the change**
|
||||
|
||||
```bash
|
||||
cargo build
|
||||
```
|
||||
|
||||
Expected: Build succeeds.
|
||||
|
||||
- [ ] **Step 3: Commit**
|
||||
|
||||
```bash
|
||||
git add src/generated_scene/generator.rs
|
||||
git commit -m "feat(generator): add determineArtifactStatus for complete status determination"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 4: Enhance buildArtifact Function
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/generated_scene/generator.rs:334-353` (current `buildArtifact` function in template)
|
||||
|
||||
**Goal:** Enhance `buildArtifact` to use `determineArtifactStatus` and accept additional parameters.
|
||||
|
||||
- [ ] **Step 1: Replace buildArtifact function**
|
||||
|
||||
Replace the current `buildArtifact` function with enhanced version:
|
||||
|
||||
**Current:**
|
||||
```javascript
|
||||
function buildArtifact(args, rows) {
|
||||
return {
|
||||
type: 'report-artifact',
|
||||
report_name: '{scene_id}',
|
||||
status: rows.length > 0 ? 'ok' : 'empty',
|
||||
period: {
|
||||
mode: args.period_mode,
|
||||
mode_code: args.period_mode_code,
|
||||
value: args.period_value,
|
||||
payload: normalizePayload(args.period_payload)
|
||||
},
|
||||
org: { label: args.org_label, code: args.org_code },
|
||||
column_defs: COLUMN_DEFS,
|
||||
columns: {columns_json},
|
||||
rows,
|
||||
counts: { detail_rows: rows.length },
|
||||
partial_reasons: [],
|
||||
reasons: []
|
||||
};
|
||||
}
|
||||
```
|
||||
|
||||
**Enhanced:**
|
||||
```javascript
|
||||
function buildArtifact({ status, blockedReason = '', fatalError = '', reasons = [], rows = [], args }) {
|
||||
return {
|
||||
type: 'report-artifact',
|
||||
report_name: REPORT_NAME,
|
||||
status: status || determineArtifactStatus({ blockedReason, fatalError, reasons, rows }),
|
||||
period: {
|
||||
mode: args.period_mode,
|
||||
mode_code: args.period_mode_code,
|
||||
value: args.period_value,
|
||||
payload: normalizePayload(args.period_payload)
|
||||
},
|
||||
org: { label: args.org_label, code: args.org_code },
|
||||
column_defs: COLUMN_DEFS,
|
||||
columns: COLUMNS,
|
||||
rows,
|
||||
counts: { detail_rows: rows.length },
|
||||
partial_reasons: reasons.filter(r => r && !r.startsWith('api_') && !r.startsWith('validation_')),
|
||||
reasons: Array.from(new Set(reasons.filter(Boolean)))
|
||||
};
|
||||
}
|
||||
```
|
||||
|
||||
- [ ] **Step 2: Verify the change**
|
||||
|
||||
```bash
|
||||
cargo build
|
||||
```
|
||||
|
||||
Expected: Build succeeds.
|
||||
|
||||
- [ ] **Step 3: Commit**
|
||||
|
||||
```bash
|
||||
git add src/generated_scene/generator.rs
|
||||
git commit -m "feat(generator): enhance buildArtifact with determineArtifactStatus integration"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 5: Enhance buildBrowserEntrypointResult Function
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/generated_scene/generator.rs:370-405` (current `buildBrowserEntrypointResult` function in template)
|
||||
|
||||
**Goal:** Complete rewrite of entrypoint with proper validation flow, page context check, and error handling.
|
||||
|
||||
- [ ] **Step 1: Replace buildBrowserEntrypointResult function**
|
||||
|
||||
Replace the entire `buildBrowserEntrypointResult` function:
|
||||
|
||||
**Current:**
|
||||
```javascript
|
||||
async function buildBrowserEntrypointResult(args, deps = defaultDeps) {
|
||||
const validation = validateArgs(args);
|
||||
if (!validation.valid) {
|
||||
return {
|
||||
type: 'report-artifact',
|
||||
report_name: '{scene_id}',
|
||||
status: 'error',
|
||||
error: 'Validation failed: ' + validation.errors.join(', '),
|
||||
column_defs: COLUMN_DEFS,
|
||||
columns: {columns_json},
|
||||
rows: [],
|
||||
counts: { detail_rows: 0 },
|
||||
partial_reasons: [],
|
||||
reasons: validation.errors
|
||||
};
|
||||
}
|
||||
|
||||
try {
|
||||
const rawData = await (deps.queryData ? deps.queryData(args) : Promise.resolve([]));
|
||||
const rows = normalizeRows(rawData);
|
||||
return buildArtifact(args, rows);
|
||||
} catch (error) {
|
||||
return {
|
||||
type: 'report-artifact',
|
||||
report_name: '{scene_id}',
|
||||
status: 'error',
|
||||
error: error.message,
|
||||
column_defs: COLUMN_DEFS,
|
||||
columns: {columns_json},
|
||||
rows: [],
|
||||
counts: { detail_rows: 0 },
|
||||
partial_reasons: [],
|
||||
reasons: [error.message]
|
||||
};
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Enhanced:**
|
||||
```javascript
|
||||
async function buildBrowserEntrypointResult(args, deps = defaultDeps) {
|
||||
// 1. Parameter validation
|
||||
const validation = validateArgs(args);
|
||||
if (!validation.valid) {
|
||||
return buildArtifact({
|
||||
status: 'blocked',
|
||||
blockedReason: 'validation_failed',
|
||||
reasons: validation.errors,
|
||||
rows: [],
|
||||
args
|
||||
});
|
||||
}
|
||||
|
||||
// 2. Page context validation
|
||||
const pageValidation = typeof deps.validatePageContext === 'function'
|
||||
? deps.validatePageContext(args)
|
||||
: { ok: true };
|
||||
if (!pageValidation?.ok) {
|
||||
return buildArtifact({
|
||||
status: 'blocked',
|
||||
blockedReason: pageValidation?.reason || 'page_context_mismatch',
|
||||
reasons: [pageValidation?.reason || 'page_context_mismatch'],
|
||||
rows: [],
|
||||
args
|
||||
});
|
||||
}
|
||||
|
||||
// 3. Data fetching
|
||||
const reasons = [];
|
||||
let rawData = null;
|
||||
try {
|
||||
rawData = await (deps.queryData ? deps.queryData(args) : Promise.resolve([]));
|
||||
} catch (error) {
|
||||
return buildArtifact({
|
||||
status: 'error',
|
||||
fatalError: error.message,
|
||||
reasons: ['api_query_failed:' + error.message],
|
||||
rows: [],
|
||||
args
|
||||
});
|
||||
}
|
||||
|
||||
// 4. Row normalization
|
||||
const rows = normalizeRows(rawData);
|
||||
if (rows.length === 0 && Array.isArray(rawData) && rawData.length > 0) {
|
||||
reasons.push('row_normalization_partial');
|
||||
}
|
||||
|
||||
// 5. Build artifact
|
||||
return buildArtifact({ reasons, rows, args });
|
||||
}
|
||||
```
|
||||
|
||||
- [ ] **Step 2: Verify the change**
|
||||
|
||||
```bash
|
||||
cargo build
|
||||
```
|
||||
|
||||
Expected: Build succeeds.
|
||||
|
||||
- [ ] **Step 3: Commit**
|
||||
|
||||
```bash
|
||||
git add src/generated_scene/generator.rs
|
||||
git commit -m "feat(generator): enhance buildBrowserEntrypointResult with validation flow"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 6: Add Helper Functions and Constants
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/generated_scene/generator.rs` (add helper functions to template)
|
||||
|
||||
**Goal:** Add utility functions used by the enhanced template.
|
||||
|
||||
- [ ] **Step 1: Add helper functions after COLUMN_DEFS constant**
|
||||
|
||||
Add these utility functions to the template after the constant definitions:
|
||||
|
||||
```javascript
|
||||
const REPORT_NAME = '{scene_id}';
|
||||
const COLUMNS = {columns_json};
|
||||
|
||||
function pickFirstNonEmpty(...values) {
|
||||
for (const value of values) {
|
||||
if (typeof value === 'string' && value.trim() !== '') {
|
||||
return value.trim();
|
||||
}
|
||||
}
|
||||
return '';
|
||||
}
|
||||
|
||||
function isNonEmptyString(value) {
|
||||
return typeof value === 'string' && value.trim() !== '';
|
||||
}
|
||||
```
|
||||
|
||||
- [ ] **Step 2: Verify the change**
|
||||
|
||||
```bash
|
||||
cargo build
|
||||
```
|
||||
|
||||
Expected: Build succeeds.
|
||||
|
||||
- [ ] **Step 3: Commit**
|
||||
|
||||
```bash
|
||||
git add src/generated_scene/generator.rs
|
||||
git commit -m "feat(generator): add helper functions for enhanced template"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 7: Update Module Exports
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/generated_scene/generator.rs:407-409` (current module.exports in template)
|
||||
|
||||
**Goal:** Update module exports to include new functions.
|
||||
|
||||
- [ ] **Step 1: Update module.exports**
|
||||
|
||||
Replace the current export block:
|
||||
|
||||
**Current:**
|
||||
```javascript
|
||||
if (typeof module !== 'undefined') {
|
||||
module.exports = { buildBrowserEntrypointResult, normalizePayload, validateArgs, buildRequest, normalizeRows, buildArtifact, API_ENDPOINTS, STATIC_PARAMS, COLUMN_DEFS };
|
||||
}
|
||||
```
|
||||
|
||||
**Enhanced:**
|
||||
```javascript
|
||||
if (typeof module !== 'undefined') {
|
||||
module.exports = {
|
||||
buildBrowserEntrypointResult,
|
||||
normalizePayload,
|
||||
validateArgs,
|
||||
buildRequest,
|
||||
normalizeRows,
|
||||
determineArtifactStatus,
|
||||
buildArtifact,
|
||||
API_ENDPOINTS,
|
||||
STATIC_PARAMS,
|
||||
COLUMN_DEFS,
|
||||
COLUMNS,
|
||||
REPORT_NAME
|
||||
};
|
||||
}
|
||||
```
|
||||
|
||||
- [ ] **Step 2: Verify the change**
|
||||
|
||||
```bash
|
||||
cargo build
|
||||
```
|
||||
|
||||
Expected: Build succeeds.
|
||||
|
||||
- [ ] **Step 3: Commit**
|
||||
|
||||
```bash
|
||||
git add src/generated_scene/generator.rs
|
||||
git commit -m "feat(generator): update module exports for enhanced template"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 8: Integration Test - Generate and Verify Script
|
||||
|
||||
**Files:**
|
||||
- Test: Generate a skill package and verify the output
|
||||
|
||||
**Goal:** Verify the enhanced template generates valid JavaScript.
|
||||
|
||||
- [ ] **Step 1: Build the project**
|
||||
|
||||
```bash
|
||||
cargo build --release
|
||||
```
|
||||
|
||||
Expected: Build succeeds.
|
||||
|
||||
- [ ] **Step 2: Generate a test skill package**
|
||||
|
||||
Use the scene generator to create a test skill:
|
||||
|
||||
```bash
|
||||
# Assuming you have a test scene directory
|
||||
cargo run --bin sg_scene_generate -- --source-dir "examples/test-scene" --scene-id "test-enhanced" --scene-name "Test Enhanced" --output-root "tmp_test_enhanced" --scene-info-json '{"sceneId":"test-enhanced","sceneName":"Test Enhanced","apiEndpoints":[{"name":"testApi","url":"http://example.com/api/test","method":"POST"}],"staticParams":{},"columnDefs":[["col1","Column 1"]]}'
|
||||
```
|
||||
|
||||
Expected: Skill package generated without errors.
|
||||
|
||||
- [ ] **Step 3: Verify generated script syntax**
|
||||
|
||||
Check the generated JavaScript for syntax errors:
|
||||
|
||||
```bash
|
||||
node --check tmp_test_enhanced/skills/test-enhanced/scripts/collect_test_enhanced.js
|
||||
```
|
||||
|
||||
Expected: No syntax errors.
|
||||
|
||||
- [ ] **Step 4: Run the generated test**
|
||||
|
||||
```bash
|
||||
node tmp_test_enhanced/skills/test-enhanced/scripts/collect_test_enhanced.test.js
|
||||
```
|
||||
|
||||
Expected: Test passes (may fail on API call, but artifact structure should be valid).
|
||||
|
||||
- [ ] **Step 5: Commit**
|
||||
|
||||
```bash
|
||||
git add -A
|
||||
git commit -m "test: verify enhanced template generates valid JavaScript"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Self-Review Checklist
|
||||
|
||||
**1. Spec Coverage:**
|
||||
- [x] URL construction bug fix → Task 1
|
||||
- [x] jQuery + fetch dual support → Task 2
|
||||
- [x] determineArtifactStatus function → Task 3
|
||||
- [x] Enhanced buildArtifact → Task 4
|
||||
- [x] Enhanced buildBrowserEntrypointResult → Task 5
|
||||
- [x] Helper functions → Task 6
|
||||
- [x] Module exports → Task 7
|
||||
- [x] Integration testing → Task 8
|
||||
|
||||
**2. Placeholder Scan:**
|
||||
- No TBD, TODO, or placeholder text found
|
||||
- All code snippets are complete
|
||||
- All commands have expected output
|
||||
|
||||
**3. Type Consistency:**
|
||||
- `buildArtifact` parameter signature consistent across all call sites
|
||||
- `args` object properties consistently named
|
||||
- Status values: blocked/error/partial/empty/ok consistently used
|
||||
|
||||
---
|
||||
|
||||
## Execution Handoff
|
||||
|
||||
Plan complete and saved to `docs/superpowers/plans/2026-04-17-progressive-template-enhancement-plan.md`. Two execution options:
|
||||
|
||||
**1. Subagent-Driven (recommended)** - I dispatch a fresh subagent per task, review between tasks, fast iteration
|
||||
|
||||
**2. Inline Execution** - Execute tasks in this session using executing-plans, batch execution with checkpoints
|
||||
|
||||
Which approach?
|
||||
@@ -0,0 +1,482 @@
|
||||
# sgClaw 场景生成器质量提升 — 实施计划
|
||||
|
||||
> 对应设计文档: `docs/superpowers/specs/2026-04-17-scene-generator-quality-improvement-design.md`
|
||||
|
||||
## 总览
|
||||
|
||||
3 个阶段,8 个任务。每个任务包含:改动文件、具体步骤、验证方式、提交信息。
|
||||
|
||||
---
|
||||
|
||||
## Phase 1: 修基础
|
||||
|
||||
### Task 1: 统一生成路径(废弃 browser_script_with_business_logic)
|
||||
|
||||
**文件**: `src/generated_scene/generator.rs`
|
||||
|
||||
**当前状态** (line 728-735):
|
||||
```rust
|
||||
fn compile_scene(scene_ir: &SceneIr, analysis: &SceneSourceAnalysis, tool_name: &str) -> CompiledScene {
|
||||
let scene_toml = render_scene_toml(scene_ir, analysis, tool_name);
|
||||
let browser_script = match scene_ir.workflow_archetype() {
|
||||
WorkflowArchetype::SingleRequestTable => compile_single_request_table(scene_ir),
|
||||
WorkflowArchetype::MultiModeRequest => compile_multi_mode_request(scene_ir),
|
||||
WorkflowArchetype::PaginatedEnrichment => compile_paginated_enrichment(scene_ir),
|
||||
WorkflowArchetype::PageStateEval => compile_page_state_eval(scene_ir),
|
||||
};
|
||||
...
|
||||
}
|
||||
```
|
||||
|
||||
**步骤**:
|
||||
|
||||
1. **修改 `compile_scene` 路由逻辑** (line 730-735):
|
||||
- `SingleRequestTable` 不再调用 `compile_simple_request_script`(`compile_single_request_table` 的底层),改为将单模式场景包装为一个 mode 后走 `compile_multi_mode_request`
|
||||
- 新增辅助函数 `ensure_modes_populated(scene_ir: &SceneIr) -> SceneIr`:
|
||||
- 如果 `scene_ir.modes` 为空但 `scene_ir.api_endpoints` 非空,生成一个 default mode
|
||||
- 将 `SingleRequestTable` 和 `PageStateEval` 场景的 `workflow_archetype` 改为 `MultiModeRequest`(因为统一走 modes 路径)
|
||||
- 修改 match 分支:
|
||||
```rust
|
||||
let browser_script = match scene_ir.workflow_archetype() {
|
||||
WorkflowArchetype::MultiModeRequest => compile_multi_mode_request(scene_ir),
|
||||
WorkflowArchetype::PaginatedEnrichment => compile_paginated_enrichment(scene_ir),
|
||||
_ => {
|
||||
// SingleRequestTable, PageStateEval — fallback to multi-mode with default mode
|
||||
let adapted = ensure_modes_populated(scene_ir);
|
||||
compile_multi_mode_request(&adapted)
|
||||
}
|
||||
};
|
||||
```
|
||||
|
||||
2. **实现 `ensure_modes_populated`**:
|
||||
- 接收 `&SceneIr`,返回 `SceneIr`(clone)
|
||||
- 如果 `modes` 已非空,直接返回 clone
|
||||
- 如果 `modes` 为空但 `api_endpoints` 非空:
|
||||
- 取第一个 endpoint 构造默认 mode
|
||||
- 设置 `name: "default"`, `label: Some("default")`
|
||||
- `condition`: `{ field: "period_mode", operator: "equals", value: "default" }`
|
||||
- `apiEndpoint`: 复制第一个 endpoint
|
||||
- `requestTemplate`: 取 `scene_ir.request_template`
|
||||
- `responsePath`: 取 `scene_ir.response_path`
|
||||
- `normalizeRules`: 取 `scene_ir.normalize_rules` 或默认
|
||||
- `columnDefs`: 取 `scene_ir.column_defs`
|
||||
- 同时设置 `default_mode = Some("default")`, `mode_switch_field = Some("period_mode")`
|
||||
|
||||
3. **标记 `browser_script_with_business_logic` 为废弃**(如果仍存在于代码中):
|
||||
- 在当前代码中,该函数已不存在(已被 `compile_simple_request_script` 替代)。在注释中标注 "legacy path, superseded by multi-mode unified path"
|
||||
|
||||
**验证**:
|
||||
- `cargo check` 无编译错误
|
||||
- 单模式场景生成的 JS 脚本包含 `const MODES =` 和 `detectMode` 逻辑
|
||||
|
||||
**提交信息**:
|
||||
```
|
||||
feat(generator): unify all scene types through multi-mode path
|
||||
|
||||
Single-mode and page-state-eval scenes now get auto-wrapped into a
|
||||
default mode and compiled through compile_multi_mode_request. This
|
||||
eliminates the old browser_script_with_business_logic code path and
|
||||
ensures all scenes get responsePath extraction, requestTemplate, and
|
||||
contentType support.
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 2: 修复 jQuery processData 参数
|
||||
|
||||
**文件**: `src/generated_scene/generator.rs`(`compile_multi_mode_request` 函数,line 1069-1253)
|
||||
|
||||
**当前状态**: 模板中 `buildModeRequest` 函数(line 1098-1118)根据 `contentType` 区分了 body 序列化方式(form-urlencoded 用 `Object.entries().join('&')`,JSON 用 `JSON.stringify`),但 jQuery ajax 调用(line 1185-1196)**没有**设置 `processData` 参数。
|
||||
|
||||
jQuery 对 form-urlencoded body 会默认再次序列化(将字符串当作 query string 处理),导致双重编码。
|
||||
|
||||
**步骤**:
|
||||
|
||||
1. 修改 `compile_multi_mode_request` 中的 jQuery ajax 调用模板(line 1185-1196 区域):
|
||||
- 在 `$.ajax({...})` 中增加 `processData` 参数:
|
||||
```javascript
|
||||
$.ajax({
|
||||
url: request.url,
|
||||
type: request.method,
|
||||
data: request.body,
|
||||
contentType: request.headers['Content-Type'],
|
||||
processData: contentType !== 'application/x-www-form-urlencoded',
|
||||
dataType: 'json',
|
||||
success: resolve,
|
||||
error: (xhr, status, err) => reject(new Error(`API failed (${xhr.status}): ${err}`))
|
||||
});
|
||||
```
|
||||
- 需要将 `contentType` 变量在 Promise 回调中可访问,从 `request` 对象中提取
|
||||
|
||||
2. 同理修改 `compile_simple_request_script` 中的 jQuery ajax 调用(line 994-1004 区域),增加相同的 `processData` 逻辑
|
||||
|
||||
**验证**:
|
||||
- 生成的 JS 中 `$.ajax` 调用包含 `processData` 参数
|
||||
- form-urlencoded 请求不会双重编码
|
||||
|
||||
**提交信息**:
|
||||
```
|
||||
fix(generator): add processData to jQuery ajax for form-urlencoded requests
|
||||
|
||||
jQuery default processData:true re-serializes string bodies, causing
|
||||
double-encoding for form-urlencoded payloads. Set processData:false
|
||||
when contentType is application/x-www-form-urlencoded.
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 3: 单模式场景自动包装为 mode 配置
|
||||
|
||||
**文件**: `frontend/scene-generator/llm-client.js`
|
||||
|
||||
**当前状态**: `analyzeSceneDeep` (line 729-769) 调用 LLM 后直接 `normalizeSceneIr` 返回。如果 LLM 输出 `modes: []` 但有 `apiEndpoints`,不会自动包装。
|
||||
|
||||
**步骤**:
|
||||
|
||||
1. 在 `analyzeSceneDeep` 函数中,`normalizeSceneIr(...)` 之后、返回之前,增加自动包装逻辑:
|
||||
|
||||
```javascript
|
||||
async function analyzeSceneDeep(sourceDir, dirContents, config) {
|
||||
const content = await requestChatCompletionWithRetry(...);
|
||||
const normalized = normalizeSceneIr(await extractJsonFromResponseWithRepair(content, config));
|
||||
|
||||
// ... existing sceneId validation ...
|
||||
|
||||
// AUTO-WRAP: single-mode scenes → modes array
|
||||
if (normalized.modes.length === 0 && normalized.apiEndpoints.length > 0) {
|
||||
normalized.modes.push({
|
||||
name: "default",
|
||||
label: "default",
|
||||
condition: { field: "period_mode", operator: "equals", value: "default" },
|
||||
apiEndpoint: normalized.apiEndpoints[0],
|
||||
columnDefs: normalized.columnDefs || [],
|
||||
requestTemplate: normalized.requestTemplate || {},
|
||||
normalizeRules: normalized.normalizeRules || { type: "validate_required", requiredFields: [], filterNull: true },
|
||||
responsePath: normalized.responsePath || "",
|
||||
});
|
||||
normalized.defaultMode = "default";
|
||||
normalized.modeSwitchField = "period_mode";
|
||||
// Upgrade archetype if it was single_request_table
|
||||
if (normalized.workflowArchetype === "single_request_table") {
|
||||
normalized.workflowArchetype = "multi_mode_request";
|
||||
}
|
||||
}
|
||||
|
||||
return normalized;
|
||||
}
|
||||
```
|
||||
|
||||
2. 同时在 `normalizeSceneIr` 中确保 `defaultMode` 和 `modeSwitchField` 有正确的默认值(已有 line 477-478 处理)
|
||||
|
||||
**验证**:
|
||||
- 对单模式场景(如 `用户日电量监测`)运行生成,确认 `modes` 数组包含一个 default mode
|
||||
- 确认 `workflowArchetype` 被正确升级为 `multi_mode_request`
|
||||
|
||||
**提交信息**:
|
||||
```
|
||||
feat(llm-client): auto-wrap single-mode scenes into modes array
|
||||
|
||||
When the LLM returns an empty modes array but has apiEndpoints,
|
||||
automatically create a default mode with the first endpoint,
|
||||
requestTemplate, responsePath, and normalizeRules. This ensures all
|
||||
scenes compile through the multi-mode path.
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Phase 2: 增强提取
|
||||
|
||||
### Task 4: 增强 LLM prompt 的强制约束
|
||||
|
||||
**文件**: `frontend/scene-generator/llm-client.js`(`DEEP_SYSTEM_PROMPT`,line 19-82)
|
||||
|
||||
**当前状态**: prompt 中已列出 schema 但没有强调哪些字段是**必须**填充的。LLM 经常跳过 `contentType`、`responsePath`、`requestTemplate`。
|
||||
|
||||
**步骤**:
|
||||
|
||||
1. 在 `DEEP_SYSTEM_PROMPT` 的 schema 定义后,增加**强制字段约束**段落:
|
||||
|
||||
```
|
||||
MANDATORY FIELDS (never leave empty):
|
||||
- apiEndpoints[].contentType: detect from source code.
|
||||
* For $.ajax({}): look for 'contentType' property. Default 'application/json' if absent.
|
||||
* For $http.sendByAxios(): contentType is 'application/json' (axios default).
|
||||
* For XMLHttpRequest: look for setRequestHeader('Content-Type', ...).
|
||||
* For form submissions: 'application/x-www-form-urlencoded'.
|
||||
- modes[].responsePath: the JSON path from raw API response to the data array.
|
||||
* Common patterns: 'data.list', 'data.rcvblAcctSumAll.rcvblAcctVOS', 'content', 'data.records'
|
||||
* If response is the array itself, use empty string "".
|
||||
- modes[].requestTemplate: the static request body shape from the source code.
|
||||
* Extract ALL keys that appear in the request body object.
|
||||
* Mark dynamic values as "${args.fieldName}" and static values as literals.
|
||||
- apiEndpoints[].url: the full API URL as seen in the source code.
|
||||
|
||||
RULES:
|
||||
- If you cannot determine contentType, default to 'application/json'.
|
||||
- If you cannot determine responsePath, default to '' (empty string).
|
||||
- If you cannot determine requestTemplate, use {} (empty object).
|
||||
- NEVER leave these fields as null or undefined.
|
||||
```
|
||||
|
||||
2. 将这段文字插入到 `DEEP_SYSTEM_PROMPT` 中 schema 定义之后、`Instructions` 之前
|
||||
|
||||
**验证**:
|
||||
- 对 `营销2.0零度户报表数据生成` 场景运行生成,确认 LLM 输出的 `contentType` 和 `responsePath` 不再为空
|
||||
- 确认 `requestTemplate` 包含了业务必需字段
|
||||
|
||||
**提交信息**:
|
||||
```
|
||||
feat(llm-client): add mandatory field constraints to DEEP_SYSTEM_PROMPT
|
||||
|
||||
Explicitly require LLM to fill contentType, responsePath, and
|
||||
requestTemplate with detected values or defaults. Reduces empty-field
|
||||
rate from ~60% to target ~10%.
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 5: 增加业务 JS 文件提取
|
||||
|
||||
**文件**:
|
||||
- `frontend/scene-generator/server.js`
|
||||
- `frontend/scene-generator/generator-runner.js`
|
||||
|
||||
**当前状态**: `readDirectory` 在 `generator-runner.js` 中已经读取所有文件到 `dirContents`,但 `buildDeepAnalyzePrompt`(`llm-client.js` line 125-157)主要推送 `index.html` 的 fragments。业务 JS 文件(如 `js/mca.js`, `js/sgApi.js`)的内容没有被单独提取推送。
|
||||
|
||||
**步骤**:
|
||||
|
||||
1. **在 `generator-runner.js` 中增加业务 JS 文件识别**:
|
||||
- 在 `buildAnalysisContext` 函数中,增加一个 `businessJsFragments` 数组
|
||||
- 识别 `js/` 目录下的 `.js` 文件(排除 `vue.js`, `element-ui` 等第三方库)
|
||||
- 对每个业务 JS 文件,提取前 600 字符的关键片段(函数定义、API 调用、配置对象)
|
||||
- 将结果放入 `analysisContext.businessJsFragments`
|
||||
|
||||
2. **在 `llm-client.js` 的 `buildDeepAnalyzePrompt` 中推送业务 JS 片段**:
|
||||
- 在现有的 `pushFragments` 调用后增加:
|
||||
```javascript
|
||||
pushFragments(parts, "business JS files", context.businessJsFragments, 4);
|
||||
```
|
||||
- 确保总 prompt 大小不超过 `MAX_DEEP_PROMPT_CHARS`(60000)
|
||||
|
||||
3. **在 `server.js` 中确保业务 JS 文件被读取**:
|
||||
- 检查 `/handle-analyze-deep` 端点中 `readDirectory` 的调用是否已经读取了 `js/` 目录下的文件
|
||||
- 如果没有,增加对 `js/*.js` 文件的读取逻辑
|
||||
|
||||
**验证**:
|
||||
- 对 `台区线损大数据` 场景运行,确认 `js/mca.js` 或类似业务文件的内容被推送给 LLM
|
||||
- 确认 prompt 总大小不超过 60000 字符
|
||||
|
||||
**提交信息**:
|
||||
```
|
||||
feat(scene-generator): extract business JS files for LLM analysis
|
||||
|
||||
Identify and push js/ directory business logic files (mca.js, sgApi.js,
|
||||
etc.) to the LLM prompt. Exclude third-party libraries. Capped at 4
|
||||
fragments to stay within MAX_DEEP_PROMPT_CHARS budget.
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 6: 提取后验证与二次追问
|
||||
|
||||
**文件**: `frontend/scene-generator/llm-client.js`
|
||||
|
||||
**当前状态**: `analyzeSceneDeep` 拿到 LLM 返回后直接 `normalizeSceneIr` 然后返回,没有检查关键字段是否缺失。
|
||||
|
||||
**步骤**:
|
||||
|
||||
1. 新增 `validateExtractedSceneInfo(sceneIr)` 函数:
|
||||
```javascript
|
||||
function validateExtractedSceneInfo(sceneIr) {
|
||||
const issues = [];
|
||||
|
||||
// Check: at least one apiEndpoint has contentType
|
||||
const endpointsWithCt = (sceneIr.apiEndpoints || []).filter(
|
||||
ep => ep && ep.contentType
|
||||
);
|
||||
if ((sceneIr.apiEndpoints || []).length > 0 && endpointsWithCt.length === 0) {
|
||||
issues.push("missing_contentType_on_endpoints");
|
||||
}
|
||||
|
||||
// Check: at least one mode has responsePath (if modes exist)
|
||||
if ((sceneIr.modes || []).length > 0) {
|
||||
const modesWithPath = sceneIr.modes.filter(m => m.responsePath !== undefined && m.responsePath !== null);
|
||||
if (modesWithPath.length === 0) {
|
||||
issues.push("missing_responsePath_on_modes");
|
||||
}
|
||||
}
|
||||
|
||||
// Check: workflowArchetype is set
|
||||
if (!sceneIr.workflowArchetype) {
|
||||
issues.push("missing_workflowArchetype");
|
||||
}
|
||||
|
||||
return issues;
|
||||
}
|
||||
```
|
||||
|
||||
2. 在 `analyzeSceneDeep` 中,`normalizeSceneIr` 之后调用验证:
|
||||
```javascript
|
||||
const issues = validateExtractedSceneInfo(normalized);
|
||||
if (issues.length > 0) {
|
||||
// Secondary prompt
|
||||
const followUpPrompt = `The previous extraction has these issues:\n${issues.join('\n')}\nPlease re-analyze the source snippets and fill in the missing fields. Use defaults if truly unavailable.`;
|
||||
|
||||
const followUpContent = await requestChatCompletionWithRetry(
|
||||
[
|
||||
{ role: "system", content: DEEP_SYSTEM_PROMPT },
|
||||
{ role: "user", content: followUpPrompt },
|
||||
],
|
||||
{ ...config, maxTokens: 2400, timeoutMs: DEEP_REQUEST_TIMEOUT_MS, retryAttempts: 1 }
|
||||
);
|
||||
|
||||
const repaired = normalizeSceneIr(await extractJsonFromResponseWithRepair(followUpContent, config));
|
||||
// Merge repaired fields into normalized (only fill empty fields)
|
||||
Object.assign(normalized, mergeSceneIrFields(repaired, normalized));
|
||||
}
|
||||
```
|
||||
|
||||
3. 新增 `mergeSceneIrFields(repaired, original)` 辅助函数:
|
||||
- 仅当 original 的字段为空/默认值时,才用 repaired 的值覆盖
|
||||
- 避免丢失第一次提取的有效信息
|
||||
|
||||
**验证**:
|
||||
- 模拟一个 LLM 返回缺少 `contentType` 的场景,确认二次追问触发
|
||||
- 确认最多追问 1 次,不会无限循环
|
||||
|
||||
**提交信息**:
|
||||
```
|
||||
feat(llm-client): add post-extraction validation with one-shot retry
|
||||
|
||||
After LLM returns scene IR, validate that critical fields (contentType,
|
||||
responsePath, workflowArchetype) are present. If missing, send one
|
||||
follow-up prompt to fill gaps. Merges repaired fields without overwriting
|
||||
valid data from the first extraction.
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Phase 3: 测试验证
|
||||
|
||||
### Task 7: 单元测试
|
||||
|
||||
**文件**: `tests/scene_generator_modes_test.rs`(新增)
|
||||
|
||||
**步骤**:
|
||||
|
||||
1. 创建测试文件 `tests/scene_generator_modes_test.rs`
|
||||
|
||||
2. 编写 5 个测试用例:
|
||||
|
||||
```rust
|
||||
#[cfg(test)]
|
||||
mod tests {
|
||||
use super::*; // adjust imports as needed
|
||||
use crate::generated_scene::generator::*;
|
||||
use crate::generated_scene::ir::*;
|
||||
use serde_json::json;
|
||||
|
||||
#[test]
|
||||
fn test_single_mode_generates_modes_array() {
|
||||
// Create a SingleRequestTable scene with one endpoint
|
||||
let scene_ir = make_test_scene_ir();
|
||||
// ... assertions: generated JS contains "const MODES ="
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_multi_mode_generates_mode_routing() {
|
||||
// Create a MultiModeRequest scene with two modes
|
||||
// ... assertions: generated JS contains "detectMode"
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_snake_camel_consistency() {
|
||||
// Verify field name serialization is consistent
|
||||
// between Rust (snake_case) and JS (camelCase)
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_form_urlencoded_request_body() {
|
||||
// Create a mode with contentType = "application/x-www-form-urlencoded"
|
||||
// ... assertions: body is Object.entries().join('&'), not JSON.stringify
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_response_path_extraction_in_template() {
|
||||
// Create a mode with responsePath = "data.list"
|
||||
// ... assertions: generated JS contains "safeGet(raw, mode.responsePath"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
3. 每个测试构造一个 `SceneIr` 实例,调用 `compile_multi_mode_request`,然后检查生成的字符串包含预期的代码片段
|
||||
|
||||
**验证**:
|
||||
- `cargo test scene_generator_modes_test` 全部通过
|
||||
|
||||
**提交信息**:
|
||||
```
|
||||
test: add unit tests for multi-mode generation path
|
||||
|
||||
Covers: single-mode auto-wrap, multi-mode routing, snake/camel
|
||||
consistency, form-urlencoded body format, and responsePath extraction.
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 8: 集成测试
|
||||
|
||||
**步骤**:
|
||||
|
||||
1. **选择两个代表性场景跑完整生成**:
|
||||
- 简单场景: `用户日电量监测`(模式 C,直接 AJAX)
|
||||
- 复杂场景: `台区线损大数据-月_周累计线损率统计分析`(模式 A,双模式)
|
||||
|
||||
2. **对比生成结果与 tq-lineloss-report**:
|
||||
- 对比 `SKILL.toml` 结构
|
||||
- 对比 `scripts/*.js` 的关键函数(`buildModeRequest`, `detectMode`, `normalizeRows`)
|
||||
- 对比 `scene.toml` 的 bootstrap 和 params 配置
|
||||
|
||||
3. **产出集成测试报告**:
|
||||
- 文件: `docs/superpowers/reports/2026-04-17-integration-test-report.md`
|
||||
- 内容: 差距清单、质量评分、遗留问题
|
||||
|
||||
4. **记录差距清单**:
|
||||
- 哪些字段仍未正确提取
|
||||
- 哪些逻辑仍需手动修正
|
||||
- 哪些场景仍不适合自动化
|
||||
|
||||
**验证**:
|
||||
- 集成测试报告已写入
|
||||
- 至少一个场景的生成质量达到 tq-lineloss-report 的 80% 以上
|
||||
|
||||
**提交信息**:
|
||||
```
|
||||
docs: add integration test report for scene generator quality
|
||||
|
||||
Generated skills for user-daily-power and tq-lineloss scenes. Compared
|
||||
against manually-authored tq-lineloss-report. Quality assessment and
|
||||
gap analysis documented.
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 执行顺序
|
||||
|
||||
```
|
||||
Task 1 → Task 2 → Task 3 → Task 4 → Task 5 → Task 6 → Task 7 → Task 8
|
||||
├──── Phase 1: 修基础 ────┤ ├───── Phase 2: 增强提取 ─────┤ ├─ Phase 3 ─┤
|
||||
```
|
||||
|
||||
Phase 1 的三个任务有依赖关系(Task 1 必须先完成,Task 2 和 Task 3 可并行)。
|
||||
Phase 2 的三个任务可并行(Task 4/5/6 修改不同文件)。
|
||||
Phase 3 依赖 Phase 1+2 全部完成。
|
||||
|
||||
## 风险与缓解
|
||||
|
||||
| 风险 | 影响 | 缓解 |
|
||||
|------|------|------|
|
||||
| LLM 二次追问增加生成时间 | 用户体验下降 | 限制追问 1 次,超时 120s |
|
||||
| 统一路径后 SingleRequestTable 场景生成的 JS 包含不必要的 mode 逻辑 | 脚本体积增大 | default mode 条件判断简单,性能影响可忽略 |
|
||||
| 业务 JS 文件过多导致 prompt 超限 | LLM 无法处理 | 限制 4 个文件,每个 600 字符 |
|
||||
| `processData` 修改影响现有正常场景 | 回归问题 | 仅对 form-urlencoded 设置 false,JSON 不受影响 |
|
||||
@@ -0,0 +1,382 @@
|
||||
# sgClaw Scene Skill 60-to-90 Roadmap Plan
|
||||
|
||||
> **Status:** Draft
|
||||
> **Date:** 2026-04-17
|
||||
> **Author:** Codex
|
||||
> **Upstream Spec:** [2026-04-17-scene-skill-60-to-90-roadmap-design.md](D:/data/ideaSpace/rust/sgClaw/claw-new/docs/superpowers/specs/2026-04-17-scene-skill-60-to-90-roadmap-design.md)
|
||||
|
||||
## Plan Intent
|
||||
|
||||
本计划用于将“scene skill 自动生成能力从 60 分提升到 90 分”的设计方案拆解为可执行的交付阶段、任务边界、验收条件与实施顺序。计划严格服从上游 `spec`,不额外扩展问题空间,不提前引入未在 `spec` 中确认的实现目标。
|
||||
|
||||
本计划覆盖的核心目标仅包括:
|
||||
|
||||
1. 建立可裁决的语义证据层
|
||||
2. 建立最小可编译业务契约
|
||||
3. 冻结 P0 样板标准答案
|
||||
4. 按 P0 到 P1 的路线推动 scene skill 自动转化能力从结构识别升级到业务语义恢复
|
||||
|
||||
## Success Criteria Baseline
|
||||
|
||||
本计划默认采用上游 `spec` 中已经收敛的成功标准:阶段性成功不再以“生成结果是否尽量接近某个参考 skill 的结构”作为唯一目标,而是以通用场景生成后的 skill 能否在内网环境中直接运行、拿到正确数据并产出正确报表作为主判定口径。
|
||||
|
||||
因此,实施验收默认同时检查以下三层闭环:
|
||||
|
||||
1. 执行闭环:生成 skill 可在自研浏览器承载的内网环境中完成执行
|
||||
2. 数据闭环:查询、分页、提取后的数据正确且完整
|
||||
3. 产物闭环:生成的 Excel 或其他报表符合业务规则
|
||||
|
||||
## Scope Guardrails
|
||||
|
||||
本计划执行过程中,以下边界保持不变:
|
||||
|
||||
1. 不以“一次覆盖全部 102 个场景”为目标
|
||||
2. 不在本计划中展开统一平台登录或目标业务系统后台登录的自动恢复实现
|
||||
3. 不把 BrowserAction 全链路抽象一次性做完
|
||||
4. 不把复杂文档渲染、模板上传、附件解析场景纳入 P0
|
||||
5. 不以“先做更多 prompt 调优”代替证据层、契约层和标准答案建设
|
||||
|
||||
## Scene Family Baseline
|
||||
|
||||
本计划执行时,默认承接上游 `spec` 对 `102` 个场景的家族分组结果:
|
||||
|
||||
1. `G1` 通用单页报表组:`68`
|
||||
2. `G2` 多模式报表组:`11`
|
||||
3. `G3` 分页明细补数组:`10`
|
||||
4. `G4` 工具检测前置组:`8`
|
||||
5. `G5` 低优先级噪声组:`5`
|
||||
|
||||
本计划的主线实施范围以 `G1 + G2 + G3` 为主,它们合计 `89` 个场景,约占全部样本的 `87%`。`G4` 作为后续检测类扩展前置保留,`G5` 默认降级处理,不进入首轮主线。
|
||||
|
||||
## Workstreams
|
||||
|
||||
本计划拆分为四条主工作流:
|
||||
|
||||
1. `WS1` 语义证据层建设
|
||||
2. `WS2` 最小可编译业务契约建设
|
||||
3. `WS3` P0 标准答案与校准基线建设
|
||||
4. `WS4` P0/P1 样板路线落地与验证
|
||||
|
||||
四条工作流之间的依赖关系为:
|
||||
|
||||
`WS1 + WS2 + WS3 -> WS4`
|
||||
|
||||
## Phase Overview
|
||||
|
||||
计划按五个阶段推进:
|
||||
|
||||
1. Phase 0:冻结边界与样板
|
||||
2. Phase 1:建立语义证据层
|
||||
3. Phase 2:建立最小可编译业务契约
|
||||
4. Phase 3:冻结 P0 canonical answers
|
||||
5. Phase 4:按 P0/P1 路线逐步验证 60-to-90 能力提升
|
||||
|
||||
其中 Phase 4 不是按业务部门推进,而是按场景家族推进,顺序固定为:
|
||||
|
||||
1. 先打 `G2` 多模式报表组,验证语义恢复上限
|
||||
2. 再打 `G1` 通用单页报表组,验证规模化迁移能力
|
||||
3. 再打 `G3` 分页明细补数组,验证复杂 workflow 与 fail-closed
|
||||
4. `G4` 保留到后续检测类扩展
|
||||
5. `G5` 默认降级处理
|
||||
|
||||
## Phase 0:冻结边界与样板
|
||||
|
||||
### Objective
|
||||
|
||||
在进入建设阶段前,先冻结问题边界、P0 样板、P1 家族和对标基线,避免实施过程中反复漂移。
|
||||
|
||||
### Tasks
|
||||
|
||||
1. 固化 P0 样板清单
|
||||
2. 固化 P1 家族清单
|
||||
3. 固化 `台区线损大数据-月_周累计线损率统计分析 -> tq-lineloss-report` 的 canonical mapping
|
||||
4. 固化宿主浏览器执行上下文和 `localhost:*` 的语义分类口径
|
||||
5. 固化“业务语义层 / 宿主浏览器能力层 / 登录与本地桥接层”的分层约束
|
||||
6. 固化 `102` 个场景的五大分组和分组口径
|
||||
7. 固化各分组到 archetype / 阶段 / 验收重点的映射关系
|
||||
|
||||
### Deliverables
|
||||
|
||||
1. 冻结后的样板名单
|
||||
2. 样板与 archetype 对照表
|
||||
3. 宿主与业务分层约束说明
|
||||
4. canonical benchmark 映射说明
|
||||
5. 五大场景分组清单
|
||||
6. 分组实施映射说明
|
||||
|
||||
### Exit Criteria
|
||||
|
||||
1. P0 / P1 样板不再变动
|
||||
2. `tq-lineloss-report` 已被明确为 P0-1 的 canonical benchmark
|
||||
3. `localhost:*` 已被明确定义为宿主桥接证据而非默认业务域
|
||||
4. `102` 个场景的五大分组和分组实施口径不再漂移
|
||||
|
||||
## Phase 1:建立语义证据层
|
||||
|
||||
### Objective
|
||||
|
||||
将“源码直接汇总到 Scene IR”的生成路径,升级为“源码先形成可裁决语义证据,再归约为 Scene IR”的路径。
|
||||
|
||||
### Tasks
|
||||
|
||||
1. 定义统一证据对象 schema
|
||||
2. 定义证据来源分层
|
||||
3. 定义证据归并与冲突消解规则
|
||||
4. 定义证据到 `Scene IR` 的映射边界
|
||||
5. 建立核心证据类型集合
|
||||
|
||||
### Required Evidence Types
|
||||
|
||||
第一版最小证据类型集合固定为:
|
||||
|
||||
1. `bootstrap_candidate`
|
||||
2. `endpoint_candidate`
|
||||
3. `mode_candidate`
|
||||
4. `request_template_candidate`
|
||||
5. `response_path_candidate`
|
||||
6. `column_defs_candidate`
|
||||
7. `normalize_rules_candidate`
|
||||
8. `workflow_candidate`
|
||||
9. `localhost_dependency_candidate`
|
||||
10. `browser_action_candidate`
|
||||
11. `export_candidate`
|
||||
|
||||
### Deliverables
|
||||
|
||||
1. 证据对象 schema 文档
|
||||
2. 证据类型字典
|
||||
3. 证据归并规则文档
|
||||
4. 证据到 `Scene IR` 的映射规则文档
|
||||
5. P0 样板的证据抽取结果样例
|
||||
|
||||
### Acceptance Criteria
|
||||
|
||||
1. 任一 P0 样板都能输出结构化证据集合
|
||||
2. `localhost:*`、宿主 JS 注入、隐藏域行为可进入独立证据槽位
|
||||
3. `Scene IR` 的核心字段均可回溯到对应证据来源
|
||||
4. 证据冲突时存在明确裁决路径,而不是被最终总结直接吞没
|
||||
|
||||
## Phase 2:建立最小可编译业务契约
|
||||
|
||||
### Objective
|
||||
|
||||
把 archetype 判断从“关键词命中”升级为“最小业务契约是否成立”,让 compiler 只接收证据闭合的输入。
|
||||
|
||||
### Tasks
|
||||
|
||||
1. 定义各 archetype 的最小可编译契约
|
||||
2. 定义统一 gate 列表
|
||||
3. 定义 gate 失败时的阻断规则
|
||||
4. 定义 archetype 最小输出契约
|
||||
5. 建立 fail-closed 优先的 readiness 判定口径
|
||||
|
||||
### Required Gates
|
||||
|
||||
统一 gate 名称最少包括:
|
||||
|
||||
1. `bootstrap_resolved`
|
||||
2. `request_contract_complete`
|
||||
3. `response_contract_complete`
|
||||
4. `workflow_contract_complete`
|
||||
5. `runtime_contract_compatible`
|
||||
|
||||
### Deliverables
|
||||
|
||||
1. archetype 最小契约表
|
||||
2. gate 判定表
|
||||
3. blocker / readiness 规则表
|
||||
4. archetype 输出契约样例
|
||||
|
||||
### Acceptance Criteria
|
||||
|
||||
1. `multi_mode_request`、`single_request_table`、`paginated_enrichment` 均有明确最小契约
|
||||
2. 没有通过 gate 的场景不能再伪装为 runnable skill
|
||||
3. readiness 结果能够区分“业务证据不足”和“宿主运行时依赖未满足”
|
||||
4. compiler 输入边界清晰,不能继续吞入未闭合 IR
|
||||
|
||||
## Phase 3:冻结 P0 Canonical Answers
|
||||
|
||||
### Objective
|
||||
|
||||
为 P0 三个主样板建立稳定的标准答案、关键证据清单和验收基线,作为后续回归与迁移的唯一校准源。
|
||||
|
||||
### Tasks
|
||||
|
||||
1. 固化三个 P0 样板的标准 `Scene IR`
|
||||
2. 固化三个 P0 样板的关键证据清单
|
||||
3. 固化三个 P0 样板的验收标准
|
||||
4. 固化三个 P0 样板的失败 taxonomy
|
||||
5. 建立 canonical answer 与实际生成结果的比对方式
|
||||
|
||||
### P0 Canonical Targets
|
||||
|
||||
1. `台区线损大数据-月_周累计线损率统计分析`
|
||||
参考 `tq-lineloss-report`
|
||||
2. `用户日电量监测`
|
||||
对标单请求量产样板
|
||||
3. `95598工单明细表`
|
||||
对标分页补数识别与阻断样板
|
||||
|
||||
### Deliverables
|
||||
|
||||
1. 三个 P0 样板的 canonical `Scene IR`
|
||||
2. 三个 P0 样板的关键语义证据基线
|
||||
3. 三个 P0 样板的验收表
|
||||
4. 三个 P0 样板的失败类型表
|
||||
|
||||
### Acceptance Criteria
|
||||
|
||||
1. P0-1 能明确以 `tq-lineloss-report` 作为高质量参考样板,而非唯一硬标准答案
|
||||
2. 三个 P0 样板都存在“生成结果 vs canonical answer”的对齐方式
|
||||
3. 后续每次能力升级均可回归验证是否偏离 P0 标准答案
|
||||
|
||||
## Phase 4:按 P0/P1 路线逐步验证 60-to-90 提升
|
||||
|
||||
### Objective
|
||||
|
||||
按照 `spec` 已定义的优先级,以 P0 为主、P1 为扩展,逐步验证自动转化器从结构识别向业务语义恢复的提升路径。
|
||||
|
||||
本阶段不按业务部门推进,而按场景家族推进。其首轮目标不是“覆盖全部 `102` 个场景”,而是先打穿主流报表型场景,再逐步扩展。
|
||||
|
||||
### Track A:P0-1 `tq` 主样板
|
||||
|
||||
#### Goal
|
||||
|
||||
打通 `multi_mode_request.month_week_table` 的主样板能力,并使结果在关键业务语义、内网可执行性与报表正确性上达到 `tq-lineloss-report` 同等级别。
|
||||
|
||||
#### Tasks
|
||||
|
||||
1. 恢复完整 `month / week` 模式矩阵
|
||||
2. 恢复每个模式的请求契约与响应契约
|
||||
3. 恢复列定义、归一化规则和导出语义
|
||||
4. 校验 bootstrap 与目标系统上下文约束
|
||||
5. 建立自动结果与 `tq-lineloss-report` 的关键语义比对
|
||||
|
||||
#### Acceptance Criteria
|
||||
|
||||
1. `mode matrix` 稳定恢复
|
||||
2. 关键 request / response contract 稳定恢复
|
||||
3. 生成结果在关键业务语义与内网报表结果上达到高质量参考水平
|
||||
|
||||
### Track B:P0-2 单请求量产样板
|
||||
|
||||
#### Goal
|
||||
|
||||
证明单请求报表家族可以形成高通过率的通用转化模板。
|
||||
|
||||
#### Tasks
|
||||
|
||||
1. 恢复 request / response / normalize 三件套
|
||||
2. 压缩伪通用兜底主路径
|
||||
3. 验证同家族样板迁移能力
|
||||
|
||||
#### Acceptance Criteria
|
||||
|
||||
1. `single_request_table` 样板稳定通过
|
||||
2. 同家族样板具备可复用性
|
||||
3. 结果判定不再过度依赖全文总结
|
||||
|
||||
### Track C:P0-3 分页补数样板
|
||||
|
||||
#### Goal
|
||||
|
||||
正确识别复杂分页补数场景的问题空间,并在证据不足时稳定阻断。
|
||||
|
||||
#### Tasks
|
||||
|
||||
1. 拆开主请求链、补数链、导出链
|
||||
2. 建立 `paginated_enrichment` 最小可编译证据集
|
||||
3. 区分业务 workflow 与宿主桥接行为
|
||||
4. 落地 fail-closed 判定
|
||||
|
||||
#### Acceptance Criteria
|
||||
|
||||
1. 分页补数 workflow 被正确拆解
|
||||
2. 证据不足时稳定 fail-closed
|
||||
3. 不再把宿主链或 `localhost:*` 误判为业务主链
|
||||
|
||||
### Track D:P1 家族扩展
|
||||
|
||||
#### Goal
|
||||
|
||||
在 P0 样板稳定后,将能力迁移到已定义的 P1 家族,验证路线具备规模化复制能力。
|
||||
|
||||
#### Tasks
|
||||
|
||||
1. 迁移线损 / 电量多模式家族
|
||||
2. 迁移单请求报表家族
|
||||
3. 迁移分页补数家族
|
||||
4. 记录每一类家族的复用成功率与失败类型
|
||||
|
||||
#### Acceptance Criteria
|
||||
|
||||
1. 每个 P1 家族至少完成一轮代表场景迁移验证
|
||||
2. P1 验证主要依赖 P0 已沉淀的证据、契约和标准答案体系
|
||||
3. 若超出当前 archetype 或契约能力边界,结果应明确 fail-closed
|
||||
|
||||
### Track E:Scene Family Expansion Policy
|
||||
|
||||
#### Goal
|
||||
|
||||
以五大场景分组为单位,明确哪些家族进入主线,哪些家族仅做预留或降级。
|
||||
|
||||
#### Tasks
|
||||
|
||||
1. 对 `G1` 通用单页报表组建立量产迁移节奏
|
||||
2. 对 `G2` 多模式报表组建立深做样板节奏
|
||||
3. 对 `G3` 分页明细补数组建立复杂链识别节奏
|
||||
4. 对 `G4` 工具检测前置组仅保留架构入口与后续扩展口径
|
||||
5. 对 `G5` 低优先级噪声组建立默认降级口径
|
||||
|
||||
#### Acceptance Criteria
|
||||
|
||||
1. `G1 + G2 + G3` 成为首轮主线范围
|
||||
2. `G4` 不抢占当前主线资源,但保留后续检测类扩展入口
|
||||
3. `G5` 不污染主线 archetype 和验收口径
|
||||
|
||||
## Milestone Order
|
||||
|
||||
总前置里程碑的发生顺序固定为:
|
||||
|
||||
1. 先完成语义证据层
|
||||
2. 再完成最小可编译业务契约
|
||||
3. 再冻结 P0 标准答案
|
||||
|
||||
在这三个里程碑完成之前,不进入大规模家族扩展。
|
||||
|
||||
## File-Level Planning Targets
|
||||
|
||||
本计划要求后续实施至少覆盖以下资产类型:
|
||||
|
||||
1. `docs/superpowers/specs/` 中的上游设计稿
|
||||
2. `docs/superpowers/plans/` 中的阶段计划与进展计划
|
||||
3. scene 生成链中的证据层、契约层、readiness / blocker 相关实现
|
||||
4. P0 样板对应的 fixture、golden IR、验收基线或等价校准资产
|
||||
|
||||
## Completion Criteria
|
||||
|
||||
本计划完成的标志为:
|
||||
|
||||
1. `tq` 主样板可以稳定恢复核心业务语义,并在内网运行与报表结果上达到高质量参考水平
|
||||
2. 单请求主样板可以形成可复制的高通过率模板,并覆盖主流通用报表场景
|
||||
3. 分页补数主样板可以稳定识别复杂 workflow,并在证据不足时 fail-closed
|
||||
4. `Scene IR` 前存在可裁决的证据层
|
||||
5. archetype 前存在明确契约 gate
|
||||
6. P0 标准答案已成为后续迁移与回归的统一校准基线
|
||||
7. 实施主线明确聚焦 `G1 + G2 + G3`,不再被边界场景牵引偏航
|
||||
|
||||
## Risks and Control Points
|
||||
|
||||
1. 若证据层先天过薄,后续契约和 canonical answer 会失去支撑
|
||||
2. 若契约 gate 定义过宽,系统会继续伪造 runnable skill
|
||||
3. 若 P0 标准答案不冻结,后续优化将失去对齐基线
|
||||
4. 若过早进入 P1 扩展,容易在未完成分层前再次引入宿主噪声污染
|
||||
|
||||
## Out of Plan
|
||||
|
||||
以下事项明确不属于本计划直接交付范围:
|
||||
|
||||
1. 统一平台登录流程自动恢复
|
||||
2. 目标业务系统后台登录实现细节
|
||||
3. 浏览器宿主能力的全量抽象
|
||||
4. 所有场景的一次性端到端可运行保证
|
||||
663
docs/superpowers/plans/2026-04-17-scene-skill-compiler-plan.md
Normal file
663
docs/superpowers/plans/2026-04-17-scene-skill-compiler-plan.md
Normal file
@@ -0,0 +1,663 @@
|
||||
# Scene Skill Compiler Implementation Plan
|
||||
|
||||
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
|
||||
|
||||
**Goal:** Upgrade `sg_scene_generate` from a scene metadata extractor plus template filler into a reusable scene skill compiler that can understand workflow semantics, classify scene archetypes, and generate runnable skills for both `tq-lineloss-report`-style and `marketing-zero-consumer-report`-style internal scenes.
|
||||
|
||||
**Architecture:** Introduce a unified `Scene IR`, switch extraction to a hybrid deterministic-plus-LLM pipeline, route generation by `workflowArchetype`, align runtime resolver contracts, and add readiness gates so users can tell whether a generated skill is safe to trial on the internal network.
|
||||
|
||||
**Tech Stack:** Rust, Node.js, HTML/CSS/JavaScript, serde_json, OpenAI-compatible LLM API
|
||||
|
||||
---
|
||||
|
||||
## Scope Check
|
||||
|
||||
This plan implements the design in:
|
||||
|
||||
- `docs/superpowers/specs/2026-04-17-scene-skill-compiler-design.md`
|
||||
|
||||
This plan builds on the existing generator work already described in:
|
||||
|
||||
- `docs/superpowers/specs/2026-04-17-llm-driven-skill-generation-design.md`
|
||||
- `docs/superpowers/specs/2026-04-17-enhanced-llm-extraction-schema-design.md`
|
||||
- `docs/superpowers/specs/2026-04-17-progressive-template-enhancement-design.md`
|
||||
- `docs/superpowers/specs/2026-04-16-multi-scene-kind-generator-design.md`
|
||||
|
||||
This plan does not attempt to solve:
|
||||
|
||||
- full login and authentication reconstruction
|
||||
- all historical scene patterns in one pass
|
||||
- 100% no-touch generation without human review
|
||||
|
||||
---
|
||||
|
||||
## File Map
|
||||
|
||||
### Core generator pipeline
|
||||
|
||||
| File | Action | Purpose |
|
||||
|------|--------|---------|
|
||||
| `frontend/scene-generator/llm-client.js` | Modify | Replace truncation-only extraction with chunked workflow-aware extraction and `Scene IR` schema output |
|
||||
| `frontend/scene-generator/generator-runner.js` | Modify | Add deterministic scene scanning, key-fragment selection, and IR support |
|
||||
| `frontend/scene-generator/server.js` | Modify | Expose analysis, preview, readiness, and generation endpoints for `Scene IR` |
|
||||
| `frontend/scene-generator/sg_scene_generator.html` | Modify | Show extraction preview, archetype classification, bootstrap, risks, and readiness |
|
||||
|
||||
### Rust backend
|
||||
|
||||
| File | Action | Purpose |
|
||||
|------|--------|---------|
|
||||
| `src/generated_scene/analyzer.rs` | Modify | Add deterministic extraction helpers and archetype support |
|
||||
| `src/generated_scene/generator.rs` | Modify | Route generation by archetype and compile from `Scene IR` instead of ad hoc fields |
|
||||
| `src/generated_scene/ir.rs` | Create | Define unified `Scene IR` structs and serde contracts |
|
||||
| `src/bin/sg_scene_generate.rs` | Modify | Accept `Scene IR` JSON or file input and pass it into generator |
|
||||
| `src/compat/scene_platform/resolvers.rs` | Modify | Align runtime parameter resolution with generated contracts |
|
||||
|
||||
### Tests and fixtures
|
||||
|
||||
| File | Action | Purpose |
|
||||
|------|--------|---------|
|
||||
| `tests/scene_generator_test.rs` | Modify | Cover new analysis, archetype classification, and generation routing |
|
||||
| `tests/generated_scene_*` or related fixtures | Modify/Create | Add representative fixtures for single-request, multi-mode, and paginated-enrichment scenes |
|
||||
|
||||
---
|
||||
|
||||
## Scope Guardrails
|
||||
|
||||
- Do not break existing `--scene-id`, `--scene-name`, or `--scene-kind` compatibility.
|
||||
- Do not require all scenes to provide complete metadata in HTML meta tags.
|
||||
- Do not force the runtime to support new resolver contracts unless generation is updated to gate incompatible output.
|
||||
- Do not assume all report scenes share `org + period` params.
|
||||
- Do not silently generate low-confidence skills as if they were runnable.
|
||||
|
||||
---
|
||||
|
||||
### Task 1: Fix Current Hard Failures Before Compiler Refactor
|
||||
|
||||
**Files:**
|
||||
- Modify: `frontend/scene-generator/llm-client.js`
|
||||
- Modify: `frontend/scene-generator/generator-runner.js`
|
||||
- Modify: `frontend/scene-generator/server.js`
|
||||
- Modify: `frontend/scene-generator/sg_scene_generator.html`
|
||||
- Modify: `src/generated_scene/generator.rs`
|
||||
|
||||
**Goal:** Stop the most obvious wrong outputs that currently make generated skills fail on the internal network even before the full compiler architecture lands.
|
||||
|
||||
- [ ] **Step 1: Remove report-scene hardcoded parameter assumptions**
|
||||
|
||||
Audit `scene.toml` generation in `src/generated_scene/generator.rs` and remove default injection of generic report params such as:
|
||||
|
||||
- fixed `org`
|
||||
- fixed `period`
|
||||
- default dictionary entity for a specific city
|
||||
- generic page title keywords like `["报表", "线损"]`
|
||||
|
||||
Expected result: generated params come from extracted scene semantics or are omitted when not confidently known.
|
||||
|
||||
- [ ] **Step 2: Rework bootstrap source priority**
|
||||
|
||||
Change bootstrap derivation so `expected_domain` and `target_url` are resolved using this order:
|
||||
|
||||
1. explicit deep extraction result
|
||||
2. deterministic extraction from business entry points
|
||||
3. HTML meta tags if trustworthy
|
||||
4. fallback empty with warning
|
||||
|
||||
Explicitly prevent script-host URLs such as static JS includes from becoming the business domain by mistake.
|
||||
|
||||
- [ ] **Step 3: Replace naive truncation with chunked extraction input**
|
||||
|
||||
Update `frontend/scene-generator/llm-client.js` and `frontend/scene-generator/generator-runner.js` so they no longer send only the first `15000/3000` characters. Replace with:
|
||||
|
||||
1. directory tree summary
|
||||
2. `index.html` chunking
|
||||
3. URL-bearing fragments
|
||||
4. request-construction fragments
|
||||
5. branching logic fragments
|
||||
6. export-related fragments
|
||||
|
||||
- [ ] **Step 4: Add analysis preview and risk banner in Web UI**
|
||||
|
||||
Update `frontend/scene-generator/sg_scene_generator.html` and `frontend/scene-generator/server.js` to preview:
|
||||
|
||||
- detected archetype
|
||||
- bootstrap
|
||||
- key endpoints
|
||||
- extracted params
|
||||
- workflow steps
|
||||
- confidence and risk notes
|
||||
|
||||
- [ ] **Step 5: Verify with marketing and tq reference scenes**
|
||||
|
||||
Run local analysis against the two reference scenes and confirm:
|
||||
|
||||
- `marketing-zero-consumer-report` no longer resolves the wrong domain
|
||||
- `tq-lineloss-report` still identifies mode-related structures
|
||||
- generated preview no longer shows generic hardcoded report params
|
||||
|
||||
- [ ] **Step 6: Commit**
|
||||
|
||||
```bash
|
||||
git add frontend/scene-generator/llm-client.js frontend/scene-generator/generator-runner.js frontend/scene-generator/server.js frontend/scene-generator/sg_scene_generator.html src/generated_scene/generator.rs
|
||||
git commit -m "fix(generator): remove hardcoded report defaults and improve bootstrap extraction"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 2: Introduce Unified Scene IR
|
||||
|
||||
**Files:**
|
||||
- Create: `src/generated_scene/ir.rs`
|
||||
- Modify: `src/generated_scene/generator.rs`
|
||||
- Modify: `src/bin/sg_scene_generate.rs`
|
||||
- Modify: `frontend/scene-generator/server.js`
|
||||
- Modify: `frontend/scene-generator/llm-client.js`
|
||||
|
||||
**Goal:** Introduce a single intermediate representation that all extraction and compilation stages use.
|
||||
|
||||
- [ ] **Step 1: Add Rust `Scene IR` structs**
|
||||
|
||||
Create `src/generated_scene/ir.rs` with serde-enabled structs for:
|
||||
|
||||
- `SceneIr`
|
||||
- `BootstrapIr`
|
||||
- `ParamIr`
|
||||
- `ModeIr`
|
||||
- `WorkflowStepIr`
|
||||
- `ArtifactContractIr`
|
||||
- `NormalizeRulesIr`
|
||||
- `ReadinessIr`
|
||||
- `EvidenceIr`
|
||||
|
||||
Minimum top-level fields:
|
||||
|
||||
```json
|
||||
{
|
||||
"sceneId": "",
|
||||
"sceneName": "",
|
||||
"sceneKind": "",
|
||||
"workflowArchetype": "",
|
||||
"bootstrap": {},
|
||||
"params": [],
|
||||
"modes": [],
|
||||
"workflowSteps": [],
|
||||
"requestTemplate": {},
|
||||
"responsePath": "",
|
||||
"normalizeRules": {},
|
||||
"artifactContract": {},
|
||||
"validationHints": {},
|
||||
"evidence": []
|
||||
}
|
||||
```
|
||||
|
||||
- [ ] **Step 2: Wire `Scene IR` into generator entrypoints**
|
||||
|
||||
Update `src/bin/sg_scene_generate.rs` to accept either:
|
||||
|
||||
- `--scene-info-json` upgraded to the new IR contract, or
|
||||
- a new `--scene-ir-json` / `--scene-ir-file` parameter
|
||||
|
||||
Keep backward compatibility by translating old scene info into partial IR where needed.
|
||||
|
||||
- [ ] **Step 3: Refactor generator to compile from IR**
|
||||
|
||||
Update `src/generated_scene/generator.rs` so its internal interfaces no longer directly depend on loosely grouped fields like `expectedDomain`, `staticParams`, and `columnDefs` alone. It should compile from unified `SceneIr`.
|
||||
|
||||
- [ ] **Step 4: Update Node server to pass IR through generation**
|
||||
|
||||
Modify `frontend/scene-generator/server.js` so analyze endpoints return IR-shaped JSON and generate endpoints pass the same structure into Rust without flattening.
|
||||
|
||||
- [ ] **Step 5: Verify serde and CLI compatibility**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo check
|
||||
node --check frontend/scene-generator/server.js
|
||||
node --check frontend/scene-generator/llm-client.js
|
||||
```
|
||||
|
||||
Expected: Rust and Node compile cleanly with the new IR contract.
|
||||
|
||||
- [ ] **Step 6: Commit**
|
||||
|
||||
```bash
|
||||
git add src/generated_scene/ir.rs src/generated_scene/generator.rs src/bin/sg_scene_generate.rs frontend/scene-generator/server.js frontend/scene-generator/llm-client.js
|
||||
git commit -m "feat(generator): introduce unified scene ir for analysis and compilation"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 3: Build Hybrid Extraction Pipeline
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/generated_scene/analyzer.rs`
|
||||
- Modify: `frontend/scene-generator/generator-runner.js`
|
||||
- Modify: `frontend/scene-generator/llm-client.js`
|
||||
- Modify: `frontend/scene-generator/server.js`
|
||||
|
||||
**Goal:** Split extraction into deterministic signal collection plus LLM semantic completion.
|
||||
|
||||
- [ ] **Step 1: Implement deterministic extraction helpers**
|
||||
|
||||
Add helper logic in `src/generated_scene/analyzer.rs` or adjacent extraction code to detect:
|
||||
|
||||
- URLs and request methods
|
||||
- `contentType`
|
||||
- request payload builders
|
||||
- pagination variables such as `page`, `rows`, `pageSize`
|
||||
- branch variables such as `period_mode`, `reportType`
|
||||
- entry methods
|
||||
- export methods
|
||||
- obvious filter expressions such as `charge !== 0`
|
||||
|
||||
- [ ] **Step 2: Create key-fragment selection in Node runner**
|
||||
|
||||
Update `frontend/scene-generator/generator-runner.js` to extract and package:
|
||||
|
||||
- directory summary
|
||||
- URL fragments
|
||||
- branch fragments
|
||||
- request-body fragments
|
||||
- response normalization fragments
|
||||
- export fragments
|
||||
|
||||
for LLM analysis.
|
||||
|
||||
- [ ] **Step 3: Redesign LLM prompt for workflow understanding**
|
||||
|
||||
Update `frontend/scene-generator/llm-client.js` so the prompt explicitly asks for:
|
||||
|
||||
- `workflowArchetype`
|
||||
- `bootstrap`
|
||||
- `params`
|
||||
- `modes`
|
||||
- `workflowSteps`
|
||||
- `requestTemplate`
|
||||
- `responsePath`
|
||||
- `normalizeRules`
|
||||
- `artifactContract`
|
||||
- `confidence`
|
||||
- `uncertainties`
|
||||
|
||||
- [ ] **Step 4: Merge deterministic and LLM results**
|
||||
|
||||
Implement merge logic in `frontend/scene-generator/server.js` or a dedicated helper:
|
||||
|
||||
- deterministic extraction wins for hard facts
|
||||
- LLM fills missing semantics
|
||||
- conflicts are surfaced in preview as warnings
|
||||
|
||||
- [ ] **Step 5: Verify against reference workflows**
|
||||
|
||||
Check that:
|
||||
|
||||
- `marketing-zero-consumer-report` emits workflow steps including `paginate`, `secondary_request`, `filter`, and `export`
|
||||
- `tq-lineloss-report` emits `modes`, `defaultMode`, and `modeSwitchField`
|
||||
|
||||
- [ ] **Step 6: Commit**
|
||||
|
||||
```bash
|
||||
git add src/generated_scene/analyzer.rs frontend/scene-generator/generator-runner.js frontend/scene-generator/llm-client.js frontend/scene-generator/server.js
|
||||
git commit -m "feat(generator): add hybrid deterministic and llm workflow extraction"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 4: Add Workflow Archetype Classification
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/generated_scene/analyzer.rs`
|
||||
- Modify: `src/generated_scene/ir.rs`
|
||||
- Modify: `frontend/scene-generator/server.js`
|
||||
- Modify: `frontend/scene-generator/sg_scene_generator.html`
|
||||
|
||||
**Goal:** Reliably classify scenes so the correct compiler path is chosen.
|
||||
|
||||
- [ ] **Step 1: Add archetype enum support**
|
||||
|
||||
Define and support these initial archetypes:
|
||||
|
||||
- `single_request_table`
|
||||
- `multi_mode_request`
|
||||
- `paginated_enrichment`
|
||||
- `page_state_eval`
|
||||
|
||||
- [ ] **Step 2: Implement classification rules**
|
||||
|
||||
Classification logic should prefer:
|
||||
|
||||
1. `multi_mode_request` when explicit mode-switch branching exists
|
||||
2. `paginated_enrichment` when paginated list fetch plus secondary requests are detected
|
||||
3. `page_state_eval` when page-state judgment dominates
|
||||
4. `single_request_table` as fallback with lower confidence
|
||||
|
||||
- [ ] **Step 3: Expose classification confidence**
|
||||
|
||||
Add confidence and evidence fields to the preview payload so UI can show why a scene was classified into an archetype.
|
||||
|
||||
- [ ] **Step 4: Add manual override support in UI**
|
||||
|
||||
Allow users to override archetype in `frontend/scene-generator/sg_scene_generator.html` before final generation, but preserve the original detected result and confidence.
|
||||
|
||||
- [ ] **Step 5: Verify reference classifications**
|
||||
|
||||
Expected:
|
||||
|
||||
- `marketing-zero-consumer-report` => `paginated_enrichment`
|
||||
- `tq-lineloss-report` => `multi_mode_request`
|
||||
|
||||
- [ ] **Step 6: Commit**
|
||||
|
||||
```bash
|
||||
git add src/generated_scene/analyzer.rs src/generated_scene/ir.rs frontend/scene-generator/server.js frontend/scene-generator/sg_scene_generator.html
|
||||
git commit -m "feat(generator): classify scenes by workflow archetype with confidence"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 5: Split Generator Into Archetype Compilers
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/generated_scene/generator.rs`
|
||||
- Optionally create: `src/generated_scene/compiler_single_request.rs`
|
||||
- Optionally create: `src/generated_scene/compiler_multi_mode.rs`
|
||||
- Optionally create: `src/generated_scene/compiler_paginated_enrichment.rs`
|
||||
- Optionally create: `src/generated_scene/compiler_page_state.rs`
|
||||
|
||||
**Goal:** Replace the single generic report template with explicit compiler paths.
|
||||
|
||||
- [ ] **Step 1: Add compiler routing by archetype**
|
||||
|
||||
Update `src/generated_scene/generator.rs` so generation dispatches on `workflowArchetype`.
|
||||
|
||||
- [ ] **Step 2: Implement `single_request_table` compiler**
|
||||
|
||||
Generate:
|
||||
|
||||
- minimal `scene.toml`
|
||||
- direct request browser script
|
||||
- artifact output for simple table/list data
|
||||
|
||||
- [ ] **Step 3: Implement `multi_mode_request` compiler**
|
||||
|
||||
Generate:
|
||||
|
||||
- mode detection
|
||||
- mode-specific request builders
|
||||
- mode-specific column definitions
|
||||
- mode-specific response extraction
|
||||
- unified artifact output
|
||||
|
||||
Reference target: `tq-lineloss-report`
|
||||
|
||||
- [ ] **Step 4: Implement `paginated_enrichment` compiler**
|
||||
|
||||
Generate:
|
||||
|
||||
- paginated list loop
|
||||
- per-item or batched secondary requests
|
||||
- aggregation and transform steps
|
||||
- business filters
|
||||
- final artifact or export output
|
||||
|
||||
Reference target: `marketing-zero-consumer-report`
|
||||
|
||||
- [ ] **Step 5: Implement `page_state_eval` compiler**
|
||||
|
||||
Generate:
|
||||
|
||||
- state-check script skeleton
|
||||
- light artifact semantics for monitoring or status checks
|
||||
|
||||
- [ ] **Step 6: Verify generated outputs by archetype**
|
||||
|
||||
Validate that generated scripts no longer:
|
||||
|
||||
- define multiple API endpoints but use only the first
|
||||
- collapse mode-aware scenes into one request body
|
||||
- flatten paginated enrichment scenes into one-step normalization
|
||||
|
||||
- [ ] **Step 7: Commit**
|
||||
|
||||
```bash
|
||||
git add src/generated_scene/generator.rs src/generated_scene/compiler_*.rs
|
||||
git commit -m "feat(generator): split scene generation into workflow archetype compilers"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 6: Align Runtime Resolver Contracts
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/compat/scene_platform/resolvers.rs`
|
||||
- Modify: `src/generated_scene/generator.rs`
|
||||
- Modify: `src/generated_scene/ir.rs`
|
||||
|
||||
**Goal:** Ensure generated parameter contracts are either executable by the runtime or explicitly flagged as unsupported.
|
||||
|
||||
- [ ] **Step 1: Audit current resolver coverage**
|
||||
|
||||
Document which current contracts are already supported, including:
|
||||
|
||||
- `dictionary_entity`
|
||||
- `month_week_period`
|
||||
- `fixed_enum`
|
||||
- `literal_passthrough`
|
||||
|
||||
- [ ] **Step 2: Add missing resolver types or gate them**
|
||||
|
||||
Choose one of these paths per parameter type:
|
||||
|
||||
1. implement new runtime resolver support
|
||||
2. downgrade generation to an existing supported resolver
|
||||
3. block generation with explicit readiness warning
|
||||
|
||||
Recommended additions:
|
||||
|
||||
- `mode_enum`
|
||||
- `date_range`
|
||||
- `org_tree`
|
||||
- `page_size`
|
||||
- `hidden_static`
|
||||
- `derived_param`
|
||||
|
||||
- [ ] **Step 3: Reflect runtime compatibility in generated metadata**
|
||||
|
||||
Generated output should clearly indicate:
|
||||
|
||||
- supported params
|
||||
- unresolved params
|
||||
- manual-completion requirements
|
||||
|
||||
- [ ] **Step 4: Add tests for resolver alignment**
|
||||
|
||||
Extend tests to ensure a generated skill cannot claim runnable readiness when its params require unsupported resolver behavior.
|
||||
|
||||
- [ ] **Step 5: Commit**
|
||||
|
||||
```bash
|
||||
git add src/compat/scene_platform/resolvers.rs src/generated_scene/generator.rs src/generated_scene/ir.rs tests/scene_generator_test.rs
|
||||
git commit -m "feat(runtime): align generated scene contracts with resolver support"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 7: Add Readiness Gates And Generation Report
|
||||
|
||||
**Files:**
|
||||
- Modify: `frontend/scene-generator/server.js`
|
||||
- Modify: `frontend/scene-generator/sg_scene_generator.html`
|
||||
- Modify: `src/generated_scene/ir.rs`
|
||||
- Modify: `src/generated_scene/generator.rs`
|
||||
|
||||
**Goal:** Make generation output self-describing so users know whether a skill is ready for internal-network trial.
|
||||
|
||||
- [ ] **Step 1: Add static readiness checks**
|
||||
|
||||
Implement checks for:
|
||||
|
||||
- entrypoint detection
|
||||
- request-chain completeness
|
||||
- bootstrap plausibility
|
||||
- param/runtime compatibility
|
||||
- archetype compiler completeness
|
||||
|
||||
- [ ] **Step 2: Add readiness levels**
|
||||
|
||||
Define:
|
||||
|
||||
- `A` = ready for direct internal-network trial
|
||||
- `B` = structurally correct, human review recommended
|
||||
- `C` = draft only, manual completion required
|
||||
|
||||
- [ ] **Step 3: Generate human-readable report**
|
||||
|
||||
Each analysis or generation result should include:
|
||||
|
||||
- archetype
|
||||
- confidence
|
||||
- key evidence
|
||||
- detected risks
|
||||
- missing pieces
|
||||
- readiness level
|
||||
|
||||
- [ ] **Step 4: Display readiness in Web UI**
|
||||
|
||||
Show the readiness grade before generation and after generation, with explicit warnings for internal-network execution risk.
|
||||
|
||||
- [ ] **Step 5: Verify readiness outcomes**
|
||||
|
||||
Expected baseline:
|
||||
|
||||
- `tq-lineloss-report` should reach `A` or high-confidence `B`
|
||||
- `marketing-zero-consumer-report` should not be labeled runnable unless pagination and secondary-request logic are correctly represented
|
||||
|
||||
- [ ] **Step 6: Commit**
|
||||
|
||||
```bash
|
||||
git add frontend/scene-generator/server.js frontend/scene-generator/sg_scene_generator.html src/generated_scene/ir.rs src/generated_scene/generator.rs
|
||||
git commit -m "feat(generator): add readiness grading and generation risk reporting"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 8: Add Regression Coverage For Reference Scenes
|
||||
|
||||
**Files:**
|
||||
- Modify: `tests/scene_generator_test.rs`
|
||||
- Create/Modify: scene generator fixtures as needed
|
||||
|
||||
**Goal:** Lock in the two reference scenes as ongoing regression cases.
|
||||
|
||||
- [ ] **Step 1: Add marketing classification fixture coverage**
|
||||
|
||||
Test that the marketing source scene is classified as `paginated_enrichment` and contains evidence for:
|
||||
|
||||
- paginated list request
|
||||
- secondary request
|
||||
- filter rule
|
||||
- export step
|
||||
|
||||
- [ ] **Step 2: Add tq classification fixture coverage**
|
||||
|
||||
Test that the tq source scene is classified as `multi_mode_request` and contains evidence for:
|
||||
|
||||
- month mode
|
||||
- week mode
|
||||
- distinct request templates
|
||||
- distinct column definitions
|
||||
|
||||
- [ ] **Step 3: Add generation-shape assertions**
|
||||
|
||||
Assert that generated outputs differ by archetype and do not collapse to a single generic template shape.
|
||||
|
||||
- [ ] **Step 4: Run verification**
|
||||
|
||||
```bash
|
||||
cargo test --test scene_generator_test -- --nocapture
|
||||
```
|
||||
|
||||
Expected: both reference cases pass and guard against regression.
|
||||
|
||||
- [ ] **Step 5: Commit**
|
||||
|
||||
```bash
|
||||
git add tests/scene_generator_test.rs tests/fixtures
|
||||
git commit -m "test(generator): add regression coverage for marketing and tq reference scenes"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Delivery Sequence
|
||||
|
||||
Recommended implementation order:
|
||||
|
||||
1. Task 1: hard failure fixes
|
||||
2. Task 2: `Scene IR`
|
||||
3. Task 3: hybrid extraction
|
||||
4. Task 4: archetype classification
|
||||
5. Task 5: compiler split
|
||||
6. Task 6: resolver alignment
|
||||
7. Task 7: readiness gates
|
||||
8. Task 8: regression coverage
|
||||
|
||||
Rationale:
|
||||
|
||||
- Task 1 stops current bad outputs early.
|
||||
- Tasks 2 to 5 establish the new compiler backbone.
|
||||
- Tasks 6 and 7 prevent false claims of runnability.
|
||||
- Task 8 locks the new architecture against regression.
|
||||
|
||||
---
|
||||
|
||||
## Verification Strategy
|
||||
|
||||
### Static Verification
|
||||
|
||||
- `cargo check`
|
||||
- `cargo test --test scene_generator_test -- --nocapture`
|
||||
- `node --check frontend/scene-generator/llm-client.js`
|
||||
- `node --check frontend/scene-generator/generator-runner.js`
|
||||
- `node --check frontend/scene-generator/server.js`
|
||||
|
||||
### Functional Verification
|
||||
|
||||
For `marketing-zero-consumer-report`:
|
||||
|
||||
- detected as `paginated_enrichment`
|
||||
- bootstrap resolves to business domain, not static script host
|
||||
- generated workflow includes pagination and secondary requests
|
||||
- generation is not labeled runnable if those steps are missing
|
||||
|
||||
For `tq-lineloss-report`:
|
||||
|
||||
- detected as `multi_mode_request`
|
||||
- month and week logic remain distinct
|
||||
- request templates and column definitions are mode-specific
|
||||
|
||||
### UI Verification
|
||||
|
||||
Confirm the scene generator UI now shows:
|
||||
|
||||
- detected archetype
|
||||
- confidence
|
||||
- bootstrap
|
||||
- key params
|
||||
- readiness grade
|
||||
- risk notes
|
||||
|
||||
---
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
This plan is complete when all of the following are true:
|
||||
|
||||
1. `sg_scene_generate` consumes a unified `Scene IR`.
|
||||
2. The analysis pipeline can distinguish at least `single_request_table`, `multi_mode_request`, `paginated_enrichment`, and `page_state_eval`.
|
||||
3. `tq-lineloss-report` is generated through the multi-mode compiler path.
|
||||
4. `marketing-zero-consumer-report` is generated through the paginated-enrichment compiler path.
|
||||
5. Generated `scene.toml` no longer injects unrelated default org/period assumptions.
|
||||
6. Bootstrap resolution no longer mistakes external script hosts for business target domains.
|
||||
7. Runtime resolver compatibility is explicit, not implicit.
|
||||
8. Generation results include readiness grading and risk reporting before internal-network trial.
|
||||
|
||||
@@ -0,0 +1,193 @@
|
||||
# G1 边界收敛与家族重排实施计划
|
||||
> Date: 2026-04-18
|
||||
> Status: Draft
|
||||
> Source:
|
||||
> - `docs/superpowers/specs/2026-04-17-scene-skill-60-to-90-roadmap-design.md`
|
||||
> - `examples/g1_batch_round1/`
|
||||
|
||||
## 1. Plan Intent
|
||||
|
||||
本计划用于处理 `G1` 通用单页报表组边界过宽的问题。
|
||||
|
||||
通过对以下 4 个边界样本的实测与结构分析,已经确认当前 `G1` 分类存在误收问题:
|
||||
|
||||
1. `高低压新增报装容量月度统计表`
|
||||
2. `电能表现场检验完成率指标报表`
|
||||
3. `计量资产库存统计`
|
||||
4. `95598供电服务月报`
|
||||
|
||||
结论不是“是否继续观察”,而是“必须整改”:
|
||||
|
||||
1. `G1` 的定义必须收紧
|
||||
2. 这 4 个样本必须重排
|
||||
3. 后续实施必须按新边界推进,不能继续把这 4 个样本混在同一类里
|
||||
|
||||
## 2. Rectification Objective
|
||||
|
||||
本轮整改目标固定为:
|
||||
|
||||
1. 收紧 `G1` 定义,避免继续污染 `single_request_table`
|
||||
2. 将 4 个边界样本重新分配到正确家族
|
||||
3. 为后续实现提供明确顺序,不再把边界样本混做“通用报表”
|
||||
|
||||
## 3. Final Reassignment Decision
|
||||
|
||||
本计划执行时,4 个样本的正式归类结论固定如下:
|
||||
|
||||
1. `高低压新增报装容量月度统计表`
|
||||
- 保留在 `G1`
|
||||
- 子型标记为:`G1-E 轻量补查汇总型`
|
||||
2. `电能表现场检验完成率指标报表`
|
||||
- 从 `G1` 拆出
|
||||
- 新家族标记为:`G6 宿主桥接多步查询型`
|
||||
3. `计量资产库存统计`
|
||||
- 从 `G1` 拆出
|
||||
- 新家族标记为:`G7 多接口盘点汇总型`
|
||||
4. `95598供电服务月报`
|
||||
- 从 `G1` 拆出
|
||||
- 新家族标记为:`G8 抓取落库分析出文档型`
|
||||
|
||||
## 4. Scope Guardrails
|
||||
|
||||
本计划边界固定如下:
|
||||
|
||||
1. 不修改线损家族 `G2`
|
||||
2. 不扩展到全部 `102` 个场景同步重排
|
||||
3. 只处理 `G1` 边界定义与这 4 个边界样本
|
||||
4. 不在本计划内直接实现 `G6/G7/G8` 全部能力
|
||||
5. 本计划优先产出“边界收敛 + 家族重排 + 实施顺序”
|
||||
|
||||
## 5. Phase Overview
|
||||
|
||||
执行顺序固定为:
|
||||
|
||||
`Phase 0 -> Phase 1 -> Phase 2 -> Phase 3`
|
||||
|
||||
### Phase 0: 冻结整改口径
|
||||
|
||||
目标:
|
||||
|
||||
1. 冻结 `G1` 修订定义
|
||||
2. 冻结 4 个样本的正式重排结论
|
||||
|
||||
退出标准:
|
||||
|
||||
1. 后续不再把这 4 个样本同时作为 `G1` 候选讨论
|
||||
|
||||
### Phase 1: 收紧 G1 边界
|
||||
|
||||
目标:
|
||||
|
||||
1. 将 `G1` 明确收敛为“通用单页报表”
|
||||
2. 把不属于 `G1` 的结构特征显式列为排除条件
|
||||
|
||||
必须落地的对象:
|
||||
|
||||
1. `G1` 修订定义
|
||||
2. `G1` 进入条件
|
||||
3. `G1` 排除条件
|
||||
4. `G1-E` 作为上边界子型的说明
|
||||
|
||||
退出标准:
|
||||
|
||||
1. `single_request_table` 不再承接宿主桥接型、盘点型、落库分析型场景
|
||||
|
||||
### Phase 2: 样本重排与家族建档
|
||||
|
||||
目标:
|
||||
|
||||
1. 把 4 个样本正式移到对应家族
|
||||
2. 为 `G6/G7/G8` 建立最小定义
|
||||
|
||||
必须落地的对象:
|
||||
|
||||
1. 样本重排表
|
||||
2. `G6` 最小定义
|
||||
3. `G7` 最小定义
|
||||
4. `G8` 最小定义
|
||||
|
||||
退出标准:
|
||||
|
||||
1. 4 个样本不再处于“G1 模糊候选”状态
|
||||
|
||||
### Phase 3: 后续实施顺序固定
|
||||
|
||||
目标:
|
||||
|
||||
1. 确定后续开发顺序
|
||||
2. 避免多家族并发扩散
|
||||
|
||||
固定顺序:
|
||||
|
||||
1. 先继续推进 `高低压新增报装容量月度统计表`
|
||||
- 作为 `G1-E`
|
||||
2. 再单开 `G6`
|
||||
- `电能表现场检验完成率指标报表`
|
||||
3. 再评估 `G7`
|
||||
- `计量资产库存统计`
|
||||
4. 最后评估 `G8`
|
||||
- `95598供电服务月报`
|
||||
|
||||
退出标准:
|
||||
|
||||
1. 后续任务顺序明确
|
||||
2. `G1` 不再继续吞入新边界样本
|
||||
|
||||
## 6. Family-Level Rectification Rules
|
||||
|
||||
### 6.1 G1 修订规则
|
||||
|
||||
`G1` 仅保留以下场景:
|
||||
|
||||
1. 单系统、单页面承载
|
||||
2. 存在相对清晰的主请求链
|
||||
3. 请求模板与响应路径可直接恢复
|
||||
4. 最终结果为单表或单次统计汇总
|
||||
5. 不依赖复杂宿主桥接
|
||||
6. 不依赖本地落库与 SQL 分析
|
||||
|
||||
### 6.2 G1 排除规则
|
||||
|
||||
出现以下特征之一,即不再归入 `G1`:
|
||||
|
||||
1. `BrowserAction / sgBrowserExcuteJsCode` 主导业务请求推进
|
||||
2. 存在明显多轮 callback 串联 workflow
|
||||
3. 同场景内存在多个业务 endpoint 分类型扫数
|
||||
4. 报表前需要本地落库、二次分析或 SQL 聚合
|
||||
5. 输出以 Word 文档流水线而非直接表格结果为主
|
||||
|
||||
## 7. Implementation Priority
|
||||
|
||||
优先级固定如下:
|
||||
|
||||
1. `P0`
|
||||
- `高低压新增报装容量月度统计表`
|
||||
- 目标:验证 `G1-E` 是否可作为 `G1` 上边界稳定成立
|
||||
2. `P1`
|
||||
- `电能表现场检验完成率指标报表`
|
||||
- 目标:验证 `G6` 的最小 workflow 定义
|
||||
3. `P2`
|
||||
- `计量资产库存统计`
|
||||
- 目标:验证 `G7` 的多 endpoint 聚合边界
|
||||
4. `P3`
|
||||
- `95598供电服务月报`
|
||||
- 目标:验证 `G8` 的抓取落库分析链路边界
|
||||
|
||||
## 8. Deliverables
|
||||
|
||||
本计划完成时至少产出:
|
||||
|
||||
1. `G1` 边界修订文案
|
||||
2. 4 个边界样本重排表
|
||||
3. `G6/G7/G8` 最小家族定义
|
||||
4. 后续实施优先级清单
|
||||
|
||||
## 9. Completion Criteria
|
||||
|
||||
本计划完成的标志是:
|
||||
|
||||
1. `G1` 定义被正式收紧
|
||||
2. 4 个边界样本完成正式重排
|
||||
3. `高低压新增报装容量月度统计表` 被确定为 `G1-E`
|
||||
4. `电能表现场检验完成率指标报表`、`计量资产库存统计`、`95598供电服务月报` 不再继续作为 `G1` 样本使用
|
||||
5. 后续开发顺序固定,不再反复讨论边界归属
|
||||
@@ -0,0 +1,212 @@
|
||||
# G1-E Light Enrichment Report Plan
|
||||
|
||||
> Date: 2026-04-18
|
||||
> Status: Draft
|
||||
> Source:
|
||||
> - `docs/superpowers/specs/2026-04-18-g1-e-light-enrichment-report-design.md`
|
||||
> - `docs/superpowers/specs/2026-04-17-scene-skill-60-to-90-roadmap-design.md`
|
||||
> - `docs/superpowers/reports/2026-04-18-g1-boundary-reassignment-report.md`
|
||||
|
||||
## 1. Plan Intent
|
||||
|
||||
本计划用于把 `G1-E 轻量补查汇总型` 从概念边界推进到可实施状态。
|
||||
|
||||
本轮只解决一个问题:
|
||||
|
||||
1. 让生成器能够对“单主请求 + 少量补查 + 单次汇总输出”的场景,恢复出可编译的三段式业务语义。
|
||||
|
||||
本计划不处理 `G6/G7/G8`,也不扩展到其它家族。
|
||||
|
||||
## 2. Scope
|
||||
|
||||
本计划纳入范围的对象只有三类:
|
||||
|
||||
1. `G1-E` 证据层补齐
|
||||
2. `G1-E` 三段式 `Scene IR` / compiler gate 落地
|
||||
3. `高低压新增报装容量月度统计表` 的 P0 样板验证
|
||||
|
||||
本计划明确排除:
|
||||
|
||||
1. `G6 宿主桥接多步查询型`
|
||||
2. `G7 多接口盘点汇总型`
|
||||
3. `G8 抓取落库分析出文档型`
|
||||
4. `102` 个场景的大规模家族扩展
|
||||
|
||||
## 3. Fixed Sample
|
||||
|
||||
本计划的唯一 P0 样板固定为:
|
||||
|
||||
1. `高低压新增报装容量月度统计表`
|
||||
|
||||
该样板的冻结目标是:
|
||||
|
||||
1. 主请求:`getWkorderAll`
|
||||
2. 补查请求:
|
||||
- `queryElectCustInfo`
|
||||
- `queryBusAcpt`
|
||||
- `getBatchPerCust97`
|
||||
3. 最终恢复为主请求、补查请求、并回规则三段式结构
|
||||
|
||||
在本计划完成前,不新增第二个 `G1-E` 样板。
|
||||
|
||||
## 4. Phase Overview
|
||||
|
||||
执行顺序固定为:
|
||||
|
||||
`Phase 0 -> Phase 1 -> Phase 2 -> Phase 3`
|
||||
|
||||
### Phase 0: Freeze Contract
|
||||
|
||||
目标:
|
||||
|
||||
1. 冻结 `G1-E` 最小定义
|
||||
2. 冻结 P0 样板的主链、补查链、并回链目标口径
|
||||
|
||||
必须落地的对象:
|
||||
|
||||
1. `G1-E` spec
|
||||
2. P0 样板目标结构说明
|
||||
3. 失败分类口径
|
||||
|
||||
退出标准:
|
||||
|
||||
1. 后续实现不再回退成普通 `G1 single_request_table`
|
||||
|
||||
### Phase 1: Evidence Layer Completion
|
||||
|
||||
目标:
|
||||
|
||||
1. 让提取链路可以显式产出 `main_request` 证据
|
||||
2. 让提取链路可以显式产出 `enrichment_request` 证据
|
||||
3. 让提取链路可以显式产出 `merge_plan` 证据
|
||||
|
||||
必须落地的对象:
|
||||
|
||||
1. `main_request` 证据 schema
|
||||
2. `enrichment_request` 证据 schema
|
||||
3. `merge_plan` 证据 schema
|
||||
4. 对应的越界识别信号
|
||||
|
||||
退出标准:
|
||||
|
||||
1. P0 样板不再只落到 `page_state_eval`
|
||||
2. 提取结果中能看见主请求、补查请求、并回规则候选
|
||||
|
||||
### Phase 2: Scene IR And Compiler Gates
|
||||
|
||||
目标:
|
||||
|
||||
1. 在 `Scene IR` 中承载三段式结构
|
||||
2. 在 compiler 中增加 `G1-E` 专属 gate
|
||||
3. 防止缺失补查契约的结果误判为普通 `G1` 成功
|
||||
|
||||
必须落地的对象:
|
||||
|
||||
1. `main_request`
|
||||
2. `enrichment_requests[]`
|
||||
3. `merge_plan`
|
||||
4. `main_request_resolved`
|
||||
5. `enrichment_requests_resolved`
|
||||
6. `merge_plan_resolved`
|
||||
7. `g1e_scope_compatible`
|
||||
|
||||
退出标准:
|
||||
|
||||
1. `G1-E` 可以独立于 `single_request_table` 被判定
|
||||
2. 越界样本会被阻断,而不是伪成功
|
||||
|
||||
### Phase 3: P0 Validation
|
||||
|
||||
目标:
|
||||
|
||||
1. 用 `高低压新增报装容量月度统计表` 验证 `G1-E` 最小闭环
|
||||
2. 冻结第一版验收基线
|
||||
|
||||
必须落地的对象:
|
||||
|
||||
1. P0 样板生成结果
|
||||
2. P0 样板验证记录
|
||||
3. P0 样板失败归因记录
|
||||
|
||||
退出标准:
|
||||
|
||||
1. 主请求、补查请求、并回规则均能稳定恢复
|
||||
2. 结果不再是空壳 `params=[] / requestEntries=[] / columnDefs=[]`
|
||||
3. 缺证据时能 fail-closed
|
||||
|
||||
## 5. Work Breakdown
|
||||
|
||||
### Task Group A: G1-E Evidence Modeling
|
||||
|
||||
任务目标:
|
||||
|
||||
1. 定义主请求证据对象
|
||||
2. 定义补查请求证据对象
|
||||
3. 定义并回规则证据对象
|
||||
4. 明确越界到 `G6/G7/G8` 的识别信号
|
||||
|
||||
完成标志:
|
||||
|
||||
1. `G1-E` 不再依赖“全文像不像报表”的模糊判断
|
||||
|
||||
### Task Group B: G1-E IR / Compiler Integration
|
||||
|
||||
任务目标:
|
||||
|
||||
1. 为 `G1-E` 建立三段式 `Scene IR`
|
||||
2. 增加 `G1-E` gate
|
||||
3. 切断“补查缺失但仍按普通 G1 成功”的通道
|
||||
|
||||
完成标志:
|
||||
|
||||
1. `G1-E` 与 `G1` 的成功条件正式分离
|
||||
|
||||
### Task Group C: P0 Sample Verification
|
||||
|
||||
任务目标:
|
||||
|
||||
1. 重新生成 `高低压新增报装容量月度统计表`
|
||||
2. 核对主请求、补查请求、并回规则是否完整
|
||||
3. 输出验证报告
|
||||
|
||||
完成标志:
|
||||
|
||||
1. `高低压新增报装容量月度统计表` 成为 `G1-E` 第一版标准样板
|
||||
|
||||
## 6. Deliverables
|
||||
|
||||
本计划完成时至少产出:
|
||||
|
||||
1. `G1-E` 证据层实现
|
||||
2. `G1-E` 三段式 `Scene IR`
|
||||
3. `G1-E` compiler gate
|
||||
4. `高低压新增报装容量月度统计表` 的 P0 生成与验证结果
|
||||
5. 对应整改报告或验证报告
|
||||
|
||||
## 7. Acceptance Criteria
|
||||
|
||||
本计划完成的标志是:
|
||||
|
||||
1. `G1-E` 已从文档定义进入可实现、可验证状态
|
||||
2. `高低压新增报装容量月度统计表` 不再被误生成为普通 `G1` 空壳 skill
|
||||
3. 生成器能够显式恢复:
|
||||
- 主请求
|
||||
- 补查请求
|
||||
- 并回规则
|
||||
4. 当证据不足或结构越界时,系统会阻断并说明原因
|
||||
|
||||
## 8. Execution Guardrails
|
||||
|
||||
执行过程中必须遵守以下边界:
|
||||
|
||||
1. 不把 `G6/G7/G8` 的能力提前混入 `G1-E`
|
||||
2. 不扩展第二个 `G1-E` 样板
|
||||
3. 不为了“先生成一个 skill”而放松 gate
|
||||
4. 不把 `G1-E` 再退化回普通 `single_request_table`
|
||||
|
||||
## 9. Next Plan
|
||||
|
||||
本计划完成后,后续顺序固定为:
|
||||
|
||||
1. 若 `G1-E` P0 验证通过,再决定是否补第二个 `G1-E` 样板
|
||||
2. 然后再进入 `G6` 的独立 spec / plan
|
||||
304
docs/superpowers/plans/2026-04-18-g2-family-expansion-plan.md
Normal file
304
docs/superpowers/plans/2026-04-18-g2-family-expansion-plan.md
Normal file
@@ -0,0 +1,304 @@
|
||||
# G2 家族扩展整改计划
|
||||
|
||||
> **Status:** Draft
|
||||
> **Date:** 2026-04-18
|
||||
> **Author:** Codex
|
||||
> **Upstream Inputs:**
|
||||
> [2026-04-18-g2-remediation-plan.md](D:/data/ideaSpace/rust/sgClaw/claw-new/docs/superpowers/plans/2026-04-18-g2-remediation-plan.md)
|
||||
> [2026-04-18-g2-second-round-remediation-report.md](D:/data/ideaSpace/rust/sgClaw/claw-new/docs/superpowers/reports/2026-04-18-g2-second-round-remediation-report.md)
|
||||
|
||||
## Plan Intent
|
||||
|
||||
本计划用于承接上一轮 `G2` 主样本整改结果,把目标从“修通 `tq` 主样本”推进到“扩展 `G2` 家族变体覆盖”。
|
||||
|
||||
上一轮已经证明:
|
||||
|
||||
1. `台区线损大数据-月_周累计线损率统计分析` 可以进入候选验证名单
|
||||
2. `G2` 主样本链路已经具备可编译性
|
||||
3. `白银线损周报`
|
||||
4. `线损同期差异报表`
|
||||
|
||||
这两份剩余真实样本仍然稳定 `fail-close`
|
||||
|
||||
因此,本计划的核心目标不是重做上一轮主样本整改,而是补齐 `G2` 家族内部剩余两类变体的识别与合同恢复能力。
|
||||
|
||||
## Success Baseline
|
||||
|
||||
本计划完成后的最低成功口径固定为:
|
||||
|
||||
1. `白银线损周报` 不再因为 `G2` 合同缺失而直接阻断
|
||||
2. `线损同期差异报表` 不再被粗暴套入 `tq` 主报表模板
|
||||
3. 生成器能够明确区分至少两类新增 `G2` 家族子型
|
||||
4. 新增子型具备各自最小可解释合同
|
||||
5. 对证据不足的样本继续 `fail-close`
|
||||
6. readiness 与“是否达到候选验证名单”保持一致
|
||||
7. 输出第三轮 `G2` 家族扩展回归报告
|
||||
|
||||
## Scope Guardrails
|
||||
|
||||
执行过程中保持以下边界不变:
|
||||
|
||||
1. 不切换到 `G1`
|
||||
2. 不切换到 `G3`
|
||||
3. 不展开统一登录、隐藏域登录或宿主 transport 重构
|
||||
4. 不扩展到 102 个全量场景
|
||||
5. 不把本计划扩散成通用 scene skill 平台重写
|
||||
6. 不否定上一轮 `tq` 主样本已经收敛的口径
|
||||
|
||||
## Target Samples
|
||||
|
||||
本计划只围绕以下三份 `G2` 家族真实样本执行:
|
||||
|
||||
1. `台区线损大数据-月_周累计线损率统计分析`
|
||||
2. `白银线损周报`
|
||||
3. `线损同期差异报表`
|
||||
|
||||
其中角色区分为:
|
||||
|
||||
1. `台区线损大数据-月_周累计线损率统计分析`
|
||||
作用:`G2-A` 主样本基线,不允许回退
|
||||
2. `白银线损周报`
|
||||
作用:`G2-B` 周报单侧 mode 变体
|
||||
3. `线损同期差异报表`
|
||||
作用:`G2-C` 混合联动变体
|
||||
|
||||
## Family Expansion Hypothesis
|
||||
|
||||
基于上一轮报告,本计划先将 `G2` 家族收束为三类:
|
||||
|
||||
1. `G2-A`
|
||||
定义:`tq` 主报表型,具备稳定的 `month/week + cols1/cols2 + mode-specific request/response`
|
||||
当前状态:已进入候选验证名单
|
||||
2. `G2-B`
|
||||
定义:周报偏单侧 mode 变体,存在 `week/tjzq` 与线损主接口,但缺少与主样本同等级的双模式列合同
|
||||
当前代表:`白银线损周报`
|
||||
3. `G2-C`
|
||||
定义:线损主链路与外部系统联动混合变体,存在线损接口和联动接口并存的情况
|
||||
当前代表:`线损同期差异报表`
|
||||
|
||||
本计划的整改原则是:
|
||||
|
||||
1. 不强行把 `G2-B/G2-C` 编造成 `G2-A`
|
||||
2. 先把三类子型边界立住
|
||||
3. 再让每类子型各自拥有最小合同
|
||||
|
||||
## Workstreams
|
||||
|
||||
本计划拆为五条工作流:
|
||||
|
||||
1. `WS1` G2 子型分层与判定收束
|
||||
2. `WS2` G2-B 周报变体合同补齐
|
||||
3. `WS3` G2-C 混合联动变体隔离
|
||||
4. `WS4` G2 家族 readiness 分级重整
|
||||
5. `WS5` 真实样本第三轮回归与报告
|
||||
|
||||
## Phase Overview
|
||||
|
||||
本计划按四个阶段推进:
|
||||
|
||||
1. Phase 0:冻结家族扩展目标
|
||||
2. Phase 1:建立 `G2-A/G2-B/G2-C` 子型边界
|
||||
3. Phase 2:分别补齐 `G2-B/G2-C` 最小合同
|
||||
4. Phase 3:回归三份真实样本并输出扩展报告
|
||||
|
||||
执行顺序固定为:
|
||||
|
||||
`Phase 0 -> Phase 1 -> Phase 2 -> Phase 3`
|
||||
|
||||
## Phase 0:冻结扩展目标
|
||||
|
||||
### Objective
|
||||
|
||||
把上一轮已经收敛出来的家族事实冻结下来,避免再次把问题表述成“主样本没修好”。
|
||||
|
||||
### Tasks
|
||||
|
||||
1. 固化 `G2-A` 已达标口径
|
||||
2. 固化 `G2-B` 与 `G2-C` 的直接 blocker
|
||||
3. 固化本计划只补家族扩展,不回退主样本链路
|
||||
|
||||
### Exit Criteria
|
||||
|
||||
1. `tq` 主样本被视为基线,不再作为待整改对象
|
||||
2. 家族扩展问题被明确表述为“变体支持缺失”
|
||||
|
||||
## Phase 1:建立子型边界
|
||||
|
||||
### Objective
|
||||
|
||||
让系统能区分 `G2-A/G2-B/G2-C`,而不是所有 `G2` 一律走同一套路。
|
||||
|
||||
### WS1:G2 子型分层与判定收束
|
||||
|
||||
#### Task 1
|
||||
|
||||
审计当前 `G2` 真实样本信号差异,明确以下边界:
|
||||
|
||||
1. 哪些信号属于 `G2-A`
|
||||
2. 哪些信号属于 `G2-B`
|
||||
3. 哪些信号属于 `G2-C`
|
||||
|
||||
#### Task 2
|
||||
|
||||
为 `G2` 增加子型判定规则,至少能区分:
|
||||
|
||||
1. 双模式主报表型
|
||||
2. 周报单侧 mode 型
|
||||
3. 混合联动型
|
||||
|
||||
#### Task 3
|
||||
|
||||
补充 fixture 与回归测试,证明:
|
||||
|
||||
1. `G2-A` 不回退
|
||||
2. `G2-B` 不再误套 `G2-A`
|
||||
3. `G2-C` 不再误套 `G2-A`
|
||||
|
||||
### Phase 1 Exit Criteria
|
||||
|
||||
1. `G2` 家族内部已可分层
|
||||
2. 生成路径不再默认所有 `G2` 都是 `tq` 主报表
|
||||
|
||||
## Phase 2:补齐变体最小合同
|
||||
|
||||
### Objective
|
||||
|
||||
分别为 `G2-B` 和 `G2-C` 建立“足够小但可解释”的合同。
|
||||
|
||||
### WS2:G2-B 周报变体合同补齐
|
||||
|
||||
#### Task 4
|
||||
|
||||
定义 `G2-B` 的最小合同,至少包括:
|
||||
|
||||
1. 主 mode 或主周期字段
|
||||
2. 对应 request template
|
||||
3. 对应 response path
|
||||
4. 对应 column/required fields
|
||||
|
||||
#### Task 5
|
||||
|
||||
修改 analyzer / generator / scene ir 组装逻辑,使 `白银线损周报` 能输出非空合同,而不是继续因合同缺失直接阻断。
|
||||
|
||||
#### Task 6
|
||||
|
||||
新增或更新测试,证明 `G2-B` 可以独立成立,不依赖 `month/week` 双模式完整结构。
|
||||
|
||||
### WS3:G2-C 混合联动变体隔离
|
||||
|
||||
#### Task 7
|
||||
|
||||
审计 `线损同期差异报表` 中:
|
||||
|
||||
1. 线损主链路
|
||||
2. 同期系统联动链路
|
||||
3. 哪一部分属于主报表合同
|
||||
|
||||
#### Task 8
|
||||
|
||||
为 `G2-C` 建立隔离规则,避免混合联动接口污染主报表生成。
|
||||
|
||||
#### Task 9
|
||||
|
||||
定义 `G2-C` 的最小可编译合同,允许:
|
||||
|
||||
1. 主链路进入候选验证
|
||||
2. 联动链路作为风险或扩展证据保留
|
||||
|
||||
而不是全部混在一起后直接失败。
|
||||
|
||||
#### Task 10
|
||||
|
||||
新增或更新测试,证明 `G2-C` 至少能稳定输出“主链路 + 联动风险”的结构化结果。
|
||||
|
||||
### WS4:G2 家族 readiness 分级重整
|
||||
|
||||
#### Task 11
|
||||
|
||||
为 `G2-A/G2-B/G2-C` 增加子型级 readiness gate。
|
||||
|
||||
#### Task 12
|
||||
|
||||
调整 readiness 评级逻辑,保证:
|
||||
|
||||
1. `G2-A` 满足完整双模式合同时可以进入 `A`
|
||||
2. `G2-B` 满足其最小合同时可以进入候选验证等级
|
||||
3. `G2-C` 若仅主链路闭合,也能获得可解释等级
|
||||
4. 证据不足时继续 `fail-close`
|
||||
|
||||
#### Task 13
|
||||
|
||||
补充测试,证明 readiness 不会再用 `G2-A` 的标准去误判全部 `G2` 子型。
|
||||
|
||||
### Phase 2 Exit Criteria
|
||||
|
||||
1. `G2-B` 具备最小合同
|
||||
2. `G2-C` 具备隔离后的最小合同
|
||||
3. readiness 与子型口径一致
|
||||
|
||||
## Phase 3:真实样本第三轮回归
|
||||
|
||||
### Objective
|
||||
|
||||
基于扩展后的家族能力,重新回归三份真实样本并输出正式结论。
|
||||
|
||||
### WS5:真实样本第三轮回归与报告
|
||||
|
||||
#### Task 14
|
||||
|
||||
重新生成以下三份真实样本:
|
||||
|
||||
1. `台区线损大数据-月_周累计线损率统计分析`
|
||||
2. `白银线损周报`
|
||||
3. `线损同期差异报表`
|
||||
|
||||
#### Task 15
|
||||
|
||||
按统一口径对比:
|
||||
|
||||
1. 子型判定
|
||||
2. bootstrap
|
||||
3. request contract
|
||||
4. response / column / normalize contract
|
||||
5. readiness
|
||||
6. 是否进入候选验证名单
|
||||
|
||||
#### Task 16
|
||||
|
||||
输出第三轮 `G2` 家族扩展整改报告,至少说明:
|
||||
|
||||
1. `G2-A` 是否保持稳定
|
||||
2. `G2-B` 是否进入候选验证名单
|
||||
3. `G2-C` 是否进入候选验证名单或仍需 fail-close
|
||||
4. 剩余 blocker 是否已经从“主样本不可生成”转移为“少数变体待扩展”
|
||||
|
||||
### Deliverables
|
||||
|
||||
1. `G2` 家族扩展回归测试
|
||||
2. `G2-B/G2-C` 对应 fixture
|
||||
3. 第三轮真实样本生成结果
|
||||
4. 第三轮 `G2` 家族扩展整改报告
|
||||
|
||||
### Acceptance Criteria
|
||||
|
||||
1. `G2-A` 不回退
|
||||
2. `G2-B` 至少达到可解释合同或候选验证等级
|
||||
3. `G2-C` 至少达到主链路隔离成功,不能继续被整包噪声污染
|
||||
4. 三份样本不再被单一 `G2-A` 模型粗暴处理
|
||||
|
||||
## File-Level Targets
|
||||
|
||||
本计划执行时,至少会触达以下资产类型:
|
||||
|
||||
1. `src/generated_scene/` 下的 analyzer / generator / readiness 相关实现
|
||||
2. `tests/fixtures/generated_scene/` 下的 `G2-B/G2-C` fixture
|
||||
3. `tests/` 下与 scene generator / readiness / family regression 相关的测试
|
||||
4. `docs/superpowers/reports/` 下的第三轮家族扩展报告
|
||||
|
||||
## Completion Criteria
|
||||
|
||||
本计划完成的标志是:
|
||||
|
||||
1. `G2` 已从“单主样本修通”推进到“至少三类子型可区分”
|
||||
2. `白银线损周报` 与 `线损同期差异报表` 不再只是被动 fail-close
|
||||
3. 下一步是否继续扩到更多线损变体,可以建立在第三轮家族扩展报告上
|
||||
331
docs/superpowers/plans/2026-04-18-g2-remediation-plan.md
Normal file
331
docs/superpowers/plans/2026-04-18-g2-remediation-plan.md
Normal file
@@ -0,0 +1,331 @@
|
||||
# G2 家族整改计划
|
||||
|
||||
> **Status:** Draft
|
||||
> **Date:** 2026-04-18
|
||||
> **Author:** Codex
|
||||
> **Upstream Spec:** [2026-04-18-g2-remediation-design.md](D:/data/ideaSpace/rust/sgClaw/claw-new/docs/superpowers/specs/2026-04-18-g2-remediation-design.md)
|
||||
|
||||
## Plan Intent
|
||||
|
||||
本计划用于把 `G2` 家族整改设计拆解为可执行任务,目标是把当前线损多模式报表家族从“信号能抓到但主链重建失败”,推进到“至少第一份样本达到候选验证门槛”。
|
||||
|
||||
本计划严格限定在 `G2` 家族整改,不扩展到:
|
||||
|
||||
1. `G1`
|
||||
2. `G3`
|
||||
3. 更大范围真实场景迁移
|
||||
4. 登录恢复、宿主协议重构或运行时 transport 改造
|
||||
|
||||
## Success Baseline
|
||||
|
||||
整改阶段的最低成功口径固定为:
|
||||
|
||||
1. `台区线损大数据-月_周累计线损率统计分析` 不再坍缩为 `paginated_enrichment`
|
||||
2. 至少该样本能生成 `multi_mode_request` 结构
|
||||
3. `bootstrap` 落到线损主业务承载面
|
||||
4. `modes` 至少恢复 `month` 与 `week`
|
||||
5. mode-specific `request/response/column/normalize` 合同不再为空
|
||||
6. readiness 不再在核心合同缺失时给出虚高 `A`
|
||||
7. 样本结果达到“可进入候选验证”门槛
|
||||
|
||||
## Scope Guardrails
|
||||
|
||||
执行过程中保持以下边界不变:
|
||||
|
||||
1. 不切换到 `G1/G3` 样本执行
|
||||
2. 不继续补更多同类 `G2` 观察样本
|
||||
3. 不在本计划中展开内网人工验证
|
||||
4. 不在本计划中处理统一登录与隐藏域登录恢复
|
||||
5. 不发散到 scene skill 平台通用重构
|
||||
|
||||
## Target Samples
|
||||
|
||||
本计划整改与回归只围绕以下三份 `G2` 样本:
|
||||
|
||||
1. `台区线损大数据-月_周累计线损率统计分析`
|
||||
2. `白银线损周报`
|
||||
3. `线损同期差异报表`
|
||||
|
||||
对应产物路径:
|
||||
|
||||
1. `examples/real_scene_batch_round1/skills/real-tq-lineloss-report-r1`
|
||||
2. `examples/real_scene_batch_round1/skills/real-baiyin-lineloss-weekly-r1`
|
||||
3. `examples/real_scene_batch_round1/skills/real-lineloss-period-diff-r1`
|
||||
|
||||
## Workstreams
|
||||
|
||||
本计划拆为五条工作流,与上游 `spec` 一一对应:
|
||||
|
||||
1. `WS1` G2 archetype 纠偏
|
||||
2. `WS2` bootstrap 纠偏
|
||||
3. `WS3` mode contract 重建
|
||||
4. `WS4` endpoint 去污染
|
||||
5. `WS5` readiness 收紧
|
||||
|
||||
## Phase Overview
|
||||
|
||||
本计划按四个阶段推进:
|
||||
|
||||
1. Phase 0:冻结整改基线
|
||||
2. Phase 1:修正识别与选择
|
||||
3. Phase 2:重建 `G2` 合同
|
||||
4. Phase 3:回归真实样本并产出整改报告
|
||||
|
||||
执行顺序固定为:
|
||||
|
||||
`Phase 0 -> Phase 1 -> Phase 2 -> Phase 3`
|
||||
|
||||
其中 `Phase 1` 先于 `Phase 2`,避免在错误 archetype 和错误 bootstrap 上继续堆模板逻辑。
|
||||
|
||||
## Phase 0:冻结整改基线
|
||||
|
||||
### Objective
|
||||
|
||||
把当前 `G2` 家族首轮 blocker、对标口径和验收门槛冻结,避免整改过程中边界漂移。
|
||||
|
||||
### Tasks
|
||||
|
||||
1. 固化三份 `G2` 样本的当前失败画像
|
||||
2. 固化 `tq-lineloss-report` 作为 `G2` 主锚点参考
|
||||
3. 固化 `G2` 候选验证门槛
|
||||
4. 固化整改阶段只围绕 `G2` 的边界
|
||||
|
||||
### Deliverables
|
||||
|
||||
1. 本计划
|
||||
2. 已存在的 `G2` blocker 汇总
|
||||
3. 已存在的第一轮迁移与候选验证报告
|
||||
|
||||
### Exit Criteria
|
||||
|
||||
1. 后续执行不再追加同类 `G2` 观察样本
|
||||
2. 不再用“先去内网试试”替代整改闭环
|
||||
|
||||
## Phase 1:修正识别与选择
|
||||
|
||||
### Objective
|
||||
|
||||
先把 `G2` 主链判定修正过来,解决 archetype、bootstrap 与 endpoint 污染这三个上游问题。
|
||||
|
||||
### WS1:G2 Archetype Rectification
|
||||
|
||||
#### Task 1
|
||||
|
||||
审计当前 `G2` archetype 误判来源,确认:
|
||||
|
||||
1. 哪些分页信号在夺权
|
||||
2. 哪些 mode 信号没有进入主判定
|
||||
3. 当前 `multi_mode_request` 与 `paginated_enrichment` 的优先级冲突点在哪里
|
||||
|
||||
#### Task 2
|
||||
|
||||
修改 `G2` archetype 判定逻辑,使以下信号在 `G2` 中具备更高权重:
|
||||
|
||||
1. `month/week`
|
||||
2. `mode`
|
||||
3. `tjzq`
|
||||
4. 同一场景内多组线损接口
|
||||
5. 模式切换分支字段
|
||||
|
||||
#### Task 3
|
||||
|
||||
新增或更新回归测试,证明:
|
||||
|
||||
1. 当前 `G2` fixture 不再判成 `paginated_enrichment`
|
||||
2. `G2` 相关修正不会误伤现有 `G3` fixture
|
||||
|
||||
### WS2:Bootstrap Rectification
|
||||
|
||||
#### Task 4
|
||||
|
||||
审计当前 bootstrap 选择逻辑,确认为什么三份样本都稳定落到 `20.77.115.36:31051`。
|
||||
|
||||
#### Task 5
|
||||
|
||||
为 `G2` 引入更严格的 bootstrap 选择约束:
|
||||
|
||||
1. 优先真实线损业务承载页
|
||||
2. 排除页面壳入口与错误主域
|
||||
3. 继续排除 `localhost:*`、第三方库 URL、静态资源 URL
|
||||
|
||||
#### Task 6
|
||||
|
||||
新增或更新测试,证明:
|
||||
|
||||
1. `G2` 主样本 bootstrap 不再落到错误入口
|
||||
2. `localhost:*` 仍只作为宿主依赖证据保留
|
||||
|
||||
### WS4:Endpoint Purification
|
||||
|
||||
#### Task 7
|
||||
|
||||
审计当前 endpoint 提取污染来源,明确以下类别如何被误收进业务候选:
|
||||
|
||||
1. 第三方依赖库
|
||||
2. 文档外链
|
||||
3. 静态资源 URL
|
||||
4. 其他业务系统遗留接口
|
||||
|
||||
#### Task 8
|
||||
|
||||
收紧 endpoint 候选过滤与排序规则,使 `G2` 样本中:
|
||||
|
||||
1. 线损主业务接口排在前列
|
||||
2. 外链与依赖库 URL 不再进入主业务候选
|
||||
3. 其他业务系统 endpoint 不再轻易抢占主链
|
||||
|
||||
#### Task 9
|
||||
|
||||
补充测试,证明:
|
||||
|
||||
1. `G2` 主 endpoint 排序明显改善
|
||||
2. 噪声 endpoint 不再污染生成主脚本
|
||||
|
||||
### Phase 1 Exit Criteria
|
||||
|
||||
1. `G2` fixture archetype 判定修正
|
||||
2. `G2` bootstrap 选择修正
|
||||
3. `G2` endpoint 候选排序修正
|
||||
|
||||
## Phase 2:重建 G2 合同
|
||||
|
||||
### Objective
|
||||
|
||||
在主链判定正确后,恢复 `G2` 必需的 mode-specific 合同与更严格的 readiness。
|
||||
|
||||
### WS3:Mode Contract Reconstruction
|
||||
|
||||
#### Task 10
|
||||
|
||||
为 `G2` 定义最小 mode contract,至少包括:
|
||||
|
||||
1. `modes[]`
|
||||
2. `defaultMode`
|
||||
3. `modeSwitchField`
|
||||
4. per-mode `requestTemplate`
|
||||
5. per-mode `responsePath`
|
||||
6. per-mode `columnDefs`
|
||||
7. per-mode `normalizeRules`
|
||||
|
||||
#### Task 11
|
||||
|
||||
修改 `Scene IR` 组装或生成逻辑,让 `G2` 样本在证据充分时真正输出 `modes[]`,而不是只保留空壳默认字段。
|
||||
|
||||
#### Task 12
|
||||
|
||||
修改 `G2` 生成脚本模板或编译路径,避免继续退化成通用:
|
||||
|
||||
- `paginate -> secondary_request -> filter`
|
||||
|
||||
要求生成结果能体现:
|
||||
|
||||
1. `month` 模式
|
||||
2. `week` 模式
|
||||
3. 不同模式的请求差异
|
||||
4. 不同模式的列差异
|
||||
|
||||
#### Task 13
|
||||
|
||||
新增或更新测试,证明:
|
||||
|
||||
1. `台区线损大数据-月_周累计线损率统计分析` 可输出非空 `modes`
|
||||
2. 至少一个 `G2` fixture 恢复出 mode-specific contract
|
||||
|
||||
### WS5:Readiness Tightening
|
||||
|
||||
#### Task 14
|
||||
|
||||
为 `G2` 新增或收紧 gate,至少覆盖:
|
||||
|
||||
1. `g2_archetype_resolved`
|
||||
2. `g2_bootstrap_resolved`
|
||||
3. `g2_modes_present`
|
||||
4. `g2_request_contract_complete`
|
||||
5. `g2_response_contract_complete`
|
||||
|
||||
#### Task 15
|
||||
|
||||
调整 readiness 评级逻辑,保证以下情况不再给出高等级:
|
||||
|
||||
1. `modes = []`
|
||||
2. `requestTemplate = null`
|
||||
3. `columnDefs = []`
|
||||
4. archetype 误判
|
||||
|
||||
#### Task 16
|
||||
|
||||
新增或更新测试,证明:
|
||||
|
||||
1. 不闭合 `G2` 样本会被降级或阻断
|
||||
2. readiness 与候选验证门槛一致
|
||||
|
||||
### Phase 2 Exit Criteria
|
||||
|
||||
1. 至少 `G2` 主样本拥有可解释的 mode contract
|
||||
2. readiness 不再虚高
|
||||
3. `G2` 生成结果在结构上具备进入候选门槛的可能
|
||||
|
||||
## Phase 3:回归真实样本并产出整改报告
|
||||
|
||||
### Objective
|
||||
|
||||
在整改完成后,重新生成三份 `G2` 真实样本,并输出第二轮正式结论。
|
||||
|
||||
### Tasks
|
||||
|
||||
#### Task 17
|
||||
|
||||
重新生成以下三份 `G2` 样本:
|
||||
|
||||
1. `台区线损大数据-月_周累计线损率统计分析`
|
||||
2. `白银线损周报`
|
||||
3. `线损同期差异报表`
|
||||
|
||||
#### Task 18
|
||||
|
||||
按与第一轮完全一致的口径,对比以下项目:
|
||||
|
||||
1. archetype
|
||||
2. bootstrap
|
||||
3. modes
|
||||
4. request contract
|
||||
5. response / column / normalize contract
|
||||
6. readiness
|
||||
|
||||
#### Task 19
|
||||
|
||||
输出整改后的第二轮报告,至少包含:
|
||||
|
||||
1. 哪些 blocker 被修掉
|
||||
2. 哪些 blocker 仍存在
|
||||
3. 哪些样本进入候选验证名单
|
||||
4. 哪些样本仍需 fail-closed
|
||||
|
||||
### Deliverables
|
||||
|
||||
1. 第二轮 `G2` 真实样本生成结果
|
||||
2. 第二轮 `G2` 整改回归报告
|
||||
3. 更新后的候选验证名单
|
||||
|
||||
### Acceptance Criteria
|
||||
|
||||
1. `台区线损大数据-月_周累计线损率统计分析` 至少进入候选验证名单
|
||||
2. 三份样本不再统一坍缩成 `paginated_enrichment`
|
||||
3. readiness 与真实业务闭合程度基本一致
|
||||
|
||||
## File-Level Targets
|
||||
|
||||
本计划执行时,至少会触达以下类型资产:
|
||||
|
||||
1. `src/generated_scene/` 下的 analyzer / generator / readiness 相关实现
|
||||
2. `tests/fixtures/generated_scene/` 下的 `G2` fixture 或 canonical 资产
|
||||
3. `tests/` 下与 scene generator / canonical / readiness 相关的回归测试
|
||||
4. `docs/superpowers/reports/` 下的第二轮整改报告
|
||||
|
||||
## Completion Criteria
|
||||
|
||||
本计划完成的标志是:
|
||||
|
||||
1. `G2` 主样本达到候选验证门槛
|
||||
2. `G2` 家族 blocker 从“稳定复现”转为“部分修复且可解释”
|
||||
3. 后续是否切换到 `G1/G3`,可以建立在整改后二轮报告上,而不是继续依赖第一轮失败画像
|
||||
@@ -0,0 +1,458 @@
|
||||
# G3 Paginated Enrichment Plan
|
||||
|
||||
> **Status:** Draft
|
||||
> **Date:** 2026-04-18
|
||||
> **Author:** Codex
|
||||
> **Upstream Spec:** [2026-04-18-g3-paginated-enrichment-design.md](D:/data/ideaSpace/rust/sgClaw/claw-new/docs/superpowers/specs/2026-04-18-g3-paginated-enrichment-design.md)
|
||||
|
||||
## Plan Intent
|
||||
|
||||
本计划用于把 `G3` 分页补数家族设计拆解为可执行任务,目标是把当前 `paginated_enrichment` 从“宽泛的复杂 workflow 标签”推进到“具备证据层、最小合同、canonical baseline 和 fail-closed 判定”的正式主线 archetype。
|
||||
|
||||
本计划严格限定在 `G3 / P0-3` 落地,不扩展到:
|
||||
|
||||
1. `G6/G7/G8`
|
||||
2. 全量 `95598` 家族并发整改
|
||||
3. 登录恢复或宿主 transport 重构
|
||||
4. 102 个场景大规模铺开
|
||||
|
||||
## Success Baseline
|
||||
|
||||
本计划完成后的最低成功口径固定为:
|
||||
|
||||
1. `95598工单明细表` 不再只是“复杂工单类”模糊样本
|
||||
2. 生成链能够显式恢复:
|
||||
- `main request`
|
||||
- `pagination plan`
|
||||
- `enrichment requests`
|
||||
- `export plan`
|
||||
3. `localhost:*`、宿主注入和 BrowserAction 不再被误判为业务主链
|
||||
4. `G3` 具备最小可编译合同和独立 gate
|
||||
5. 证据不足时结果稳定 `fail-closed`
|
||||
6. `95598、12398、流程超期风险工单明细` 能作为第一扩展样板进入复用验证
|
||||
|
||||
## Scope Guardrails
|
||||
|
||||
执行过程中保持以下边界不变:
|
||||
|
||||
1. 不把 `G3` 回退为普通分页表识别
|
||||
2. 不把宿主桥接能力提前混入 `G3` 合同
|
||||
3. 不为了先生成 skill 而放松 gate
|
||||
4. 不并发展开 `G6/G7/G8`
|
||||
5. 不在本计划中做真实内网人工验证
|
||||
|
||||
## Target Samples
|
||||
|
||||
本计划整改与回归只围绕以下两个样板:
|
||||
|
||||
1. `95598工单明细表`
|
||||
2. `95598、12398、流程超期风险工单明细`
|
||||
|
||||
其中角色固定为:
|
||||
|
||||
1. `95598工单明细表`
|
||||
- 作用:`P0-3` 主样板
|
||||
- 目标:冻结 `G3 canonical`
|
||||
2. `95598、12398、流程超期风险工单明细`
|
||||
- 作用:第一扩展样板
|
||||
- 目标:验证 `G3` 合同与证据层是否可复用
|
||||
|
||||
## Workstreams
|
||||
|
||||
本计划拆为五条工作流:
|
||||
|
||||
1. `WS1` G3 边界冻结与样板建档
|
||||
2. `WS2` G3 证据层建模
|
||||
3. `WS3` G3 Scene IR / compiler gate / readiness 建设
|
||||
4. `WS4` G3 P0 canonical 与失败 taxonomy 冻结
|
||||
5. `WS5` G3 真实样本回归与报告
|
||||
|
||||
## Phase Overview
|
||||
|
||||
本计划按五个阶段推进:
|
||||
|
||||
1. Phase 0:冻结 `G3` 边界与样板
|
||||
2. Phase 1:建立 `G3` 证据层
|
||||
3. Phase 2:建立 `G3` 最小合同与 gate
|
||||
4. Phase 3:冻结 `P0-3 canonical`
|
||||
5. Phase 4:回归真实样本并输出首轮报告
|
||||
|
||||
执行顺序固定为:
|
||||
|
||||
`Phase 0 -> Phase 1 -> Phase 2 -> Phase 3 -> Phase 4`
|
||||
|
||||
## Phase 0:冻结 G3 边界与样板
|
||||
|
||||
### Objective
|
||||
|
||||
先把 `G3` 的问题边界、主样板和扩展样板固定下来,避免开发过程中把工单类、宿主桥接类和导出分析类重新混在一起。
|
||||
|
||||
### WS1:G3 边界冻结与样板建档
|
||||
|
||||
#### Task 1
|
||||
|
||||
冻结 `G3` 正式定义:
|
||||
|
||||
1. 不是普通分页表
|
||||
2. 不是宿主桥接型
|
||||
3. 而是“主查询链 + 分页链 + 补数链 + 导出链”并存的复杂 workflow 报表
|
||||
|
||||
#### Task 2
|
||||
|
||||
冻结 `95598工单明细表` 为唯一 `P0-3` 主样板。
|
||||
|
||||
#### Task 3
|
||||
|
||||
冻结 `95598、12398、流程超期风险工单明细` 为第一扩展样板。
|
||||
|
||||
#### Task 4
|
||||
|
||||
固化 `G3` 进入条件:
|
||||
|
||||
1. 存在主查询链候选
|
||||
2. 存在分页控制证据
|
||||
3. 存在补数或关联详情链
|
||||
4. 最终结果依赖分页拉全、补齐、导出或汇总
|
||||
|
||||
#### Task 5
|
||||
|
||||
固化 `G3` 排除条件:
|
||||
|
||||
1. 单请求即可完成的普通报表
|
||||
2. 仅靠 BrowserAction 推进、无稳定业务主链
|
||||
3. 以本地落库分析或文档产物为主体
|
||||
4. `localhost:*` 或宿主依赖明显压过业务证据
|
||||
|
||||
### Deliverables
|
||||
|
||||
1. `G3` family definition
|
||||
2. `G3` 样板清单
|
||||
3. `G3` 进入条件与排除条件
|
||||
4. `G3` 与其它家族边界说明
|
||||
|
||||
### Exit Criteria
|
||||
|
||||
1. `95598工单明细表` 不再作为模糊工单样本讨论
|
||||
2. `G3` 不再与宿主桥接型、文档产物型场景混淆
|
||||
|
||||
## Phase 1:建立 G3 证据层
|
||||
|
||||
### Objective
|
||||
|
||||
把源码直接压成 `Scene IR` 的路径升级为:先形成 `G3` 可裁决证据,再归约成 `Scene IR`。
|
||||
|
||||
### WS2:G3 证据层建模
|
||||
|
||||
#### Task 6
|
||||
|
||||
定义 `main_request_candidate`,承载:
|
||||
|
||||
1. 主查询 endpoint
|
||||
2. 查询参数模板
|
||||
3. 时间范围或主过滤条件
|
||||
|
||||
#### Task 7
|
||||
|
||||
定义 `pagination_candidate`,承载:
|
||||
|
||||
1. 页码字段
|
||||
2. pageSize 字段
|
||||
3. 翻页终止条件
|
||||
4. 滚动窗口或区间推进规则
|
||||
|
||||
#### Task 8
|
||||
|
||||
定义 `enrichment_request_candidate`,承载:
|
||||
|
||||
1. 详情补查
|
||||
2. 二次接口
|
||||
3. 关联补数
|
||||
|
||||
#### Task 9
|
||||
|
||||
定义 `join_key_candidate`,承载:
|
||||
|
||||
1. 工单号
|
||||
2. 流程号
|
||||
3. 用户号
|
||||
4. 设备号
|
||||
5. 其它主补链关联键
|
||||
|
||||
#### Task 10
|
||||
|
||||
定义 `export_candidate`,承载:
|
||||
|
||||
1. 导出接口
|
||||
2. 导出参数
|
||||
3. 导出前置动作
|
||||
4. 产物类型
|
||||
|
||||
#### Task 11
|
||||
|
||||
定义 `workflow_step_candidate`,承载:
|
||||
|
||||
1. 主查
|
||||
2. 翻页
|
||||
3. 补查
|
||||
4. 聚合
|
||||
5. 导出
|
||||
|
||||
之间的顺序关系。
|
||||
|
||||
#### Task 12
|
||||
|
||||
定义 `dedupe_or_merge_rule_candidate`,承载:
|
||||
|
||||
1. 去重规则
|
||||
2. 主从并回规则
|
||||
3. 跨页累积规则
|
||||
|
||||
#### Task 13
|
||||
|
||||
定义 `host_bridge_candidate` 与 `localhost_dependency_candidate`,确保宿主链只作为独立证据保留。
|
||||
|
||||
#### Task 14
|
||||
|
||||
建立证据归并与冲突裁决规则,明确:
|
||||
|
||||
1. 哪些属于业务主链
|
||||
2. 哪些属于宿主桥接
|
||||
3. 哪些属于结果导出链
|
||||
|
||||
### Deliverables
|
||||
|
||||
1. `G3` evidence schema
|
||||
2. `G3` evidence type dictionary
|
||||
3. 证据归并规则
|
||||
4. `95598工单明细表` 第一版证据样例
|
||||
|
||||
### Exit Criteria
|
||||
|
||||
1. 主链、分页链、补链、导出链、宿主链能够分槽呈现
|
||||
2. `localhost:*` 不再混入业务主链
|
||||
|
||||
## Phase 2:建立 G3 最小合同与 Gate
|
||||
|
||||
### Objective
|
||||
|
||||
把 `G3` 的判定标准从“看起来像分页补数场景”升级为“最小业务合同是否成立”。
|
||||
|
||||
### WS3:G3 Scene IR / compiler gate / readiness 建设
|
||||
|
||||
#### Task 15
|
||||
|
||||
定义 `G3` 最小合同,至少包括:
|
||||
|
||||
1. `main_request`
|
||||
2. `pagination_plan`
|
||||
3. `enrichment_requests[]`
|
||||
4. `join_keys[]`
|
||||
5. `export_plan`
|
||||
6. `merge_or_dedupe_rules`
|
||||
|
||||
#### Task 16
|
||||
|
||||
在 `Scene IR` 中承载 `G3` 专属结构,不再退化成普通 `paginated_enrichment` 空壳字段。
|
||||
|
||||
#### Task 17
|
||||
|
||||
增加 `G3` gate,至少包括:
|
||||
|
||||
1. `g3_main_request_resolved`
|
||||
2. `g3_pagination_contract_complete`
|
||||
3. `g3_enrichment_contract_complete`
|
||||
4. `g3_join_key_resolved`
|
||||
5. `g3_export_path_identified`
|
||||
6. `g3_runtime_scope_compatible`
|
||||
|
||||
#### Task 18
|
||||
|
||||
定义 blocker / readiness 判定口径,要求能区分:
|
||||
|
||||
1. 业务证据不足
|
||||
2. 分页合同不闭合
|
||||
3. 补数合同不闭合
|
||||
4. 导出链依赖宿主
|
||||
5. 运行时依赖未满足
|
||||
|
||||
#### Task 19
|
||||
|
||||
落地 `fail-closed` 规则:
|
||||
|
||||
1. 主请求链缺失,阻断
|
||||
2. 分页链存在但终止条件不明,阻断
|
||||
3. 补数链存在但 join key 不明,阻断
|
||||
4. 只有导出动作没有业务主链,阻断
|
||||
5. 宿主桥接证据明显多于业务证据,阻断
|
||||
|
||||
#### Task 20
|
||||
|
||||
补充测试,证明未闭合 `G3` 样本不能伪装成 runnable skill。
|
||||
|
||||
### Deliverables
|
||||
|
||||
1. `G3` minimal contract table
|
||||
2. `G3` gate table
|
||||
3. `G3` blocker / readiness table
|
||||
4. `G3` Scene IR example
|
||||
|
||||
### Exit Criteria
|
||||
|
||||
1. `G3` 已拥有独立 gate
|
||||
2. 未闭合结果会准确阻断
|
||||
3. `compiler` 不再吞入未闭合 `G3 IR`
|
||||
|
||||
## Phase 3:冻结 P0-3 Canonical
|
||||
|
||||
### Objective
|
||||
|
||||
把 `95598工单明细表` 做成 `G3` 的第一版标准答案、关键证据基线和失败 taxonomy 基线。
|
||||
|
||||
### WS4:G3 P0 canonical 与失败 taxonomy 冻结
|
||||
|
||||
#### Task 21
|
||||
|
||||
冻结 `95598工单明细表` 的 canonical `Scene IR`。
|
||||
|
||||
#### Task 22
|
||||
|
||||
冻结关键证据清单,至少包括:
|
||||
|
||||
1. 主请求链
|
||||
2. 分页链
|
||||
3. 补数链
|
||||
4. join key
|
||||
5. 导出链
|
||||
6. 宿主依赖
|
||||
|
||||
#### Task 23
|
||||
|
||||
冻结验收检查表,至少检查:
|
||||
|
||||
1. 主链是否恢复
|
||||
2. 分页链是否恢复
|
||||
3. 补链是否恢复
|
||||
4. join key 是否恢复
|
||||
5. 导出链是否恢复
|
||||
6. 宿主链是否被隔离
|
||||
7. readiness 是否与真实闭合程度一致
|
||||
|
||||
#### Task 24
|
||||
|
||||
冻结失败 taxonomy,至少包括:
|
||||
|
||||
1. `main_chain_missing`
|
||||
2. `pagination_incomplete`
|
||||
3. `enrichment_incomplete`
|
||||
4. `join_key_missing`
|
||||
5. `export_only_without_business_chain`
|
||||
6. `host_bridge_pollution`
|
||||
7. `runtime_dependency_unresolved`
|
||||
|
||||
#### Task 25
|
||||
|
||||
建立“生成结果 vs canonical”对齐方式。
|
||||
|
||||
### Deliverables
|
||||
|
||||
1. `G3` P0 canonical `Scene IR`
|
||||
2. `G3` P0 evidence baseline
|
||||
3. `G3` acceptance checklist
|
||||
4. `G3` failure taxonomy table
|
||||
|
||||
### Exit Criteria
|
||||
|
||||
1. `95598工单明细表` 成为 `G3` 第一版统一校准源
|
||||
2. 后续 `G3` 回归都可以对照固定 taxonomy
|
||||
|
||||
## Phase 4:真实样本回归与首轮报告
|
||||
|
||||
### Objective
|
||||
|
||||
先用 `P0` 主样板建立闭环,再用一个扩展样板验证 `G3` 合同是否具备复用性。
|
||||
|
||||
### WS5:G3 真实样本回归与报告
|
||||
|
||||
#### Task 26
|
||||
|
||||
重新生成 `95598工单明细表`。
|
||||
|
||||
#### Task 27
|
||||
|
||||
按统一口径检查:
|
||||
|
||||
1. archetype
|
||||
2. bootstrap
|
||||
3. main request
|
||||
4. pagination plan
|
||||
5. enrichment requests
|
||||
6. join keys
|
||||
7. export plan
|
||||
8. localhost / host bridge separation
|
||||
9. readiness / blocker
|
||||
|
||||
#### Task 28
|
||||
|
||||
输出 `G3 P0 validation report`,结论只允许以下三种:
|
||||
|
||||
1. `通过`
|
||||
2. `Fail-closed 且理由准确`
|
||||
3. `误判,需要整改`
|
||||
|
||||
#### Task 29
|
||||
|
||||
重新生成 `95598、12398、流程超期风险工单明细`。
|
||||
|
||||
#### Task 30
|
||||
|
||||
对比其与 `P0` 样板之间:
|
||||
|
||||
1. 哪些合同可复用
|
||||
2. 哪些 blocker 是家族共性
|
||||
3. 哪些是扩展样板特有复杂度
|
||||
|
||||
#### Task 31
|
||||
|
||||
输出 `G3 first-round family expansion report`。
|
||||
|
||||
### Deliverables
|
||||
|
||||
1. `G3` P0 样板生成结果
|
||||
2. `G3` P0 验证报告
|
||||
3. `G3` 扩展样板生成结果
|
||||
4. `G3` 首轮家族扩展报告
|
||||
|
||||
### Acceptance Criteria
|
||||
|
||||
1. `95598工单明细表` 至少达到“结构恢复完整”或“Fail-closed 理由准确”
|
||||
2. 扩展样板不会再被粗暴压成普通分页表
|
||||
3. `G3` 失败结果具备可解释性
|
||||
4. `G3` 至少形成第一版家族复用口径
|
||||
|
||||
## File-Level Targets
|
||||
|
||||
本计划执行时,至少会触达以下资产类型:
|
||||
|
||||
1. `docs/superpowers/specs/`
|
||||
2. `docs/superpowers/plans/`
|
||||
3. `docs/superpowers/reports/`
|
||||
4. `src/generated_scene/` 下与证据层、合同层、readiness 相关实现
|
||||
5. `tests/fixtures/generated_scene/`
|
||||
6. `tests/`
|
||||
|
||||
## Completion Criteria
|
||||
|
||||
本计划完成的标志是:
|
||||
|
||||
1. `G3` 已拥有正式边界定义
|
||||
2. `G3` 已拥有最小证据层与最小合同
|
||||
3. `G3` 已拥有独立 gate 与 fail-closed 口径
|
||||
4. `95598工单明细表` 已成为 `P0-3 canonical`
|
||||
5. `G3` 首轮真实样本回归已经给出正式结论
|
||||
|
||||
## Next Step
|
||||
|
||||
本计划完成后,后续顺序固定为:
|
||||
|
||||
1. 若 `G3` 的 `P0` 与首轮扩展样板稳定,再决定是否补第二个 `G1-E` 样板
|
||||
2. 然后再决定是否进入 `G6` 的独立设计与计划
|
||||
@@ -0,0 +1,77 @@
|
||||
# G6 Host Bridge Workflow Plan
|
||||
|
||||
> Date: 2026-04-18
|
||||
> Status: Initial implementation slice
|
||||
|
||||
## Plan Intent
|
||||
|
||||
Start the `G6` line after `G1-E` second-sample reuse has been validated.
|
||||
|
||||
This plan implements the first safe slice only: classification, evidence separation, readiness gates, and fail-closed behavior.
|
||||
|
||||
## Phase 0: Boundary Freeze
|
||||
|
||||
Tasks:
|
||||
|
||||
1. keep `电能表现场检验完成率指标报表` as the P0 boundary sample
|
||||
2. define the repo-local representative fixture
|
||||
3. keep `G6` separate from `G1`, `G1-E`, `G3`, `G7`, and `G8`
|
||||
|
||||
Deliverables:
|
||||
|
||||
1. `G6` design doc
|
||||
2. `G6` plan doc
|
||||
3. repo-local representative fixture
|
||||
|
||||
Acceptance criteria:
|
||||
|
||||
1. `G6` is no longer discussed as a `G1` candidate
|
||||
2. `G6` is not treated as a generic localhost-pollution case
|
||||
|
||||
## Phase 1: Analyzer Classification
|
||||
|
||||
Tasks:
|
||||
|
||||
1. add `host_bridge_workflow` as a workflow archetype
|
||||
2. detect explicit host bridge actions
|
||||
3. keep `localhost:*` as supporting host-runtime evidence
|
||||
4. ensure explicit host bridge signals outrank `G1-E`
|
||||
5. ensure ordinary localhost export noise does not become `G6`
|
||||
|
||||
Acceptance criteria:
|
||||
|
||||
1. `g6_host_bridge_workflow` fixture classifies as `host_bridge_workflow`
|
||||
2. `bootstrap_localhost_pollution` remains a non-G6 business scene
|
||||
|
||||
## Phase 2: Fail-Closed Gate
|
||||
|
||||
Tasks:
|
||||
|
||||
1. add readiness risks for missing or unsupported G6 contract
|
||||
2. add `g6_host_bridge_detected`
|
||||
3. add `g6_fail_closed`
|
||||
4. block generation before runnable output
|
||||
|
||||
Acceptance criteria:
|
||||
|
||||
1. `G6` generation returns a controlled error
|
||||
2. error message includes `host_bridge_workflow`
|
||||
3. no pseudo-runnable skill is produced
|
||||
|
||||
## Phase 3: Regression
|
||||
|
||||
Tasks:
|
||||
|
||||
1. run scene generator regression
|
||||
2. run family regression
|
||||
3. run family policy regression
|
||||
4. run canonical regression
|
||||
|
||||
Acceptance criteria:
|
||||
|
||||
1. all target regressions pass
|
||||
2. no `G1-E/G3/G2` behavior regresses
|
||||
|
||||
## Next Step
|
||||
|
||||
After this safe G6 slice, continue to `G7 多接口盘点汇总型` boundary assessment unless G6 runtime implementation becomes the selected priority.
|
||||
@@ -0,0 +1,68 @@
|
||||
# G7 Multi Endpoint Inventory Plan
|
||||
|
||||
> Date: 2026-04-18
|
||||
> Status: Initial implementation slice
|
||||
|
||||
## Plan Intent
|
||||
|
||||
Start `G7` after the safe `G6` classification slice.
|
||||
|
||||
This plan only establishes boundary classification and fail-closed behavior. It does not implement runnable multi-endpoint inventory aggregation.
|
||||
|
||||
## Phase 0: Boundary Freeze
|
||||
|
||||
Tasks:
|
||||
|
||||
1. use `计量资产库存统计` as the P0 boundary sample
|
||||
2. define a repo-local representative fixture
|
||||
3. keep `G7` separate from `G1`, `G1-E`, `G6`, and `G8`
|
||||
|
||||
Acceptance criteria:
|
||||
|
||||
1. `G7` is no longer a `G1` candidate
|
||||
2. `G7` is not confused with host bridge workflow
|
||||
|
||||
## Phase 1: Analyzer Classification
|
||||
|
||||
Tasks:
|
||||
|
||||
1. add `multi_endpoint_inventory` as a workflow archetype
|
||||
2. detect inventory endpoint families
|
||||
3. classify scenes with three or more inventory endpoints as `G7`
|
||||
|
||||
Acceptance criteria:
|
||||
|
||||
1. `g7_multi_endpoint_inventory` fixture classifies as `multi_endpoint_inventory`
|
||||
2. inventory endpoint names include `assetStatsQueryMeter` and `assetStatsQueryJlGnModule`
|
||||
|
||||
## Phase 2: Fail-Closed Gate
|
||||
|
||||
Tasks:
|
||||
|
||||
1. add `g7_inventory_endpoints_detected`
|
||||
2. add `g7_fail_closed`
|
||||
3. block generation before runnable output
|
||||
|
||||
Acceptance criteria:
|
||||
|
||||
1. generation returns a controlled error
|
||||
2. error message includes `multi_endpoint_inventory`
|
||||
3. no pseudo-runnable skill is produced
|
||||
|
||||
## Phase 3: Regression
|
||||
|
||||
Tasks:
|
||||
|
||||
1. run scene generator regression
|
||||
2. run family regression
|
||||
3. run family policy regression
|
||||
4. run canonical regression
|
||||
|
||||
Acceptance criteria:
|
||||
|
||||
1. all target regressions pass
|
||||
2. no existing family baseline regresses
|
||||
|
||||
## Next Step
|
||||
|
||||
After this safe G7 slice, continue to `G8 抓取落库分析出文档型` boundary assessment.
|
||||
@@ -0,0 +1,70 @@
|
||||
# G8 Local Document Pipeline Plan
|
||||
|
||||
> Date: 2026-04-18
|
||||
> Status: Initial implementation slice
|
||||
|
||||
## Plan Intent
|
||||
|
||||
Start `G8` after the safe `G7` classification slice.
|
||||
|
||||
This plan only establishes boundary classification and fail-closed behavior. It does not implement runnable local storage, SQL, or document generation orchestration.
|
||||
|
||||
## Phase 0: Boundary Freeze
|
||||
|
||||
Tasks:
|
||||
|
||||
1. use `95598供电服务月报` as the P0 boundary sample
|
||||
2. define a repo-local representative fixture
|
||||
3. keep `G8` separate from `G1`, `G1-E`, `G6`, `G7`, and `G3`
|
||||
|
||||
Acceptance criteria:
|
||||
|
||||
1. `G8` is no longer a `G1` candidate
|
||||
2. `G8` is not collapsed into generic host bridge workflow
|
||||
|
||||
## Phase 1: Analyzer Classification
|
||||
|
||||
Tasks:
|
||||
|
||||
1. add `local_doc_pipeline` as a workflow archetype
|
||||
2. detect `definedSqlQuery`
|
||||
3. detect `docExport`
|
||||
4. detect `selectData` / local config service persistence
|
||||
5. prioritize `G8` over `G6` when both signals exist
|
||||
|
||||
Acceptance criteria:
|
||||
|
||||
1. `g8_local_doc_pipeline` fixture classifies as `local_doc_pipeline`
|
||||
2. local pipeline actions are visible in deterministic facts
|
||||
|
||||
## Phase 2: Fail-Closed Gate
|
||||
|
||||
Tasks:
|
||||
|
||||
1. add `g8_local_doc_pipeline_detected`
|
||||
2. add `g8_fail_closed`
|
||||
3. block generation before runnable output
|
||||
|
||||
Acceptance criteria:
|
||||
|
||||
1. generation returns a controlled error
|
||||
2. error message includes `local_doc_pipeline`
|
||||
3. no pseudo-runnable skill is produced
|
||||
|
||||
## Phase 3: Regression
|
||||
|
||||
Tasks:
|
||||
|
||||
1. run scene generator regression
|
||||
2. run family regression
|
||||
3. run family policy regression
|
||||
4. run canonical regression
|
||||
|
||||
Acceptance criteria:
|
||||
|
||||
1. all target regressions pass
|
||||
2. no existing family baseline regresses
|
||||
|
||||
## Next Step
|
||||
|
||||
After this safe G8 slice, the boundary-reassignment sequence has a code-backed fail-closed guard for `G1-E`, `G6`, `G7`, and `G8`.
|
||||
@@ -0,0 +1,215 @@
|
||||
# 线损家族变体扩展计划
|
||||
|
||||
> **Status:** Draft
|
||||
> **Date:** 2026-04-18
|
||||
> **Author:** Codex
|
||||
> **Upstream Inputs:**
|
||||
> [2026-04-18-g2-family-expansion-plan.md](D:/data/ideaSpace/rust/sgClaw/claw-new/docs/superpowers/plans/2026-04-18-g2-family-expansion-plan.md)
|
||||
> [2026-04-18-g2-family-expansion-third-round-report.md](D:/data/ideaSpace/rust/sgClaw/claw-new/docs/superpowers/reports/2026-04-18-g2-family-expansion-third-round-report.md)
|
||||
|
||||
## Plan Intent
|
||||
|
||||
本计划用于承接当前已经收敛出的 `G2-A/G2-B/G2-C` 三类线损子型,把目标从“修通三个代表样本”推进到“可复制扩展更多线损变体”。
|
||||
|
||||
当前已经证明:
|
||||
|
||||
1. `G2-A` 双模式主报表型可生成
|
||||
2. `G2-B` 周报单侧 mode 型可生成
|
||||
3. `G2-C` 混合联动型可生成
|
||||
|
||||
因此,下一阶段不再围绕这三个样本反复微调,而是要把“线损场景 -> 子型 -> 最小合同 -> 候选验证”这条复制链做出来。
|
||||
|
||||
## Success Baseline
|
||||
|
||||
本计划完成后的最低成功口径固定为:
|
||||
|
||||
1. 新增一批线损真实场景能够被归入现有子型或新子型
|
||||
2. 每个新增子型都有最小合同标准
|
||||
3. 至少每类新增子型有 2 到 3 个真实样本完成迁移验证
|
||||
4. 不能归类或合同不足的样本继续 `fail-close`
|
||||
5. 形成一份“线损家族实施映射表”
|
||||
6. 输出一轮新的线损家族扩展报告
|
||||
|
||||
## Scope Guardrails
|
||||
|
||||
执行过程中保持以下边界不变:
|
||||
|
||||
1. 不扩展到非线损报表家族
|
||||
2. 不处理统一登录、隐藏域登录或宿主 transport 重构
|
||||
3. 不在本计划中做真实内网人工验证
|
||||
4. 不把本计划扩散成 102 个全量场景一次性铺开
|
||||
5. 不回头推翻已经收敛的 `G2-A/G2-B/G2-C` 结果
|
||||
|
||||
## Phase Overview
|
||||
|
||||
本计划按五个阶段推进:
|
||||
|
||||
1. Phase 0:冻结线损扩展基线
|
||||
2. Phase 1:建立线损变体分组清单
|
||||
3. Phase 2:为新增变体建立最小合同标准
|
||||
4. Phase 3:按分组扩展 fixture / 判定 / 生成链路
|
||||
5. Phase 4:回归真实样本并输出扩展报告
|
||||
|
||||
执行顺序固定为:
|
||||
|
||||
`Phase 0 -> Phase 1 -> Phase 2 -> Phase 3 -> Phase 4`
|
||||
|
||||
## Phase 0:冻结扩展基线
|
||||
|
||||
### Objective
|
||||
|
||||
把当前已经达成的线损家族基线冻结下来,作为后续横向复制的起点。
|
||||
|
||||
### Tasks
|
||||
|
||||
1. 固化 `G2-A/G2-B/G2-C` 当前口径
|
||||
2. 固化这三类子型的最小合同事实
|
||||
3. 固化本计划不再回到“主样本修通”阶段
|
||||
|
||||
### Exit Criteria
|
||||
|
||||
1. `G2-A/G2-B/G2-C` 被视为已建立的家族基线
|
||||
2. 扩展工作被明确表述为“更多线损变体复制”
|
||||
|
||||
## Phase 1:建立线损变体分组清单
|
||||
|
||||
### Objective
|
||||
|
||||
先把“还要扩哪些线损场景”分组,而不是直接零散补样本。
|
||||
|
||||
### Tasks
|
||||
|
||||
1. 从现有线损场景中筛出最接近当前家族的候选样本
|
||||
2. 按结构而不是按名称分组,至少分成:
|
||||
- 双模式主报表型
|
||||
- 周报/日报单侧模式型
|
||||
- 排行/明细主链路型
|
||||
- 线损主链路 + 外部系统联动型
|
||||
- 异常诊断/详情下钻型
|
||||
3. 每组先挑 2 到 3 个代表样本
|
||||
|
||||
### Deliverables
|
||||
|
||||
1. 线损变体分组清单
|
||||
2. 每组代表样本名单
|
||||
|
||||
### Exit Criteria
|
||||
|
||||
1. 不再按单个场景零散推进
|
||||
2. 后续整改对象以“分组”为单位推进
|
||||
|
||||
## Phase 2:建立新增变体最小合同标准
|
||||
|
||||
### Objective
|
||||
|
||||
为每一组新增线损变体先定义“什么叫最低可用”,再动生成器。
|
||||
|
||||
### Tasks
|
||||
|
||||
1. 为每组定义最小合同,至少明确:
|
||||
- 主 endpoint
|
||||
- request template
|
||||
- response path
|
||||
- 关键字段或 column defs
|
||||
- normalize / required fields
|
||||
2. 明确哪些链路属于主合同
|
||||
3. 明确哪些链路属于扩展证据或风险证据
|
||||
|
||||
### Deliverables
|
||||
|
||||
1. 线损变体最小合同表
|
||||
2. 每组的候选验证门槛
|
||||
|
||||
### Exit Criteria
|
||||
|
||||
1. 每组都有统一判定口径
|
||||
2. 后续开发不再靠单样本临时拍脑袋
|
||||
|
||||
## Phase 3:按分组扩展生成链路
|
||||
|
||||
### Objective
|
||||
|
||||
把新增变体分组逐类接入 analyzer / generator / readiness。
|
||||
|
||||
### Tasks
|
||||
|
||||
1. 每一组先补 fixture
|
||||
2. 每一组先补测试
|
||||
3. 再补子型判定
|
||||
4. 再补最小合同恢复
|
||||
5. 再补 readiness 分级
|
||||
|
||||
### Rules
|
||||
|
||||
1. 任何一组都必须先有 fixture,再改逻辑
|
||||
2. 不允许多个组同时无边界并行扩散
|
||||
3. 一组完成后再推进下一组
|
||||
|
||||
### Deliverables
|
||||
|
||||
1. 新增线损变体 fixture
|
||||
2. 新增家族回归测试
|
||||
3. 对应 analyzer / generator / readiness 扩展实现
|
||||
|
||||
### Exit Criteria
|
||||
|
||||
1. 至少新增 1 到 2 类线损变体可生成
|
||||
2. 原有 `G2-A/G2-B/G2-C` 不回退
|
||||
|
||||
## Phase 4:真实样本回归与扩展报告
|
||||
|
||||
### Objective
|
||||
|
||||
把扩展后的线损家族能力回到真实样本上验证,而不是停在 fixture 层。
|
||||
|
||||
### Tasks
|
||||
|
||||
1. 重新生成各组代表样本
|
||||
2. 对比:
|
||||
- 子型判定
|
||||
- bootstrap
|
||||
- request contract
|
||||
- response / column / normalize contract
|
||||
- readiness
|
||||
- 是否进入候选验证名单
|
||||
3. 输出线损家族扩展回归报告
|
||||
|
||||
### Deliverables
|
||||
|
||||
1. 真实样本生成结果
|
||||
2. 线损家族扩展回归报告
|
||||
3. 更新后的候选验证名单
|
||||
|
||||
### Acceptance Criteria
|
||||
|
||||
1. 至少 2 个以上新增线损变体组进入候选验证阶段
|
||||
2. 不能归类的场景继续 `fail-close`
|
||||
3. 原有三类 `G2-A/G2-B/G2-C` 不回退
|
||||
|
||||
## Workstream Breakdown
|
||||
|
||||
本计划建议按以下工作流落地:
|
||||
|
||||
1. `WS1` 线损变体盘点与分组
|
||||
2. `WS2` 新增变体最小合同设计
|
||||
3. `WS3` fixture / 回归测试扩展
|
||||
4. `WS4` analyzer / generator / readiness 扩展
|
||||
5. `WS5` 真实样本回归与报告
|
||||
|
||||
## File-Level Targets
|
||||
|
||||
执行本计划时,预计触达以下资产类型:
|
||||
|
||||
1. `docs/superpowers/plans/`
|
||||
2. `docs/superpowers/reports/`
|
||||
3. `tests/fixtures/generated_scene/`
|
||||
4. `tests/`
|
||||
5. `src/generated_scene/`
|
||||
|
||||
## Completion Criteria
|
||||
|
||||
本计划完成的标志是:
|
||||
|
||||
1. 线损家族不再只有三个代表样本可解释
|
||||
2. 已建立“按分组复制”的扩展方法,而不是单样本修修补补
|
||||
3. 后续是否继续向更广场景扩展,可以建立在这份线损家族扩展结果上
|
||||
@@ -0,0 +1,237 @@
|
||||
# Scene Generator Ops Console Plan
|
||||
|
||||
> **Status:** Draft
|
||||
> **Date:** 2026-04-18
|
||||
> **Author:** Codex
|
||||
> **Upstream Spec:** [2026-04-18-scene-generator-ops-console-design.md](D:/data/ideaSpace/rust/sgClaw/claw-new/docs/superpowers/specs/2026-04-18-scene-generator-ops-console-design.md)
|
||||
|
||||
## Plan Intent
|
||||
|
||||
本计划用于将 scene generator 页面从“开发调试控制台”收敛为“面向运维的场景 Skill 生成工作台”,并把上游 `spec` 中已经明确的信息架构、中文化、显隐分层和交互流程拆解为可执行的实施步骤。
|
||||
|
||||
本计划只覆盖前端页面层与页面交互层的收敛,不扩展到 scene generator 后端分析逻辑或生成协议改造。
|
||||
|
||||
## Scope Guardrails
|
||||
|
||||
本计划执行过程中,以下边界保持不变:
|
||||
|
||||
1. 不修改 scene generator 后端接口协议
|
||||
2. 不重写分析算法或 Skill 生成逻辑
|
||||
3. 不删除现有调试信息,只调整默认显隐与展示层次
|
||||
4. 不把本计划扩展成新的前端设计系统建设
|
||||
|
||||
## Primary Outcome
|
||||
|
||||
本计划的直接目标是让运维人员不需要理解 `Scene IR`、`workflowArchetype`、`requestTemplate` 等底层术语,也能完成:
|
||||
|
||||
1. 选择场景目录
|
||||
2. 启动分析
|
||||
3. 判断是否可生成
|
||||
4. 启动生成
|
||||
5. 查看结果目录或失败原因
|
||||
|
||||
## Workstreams
|
||||
|
||||
本计划拆分为四条工作流:
|
||||
|
||||
1. `WS1` 信息架构与页面分层收敛
|
||||
2. `WS2` 中文化与业务态映射
|
||||
3. `WS3` 日志、结果与风险摘要收敛
|
||||
4. `WS4` 调试信息折叠与双层体验收口
|
||||
|
||||
## Phase Overview
|
||||
|
||||
计划按五个阶段推进:
|
||||
|
||||
1. Phase 0:冻结页面目标与口径
|
||||
2. Phase 1:完成信息架构重组
|
||||
3. Phase 2:完成中文化和业务态映射
|
||||
4. Phase 3:完成日志与结果区收敛
|
||||
5. Phase 4:完成调试层折叠和整体验收
|
||||
|
||||
## Phase 0:冻结页面目标与口径
|
||||
|
||||
### Objective
|
||||
|
||||
先冻结该页面服务对象、默认使用模式、主状态表达与一级/二级/三级信息边界,避免实施过程中一边改布局一边改定位。
|
||||
|
||||
### Tasks
|
||||
|
||||
1. 固化页面角色定义:运维执行者优先,开发 / 调试者次级
|
||||
2. 固化页面定位:运维工作台,而不是开发调试台
|
||||
3. 固化默认模式:默认运维模式,技术详情折叠
|
||||
4. 固化一级/二级/三级信息边界
|
||||
5. 固化状态表达、场景类型映射和可执行性映射口径
|
||||
|
||||
### Deliverables
|
||||
|
||||
1. 页面角色说明
|
||||
2. 信息层级边界说明
|
||||
3. 状态与场景类型映射表
|
||||
4. 显隐策略说明
|
||||
|
||||
### Exit Criteria
|
||||
|
||||
1. 页面默认服务对象不再摇摆
|
||||
2. 一级信息与技术详情边界不再摇摆
|
||||
3. 中文状态和类型映射口径冻结
|
||||
|
||||
## Phase 1:完成信息架构重组
|
||||
|
||||
### Objective
|
||||
|
||||
将当前“配置区 + 分析区 + 生成日志 + 技术字段混排”的页面结构,重组为运维可理解的工作台结构。
|
||||
|
||||
### Tasks
|
||||
|
||||
1. 重组顶部总览区
|
||||
2. 重组左侧主操作区
|
||||
3. 重组右侧结果摘要区
|
||||
4. 重组底部执行过程区
|
||||
5. 预留技术详情区并默认折叠
|
||||
|
||||
### Required Sections
|
||||
|
||||
首屏结构固定为:
|
||||
|
||||
1. 顶部总览区
|
||||
2. 左侧主操作区
|
||||
3. 右侧结果摘要区
|
||||
4. 底部执行过程区
|
||||
5. 技术详情区
|
||||
|
||||
### Deliverables
|
||||
|
||||
1. 页面区块结构实现
|
||||
2. 区块标题与区块顺序实现
|
||||
3. 一级流程的视觉主路径
|
||||
|
||||
### Acceptance Criteria
|
||||
|
||||
1. 首屏不再同时暴露大量技术细节
|
||||
2. 运维默认流程可以按“选择目录 -> 分析 -> 生成 -> 查看结果”完成
|
||||
3. 页面结构从“调试面板”转为“工作台”
|
||||
|
||||
## Phase 2:完成中文化和业务态映射
|
||||
|
||||
### Objective
|
||||
|
||||
将当前页面的大量英文标题、按钮和技术术语替换为面向运维的中文表述,并将底层技术状态映射为业务可读状态。
|
||||
|
||||
### Tasks
|
||||
|
||||
1. 替换页面标题、副标题和区块标题
|
||||
2. 替换按钮文案和输入框占位文案
|
||||
3. 替换日志标签文案
|
||||
4. 建立 `Readiness` 中文映射
|
||||
5. 建立 archetype 中文映射
|
||||
|
||||
### Required Mappings
|
||||
|
||||
最小映射集合包括:
|
||||
|
||||
1. `Readiness A/B/C -> 可直接生成 / 可生成但需确认 / 暂不建议生成`
|
||||
2. `single_request_table -> 单页报表`
|
||||
3. `multi_mode_request -> 多模式报表`
|
||||
4. `paginated_enrichment -> 分页明细`
|
||||
5. `page_state_eval -> 页面检测`
|
||||
|
||||
### Deliverables
|
||||
|
||||
1. 中文标题与按钮实现
|
||||
2. 中文状态映射实现
|
||||
3. 中文场景类型映射实现
|
||||
4. 中文风险与结果文案实现
|
||||
|
||||
### Acceptance Criteria
|
||||
|
||||
1. 首屏不再出现大面积未翻译英文
|
||||
2. 运维可直接理解主要状态和场景类型
|
||||
3. 技术术语不再作为首页主文案
|
||||
|
||||
## Phase 3:完成日志与结果区收敛
|
||||
|
||||
### Objective
|
||||
|
||||
让页面日志和结果区优先服务“执行与排障”,而不是原始流式调试输出。
|
||||
|
||||
### Tasks
|
||||
|
||||
1. 将 `Generation Log` 改为 `执行过程`
|
||||
2. 将 `status / log / complete / error` 标签中文化
|
||||
3. 将原始流日志优先收敛为中文摘要日志
|
||||
4. 完善 `生成结果` 区的成功/失败状态展示
|
||||
5. 强化输出目录和结果文件入口
|
||||
|
||||
### Deliverables
|
||||
|
||||
1. 中文摘要日志
|
||||
2. 生成结果卡片
|
||||
3. 失败原因摘要
|
||||
4. 输出目录入口
|
||||
|
||||
### Acceptance Criteria
|
||||
|
||||
1. 运维无需阅读底层 SSE 技术消息也能理解执行过程
|
||||
2. 成功时能快速找到结果目录
|
||||
3. 失败时能快速看到中文失败原因
|
||||
|
||||
## Phase 4:完成调试层折叠和整体验收
|
||||
|
||||
### Objective
|
||||
|
||||
保留开发与排障能力,但让其默认下沉为调试层,不干扰运维首屏使用。
|
||||
|
||||
### Tasks
|
||||
|
||||
1. 将 `Scene IR`、`requestTemplate`、`evidence`、`workflow steps` 等收入口技术详情区
|
||||
2. 将 `scene-id`、`scene-kind`、`targetUrl override`、`workflow archetype override` 收入口高级设置
|
||||
3. 校验默认显隐逻辑
|
||||
4. 校验运维模式与调试模式体验边界
|
||||
5. 完成最终页面口径验收
|
||||
|
||||
### Deliverables
|
||||
|
||||
1. 高级设置折叠区
|
||||
2. 技术详情折叠区
|
||||
3. 最终页面显隐策略实现
|
||||
|
||||
### Acceptance Criteria
|
||||
|
||||
1. 运维首页只承载状态摘要、操作与结果
|
||||
2. 开发调试仍可通过折叠区查看完整技术信息
|
||||
3. 不再出现“默认首屏就是技术调试面板”的体验
|
||||
|
||||
## File-Level Planning Targets
|
||||
|
||||
本计划后续实施至少覆盖以下资产:
|
||||
|
||||
1. [sg_scene_generator.html](D:/data/ideaSpace/rust/sgClaw/claw-new/frontend/scene-generator/sg_scene_generator.html)
|
||||
2. 与页面展示文案和显隐逻辑相关的前端脚本
|
||||
3. 与页面标题、区块结构和状态映射相关的前端样式与渲染逻辑
|
||||
|
||||
## Completion Criteria
|
||||
|
||||
本计划完成的标志为:
|
||||
|
||||
1. 页面默认形态已从“开发调试控制台”转为“运维工作台”
|
||||
2. 首屏已完成中文化和业务态映射
|
||||
3. 运维默认流程可在首屏完成,不依赖技术详情区
|
||||
4. 调试信息仍保留,但不再默认淹没首页
|
||||
5. 失败原因、风险提示和结果目录对运维可直接理解
|
||||
|
||||
## Risks and Control Points
|
||||
|
||||
1. 若只改文案不改信息架构,页面仍会保持臃肿
|
||||
2. 若只隐藏字段不重做结果摘要,运维仍无法快速判断是否可生成
|
||||
3. 若过度删除技术信息,会削弱开发与排障效率
|
||||
4. 若状态映射不统一,页面会出现中文标题下仍夹杂底层技术语义的割裂感
|
||||
|
||||
## Out of Plan
|
||||
|
||||
以下事项不属于本计划直接交付范围:
|
||||
|
||||
1. scene generator 后端分析逻辑重构
|
||||
2. Skill 生成协议变更
|
||||
3. 页面服务端接口新增
|
||||
4. 运维权限、账号体系或多角色权限控制
|
||||
@@ -0,0 +1,277 @@
|
||||
# sgClaw Scene Skill Post-Roadmap Execution Plan
|
||||
|
||||
> **Status:** Draft
|
||||
> **Date:** 2026-04-18
|
||||
> **Author:** Codex
|
||||
> **Upstream Spec:** [2026-04-18-scene-skill-post-roadmap-execution-design.md](D:/data/ideaSpace/rust/sgClaw/claw-new/docs/superpowers/specs/2026-04-18-scene-skill-post-roadmap-execution-design.md)
|
||||
|
||||
## Plan Intent
|
||||
|
||||
This plan starts after the closure of the current `60-to-90 roadmap`.
|
||||
|
||||
Its purpose is not to reopen `G1/G2/G3` implementation, but to:
|
||||
|
||||
1. unify current execution state
|
||||
2. start real-sample validation
|
||||
3. plan the next bounded roadmap
|
||||
|
||||
## Scope Guardrails
|
||||
|
||||
1. Do not reopen completed `G1/G2/G3` repo-local baseline implementation.
|
||||
2. Do not keep expanding fixtures as the primary mode of progress.
|
||||
3. Do not silently pull `G4/G5` into implementation.
|
||||
4. Do not directly implement unified login recovery in this plan.
|
||||
5. Do not treat the old roadmap as still open-ended.
|
||||
6. Phase 1 execution-board work must stay minimal and exist only to support Phase 2 real-sample validation.
|
||||
7. Once `G2`, `G1-E`, and `G3` each have at least one mappable real sample, execution must move immediately into Phase 2.
|
||||
8. Any new asset that does not directly support real-sample validation is deferred to Phase 3 or Phase 4.
|
||||
|
||||
## Workstreams
|
||||
|
||||
1. `WS1` Current Execution Board Unification
|
||||
2. `WS2` Real Sample Validation
|
||||
3. `WS3` Boundary and Runtime Gap Planning
|
||||
4. `WS4` Next Roadmap Definition
|
||||
|
||||
## Phase Overview
|
||||
|
||||
1. Phase 0: Freeze Handover Boundary
|
||||
2. Phase 1: Build Current Execution Board
|
||||
3. Phase 2: Start Real Sample Validation
|
||||
4. Phase 3: Define Boundary and Runtime Entry Rules
|
||||
5. Phase 4: Publish the Next Roadmap
|
||||
|
||||
Execution order is fixed as:
|
||||
|
||||
`Phase 0 -> Phase 1 -> Phase 2 -> Phase 3 -> Phase 4`
|
||||
|
||||
## Phase 0: Freeze Handover Boundary
|
||||
|
||||
### Objective
|
||||
|
||||
Freeze the boundary between the completed roadmap and the next-stage work.
|
||||
|
||||
### Tasks
|
||||
|
||||
1. Freeze current roadmap completion status.
|
||||
2. Freeze current mainline family status for `G2`, `G1-E`, and `G3`.
|
||||
3. Freeze current boundary family status for `G6/G7/G8`.
|
||||
4. Freeze current deferred status for `G4/G5`.
|
||||
|
||||
### Deliverables
|
||||
|
||||
1. roadmap handover snapshot
|
||||
2. next-stage scope statement
|
||||
3. current family-state matrix
|
||||
|
||||
### Acceptance Criteria
|
||||
|
||||
1. old and new roadmap boundaries are explicit
|
||||
2. next-stage work is no longer mixed into the old roadmap
|
||||
|
||||
## Phase 1: Build Current Execution Board
|
||||
|
||||
### Objective
|
||||
|
||||
Create the minimum authoritative execution board required to start real-sample validation for the current `102-scene` status.
|
||||
|
||||
### WS1
|
||||
|
||||
#### Task 1
|
||||
|
||||
Build one `102-scene current execution board`.
|
||||
|
||||
#### Task 2
|
||||
|
||||
Define the stable scene status vocabulary:
|
||||
|
||||
1. `promoted-baseline`
|
||||
2. `promoted-expansion`
|
||||
3. `boundary-family`
|
||||
4. `deferred`
|
||||
5. `degraded`
|
||||
6. `unvalidated`
|
||||
|
||||
#### Task 3
|
||||
|
||||
Map current `G2/G1-E/G3` scene promotions into the board.
|
||||
|
||||
#### Task 4
|
||||
|
||||
Generate a snapshot-vs-current diff asset.
|
||||
|
||||
#### Task 5
|
||||
|
||||
Stop Phase 1 immediately after `G2`, `G1-E`, and `G3` each have at least one mappable real sample entry in the board.
|
||||
|
||||
### Deliverables
|
||||
|
||||
1. `102-scene current execution board`
|
||||
2. snapshot-vs-current diff report
|
||||
3. scene-to-family status mapping
|
||||
|
||||
### Acceptance Criteria
|
||||
|
||||
1. every scene has one current-state label
|
||||
2. promoted states are visible without reading multiple assets
|
||||
3. board status matches current family assets
|
||||
4. the board is limited to the minimum fields needed by Phase 2 validation records
|
||||
5. no Phase 1 asset is added unless it directly supports real-sample validation
|
||||
|
||||
## Phase 2: Start Real Sample Validation
|
||||
|
||||
### Objective
|
||||
|
||||
Create the next quality layer above fixture success.
|
||||
|
||||
### WS2
|
||||
|
||||
#### Task 5
|
||||
|
||||
Choose the first real-sample validation set for:
|
||||
|
||||
1. `G2`
|
||||
2. `G1-E`
|
||||
3. `G3`
|
||||
|
||||
#### Task 6
|
||||
|
||||
Freeze validation criteria:
|
||||
|
||||
1. compile success
|
||||
2. readiness correctness
|
||||
3. data correctness
|
||||
4. output correctness
|
||||
5. fail-closed correctness
|
||||
|
||||
#### Task 7
|
||||
|
||||
Create a real-sample validation record template.
|
||||
|
||||
#### Task 8
|
||||
|
||||
Record first-round real-sample results.
|
||||
|
||||
#### Task 9
|
||||
|
||||
Write mismatches back into the execution board.
|
||||
|
||||
#### Task 10
|
||||
|
||||
Reject requests for new board-only assets that do not unblock current validation execution.
|
||||
|
||||
### Deliverables
|
||||
|
||||
1. real-sample validation plan
|
||||
2. real-sample record template
|
||||
3. first-round validation records
|
||||
4. mismatch taxonomy
|
||||
|
||||
### Acceptance Criteria
|
||||
|
||||
1. each mainline family has at least one real-sample record
|
||||
2. real-sample status is separated from fixture status
|
||||
3. mismatch reasons are explicit and reusable
|
||||
4. Phase 2 begins as soon as `G2`, `G1-E`, and `G3` each have one mappable real sample
|
||||
|
||||
## Phase 3: Define Boundary and Runtime Entry Rules
|
||||
|
||||
### Objective
|
||||
|
||||
Prepare the next bounded execution scope instead of drifting into it.
|
||||
|
||||
### WS3
|
||||
|
||||
#### Task 11
|
||||
|
||||
Assess `G6/G7/G8` boundary-family readiness for future expansion.
|
||||
|
||||
#### Task 12
|
||||
|
||||
Define formal entry criteria for `G4/G5`.
|
||||
|
||||
#### Task 13
|
||||
|
||||
Build a runtime-gap matrix for:
|
||||
|
||||
1. login recovery
|
||||
2. host-runtime integration
|
||||
3. transport/runtime gaps
|
||||
4. local document and attachment workflows
|
||||
|
||||
#### Task 14
|
||||
|
||||
Separate:
|
||||
|
||||
1. archetype-family gaps
|
||||
2. runtime-platform gaps
|
||||
|
||||
### Deliverables
|
||||
|
||||
1. boundary readiness note
|
||||
2. deferred family entry criteria
|
||||
3. runtime gap matrix
|
||||
4. prioritization note
|
||||
|
||||
### Acceptance Criteria
|
||||
|
||||
1. `G4/G5` do not enter the next build round without documented criteria
|
||||
2. runtime gaps are tracked separately from family expansion
|
||||
3. next implementation scope has an explicit reason
|
||||
|
||||
## Phase 4: Publish the Next Roadmap
|
||||
|
||||
### Objective
|
||||
|
||||
Replace open-ended continuation with a new bounded roadmap.
|
||||
|
||||
### WS4
|
||||
|
||||
#### Task 15
|
||||
|
||||
Write the next-stage design.
|
||||
|
||||
#### Task 16
|
||||
|
||||
Write the next-stage plan.
|
||||
|
||||
#### Task 17
|
||||
|
||||
Define milestone ordering.
|
||||
|
||||
#### Task 18
|
||||
|
||||
Define next-stage completion criteria.
|
||||
|
||||
### Deliverables
|
||||
|
||||
1. post-roadmap design
|
||||
2. post-roadmap plan
|
||||
3. milestone table
|
||||
4. completion criteria
|
||||
|
||||
### Acceptance Criteria
|
||||
|
||||
1. new implementation work has a new roadmap
|
||||
2. the old roadmap is no longer implicitly extended
|
||||
3. next-stage completion can be judged independently
|
||||
|
||||
## Milestone Order
|
||||
|
||||
1. Freeze the handover boundary
|
||||
2. Unify the execution board
|
||||
3. Start real-sample validation
|
||||
4. Freeze boundary/runtime entry rules
|
||||
5. Publish the next roadmap
|
||||
|
||||
No new implementation round should begin before milestones 1 to 4 are complete.
|
||||
No Phase 1 expansion should continue after the minimum board needed for milestone 3 is available.
|
||||
|
||||
## Completion Criteria
|
||||
|
||||
This plan is complete when:
|
||||
|
||||
1. the current roadmap is explicitly closed
|
||||
2. the execution board is unified
|
||||
3. real-sample validation is formally underway
|
||||
4. a new bounded roadmap exists for post-roadmap work
|
||||
@@ -0,0 +1,128 @@
|
||||
# sgClaw Scene Skill Real Sample Validation Roadmap Plan
|
||||
|
||||
> **Status:** Draft
|
||||
> **Date:** 2026-04-18
|
||||
> **Author:** Codex
|
||||
> **Upstream Spec:** [2026-04-18-scene-skill-real-sample-validation-roadmap-design.md](D:/data/ideaSpace/rust/sgClaw/claw-new/docs/superpowers/specs/2026-04-18-scene-skill-real-sample-validation-roadmap-design.md)
|
||||
|
||||
## Plan Intent
|
||||
|
||||
This plan starts after the post-roadmap execution board and first-round validation layer are in place.
|
||||
|
||||
Its purpose is to:
|
||||
|
||||
1. execute selected real samples for `G2`, `G1-E`, and `G3`
|
||||
2. use validation outcomes to decide the next bounded implementation scope
|
||||
3. avoid drifting back into fixture-first or asset-first work
|
||||
|
||||
## Scope Guardrails
|
||||
|
||||
1. Do not reopen completed repo-local baseline implementation for `G1/G2/G3`.
|
||||
2. Do not create new board-only assets unless they unblock current validation execution.
|
||||
3. Do not open `G4/G5` implementation before formal entry decisions are documented.
|
||||
4. Do not pull `G6/G7/G8` into the next build round without explicit validation pressure.
|
||||
|
||||
## Workstreams
|
||||
|
||||
1. `WS1` Mainline Real Sample Execution
|
||||
2. `WS2` Validation Result Triage
|
||||
3. `WS3` Boundary Runtime Entry Decision
|
||||
4. `WS4` Deferred Family Entry Decision
|
||||
|
||||
## Phase 0: Execute Mainline Real Samples
|
||||
|
||||
### Objective
|
||||
|
||||
Convert selected `G2`, `G1-E`, and `G3` anchors into executed real-sample records.
|
||||
|
||||
### Tasks
|
||||
|
||||
1. Execute `G2` anchor validation updates from the current mismatch baseline.
|
||||
2. Keep `G1-E` real pass anchor as the current positive baseline.
|
||||
3. Execute the pending `G3` real sample.
|
||||
4. Write all outcomes into the validation record layer.
|
||||
|
||||
### Deliverables
|
||||
|
||||
1. updated real-sample validation records
|
||||
2. updated mismatch taxonomy usage
|
||||
3. updated execution-board validation statuses
|
||||
|
||||
### Acceptance Criteria
|
||||
|
||||
1. `G2`, `G1-E`, and `G3` each have executed real-sample records
|
||||
2. `selected-not-yet-run` no longer remains for current mainline anchors
|
||||
|
||||
## Phase 1: Triage Results Into Scope Decisions
|
||||
|
||||
### Objective
|
||||
|
||||
Use validation results, not fixture status, to choose the next bounded scope.
|
||||
|
||||
### Tasks
|
||||
|
||||
1. classify each mainline family result as `stable`, `mismatch-driven`, or `blocked-by-runtime`
|
||||
2. identify which problems are compiler-family gaps and which are runtime gaps
|
||||
3. define the next recommended scope from validation evidence
|
||||
|
||||
### Deliverables
|
||||
|
||||
1. validation triage report
|
||||
2. next-scope recommendation
|
||||
|
||||
### Acceptance Criteria
|
||||
|
||||
1. the next scope is justified by executed validation evidence
|
||||
2. repo-local success no longer acts as the sole decision signal
|
||||
|
||||
## Phase 2: Boundary Runtime Entry Decision
|
||||
|
||||
### Objective
|
||||
|
||||
Decide whether `G6/G7/G8` should stay boundary-only or enter a runtime-focused roadmap.
|
||||
|
||||
### Tasks
|
||||
|
||||
1. compare boundary-family runtime gaps against executed validation pressure
|
||||
2. decide whether any boundary family should enter the next roadmap
|
||||
3. document non-entry decisions explicitly when scope stays closed
|
||||
|
||||
### Deliverables
|
||||
|
||||
1. boundary runtime decision note
|
||||
2. next-roadmap inclusion or exclusion list
|
||||
|
||||
### Acceptance Criteria
|
||||
|
||||
1. `G6/G7/G8` entry decisions are explicit
|
||||
2. no boundary family enters by drift
|
||||
|
||||
## Phase 3: Deferred Family Entry Decision
|
||||
|
||||
### Objective
|
||||
|
||||
Decide whether `G4/G5` should remain closed or enter a later roadmap.
|
||||
|
||||
### Tasks
|
||||
|
||||
1. compare deferred-family criteria against current validation pressure
|
||||
2. confirm whether `G4/G5` remain deferred or degraded
|
||||
3. record the decision before any new implementation starts
|
||||
|
||||
### Deliverables
|
||||
|
||||
1. deferred family decision note
|
||||
2. updated next-roadmap scope boundary
|
||||
|
||||
### Acceptance Criteria
|
||||
|
||||
1. `G4/G5` entry decisions are explicit
|
||||
2. deferred families do not enter implementation implicitly
|
||||
|
||||
## Completion Criteria
|
||||
|
||||
This plan is complete when:
|
||||
|
||||
1. all selected mainline anchors have executed real-sample records
|
||||
2. the next implementation scope is selected from validation outcomes
|
||||
3. boundary and deferred family entry decisions are documented
|
||||
@@ -0,0 +1,51 @@
|
||||
# 102 Final Coverage Status Rollup Plan
|
||||
|
||||
> Date: 2026-04-19
|
||||
> Parent Framework: `2026-04-19-scene-skill-102-full-coverage-framework-plan.md`
|
||||
> Parent Layer: `Layer E`
|
||||
> Status: Active
|
||||
|
||||
## Plan Intent
|
||||
|
||||
Create the final 102-scene coverage rollup after residual 13 closure. This plan publishes a candidate/status view only.
|
||||
|
||||
## Fixed Inputs
|
||||
|
||||
1. `tests/fixtures/generated_scene/full_coverage_reconciliation_candidates_2026-04-19.json`
|
||||
2. `tests/fixtures/generated_scene/residual_13_reconciliation_candidates_2026-04-19.json`
|
||||
3. `tests/fixtures/generated_scene/boundary_residual_hold_decision_2026-04-19.json`
|
||||
4. `tests/fixtures/generated_scene/bootstrap_target_residual_isolation_2026-04-19.json`
|
||||
5. `tests/fixtures/generated_scene/promotion_board_reconciliation_policy_2026-04-19.json`
|
||||
|
||||
## Allowed Files
|
||||
|
||||
1. `tests/fixtures/generated_scene/final_coverage_status_rollup_2026-04-19.json`
|
||||
2. `docs/superpowers/reports/2026-04-19-102-final-coverage-status-rollup-report.md`
|
||||
|
||||
## Forbidden Files
|
||||
|
||||
1. `tests/fixtures/generated_scene/scene_execution_board_2026-04-18.json`
|
||||
2. `src/generated_scene/analyzer.rs`
|
||||
3. `src/generated_scene/generator.rs`
|
||||
|
||||
## Tasks
|
||||
|
||||
1. Load the 102-scene full coverage reconciliation candidate view.
|
||||
2. Load the residual 13 reconciliation candidate view.
|
||||
3. Replace matching residual scenes in the 102 view with residual follow-up candidate statuses.
|
||||
4. Attach boundary/bootstrap overlay decisions where present.
|
||||
5. Produce final coverage summary.
|
||||
6. Publish the rollup JSON.
|
||||
7. Publish the rollup report.
|
||||
|
||||
## Completion Criteria
|
||||
|
||||
1. Final rollup contains `102` scenes.
|
||||
2. Final summary has `95` framework auto-pass candidates and `7` structured fail-closed candidates.
|
||||
3. There are `0` source-unreadable, unsupported-family, missing-source, and misclassified-unresolved records.
|
||||
4. Official execution board is not modified.
|
||||
5. Report names the next bounded step.
|
||||
|
||||
## Stop Statement
|
||||
|
||||
Stop after the final coverage rollup JSON and report are published. Do not update the official execution board under this plan.
|
||||
@@ -0,0 +1,42 @@
|
||||
# 102 Framework Closure Rollup Plan
|
||||
|
||||
> Date: 2026-04-19
|
||||
> Parent Framework: `2026-04-19-scene-skill-102-full-coverage-framework-plan.md`
|
||||
> Parent Sequence: `2026-04-19-final-2-residual-child-plan-sequence-plan.md`
|
||||
> Status: Draft
|
||||
|
||||
## Plan Intent
|
||||
|
||||
Publish the final 102-scene framework closure rollup after the final-2 residual roadmaps and board refresh are complete.
|
||||
|
||||
## Fixed Inputs
|
||||
|
||||
1. `tests/fixtures/generated_scene/scene_execution_board_2026-04-18.json`
|
||||
2. optional `tests/fixtures/generated_scene/final_2_official_board_reconciliation_refresh_2026-04-19.json`
|
||||
|
||||
## Allowed Files
|
||||
|
||||
1. `tests/fixtures/generated_scene/scene_skill_102_framework_closure_rollup_2026-04-19.json`
|
||||
2. `docs/superpowers/reports/2026-04-19-scene-skill-102-framework-closure-rollup-report.md`
|
||||
|
||||
## Forbidden Files
|
||||
|
||||
1. `src/generated_scene/analyzer.rs`
|
||||
2. `src/generated_scene/generator.rs`
|
||||
3. `tests/fixtures/generated_scene/scene_execution_board_2026-04-18.json`
|
||||
|
||||
## Tasks
|
||||
|
||||
1. Load official board.
|
||||
2. Count framework statuses.
|
||||
3. List any remaining structured fail-closed scenes and their named next actions.
|
||||
4. Verify unresolved count is zero.
|
||||
5. Publish closure rollup JSON and report.
|
||||
|
||||
## Expected Delta
|
||||
|
||||
No implementation delta. This is the final reporting layer.
|
||||
|
||||
## Stop Statement
|
||||
|
||||
Stop after publishing the 102 framework closure rollup. Do not start another runtime roadmap under this plan.
|
||||
@@ -0,0 +1,62 @@
|
||||
# 102 Full Coverage Follow-Up Sweep And Reconciliation Plan
|
||||
|
||||
> Date: 2026-04-19
|
||||
> Status: Draft
|
||||
> Parent Framework Plan: `docs/superpowers/plans/2026-04-19-scene-skill-102-full-coverage-framework-plan.md`
|
||||
> Parent Layer: `Layer E`
|
||||
> Upstream Design: `docs/superpowers/specs/2026-04-19-102-full-coverage-followup-sweep-and-reconciliation-design.md`
|
||||
|
||||
## Plan Intent
|
||||
|
||||
Run one fixed full 102-scene follow-up sweep after Route 2 through Route 6 have closed, then publish a policy-governed reconciliation candidate view.
|
||||
|
||||
## Fixed Inputs
|
||||
|
||||
1. `tests/fixtures/generated_scene/scene_execution_board_2026-04-18.json`
|
||||
2. `tests/fixtures/generated_scene/g3_enrichment_request_closure_followup_2026-04-19.json`
|
||||
3. `tests/fixtures/generated_scene/g3_export_plan_closure_followup_2026-04-19.json`
|
||||
4. `tests/fixtures/generated_scene/g3_residual_contract_closure_2026-04-19.json`
|
||||
5. `tests/fixtures/generated_scene/g2_remaining_fail_closed_closure_followup_2026-04-19.json`
|
||||
6. `tests/fixtures/generated_scene/g1e_remaining_fail_closed_closure_followup_2026-04-19.json`
|
||||
7. `tests/fixtures/generated_scene/boundary_fail_closed_decision_2026-04-19.json`
|
||||
8. `tests/fixtures/generated_scene/promotion_board_reconciliation_policy_2026-04-19.json`
|
||||
|
||||
## Allowed Files
|
||||
|
||||
1. follow-up sweep JSON asset
|
||||
2. reconciliation candidate JSON asset
|
||||
3. follow-up sweep report
|
||||
4. reconciliation candidate report
|
||||
|
||||
## Forbidden Files
|
||||
|
||||
1. `src/generated_scene/analyzer.rs`
|
||||
2. `src/generated_scene/generator.rs`
|
||||
3. `tests/fixtures/generated_scene/scene_execution_board_2026-04-18.json`
|
||||
4. family implementation assets
|
||||
|
||||
## Tasks
|
||||
|
||||
1. run fixed 102-scene follow-up sweep
|
||||
2. classify raw sweep result
|
||||
3. apply Route 5 route decisions where applicable
|
||||
4. apply Route 6 promotion policy to build reconciliation candidate view
|
||||
5. publish coverage delta and remaining-gap report
|
||||
|
||||
## Expected Coverage Delta
|
||||
|
||||
The plan should quantify cumulative delta after Routes 2, 3, and 4.
|
||||
|
||||
## Completion Criteria
|
||||
|
||||
1. total scene count is 102
|
||||
2. every scene has one raw sweep status
|
||||
3. every scene has one reconciliation candidate status
|
||||
4. coverage delta is reported
|
||||
5. official execution board is not modified
|
||||
|
||||
## Stop Statement
|
||||
|
||||
Stop after publishing the follow-up sweep and reconciliation candidate reports.
|
||||
|
||||
Do not start a new implementation route under this plan.
|
||||
197
docs/superpowers/plans/2026-04-19-102-full-sweep-dry-run-plan.md
Normal file
197
docs/superpowers/plans/2026-04-19-102-full-sweep-dry-run-plan.md
Normal file
@@ -0,0 +1,197 @@
|
||||
# 102 Full Sweep Dry-Run Plan
|
||||
|
||||
> Date: 2026-04-19
|
||||
> Status: Draft
|
||||
> Upstream Spec: [2026-04-19-102-full-sweep-dry-run-design.md](D:/data/ideaSpace/rust/sgClaw/claw-new/docs/superpowers/specs/2026-04-19-102-full-sweep-dry-run-design.md)
|
||||
|
||||
## Plan Intent
|
||||
|
||||
Run one bounded, read-only full sweep over the `102` scene ledger to measure actual generic `scene -> skill` coverage.
|
||||
|
||||
The plan answers:
|
||||
|
||||
`how many of the 102 scenes can the current generic analyzer/generator handle today?`
|
||||
|
||||
## Scope Guardrails
|
||||
|
||||
1. do not change analyzer logic
|
||||
2. do not change generator logic
|
||||
3. do not promote scenes into `scene_execution_board_2026-04-18.json`
|
||||
4. do not add new family baselines
|
||||
5. do not create new family implementation plans
|
||||
6. do not fix failures during this dry-run
|
||||
7. do not run outside the fixed `102` scene set
|
||||
|
||||
## Fixed Inputs
|
||||
|
||||
1. execution board: `tests/fixtures/generated_scene/scene_execution_board_2026-04-18.json`
|
||||
2. scene root: `D:/desk/智能体资料/全量业务场景/一平台场景`
|
||||
3. generator command: `cargo run --bin sg_scene_generate`
|
||||
|
||||
## Fixed Outputs
|
||||
|
||||
1. dry-run result: `tests/fixtures/generated_scene/full_sweep_dry_run_2026-04-19.json`
|
||||
2. dry-run output root: `examples/full_sweep_dry_run_2026-04-19`
|
||||
3. report: `docs/superpowers/reports/2026-04-19-102-full-sweep-dry-run-report.md`
|
||||
|
||||
## Workstreams
|
||||
|
||||
1. `WS1` Build Scene Inventory
|
||||
2. `WS2` Run Analyzer/Generator Dry-Run
|
||||
3. `WS3` Classify Results
|
||||
4. `WS4` Publish Coverage Report
|
||||
|
||||
## Phase 0: Freeze Dry-Run Boundary
|
||||
|
||||
### Objective
|
||||
|
||||
Make the dry-run a measurement exercise only.
|
||||
|
||||
### Tasks
|
||||
|
||||
1. freeze the execution board input
|
||||
2. freeze the local scene root
|
||||
3. freeze the dry-run output paths
|
||||
4. explicitly mark the run as read-only with respect to generator behavior and board status
|
||||
|
||||
### Deliverables
|
||||
|
||||
1. fixed input statement
|
||||
2. fixed output statement
|
||||
3. dry-run no-promotion statement
|
||||
|
||||
### Acceptance Criteria
|
||||
|
||||
1. no analyzer/generator implementation file is edited for this dry-run
|
||||
2. `scene_execution_board_2026-04-18.json` is not modified by dry-run results
|
||||
3. failures are recorded, not fixed
|
||||
|
||||
## Phase 1: Build Scene Inventory
|
||||
|
||||
### Objective
|
||||
|
||||
Construct a deterministic inventory of all `102` scene names and expected source directories.
|
||||
|
||||
### Tasks
|
||||
|
||||
1. read `scene_execution_board_2026-04-18.json`
|
||||
2. extract all scene entries
|
||||
3. map each scene name to `D:/desk/智能体资料/全量业务场景/一平台场景/<sceneName>`
|
||||
4. check whether each source directory exists
|
||||
5. assign initial inventory status:
|
||||
- `source-present`
|
||||
- `missing-source`
|
||||
|
||||
### Deliverables
|
||||
|
||||
1. inventory section inside `full_sweep_dry_run_2026-04-19.json`
|
||||
2. missing-source list
|
||||
|
||||
### Acceptance Criteria
|
||||
|
||||
1. inventory count equals `102`
|
||||
2. every scene has a source path
|
||||
3. missing source does not stop the sweep
|
||||
|
||||
## Phase 2: Run Analyzer/Generator Dry-Run
|
||||
|
||||
### Objective
|
||||
|
||||
Attempt current generic generation for every source-present scene without fixing failures.
|
||||
|
||||
### Tasks
|
||||
|
||||
1. generate a stable safe scene id for each scene
|
||||
2. invoke `sg_scene_generate` for each source-present scene
|
||||
3. write outputs under `examples/full_sweep_dry_run_2026-04-19`
|
||||
4. for successful generation, read `references/generation-report.json`
|
||||
5. for failed generation, capture stderr/stdout and exit code
|
||||
6. continue until all `102` scenes are processed
|
||||
|
||||
### Deliverables
|
||||
|
||||
1. per-scene dry-run execution record
|
||||
2. generated output root for successful scenes
|
||||
3. captured error messages for failed scenes
|
||||
|
||||
### Acceptance Criteria
|
||||
|
||||
1. every source-present scene has a generator result
|
||||
2. no failure aborts the full sweep
|
||||
3. generator results are isolated under the dry-run output root
|
||||
|
||||
## Phase 3: Classify Results
|
||||
|
||||
### Objective
|
||||
|
||||
Turn raw dry-run output into actionable coverage categories.
|
||||
|
||||
### Tasks
|
||||
|
||||
1. classify generated `A/B` readiness with no blocker as `auto-pass`
|
||||
2. classify generator blocking with known gate/contract reason as `fail-closed-known`
|
||||
3. classify obvious family mismatch as `misclassified`
|
||||
4. classify evidence outside current families as `unsupported-family`
|
||||
5. classify absent directories as `missing-source`
|
||||
6. classify read/analyze failures as `source-unreadable`
|
||||
7. compute top blockers by frequency
|
||||
8. compute counts by inferred archetype
|
||||
|
||||
### Deliverables
|
||||
|
||||
1. final dry-run status per scene
|
||||
2. summary counts
|
||||
3. by-archetype counts
|
||||
4. top-blocker list
|
||||
|
||||
### Acceptance Criteria
|
||||
|
||||
1. every scene has exactly one final status
|
||||
2. total classified count equals `102`
|
||||
3. every non-pass scene has a reason
|
||||
|
||||
## Phase 4: Publish Report
|
||||
|
||||
### Objective
|
||||
|
||||
Answer the coverage question without changing project state.
|
||||
|
||||
### Tasks
|
||||
|
||||
1. write `full_sweep_dry_run_2026-04-19.json`
|
||||
2. write `2026-04-19-102-full-sweep-dry-run-report.md`
|
||||
3. report these four headline numbers:
|
||||
- `real-sample executed pass`
|
||||
- `code-backed ledger coverage`
|
||||
- `dry-run auto-pass`
|
||||
- `dry-run actionable coverage`
|
||||
4. list next recommended blocker, but do not start implementation
|
||||
|
||||
### Deliverables
|
||||
|
||||
1. dry-run JSON
|
||||
2. dry-run report
|
||||
|
||||
### Acceptance Criteria
|
||||
|
||||
1. report can answer actual generic coverage over `102` scenes
|
||||
2. report separates proven coverage from predicted/dry-run coverage
|
||||
3. report does not promote scene status
|
||||
|
||||
## Completion Criteria
|
||||
|
||||
This plan is complete when:
|
||||
|
||||
1. all `102` scenes are included in the dry-run result
|
||||
2. the dry-run result has stable summary counts
|
||||
3. the report explains the gap between `5/102`, `23/102`, and dry-run coverage
|
||||
4. no generator logic or execution board status is modified
|
||||
|
||||
## Non-Negotiable Stop Rule
|
||||
|
||||
After this dry-run starts:
|
||||
|
||||
1. do not fix generator failures inside the sweep
|
||||
2. do not create new family implementation plans from a single failure
|
||||
3. do not update the execution board automatically
|
||||
4. stop after publishing the dry-run result and report
|
||||
@@ -0,0 +1,240 @@
|
||||
# 102 Full Sweep Dry-Run Triage Plan
|
||||
|
||||
> Date: 2026-04-19
|
||||
> Status: Draft
|
||||
> Upstream Spec: `docs/superpowers/specs/2026-04-19-102-full-sweep-dry-run-triage-design.md`
|
||||
|
||||
## Plan Intent
|
||||
|
||||
Turn the `62` non-pass records from the full sweep into concrete triage buckets while staying measurement-only.
|
||||
|
||||
The plan must not fix generator failures. It only explains them.
|
||||
|
||||
## Fixed Inputs
|
||||
|
||||
1. dry-run result: `tests/fixtures/generated_scene/full_sweep_dry_run_2026-04-19.json`
|
||||
2. dry-run output root: `examples/full_sweep_dry_run_2026-04-19`
|
||||
3. execution board: `tests/fixtures/generated_scene/scene_execution_board_2026-04-18.json`
|
||||
4. scene root: `D:/desk/智能体资料/全量业务场景/一平台场景`
|
||||
|
||||
## Fixed Outputs
|
||||
|
||||
1. triage result: `tests/fixtures/generated_scene/full_sweep_dry_run_triage_2026-04-19.json`
|
||||
2. triage report: `docs/superpowers/reports/2026-04-19-102-full-sweep-dry-run-triage-report.md`
|
||||
|
||||
## Non-Negotiable Scope Guardrails
|
||||
|
||||
1. do not edit analyzer implementation
|
||||
2. do not edit generator implementation
|
||||
3. do not update `scene_execution_board_2026-04-18.json`
|
||||
4. do not promote any scene
|
||||
5. do not add new family baselines
|
||||
6. do not start implementation correction during triage
|
||||
7. do not expand beyond the fixed `102` scene set
|
||||
|
||||
## Workstreams
|
||||
|
||||
1. `WS1` Timeout Triage
|
||||
2. `WS2` Misclassification Triage
|
||||
3. `WS3` No-Report Failure Triage
|
||||
4. `WS4` Publish Triage Result
|
||||
|
||||
## Phase 0: Freeze Triage Boundary
|
||||
|
||||
### Objective
|
||||
|
||||
Make the triage a classification exercise only.
|
||||
|
||||
### Tasks
|
||||
|
||||
1. read the upstream dry-run result
|
||||
2. verify the upstream result has `102` scenes
|
||||
3. verify non-pass buckets are:
|
||||
- `31` timeout records
|
||||
- `5` misclassified records
|
||||
- `25` no-report records
|
||||
- `1` bootstrap-target record
|
||||
4. freeze the triage order:
|
||||
- timeout first
|
||||
- misclassification second
|
||||
- no-report third
|
||||
|
||||
### Deliverables
|
||||
|
||||
1. frozen triage input statement
|
||||
2. frozen non-pass bucket counts
|
||||
3. frozen triage order
|
||||
|
||||
### Acceptance Criteria
|
||||
|
||||
1. triage input count is stable
|
||||
2. no code is changed
|
||||
3. no board status is updated
|
||||
|
||||
## Phase 1: Timeout Triage
|
||||
|
||||
### Objective
|
||||
|
||||
Split the `31` timeout records into second-level reasons.
|
||||
|
||||
### Tasks
|
||||
|
||||
1. select records where `dryRunStatus = source-unreadable`
|
||||
2. verify reason is `generator timeout after 30s`
|
||||
3. collect source directory metadata:
|
||||
- source directory exists
|
||||
- file count
|
||||
- total source bytes
|
||||
- largest file path
|
||||
- largest file bytes
|
||||
4. collect dry-run artifact metadata:
|
||||
- generated skill directory exists
|
||||
- references directory exists
|
||||
- generation report exists
|
||||
5. preserve board context:
|
||||
- current group
|
||||
- current status
|
||||
- current source asset
|
||||
- real sample record id
|
||||
6. optionally run one diagnostic longer-timeout attempt for classification only
|
||||
7. assign one timeout label:
|
||||
- `timeout-known-family-sample`
|
||||
- `timeout-unvalidated-source`
|
||||
- `timeout-large-source`
|
||||
- `timeout-command-hang`
|
||||
- `timeout-generator-slow-but-progressing`
|
||||
- `timeout-undetermined`
|
||||
|
||||
### Deliverables
|
||||
|
||||
1. `timeoutTriage[]` records in the triage JSON
|
||||
2. timeout label summary
|
||||
3. timeout size/source metadata summary
|
||||
|
||||
### Acceptance Criteria
|
||||
|
||||
1. all `31` timeout records have a second-level label
|
||||
2. no timeout is treated as unsupported family by default
|
||||
3. no long-timeout rerun result promotes a scene
|
||||
|
||||
## Phase 2: Misclassification Triage
|
||||
|
||||
### Objective
|
||||
|
||||
Explain the `5` board-vs-archetype conflicts.
|
||||
|
||||
### Tasks
|
||||
|
||||
1. select records where `dryRunStatus = misclassified`
|
||||
2. preserve:
|
||||
- board expected group
|
||||
- expected archetype
|
||||
- inferred archetype
|
||||
- current source asset
|
||||
- real sample layer status
|
||||
3. inspect existing dry-run report path when present
|
||||
4. collect route-conflict evidence:
|
||||
- whether host bridge evidence dominates
|
||||
- whether G3 or G1-E evidence is still present
|
||||
- whether current board expectation came from baseline or expansion
|
||||
5. assign one routing triage label:
|
||||
- `route-overprefer-host-bridge`
|
||||
- `board-expectation-stale`
|
||||
- `mixed-workflow-host-bridge-valid`
|
||||
- `scene-family-split-needed`
|
||||
- `misclassification-undetermined`
|
||||
|
||||
### Deliverables
|
||||
|
||||
1. `misclassificationTriage[]` records in the triage JSON
|
||||
2. routing conflict summary
|
||||
3. high-priority routing risk list
|
||||
|
||||
### Acceptance Criteria
|
||||
|
||||
1. all `5` misclassified records have a routing label
|
||||
2. no routing code is changed
|
||||
3. the report identifies whether implementation correction is justified later
|
||||
|
||||
## Phase 3: No-Report Failure Triage
|
||||
|
||||
### Objective
|
||||
|
||||
Split the `25` generic no-report failures into concrete failure stages.
|
||||
|
||||
### Tasks
|
||||
|
||||
1. select records where:
|
||||
- `dryRunStatus = fail-closed-known`
|
||||
- `reason = generator failed without generation report`
|
||||
2. collect command artifacts:
|
||||
- exit code
|
||||
- stdout tail
|
||||
- stderr tail
|
||||
3. inspect output artifacts:
|
||||
- skill directory exists
|
||||
- references directory exists
|
||||
- any report file exists
|
||||
4. infer one failure stage:
|
||||
- `source-scan`
|
||||
- `analyzer`
|
||||
- `ir-assembly`
|
||||
- `readiness-before-report`
|
||||
- `compiler-package-write`
|
||||
- `panic-or-process-error`
|
||||
- `unknown-no-report`
|
||||
5. keep `bootstrap_target` failure separate
|
||||
|
||||
### Deliverables
|
||||
|
||||
1. `noReportFailureTriage[]` records in the triage JSON
|
||||
2. `bootstrapTargetFailures[]` records in the triage JSON
|
||||
3. failure-stage summary
|
||||
|
||||
### Acceptance Criteria
|
||||
|
||||
1. all `25` no-report failures have an inferred failure stage
|
||||
2. the `bootstrap_target` case is not hidden in the no-report bucket
|
||||
3. every non-pass record remains explainable without implementation changes
|
||||
|
||||
## Phase 4: Publish Triage Result
|
||||
|
||||
### Objective
|
||||
|
||||
Publish a bounded triage result and stop.
|
||||
|
||||
### Tasks
|
||||
|
||||
1. write `full_sweep_dry_run_triage_2026-04-19.json`
|
||||
2. write `2026-04-19-102-full-sweep-dry-run-triage-report.md`
|
||||
3. include:
|
||||
- timeout triage summary
|
||||
- misclassification triage summary
|
||||
- no-report triage summary
|
||||
- recommended next blocker
|
||||
4. explicitly state that the triage does not promote scenes or start fixes
|
||||
|
||||
### Deliverables
|
||||
|
||||
1. triage JSON
|
||||
2. triage report
|
||||
|
||||
### Acceptance Criteria
|
||||
|
||||
1. all `62` non-pass records are covered
|
||||
2. every non-pass record has a second-level explanation
|
||||
3. the report identifies the next blocker without implementing it
|
||||
4. no generator/analyzer file is modified
|
||||
5. `scene_execution_board_2026-04-18.json` is not modified
|
||||
|
||||
## Completion Criteria
|
||||
|
||||
This plan is complete when:
|
||||
|
||||
1. `31` timeout records have timeout labels
|
||||
2. `5` misclassified records have routing labels
|
||||
3. `25` no-report failures have failure stages
|
||||
4. `1` bootstrap-target failure is separately tracked
|
||||
5. the triage JSON and report are published
|
||||
6. execution stops without implementation work
|
||||
|
||||
@@ -0,0 +1,305 @@
|
||||
# 102 Full Sweep Improvement Roadmap Plan
|
||||
|
||||
> Date: 2026-04-19
|
||||
> Status: Draft
|
||||
> Upstream Spec: `docs/superpowers/specs/2026-04-19-102-full-sweep-improvement-roadmap-design.md`
|
||||
> Upstream Dry-Run Result: `tests/fixtures/generated_scene/full_sweep_dry_run_2026-04-19.json`
|
||||
> Upstream Triage Result: `tests/fixtures/generated_scene/full_sweep_dry_run_triage_2026-04-19.json`
|
||||
|
||||
## Plan Intent
|
||||
|
||||
Turn the `102` scene dry-run and triage findings into a governed improvement roadmap.
|
||||
|
||||
This plan is intentionally broad like the earlier `60-to-90` roadmap. It coordinates multiple bounded implementation tracks instead of starting isolated fixes from individual failures.
|
||||
|
||||
## Baseline
|
||||
|
||||
Current measured baseline:
|
||||
|
||||
| Metric | Count |
|
||||
| --- | ---: |
|
||||
| Real-sample executed pass | 5 / 102 |
|
||||
| Code-backed ledger coverage | 23 / 102 |
|
||||
| Dry-run auto-pass | 40 / 102 |
|
||||
| Dry-run actionable coverage | 66 / 102 |
|
||||
|
||||
Current triage baseline:
|
||||
|
||||
| Bucket | Count | Triage conclusion |
|
||||
| --- | ---: | --- |
|
||||
| Timeout | 31 | `19 timeout-unvalidated-source`, `8 timeout-large-source`, `4 timeout-known-family-sample` |
|
||||
| Misclassified | 5 | all `route-overprefer-host-bridge` |
|
||||
| No-report failure | 25 | all `readiness-before-report` |
|
||||
| Bootstrap target | 1 | separate `bootstrap_target` |
|
||||
|
||||
## Scope Guardrails
|
||||
|
||||
1. do not add new scene families
|
||||
2. do not update `scene_execution_board_2026-04-18.json` inside this roadmap
|
||||
3. do not promote scenes directly from diagnostic or dry-run results
|
||||
4. do not reopen completed real-sample passes except as regression checks
|
||||
5. do not start `G4/G5`
|
||||
6. do not implement full login recovery
|
||||
7. do not implement full host runtime transport
|
||||
8. do not implement local document attachment runtime
|
||||
9. do not create unbounded micro-plans from a single failure
|
||||
|
||||
## Workstreams
|
||||
|
||||
1. `WS1` Timeout Diagnostics and Scan Budget
|
||||
2. `WS2` Routing Boundary Correction
|
||||
3. `WS3` Structured Fail-Closed Reporting
|
||||
4. `WS4` Follow-Up Sweep and Coverage Delta
|
||||
|
||||
## Phase 0: Freeze Improvement Baseline
|
||||
|
||||
### Objective
|
||||
|
||||
Freeze the dry-run and triage outputs as the only accepted inputs to this roadmap.
|
||||
|
||||
### Tasks
|
||||
|
||||
1. freeze `full_sweep_dry_run_2026-04-19.json`
|
||||
2. freeze `full_sweep_dry_run_triage_2026-04-19.json`
|
||||
3. freeze the four headline metrics:
|
||||
- `5/102` real-sample pass
|
||||
- `23/102` code-backed ledger coverage
|
||||
- `40/102` dry-run auto-pass
|
||||
- `66/102` dry-run actionable coverage
|
||||
4. freeze the problem buckets:
|
||||
- `4` known-family timeouts
|
||||
- `8` large-source timeouts
|
||||
- `19` unvalidated-source timeouts
|
||||
- `5` host-bridge over-preference cases
|
||||
- `25` readiness-before-report failures
|
||||
- `1` bootstrap-target failure
|
||||
|
||||
### Deliverables
|
||||
|
||||
1. baseline statement
|
||||
2. frozen blocker inventory
|
||||
3. roadmap entry criteria
|
||||
|
||||
### Acceptance Criteria
|
||||
|
||||
1. no additional scene is added to scope
|
||||
2. no implementation starts before the baseline is frozen
|
||||
3. dry-run and triage assets are treated as immutable inputs
|
||||
|
||||
## Phase 1: Known-Family Timeout Diagnostics
|
||||
|
||||
### Objective
|
||||
|
||||
Resolve the highest-priority ambiguity: known-family scenes that timed out in the full sweep.
|
||||
|
||||
### Tasks
|
||||
|
||||
1. select only records labeled `timeout-known-family-sample`
|
||||
2. capture source scale metrics and previous family context
|
||||
3. run bounded diagnostic attempts if needed
|
||||
4. classify each record as:
|
||||
- `known-family-rerun-pass`
|
||||
- `known-family-source-scale-timeout`
|
||||
- `known-family-generator-hotspot`
|
||||
- `known-family-contract-blocked-after-long-run`
|
||||
- `known-family-timeout-unresolved`
|
||||
5. publish diagnostic result
|
||||
|
||||
### Deliverables
|
||||
|
||||
1. known-family timeout diagnostic JSON
|
||||
2. known-family timeout diagnostic report
|
||||
|
||||
### Acceptance Criteria
|
||||
|
||||
1. all `4` known-family timeout records are classified
|
||||
2. no scene is promoted from diagnostic success
|
||||
3. no generator logic is changed in the diagnostic step
|
||||
|
||||
## Phase 2: Source-Scale and Scan-Budget Improvement
|
||||
|
||||
### Objective
|
||||
|
||||
Reduce timeout noise caused by oversized source directories and obvious vendor/library files.
|
||||
|
||||
### Tasks
|
||||
|
||||
1. analyze `timeout-large-source` and `timeout-unvalidated-source`
|
||||
2. define source scan budget policy
|
||||
3. define vendor/library ignore policy
|
||||
4. implement only bounded source scanning or timeout reporting changes
|
||||
5. verify no canonical or real-sample regression is introduced
|
||||
|
||||
### Deliverables
|
||||
|
||||
1. source scan budget policy
|
||||
2. bounded scan implementation if approved by Phase 1 evidence
|
||||
3. timeout reporting regression tests
|
||||
|
||||
### Acceptance Criteria
|
||||
|
||||
1. large source directories no longer dominate the full sweep by accidental vendor-file scanning
|
||||
2. known-family samples are not made worse
|
||||
3. archetype semantics are unchanged
|
||||
|
||||
## Phase 3: Host-Bridge Route Over-Preference Correction
|
||||
|
||||
### Objective
|
||||
|
||||
Correct or formally adjudicate the five cases where `host_bridge_workflow` over-absorbed `G3` or `G1-E` expected scenes.
|
||||
|
||||
### Tasks
|
||||
|
||||
1. select the `5` `route-overprefer-host-bridge` records
|
||||
2. compare business-chain evidence against host-bridge evidence
|
||||
3. define routing precedence rules for:
|
||||
- `G3` vs `G6`
|
||||
- `G1-E` vs `G6`
|
||||
4. implement bounded routing correction only if evidence supports it
|
||||
5. preserve regressions for:
|
||||
- `G3` real-sample pass
|
||||
- `G1-E` real-sample pass
|
||||
- `G6` real-sample pass
|
||||
6. classify each case as:
|
||||
- `route-corrected-to-g3`
|
||||
- `route-corrected-to-g1e`
|
||||
- `board-expectation-reclassified`
|
||||
- `valid-host-bridge-workflow`
|
||||
- `route-conflict-unresolved`
|
||||
|
||||
### Deliverables
|
||||
|
||||
1. route over-preference correction report
|
||||
2. routing regression tests
|
||||
3. updated dry-run classification for the five fixed records
|
||||
|
||||
### Acceptance Criteria
|
||||
|
||||
1. all `5` route conflicts are adjudicated
|
||||
2. `host_bridge_workflow` no longer wins solely because host evidence exists
|
||||
3. existing `G6` pass remains stable
|
||||
4. no broad routing rewrite is introduced
|
||||
|
||||
## Phase 4: Structured Fail-Closed Reporting
|
||||
|
||||
### Objective
|
||||
|
||||
Convert `readiness-before-report` failures into structured failure reports instead of process-level no-report failures.
|
||||
|
||||
### Tasks
|
||||
|
||||
1. select the `25` `readiness-before-report` records
|
||||
2. identify where generation exits before report emission
|
||||
3. define a minimal failure-report schema for pre-package fail-closed
|
||||
4. emit structured failure records with:
|
||||
- inferred archetype
|
||||
- failed gate
|
||||
- blocker reason
|
||||
- missing contract pieces
|
||||
- stderr summary if any
|
||||
5. keep scenes failing unless their contracts are actually complete
|
||||
|
||||
### Deliverables
|
||||
|
||||
1. pre-report fail-closed schema
|
||||
2. implementation of structured failure report emission
|
||||
3. regression covering at least one `paginated_enrichment`, one `local_doc_pipeline`, one `multi_mode_request`, and one `single_request_enrichment` pre-report failure
|
||||
|
||||
### Acceptance Criteria
|
||||
|
||||
1. no-report failures are reduced or eliminated as a category
|
||||
2. failing scenes still fail closed
|
||||
3. failure reasons become machine-readable
|
||||
4. auto-pass count is not inflated by looser gates
|
||||
|
||||
## Phase 5: Bootstrap Target Isolation
|
||||
|
||||
### Objective
|
||||
|
||||
Keep the single `bootstrap_target` failure isolated and decide whether it belongs to later bootstrap normalization work.
|
||||
|
||||
### Tasks
|
||||
|
||||
1. preserve `用户停电频次分析监测` as a separate bootstrap failure
|
||||
2. inspect whether the failure is caused by missing target URL, domain mismatch, or unsupported bootstrap pattern
|
||||
3. produce a bootstrap isolation note
|
||||
4. do not implement login or bootstrap auto-recovery
|
||||
|
||||
### Deliverables
|
||||
|
||||
1. bootstrap target isolation note
|
||||
2. decision whether the case enters a later bootstrap-normalization roadmap
|
||||
|
||||
### Acceptance Criteria
|
||||
|
||||
1. the bootstrap case does not pollute readiness-before-report work
|
||||
2. no login recovery implementation is started
|
||||
|
||||
## Phase 6: Follow-Up Full Sweep and Coverage Delta
|
||||
|
||||
### Objective
|
||||
|
||||
Measure whether the bounded improvements improved generic coverage.
|
||||
|
||||
### Tasks
|
||||
|
||||
1. rerun the fixed `102` scene full sweep with the same scene set
|
||||
2. produce a new dry-run result
|
||||
3. compare against the baseline:
|
||||
- auto-pass delta
|
||||
- actionable coverage delta
|
||||
- timeout delta
|
||||
- misclassification delta
|
||||
- no-report delta
|
||||
4. publish coverage delta report
|
||||
5. decide whether to move to execution-board status sync or another bounded improvement cycle
|
||||
|
||||
### Deliverables
|
||||
|
||||
1. follow-up full sweep JSON
|
||||
2. coverage delta report
|
||||
3. remaining blocker decision board
|
||||
|
||||
### Acceptance Criteria
|
||||
|
||||
1. scene set remains exactly `102`
|
||||
2. baseline and follow-up are comparable
|
||||
3. improvements are quantified, not assumed
|
||||
4. no execution board status is changed automatically
|
||||
|
||||
## Milestone Order
|
||||
|
||||
The order is fixed:
|
||||
|
||||
1. Phase 0: freeze baseline
|
||||
2. Phase 1: known-family timeout diagnostics
|
||||
3. Phase 2: source-scale and scan-budget improvement
|
||||
4. Phase 3: host-bridge route over-preference correction
|
||||
5. Phase 4: structured fail-closed reporting
|
||||
6. Phase 5: bootstrap target isolation
|
||||
7. Phase 6: follow-up full sweep and coverage delta
|
||||
|
||||
Do not start Phase 3 before Phase 1 is completed. Known-family timeout ambiguity affects the interpretation of current coverage.
|
||||
|
||||
Do not start Phase 6 before Phases 2-5 have either completed or been explicitly deferred with reasons.
|
||||
|
||||
## Completion Criteria
|
||||
|
||||
This roadmap is complete when:
|
||||
|
||||
1. known-family timeouts are no longer mixed with generic timeout noise
|
||||
2. host-bridge over-preference cases are adjudicated
|
||||
3. readiness-before-report failures become structured fail-closed records
|
||||
4. the bootstrap target case is isolated
|
||||
5. a follow-up full sweep quantifies coverage delta
|
||||
6. no new family is introduced as a shortcut around current blockers
|
||||
|
||||
## Out of Plan
|
||||
|
||||
1. new family implementation
|
||||
2. `G4/G5` implementation
|
||||
3. browser host runtime transport
|
||||
4. login recovery
|
||||
5. attachment/local document runtime
|
||||
6. automatic execution board promotion
|
||||
|
||||
@@ -0,0 +1,140 @@
|
||||
# 102 Sweep Status Reconciliation Plan
|
||||
|
||||
> Date: 2026-04-19
|
||||
> Status: Draft
|
||||
> Upstream Spec: `docs/superpowers/specs/2026-04-19-102-sweep-status-reconciliation-design.md`
|
||||
|
||||
## Plan Intent
|
||||
|
||||
Reconcile the follow-up `102` sweep result with the final route-conflict decisions so the next roadmap uses a trustworthy status baseline.
|
||||
|
||||
This plan is a status reconciliation plan, not an implementation plan.
|
||||
|
||||
## Scope Guardrails
|
||||
|
||||
1. do not modify `src/generated_scene/analyzer.rs`
|
||||
2. do not modify `src/generated_scene/generator.rs`
|
||||
3. do not modify `tests/fixtures/generated_scene/scene_execution_board_2026-04-18.json`
|
||||
4. do not promote any scene
|
||||
5. do not add or modify family baselines
|
||||
6. do not rerun the `102` sweep
|
||||
7. do not implement fixes for fail-closed or timeout records
|
||||
|
||||
## Phase 0: Freeze Inputs
|
||||
|
||||
### Objective
|
||||
|
||||
Freeze the exact reconciliation inputs.
|
||||
|
||||
### Tasks
|
||||
|
||||
1. read `full_sweep_improvement_followup_2026-04-19.json`
|
||||
2. read `remaining_route_conflict_decisions_2026-04-19.json`
|
||||
3. verify follow-up sweep scene count is `102`
|
||||
4. verify route-decision conflict count is `4`
|
||||
|
||||
### Deliverables
|
||||
|
||||
1. input validation summary
|
||||
|
||||
### Acceptance Criteria
|
||||
|
||||
1. reconciliation does not proceed if follow-up scene count is not `102`
|
||||
2. reconciliation does not proceed if route-decision count is not `4`
|
||||
|
||||
## Phase 1: Merge Route Decisions
|
||||
|
||||
### Objective
|
||||
|
||||
Apply route-conflict decisions as a reconciliation overlay without changing raw sweep status.
|
||||
|
||||
### Tasks
|
||||
|
||||
1. match route decisions by `sceneId`
|
||||
2. for each matching scene, keep `dryRunStatus = misclassified`
|
||||
3. add `routeDecision = valid-host-bridge-workflow`
|
||||
4. set `reconciledStatus = adjudicated-valid-host-bridge`
|
||||
5. preserve decision reason and evidence summary
|
||||
|
||||
### Deliverables
|
||||
|
||||
1. route-decision overlay records
|
||||
|
||||
### Acceptance Criteria
|
||||
|
||||
1. all `4` route decisions match a follow-up scene
|
||||
2. all `4` are reconciled to `adjudicated-valid-host-bridge`
|
||||
3. no broad status rewrite is performed
|
||||
|
||||
## Phase 2: Build Reconciled Status Counts
|
||||
|
||||
### Objective
|
||||
|
||||
Build the reconciled status summary for all `102` scenes.
|
||||
|
||||
### Tasks
|
||||
|
||||
1. copy all follow-up scene records into a new reconciliation asset
|
||||
2. assign `reconciledStatus` for every scene
|
||||
3. count statuses:
|
||||
- `auto-pass`
|
||||
- `fail-closed-known`
|
||||
- `adjudicated-valid-host-bridge`
|
||||
- `source-unreadable`
|
||||
- `missing-source`
|
||||
- `unsupported-family`
|
||||
- `misclassified-unresolved`
|
||||
4. summarize fail-closed records by archetype and reason
|
||||
5. preserve remaining timeout records as unresolved timeout inputs
|
||||
|
||||
### Deliverables
|
||||
|
||||
1. `tests/fixtures/generated_scene/full_sweep_status_reconciliation_2026-04-19.json`
|
||||
|
||||
### Acceptance Criteria
|
||||
|
||||
1. total scene count is `102`
|
||||
2. reconciled status count total is `102`
|
||||
3. unresolved misclassification count is `0`
|
||||
4. timeout count remains `2`
|
||||
|
||||
## Phase 3: Publish Reconciliation Report
|
||||
|
||||
### Objective
|
||||
|
||||
Make the reconciled state readable and actionable.
|
||||
|
||||
### Tasks
|
||||
|
||||
1. summarize raw follow-up counts
|
||||
2. summarize reconciled counts
|
||||
3. list `4` valid-host-bridge adjudications
|
||||
4. list `2` remaining timeout inputs
|
||||
5. summarize `48` fail-closed-known records as the next implementation-analysis candidate
|
||||
6. state explicitly that the execution board was not changed
|
||||
|
||||
### Deliverables
|
||||
|
||||
1. `docs/superpowers/reports/2026-04-19-102-sweep-status-reconciliation-report.md`
|
||||
|
||||
### Acceptance Criteria
|
||||
|
||||
1. report explains why raw `misclassified = 4` no longer means unresolved route bugs
|
||||
2. report identifies the next likely roadmap input without starting it
|
||||
3. report confirms no code or execution-board changes
|
||||
|
||||
## Completion Criteria
|
||||
|
||||
This plan is complete when:
|
||||
|
||||
1. reconciliation JSON exists
|
||||
2. reconciliation report exists
|
||||
3. all `4` route conflicts are represented as adjudicated valid host-bridge workflows
|
||||
4. no unresolved misclassification remains
|
||||
5. `2` timeouts and `48` fail-closed records remain visible as separate future inputs
|
||||
|
||||
## Stop Statement
|
||||
|
||||
Stop after publishing the reconciliation JSON and report.
|
||||
|
||||
Do not start the next roadmap in this plan.
|
||||
@@ -0,0 +1,44 @@
|
||||
# Bootstrap Target Normalization Roadmap Plan
|
||||
|
||||
> Date: 2026-04-19
|
||||
> Parent Framework: `2026-04-19-scene-skill-102-full-coverage-framework-plan.md`
|
||||
> Parent Sequence: `2026-04-19-final-2-residual-child-plan-sequence-plan.md`
|
||||
> Fixed Scene: `sweep-091-scene`
|
||||
> Status: Draft
|
||||
|
||||
## Plan Intent
|
||||
|
||||
Run a bounded bootstrap target normalization slice for the single remaining `page_state_eval` residual.
|
||||
|
||||
## Fixed Input Bucket
|
||||
|
||||
1. `sweep-091-scene`
|
||||
|
||||
## Allowed Files
|
||||
|
||||
1. `src/generated_scene/analyzer.rs`
|
||||
2. `src/generated_scene/generator.rs`
|
||||
3. `tests/scene_generator_test.rs`
|
||||
4. `tests/fixtures/generated_scene/bootstrap_target_normalization_followup_2026-04-19.json`
|
||||
5. `tests/fixtures/generated_scene/bootstrap_target_normalization_reconciliation_candidates_2026-04-19.json`
|
||||
6. `docs/superpowers/reports/2026-04-19-bootstrap-target-normalization-roadmap-report.md`
|
||||
|
||||
## Forbidden Files
|
||||
|
||||
1. `tests/fixtures/generated_scene/scene_execution_board_2026-04-18.json`
|
||||
|
||||
## Tasks
|
||||
|
||||
1. Freeze the current `sweep-091-scene` generation report.
|
||||
2. Identify whether the failure is a missing target URL, target-domain ambiguity, or policy-held navigation dependency.
|
||||
3. Implement at most one bounded bootstrap target normalization slice if the target can be recovered from deterministic source evidence.
|
||||
4. Rerun only `sweep-091-scene`.
|
||||
5. Publish follow-up and reconciliation candidate assets.
|
||||
|
||||
## Expected Delta
|
||||
|
||||
Target delta is `+1 framework-auto-pass-candidate` if deterministic bootstrap target recovery is possible. Otherwise the delta is `0`, with a narrower named hold.
|
||||
|
||||
## Stop Statement
|
||||
|
||||
Stop after the single-scene follow-up and reconciliation candidates are published. Do not update the official board under this plan.
|
||||
@@ -0,0 +1,38 @@
|
||||
# Bootstrap Target Residual Isolation Plan
|
||||
|
||||
> Date: 2026-04-19
|
||||
> Status: Draft
|
||||
> Parent Plan: `docs/superpowers/plans/2026-04-19-structured-fail-closed-residual-13-closure-plan.md`
|
||||
> Parent Route: `Residual Route D`
|
||||
> Parent Layer: `Layer D`
|
||||
|
||||
## Plan Intent
|
||||
|
||||
Isolate the remaining page-state/bootstrap-target residual without starting login recovery or runtime navigation implementation.
|
||||
|
||||
## Fixed Input Bucket
|
||||
|
||||
1. `sweep-091-scene` / `用户停电频次分析监测`
|
||||
|
||||
## Allowed Files
|
||||
|
||||
1. isolation JSON asset
|
||||
2. isolation report
|
||||
|
||||
## Forbidden Files
|
||||
|
||||
1. `src/generated_scene/analyzer.rs`
|
||||
2. `src/generated_scene/generator.rs`
|
||||
3. login/runtime implementation files
|
||||
4. `tests/fixtures/generated_scene/scene_execution_board_2026-04-18.json`
|
||||
|
||||
## Tasks
|
||||
|
||||
1. preserve the residual as bootstrap-target isolated;
|
||||
2. publish isolation report;
|
||||
3. do not implement login recovery.
|
||||
|
||||
## Stop Statement
|
||||
|
||||
Stop after isolation assets are published.
|
||||
|
||||
@@ -0,0 +1,55 @@
|
||||
# Boundary Fail-Closed Decision Plan
|
||||
|
||||
> Date: 2026-04-19
|
||||
> Status: Draft
|
||||
> Parent Framework Plan: `docs/superpowers/plans/2026-04-19-scene-skill-102-full-coverage-framework-plan.md`
|
||||
> Parent Route: `Route 5: boundary-family fail-closed`
|
||||
> Parent Layer: `Layer C + Layer D`
|
||||
> Upstream Design: `docs/superpowers/specs/2026-04-19-boundary-fail-closed-decision-design.md`
|
||||
|
||||
## Plan Intent
|
||||
|
||||
Publish a decision for the remaining boundary-family fail-closed buckets after mainline routes are complete or deferred.
|
||||
|
||||
## Fixed Input Bucket
|
||||
|
||||
1. `local_doc_pipeline = 5`
|
||||
2. `host_bridge_workflow = 1`
|
||||
3. `page_state_eval/bootstrap_target = 1`
|
||||
|
||||
## Allowed Files
|
||||
|
||||
1. boundary decision JSON assets
|
||||
2. boundary decision report assets
|
||||
3. optional next bounded boundary plan docs
|
||||
|
||||
## Forbidden Files
|
||||
|
||||
1. `src/generated_scene/analyzer.rs`
|
||||
2. `src/generated_scene/generator.rs`
|
||||
3. `tests/fixtures/generated_scene/scene_execution_board_2026-04-18.json`
|
||||
|
||||
## Tasks
|
||||
|
||||
1. freeze the Route 5 bucket state
|
||||
2. inspect each boundary subgroup
|
||||
3. decide defer/hold/open-slice
|
||||
4. publish Route 5 decision report
|
||||
|
||||
## Expected Coverage Delta
|
||||
|
||||
Decision-only delta:
|
||||
|
||||
1. unresolved boundary ambiguity should go to zero
|
||||
|
||||
## Completion Criteria
|
||||
|
||||
1. every Route 5 subgroup has a named decision
|
||||
2. any follow-up bounded plan is explicit and optional
|
||||
|
||||
## Stop Statement
|
||||
|
||||
Stop after the Route 5 decision report is published.
|
||||
|
||||
Do not begin boundary implementation under this plan.
|
||||
|
||||
@@ -0,0 +1,139 @@
|
||||
# Boundary Family Real-Sample Entry Roadmap Plan
|
||||
|
||||
> Date: 2026-04-19
|
||||
> Status: Draft
|
||||
> Upstream Spec: [2026-04-19-boundary-family-real-sample-entry-roadmap-design.md](D:/data/ideaSpace/rust/sgClaw/claw-new/docs/superpowers/specs/2026-04-19-boundary-family-real-sample-entry-roadmap-design.md)
|
||||
|
||||
## Plan Intent
|
||||
|
||||
This roadmap determines the next bounded step after `G1-E / G2 / G3` have all closed as executed real-sample passes.
|
||||
|
||||
Its only purpose is:
|
||||
|
||||
`decide whether one boundary family may enter real-sample execution scope next`
|
||||
|
||||
## Scope Guardrails
|
||||
|
||||
1. do not reopen `G1-E / G2 / G3`
|
||||
2. do not implement runtime-platform prerequisites under this roadmap
|
||||
3. do not execute real samples for more than one boundary family
|
||||
4. do not open `G4 / G5`
|
||||
5. do not turn this work into a new family-asset expansion program
|
||||
|
||||
## Candidate Boundary Families
|
||||
|
||||
The only candidates under this roadmap are:
|
||||
|
||||
1. `G6`
|
||||
2. `G7`
|
||||
3. `G8`
|
||||
|
||||
## Workstreams
|
||||
|
||||
1. `WS1` Freeze the Post-Mainline Starting State
|
||||
2. `WS2` Evaluate Boundary-Family Entry Readiness
|
||||
3. `WS3` Select One Next Candidate or Hold All
|
||||
4. `WS4` Publish the Next Bounded Execution Slice
|
||||
|
||||
## Phase 0: Freeze the Starting State
|
||||
|
||||
### Objective
|
||||
|
||||
Lock the roadmap start point so the decision does not drift back into old mainline work.
|
||||
|
||||
### Tasks
|
||||
|
||||
1. freeze `G1-E / G2 / G3` as closed executed passes
|
||||
2. freeze `G6 / G7 / G8` as held boundary families
|
||||
3. freeze `G4 / G5` as out of scope
|
||||
|
||||
### Deliverables
|
||||
|
||||
1. starting-state note
|
||||
2. fixed candidate list
|
||||
|
||||
### Acceptance Criteria
|
||||
|
||||
1. no mainline or deferred family work is reopened under this roadmap
|
||||
|
||||
## Phase 1: Evaluate Boundary-Family Entry Readiness
|
||||
|
||||
### Objective
|
||||
|
||||
Compare `G6 / G7 / G8` against explicit entry criteria instead of intuition.
|
||||
|
||||
### Tasks
|
||||
|
||||
1. restate the current entry condition for each boundary family
|
||||
2. compare the required runtime gap for each family
|
||||
3. estimate which family needs the smallest new capability to enter real-sample scope
|
||||
|
||||
### Deliverables
|
||||
|
||||
1. boundary-family comparison matrix
|
||||
2. smallest-entry-cost summary
|
||||
|
||||
### Acceptance Criteria
|
||||
|
||||
1. the next candidate family can be justified with explicit criteria
|
||||
2. the rejected families have explicit hold reasons
|
||||
|
||||
## Phase 2: Select One Next Candidate or Hold All
|
||||
|
||||
### Objective
|
||||
|
||||
Reduce the next-step ambiguity to a single bounded decision.
|
||||
|
||||
### Tasks
|
||||
|
||||
1. select exactly one family as the next real-sample entry candidate
|
||||
2. or explicitly conclude that all boundary families remain held
|
||||
3. record why the non-selected families remain out of scope
|
||||
|
||||
### Deliverables
|
||||
|
||||
1. boundary-family entry decision
|
||||
2. hold reasons for non-selected families
|
||||
|
||||
### Acceptance Criteria
|
||||
|
||||
1. no more than one next family is opened
|
||||
2. the decision is bounded and defensible
|
||||
|
||||
## Phase 3: Publish the Next Bounded Execution Slice
|
||||
|
||||
### Objective
|
||||
|
||||
Turn the decision into the next actionable bounded plan.
|
||||
|
||||
### Tasks
|
||||
|
||||
1. if one family is selected, write a bounded `design + plan` for its minimum real-sample entry slice
|
||||
2. if none is selected, write a bounded prerequisites plan instead
|
||||
3. update the decision report layer
|
||||
|
||||
### Deliverables
|
||||
|
||||
1. next-family bounded `design`
|
||||
2. next-family bounded `plan`
|
||||
3. roadmap closure report
|
||||
|
||||
### Acceptance Criteria
|
||||
|
||||
1. the next step is ready to execute without reopening roadmap scope
|
||||
2. only one bounded direction is emitted
|
||||
|
||||
## Completion Criteria
|
||||
|
||||
This roadmap is complete when:
|
||||
|
||||
1. the post-mainline next step is reduced to one bounded direction
|
||||
2. `G6 / G7 / G8` no longer compete ambiguously for priority
|
||||
3. a single follow-up `design + plan` exists for the selected direction
|
||||
|
||||
## Next Step
|
||||
|
||||
After this roadmap completes:
|
||||
|
||||
1. execute the selected family-entry slice if one family is admitted
|
||||
2. otherwise execute the bounded prerequisites slice before any boundary family enters real-sample scope
|
||||
@@ -0,0 +1,38 @@
|
||||
# Boundary Residual Hold Decision Plan
|
||||
|
||||
> Date: 2026-04-19
|
||||
> Status: Draft
|
||||
> Parent Plan: `docs/superpowers/plans/2026-04-19-structured-fail-closed-residual-13-closure-plan.md`
|
||||
> Parent Route: `Residual Route C`
|
||||
> Parent Layer: `Layer D`
|
||||
|
||||
## Plan Intent
|
||||
|
||||
Decide whether the remaining `local_doc_pipeline` and `host_bridge_workflow` residual records should remain held or enter a future runtime roadmap.
|
||||
|
||||
## Fixed Input Bucket
|
||||
|
||||
1. five `local_doc_pipeline` residual records
|
||||
2. one `host_bridge_workflow` residual record
|
||||
|
||||
## Allowed Files
|
||||
|
||||
1. decision JSON asset
|
||||
2. decision report
|
||||
|
||||
## Forbidden Files
|
||||
|
||||
1. `src/generated_scene/analyzer.rs`
|
||||
2. `src/generated_scene/generator.rs`
|
||||
3. `tests/fixtures/generated_scene/scene_execution_board_2026-04-18.json`
|
||||
|
||||
## Tasks
|
||||
|
||||
1. classify each boundary residual as hold/defer/runtime-roadmap-input;
|
||||
2. do not implement runtime support;
|
||||
3. publish decision report.
|
||||
|
||||
## Stop Statement
|
||||
|
||||
Stop after decision assets are published.
|
||||
|
||||
@@ -0,0 +1,123 @@
|
||||
# Boundary Runtime Prerequisites Roadmap Plan
|
||||
|
||||
> Date: 2026-04-19
|
||||
> Status: Draft
|
||||
> Upstream Spec: [2026-04-19-boundary-runtime-prerequisites-roadmap-design.md](D:/data/ideaSpace/rust/sgClaw/claw-new/docs/superpowers/specs/2026-04-19-boundary-runtime-prerequisites-roadmap-design.md)
|
||||
|
||||
## Plan Intent
|
||||
|
||||
This roadmap determines the next bounded prerequisites slice after the post-`G7` boundary decision concludes that direct `G6` or `G8` execution should not start yet.
|
||||
|
||||
Its only purpose is:
|
||||
|
||||
`select one bounded prerequisite direction before the next boundary-family real-sample attempt`
|
||||
|
||||
## Scope Guardrails
|
||||
|
||||
1. do not execute `G6` or `G8`
|
||||
2. do not reopen `G7`
|
||||
3. do not reopen `G1-E / G2 / G3`
|
||||
4. do not implement host-runtime or local-doc runtime under this roadmap
|
||||
5. do not open `G4 / G5`
|
||||
|
||||
## Candidate Prerequisite Directions
|
||||
|
||||
The only candidates under this roadmap are:
|
||||
|
||||
1. `G6 host-bridge prerequisites`
|
||||
2. `G8 local-doc prerequisites`
|
||||
|
||||
## Workstreams
|
||||
|
||||
1. `WS1` Freeze the Post-G7 Boundary Hold State
|
||||
2. `WS2` Compare G6 and G8 Prerequisite Burden
|
||||
3. `WS3` Select One Prerequisite Direction
|
||||
4. `WS4` Publish the Next Bounded Prerequisites Slice
|
||||
|
||||
## Phase 0: Freeze the Starting State
|
||||
|
||||
### Objective
|
||||
|
||||
Lock the roadmap start point so no closed family work is reopened.
|
||||
|
||||
### Tasks
|
||||
|
||||
1. freeze `G7` as closed
|
||||
2. freeze `G6` and `G8` as held pending prerequisites
|
||||
3. freeze `G1-E / G2 / G3` as closed
|
||||
4. freeze `G4 / G5` as out of scope
|
||||
|
||||
### Deliverables
|
||||
|
||||
1. starting-state note
|
||||
2. fixed prerequisite candidate list
|
||||
|
||||
### Acceptance Criteria
|
||||
|
||||
1. no family execution begins under this roadmap
|
||||
|
||||
## Phase 1: Compare Prerequisite Burden
|
||||
|
||||
### Objective
|
||||
|
||||
Compare `G6` and `G8` at the prerequisite level instead of at the execution level.
|
||||
|
||||
### Tasks
|
||||
|
||||
1. restate the smallest blocked capability for `G6`
|
||||
2. restate the smallest blocked capability for `G8`
|
||||
3. compare which prerequisite can be isolated more cleanly
|
||||
|
||||
### Deliverables
|
||||
|
||||
1. prerequisite comparison matrix
|
||||
2. smallest-prerequisite summary
|
||||
|
||||
### Acceptance Criteria
|
||||
|
||||
1. the selected prerequisite direction is justified explicitly
|
||||
|
||||
## Phase 2: Select One Prerequisite Direction
|
||||
|
||||
### Objective
|
||||
|
||||
Reduce the post-`G7` prerequisite ambiguity to one bounded decision.
|
||||
|
||||
### Tasks
|
||||
|
||||
1. select exactly one direction:
|
||||
- `G6 host-bridge prerequisites`
|
||||
- or `G8 local-doc prerequisites`
|
||||
2. record why the other direction remains held
|
||||
|
||||
### Deliverables
|
||||
|
||||
1. prerequisite direction decision
|
||||
2. hold reason for the non-selected direction
|
||||
|
||||
### Acceptance Criteria
|
||||
|
||||
1. only one next direction is opened
|
||||
2. the decision is bounded and defensible
|
||||
|
||||
## Phase 3: Publish the Next Bounded Slice
|
||||
|
||||
### Objective
|
||||
|
||||
Turn the decision into the next executable bounded artifact.
|
||||
|
||||
### Tasks
|
||||
|
||||
1. write one bounded follow-up design and plan for the selected prerequisite direction
|
||||
2. publish a roadmap closure report
|
||||
|
||||
### Deliverables
|
||||
|
||||
1. next bounded `design`
|
||||
2. next bounded `plan`
|
||||
3. roadmap closure report
|
||||
|
||||
### Acceptance Criteria
|
||||
|
||||
1. the next step is ready without extending this roadmap
|
||||
2. only one bounded direction is emitted
|
||||
@@ -0,0 +1,54 @@
|
||||
# Final 2 Official Board Reconciliation Refresh Plan
|
||||
|
||||
> Date: 2026-04-19
|
||||
> Parent Framework: `2026-04-19-scene-skill-102-full-coverage-framework-plan.md`
|
||||
> Parent Sequence: `2026-04-19-final-2-residual-child-plan-sequence-plan.md`
|
||||
> Status: Draft
|
||||
|
||||
## Plan Intent
|
||||
|
||||
Refresh official board framework fields after one or both final-2 residual roadmaps publish reconciliation candidates.
|
||||
|
||||
## Fixed Inputs
|
||||
|
||||
At least one of:
|
||||
|
||||
1. `tests/fixtures/generated_scene/bootstrap_target_normalization_reconciliation_candidates_2026-04-19.json`
|
||||
2. `tests/fixtures/generated_scene/host_bridge_runtime_reconciliation_candidates_2026-04-19.json`
|
||||
|
||||
Also required:
|
||||
|
||||
1. `tests/fixtures/generated_scene/scene_execution_board_2026-04-18.json`
|
||||
2. `tests/fixtures/generated_scene/promotion_board_reconciliation_policy_2026-04-19.json`
|
||||
|
||||
## Allowed Files
|
||||
|
||||
1. `tests/fixtures/generated_scene/scene_execution_board_2026-04-18.json`
|
||||
2. `tests/fixtures/generated_scene/final_2_official_board_reconciliation_refresh_2026-04-19.json`
|
||||
3. `docs/superpowers/reports/2026-04-19-final-2-official-board-reconciliation-refresh-report.md`
|
||||
|
||||
## Forbidden Files
|
||||
|
||||
1. `src/generated_scene/analyzer.rs`
|
||||
2. `src/generated_scene/generator.rs`
|
||||
|
||||
## Tasks
|
||||
|
||||
1. Load candidate assets that exist.
|
||||
2. Verify each candidate belongs to `sweep-085-scene` or `sweep-091-scene`.
|
||||
3. Match board rows by `sceneId`.
|
||||
4. Update only framework-layer fields.
|
||||
5. Recompute board framework summary.
|
||||
6. Publish reconciliation refresh JSON and report.
|
||||
|
||||
## Expected Delta
|
||||
|
||||
Delta depends on candidate assets:
|
||||
|
||||
1. one closed residual: `framework-auto-pass +1`, `framework-structured-fail-closed -1`
|
||||
2. both closed residuals: `framework-auto-pass +2`, `framework-structured-fail-closed -2`
|
||||
3. held residuals: no count delta, but narrower next action / hold reason
|
||||
|
||||
## Stop Statement
|
||||
|
||||
Stop after the final-2 board reconciliation refresh JSON and report are published.
|
||||
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user