admin/claw

Files

zyl d2c9902966 feat: add browser script skill execution

2026-03-30 08:29:44 +08:00

7.6 KiB

Raw Permalink Blame History

Browser Script Skill Capability Implementation Plan

For Claude: REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.

Goal: Let zeroclaw-backed sgclaw skills call deterministic browser-context scripts through existing SuperRPA page-execution capabilities.

Architecture: Reuse SuperRPA's existing page script execution path instead of inventing a parallel workflow executor. Extend sgclaw's browser pipe and zeroclaw skill-tool mapping so a skill can declare a browser script tool, execute a packaged script file with parameters, and receive structured JSON back in the current page context.

Tech Stack: Rust (sgclaw, vendored zeroclaw), Chromium/SuperRPA C++, existing sgclaw browser pipe protocol, skill package SKILL.toml / scripts/.

Task 1: Define the browser-script contract

Files:

Modify: third_party/zeroclaw/src/skills/mod.rs
Modify: third_party/zeroclaw/src/tools/mod.rs
Create: third_party/zeroclaw/src/tools/browser_script_skill.rs
Test: tests/compat_browser_tool_test.rs

Step 1: Write the failing test

Add a sgclaw-side test proving the browser tool schema accepts a script execution action with required browser-script fields and rejects missing script payload.

Step 2: Run test to verify it fails

Run: cargo test --manifest-path /home/zyl/projects/sgClaw/claw/Cargo.toml --test compat_browser_tool_test Expected: FAIL because the browser tool still exposes only click/type/navigate/getText.

Step 3: Write minimal implementation

Extend the browser tool adapter and protocol-facing action parsing to support a deterministic script execution action carrying:

action
expected_domain
script
optional args

Step 4: Run test to verify it passes

Run: cargo test --manifest-path /home/zyl/projects/sgClaw/claw/Cargo.toml --test compat_browser_tool_test Expected: PASS for the new browser script action coverage.

Step 5: Commit

git add tests/compat_browser_tool_test.rs src/compat/browser_tool_adapter.rs src/pipe/protocol.rs third_party/zeroclaw/src/skills/mod.rs third_party/zeroclaw/src/tools/mod.rs third_party/zeroclaw/src/tools/browser_script_skill.rs
git commit -m "feat: add browser script skill tool support"

Task 2: Wire browser script execution through the sgclaw host bridge

Files:

Modify: src/pipe/protocol.rs
Modify: /home/zyl/projects/superRpa/src/chrome/browser/superrpa/sgclaw/sgclaw_pipe_protocol.cc
Modify: /home/zyl/projects/superRpa/src/chrome/browser/superrpa/sgclaw/sgclaw_security_gate.cc
Test: /home/zyl/projects/superRpa/src/chrome/browser/superrpa/sgclaw/sgclaw_pipe_protocol_mainline_unittest.cc
Test: /home/zyl/projects/superRpa/src/chrome/browser/superrpa/sgclaw/sgclaw_process_host_mainline_unittest.cc

Step 1: Write the failing tests

Add host-side tests proving:

sgclaw protocol accepts the new browser script action
handshake advertises the action
the security gate allows it only through the sgclaw host allowlist

Step 2: Run tests to verify they fail

Run the relevant SuperRPA unit tests. Expected: FAIL because the host protocol and allowlist do not yet include browser script execution.

Step 3: Write minimal implementation

Reuse existing SuperRPA router support by bridging the new sgclaw action to the host's page script execution capability instead of inventing a second execution path.

Step 4: Run tests to verify they pass

Run the same SuperRPA unit tests. Expected: PASS with the new action accepted end-to-end.

Step 5: Commit

git add src/pipe/protocol.rs /home/zyl/projects/superRpa/src/chrome/browser/superrpa/sgclaw/sgclaw_pipe_protocol.cc /home/zyl/projects/superRpa/src/chrome/browser/superrpa/sgclaw/sgclaw_security_gate.cc /home/zyl/projects/superRpa/src/chrome/browser/superrpa/sgclaw/sgclaw_pipe_protocol_mainline_unittest.cc /home/zyl/projects/superRpa/src/chrome/browser/superrpa/sgclaw/sgclaw_process_host_mainline_unittest.cc
git commit -m "feat: expose browser script execution to sgclaw"

Task 3: Convert Zhihu hotlist collection into a script-backed skill path

Files:

Create: /home/zyl/projects/sgClaw/skill_lib/skills/zhihu-hotlist/SKILL.toml
Create: /home/zyl/projects/sgClaw/skill_lib/skills/zhihu-hotlist/scripts/extract_hotlist.js
Modify: /home/zyl/projects/sgClaw/skill_lib/skills/zhihu-hotlist/SKILL.md
Modify: tests/compat_runtime_test.rs

Step 1: Write the failing test

Add runtime coverage proving the Zhihu hotlist export flow uses the skill-backed browser script tool instead of repeated getText probing.

Step 2: Run test to verify it fails

Run: cargo test --manifest-path /home/zyl/projects/sgClaw/claw/Cargo.toml --test compat_runtime_test Expected: FAIL because the current flow still uses read_skill plus generic getText.

Step 3: Write minimal implementation

Package the hotlist extractor as a deterministic browser script tool returning structured rows and update the skill description to require that tool before export.

Step 4: Run test to verify it passes

Run: cargo test --manifest-path /home/zyl/projects/sgClaw/claw/Cargo.toml --test compat_runtime_test Expected: PASS with the script-backed tool visible in logs and the generic repeated getText fallback removed from the primary path.

Step 5: Commit

git add /home/zyl/projects/sgClaw/skill_lib/skills/zhihu-hotlist/SKILL.toml /home/zyl/projects/sgClaw/skill_lib/skills/zhihu-hotlist/scripts/extract_hotlist.js /home/zyl/projects/sgClaw/skill_lib/skills/zhihu-hotlist/SKILL.md tests/compat_runtime_test.rs
git commit -m "feat: make zhihu hotlist extraction script-backed"

Task 4: Fix plan visibility in the sgclaw frontend

Files:

Modify: /home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_state.ts
Modify: /home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat.ts
Test: frontend or host tests covering [sgclaw/plan]

Step 1: Write the failing test

Add coverage proving [sgclaw/plan] logs are treated as planner logs and preserve multiline task-step rendering.

Step 2: Run test to verify it fails

Run the relevant frontend or browser-side tests. Expected: FAIL because the current frontend only recognizes [browser/plan] or [plan].

Step 3: Write minimal implementation

Accept [sgclaw/plan] as a planner log source and keep the plan text rendered as multiple visible lines.

Step 4: Run test to verify it passes

Run the same tests. Expected: PASS with sgclaw planner logs rendered correctly.

Step 5: Commit

git add /home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_state.ts /home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat.ts
git commit -m "fix: render sgclaw planner logs in chat ui"

Task 5: Full verification

Files:

Verify only

Step 1: Run sgclaw Rust tests

Run: cargo test --manifest-path /home/zyl/projects/sgClaw/claw/Cargo.toml Expected: PASS.

Step 2: Run relevant SuperRPA unit tests

Run the updated sgclaw host and frontend tests. Expected: PASS.

Step 3: Run a manual browser acceptance flow

Verify that:

startup logs include runtime and skill versions
[sgclaw/plan] is shown as a real multi-line plan
Zhihu hotlist export invokes the script-backed skill tool
the flow returns a real .xlsx path without repeated self-referential output

Step 4: Commit verification artifacts if needed

Keep verification commits separate from behavior changes.

7.6 KiB Raw Permalink Blame History

Browser Script Skill Capability Implementation Plan

Task 1: Define the browser-script contract

Task 2: Wire browser script execution through the sgclaw host bridge

Task 3: Convert Zhihu hotlist collection into a script-backed skill path

Task 4: Fix plan visibility in the sgclaw frontend

Task 5: Full verification

7.6 KiB

Raw Permalink Blame History