Files
skill-lib/docs/superpowers/plans/2026-03-25-superrpa-sgclaw-browser-control.md
2026-03-25 02:17:55 +00:00

11 KiB

SuperRPA sgClaw Browser Control Implementation Plan

For agentic workers: REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (- [ ]) syntax for tracking.

Goal: Deliver a two-phase integration where sgclaw first drives the existing SuperRPA browser through a minimal fixed-intent demo, then upgrades to a real Agent loop backed by deepseek-chat.

Architecture: Keep the browser side thin and reuse-first. Rust owns task understanding, pipe protocol, and sequencing; SuperRPA owns process hosting, secondary security checks, and delegation into existing CommandRouter. Phase 1 uses a rule-based planner; Phase 2 swaps in an Agent runtime without changing browser command execution.

Tech Stack: Rust, JSON Line over STDIO, HMAC-SHA256, SuperRPA Chromium C++, existing CommandRouter, existing rules services, FunctionsUI bridge, DeepSeek OpenAI-compatible API (deepseek-chat).


File Structure

sgClaw Repository

  • Create: src/agent/mod.rs
  • Create: src/agent/runtime.rs
  • Create: src/agent/planner.rs
  • Create: src/llm/mod.rs
  • Create: src/llm/provider.rs
  • Create: src/llm/deepseek.rs
  • Create: src/config/mod.rs
  • Create: src/config/settings.rs
  • Modify: src/lib.rs
  • Modify: src/main.rs
  • Modify: src/pipe/protocol.rs
  • Modify: src/pipe/browser_tool.rs
  • Modify: src/security/hmac.rs
  • Modify: resources/rules.json
  • Create: tests/task_protocol_test.rs
  • Create: tests/planner_test.rs
  • Create: tests/runtime_task_flow_test.rs

SuperRPA Repository

  • Modify: src/chrome/browser/superrpa/BUILD.gn
  • Modify: src/chrome/browser/superrpa/router/command_router.h
  • Modify: src/chrome/browser/superrpa/router/command_router.cc
  • Modify: src/chrome/browser/superrpa/sgclaw/sgclaw_command_dispatcher.cc
  • Modify: src/chrome/browser/superrpa/sgclaw/sgclaw_security_gate.h
  • Modify: src/chrome/browser/superrpa/sgclaw/sgclaw_security_gate.cc
  • Create or modify: src/chrome/browser/superrpa/sgclaw/sgclaw_process_host.h
  • Create or modify: src/chrome/browser/superrpa/sgclaw/sgclaw_process_host.cc
  • Create or modify: src/chrome/browser/superrpa/sgclaw/pipe_listener.h
  • Create or modify: src/chrome/browser/superrpa/sgclaw/pipe_listener.cc
  • Modify: src/chrome/browser/resources/superrpa/devtools/functions/functions.ts
  • Modify: src/chrome/browser/resources/superrpa/devtools/functions/functions_manifest.ts
  • Modify: src/chrome/browser/superrpa/rules/rpa_rules_service_factory.cc
  • Test: test("superrpa_unittests")

Task 1: Align Pipe Contract and Security Baseline

Files:

  • Modify: src/pipe/protocol.rs

  • Modify: src/security/hmac.rs

  • Modify: resources/rules.json

  • Create: tests/task_protocol_test.rs

  • Step 1: Write failing protocol tests for task-level messages

Add tests covering submit_task, task_complete, and exact HMAC canonical string expectations.

  • Step 2: Run protocol-focused tests

Run: cargo test task_protocol_test pipe_protocol_test -q Expected: FAIL because the task-level messages and canonical signing are missing.

  • Step 3: Extend protocol types

Add task-scope message variants in src/pipe/protocol.rs for:

  • browser -> sgclaw submit_task

  • sgclaw -> browser task_complete

  • optional log_entry

  • Step 4: Fix HMAC canonical string

Change src/security/hmac.rs to sign:

<seq>\n<action>\n<stable_json(params)>\n<expected_domain>
  • Step 5: Add demo rules isolation

Add a clearly marked demo allow entry for Baidu in resources/rules.json, with comments in docs explaining it is demo-only.

  • Step 6: Re-run protocol tests

Run: cargo test task_protocol_test pipe_protocol_test -q Expected: PASS.

Task 2: Build Phase 1 Rust Task Flow

Files:

  • Create: src/agent/mod.rs

  • Create: src/agent/planner.rs

  • Modify: src/lib.rs

  • Modify: src/main.rs

  • Create: tests/planner_test.rs

  • Create: tests/runtime_task_flow_test.rs

  • Step 1: Write failing planner tests

Add tests for parsing:

  • 打开百度搜索天气
  • 打开百度搜索电网调度

Expected output is an ordered action plan: navigate, type, click.

  • Step 2: Run planner tests

Run: cargo test planner_test -q Expected: FAIL because no planner exists.

  • Step 3: Implement rule-based planner

Create src/agent/planner.rs with a minimal parser that only accepts the Baidu-search intent family and rejects everything else clearly.

  • Step 4: Wire submit_task handling into runtime entry

Update src/lib.rs and src/main.rs so the Rust process can receive a task message, execute the planner, call BrowserPipeTool, and emit task_complete.

  • Step 5: Add end-to-end runtime test

Use a mock transport to validate:

  • receive submit_task

  • send three browser commands

  • consume three responses

  • emit task_complete

  • Step 6: Re-run Rust tests

Run: cargo test -q Expected: PASS for planner and runtime task flow.

Task 3: Reuse Existing SuperRPA Browser Execution Path

Files:

  • Modify: src/chrome/browser/superrpa/sgclaw/sgclaw_process_host.h

  • Modify: src/chrome/browser/superrpa/sgclaw/sgclaw_process_host.cc

  • Modify: src/chrome/browser/superrpa/sgclaw/pipe_listener.h

  • Modify: src/chrome/browser/superrpa/sgclaw/pipe_listener.cc

  • Modify: src/chrome/browser/superrpa/BUILD.gn

  • Step 1: Add failing browser-side host/listener tests

Cover:

  • process start

  • init handshake timeout

  • JSON Line split and dispatch

  • listener rejection of invalid payloads

  • Step 2: Implement process host skeleton

Add lifecycle states and Start/Stop/SendLine using the existing sgclaw area, not a parallel subsystem.

  • Step 3: Implement listener

Read stdout, split lines, reject empty/oversized/invalid JSON, and forward valid messages to sgclaw dispatch code.

  • Step 4: Hook build targets

Update src/chrome/browser/superrpa/BUILD.gn to compile the sgclaw host/listener path inside existing targets.

  • Step 5: Run browser unit tests

Run the relevant superrpa_unittests target for the added cases. Expected: PASS.

Task 4: Reuse CommandRouter and Security Gates

Files:

  • Modify: src/chrome/browser/superrpa/router/command_router.h

  • Modify: src/chrome/browser/superrpa/router/command_router.cc

  • Modify: src/chrome/browser/superrpa/sgclaw/sgclaw_command_dispatcher.cc

  • Modify: src/chrome/browser/superrpa/sgclaw/sgclaw_security_gate.h

  • Modify: src/chrome/browser/superrpa/sgclaw/sgclaw_security_gate.cc

  • Modify: src/chrome/browser/superrpa/rules/rpa_rules_service_factory.cc

  • Step 1: Write failing dispatch/security tests

Cover:

  • allowed Baidu demo task

  • blocked non-whitelisted domain

  • blocked unsupported action

  • HMAC mismatch rejection

  • Step 2: Reuse command entrypoints

Map sgclaw commands into existing methods:

  • ExecuteNavigate

  • ExecuteType

  • ExecuteClick

  • ExecuteGetText

  • Step 3: Reuse security layers

Ensure sgclaw path reads existing rules services and uses sgclaw_security_gate for secondary checks before dispatch.

  • Step 4: Add demo rules source

If needed, gate Baidu allow rules behind profile/demo config rather than broad permanent defaults.

  • Step 5: Re-run browser tests

Run the focused security/dispatch unit tests. Expected: PASS.

Task 5: Wire FunctionsUI Submission and Result Flow

Files:

  • Modify: src/chrome/browser/resources/superrpa/devtools/functions/functions.ts

  • Modify: src/chrome/browser/resources/superrpa/devtools/functions/functions_manifest.ts

  • Modify: browser-side bridge code that receives window.__SUPER_RPA_BRIDGE__ calls

  • Step 1: Write failing UI bridge test or manual harness case

Cover:

  • sgclaw_start

  • sgclaw_stop

  • sgclaw_submit_task

  • result/event propagation

  • Step 2: Add bridge entry points

Expose minimal callable actions from FunctionsUI to the browser-side sgclaw host.

  • Step 3: Surface task lifecycle events

Push state, logs, and final result back to FunctionsUI without introducing a new parallel UI subsystem.

  • Step 4: Validate manual smoke path

Manual test:

  1. Open FunctionsUI
  2. Start sgclaw
  3. Submit 打开百度搜索天气
  4. Observe logs and completion summary
  • Step 5: Document the bridge contract

Add a short browser-side note describing the exact payloads for start/stop/submit/result.

Task 6: Add Phase 2 Agent Runtime with DeepSeek

Files:

  • Create: src/agent/runtime.rs

  • Create: src/llm/mod.rs

  • Create: src/llm/provider.rs

  • Create: src/llm/deepseek.rs

  • Create: src/config/mod.rs

  • Create: src/config/settings.rs

  • Modify: src/pipe/browser_tool.rs

  • Modify: src/lib.rs

  • Create: tests/deepseek_provider_test.rs

  • Create: tests/agent_runtime_test.rs

  • Step 1: Write failing provider tests

Cover:

  • config loading from env

  • request shape for DeepSeek compatible chat API

  • model default = deepseek-chat

  • Step 2: Implement provider abstraction

Add a minimal provider trait and DeepSeek implementation using:

  • base_url=https://api.deepseek.com

  • model deepseek-chat

  • API key from environment or config file, never hardcoded

  • Step 3: Write failing runtime tests

Cover:

  • tool registration for browser_action

  • one think-act-observe cycle

  • final summary generation after successful browser actions

  • Step 4: Implement Agent runtime

Create a minimal AgentRuntime that can:

  • receive task text

  • call provider

  • parse tool call

  • invoke BrowserPipeTool

  • emit task_complete

  • Step 5: Keep Phase 1 fallback

Retain the rule-based planner as a fallback path for offline/demo use and for controlled debugging.

  • Step 6: Re-run Rust tests

Run: cargo test -q Expected: PASS including provider and runtime suites.

Task 7: Final Cross-Repo Acceptance and Low-Context Docs

Files:

  • Modify: README.md

  • Create: docs/superpowers/acceptance/2026-03-25-superrpa-sgclaw-browser-control.md

  • Modify: docs/浏览器对接标准.md

  • Modify: docs/sgclaw_project_team_kickoff.md

  • Step 1: Write acceptance checklist

Cover:

  • handshake

  • submit_task

  • Baidu search success

  • HMAC mismatch failure

  • non-whitelisted domain rejection

  • Step 2: Create low-context handoff docs

Write one short acceptance doc that links only the required files and commands for each phase.

  • Step 3: Run final smoke tests

Rust repo: cargo test -q

Browser repo: run focused superrpa_unittests

Manual: submit 打开百度搜索天气

  • Step 4: Update top-level docs

Update README and browser contract docs so the next contributor can find:

  • Phase 1 demo loop

  • Phase 2 Agent loop

  • exact integration points

  • Step 5: Commit in small slices

Suggested commit order:

  1. feat: align sgclaw pipe contract for task flow
  2. feat: add phase1 baidu demo planner
  3. feat: wire superrpa sgclaw process host and dispatcher
  4. feat: add functionsui sgclaw task bridge
  5. feat: add deepseek-backed agent runtime
  6. docs: add acceptance and integration notes