# SuperRPA sgClaw Browser Control Design ## Goal Build `sgclaw` in two phases so it can control the existing SuperRPA browser with minimal new surface area. - Phase 1: deliver a demo-safe closed loop for a fixed instruction like `打开百度搜索天气`. - Phase 2: upgrade that loop into a real Agent flow backed by `deepseek-chat`. The design must maximize reuse of existing SuperRPA browser interfaces and minimize working context for future contributors. ## Scope ### In Scope - Reuse SuperRPA `CommandRouter` as the browser execution entry. - Reuse existing browser rule and security infrastructure where possible. - Keep the Rust side responsible for task understanding, sequencing, and pipe protocol. - Keep the browser side responsible for process hosting, security re-check, and command dispatch. - Use layered docs so contributors only read the smallest necessary document. ### Out of Scope - New browser automation APIs parallel to `CommandRouter` - Full SkillLoader / Memory / MCP work in Phase 1 - Broad action-set expansion beyond `click`, `type`, `navigate`, `getText` ## Existing Integration Points ### sgClaw Repository - Pipe and security baseline already exist in [`src/pipe/protocol.rs`](/home/zyl/projects/sgClaw/src/pipe/protocol.rs), [`src/pipe/handshake.rs`](/home/zyl/projects/sgClaw/src/pipe/handshake.rs), [`src/pipe/browser_tool.rs`](/home/zyl/projects/sgClaw/src/pipe/browser_tool.rs), and [`src/security/mac_policy.rs`](/home/zyl/projects/sgClaw/src/security/mac_policy.rs). ### SuperRPA Repository - Browser command entry: `src/chrome/browser/superrpa/router/command_router.h/.cc` - Existing sgclaw dispatch/security area: `src/chrome/browser/superrpa/sgclaw/sgclaw_command_dispatcher.cc`, `src/chrome/browser/superrpa/sgclaw/sgclaw_security_gate.h/.cc` - FunctionsUI front-end entry: `src/chrome/browser/resources/superrpa/devtools/functions/functions.ts` - Rules and whitelist sources: `src/chrome/browser/superrpa/rules/*`, `src/chrome/browser/superrpa/zombie/resource_controller.*` ## Recommended Architecture Use a thin-adapter design. 1. Rust owns `submit_task`, planning, pipe messages, response correlation, and final task completion. 2. SuperRPA owns `sgclaw` process lifecycle, JSON Line I/O, secondary security validation, and delegation into existing `CommandRouter`. 3. Phase 1 uses a rule-based planner for one narrow intent family: `打开百度搜索X`. 4. Phase 2 replaces that planner with a real Agent runtime using `deepseek-chat`, but keeps the same `BrowserPipeTool` contract so browser-side code stays thin. This preserves the browser’s existing abstractions and avoids duplicating action logic. ## Phase Design ### Phase 1: Minimal Demo Loop - Add task-level messages on top of the existing pipe. - Accept a `submit_task` instruction from the browser bridge. - Parse only one pattern family: open Baidu, enter query, click search. - Return `task_complete` with summary and step log. - Allow Baidu only in demo rules, not as a permanent broad whitelist expansion. ### Phase 2: Real Agent Loop - Add `agent/runtime.rs` and provider abstraction. - Register `BrowserPipeTool` as `browser_action`. - Default provider is DeepSeek with `base_url=https://api.deepseek.com` and model `deepseek-chat`. - Keep provider config externalized through environment variables and settings files. ## Security - HMAC must be aligned to the browser contract exactly: `\n\n\n`. - Rust validates before send; browser validates again before dispatch. - `rules.json` remains the source for domain/action allow rules. - Demo-only domains like `baidu.com` must be clearly isolated in a demo profile or demo rules file. ## Context Control Strategy Use four small docs instead of one large narrative: 1. This design doc: goals, boundaries, architecture. 2. Browser contract doc: exact message shapes and file paths. 3. Plan doc: execution order and concrete files. 4. Acceptance doc: smoke tests and failure matrix. Each implementation task should point only to the doc section it needs. ## Testing Strategy - Rust unit tests for protocol, planner, HMAC, and runtime message handling - Rust integration tests for `submit_task -> command -> response -> task_complete` - SuperRPA unit tests for process host, listener, security gate, and dispatch mapping - Cross-repo smoke test for `打开百度搜索天气` ## Acceptance Criteria ### Phase 1 - Start `sgclaw` from SuperRPA - Send `submit_task` - Navigate to Baidu and search a keyword through existing browser actions - Surface logs and final result back to FunctionsUI ### Phase 2 - Execute the same flow through `deepseek-chat` - Keep the same browser contract and command mapping - Expose provider/model config without code changes