chore: seed sgclaw rust baseline
This commit is contained in:
@@ -0,0 +1,107 @@
|
||||
# SuperRPA sgClaw Browser Control Design
|
||||
|
||||
## Goal
|
||||
|
||||
Build `sgclaw` in two phases so it can control the existing SuperRPA browser with minimal new surface area.
|
||||
|
||||
- Phase 1: deliver a demo-safe closed loop for a fixed instruction like `打开百度搜索天气`.
|
||||
- Phase 2: upgrade that loop into a real Agent flow backed by `deepseek-chat`.
|
||||
|
||||
The design must maximize reuse of existing SuperRPA browser interfaces and minimize working context for future contributors.
|
||||
|
||||
## Scope
|
||||
|
||||
### In Scope
|
||||
|
||||
- Reuse SuperRPA `CommandRouter` as the browser execution entry.
|
||||
- Reuse existing browser rule and security infrastructure where possible.
|
||||
- Keep the Rust side responsible for task understanding, sequencing, and pipe protocol.
|
||||
- Keep the browser side responsible for process hosting, security re-check, and command dispatch.
|
||||
- Use layered docs so contributors only read the smallest necessary document.
|
||||
|
||||
### Out of Scope
|
||||
|
||||
- New browser automation APIs parallel to `CommandRouter`
|
||||
- Full SkillLoader / Memory / MCP work in Phase 1
|
||||
- Broad action-set expansion beyond `click`, `type`, `navigate`, `getText`
|
||||
|
||||
## Existing Integration Points
|
||||
|
||||
### sgClaw Repository
|
||||
|
||||
- Pipe and security baseline already exist in [`src/pipe/protocol.rs`](/home/zyl/projects/sgClaw/src/pipe/protocol.rs), [`src/pipe/handshake.rs`](/home/zyl/projects/sgClaw/src/pipe/handshake.rs), [`src/pipe/browser_tool.rs`](/home/zyl/projects/sgClaw/src/pipe/browser_tool.rs), and [`src/security/mac_policy.rs`](/home/zyl/projects/sgClaw/src/security/mac_policy.rs).
|
||||
|
||||
### SuperRPA Repository
|
||||
|
||||
- Browser command entry: `src/chrome/browser/superrpa/router/command_router.h/.cc`
|
||||
- Existing sgclaw dispatch/security area: `src/chrome/browser/superrpa/sgclaw/sgclaw_command_dispatcher.cc`, `src/chrome/browser/superrpa/sgclaw/sgclaw_security_gate.h/.cc`
|
||||
- FunctionsUI front-end entry: `src/chrome/browser/resources/superrpa/devtools/functions/functions.ts`
|
||||
- Rules and whitelist sources: `src/chrome/browser/superrpa/rules/*`, `src/chrome/browser/superrpa/zombie/resource_controller.*`
|
||||
|
||||
## Recommended Architecture
|
||||
|
||||
Use a thin-adapter design.
|
||||
|
||||
1. Rust owns `submit_task`, planning, pipe messages, response correlation, and final task completion.
|
||||
2. SuperRPA owns `sgclaw` process lifecycle, JSON Line I/O, secondary security validation, and delegation into existing `CommandRouter`.
|
||||
3. Phase 1 uses a rule-based planner for one narrow intent family: `打开百度搜索X`.
|
||||
4. Phase 2 replaces that planner with a real Agent runtime using `deepseek-chat`, but keeps the same `BrowserPipeTool` contract so browser-side code stays thin.
|
||||
|
||||
This preserves the browser’s existing abstractions and avoids duplicating action logic.
|
||||
|
||||
## Phase Design
|
||||
|
||||
### Phase 1: Minimal Demo Loop
|
||||
|
||||
- Add task-level messages on top of the existing pipe.
|
||||
- Accept a `submit_task` instruction from the browser bridge.
|
||||
- Parse only one pattern family: open Baidu, enter query, click search.
|
||||
- Return `task_complete` with summary and step log.
|
||||
- Allow Baidu only in demo rules, not as a permanent broad whitelist expansion.
|
||||
|
||||
### Phase 2: Real Agent Loop
|
||||
|
||||
- Add `agent/runtime.rs` and provider abstraction.
|
||||
- Register `BrowserPipeTool` as `browser_action`.
|
||||
- Default provider is DeepSeek with `base_url=https://api.deepseek.com` and model `deepseek-chat`.
|
||||
- Keep provider config externalized through environment variables and settings files.
|
||||
|
||||
## Security
|
||||
|
||||
- HMAC must be aligned to the browser contract exactly: `<seq>\n<action>\n<stable_json(params)>\n<expected_domain>`.
|
||||
- Rust validates before send; browser validates again before dispatch.
|
||||
- `rules.json` remains the source for domain/action allow rules.
|
||||
- Demo-only domains like `baidu.com` must be clearly isolated in a demo profile or demo rules file.
|
||||
|
||||
## Context Control Strategy
|
||||
|
||||
Use four small docs instead of one large narrative:
|
||||
|
||||
1. This design doc: goals, boundaries, architecture.
|
||||
2. Browser contract doc: exact message shapes and file paths.
|
||||
3. Plan doc: execution order and concrete files.
|
||||
4. Acceptance doc: smoke tests and failure matrix.
|
||||
|
||||
Each implementation task should point only to the doc section it needs.
|
||||
|
||||
## Testing Strategy
|
||||
|
||||
- Rust unit tests for protocol, planner, HMAC, and runtime message handling
|
||||
- Rust integration tests for `submit_task -> command -> response -> task_complete`
|
||||
- SuperRPA unit tests for process host, listener, security gate, and dispatch mapping
|
||||
- Cross-repo smoke test for `打开百度搜索天气`
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
### Phase 1
|
||||
|
||||
- Start `sgclaw` from SuperRPA
|
||||
- Send `submit_task`
|
||||
- Navigate to Baidu and search a keyword through existing browser actions
|
||||
- Surface logs and final result back to FunctionsUI
|
||||
|
||||
### Phase 2
|
||||
|
||||
- Execute the same flow through `deepseek-chat`
|
||||
- Keep the same browser contract and command mapping
|
||||
- Expose provider/model config without code changes
|
||||
Reference in New Issue
Block a user