chore: seed sgclaw rust baseline
This commit is contained in:
396
docs/browser_team_kickoff.md
Normal file
396
docs/browser_team_kickoff.md
Normal file
@@ -0,0 +1,396 @@
|
||||
# sgClaw 浏览器团队开发启动文档
|
||||
|
||||
**适用对象**:Chromium / C++ 浏览器开发团队(P2)
|
||||
**目标**:浏览器团队拿到本文档后即可独立启动开发,并在一个周期后与 sgClaw 项目团队完成 Pipe 联调。
|
||||
**协议版本**:`1.0`
|
||||
**冻结日期**:`2026-03-24`
|
||||
|
||||
---
|
||||
|
||||
## 1. 开发目标
|
||||
|
||||
浏览器团队本周期只负责浏览器侧 Pipe 接入,不负责 LLM、Skill、Memory、Agent 推理。
|
||||
|
||||
本周期结束时,浏览器侧必须具备以下能力:
|
||||
|
||||
1. 能从浏览器主进程启动 `sgclaw` Rust 子进程。
|
||||
2. 能通过 `stdin/stdout` 与 `sgclaw` 进行双向 `JSON Line` 通信。
|
||||
3. 能解析 sgClaw 发来的 `command` 消息,并路由到现有 `CommandRouter`。
|
||||
4. 能执行最小可联调动作:`click`、`type`、`navigate`、`getText`。
|
||||
5. 能返回结构化 `response` 消息。
|
||||
6. 能在浏览器侧执行域名和 action 白名单校验。
|
||||
|
||||
本周期不做:
|
||||
|
||||
1. 不改现有 `CommandRouter` 的核心接口。
|
||||
2. 不新造一套浏览器操作 API。
|
||||
3. 不改为 HTTP、WebSocket、Named Pipe。
|
||||
4. 不实现 Rust 侧逻辑。
|
||||
|
||||
---
|
||||
|
||||
## 2. 架构边界
|
||||
|
||||
浏览器侧是父进程,`sgclaw` 是子进程。浏览器侧新增三个模块:
|
||||
|
||||
1. `SgClawProcessHost`
|
||||
负责子进程启动、停止、状态管理、异常退出处理。
|
||||
2. `PipeListener`
|
||||
负责异步读取 `sgclaw stdout`,按行解析 JSON 并分发。
|
||||
3. `MacWhitelistCheck`
|
||||
负责浏览器侧二次安全校验,防止越权 action 落到 `CommandRouter`。
|
||||
|
||||
浏览器侧数据流固定如下:
|
||||
|
||||
`Side Panel / UI -> SgClawProcessHost -> STDIO Pipe -> sgclaw`
|
||||
|
||||
`sgclaw -> STDIO Pipe -> PipeListener -> MacWhitelistCheck -> CommandRouter -> response`
|
||||
|
||||
---
|
||||
|
||||
## 3. 浏览器团队负责的交付物
|
||||
|
||||
本周期交付以下文件或等价模块:
|
||||
|
||||
1. `sgclaw_process_host.h`
|
||||
2. `sgclaw_process_host.cc`
|
||||
3. `pipe_listener.h`
|
||||
4. `pipe_listener.cc`
|
||||
5. `mac_whitelist_check.h`
|
||||
6. `mac_whitelist_check.cc`
|
||||
7. `rules.json`
|
||||
8. `sgclaw_unittests` 中对应单元测试
|
||||
|
||||
建议目录:
|
||||
|
||||
```text
|
||||
chrome/browser/superrpa/sgclaw/
|
||||
sgclaw_process_host.h
|
||||
sgclaw_process_host.cc
|
||||
pipe_listener.h
|
||||
pipe_listener.cc
|
||||
mac_whitelist_check.h
|
||||
mac_whitelist_check.cc
|
||||
test/
|
||||
sgclaw_process_host_unittest.cc
|
||||
pipe_listener_unittest.cc
|
||||
mac_whitelist_check_unittest.cc
|
||||
resources/
|
||||
rules.json
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 4. 冻结接口
|
||||
|
||||
### 4.1 传输协议
|
||||
|
||||
1. 传输层固定为 `STDIO Pipe`。
|
||||
2. 编码固定为 `UTF-8`。
|
||||
3. 消息边界固定为 `JSON Line`,每行一条完整 JSON。
|
||||
4. 单条消息最大 `1 MB`。
|
||||
5. `stdout` 只允许输出协议消息,日志必须走 `stderr`。
|
||||
|
||||
### 4.2 握手协议
|
||||
|
||||
浏览器发送:
|
||||
|
||||
```json
|
||||
{"type":"init","version":"1.0","hmac_seed":"0123456789abcdef","capabilities":["browser_action"]}
|
||||
```
|
||||
|
||||
sgClaw 返回:
|
||||
|
||||
```json
|
||||
{"type":"init_ack","version":"1.0","agent_id":"uuid-v4","supported_actions":["click","type","navigate","getText","getHtml","waitForSelector","pageScreenshot","select","scrollTo","getAomSnapshot","storageSet","storageGet","zombieSpawn","zombieKill"]}
|
||||
```
|
||||
|
||||
约束:
|
||||
|
||||
1. 浏览器必须在子进程启动后 `5s` 内发送 `init`。
|
||||
2. `5s` 内收不到 `init_ack`,判定启动失败。
|
||||
3. `version` 不一致,必须立即终止会话。
|
||||
|
||||
### 4.3 command 消息格式
|
||||
|
||||
```json
|
||||
{
|
||||
"type":"command",
|
||||
"seq":12,
|
||||
"action":"click",
|
||||
"params":{"selector":"#submit","wait_after":300},
|
||||
"security":{
|
||||
"expected_domain":"oa.example.com",
|
||||
"hmac":"<hex>"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
字段要求:
|
||||
|
||||
1. `seq` 为正整数,必须唯一。
|
||||
2. `action` 必须在白名单内。
|
||||
3. `params` 必须是对象。
|
||||
4. `security.expected_domain` 和 `security.hmac` 必须存在。
|
||||
|
||||
### 4.4 response 消息格式
|
||||
|
||||
成功:
|
||||
|
||||
```json
|
||||
{
|
||||
"type":"response",
|
||||
"seq":12,
|
||||
"success":true,
|
||||
"data":{"text":"提交成功"},
|
||||
"aom_snapshot":[],
|
||||
"timing":{"queue_ms":2,"exec_ms":38}
|
||||
}
|
||||
```
|
||||
|
||||
失败:
|
||||
|
||||
```json
|
||||
{
|
||||
"type":"response",
|
||||
"seq":12,
|
||||
"success":false,
|
||||
"data":{
|
||||
"error":{
|
||||
"code":"CMD_SELECTOR_NOT_FOUND",
|
||||
"message":"selector '#submit' not found"
|
||||
}
|
||||
},
|
||||
"aom_snapshot":[],
|
||||
"timing":{"queue_ms":1,"exec_ms":10}
|
||||
}
|
||||
```
|
||||
|
||||
约束:
|
||||
|
||||
1. 一个 `command.seq` 只能对应一个 `response.seq`。
|
||||
2. 失败必须返回结构化错误,不允许只返回字符串。
|
||||
3. `timing` 必须始终带上。
|
||||
|
||||
---
|
||||
|
||||
## 5. 本周期最小 Action 集
|
||||
|
||||
联调周期只强制四个动作:
|
||||
|
||||
1. `click`
|
||||
2. `type`
|
||||
3. `navigate`
|
||||
4. `getText`
|
||||
|
||||
动作语义:
|
||||
|
||||
1. `click`
|
||||
调用现有点击能力,支持可选 `wait_after`。
|
||||
2. `type`
|
||||
在目标输入框输入文本,支持 `clear_first`。
|
||||
3. `navigate`
|
||||
导航到目标 URL。
|
||||
4. `getText`
|
||||
获取目标节点文本。
|
||||
|
||||
其余 action 可保留接口但不进入本周期强制验收。
|
||||
|
||||
---
|
||||
|
||||
## 6. 浏览器侧实现要求
|
||||
|
||||
### 6.1 SgClawProcessHost
|
||||
|
||||
必须实现:
|
||||
|
||||
1. 单例,避免重复创建多个 `sgclaw` 子进程。
|
||||
2. `Start()` 创建匿名管道并启动子进程。
|
||||
3. `Stop()` 正常关闭并在超时后强制结束。
|
||||
4. `OnProcessCrash()` 记录错误并更新状态。
|
||||
5. 状态机至少包含 `Idle -> Starting -> Running -> Stopped / Crashed`。
|
||||
|
||||
建议接口:
|
||||
|
||||
```cpp
|
||||
class SgClawProcessHost {
|
||||
public:
|
||||
bool Start();
|
||||
void Stop();
|
||||
bool IsRunning() const;
|
||||
bool SendLine(std::string json_line);
|
||||
};
|
||||
```
|
||||
|
||||
### 6.2 PipeListener
|
||||
|
||||
必须实现:
|
||||
|
||||
1. 持续读取 `stdout`。
|
||||
2. 以换行符切分 `JSON Line`。
|
||||
3. 拒绝空行、非 JSON、超过 1MB 的消息。
|
||||
4. 能按 `seq` 追踪一次请求的完整生命周期。
|
||||
5. 管道断开时通知 `SgClawProcessHost`。
|
||||
|
||||
### 6.3 CommandRouter 对接
|
||||
|
||||
必须实现:
|
||||
|
||||
1. `command.action` 到现有浏览器命令的映射表。
|
||||
2. 尽量复用现有 `CommandRouter`。
|
||||
3. 不允许在 Pipe 层直接写新的页面控制逻辑。
|
||||
4. response 必须从实际执行结果构造,不允许伪造成功。
|
||||
|
||||
建议映射:
|
||||
|
||||
1. `click -> CommandRouter.click`
|
||||
2. `type -> CommandRouter.type`
|
||||
3. `navigate -> CommandRouter.navigate`
|
||||
4. `getText -> CommandRouter.getText`
|
||||
|
||||
### 6.4 MacWhitelistCheck
|
||||
|
||||
必须实现:
|
||||
|
||||
1. action 白名单校验。
|
||||
2. expected_domain 与当前页面域名比对。
|
||||
3. `rules.json` 加载失败时默认拒绝。
|
||||
4. 拒绝时返回统一错误码。
|
||||
|
||||
建议错误码:
|
||||
|
||||
1. `MAC_ACTION_NOT_ALLOWED`
|
||||
2. `MAC_DOMAIN_NOT_ALLOWED`
|
||||
3. `MAC_RULES_LOAD_FAILED`
|
||||
4. `PIPE_INVALID_JSON`
|
||||
5. `PIPE_MESSAGE_TOO_LARGE`
|
||||
|
||||
---
|
||||
|
||||
## 7. 浏览器团队开发顺序
|
||||
|
||||
### Day 1-2
|
||||
|
||||
1. 完成 `SgClawProcessHost` 骨架。
|
||||
2. 用 dummy 子进程验证启动和退出。
|
||||
3. 打通 `stdin/stdout` 读写通道。
|
||||
|
||||
验收:
|
||||
|
||||
1. 能启动 `echo` 或测试进程。
|
||||
2. 能发送一行字符串并收到回写。
|
||||
|
||||
### Day 3-4
|
||||
|
||||
1. 完成 `PipeListener`。
|
||||
2. 完成 `init -> init_ack` 握手。
|
||||
3. 建立 `command` / `response` 解析结构。
|
||||
|
||||
验收:
|
||||
|
||||
1. 能与 Rust 侧互发 JSON Line。
|
||||
2. 能处理 `seq` 对应关系。
|
||||
|
||||
### Day 5-6
|
||||
|
||||
1. 接入 `CommandRouter`。
|
||||
2. 完成 4 个最小 action。
|
||||
3. 完成 `MacWhitelistCheck`。
|
||||
|
||||
验收:
|
||||
|
||||
1. Rust 发起 `click/type/navigate/getText` 时浏览器真实执行。
|
||||
2. 非白名单域名被拒绝。
|
||||
|
||||
### Day 7
|
||||
|
||||
1. 完成浏览器侧单元测试。
|
||||
2. 提供联调分支和运行说明。
|
||||
3. 预留半天与项目团队联调。
|
||||
|
||||
---
|
||||
|
||||
## 8. 浏览器团队自测清单
|
||||
|
||||
- [ ] `Start()` 成功启动真实 `sgclaw` 二进制。
|
||||
- [ ] `Start()` 重复调用不会启动多个实例。
|
||||
- [ ] `Stop()` 能正常关闭进程。
|
||||
- [ ] `init -> init_ack` 成功。
|
||||
- [ ] 超过 1MB 的 JSON 消息会被拒绝。
|
||||
- [ ] 非 JSON 行会被拒绝。
|
||||
- [ ] `click/type/navigate/getText` 能成功返回。
|
||||
- [ ] 域名不匹配时返回 `MAC_DOMAIN_NOT_ALLOWED`。
|
||||
- [ ] `rules.json` 缺失时默认拒绝。
|
||||
- [ ] 日志中能按 `seq` 查到请求和响应。
|
||||
|
||||
---
|
||||
|
||||
## 9. 联调输入输出样例
|
||||
|
||||
### 9.1 手动握手
|
||||
|
||||
浏览器发:
|
||||
|
||||
```json
|
||||
{"type":"init","version":"1.0","hmac_seed":"00112233445566778899aabbccddeeff","capabilities":["browser_action"]}
|
||||
```
|
||||
|
||||
期待 Rust 回:
|
||||
|
||||
```json
|
||||
{"type":"init_ack","version":"1.0","agent_id":"00000000-0000-0000-0000-000000000000","supported_actions":["click","type","navigate","getText","getHtml","waitForSelector","pageScreenshot","select","scrollTo","getAomSnapshot","storageSet","storageGet","zombieSpawn","zombieKill"]}
|
||||
```
|
||||
|
||||
### 9.2 最小 click 联调
|
||||
|
||||
Rust 发:
|
||||
|
||||
```json
|
||||
{"type":"command","seq":1,"action":"click","params":{"selector":"#login-btn"},"security":{"expected_domain":"oa.example.com","hmac":"<hex>"}}
|
||||
```
|
||||
|
||||
浏览器回:
|
||||
|
||||
```json
|
||||
{"type":"response","seq":1,"success":true,"data":{},"aom_snapshot":[],"timing":{"queue_ms":1,"exec_ms":35}}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 10. 联调日必须提供的东西
|
||||
|
||||
浏览器团队在联调前必须准备:
|
||||
|
||||
1. 可运行的浏览器分支。
|
||||
2. `sgclaw` 子进程启动入口。
|
||||
3. `rules.json` 默认测试配置。
|
||||
4. 最小测试页面,至少包含一个输入框、一个按钮、一个文本节点。
|
||||
5. 一份 action 到 `CommandRouter` 的映射表。
|
||||
6. 一份错误码表。
|
||||
|
||||
---
|
||||
|
||||
## 11. 周期结束验收标准
|
||||
|
||||
以下全部满足,浏览器团队本周期完成:
|
||||
|
||||
1. 能在浏览器中稳定启动和停止 `sgclaw`。
|
||||
2. `init -> init_ack` 成功率 100%。
|
||||
3. `click/type/navigate/getText` 联调通过。
|
||||
4. 所有失败场景均返回结构化错误。
|
||||
5. 域名和 action 白名单生效。
|
||||
6. 与项目团队在同一测试页完成一次端到端演示。
|
||||
|
||||
---
|
||||
|
||||
## 12. 依赖与协作方式
|
||||
|
||||
浏览器团队只依赖以下冻结输入:
|
||||
|
||||
1. Pipe 协议版本:`1.0`
|
||||
2. 消息结构:`init / init_ack / command / response`
|
||||
3. 最小 action:`click/type/navigate/getText`
|
||||
4. 安全字段:`expected_domain`、`hmac`
|
||||
|
||||
除以上四项外,本周期内其他细节不应阻塞浏览器侧开发。
|
||||
|
||||
390
docs/sgclaw_project_team_kickoff.md
Normal file
390
docs/sgclaw_project_team_kickoff.md
Normal file
@@ -0,0 +1,390 @@
|
||||
# sgClaw 本项目团队开发启动文档
|
||||
|
||||
**适用对象**:sgClaw Rust / Agent 项目开发团队(P1a、P1b)
|
||||
**目标**:项目团队拿到本文档后即可独立启动 Rust 侧开发,并在一个周期后与浏览器团队完成 Pipe 联调。
|
||||
**协议版本**:`1.0`
|
||||
**冻结日期**:`2026-03-24`
|
||||
|
||||
---
|
||||
|
||||
## 1. 开发目标
|
||||
|
||||
本项目团队本周期只负责 sgClaw Rust 侧能力,不负责 Chromium 内部实现。
|
||||
|
||||
本周期结束时,Rust 侧必须具备以下能力:
|
||||
|
||||
1. 可作为浏览器子进程启动。
|
||||
2. 通过 `stdin/stdout` 执行双向 `JSON Line` 通信。
|
||||
3. 完成 `init -> init_ack` 握手。
|
||||
4. 提供 `BrowserPipeTool`,可发送 `click/type/navigate/getText`。
|
||||
5. 能等待并解析浏览器侧 `response`。
|
||||
6. 能执行本地 `MAC Policy` 初步校验。
|
||||
|
||||
本周期不做:
|
||||
|
||||
1. 不切回 HTTP/TCP 演示通道。
|
||||
2. 不依赖浏览器团队未完成的 UI。
|
||||
3. 不把 pipe 协议和业务 skill 混在一起推进。
|
||||
4. 不要求本周期完成完整 15 action。
|
||||
|
||||
---
|
||||
|
||||
## 2. Rust 团队负责的交付物
|
||||
|
||||
本周期交付以下文件或等价模块:
|
||||
|
||||
1. `src/main.rs`
|
||||
2. `src/pipe/protocol.rs`
|
||||
3. `src/pipe/handshake.rs`
|
||||
4. `src/pipe/browser_tool.rs`
|
||||
5. `src/pipe/mod.rs`
|
||||
6. `src/security/mac_policy.rs`
|
||||
7. `src/security/hmac.rs`
|
||||
8. `tests/pipe_protocol_test.rs`
|
||||
9. `tests/pipe_handshake_test.rs`
|
||||
10. `tests/browser_tool_test.rs`
|
||||
11. `tests/integration/handshake_flow_test.rs`
|
||||
|
||||
建议本周期目录保持如下:
|
||||
|
||||
```text
|
||||
src/
|
||||
main.rs
|
||||
lib.rs
|
||||
pipe/
|
||||
mod.rs
|
||||
protocol.rs
|
||||
handshake.rs
|
||||
browser_tool.rs
|
||||
security/
|
||||
mod.rs
|
||||
hmac.rs
|
||||
mac_policy.rs
|
||||
tests/
|
||||
pipe_protocol_test.rs
|
||||
pipe_handshake_test.rs
|
||||
browser_tool_test.rs
|
||||
integration/
|
||||
handshake_flow_test.rs
|
||||
resources/
|
||||
rules.json
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 3. 冻结边界
|
||||
|
||||
### 3.1 本周期团队边界
|
||||
|
||||
P1a 负责:
|
||||
|
||||
1. Pipe 协议结构体。
|
||||
2. 握手。
|
||||
3. `BrowserPipeTool`。
|
||||
4. HMAC 计算。
|
||||
5. response 关联和超时处理。
|
||||
|
||||
P1b 负责:
|
||||
|
||||
1. 将 `BrowserPipeTool` 作为工具注册到后续 `AgentRuntime`。
|
||||
2. 但本周期联调不阻塞于完整 ReAct Loop。
|
||||
|
||||
本周期联调最小成功标准是:
|
||||
|
||||
1. Rust 能发命令。
|
||||
2. 浏览器能执行并返回。
|
||||
3. Rust 能按 `seq` 收到正确 response。
|
||||
|
||||
### 3.2 进程与日志约束
|
||||
|
||||
1. `stdin` 只读协议消息。
|
||||
2. `stdout` 只写协议消息。
|
||||
3. 所有日志必须写到 `stderr`。
|
||||
4. 遇到协议错误时返回结构化错误或退出,不允许把调试日志写进 `stdout`。
|
||||
|
||||
---
|
||||
|
||||
## 4. 冻结协议
|
||||
|
||||
### 4.1 Browser -> sgClaw
|
||||
|
||||
`init`
|
||||
|
||||
```json
|
||||
{"type":"init","version":"1.0","hmac_seed":"0123456789abcdef","capabilities":["browser_action"]}
|
||||
```
|
||||
|
||||
`response`
|
||||
|
||||
```json
|
||||
{"type":"response","seq":1,"success":true,"data":{},"aom_snapshot":[],"timing":{"queue_ms":1,"exec_ms":20}}
|
||||
```
|
||||
|
||||
### 4.2 sgClaw -> Browser
|
||||
|
||||
`init_ack`
|
||||
|
||||
```json
|
||||
{"type":"init_ack","version":"1.0","agent_id":"uuid-v4","supported_actions":["click","type","navigate","getText","getHtml","waitForSelector","pageScreenshot","select","scrollTo","getAomSnapshot","storageSet","storageGet","zombieSpawn","zombieKill"]}
|
||||
```
|
||||
|
||||
`command`
|
||||
|
||||
```json
|
||||
{
|
||||
"type":"command",
|
||||
"seq":1,
|
||||
"action":"click",
|
||||
"params":{"selector":"#submit"},
|
||||
"security":{
|
||||
"expected_domain":"oa.example.com",
|
||||
"hmac":"<hex>"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### 4.3 必须满足的协议规则
|
||||
|
||||
1. 编码为 UTF-8。
|
||||
2. 每行一个完整 JSON。
|
||||
3. 单消息最大 `1 MB`。
|
||||
4. `seq` 从 `1` 开始递增。
|
||||
5. 每个 `command.seq` 对应唯一 `response.seq`。
|
||||
6. `version` 固定为 `1.0`。
|
||||
|
||||
---
|
||||
|
||||
## 5. Rust 侧实现要求
|
||||
|
||||
### 5.1 main.rs
|
||||
|
||||
本周期 `main` 的目标很简单:
|
||||
|
||||
1. 初始化日志到 `stderr`。
|
||||
2. 用 `stdin/stdout` 执行握手。
|
||||
3. 初始化 `BrowserPipeTool` 所需对象。
|
||||
4. 保持进程存活,等待命令结果和后续任务。
|
||||
|
||||
如果当前代码还保留演示版 HTTP 入口,本周期必须恢复到 pipe 入口优先。
|
||||
|
||||
### 5.2 handshake.rs
|
||||
|
||||
必须实现:
|
||||
|
||||
1. 从 `stdin` 读取第一条 `init`。
|
||||
2. 校验 `version`。
|
||||
3. 从 `hmac_seed` 派生会话级 HMAC key。
|
||||
4. 生成 `agent_id`。
|
||||
5. 向 `stdout` 回写 `init_ack`。
|
||||
|
||||
失败条件:
|
||||
|
||||
1. 第一条消息不是 `init`。
|
||||
2. `version` 不匹配。
|
||||
3. `hmac_seed` 非法。
|
||||
|
||||
### 5.3 protocol.rs
|
||||
|
||||
必须定义:
|
||||
|
||||
1. `BrowserMessage`
|
||||
2. `AgentMessage`
|
||||
3. `SecurityFields`
|
||||
4. `Timing`
|
||||
5. `Action`
|
||||
|
||||
本周期最小 `Action` 必须覆盖:
|
||||
|
||||
1. `click`
|
||||
2. `type`
|
||||
3. `navigate`
|
||||
4. `getText`
|
||||
|
||||
建议保留剩余 action 枚举,为后续扩展留口。
|
||||
|
||||
### 5.4 browser_tool.rs
|
||||
|
||||
必须实现:
|
||||
|
||||
1. 输入参数反序列化为 `Action`。
|
||||
2. 调用本地 `MAC Policy` 做前置校验。
|
||||
3. 分配递增 `seq`。
|
||||
4. 计算 `security.hmac`。
|
||||
5. 向 `stdout` 写出 `command`。
|
||||
6. 等待同 `seq` 的 `response`。
|
||||
7. 超时返回错误。
|
||||
|
||||
建议超时:
|
||||
|
||||
1. 握手超时:`5s`
|
||||
2. 单 action 响应超时:`30s`
|
||||
|
||||
### 5.5 mac_policy.rs
|
||||
|
||||
本周期最小校验:
|
||||
|
||||
1. action 白名单。
|
||||
2. 域名白名单。
|
||||
3. storage key 前缀约束可后置。
|
||||
4. 熔断器可后置,但接口要预留。
|
||||
|
||||
`rules.json` 建议格式:
|
||||
|
||||
```json
|
||||
{
|
||||
"version": "1.0",
|
||||
"domains": {
|
||||
"allowed": ["oa.example.com", "erp.example.com", "hr.example.com"]
|
||||
},
|
||||
"pipe_actions": {
|
||||
"allowed": ["click", "type", "navigate", "getText"],
|
||||
"blocked": ["eval", "executeJsInPage"]
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 6. 本周期开发顺序
|
||||
|
||||
### Day 1-2
|
||||
|
||||
1. 固定协议结构体。
|
||||
2. 写 `pipe_protocol_test`。
|
||||
3. 写 `pipe_handshake_test`。
|
||||
4. 恢复 `stdin/stdout` 入口。
|
||||
|
||||
验收:
|
||||
|
||||
1. 能独立运行进程并手动喂一条 `init`。
|
||||
2. 能正确输出 `init_ack`。
|
||||
|
||||
### Day 3-4
|
||||
|
||||
1. 完成 `BrowserPipeTool`。
|
||||
2. 完成 HMAC 计算。
|
||||
3. 完成基于 `seq` 的 response 匹配。
|
||||
4. 与本地 mock 浏览器进程联通。
|
||||
|
||||
验收:
|
||||
|
||||
1. Rust 能发出 `click/type/navigate/getText` 四类命令。
|
||||
2. mock response 能被正确接收。
|
||||
|
||||
### Day 5-6
|
||||
|
||||
1. 接入最小 `MAC Policy`。
|
||||
2. 完成 integration test。
|
||||
3. 准备联调脚本和示例 JSON。
|
||||
|
||||
验收:
|
||||
|
||||
1. 非白名单 action 在 Rust 侧被前置拒绝。
|
||||
2. 域名不合法时直接失败。
|
||||
|
||||
### Day 7
|
||||
|
||||
1. 收口测试。
|
||||
2. 输出联调说明。
|
||||
3. 与浏览器团队联调。
|
||||
|
||||
---
|
||||
|
||||
## 7. Rust 团队自测清单
|
||||
|
||||
- [ ] `protocol.rs` 序列化/反序列化测试通过。
|
||||
- [ ] `init -> init_ack` 测试通过。
|
||||
- [ ] `version` 不匹配时握手失败。
|
||||
- [ ] `hmac_seed` 非法时握手失败。
|
||||
- [ ] `click/type/navigate/getText` 命令都能正确编码。
|
||||
- [ ] `response.seq` 不匹配时不会误关联。
|
||||
- [ ] 单 action 超时能返回错误。
|
||||
- [ ] 非白名单 action 被 `MAC Policy` 拒绝。
|
||||
- [ ] 日志只出现在 `stderr`。
|
||||
|
||||
---
|
||||
|
||||
## 8. 联调输入输出样例
|
||||
|
||||
### 8.1 手动运行握手
|
||||
|
||||
输入:
|
||||
|
||||
```json
|
||||
{"type":"init","version":"1.0","hmac_seed":"00112233445566778899aabbccddeeff","capabilities":["browser_action"]}
|
||||
```
|
||||
|
||||
期望输出:
|
||||
|
||||
```json
|
||||
{"type":"init_ack","version":"1.0","agent_id":"00000000-0000-0000-0000-000000000000","supported_actions":["click","type","navigate","getText","getHtml","waitForSelector","pageScreenshot","select","scrollTo","getAomSnapshot","storageSet","storageGet","zombieSpawn","zombieKill"]}
|
||||
```
|
||||
|
||||
### 8.2 最小命令样例
|
||||
|
||||
输出给浏览器:
|
||||
|
||||
```json
|
||||
{"type":"command","seq":1,"action":"navigate","params":{"url":"https://oa.example.com/login"},"security":{"expected_domain":"oa.example.com","hmac":"<hex>"}}
|
||||
```
|
||||
|
||||
浏览器回:
|
||||
|
||||
```json
|
||||
{"type":"response","seq":1,"success":true,"data":{},"aom_snapshot":[],"timing":{"queue_ms":1,"exec_ms":50}}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 9. 联调前必须提供的东西
|
||||
|
||||
本项目团队在联调前必须准备:
|
||||
|
||||
1. 可运行的 `sgclaw` 可执行文件或 debug 启动方式。
|
||||
2. 协议样例文件。
|
||||
3. `rules.json` 默认测试配置。
|
||||
4. 四个最小 action 的参数样例。
|
||||
5. 一份错误码表。
|
||||
6. 一份 `stderr` 日志关键字段说明。
|
||||
|
||||
---
|
||||
|
||||
## 10. 周期结束验收标准
|
||||
|
||||
以下全部满足,Rust 团队本周期完成:
|
||||
|
||||
1. `sgclaw` 可以被浏览器作为子进程启动。
|
||||
2. `init -> init_ack` 成功率 100%。
|
||||
3. 能稳定发送 `click/type/navigate/getText` 四类命令。
|
||||
4. 能稳定按 `seq` 收到并解析 response。
|
||||
5. Rust 侧前置 `MAC Policy` 生效。
|
||||
6. 与浏览器团队在同一测试页面上联调成功。
|
||||
|
||||
---
|
||||
|
||||
## 11. 联调日执行顺序
|
||||
|
||||
联调当天只按下面顺序走,避免双方并发改协议:
|
||||
|
||||
1. 先验证 `init -> init_ack`。
|
||||
2. 再验证 `navigate`。
|
||||
3. 再验证 `type`。
|
||||
4. 再验证 `click`。
|
||||
5. 最后验证 `getText`。
|
||||
6. 再补失败场景:域名拒绝、非法 action、超时。
|
||||
|
||||
任何协议字段问题,一律以 `protocol.rs` 和本文件为准,不在联调现场临时改口。
|
||||
|
||||
---
|
||||
|
||||
## 12. 对浏览器团队的依赖
|
||||
|
||||
Rust 团队本周期只依赖浏览器团队提供以下冻结输入:
|
||||
|
||||
1. 浏览器能启动子进程。
|
||||
2. 浏览器能收发 JSON Line。
|
||||
3. 浏览器支持 4 个最小 action。
|
||||
4. 浏览器返回结构化 `response`。
|
||||
|
||||
浏览器内部如何落到 `CommandRouter`,不属于 Rust 团队阻塞项。
|
||||
|
||||
@@ -0,0 +1,345 @@
|
||||
# SuperRPA sgClaw Browser Control Implementation Plan
|
||||
|
||||
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
|
||||
|
||||
**Goal:** Deliver a two-phase integration where `sgclaw` first drives the existing SuperRPA browser through a minimal fixed-intent demo, then upgrades to a real Agent loop backed by `deepseek-chat`.
|
||||
|
||||
**Architecture:** Keep the browser side thin and reuse-first. Rust owns task understanding, pipe protocol, and sequencing; SuperRPA owns process hosting, secondary security checks, and delegation into existing `CommandRouter`. Phase 1 uses a rule-based planner; Phase 2 swaps in an Agent runtime without changing browser command execution.
|
||||
|
||||
**Tech Stack:** Rust, JSON Line over STDIO, HMAC-SHA256, SuperRPA Chromium C++, existing `CommandRouter`, existing rules services, FunctionsUI bridge, DeepSeek OpenAI-compatible API (`deepseek-chat`).
|
||||
|
||||
---
|
||||
|
||||
## File Structure
|
||||
|
||||
### sgClaw Repository
|
||||
|
||||
- Create: `src/agent/mod.rs`
|
||||
- Create: `src/agent/runtime.rs`
|
||||
- Create: `src/agent/planner.rs`
|
||||
- Create: `src/llm/mod.rs`
|
||||
- Create: `src/llm/provider.rs`
|
||||
- Create: `src/llm/deepseek.rs`
|
||||
- Create: `src/config/mod.rs`
|
||||
- Create: `src/config/settings.rs`
|
||||
- Modify: `src/lib.rs`
|
||||
- Modify: `src/main.rs`
|
||||
- Modify: `src/pipe/protocol.rs`
|
||||
- Modify: `src/pipe/browser_tool.rs`
|
||||
- Modify: `src/security/hmac.rs`
|
||||
- Modify: `resources/rules.json`
|
||||
- Create: `tests/task_protocol_test.rs`
|
||||
- Create: `tests/planner_test.rs`
|
||||
- Create: `tests/runtime_task_flow_test.rs`
|
||||
|
||||
### SuperRPA Repository
|
||||
|
||||
- Modify: `src/chrome/browser/superrpa/BUILD.gn`
|
||||
- Modify: `src/chrome/browser/superrpa/router/command_router.h`
|
||||
- Modify: `src/chrome/browser/superrpa/router/command_router.cc`
|
||||
- Modify: `src/chrome/browser/superrpa/sgclaw/sgclaw_command_dispatcher.cc`
|
||||
- Modify: `src/chrome/browser/superrpa/sgclaw/sgclaw_security_gate.h`
|
||||
- Modify: `src/chrome/browser/superrpa/sgclaw/sgclaw_security_gate.cc`
|
||||
- Create or modify: `src/chrome/browser/superrpa/sgclaw/sgclaw_process_host.h`
|
||||
- Create or modify: `src/chrome/browser/superrpa/sgclaw/sgclaw_process_host.cc`
|
||||
- Create or modify: `src/chrome/browser/superrpa/sgclaw/pipe_listener.h`
|
||||
- Create or modify: `src/chrome/browser/superrpa/sgclaw/pipe_listener.cc`
|
||||
- Modify: `src/chrome/browser/resources/superrpa/devtools/functions/functions.ts`
|
||||
- Modify: `src/chrome/browser/resources/superrpa/devtools/functions/functions_manifest.ts`
|
||||
- Modify: `src/chrome/browser/superrpa/rules/rpa_rules_service_factory.cc`
|
||||
- Test: `test("superrpa_unittests")`
|
||||
|
||||
## Task 1: Align Pipe Contract and Security Baseline
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/pipe/protocol.rs`
|
||||
- Modify: `src/security/hmac.rs`
|
||||
- Modify: `resources/rules.json`
|
||||
- Create: `tests/task_protocol_test.rs`
|
||||
|
||||
- [ ] **Step 1: Write failing protocol tests for task-level messages**
|
||||
|
||||
Add tests covering `submit_task`, `task_complete`, and exact HMAC canonical string expectations.
|
||||
|
||||
- [ ] **Step 2: Run protocol-focused tests**
|
||||
|
||||
Run: `cargo test task_protocol_test pipe_protocol_test -q`
|
||||
Expected: FAIL because the task-level messages and canonical signing are missing.
|
||||
|
||||
- [ ] **Step 3: Extend protocol types**
|
||||
|
||||
Add task-scope message variants in `src/pipe/protocol.rs` for:
|
||||
- browser -> sgclaw `submit_task`
|
||||
- sgclaw -> browser `task_complete`
|
||||
- optional `log_entry`
|
||||
|
||||
- [ ] **Step 4: Fix HMAC canonical string**
|
||||
|
||||
Change `src/security/hmac.rs` to sign:
|
||||
|
||||
```text
|
||||
<seq>\n<action>\n<stable_json(params)>\n<expected_domain>
|
||||
```
|
||||
|
||||
- [ ] **Step 5: Add demo rules isolation**
|
||||
|
||||
Add a clearly marked demo allow entry for Baidu in `resources/rules.json`, with comments in docs explaining it is demo-only.
|
||||
|
||||
- [ ] **Step 6: Re-run protocol tests**
|
||||
|
||||
Run: `cargo test task_protocol_test pipe_protocol_test -q`
|
||||
Expected: PASS.
|
||||
|
||||
## Task 2: Build Phase 1 Rust Task Flow
|
||||
|
||||
**Files:**
|
||||
- Create: `src/agent/mod.rs`
|
||||
- Create: `src/agent/planner.rs`
|
||||
- Modify: `src/lib.rs`
|
||||
- Modify: `src/main.rs`
|
||||
- Create: `tests/planner_test.rs`
|
||||
- Create: `tests/runtime_task_flow_test.rs`
|
||||
|
||||
- [ ] **Step 1: Write failing planner tests**
|
||||
|
||||
Add tests for parsing:
|
||||
- `打开百度搜索天气`
|
||||
- `打开百度搜索电网调度`
|
||||
|
||||
Expected output is an ordered action plan: `navigate`, `type`, `click`.
|
||||
|
||||
- [ ] **Step 2: Run planner tests**
|
||||
|
||||
Run: `cargo test planner_test -q`
|
||||
Expected: FAIL because no planner exists.
|
||||
|
||||
- [ ] **Step 3: Implement rule-based planner**
|
||||
|
||||
Create `src/agent/planner.rs` with a minimal parser that only accepts the Baidu-search intent family and rejects everything else clearly.
|
||||
|
||||
- [ ] **Step 4: Wire `submit_task` handling into runtime entry**
|
||||
|
||||
Update `src/lib.rs` and `src/main.rs` so the Rust process can receive a task message, execute the planner, call `BrowserPipeTool`, and emit `task_complete`.
|
||||
|
||||
- [ ] **Step 5: Add end-to-end runtime test**
|
||||
|
||||
Use a mock transport to validate:
|
||||
- receive `submit_task`
|
||||
- send three browser commands
|
||||
- consume three responses
|
||||
- emit `task_complete`
|
||||
|
||||
- [ ] **Step 6: Re-run Rust tests**
|
||||
|
||||
Run: `cargo test -q`
|
||||
Expected: PASS for planner and runtime task flow.
|
||||
|
||||
## Task 3: Reuse Existing SuperRPA Browser Execution Path
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/chrome/browser/superrpa/sgclaw/sgclaw_process_host.h`
|
||||
- Modify: `src/chrome/browser/superrpa/sgclaw/sgclaw_process_host.cc`
|
||||
- Modify: `src/chrome/browser/superrpa/sgclaw/pipe_listener.h`
|
||||
- Modify: `src/chrome/browser/superrpa/sgclaw/pipe_listener.cc`
|
||||
- Modify: `src/chrome/browser/superrpa/BUILD.gn`
|
||||
|
||||
- [ ] **Step 1: Add failing browser-side host/listener tests**
|
||||
|
||||
Cover:
|
||||
- process start
|
||||
- init handshake timeout
|
||||
- JSON Line split and dispatch
|
||||
- listener rejection of invalid payloads
|
||||
|
||||
- [ ] **Step 2: Implement process host skeleton**
|
||||
|
||||
Add lifecycle states and `Start/Stop/SendLine` using the existing sgclaw area, not a parallel subsystem.
|
||||
|
||||
- [ ] **Step 3: Implement listener**
|
||||
|
||||
Read `stdout`, split lines, reject empty/oversized/invalid JSON, and forward valid messages to sgclaw dispatch code.
|
||||
|
||||
- [ ] **Step 4: Hook build targets**
|
||||
|
||||
Update `src/chrome/browser/superrpa/BUILD.gn` to compile the sgclaw host/listener path inside existing targets.
|
||||
|
||||
- [ ] **Step 5: Run browser unit tests**
|
||||
|
||||
Run the relevant `superrpa_unittests` target for the added cases.
|
||||
Expected: PASS.
|
||||
|
||||
## Task 4: Reuse CommandRouter and Security Gates
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/chrome/browser/superrpa/router/command_router.h`
|
||||
- Modify: `src/chrome/browser/superrpa/router/command_router.cc`
|
||||
- Modify: `src/chrome/browser/superrpa/sgclaw/sgclaw_command_dispatcher.cc`
|
||||
- Modify: `src/chrome/browser/superrpa/sgclaw/sgclaw_security_gate.h`
|
||||
- Modify: `src/chrome/browser/superrpa/sgclaw/sgclaw_security_gate.cc`
|
||||
- Modify: `src/chrome/browser/superrpa/rules/rpa_rules_service_factory.cc`
|
||||
|
||||
- [ ] **Step 1: Write failing dispatch/security tests**
|
||||
|
||||
Cover:
|
||||
- allowed Baidu demo task
|
||||
- blocked non-whitelisted domain
|
||||
- blocked unsupported action
|
||||
- HMAC mismatch rejection
|
||||
|
||||
- [ ] **Step 2: Reuse command entrypoints**
|
||||
|
||||
Map sgclaw commands into existing methods:
|
||||
- `ExecuteNavigate`
|
||||
- `ExecuteType`
|
||||
- `ExecuteClick`
|
||||
- `ExecuteGetText`
|
||||
|
||||
- [ ] **Step 3: Reuse security layers**
|
||||
|
||||
Ensure sgclaw path reads existing rules services and uses `sgclaw_security_gate` for secondary checks before dispatch.
|
||||
|
||||
- [ ] **Step 4: Add demo rules source**
|
||||
|
||||
If needed, gate Baidu allow rules behind profile/demo config rather than broad permanent defaults.
|
||||
|
||||
- [ ] **Step 5: Re-run browser tests**
|
||||
|
||||
Run the focused security/dispatch unit tests.
|
||||
Expected: PASS.
|
||||
|
||||
## Task 5: Wire FunctionsUI Submission and Result Flow
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/chrome/browser/resources/superrpa/devtools/functions/functions.ts`
|
||||
- Modify: `src/chrome/browser/resources/superrpa/devtools/functions/functions_manifest.ts`
|
||||
- Modify: browser-side bridge code that receives `window.__SUPER_RPA_BRIDGE__` calls
|
||||
|
||||
- [ ] **Step 1: Write failing UI bridge test or manual harness case**
|
||||
|
||||
Cover:
|
||||
- `sgclaw_start`
|
||||
- `sgclaw_stop`
|
||||
- `sgclaw_submit_task`
|
||||
- result/event propagation
|
||||
|
||||
- [ ] **Step 2: Add bridge entry points**
|
||||
|
||||
Expose minimal callable actions from FunctionsUI to the browser-side sgclaw host.
|
||||
|
||||
- [ ] **Step 3: Surface task lifecycle events**
|
||||
|
||||
Push state, logs, and final result back to FunctionsUI without introducing a new parallel UI subsystem.
|
||||
|
||||
- [ ] **Step 4: Validate manual smoke path**
|
||||
|
||||
Manual test:
|
||||
1. Open FunctionsUI
|
||||
2. Start sgclaw
|
||||
3. Submit `打开百度搜索天气`
|
||||
4. Observe logs and completion summary
|
||||
|
||||
- [ ] **Step 5: Document the bridge contract**
|
||||
|
||||
Add a short browser-side note describing the exact payloads for start/stop/submit/result.
|
||||
|
||||
## Task 6: Add Phase 2 Agent Runtime with DeepSeek
|
||||
|
||||
**Files:**
|
||||
- Create: `src/agent/runtime.rs`
|
||||
- Create: `src/llm/mod.rs`
|
||||
- Create: `src/llm/provider.rs`
|
||||
- Create: `src/llm/deepseek.rs`
|
||||
- Create: `src/config/mod.rs`
|
||||
- Create: `src/config/settings.rs`
|
||||
- Modify: `src/pipe/browser_tool.rs`
|
||||
- Modify: `src/lib.rs`
|
||||
- Create: `tests/deepseek_provider_test.rs`
|
||||
- Create: `tests/agent_runtime_test.rs`
|
||||
|
||||
- [ ] **Step 1: Write failing provider tests**
|
||||
|
||||
Cover:
|
||||
- config loading from env
|
||||
- request shape for DeepSeek compatible chat API
|
||||
- model default = `deepseek-chat`
|
||||
|
||||
- [ ] **Step 2: Implement provider abstraction**
|
||||
|
||||
Add a minimal provider trait and DeepSeek implementation using:
|
||||
- `base_url=https://api.deepseek.com`
|
||||
- model `deepseek-chat`
|
||||
- API key from environment or config file, never hardcoded
|
||||
|
||||
- [ ] **Step 3: Write failing runtime tests**
|
||||
|
||||
Cover:
|
||||
- tool registration for `browser_action`
|
||||
- one think-act-observe cycle
|
||||
- final summary generation after successful browser actions
|
||||
|
||||
- [ ] **Step 4: Implement Agent runtime**
|
||||
|
||||
Create a minimal `AgentRuntime` that can:
|
||||
- receive task text
|
||||
- call provider
|
||||
- parse tool call
|
||||
- invoke `BrowserPipeTool`
|
||||
- emit `task_complete`
|
||||
|
||||
- [ ] **Step 5: Keep Phase 1 fallback**
|
||||
|
||||
Retain the rule-based planner as a fallback path for offline/demo use and for controlled debugging.
|
||||
|
||||
- [ ] **Step 6: Re-run Rust tests**
|
||||
|
||||
Run: `cargo test -q`
|
||||
Expected: PASS including provider and runtime suites.
|
||||
|
||||
## Task 7: Final Cross-Repo Acceptance and Low-Context Docs
|
||||
|
||||
**Files:**
|
||||
- Modify: `README.md`
|
||||
- Create: `docs/superpowers/acceptance/2026-03-25-superrpa-sgclaw-browser-control.md`
|
||||
- Modify: `docs/浏览器对接标准.md`
|
||||
- Modify: `docs/sgclaw_project_team_kickoff.md`
|
||||
|
||||
- [ ] **Step 1: Write acceptance checklist**
|
||||
|
||||
Cover:
|
||||
- handshake
|
||||
- `submit_task`
|
||||
- Baidu search success
|
||||
- HMAC mismatch failure
|
||||
- non-whitelisted domain rejection
|
||||
|
||||
- [ ] **Step 2: Create low-context handoff docs**
|
||||
|
||||
Write one short acceptance doc that links only the required files and commands for each phase.
|
||||
|
||||
- [ ] **Step 3: Run final smoke tests**
|
||||
|
||||
Rust repo:
|
||||
`cargo test -q`
|
||||
|
||||
Browser repo:
|
||||
run focused `superrpa_unittests`
|
||||
|
||||
Manual:
|
||||
submit `打开百度搜索天气`
|
||||
|
||||
- [ ] **Step 4: Update top-level docs**
|
||||
|
||||
Update README and browser contract docs so the next contributor can find:
|
||||
- Phase 1 demo loop
|
||||
- Phase 2 Agent loop
|
||||
- exact integration points
|
||||
|
||||
- [ ] **Step 5: Commit in small slices**
|
||||
|
||||
Suggested commit order:
|
||||
1. `feat: align sgclaw pipe contract for task flow`
|
||||
2. `feat: add phase1 baidu demo planner`
|
||||
3. `feat: wire superrpa sgclaw process host and dispatcher`
|
||||
4. `feat: add functionsui sgclaw task bridge`
|
||||
5. `feat: add deepseek-backed agent runtime`
|
||||
6. `docs: add acceptance and integration notes`
|
||||
@@ -0,0 +1,107 @@
|
||||
# SuperRPA sgClaw Browser Control Design
|
||||
|
||||
## Goal
|
||||
|
||||
Build `sgclaw` in two phases so it can control the existing SuperRPA browser with minimal new surface area.
|
||||
|
||||
- Phase 1: deliver a demo-safe closed loop for a fixed instruction like `打开百度搜索天气`.
|
||||
- Phase 2: upgrade that loop into a real Agent flow backed by `deepseek-chat`.
|
||||
|
||||
The design must maximize reuse of existing SuperRPA browser interfaces and minimize working context for future contributors.
|
||||
|
||||
## Scope
|
||||
|
||||
### In Scope
|
||||
|
||||
- Reuse SuperRPA `CommandRouter` as the browser execution entry.
|
||||
- Reuse existing browser rule and security infrastructure where possible.
|
||||
- Keep the Rust side responsible for task understanding, sequencing, and pipe protocol.
|
||||
- Keep the browser side responsible for process hosting, security re-check, and command dispatch.
|
||||
- Use layered docs so contributors only read the smallest necessary document.
|
||||
|
||||
### Out of Scope
|
||||
|
||||
- New browser automation APIs parallel to `CommandRouter`
|
||||
- Full SkillLoader / Memory / MCP work in Phase 1
|
||||
- Broad action-set expansion beyond `click`, `type`, `navigate`, `getText`
|
||||
|
||||
## Existing Integration Points
|
||||
|
||||
### sgClaw Repository
|
||||
|
||||
- Pipe and security baseline already exist in [`src/pipe/protocol.rs`](/home/zyl/projects/sgClaw/src/pipe/protocol.rs), [`src/pipe/handshake.rs`](/home/zyl/projects/sgClaw/src/pipe/handshake.rs), [`src/pipe/browser_tool.rs`](/home/zyl/projects/sgClaw/src/pipe/browser_tool.rs), and [`src/security/mac_policy.rs`](/home/zyl/projects/sgClaw/src/security/mac_policy.rs).
|
||||
|
||||
### SuperRPA Repository
|
||||
|
||||
- Browser command entry: `src/chrome/browser/superrpa/router/command_router.h/.cc`
|
||||
- Existing sgclaw dispatch/security area: `src/chrome/browser/superrpa/sgclaw/sgclaw_command_dispatcher.cc`, `src/chrome/browser/superrpa/sgclaw/sgclaw_security_gate.h/.cc`
|
||||
- FunctionsUI front-end entry: `src/chrome/browser/resources/superrpa/devtools/functions/functions.ts`
|
||||
- Rules and whitelist sources: `src/chrome/browser/superrpa/rules/*`, `src/chrome/browser/superrpa/zombie/resource_controller.*`
|
||||
|
||||
## Recommended Architecture
|
||||
|
||||
Use a thin-adapter design.
|
||||
|
||||
1. Rust owns `submit_task`, planning, pipe messages, response correlation, and final task completion.
|
||||
2. SuperRPA owns `sgclaw` process lifecycle, JSON Line I/O, secondary security validation, and delegation into existing `CommandRouter`.
|
||||
3. Phase 1 uses a rule-based planner for one narrow intent family: `打开百度搜索X`.
|
||||
4. Phase 2 replaces that planner with a real Agent runtime using `deepseek-chat`, but keeps the same `BrowserPipeTool` contract so browser-side code stays thin.
|
||||
|
||||
This preserves the browser’s existing abstractions and avoids duplicating action logic.
|
||||
|
||||
## Phase Design
|
||||
|
||||
### Phase 1: Minimal Demo Loop
|
||||
|
||||
- Add task-level messages on top of the existing pipe.
|
||||
- Accept a `submit_task` instruction from the browser bridge.
|
||||
- Parse only one pattern family: open Baidu, enter query, click search.
|
||||
- Return `task_complete` with summary and step log.
|
||||
- Allow Baidu only in demo rules, not as a permanent broad whitelist expansion.
|
||||
|
||||
### Phase 2: Real Agent Loop
|
||||
|
||||
- Add `agent/runtime.rs` and provider abstraction.
|
||||
- Register `BrowserPipeTool` as `browser_action`.
|
||||
- Default provider is DeepSeek with `base_url=https://api.deepseek.com` and model `deepseek-chat`.
|
||||
- Keep provider config externalized through environment variables and settings files.
|
||||
|
||||
## Security
|
||||
|
||||
- HMAC must be aligned to the browser contract exactly: `<seq>\n<action>\n<stable_json(params)>\n<expected_domain>`.
|
||||
- Rust validates before send; browser validates again before dispatch.
|
||||
- `rules.json` remains the source for domain/action allow rules.
|
||||
- Demo-only domains like `baidu.com` must be clearly isolated in a demo profile or demo rules file.
|
||||
|
||||
## Context Control Strategy
|
||||
|
||||
Use four small docs instead of one large narrative:
|
||||
|
||||
1. This design doc: goals, boundaries, architecture.
|
||||
2. Browser contract doc: exact message shapes and file paths.
|
||||
3. Plan doc: execution order and concrete files.
|
||||
4. Acceptance doc: smoke tests and failure matrix.
|
||||
|
||||
Each implementation task should point only to the doc section it needs.
|
||||
|
||||
## Testing Strategy
|
||||
|
||||
- Rust unit tests for protocol, planner, HMAC, and runtime message handling
|
||||
- Rust integration tests for `submit_task -> command -> response -> task_complete`
|
||||
- SuperRPA unit tests for process host, listener, security gate, and dispatch mapping
|
||||
- Cross-repo smoke test for `打开百度搜索天气`
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
### Phase 1
|
||||
|
||||
- Start `sgclaw` from SuperRPA
|
||||
- Send `submit_task`
|
||||
- Navigate to Baidu and search a keyword through existing browser actions
|
||||
- Surface logs and final result back to FunctionsUI
|
||||
|
||||
### Phase 2
|
||||
|
||||
- Execute the same flow through `deepseek-chat`
|
||||
- Keep the same browser contract and command mapping
|
||||
- Expose provider/model config without code changes
|
||||
Reference in New Issue
Block a user