feat: align browser callback runtime and export flows
Consolidate the browser task runtime around the callback path, add safer artifact opening for Zhihu exports, and cover the new service/browser flows with focused tests and supporting docs. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
425
docs/superpowers/plans/2026-04-01-claw-ws-execution-cards.md
Normal file
425
docs/superpowers/plans/2026-04-01-claw-ws-execution-cards.md
Normal file
@@ -0,0 +1,425 @@
|
||||
# Claw-WS 开发执行顺序卡片
|
||||
|
||||
> 配套计划:[`2026-04-01-claw-ws-parallel-transport.md`](./2026-04-01-claw-ws-parallel-transport.md)
|
||||
>
|
||||
> 使用方式:严格按卡片顺序执行。每张卡片完成后先跑卡片内测试,再进入下一张。不要跳卡,不要提前接线,不要先写 service/client 再回头抽象底层。
|
||||
|
||||
---
|
||||
|
||||
## 卡片 0:执行前约束
|
||||
|
||||
**目标**
|
||||
先锁定边界,避免实现过程中把 pipe 模式改坏。
|
||||
|
||||
**必须遵守**
|
||||
- 现有 pipe 模式必须保持可用
|
||||
- 新增的是并行 `claw-ws` 模式,不是替换 pipe
|
||||
- v1 只做单客户端、单任务串行
|
||||
- `browser_action` / `superrpa_browser` 外部命名保持稳定
|
||||
- 如果 WS `Eval` 不完整,先禁用相关 browser-script skill 暴露
|
||||
- 不要提前做多客户端、任务队列、管理接口
|
||||
|
||||
**完成标准**
|
||||
- 开发者明确后续所有改动都围绕“抽象复用 + 并行新增”进行
|
||||
|
||||
---
|
||||
|
||||
## 卡片 1:抽共享 SubmitTask Runner
|
||||
|
||||
**目标**
|
||||
把当前 `BrowserMessage::SubmitTask` 的主执行逻辑从 pipe 入口里抽出来,变成共享执行器。
|
||||
|
||||
**先做什么**
|
||||
1. 新增 `tests/task_runner_test.rs`
|
||||
2. 先写失败用例:
|
||||
- 空 instruction
|
||||
- 无 LLM 配置
|
||||
- 日志顺序仍然是 `LogEntry` -> `TaskComplete`
|
||||
|
||||
**要改哪些文件**
|
||||
- `src/agent/mod.rs`
|
||||
- `src/lib.rs`
|
||||
- `src/agent/task_runner.rs`
|
||||
- `tests/task_runner_test.rs`
|
||||
|
||||
**实现动作**
|
||||
1. 建 `SubmitTaskRequest`
|
||||
2. 建 `AgentEventSink`
|
||||
3. 建 `run_submit_task(...)`
|
||||
4. 让 pipe 入口只做:
|
||||
- 解包 `BrowserMessage::SubmitTask`
|
||||
- 转成 `SubmitTaskRequest`
|
||||
- 调共享 runner
|
||||
|
||||
**绝对不要做**
|
||||
- 不要在这一张卡里引入 ws backend
|
||||
- 不要改 tool adapter
|
||||
- 不要碰 service/client
|
||||
|
||||
**本卡测试命令**
|
||||
|
||||
```bash
|
||||
cargo test --test runtime_task_flow_test --test task_runner_test
|
||||
```
|
||||
|
||||
**通过标准**
|
||||
- 老的 `runtime_task_flow_test` 继续绿
|
||||
- 新的 `task_runner_test` 通过
|
||||
- pipe 行为无变化
|
||||
|
||||
**完成后提交**
|
||||
|
||||
```bash
|
||||
git commit -m "refactor: extract shared submit-task runner"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 卡片 2:抽 BrowserBackend 抽象
|
||||
|
||||
**目标**
|
||||
把上层 runtime / orchestration / tool adapter 从 `BrowserPipeTool<T>` 解耦,统一依赖浏览器后端接口。
|
||||
|
||||
**先做什么**
|
||||
1. 新增 `tests/browser_backend_capability_test.rs`
|
||||
2. 先写失败用例:
|
||||
- pipe backend 元数据不变
|
||||
- pipe backend 支持 `Eval`
|
||||
- `supports_eval() == false` 时不暴露 browser-script tools
|
||||
|
||||
**要改哪些文件**
|
||||
- `src/browser/mod.rs`
|
||||
- `src/browser/backend.rs`
|
||||
- `src/browser/pipe_backend.rs`
|
||||
- `src/compat/browser_tool_adapter.rs`
|
||||
- `src/compat/browser_script_skill_tool.rs`
|
||||
- `src/compat/runtime.rs`
|
||||
- `src/compat/orchestration.rs`
|
||||
- `src/compat/workflow_executor.rs`
|
||||
- `src/lib.rs`
|
||||
- `tests/browser_backend_capability_test.rs`
|
||||
|
||||
**实现动作**
|
||||
1. 定义 `BrowserBackend`
|
||||
2. 写 `pipe_backend` 包装当前 `BrowserPipeTool`
|
||||
3. 把上层签名改成 `Arc<dyn BrowserBackend>`
|
||||
4. 保持工具名不变:
|
||||
- `browser_action`
|
||||
- `superrpa_browser`
|
||||
5. 增加 `supports_eval()` gating
|
||||
|
||||
**绝对不要做**
|
||||
- 不要在这一张卡里接浏览器 ws 协议
|
||||
- 不要建 service
|
||||
- 不要加 client 协议
|
||||
|
||||
**本卡测试命令**
|
||||
|
||||
```bash
|
||||
cargo test --test browser_tool_test --test compat_browser_tool_test --test browser_backend_capability_test
|
||||
```
|
||||
|
||||
**通过标准**
|
||||
- 现有 browser tool 相关测试不回归
|
||||
- 新 capability test 通过
|
||||
- 上层逻辑已脱离 `BrowserPipeTool<T>` 的硬耦合
|
||||
|
||||
**完成后提交**
|
||||
|
||||
```bash
|
||||
git commit -m "refactor: abstract browser backend from pipe transport"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 卡片 3:写死浏览器 WS 协议 Codec
|
||||
|
||||
**目标**
|
||||
单独做浏览器固定 WebSocket 协议编解码层,不把协议细节散落到 backend 和 service 里。
|
||||
|
||||
**先做什么**
|
||||
1. 新增 `tests/browser_ws_protocol_test.rs`
|
||||
2. 先写失败用例:
|
||||
- outbound frame 精确编码
|
||||
- callback payload 解析
|
||||
- 异常格式拒绝
|
||||
- v1 action 覆盖完整
|
||||
|
||||
**要改哪些文件**
|
||||
- `src/browser/ws_protocol.rs`
|
||||
- `tests/browser_ws_protocol_test.rs`
|
||||
|
||||
**实现动作**
|
||||
1. 按浏览器文档编码数组消息
|
||||
2. 只支持 v1 必需动作:
|
||||
- `Navigate`
|
||||
- `GetText`
|
||||
- `Click`
|
||||
- `Type`
|
||||
- `Eval`
|
||||
3. 定义 callback 解析和关联规则
|
||||
4. 对 unsupported / malformed 早失败
|
||||
|
||||
**绝对不要做**
|
||||
- 不要在这张卡里连真实浏览器
|
||||
- 不要写 service 协议
|
||||
- 不要把网络连接逻辑塞进 codec
|
||||
|
||||
**本卡测试命令**
|
||||
|
||||
```bash
|
||||
cargo test --test browser_ws_protocol_test
|
||||
```
|
||||
|
||||
**通过标准**
|
||||
- codec 单测全绿
|
||||
- 无网络依赖
|
||||
- 已能作为 backend 的纯协议层基础
|
||||
|
||||
**完成后提交**
|
||||
|
||||
```bash
|
||||
git commit -m "test: codify fixed browser websocket protocol"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 卡片 4:实现 Browser WS Backend
|
||||
|
||||
**目标**
|
||||
在 codec 之上提供和 pipe backend 类似的阻塞式 `invoke(...)` 能力。
|
||||
|
||||
**先做什么**
|
||||
1. 新增 `tests/browser_ws_backend_test.rs`
|
||||
2. 先写失败用例:
|
||||
- `0 + 无 callback` 成功
|
||||
- 非 `0` 失败
|
||||
- `0 + callback` 成功
|
||||
- callback timeout
|
||||
- socket drop
|
||||
|
||||
**要改哪些文件**
|
||||
- `src/browser/mod.rs`
|
||||
- `src/browser/ws_backend.rs`
|
||||
- `tests/browser_ws_backend_test.rs`
|
||||
|
||||
**实现动作**
|
||||
1. 建长连接管理器
|
||||
2. 先做串行单飞请求
|
||||
3. 发送前过 `MacPolicy`
|
||||
4. 统一即时返回和 callback 返回
|
||||
5. 输出统一 `CommandOutput`
|
||||
|
||||
**绝对不要做**
|
||||
- 不要在这一张卡里接 service 层
|
||||
- 不要做多并发 browser request
|
||||
- 不要直接把浏览器 ws 代码散进 runtime
|
||||
|
||||
**本卡测试命令**
|
||||
|
||||
```bash
|
||||
cargo test --test browser_ws_backend_test
|
||||
```
|
||||
|
||||
**通过标准**
|
||||
- backend 在 mocks/fakes 下稳定通过
|
||||
- invoke 语义与 pipe backend 接近
|
||||
- 可供上层 runtime 直接替换使用
|
||||
|
||||
**完成后提交**
|
||||
|
||||
```bash
|
||||
git commit -m "feat: add browser websocket backend"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 卡片 5:实现 sg_claw Service
|
||||
|
||||
**目标**
|
||||
新增本地长驻服务端,承接 client 请求并复用共享 task runner。
|
||||
|
||||
**先做什么**
|
||||
1. 新增 `tests/service_ws_session_test.rs`
|
||||
2. 先写失败用例:
|
||||
- 首个客户端接入成功
|
||||
- 第二个客户端收到 busy
|
||||
- 断开后状态释放
|
||||
- 任务重入被拒绝
|
||||
|
||||
**要改哪些文件**
|
||||
- `src/service/mod.rs`
|
||||
- `src/service/protocol.rs`
|
||||
- `src/service/server.rs`
|
||||
- `src/bin/sg_claw.rs`
|
||||
- `src/lib.rs`
|
||||
- `Cargo.toml`
|
||||
- `tests/service_ws_session_test.rs`
|
||||
|
||||
**实现动作**
|
||||
1. 定义 client/service 协议
|
||||
2. 实现 service 端事件 sink
|
||||
3. 建单活 session 状态机:
|
||||
- `Idle`
|
||||
- `ClientAttached`
|
||||
- `TaskRunning`
|
||||
4. 路由 `SubmitTask` 到共享 runner
|
||||
5. 保持 pipe 入口不变
|
||||
|
||||
**绝对不要做**
|
||||
- 不要在这一张卡里做 client 交互体验优化
|
||||
- 不要加任务队列
|
||||
- 不要支持多客户端并发
|
||||
|
||||
**本卡测试命令**
|
||||
|
||||
```bash
|
||||
cargo test --test service_ws_session_test
|
||||
```
|
||||
|
||||
**通过标准**
|
||||
- 服务端会话锁生效
|
||||
- 共享 runner 可被 service 复用
|
||||
- pipe 模式入口未受影响
|
||||
|
||||
**完成后提交**
|
||||
|
||||
```bash
|
||||
git commit -m "feat: add claw-ws service entrypoint"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 卡片 6:实现 sg_claw_client
|
||||
|
||||
**目标**
|
||||
新增一个薄客户端,提供类似 `claude/codex` 的交互式命令行体验。
|
||||
|
||||
**先做什么**
|
||||
1. 新增 `tests/service_task_flow_test.rs`
|
||||
2. 先写失败用例:
|
||||
- submit-task 送达 service
|
||||
- 日志按顺序流回
|
||||
- completion 只到一次
|
||||
- 完成后断开处理清晰
|
||||
|
||||
**要改哪些文件**
|
||||
- `src/bin/sg_claw_client.rs`
|
||||
- `Cargo.toml`
|
||||
- `tests/service_task_flow_test.rs`
|
||||
|
||||
**实现动作**
|
||||
1. 建立到本地 service 的 ws 连接
|
||||
2. 读取用户输入
|
||||
3. 发送 `SubmitTask`
|
||||
4. 实时打印日志
|
||||
5. 收到 `TaskComplete` 结束本轮
|
||||
|
||||
**绝对不要做**
|
||||
- 不要把 runtime、skills、browser backend 复制进 client
|
||||
- 不要让 client 直接连浏览器
|
||||
- 不要让 client 承担业务逻辑
|
||||
|
||||
**本卡测试命令**
|
||||
|
||||
```bash
|
||||
cargo test --test service_task_flow_test
|
||||
cargo build --bin sg_claw --bin sg_claw_client
|
||||
```
|
||||
|
||||
**通过标准**
|
||||
- client 是薄壳
|
||||
- task flow 正常
|
||||
- 两个新 binary 可编译
|
||||
|
||||
**完成后提交**
|
||||
|
||||
```bash
|
||||
git commit -m "feat: add interactive claw-ws client"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 卡片 7:最终接线与回归验证
|
||||
|
||||
**目标**
|
||||
把 ws 路径接起来,同时确认 pipe 路径零回归。
|
||||
|
||||
**先做什么**
|
||||
1. 只增加最小配置项:
|
||||
- `browser_ws_url`
|
||||
- `service_ws_listen_addr`
|
||||
2. 检查外部工具命名保持稳定
|
||||
|
||||
**要改哪些文件**
|
||||
- `Cargo.toml`
|
||||
- `src/lib.rs`
|
||||
- `src/config/settings.rs`
|
||||
- `src/runtime/engine.rs`(如确有必要)
|
||||
|
||||
**实现动作**
|
||||
1. 接入最小配置面
|
||||
2. 确保 pipe / ws 下工具命名一致
|
||||
3. 跑旧 pipe 回归
|
||||
4. 跑新 ws 测试
|
||||
5. 跑全量 Rust tests
|
||||
6. 编译所有 binary
|
||||
7. 做一次真实本地 smoke test
|
||||
|
||||
**本卡 pipe 回归命令**
|
||||
|
||||
```bash
|
||||
cargo test --test browser_tool_test --test compat_browser_tool_test --test runtime_task_flow_test --test pipe_handshake_test --test pipe_protocol_test --test task_protocol_test
|
||||
```
|
||||
|
||||
**本卡 ws 测试命令**
|
||||
|
||||
```bash
|
||||
cargo test --test task_runner_test --test browser_ws_protocol_test --test browser_ws_backend_test --test browser_backend_capability_test --test service_ws_session_test --test service_task_flow_test
|
||||
```
|
||||
|
||||
**本卡全量命令**
|
||||
|
||||
```bash
|
||||
cargo test --tests
|
||||
cargo build --bin sgclaw --bin sg_claw --bin sg_claw_client
|
||||
```
|
||||
|
||||
**手工验证**
|
||||
1. 启动浏览器,确认 `ws://127.0.0.1:12345` 可用
|
||||
2. `cargo run --bin sg_claw`
|
||||
3. 新终端运行 `cargo run --bin sg_claw_client`
|
||||
4. 发一个简单浏览器任务
|
||||
5. 确认日志流和单次 completion
|
||||
6. 确认旧 `cargo run` pipe 入口仍可启动
|
||||
|
||||
**通过标准**
|
||||
- pipe 模式零回归
|
||||
- ws 模式可独立工作
|
||||
- 两套模式并行存在
|
||||
|
||||
**完成后提交**
|
||||
|
||||
```bash
|
||||
git commit -m "feat: wire parallel claw-ws transport"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 一句话执行顺序
|
||||
|
||||
严格按下面顺序做:
|
||||
|
||||
1. 共享 runner
|
||||
2. browser backend 抽象
|
||||
3. ws 协议 codec
|
||||
4. ws backend
|
||||
5. service
|
||||
6. client
|
||||
7. 配置接线 + 回归
|
||||
|
||||
如果顺序乱了,最容易出现的问题是:
|
||||
- 上层重复实现
|
||||
- pipe 被误伤
|
||||
- ws 协议细节扩散到整个工程
|
||||
- service/client 提前写完后又被迫重构
|
||||
687
docs/superpowers/plans/2026-04-01-claw-ws-parallel-transport.md
Normal file
687
docs/superpowers/plans/2026-04-01-claw-ws-parallel-transport.md
Normal file
@@ -0,0 +1,687 @@
|
||||
# Claw-WS Parallel Transport Implementation Plan
|
||||
|
||||
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
|
||||
|
||||
**Goal:** Add a parallel `claw-ws` transport path that keeps the current pipe mode intact while introducing a long-lived `sg_claw` local service, an interactive `sg_claw_client`, and a browser WebSocket backend at `ws://127.0.0.1:12345`.
|
||||
|
||||
**Architecture:** First extract a transport-agnostic submit-task runner and browser backend abstraction from the current pipe-coupled flow. Keep the existing pipe path as one adapter/backend, then add a fixed-protocol browser WebSocket backend plus a small service/session layer and an interactive CLI client that reuse the same runtime, orchestration, and browser-facing tool adapters.
|
||||
|
||||
**Tech Stack:** Rust 2021, current sgclaw compat runtime, zeroclaw runtime engine, `serde`/`serde_json`, existing `MacPolicy`, and a blocking WebSocket crate for v1 (`tungstenite` preferred over a broad async rewrite).
|
||||
|
||||
---
|
||||
|
||||
## Scope Guardrails
|
||||
|
||||
- Keep the current pipe mode entrypoint and behavior working.
|
||||
- Do **not** replace the existing browser pipe path.
|
||||
- Add a **parallel** WebSocket path only.
|
||||
- v1 supports **one active client session** only.
|
||||
- Reuse existing tool names and runtime behavior whenever possible.
|
||||
- If WS `Eval` support is incomplete, disable eval-dependent browser-script skill exposure in WS mode rather than shipping partial behavior.
|
||||
- Do not broaden v1 with task queues, multi-client support, or admin endpoints.
|
||||
|
||||
---
|
||||
|
||||
## File Structure
|
||||
|
||||
### Existing files to reuse
|
||||
|
||||
- Modify: `src/lib.rs` — current pipe bootstrap and receive loop; keep as the legacy pipe entrypoint.
|
||||
- Modify: `src/agent/mod.rs` — current `BrowserMessage::SubmitTask` entrypoint and config-loading flow.
|
||||
- Modify: `src/compat/runtime.rs` — compat runtime and tool assembly.
|
||||
- Modify: `src/compat/orchestration.rs` — direct workflow vs compat runtime routing.
|
||||
- Modify: `src/compat/browser_tool_adapter.rs` — exposes `browser_action` and `superrpa_browser`.
|
||||
- Modify: `src/compat/browser_script_skill_tool.rs` — browser-script skill execution.
|
||||
- Modify: `src/compat/workflow_executor.rs` — direct browser workflows such as Zhihu flows.
|
||||
- Reuse: `src/pipe/browser_tool.rs` — current browser command executor; retain as the pipe backend implementation.
|
||||
- Reuse: `src/pipe/protocol.rs` — `BrowserMessage`, `AgentMessage`, `Action`, `ExecutionSurfaceMetadata`.
|
||||
- Reuse: `src/security/mac_policy.rs` — local action/domain guardrails.
|
||||
- Modify: `src/config/settings.rs` — minimal new config surface for WS mode.
|
||||
- Optional modify: `src/runtime/engine.rs` — only if backend capability wiring requires it.
|
||||
|
||||
### New files to create
|
||||
|
||||
- Create: `src/agent/task_runner.rs` — shared submit-task execution entrypoint.
|
||||
- Create: `src/browser/mod.rs` — browser backend exports.
|
||||
- Create: `src/browser/backend.rs` — `BrowserBackend` trait and helpers.
|
||||
- Create: `src/browser/pipe_backend.rs` — wrapper around existing `BrowserPipeTool`.
|
||||
- Create: `src/browser/ws_protocol.rs` — fixed browser WS request/response codec.
|
||||
- Create: `src/browser/ws_backend.rs` — browser WS backend with blocking invoke semantics.
|
||||
- Create: `src/service/mod.rs` — service exports.
|
||||
- Create: `src/service/protocol.rs` — client/service WS message types.
|
||||
- Create: `src/service/server.rs` — single-session `sg_claw` server.
|
||||
- Create: `src/bin/sg_claw.rs` — service binary.
|
||||
- Create: `src/bin/sg_claw_client.rs` — interactive CLI client.
|
||||
- Create: `tests/task_runner_test.rs` — shared submit-task runner regressions.
|
||||
- Create: `tests/browser_backend_capability_test.rs` — backend capability/tool exposure tests.
|
||||
- Create: `tests/browser_ws_protocol_test.rs` — browser WS protocol tests.
|
||||
- Create: `tests/browser_ws_backend_test.rs` — browser WS backend tests.
|
||||
- Create: `tests/service_ws_session_test.rs` — single-session server tests.
|
||||
- Create: `tests/service_task_flow_test.rs` — client/service task flow tests.
|
||||
|
||||
---
|
||||
|
||||
## Task 1: Extract a shared submit-task runner
|
||||
|
||||
**Files:**
|
||||
- Create: `src/agent/task_runner.rs`
|
||||
- Modify: `src/agent/mod.rs`
|
||||
- Modify: `src/lib.rs`
|
||||
- Test: `tests/task_runner_test.rs`
|
||||
- Reuse: `src/compat/runtime.rs`, `src/compat/orchestration.rs`
|
||||
|
||||
- [ ] **Step 1: Write a failing runner regression test**
|
||||
|
||||
Create `tests/task_runner_test.rs` covering:
|
||||
- empty instruction returns the same `TaskComplete` failure summary
|
||||
- missing LLM config still returns the same summary shape
|
||||
- the pipe adapter still emits `LogEntry` before `TaskComplete`
|
||||
|
||||
- [ ] **Step 2: Run the targeted regression tests first**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test runtime_task_flow_test --test task_runner_test
|
||||
```
|
||||
|
||||
Expected: `task_runner_test` fails because the shared runner does not exist yet.
|
||||
|
||||
- [ ] **Step 3: Define the transport-neutral request model**
|
||||
|
||||
Create `src/agent/task_runner.rs` with a request struct that mirrors the current pipe payload:
|
||||
|
||||
```rust
|
||||
pub struct SubmitTaskRequest {
|
||||
pub instruction: String,
|
||||
pub conversation_id: Option<String>,
|
||||
pub messages: Vec<ConversationMessage>,
|
||||
pub page_url: Option<String>,
|
||||
pub page_title: Option<String>,
|
||||
}
|
||||
```
|
||||
|
||||
Normalize empty strings to `None` at the adapter boundary.
|
||||
|
||||
- [ ] **Step 4: Define an event sink abstraction**
|
||||
|
||||
Add a small trait that can emit the current agent events without depending on a specific transport:
|
||||
|
||||
```rust
|
||||
pub trait AgentEventSink {
|
||||
fn send(&self, message: &AgentMessage) -> Result<(), PipeError>;
|
||||
}
|
||||
```
|
||||
|
||||
The existing pipe transport should implement this first.
|
||||
|
||||
- [ ] **Step 5: Move submit-task execution into a shared function**
|
||||
|
||||
Extract the body currently inside `BrowserMessage::SubmitTask` handling from `src/agent/mod.rs` into a shared function such as:
|
||||
|
||||
```rust
|
||||
pub fn run_submit_task(
|
||||
sink: &dyn AgentEventSink,
|
||||
browser_backend: Arc<dyn BrowserBackend>,
|
||||
context: &AgentRuntimeContext,
|
||||
request: SubmitTaskRequest,
|
||||
) -> Result<(), PipeError>
|
||||
```
|
||||
|
||||
This function must still:
|
||||
- validate empty instruction
|
||||
- load sgclaw settings
|
||||
- log runtime/config info
|
||||
- choose orchestration vs compat runtime
|
||||
- emit `AgentMessage::TaskComplete`
|
||||
|
||||
- [ ] **Step 6: Keep pipe mode as a thin adapter**
|
||||
|
||||
Refactor `handle_browser_message_with_context(...)` in `src/agent/mod.rs` so it only:
|
||||
- pattern matches `BrowserMessage`
|
||||
- converts `SubmitTask` into `SubmitTaskRequest`
|
||||
- forwards into `run_submit_task(...)`
|
||||
|
||||
- [ ] **Step 7: Re-run the runner regressions**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test runtime_task_flow_test --test task_runner_test
|
||||
```
|
||||
|
||||
Expected: both tests pass and pipe behavior remains unchanged.
|
||||
|
||||
- [ ] **Step 8: Commit**
|
||||
|
||||
```bash
|
||||
git add src/agent/mod.rs src/agent/task_runner.rs src/lib.rs tests/task_runner_test.rs
|
||||
git commit -m "refactor: extract shared submit-task runner"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Task 2: Introduce a browser backend abstraction and wrap the current pipe implementation
|
||||
|
||||
**Files:**
|
||||
- Create: `src/browser/mod.rs`
|
||||
- Create: `src/browser/backend.rs`
|
||||
- Create: `src/browser/pipe_backend.rs`
|
||||
- Modify: `src/lib.rs`
|
||||
- Modify: `src/compat/browser_tool_adapter.rs`
|
||||
- Modify: `src/compat/browser_script_skill_tool.rs`
|
||||
- Modify: `src/compat/runtime.rs`
|
||||
- Modify: `src/compat/orchestration.rs`
|
||||
- Modify: `src/compat/workflow_executor.rs`
|
||||
- Test: `tests/browser_backend_capability_test.rs`
|
||||
- Reuse: `src/pipe/browser_tool.rs`, `src/security/mac_policy.rs`
|
||||
|
||||
- [ ] **Step 1: Add a failing backend capability test**
|
||||
|
||||
Create `tests/browser_backend_capability_test.rs` to verify:
|
||||
- pipe backend still exposes privileged surface metadata
|
||||
- pipe backend still supports `Eval`
|
||||
- browser-script tool exposure is disabled when `supports_eval()` is false
|
||||
|
||||
- [ ] **Step 2: Run the current browser adapter tests first**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test browser_tool_test --test compat_browser_tool_test --test browser_backend_capability_test
|
||||
```
|
||||
|
||||
Expected: new capability test fails because the backend abstraction does not exist yet.
|
||||
|
||||
- [ ] **Step 3: Define the shared browser interface**
|
||||
|
||||
Create `src/browser/backend.rs`:
|
||||
|
||||
```rust
|
||||
pub trait BrowserBackend: Send + Sync {
|
||||
fn invoke(
|
||||
&self,
|
||||
action: Action,
|
||||
params: Value,
|
||||
expected_domain: &str,
|
||||
) -> Result<CommandOutput, PipeError>;
|
||||
|
||||
fn surface_metadata(&self) -> ExecutionSurfaceMetadata;
|
||||
|
||||
fn supports_eval(&self) -> bool {
|
||||
true
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
- [ ] **Step 4: Implement the pipe backend as a wrapper**
|
||||
|
||||
Create `src/browser/pipe_backend.rs` that stores the current `BrowserPipeTool<T>` and forwards `invoke(...)` and `surface_metadata()` unchanged.
|
||||
|
||||
Pipe mode must continue using:
|
||||
- `perform_handshake(...)`
|
||||
- `MacPolicy::load_from_path(...)`
|
||||
- `BrowserPipeTool::new(...).with_response_timeout(...)`
|
||||
|
||||
- [ ] **Step 5: Refactor runtime and tool adapters to depend on `Arc<dyn BrowserBackend>`**
|
||||
|
||||
Update:
|
||||
- `src/compat/browser_tool_adapter.rs`
|
||||
- `src/compat/browser_script_skill_tool.rs`
|
||||
- `src/compat/runtime.rs`
|
||||
- `src/compat/orchestration.rs`
|
||||
- `src/compat/workflow_executor.rs`
|
||||
|
||||
Preserve external tool names:
|
||||
- `browser_action`
|
||||
- `superrpa_browser`
|
||||
|
||||
- [ ] **Step 6: Add capability gating for eval-dependent script tools**
|
||||
|
||||
If `supports_eval()` is false, do **not** expose browser-script skill tools from `build_browser_script_skill_tools(...)` in that backend mode.
|
||||
|
||||
- [ ] **Step 7: Re-run browser adapter tests**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test browser_tool_test --test compat_browser_tool_test --test browser_backend_capability_test
|
||||
```
|
||||
|
||||
Expected: all three pass.
|
||||
|
||||
- [ ] **Step 8: Commit**
|
||||
|
||||
```bash
|
||||
git add src/browser src/lib.rs src/compat/browser_tool_adapter.rs src/compat/browser_script_skill_tool.rs src/compat/runtime.rs src/compat/orchestration.rs src/compat/workflow_executor.rs tests/browser_backend_capability_test.rs
|
||||
git commit -m "refactor: abstract browser backend from pipe transport"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Task 3: Implement the fixed browser WebSocket protocol codec in isolation
|
||||
|
||||
**Files:**
|
||||
- Create: `src/browser/ws_protocol.rs`
|
||||
- Test: `tests/browser_ws_protocol_test.rs`
|
||||
- Reuse: `docs/_tmp_sgbrowser_ws_api_doc.txt`
|
||||
|
||||
- [ ] **Step 1: Write failing protocol codec tests**
|
||||
|
||||
Create `tests/browser_ws_protocol_test.rs` covering:
|
||||
- exact outbound frame encoding
|
||||
- callback payload decoding
|
||||
- unknown callback format rejection
|
||||
- mapping coverage for every supported v1 action
|
||||
|
||||
- [ ] **Step 2: Run the protocol tests first**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test browser_ws_protocol_test
|
||||
```
|
||||
|
||||
Expected: fail because the WS protocol codec does not exist yet.
|
||||
|
||||
- [ ] **Step 3: Encode the exact browser frame shapes**
|
||||
|
||||
Create `src/browser/ws_protocol.rs` so it can build exact array-form payloads such as:
|
||||
|
||||
```rust
|
||||
[requesturl, "sgBrowserExcuteJsCodeByArea", target_url, js_code, area]
|
||||
```
|
||||
|
||||
Serialize to the JSON string format required by the browser service.
|
||||
|
||||
- [ ] **Step 4: Define the v1 action mapping table**
|
||||
|
||||
Support only the actions already needed by current sgclaw flows:
|
||||
- `Navigate`
|
||||
- `GetText`
|
||||
- `Click`
|
||||
- `Type`
|
||||
- `Eval`
|
||||
|
||||
Document which browser functions each one maps to and what assumptions they rely on.
|
||||
|
||||
- [ ] **Step 5: Define callback parsing and correlation rules**
|
||||
|
||||
Represent callback-bearing operations explicitly, including the callback function naming or request-correlation strategy the backend will depend on.
|
||||
|
||||
- [ ] **Step 6: Reject unsupported or malformed shapes early**
|
||||
|
||||
Fail fast for:
|
||||
- unsupported actions
|
||||
- malformed callback payloads
|
||||
- missing request correlation metadata
|
||||
|
||||
- [ ] **Step 7: Re-run the protocol tests**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test browser_ws_protocol_test
|
||||
```
|
||||
|
||||
Expected: pass with no network dependency.
|
||||
|
||||
- [ ] **Step 8: Commit**
|
||||
|
||||
```bash
|
||||
git add src/browser/ws_protocol.rs tests/browser_ws_protocol_test.rs
|
||||
git commit -m "test: codify fixed browser websocket protocol"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Task 4: Build the browser WS backend with synchronous invoke semantics
|
||||
|
||||
**Files:**
|
||||
- Create: `src/browser/ws_backend.rs`
|
||||
- Modify: `src/browser/mod.rs`
|
||||
- Test: `tests/browser_ws_backend_test.rs`
|
||||
- Reuse: `CommandOutput`, `PipeError`, `ExecutionSurfaceMetadata`, `MacPolicy`
|
||||
|
||||
- [ ] **Step 1: Write failing backend behavior tests**
|
||||
|
||||
Create `tests/browser_ws_backend_test.rs` covering:
|
||||
- zero return + no callback => success
|
||||
- non-zero return => failure
|
||||
- zero return + callback => success with normalized `CommandOutput`
|
||||
- callback timeout => timeout error
|
||||
- dropped socket => clear failure
|
||||
|
||||
- [ ] **Step 2: Run backend tests first**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test browser_ws_backend_test
|
||||
```
|
||||
|
||||
Expected: fail because the WS backend does not exist yet.
|
||||
|
||||
- [ ] **Step 3: Build a long-lived browser connection manager**
|
||||
|
||||
Implement `src/browser/ws_backend.rs` to connect to `ws://127.0.0.1:12345` and expose blocking `invoke(...)` calls.
|
||||
|
||||
Use a dedicated connection loop plus request/response coordination instead of scattering raw socket calls through the runtime.
|
||||
|
||||
- [ ] **Step 4: Preserve local guardrails before send**
|
||||
|
||||
Validate `MacPolicy` before translating an action into the browser WS protocol, matching current pipe backend behavior.
|
||||
|
||||
- [ ] **Step 5: Normalize immediate status returns and delayed callbacks**
|
||||
|
||||
For each `invoke(...)` call:
|
||||
- fail immediately on non-zero return codes
|
||||
- succeed immediately for operations with no data callback
|
||||
- wait for the matching callback for result-bearing operations
|
||||
- convert the final outcome into `CommandOutput`
|
||||
|
||||
- [ ] **Step 6: Keep v1 concurrency intentionally serialized**
|
||||
|
||||
Allow only one in-flight browser request at a time unless the browser callback protocol proves a stable request-id guarantee.
|
||||
|
||||
- [ ] **Step 7: Re-run backend tests**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test browser_ws_backend_test
|
||||
```
|
||||
|
||||
Expected: pass using mocks/fakes, not the real browser.
|
||||
|
||||
- [ ] **Step 8: Commit**
|
||||
|
||||
```bash
|
||||
git add src/browser/mod.rs src/browser/ws_backend.rs tests/browser_ws_backend_test.rs
|
||||
git commit -m "feat: add browser websocket backend"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Task 5: Add the `sg_claw` service protocol and single-session server
|
||||
|
||||
**Files:**
|
||||
- Create: `src/service/mod.rs`
|
||||
- Create: `src/service/protocol.rs`
|
||||
- Create: `src/service/server.rs`
|
||||
- Create: `src/bin/sg_claw.rs`
|
||||
- Modify: `src/lib.rs`
|
||||
- Modify: `Cargo.toml`
|
||||
- Test: `tests/service_ws_session_test.rs`
|
||||
- Reuse: `AgentMessage::LogEntry`, `AgentMessage::TaskComplete`, `SubmitTaskRequest`, `run_submit_task(...)`
|
||||
|
||||
- [ ] **Step 1: Write failing service session tests**
|
||||
|
||||
Create `tests/service_ws_session_test.rs` to verify:
|
||||
- first client attaches
|
||||
- second client gets `Busy`
|
||||
- disconnect resets session state
|
||||
- overlapping task submission is rejected clearly
|
||||
|
||||
- [ ] **Step 2: Run the session tests first**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test service_ws_session_test
|
||||
```
|
||||
|
||||
Expected: fail because the service layer does not exist yet.
|
||||
|
||||
- [ ] **Step 3: Define a thin client/service WS protocol**
|
||||
|
||||
In `src/service/protocol.rs`, reuse existing task/event shapes as much as possible:
|
||||
|
||||
```rust
|
||||
ClientMessage::SubmitTask { instruction, conversation_id, messages, page_url, page_title }
|
||||
ClientMessage::Ping
|
||||
ServiceMessage::LogEntry { level, message }
|
||||
ServiceMessage::TaskComplete { success, summary }
|
||||
ServiceMessage::Busy { message }
|
||||
```
|
||||
|
||||
- [ ] **Step 4: Add the service event sink adapter**
|
||||
|
||||
Implement `AgentEventSink` for the service session writer so the shared task runner can stream `LogEntry` and `TaskComplete` over the service WebSocket.
|
||||
|
||||
- [ ] **Step 5: Implement single-active-client session state**
|
||||
|
||||
Model explicit states such as:
|
||||
- `Idle`
|
||||
- `ClientAttached`
|
||||
- `TaskRunning`
|
||||
|
||||
Reject a second client with `ServiceMessage::Busy` and close the socket. Reject overlapping tasks instead of queueing them.
|
||||
|
||||
- [ ] **Step 6: Add the service binary**
|
||||
|
||||
Create `src/bin/sg_claw.rs` that:
|
||||
- loads config
|
||||
- creates the browser WS backend
|
||||
- listens for local client connections
|
||||
- routes `SubmitTask` into `run_submit_task(...)`
|
||||
|
||||
Keep `src/main.rs` and the existing `sgclaw::run()` pipe path unchanged.
|
||||
|
||||
- [ ] **Step 7: Re-run the session tests**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test service_ws_session_test
|
||||
```
|
||||
|
||||
Expected: pass without the real browser.
|
||||
|
||||
- [ ] **Step 8: Commit**
|
||||
|
||||
```bash
|
||||
git add src/service src/bin/sg_claw.rs src/lib.rs Cargo.toml tests/service_ws_session_test.rs
|
||||
git commit -m "feat: add claw-ws service entrypoint"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Task 6: Add the `sg_claw_client` interactive CLI
|
||||
|
||||
**Files:**
|
||||
- Create: `src/bin/sg_claw_client.rs`
|
||||
- Modify: `Cargo.toml`
|
||||
- Test: `tests/service_task_flow_test.rs`
|
||||
- Reuse: `src/service/protocol.rs`
|
||||
|
||||
- [ ] **Step 1: Write failing client/service task flow tests**
|
||||
|
||||
Create `tests/service_task_flow_test.rs` to verify:
|
||||
- the submit-task request reaches the service
|
||||
- log entries stream in order
|
||||
- the final summary arrives exactly once
|
||||
- disconnect after task completion is handled cleanly
|
||||
|
||||
- [ ] **Step 2: Run the flow tests first**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test service_task_flow_test
|
||||
```
|
||||
|
||||
Expected: fail because the client does not exist yet.
|
||||
|
||||
- [ ] **Step 3: Implement a thin interactive client loop**
|
||||
|
||||
Create `src/bin/sg_claw_client.rs` that:
|
||||
- connects to the local `sg_claw` service
|
||||
- reads a line of user input
|
||||
- sends `ClientMessage::SubmitTask`
|
||||
- prints streamed `LogEntry` events as they arrive
|
||||
- ends the turn on `TaskComplete`
|
||||
|
||||
- [ ] **Step 4: Keep the client intentionally dumb**
|
||||
|
||||
Do **not** duplicate runtime logic in the client. Browser access, skills, orchestration, and task execution remain entirely inside the service.
|
||||
|
||||
- [ ] **Step 5: Re-run the flow tests**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test service_task_flow_test
|
||||
```
|
||||
|
||||
Expected: pass without the real browser.
|
||||
|
||||
- [ ] **Step 6: Build the new binaries explicitly**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo build --bin sg_claw --bin sg_claw_client
|
||||
```
|
||||
|
||||
Expected: both binaries compile successfully.
|
||||
|
||||
- [ ] **Step 7: Commit**
|
||||
|
||||
```bash
|
||||
git add src/bin/sg_claw_client.rs Cargo.toml tests/service_task_flow_test.rs
|
||||
git commit -m "feat: add interactive claw-ws client"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Task 7: Finish wiring, preserve pipe mode, and verify end-to-end behavior
|
||||
|
||||
**Files:**
|
||||
- Modify: `Cargo.toml`
|
||||
- Modify: `src/lib.rs`
|
||||
- Modify: `src/config/settings.rs`
|
||||
- Optional modify: `src/runtime/engine.rs`
|
||||
- Reuse: `tests/browser_tool_test.rs`, `tests/runtime_task_flow_test.rs`, `tests/compat_runtime_test.rs`
|
||||
|
||||
- [ ] **Step 1: Add only the minimum config surface for v1**
|
||||
|
||||
Add settings such as:
|
||||
- `browser_ws_url` defaulting to `ws://127.0.0.1:12345`
|
||||
- `service_ws_listen_addr` defaulting to local loopback
|
||||
|
||||
Do **not** change the meaning of existing browser backend/profile settings just to represent service mode.
|
||||
|
||||
- [ ] **Step 2: Keep external browser tool naming stable**
|
||||
|
||||
Verify that the runtime still exposes:
|
||||
- `superrpa_browser`
|
||||
- `browser_action`
|
||||
|
||||
under both pipe and WS modes where the backend supports them.
|
||||
|
||||
- [ ] **Step 3: Re-run the current pipe regression suite**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test browser_tool_test --test compat_browser_tool_test --test runtime_task_flow_test --test pipe_handshake_test --test pipe_protocol_test --test task_protocol_test
|
||||
```
|
||||
|
||||
Expected: all existing pipe-oriented tests still pass unchanged.
|
||||
|
||||
- [ ] **Step 4: Run the new WS-focused suite**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test task_runner_test --test browser_ws_protocol_test --test browser_ws_backend_test --test browser_backend_capability_test --test service_ws_session_test --test service_task_flow_test
|
||||
```
|
||||
|
||||
Expected: all new tests pass without launching the real browser.
|
||||
|
||||
- [ ] **Step 5: Run a full Rust test sweep**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --tests
|
||||
```
|
||||
|
||||
Expected: all Rust tests pass.
|
||||
|
||||
- [ ] **Step 6: Build all three binaries**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo build --bin sgclaw --bin sg_claw --bin sg_claw_client
|
||||
```
|
||||
|
||||
Expected: all three binaries compile.
|
||||
|
||||
- [ ] **Step 7: Perform a manual local smoke test**
|
||||
|
||||
Manual test:
|
||||
1. Start the browser app so `ws://127.0.0.1:12345` is available.
|
||||
2. Run `cargo run --bin sg_claw`.
|
||||
3. In another terminal, run `cargo run --bin sg_claw_client`.
|
||||
4. Submit a simple browser task such as opening a page or fetching visible text.
|
||||
5. Confirm the client prints streaming logs and exactly one final completion summary.
|
||||
6. Confirm the old pipe-mode entry still starts via `cargo run`.
|
||||
|
||||
Expected: both modes work side-by-side.
|
||||
|
||||
- [ ] **Step 8: Commit**
|
||||
|
||||
```bash
|
||||
git add Cargo.toml src/lib.rs src/config/settings.rs src/runtime/engine.rs
|
||||
git commit -m "feat: wire parallel claw-ws transport"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Verification Checklist
|
||||
|
||||
### Fast regression checks
|
||||
|
||||
```bash
|
||||
cargo test --test browser_tool_test --test compat_browser_tool_test --test runtime_task_flow_test
|
||||
```
|
||||
|
||||
Expected: current pipe/browser runtime behavior remains green.
|
||||
|
||||
### Full Rust test sweep
|
||||
|
||||
```bash
|
||||
cargo test --tests
|
||||
```
|
||||
|
||||
Expected: all Rust tests pass.
|
||||
|
||||
### Binary build verification
|
||||
|
||||
```bash
|
||||
cargo build --bin sgclaw --bin sg_claw --bin sg_claw_client
|
||||
```
|
||||
|
||||
Expected: all three binaries compile.
|
||||
|
||||
### Manual end-to-end verification
|
||||
|
||||
- Browser app listening on `ws://127.0.0.1:12345`
|
||||
- `cargo run --bin sg_claw`
|
||||
- `cargo run --bin sg_claw_client`
|
||||
- submit one browser task
|
||||
- verify streaming logs, final completion, and single-client lock behavior
|
||||
- verify `cargo run` still preserves old pipe bootstrap
|
||||
|
||||
---
|
||||
|
||||
## Notes for Implementation
|
||||
|
||||
- Keep the current pipe bootstrap in `src/lib.rs` intact until the shared runner and pipe backend wrapper are both green.
|
||||
- Prefer small commits at each task boundary.
|
||||
- Keep the new WS path additive and isolated.
|
||||
- Do not ship partial browser capabilities under stable tool names.
|
||||
- Treat `docs/_tmp_sgbrowser_ws_api_doc.txt` as the browser WS protocol source of truth while implementing `src/browser/ws_protocol.rs`.
|
||||
@@ -0,0 +1,607 @@
|
||||
# WS Browser Backend Auth Replacement Implementation Plan
|
||||
|
||||
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
|
||||
|
||||
**Goal:** Replace the ws service path’s empty-session-key `BrowserPipeTool` dependency with a ws-native browser backend path so real browser websocket calls work, while preserving legacy pipe behavior exactly.
|
||||
|
||||
**Architecture:** Keep the existing pipe entry untouched and add a ws-only parallel execution seam. The ws service path will construct a `ServiceBrowserWsClient` plus `WsBrowserBackend`, pass that backend through a new ws-only submit-task entry, and let the existing compat/runtime stack consume `Arc<dyn BrowserBackend>` instead of requiring `BrowserPipeTool` on the ws path.
|
||||
|
||||
**Tech Stack:** Rust 2021, current sgclaw agent/task runner, compat runtime/orchestration stack, `tungstenite`, `serde_json`, existing `MacPolicy`, existing `BrowserBackend`/`WsBrowserBackend`, and the current Rust test suite.
|
||||
|
||||
---
|
||||
|
||||
## Scope Guardrails
|
||||
|
||||
- Only change the ws service path.
|
||||
- Do **not** change `src/lib.rs` pipe runtime behavior.
|
||||
- Do **not** change pipe handshake semantics.
|
||||
- Do **not** introduce fake session keys, fake HMAC seeds, or auth bypasses.
|
||||
- Keep legacy `run_submit_task(...)` available for the pipe entry.
|
||||
- If a shared layer must change, add a parallel ws-only entry instead of weakening the pipe path.
|
||||
- Keep the current single-client, single-task service model.
|
||||
- Do not broaden this slice into browser process launch, queueing, multi-client support, or protocol extensions.
|
||||
|
||||
---
|
||||
|
||||
## File Structure
|
||||
|
||||
### Existing files to modify
|
||||
|
||||
- Modify: `src/agent/task_runner.rs` — keep the current pipe-oriented submit path and add the ws-only backend-based submit path.
|
||||
- Modify: `src/compat/runtime.rs` — add a backend-driven execution entry that accepts `Arc<dyn BrowserBackend>` directly, while keeping the current pipe-oriented public functions behaviorally unchanged.
|
||||
- Modify: `src/compat/orchestration.rs` — add a matching backend-driven execution entry for orchestration/direct-route flows, while keeping the current pipe-oriented public functions behaviorally unchanged.
|
||||
- Modify: `src/compat/workflow_executor.rs` — add backend-driven sibling APIs for direct-route/fallback execution, while keeping the current pipe-oriented public functions behaviorally unchanged.
|
||||
- Modify: `src/service/server.rs` — replace the ws service’s `BrowserPipeTool::new(..., vec![])` path with a ws-native `WsClient` + `WsBrowserBackend` path.
|
||||
- Modify: `src/service/mod.rs` — only if minimal re-export or call-signature changes are needed around the new ws-only submit path.
|
||||
- Modify: `src/browser/mod.rs` — only if export cleanup is truly needed for the service wiring.
|
||||
- Reuse: `src/agent/mod.rs` — keep the current pipe routing unchanged unless a tiny internal refactor is strictly needed to reuse shared code.
|
||||
- Reuse: `src/browser/backend.rs` — existing shared browser backend trait.
|
||||
- Reuse: `src/browser/ws_backend.rs` — existing ws-native browser backend implementation.
|
||||
- Reuse: `src/browser/ws_protocol.rs` — existing browser websocket protocol codec.
|
||||
- Reuse: `src/compat/browser_tool_adapter.rs` — should already speak `BrowserBackend`; only touch if a narrow ws regression forces it.
|
||||
- Reuse: `src/compat/browser_script_skill_tool.rs` — eval-capability gating already exists; only touch if a narrow ws regression forces it.
|
||||
- Reuse: `src/lib.rs` — pipe entrypoint must remain behaviorally unchanged; verify only.
|
||||
|
||||
### Existing tests to extend
|
||||
|
||||
- Modify: `tests/browser_ws_backend_test.rs` — keep existing ws backend coverage green after the service adapter wiring lands.
|
||||
- Modify: `tests/browser_script_skill_tool_test.rs` — re-verify eval-gating and browser-script behavior after the shared compat/runtime seam changes.
|
||||
- Modify: `tests/service_ws_session_test.rs` — update service-side unit/session tests to exercise the ws-only submit path.
|
||||
- Modify: `tests/service_task_flow_test.rs` — add client→service chain coverage proving the ws path reaches a browser websocket and no longer emits `invalid hmac seed`.
|
||||
- Modify: `src/service/server.rs` under `#[cfg(test)]` if the private service-side ws adapter cannot be exercised from an integration test crate without changing production visibility.
|
||||
|
||||
### New files to create
|
||||
|
||||
- Create: `tests/browser_ws_service_adapter_test.rs` if the adapter can be exercised through a public seam; otherwise keep the deterministic adapter tests as unit tests in `src/service/server.rs` so no production visibility changes are required.
|
||||
|
||||
---
|
||||
|
||||
## Task 1: Lock the ws-only behavior with deterministic failing tests
|
||||
|
||||
**Files:**
|
||||
- Create: `tests/browser_ws_service_adapter_test.rs`
|
||||
- Modify: `tests/service_task_flow_test.rs`
|
||||
- Reuse: `tests/browser_ws_backend_test.rs`, `src/browser/ws_backend.rs`, `src/service/server.rs`
|
||||
|
||||
- [ ] **Step 1: Write the first failing backend/adapter test**
|
||||
|
||||
Create `tests/browser_ws_service_adapter_test.rs` with one focused test that directly exercises the ws-service adapter layer, without `sg_claw_client`, without LLM planning, and without natural-language tasks.
|
||||
|
||||
Start with the smallest real behavior from the spec:
|
||||
- fake browser websocket server accepts one connection
|
||||
- the ws-service adapter builds the same kind of client the service will use
|
||||
- `WsBrowserBackend.invoke(Action::Navigate, ...)` succeeds on status `0`
|
||||
- the fake server receives one text frame that decodes as a ws `Navigate` call
|
||||
|
||||
- [ ] **Step 2: Run that single new test and watch it fail**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test browser_ws_service_adapter_test ws_service_backend_navigate_reaches_browser_websocket -- --nocapture
|
||||
```
|
||||
|
||||
Expected: FAIL because the service-side ws client/adapter does not exist yet.
|
||||
|
||||
- [ ] **Step 3: Add the second failing deterministic test**
|
||||
|
||||
In the same file, add a test for the forced-close path:
|
||||
- fake browser websocket server accepts a request, then closes/reset the socket before returning a status frame
|
||||
- observe the error at the `WsBrowserBackend.invoke(...)` call site
|
||||
- assert the outward error is exactly `PipeError::PipeClosed`
|
||||
|
||||
- [ ] **Step 4: Run only the forced-close test and watch it fail**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test browser_ws_service_adapter_test ws_service_backend_maps_browser_disconnect_to_pipe_closed -- --nocapture
|
||||
```
|
||||
|
||||
Expected: FAIL because the service-side ws client/adapter still does not exist.
|
||||
|
||||
- [ ] **Step 5: Add the third failing deterministic test**
|
||||
|
||||
In the same file, add a callback-timeout test:
|
||||
- fake browser websocket server returns status `0`
|
||||
- it never returns the callback frame
|
||||
- assert the outward error at `invoke(...)` is exactly `PipeError::Timeout`
|
||||
|
||||
Use a tiny response timeout in the backend under test.
|
||||
|
||||
- [ ] **Step 6: Run only the callback-timeout test and watch it fail**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test browser_ws_service_adapter_test ws_service_backend_times_out_waiting_for_callback -- --nocapture
|
||||
```
|
||||
|
||||
Expected: FAIL because the service-side ws client/adapter still does not exist.
|
||||
|
||||
- [ ] **Step 7: Add the end-to-end failing regression for the auth bug**
|
||||
|
||||
Extend `tests/service_task_flow_test.rs` with one client→service integration test that:
|
||||
- starts a fake browser websocket server
|
||||
- starts the real `sg_claw` service binary with a temp config pointing `browserWsUrl` to that fake server
|
||||
- starts the real `sg_claw_client`
|
||||
- submits the fixed instruction `打开知乎热榜并读取页面主区域文本`
|
||||
- captures service/client output
|
||||
- asserts the fake browser server received at least one text frame
|
||||
- asserts output does **not** contain `invalid hmac seed: session key must not be empty`
|
||||
|
||||
Do not assert planner details here. This test only proves the service path no longer goes through the empty-session-key auth failure.
|
||||
|
||||
- [ ] **Step 8: Run the integration regression and watch it fail**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test service_task_flow_test ws_service_submit_task_no_longer_hits_invalid_hmac_seed -- --nocapture
|
||||
```
|
||||
|
||||
Expected: FAIL on the current code because the ws service still constructs `BrowserPipeTool::new(..., vec![])`.
|
||||
|
||||
- [ ] **Step 9: Commit the red tests only after they are all in place**
|
||||
|
||||
Do not commit yet if any required red test was skipped. The next task will make them pass.
|
||||
|
||||
---
|
||||
|
||||
## Task 2: Add a ws-only browser-backend execution seam without changing the pipe path
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/agent/task_runner.rs`
|
||||
- Modify: `src/compat/runtime.rs`
|
||||
- Modify: `src/compat/orchestration.rs`
|
||||
- Modify: `src/compat/workflow_executor.rs`
|
||||
- Reuse: `src/agent/mod.rs`, `src/browser/backend.rs`
|
||||
- Test: `tests/task_runner_test.rs`, `tests/browser_script_skill_tool_test.rs`
|
||||
|
||||
- [ ] **Step 1: Write the smallest failing runner-level ws entry test**
|
||||
|
||||
Extend `tests/task_runner_test.rs` with a focused test that proves there is a ws-only submit entry accepting `Arc<dyn BrowserBackend>` and an arbitrary event sink, while the old `run_submit_task(...)` signature still exists for pipe mode.
|
||||
|
||||
The test can stay on the missing-LLM-config path so it does not need a real browser call. It should compile only once the new ws-only function exists.
|
||||
|
||||
- [ ] **Step 2: Run the targeted runner test and watch it fail**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test task_runner_test ws_only_submit_task_entry_accepts_browser_backend -- --nocapture
|
||||
```
|
||||
|
||||
Expected: FAIL to compile or FAIL to link because the ws-only entry does not exist yet.
|
||||
|
||||
- [ ] **Step 3: Add the new ws-only submit-task entry in `src/agent/task_runner.rs`**
|
||||
|
||||
Keep the current pipe function intact:
|
||||
|
||||
```rust
|
||||
pub fn run_submit_task<T: Transport + 'static>(... browser_tool: &BrowserPipeTool<T>, ...)
|
||||
```
|
||||
|
||||
Add a parallel entry for the service path, for example:
|
||||
|
||||
```rust
|
||||
pub fn run_submit_task_with_browser_backend(
|
||||
sink: &dyn AgentEventSink,
|
||||
browser_backend: Arc<dyn BrowserBackend>,
|
||||
context: &AgentRuntimeContext,
|
||||
request: SubmitTaskRequest,
|
||||
) -> Result<(), PipeError>
|
||||
```
|
||||
|
||||
Rules:
|
||||
- share as much internal logic as possible with the pipe path
|
||||
- do not change `run_submit_task(...)` behavior
|
||||
- do not change `src/agent/mod.rs` pipe wiring except, at most, small internal refactoring to reuse common code
|
||||
|
||||
- [ ] **Step 4: Add a backend-driven compat runtime entry**
|
||||
|
||||
In `src/compat/runtime.rs`, add a parallel entry that accepts `Arc<dyn BrowserBackend>` directly instead of `BrowserPipeTool<T>`.
|
||||
|
||||
Keep the existing pipe-oriented public function in place.
|
||||
|
||||
The backend-driven entry must preserve:
|
||||
- existing log emission order
|
||||
- tool names (`superrpa_browser`, `browser_action`)
|
||||
- existing browser-script tool gating behavior
|
||||
- existing office/screen tool attachment logic
|
||||
- existing conversation seeding and provider setup
|
||||
|
||||
- [ ] **Step 5: Add backend-driven orchestration and workflow-executor entries**
|
||||
|
||||
In `src/compat/orchestration.rs`, add the matching backend-driven entry so direct-route flows and fallback flows can run with `Arc<dyn BrowserBackend>` on the ws path.
|
||||
|
||||
In `src/compat/workflow_executor.rs`, add backend-driven sibling APIs for any direct-route/fallback execution that is currently hard-wired to `BrowserPipeTool<T>`.
|
||||
|
||||
Keep the existing pipe-oriented orchestration and workflow-executor public functions in place.
|
||||
|
||||
- [ ] **Step 6: Route the new ws-only submit entry through the backend-driven compat/orchestration/workflow-executor path**
|
||||
|
||||
Inside `src/agent/task_runner.rs`, make the new ws-only submit entry call the new backend-based compat/orchestration functions, while the old pipe entry keeps calling the old pipe-based functions.
|
||||
|
||||
This is the core compatibility seam, and it must cover both normal compat-runtime execution and direct-route/fallback workflow execution.
|
||||
|
||||
- [ ] **Step 7: Re-run the new runner test**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test task_runner_test ws_only_submit_task_entry_accepts_browser_backend -- --nocapture
|
||||
```
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
- [ ] **Step 8: Re-run the full runner, workflow, and browser-script regressions**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test task_runner_test --test browser_script_skill_tool_test -- --nocapture
|
||||
```
|
||||
|
||||
Then run the workflow executor unit coverage that protects direct-route behavior:
|
||||
|
||||
```bash
|
||||
cargo test compat::workflow_executor::tests -- --nocapture
|
||||
```
|
||||
|
||||
Expected: all existing runner, workflow, and browser-script tests still pass, proving the pipe-facing path, direct-route behavior, and eval-gating stayed stable.
|
||||
|
||||
- [ ] **Step 9: Commit**
|
||||
|
||||
```bash
|
||||
git add src/agent/task_runner.rs src/compat/runtime.rs src/compat/orchestration.rs src/compat/workflow_executor.rs tests/task_runner_test.rs tests/browser_script_skill_tool_test.rs
|
||||
git commit -m "refactor: add ws-only browser backend submit path"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Task 3: Replace the ws service’s empty-session-key browser tool with a ws-native backend
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/service/server.rs`
|
||||
- Modify: `src/service/mod.rs` only if minimal re-export or signature cleanup is required
|
||||
- Modify: `src/browser/mod.rs` only if export cleanup is needed
|
||||
- Test: `tests/browser_ws_service_adapter_test.rs`
|
||||
- Reuse: `src/browser/ws_backend.rs`, `src/browser/ws_protocol.rs`
|
||||
|
||||
- [ ] **Step 1: Write the smallest failing service-side adapter compile target**
|
||||
|
||||
Add a compile-level or construction-level assertion in `tests/browser_ws_service_adapter_test.rs` that the service path can construct the new service-side ws client type used by `serve_client(...)`.
|
||||
|
||||
This should fail until the type exists in `src/service/server.rs`.
|
||||
|
||||
- [ ] **Step 2: Run the adapter test group and watch the constructor test fail**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test browser_ws_service_adapter_test -- --nocapture
|
||||
```
|
||||
|
||||
Expected: FAIL because the service-side ws client type does not exist yet.
|
||||
|
||||
- [ ] **Step 3: Introduce `ServiceBrowserWsClient` in `src/service/server.rs`**
|
||||
|
||||
Create a narrow client type that owns the real websocket connection to `browser_ws_url` and implements `WsClient`:
|
||||
|
||||
Required responsibilities only:
|
||||
- lazily connect on first use
|
||||
- send raw text frames
|
||||
- receive raw text frames with timeout
|
||||
- map close/reset to exactly `PipeError::PipeClosed`
|
||||
- map connect failure to exactly `PipeError::Protocol("browser websocket connect failed: ...")`
|
||||
- map timeouts to exactly `PipeError::Timeout`
|
||||
|
||||
Do **not** duplicate `WsBrowserBackend` responsibilities here.
|
||||
|
||||
- [ ] **Step 4: Remove ws-path use of `BrowserPipeTool::new(..., vec![])`**
|
||||
|
||||
In `serve_client(...)`, replace this shape:
|
||||
|
||||
```rust
|
||||
let transport = Arc::new(ServiceBrowserTransport::new(...));
|
||||
let browser_tool = BrowserPipeTool::new(transport.clone(), mac_policy.clone(), vec![])
|
||||
```
|
||||
|
||||
with the ws-native shape:
|
||||
|
||||
```rust
|
||||
let ws_client = Arc::new(ServiceBrowserWsClient::new(...));
|
||||
let browser_backend: Arc<dyn BrowserBackend> = Arc::new(
|
||||
WsBrowserBackend::new(ws_client, mac_policy.clone(), initial_request_url(...))
|
||||
.with_response_timeout(BROWSER_RESPONSE_TIMEOUT)
|
||||
);
|
||||
```
|
||||
|
||||
Then route the task through the new ws-only submit entry from Task 2.
|
||||
|
||||
- [ ] **Step 5: Delete or narrow old ws-path transport code that duplicated protocol handling**
|
||||
|
||||
Remove the service-only callback polling / response queue logic that existed solely to feed `BrowserPipeTool`.
|
||||
|
||||
Keep only what is still needed for:
|
||||
- service client websocket I/O (`sg_claw_client` ↔ `sg_claw`)
|
||||
- browser websocket I/O (`sg_claw` ↔ `browser_ws_url`)
|
||||
|
||||
Do not leave two competing ws protocol implementations in `src/service/server.rs`.
|
||||
|
||||
- [ ] **Step 6: Re-run deterministic adapter/backend tests**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test browser_ws_service_adapter_test -- --nocapture
|
||||
```
|
||||
|
||||
Expected: PASS, including:
|
||||
- navigate success
|
||||
- disconnect => `PipeError::PipeClosed`
|
||||
- callback timeout => `PipeError::Timeout`
|
||||
|
||||
- [ ] **Step 7: Re-run existing ws backend tests**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test browser_ws_backend_test -- --nocapture
|
||||
```
|
||||
|
||||
Expected: PASS, confirming the service adapter change did not break the existing ws backend semantics.
|
||||
|
||||
- [ ] **Step 8: Commit**
|
||||
|
||||
```bash
|
||||
git add src/service/server.rs src/service/mod.rs src/browser/mod.rs tests/browser_ws_service_adapter_test.rs
|
||||
git commit -m "feat: switch ws service to ws-native browser backend"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Task 4: Prove the auth bug is gone and pipe mode is unchanged
|
||||
|
||||
**Files:**
|
||||
- Modify: `tests/service_ws_session_test.rs`
|
||||
- Modify: `tests/service_task_flow_test.rs`
|
||||
- Reuse: `src/lib.rs`, `src/service/mod.rs`, `src/compat/workflow_executor.rs`
|
||||
|
||||
- [ ] **Step 1: Update service session tests for the new ws-only call path**
|
||||
|
||||
Adjust any service session tests that still call `handle_client_message(...)` through the old ws-path `BrowserPipeTool` assumption.
|
||||
|
||||
Prefer one of these narrow approaches:
|
||||
- overload `handle_client_message(...)` with a backend-based service entry used only in ws tests, or
|
||||
- keep `handle_client_message(...)` pipe-oriented and test the ws path through `serve_client(...)` and the real service binary instead
|
||||
|
||||
Choose the option that changes the fewest existing tests and leaves the pipe path simplest.
|
||||
|
||||
- [ ] **Step 2: Run the focused service session file**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test service_ws_session_test -- --nocapture
|
||||
```
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
- [ ] **Step 3: Make the auth-regression integration test pass**
|
||||
|
||||
Re-run the exact end-to-end regression from Task 1:
|
||||
|
||||
```bash
|
||||
cargo test --test service_task_flow_test ws_service_submit_task_no_longer_hits_invalid_hmac_seed -- --nocapture
|
||||
```
|
||||
|
||||
Expected: PASS, with evidence that:
|
||||
- the fake browser websocket server received at least one frame
|
||||
- output no longer contains `invalid hmac seed: session key must not be empty`
|
||||
|
||||
- [ ] **Step 4: Add one explicit mandatory assertion for browser websocket connect failures**
|
||||
|
||||
Add one focused assertion that a browser websocket connect failure surfaces outward as:
|
||||
|
||||
```rust
|
||||
PipeError::Protocol("browser websocket connect failed: ...")
|
||||
```
|
||||
|
||||
Do not leave this semantic implied.
|
||||
|
||||
- [ ] **Step 5: Add one explicit ws direct-route regression**
|
||||
|
||||
Add one focused regression that proves a ws-backed browser backend can traverse a direct-route/fallback path that currently flows through `src/compat/workflow_executor.rs`.
|
||||
|
||||
Keep it deterministic and narrow. Prefer a fake backend plus direct function invocation over a planner-dependent natural-language end-to-end test.
|
||||
|
||||
- [ ] **Step 6: Run the ws-focused regression suite**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test browser_ws_service_adapter_test --test browser_ws_backend_test --test browser_ws_protocol_test --test service_ws_session_test --test service_task_flow_test -- --nocapture
|
||||
```
|
||||
|
||||
Then run the workflow-executor direct-route coverage:
|
||||
|
||||
```bash
|
||||
cargo test compat::workflow_executor::tests -- --nocapture
|
||||
```
|
||||
|
||||
Expected: all ws-focused and direct-route workflow tests pass.
|
||||
|
||||
- [ ] **Step 7: Run the required pipe and browser-script regression suite**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test pipe_handshake_test --test browser_tool_test --test compat_browser_tool_test --test browser_script_skill_tool_test --test runtime_task_flow_test -- --nocapture
|
||||
```
|
||||
|
||||
Expected: all required pipe and browser-script regressions pass unchanged.
|
||||
|
||||
- [ ] **Step 8: Run the full relevant verification sweep**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test browser_ws_service_adapter_test --test browser_ws_backend_test --test browser_ws_protocol_test --test service_ws_session_test --test service_task_flow_test --test pipe_handshake_test --test browser_tool_test --test compat_browser_tool_test --test browser_script_skill_tool_test --test runtime_task_flow_test -- --nocapture
|
||||
```
|
||||
|
||||
Then run:
|
||||
|
||||
```bash
|
||||
cargo test compat::workflow_executor::tests -- --nocapture
|
||||
```
|
||||
|
||||
Expected: full mixed ws+pipe verification passes in fresh runs.
|
||||
|
||||
- [ ] **Step 9: Build the affected binaries**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo build --bin sgclaw --bin sg_claw --bin sg_claw_client
|
||||
```
|
||||
|
||||
Expected: all three binaries compile.
|
||||
|
||||
- [ ] **Step 10: Commit**
|
||||
|
||||
```bash
|
||||
git add tests/service_ws_session_test.rs tests/service_task_flow_test.rs tests/browser_ws_service_adapter_test.rs src/compat/workflow_executor.rs
|
||||
git commit -m "test: verify ws auth replacement and pipe regressions"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Task 5: Manual smoke verification against the real browser
|
||||
|
||||
**Files:**
|
||||
- Reuse only: no code changes unless a verified bug is found during smoke work
|
||||
|
||||
- [ ] **Step 1: Start the real browser websocket target**
|
||||
|
||||
Confirm the real sgBrowser endpoint is reachable at the configured `browserWsUrl`.
|
||||
|
||||
- [ ] **Step 2: Start the real ws service**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo run --bin sg_claw -- --config-path "D:/data/ideaSpace/rust/sgClaw/sgclaw_config.json"
|
||||
```
|
||||
|
||||
Expected: service prints the resolved listen address and browser websocket URL.
|
||||
|
||||
- [ ] **Step 3: Run the minimal browser task through the real client**
|
||||
|
||||
Run from a separate terminal with UTF-8-safe input:
|
||||
|
||||
```bash
|
||||
cargo run --bin sg_claw_client -- --config-path "D:/data/ideaSpace/rust/sgClaw/sgclaw_config.json"
|
||||
```
|
||||
|
||||
Submit:
|
||||
|
||||
```text
|
||||
打开知乎热榜并读取页面主区域文本
|
||||
```
|
||||
|
||||
Expected:
|
||||
- browser actions start executing
|
||||
- no `invalid hmac seed: session key must not be empty`
|
||||
- one final completion is returned
|
||||
|
||||
- [ ] **Step 4: Run the old Zhihu skill smoke**
|
||||
|
||||
Submit:
|
||||
|
||||
```text
|
||||
读取知乎热榜数据,并导出 excel 文件
|
||||
```
|
||||
|
||||
Expected: the task enters the real browser action path instead of dying at auth initialization.
|
||||
|
||||
- [ ] **Step 5: Re-check the legacy pipe entry without modifying it**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo run
|
||||
```
|
||||
|
||||
Only verify startup behavior appropriate for the current pipe environment. Do not change pipe code during this smoke step.
|
||||
|
||||
- [ ] **Step 6: If a smoke failure appears, stop and debug before editing**
|
||||
|
||||
Any failure found here must be handled with:
|
||||
- a fresh reproducer
|
||||
- a failing automated test if feasible
|
||||
- the smallest scoped fix
|
||||
|
||||
Do not fold speculative smoke fixes into this slice.
|
||||
|
||||
---
|
||||
|
||||
## Verification Checklist
|
||||
|
||||
### Deterministic ws-only tests
|
||||
|
||||
```bash
|
||||
cargo test --test browser_ws_service_adapter_test --test browser_ws_backend_test --test browser_ws_protocol_test -- --nocapture
|
||||
```
|
||||
|
||||
Expected: ws-native backend and service adapter semantics are green without LLM/planner dependencies.
|
||||
|
||||
### Client→service ws chain tests
|
||||
|
||||
```bash
|
||||
cargo test --test service_ws_session_test --test service_task_flow_test -- --nocapture
|
||||
```
|
||||
|
||||
Expected: the ws service path reaches the browser websocket and no longer emits the empty-session-key auth failure.
|
||||
|
||||
### Required pipe and browser-script regressions
|
||||
|
||||
```bash
|
||||
cargo test --test pipe_handshake_test --test browser_tool_test --test compat_browser_tool_test --test browser_script_skill_tool_test --test runtime_task_flow_test -- --nocapture
|
||||
```
|
||||
|
||||
Expected: legacy pipe behavior and browser-script eval-gating remain unchanged.
|
||||
|
||||
### Binary build verification
|
||||
|
||||
```bash
|
||||
cargo build --bin sgclaw --bin sg_claw --bin sg_claw_client
|
||||
```
|
||||
|
||||
Expected: all affected binaries compile.
|
||||
|
||||
### Manual end-to-end verification
|
||||
|
||||
- real sgBrowser running at configured `browserWsUrl`
|
||||
- `cargo run --bin sg_claw -- --config-path "D:/data/ideaSpace/rust/sgClaw/sgclaw_config.json"`
|
||||
- `cargo run --bin sg_claw_client -- --config-path "D:/data/ideaSpace/rust/sgClaw/sgclaw_config.json"`
|
||||
- run the Zhihu minimal task
|
||||
- run the old Zhihu export task
|
||||
- verify no `invalid hmac seed` appears
|
||||
- verify pipe startup still behaves as before
|
||||
|
||||
---
|
||||
|
||||
## Notes for Implementation
|
||||
|
||||
- Keep the current pipe bootstrap in `src/lib.rs` untouched.
|
||||
- Prefer adding ws-only functions over changing existing pipe signatures.
|
||||
- Reuse `WsBrowserBackend` for protocol semantics; do not re-implement callback handling inside the service.
|
||||
- Keep `ServiceBrowserWsClient` narrow: connection lifecycle + raw websocket I/O only.
|
||||
- Preserve exact outward error semantics from the spec:
|
||||
- connect failure => `PipeError::Protocol("browser websocket connect failed: ...")`
|
||||
- non-zero status => `PipeError::Protocol("browser returned non-zero status: ...")`
|
||||
- callback timeout => `PipeError::Timeout`
|
||||
- close/reset => `PipeError::PipeClosed`
|
||||
- Do not claim success until the mixed ws+pipe verification commands have been run fresh.
|
||||
482
docs/superpowers/plans/2026-04-03-ws-browser-bridge-path-plan.md
Normal file
482
docs/superpowers/plans/2026-04-03-ws-browser-bridge-path-plan.md
Normal file
@@ -0,0 +1,482 @@
|
||||
# WS Browser Bridge Path Implementation Plan
|
||||
|
||||
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
|
||||
|
||||
**Goal:** Replace the raw-ws-direct browser execution assumption with a bridge-backed browser integration path that matches the validated FunctionsUI / BrowserAction / CommandRouter model while preserving existing pipe behavior.
|
||||
|
||||
**Architecture:** Keep the current Rust-side browser orchestration flow centered on `Arc<dyn BrowserBackend>`, but stop treating `WsBrowserBackend` as the real production browser surface. Model the validated bridge as two explicit layers: Layer 1 session/lifecycle calls (`sgclawConnect`, `sgclawStart`, `sgclawStop`, `sgclawSubmitTask`) and Layer 2 browser-action execution (`window.sgFunctionsUI(...)`, `window.BrowserAction(...)`, `CommandRouter`). The new backend targets Layer 2 only through a narrow repo-local `BridgeActionTransport` seam, while lifecycle/session concerns stay separate from per-action browser execution.
|
||||
|
||||
**Tech Stack:** Rust 2021, existing `BrowserBackend` abstraction, compat/runtime/orchestration stack, current service/task runner integration, existing bridge-oriented design docs, existing Rust unit/integration test suite.
|
||||
|
||||
---
|
||||
|
||||
## Scope Guardrails
|
||||
|
||||
- Do **not** continue extending raw external sgBrowser websocket business-frame handling as the mainline path.
|
||||
- Do **not** modify `src/lib.rs`, pipe handshake behavior, or the working `BrowserPipeTool` path.
|
||||
- Do **not** invent a parallel browser-command contract unrelated to the documented bridge surface.
|
||||
- Do **not** rewrite the whole compat/runtime stack when a narrow adapter will do.
|
||||
- Do **not** assume access to the full SuperRPA browser-host codebase from this repository; encode the validated contract at the nearest seam available here.
|
||||
|
||||
---
|
||||
|
||||
## File Structure
|
||||
|
||||
### Existing files to modify
|
||||
|
||||
- Modify: `src/browser/mod.rs`
|
||||
- export the new bridge contract/transport/backend modules explicitly
|
||||
- Modify: `src/browser/backend.rs`
|
||||
- only if a tiny shared helper or trait documentation update is needed for the new bridge-backed backend
|
||||
- Modify: `src/compat/browser_tool_adapter.rs`
|
||||
- ensure existing browser action mapping remains reusable with the new backend implementation
|
||||
- Modify: `src/compat/runtime.rs`
|
||||
- wire the bridge-backed browser backend into the ws service/browser execution path without changing the pipe path
|
||||
- Modify: `src/compat/orchestration.rs`
|
||||
- only where browser backend wiring requires the bridge-backed path to flow through orchestration
|
||||
- Modify: `src/compat/workflow_executor.rs`
|
||||
- preserve direct-route/fallback use of `BrowserBackend` when the backend is bridge-backed instead of websocket-backed
|
||||
- Modify: `src/service/server.rs`
|
||||
- replace the current real-browser execution assumption with bridge-backend construction plus a repo-local bridge transport provider seam for the relevant service path
|
||||
- Modify: `tests/compat_browser_tool_test.rs`
|
||||
- extend browser tool mapping coverage if needed for bridge-backed execution
|
||||
- Modify: `tests/service_task_flow_test.rs`
|
||||
- replace raw-ws-direct expectations with bridge-path expectations where appropriate
|
||||
- Modify: `tests/service_ws_session_test.rs`
|
||||
- update service-side tests if they currently assume the real browser path is raw websocket driven
|
||||
|
||||
### New files to create
|
||||
|
||||
- Create: `src/browser/bridge_contract.rs`
|
||||
- narrow, explicit contract types that keep lifecycle/session bridge calls separate from browser-action execution requests/replies
|
||||
- Create: `src/browser/bridge_transport.rs`
|
||||
- repo-local `BridgeActionTransport` seam used by the backend and injected by service/runtime wiring
|
||||
- Create: `src/browser/bridge_backend.rs`
|
||||
- new `BrowserBackend` implementation that maps browser actions onto the Layer-2 bridge action contract through `BridgeActionTransport`
|
||||
- Create: `tests/browser_bridge_backend_test.rs`
|
||||
- deterministic unit tests for action-to-bridge mapping and reply/error normalization using a fake bridge transport
|
||||
- Create: `tests/browser_bridge_contract_test.rs`
|
||||
- narrow tests proving the two bridge layers stay explicit and browser-action requests remain semantic rather than raw-websocket-shaped
|
||||
|
||||
### Evidence files to consult during implementation
|
||||
|
||||
- Read: `docs/_tmp_sgbrowser_ws_probe_transcript.md`
|
||||
- Read: `frontend/archive/sgClaw验证-已归档/testRunner.js`
|
||||
- Read: `docs/superpowers/specs/2026-03-25-superrpa-sgclaw-browser-control-design.md`
|
||||
- Read: `docs/archive/项目管理与排期/协作时间表.md`
|
||||
- Read: `docs/plans/2026-03-27-sgclaw-floating-chat-frontend-design.md`
|
||||
|
||||
---
|
||||
|
||||
## Task 1: Lock the bridge contract in deterministic tests before adding the backend
|
||||
|
||||
**Files:**
|
||||
- Create: `src/browser/bridge_contract.rs`
|
||||
- Create: `tests/browser_bridge_contract_test.rs`
|
||||
- Reuse as design evidence:
|
||||
- `frontend/archive/sgClaw验证-已归档/testRunner.js`
|
||||
- `docs/archive/项目管理与排期/协作时间表.md`
|
||||
- `docs/plans/2026-03-27-sgclaw-floating-chat-frontend-design.md`
|
||||
|
||||
- [ ] **Step 1: Write the first failing contract test for named bridge calls**
|
||||
|
||||
Create `tests/browser_bridge_contract_test.rs` with one focused test that encodes the bridge naming expectations already evidenced in the repo.
|
||||
|
||||
Start with a test shape like:
|
||||
|
||||
```rust
|
||||
#[test]
|
||||
fn bridge_contract_names_match_documented_bridge_surface() {
|
||||
// assert the contract contains the exact bridge action names
|
||||
}
|
||||
```
|
||||
|
||||
Required expectations:
|
||||
- `sgclawConnect`
|
||||
- `sgclawStart`
|
||||
- `sgclawStop`
|
||||
- `sgclawSubmitTask`
|
||||
- these names live in an explicit lifecycle/session contract type, not in the browser-action request type
|
||||
|
||||
Do **not** invent additional action names in this first test.
|
||||
|
||||
- [ ] **Step 2: Run the single contract test and verify it fails**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test browser_bridge_contract_test bridge_contract_names_match_documented_bridge_surface -- --nocapture
|
||||
```
|
||||
|
||||
Expected: FAIL because `src/browser/bridge_contract.rs` does not exist yet.
|
||||
|
||||
- [ ] **Step 3: Add the second failing contract test for browser-action request shaping**
|
||||
|
||||
In the same file, add one focused test proving the bridge contract can represent a browser action request without leaking raw websocket business-frame semantics.
|
||||
|
||||
Test shape:
|
||||
|
||||
```rust
|
||||
#[test]
|
||||
fn bridge_contract_represents_browser_action_requests_without_ws_business_frames() {
|
||||
// create a click/navigate/getText style action request and assert shape
|
||||
}
|
||||
```
|
||||
|
||||
Required assertions:
|
||||
- request shape identifies the intended browser action semantically
|
||||
- request shape is distinct from the lifecycle/session bridge call type
|
||||
- request shape does **not** embed `sgBrowerserOpenPage`, `callBackJsToCpp`, or other raw websocket business-frame names
|
||||
|
||||
- [ ] **Step 4: Run the second contract test and verify it fails**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test browser_bridge_contract_test bridge_contract_represents_browser_action_requests_without_ws_business_frames -- --nocapture
|
||||
```
|
||||
|
||||
Expected: FAIL because the bridge contract does not exist yet.
|
||||
|
||||
- [ ] **Step 5: Implement the minimal bridge contract module**
|
||||
|
||||
Create `src/browser/bridge_contract.rs` with only the types needed by the tests.
|
||||
|
||||
Recommended shape:
|
||||
|
||||
```rust
|
||||
pub enum BridgeLifecycleCall {
|
||||
Connect,
|
||||
Start,
|
||||
Stop,
|
||||
SubmitTask,
|
||||
}
|
||||
|
||||
impl BridgeLifecycleCall {
|
||||
pub fn bridge_name(&self) -> &'static str {
|
||||
match self {
|
||||
Self::Connect => "sgclawConnect",
|
||||
Self::Start => "sgclawStart",
|
||||
Self::Stop => "sgclawStop",
|
||||
Self::SubmitTask => "sgclawSubmitTask",
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
pub struct BridgeBrowserActionRequest {
|
||||
pub action: String,
|
||||
pub params: serde_json::Value,
|
||||
pub expected_domain: String,
|
||||
}
|
||||
```
|
||||
|
||||
Rules:
|
||||
- model the documented bridge/lifecycle naming explicitly
|
||||
- keep the browser action request semantic, not websocket-frame-shaped
|
||||
- keep the module small and repository-local
|
||||
|
||||
- [ ] **Step 6: Re-run the contract tests**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test browser_bridge_contract_test -- --nocapture
|
||||
```
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
- [ ] **Step 7: Commit**
|
||||
|
||||
```bash
|
||||
git add src/browser/bridge_contract.rs tests/browser_bridge_contract_test.rs
|
||||
git commit -m "test: define sgClaw bridge contract surface"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Task 2: Add the repo-local transport seam and bridge-backed `BrowserBackend`
|
||||
|
||||
**Files:**
|
||||
- Create: `src/browser/bridge_transport.rs`
|
||||
- Create: `src/browser/bridge_backend.rs`
|
||||
- Create: `tests/browser_bridge_backend_test.rs`
|
||||
- Reuse: `src/browser/backend.rs`
|
||||
- Reuse: `src/browser/bridge_contract.rs`
|
||||
- Reuse: `src/compat/browser_tool_adapter.rs`
|
||||
|
||||
- [ ] **Step 1: Write the first failing backend test for action mapping**
|
||||
|
||||
Create `tests/browser_bridge_backend_test.rs` with one focused test proving a `BrowserBackend` action is translated into the bridge contract request shape.
|
||||
|
||||
Start with a narrow action such as `Action::Navigate`.
|
||||
|
||||
Required assertions:
|
||||
- `Action::Navigate` becomes one semantic bridge browser-action request
|
||||
- the request preserves action parameters and expected domain
|
||||
- the test does **not** assert any raw websocket payload strings
|
||||
|
||||
- [ ] **Step 2: Run the first backend test and verify it fails**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test browser_bridge_backend_test bridge_backend_maps_navigate_to_bridge_action_request -- --nocapture
|
||||
```
|
||||
|
||||
Expected: FAIL because `src/browser/bridge_backend.rs` does not exist yet.
|
||||
|
||||
- [ ] **Step 3: Add the second failing backend test for reply normalization**
|
||||
|
||||
Add one focused test proving the backend can normalize a successful bridge reply into the existing `CommandOutput` shape expected by `BrowserBackend` callers.
|
||||
|
||||
- [ ] **Step 4: Run the second backend test and verify it fails**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test browser_bridge_backend_test bridge_backend_normalizes_successful_bridge_reply -- --nocapture
|
||||
```
|
||||
|
||||
Expected: FAIL because the backend does not exist yet.
|
||||
|
||||
- [ ] **Step 5: Add the third failing backend test for bridge-side errors**
|
||||
|
||||
Add one focused test proving a bridge-side error normalizes into the correct outward `PipeError` semantics for backend callers.
|
||||
|
||||
- [ ] **Step 6: Run the error-path test and verify it fails**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test browser_bridge_backend_test bridge_backend_maps_bridge_failure_to_pipe_error -- --nocapture
|
||||
```
|
||||
|
||||
Expected: FAIL because the backend does not exist yet.
|
||||
|
||||
- [ ] **Step 7: Implement the minimal transport seam and bridge backend**
|
||||
|
||||
Create `src/browser/bridge_transport.rs` and `src/browser/bridge_backend.rs`.
|
||||
|
||||
The transport seam must:
|
||||
- define the repo-local `BridgeActionTransport` contract used for Layer-2 browser-action execution only
|
||||
- accept semantic `BridgeBrowserActionRequest` values and return semantic success/error replies
|
||||
- remain small, explicit, and easy to fake in tests
|
||||
|
||||
The backend must:
|
||||
- implement the existing `BrowserBackend` trait
|
||||
- translate supported actions into `BridgeBrowserActionRequest`
|
||||
- depend on `BridgeActionTransport` instead of raw websocket payload building
|
||||
- normalize success/error replies into existing backend-facing result types
|
||||
|
||||
Rules:
|
||||
- do not embed raw websocket business-frame names
|
||||
- do not change `BrowserBackend` semantics for existing callers
|
||||
- do not pull lifecycle/session bridge calls into this backend layer
|
||||
|
||||
- [ ] **Step 8: Re-run the bridge backend tests**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test browser_bridge_backend_test -- --nocapture
|
||||
```
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
- [ ] **Step 9: Re-run browser tool adapter coverage**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test compat_browser_tool_test -- --nocapture
|
||||
```
|
||||
|
||||
Expected: PASS, proving the existing browser action mapping remains reusable with the new backend.
|
||||
|
||||
- [ ] **Step 10: Commit**
|
||||
|
||||
```bash
|
||||
git add src/browser/bridge_transport.rs src/browser/bridge_backend.rs tests/browser_bridge_backend_test.rs src/compat/browser_tool_adapter.rs src/browser/mod.rs
|
||||
git commit -m "feat: add bridge-backed browser backend"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Task 3: Wire the bridge-backed backend into the real-browser service path
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/service/server.rs`
|
||||
- Modify: `src/compat/runtime.rs`
|
||||
- Modify: `src/compat/orchestration.rs`
|
||||
- Modify: `src/compat/workflow_executor.rs`
|
||||
- Modify: `tests/service_task_flow_test.rs`
|
||||
- Modify: `tests/service_ws_session_test.rs`
|
||||
- Reuse: `src/browser/bridge_backend.rs`
|
||||
- Reuse: `src/browser/bridge_contract.rs`
|
||||
- Reuse: `src/browser/bridge_transport.rs`
|
||||
|
||||
- [ ] **Step 1: Write the first failing service-path test for bridge backend construction**
|
||||
|
||||
Add or update one focused service test proving the real-browser execution path constructs and uses the bridge-backed backend instead of the raw websocket backend assumption.
|
||||
|
||||
The test should observe backend selection at the nearest possible seam.
|
||||
|
||||
- [ ] **Step 2: Run the focused service test and verify it fails**
|
||||
|
||||
Run the narrowest affected service test command.
|
||||
|
||||
Expected: FAIL because the service path is not wired to the bridge backend yet.
|
||||
|
||||
- [ ] **Step 3: Add the minimal service/runtime wiring**
|
||||
|
||||
Change the relevant service/browser execution path so it constructs the new bridge-backed backend, injects the repo-local bridge transport provider at the nearest seam, and passes the backend through the existing runtime/orchestration flow.
|
||||
|
||||
Rules:
|
||||
- keep the pipe path unchanged
|
||||
- keep changes localized
|
||||
- keep lifecycle/session bridge handling separate from per-action browser execution
|
||||
- preserve existing runtime log and task flow behavior where possible
|
||||
|
||||
- [ ] **Step 4: Add one direct-route/fallback regression**
|
||||
|
||||
Add one focused regression proving a bridge-backed backend still works through the direct-route or fallback path exercised by `src/compat/workflow_executor.rs`.
|
||||
|
||||
- [ ] **Step 5: Run the bridge-focused service tests**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test service_ws_session_test --test service_task_flow_test -- --nocapture
|
||||
```
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
- [ ] **Step 6: Re-run workflow/runtime regressions**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test compat::workflow_executor::tests -- --nocapture
|
||||
cargo test --test compat_browser_tool_test --test browser_script_skill_tool_test --test task_runner_test -- --nocapture
|
||||
```
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
- [ ] **Step 7: Commit**
|
||||
|
||||
```bash
|
||||
git add src/service/server.rs src/compat/runtime.rs src/compat/orchestration.rs src/compat/workflow_executor.rs tests/service_ws_session_test.rs tests/service_task_flow_test.rs
|
||||
git commit -m "refactor: route real browser path through bridge backend"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Task 4: Verify bridge-path behavior without pipe regression
|
||||
|
||||
**Files:**
|
||||
- Reuse only unless a failing test proves a minimal fix is still needed
|
||||
|
||||
- [ ] **Step 1: Run bridge/backend unit coverage**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test browser_bridge_contract_test --test browser_bridge_backend_test -- --nocapture
|
||||
```
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
- [ ] **Step 2: Run service/runtime bridge-path regressions**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test service_ws_session_test --test service_task_flow_test -- --nocapture
|
||||
```
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
- [ ] **Step 3: Run required pipe regressions**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test pipe_handshake_test --test browser_tool_test --test compat_browser_tool_test --test browser_script_skill_tool_test --test runtime_task_flow_test -- --nocapture
|
||||
```
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
- [ ] **Step 4: Build the affected binaries**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo build --bin sgclaw --bin sg_claw --bin sg_claw_client
|
||||
```
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
- [ ] **Step 5: Stop if any regression points back to raw websocket assumptions**
|
||||
|
||||
If any test still encodes raw external websocket business-frame assumptions as the real-browser path, update that test to the bridge-backed design rather than patching production code to satisfy the old assumption.
|
||||
|
||||
- [ ] **Step 6: Commit**
|
||||
|
||||
```bash
|
||||
git add tests/browser_bridge_contract_test.rs tests/browser_bridge_backend_test.rs tests/service_ws_session_test.rs tests/service_task_flow_test.rs
|
||||
git commit -m "test: verify bridge path and preserve pipe behavior"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Verification Checklist
|
||||
|
||||
### Bridge contract tests
|
||||
|
||||
```bash
|
||||
cargo test --test browser_bridge_contract_test -- --nocapture
|
||||
```
|
||||
|
||||
Expected: documented bridge names and semantic browser-action request shaping are locked.
|
||||
|
||||
### Bridge backend tests
|
||||
|
||||
```bash
|
||||
cargo test --test browser_bridge_backend_test -- --nocapture
|
||||
```
|
||||
|
||||
Expected: backend action mapping and reply/error normalization are green.
|
||||
|
||||
### Service/runtime integration tests
|
||||
|
||||
```bash
|
||||
cargo test --test service_ws_session_test --test service_task_flow_test -- --nocapture
|
||||
cargo test compat::workflow_executor::tests -- --nocapture
|
||||
```
|
||||
|
||||
Expected: real-browser path uses the bridge-backed backend and direct-route/fallback behavior remains intact.
|
||||
|
||||
### Pipe regressions
|
||||
|
||||
```bash
|
||||
cargo test --test pipe_handshake_test --test browser_tool_test --test compat_browser_tool_test --test browser_script_skill_tool_test --test runtime_task_flow_test -- --nocapture
|
||||
```
|
||||
|
||||
Expected: pipe path remains unchanged.
|
||||
|
||||
### Binary build verification
|
||||
|
||||
```bash
|
||||
cargo build --bin sgclaw --bin sg_claw --bin sg_claw_client
|
||||
```
|
||||
|
||||
Expected: affected binaries compile.
|
||||
|
||||
---
|
||||
|
||||
## Notes for Implementation
|
||||
|
||||
- The websocket probe work stays in the repository as diagnostic tooling; do not repurpose it into the bridge adapter.
|
||||
- `docs/_tmp_sgbrowser_ws_probe_transcript.md` is evidence that rejected the raw-ws-direct assumption, not a contract to keep satisfying.
|
||||
- Favor one narrow bridge-backed backend over broad runtime rewrites.
|
||||
- If the nearest repo-local seam is still slightly abstract because the external SuperRPA host code is outside this repository, make that abstraction explicit and test it rather than guessing hidden behavior.
|
||||
@@ -0,0 +1,566 @@
|
||||
# WS Browser Integration Surface Correction Implementation Plan
|
||||
|
||||
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
|
||||
|
||||
**Goal:** Replace the unvalidated raw-ws-direct assumption with an evidence-backed decision: either prove a minimal sgBrowser bootstrap sequence for raw websocket control, or pivot to the real browser bridge surface.
|
||||
|
||||
**Architecture:** Treat the existing ws-native backend as a protocol/testing asset, not as a validated production integration surface. First build a narrow probe/validation harness that can run candidate bootstrap sequences and capture exact live transcripts from the real endpoint. Then branch decisively: if a reproducible bootstrap sequence yields real status/callback frames, implement that bootstrap path; otherwise stop raw-ws speculation and write the bridge-first implementation slice.
|
||||
|
||||
**Tech Stack:** Rust 2021, existing `src/browser/ws_protocol.rs` / `src/browser/ws_backend.rs`, service websocket infrastructure, `tungstenite`, `serde_json`, current Rust test suite, local sgBrowser websocket documentation.
|
||||
|
||||
---
|
||||
|
||||
## Scope Guardrails
|
||||
|
||||
- Do **not** add more speculative production fixes to `src/service/server.rs` just to “try one more thing.”
|
||||
- Do **not** claim raw websocket is the supported path unless the live probe transcript proves it.
|
||||
- Do **not** modify `src/lib.rs`, pipe handshake behavior, or the pipe browser-tool path.
|
||||
- Do **not** implement both the bootstrap architecture and the bridge architecture in the same branch.
|
||||
- Keep the ws-native code unless and until the bridge decision makes specific pieces obsolete.
|
||||
- Prefer a dedicated probe surface over embedding validation logic into production request handling.
|
||||
|
||||
---
|
||||
|
||||
## File Structure
|
||||
|
||||
### Existing files to modify
|
||||
|
||||
- Modify: `src/browser/mod.rs`
|
||||
- export the new `ws_probe` module so both tests and the probe binary use the same library surface
|
||||
- Modify: `src/browser/ws_protocol.rs`
|
||||
- only if a tiny helper extraction is required for test/probe readability
|
||||
- do not change existing protocol semantics in this slice
|
||||
- Modify: `tests/browser_ws_protocol_test.rs`
|
||||
- add deterministic coverage for any extracted helper used by the probe harness
|
||||
|
||||
### New files to create
|
||||
|
||||
- Create: `src/bin/sgbrowser_ws_probe.rs`
|
||||
- standalone diagnostic binary for ordered frame-script probing against a live sgBrowser websocket endpoint
|
||||
- Create: `src/browser/ws_probe.rs`
|
||||
- small reusable probe/transcript module, if needed, to keep the binary and tests focused
|
||||
- Create: `tests/browser_ws_probe_test.rs`
|
||||
- deterministic fake-server tests for transcript capture, timeout reporting, and scripted sequence execution
|
||||
- Create: `docs/superpowers/specs/2026-04-03-ws-browser-bridge-path-design.md` **only if Option B wins after probing**
|
||||
- follow-up bridge design, not part of the initial coding slice
|
||||
- Create: `docs/superpowers/plans/2026-04-03-ws-browser-bridge-path-plan.md` **only if Option B wins after probing**
|
||||
- follow-up bridge implementation plan, not part of the initial coding slice
|
||||
- Create: `docs/_tmp_sgbrowser_ws_probe_transcript.md`
|
||||
- temporary evidence artifact capturing the real endpoint probe matrix and outcomes
|
||||
|
||||
### Files deliberately not changed in the initial slice
|
||||
|
||||
- `src/lib.rs`
|
||||
- `src/agent/task_runner.rs`
|
||||
- `src/compat/runtime.rs`
|
||||
- `src/compat/orchestration.rs`
|
||||
- `src/compat/workflow_executor.rs`
|
||||
- `src/browser/ws_backend.rs`
|
||||
|
||||
Unless the probe results prove a real bootstrap contract, these files stay untouched.
|
||||
|
||||
---
|
||||
|
||||
## Task 1: Build a deterministic websocket probe harness before touching production behavior
|
||||
|
||||
**Files:**
|
||||
- Create: `src/browser/ws_probe.rs`
|
||||
- Create: `tests/browser_ws_probe_test.rs`
|
||||
- Reuse: `src/browser/ws_protocol.rs`
|
||||
|
||||
- [ ] **Step 1: Write the first failing transcript test**
|
||||
|
||||
Create `tests/browser_ws_probe_test.rs` with one focused fake-server test that executes a scripted sequence of outgoing text frames and records all received text frames in order.
|
||||
|
||||
Start with this shape:
|
||||
|
||||
```rust
|
||||
#[test]
|
||||
fn probe_records_welcome_then_silence_transcript() {
|
||||
// fake server sends one welcome frame and then stays silent
|
||||
// probe result should preserve that exact transcript and mark timeout/silence explicitly
|
||||
}
|
||||
```
|
||||
|
||||
Required assertions:
|
||||
- the probe can connect to the fake websocket server
|
||||
- it can send a scripted first frame
|
||||
- it records the first inbound text frame exactly
|
||||
- it returns a transcript/result object that distinguishes timeout from protocol parse failure
|
||||
|
||||
- [ ] **Step 2: Run the single new test and verify it fails**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test browser_ws_probe_test probe_records_welcome_then_silence_transcript -- --nocapture
|
||||
```
|
||||
|
||||
Expected: FAIL because the probe harness does not exist yet.
|
||||
|
||||
- [ ] **Step 3: Add the second failing probe test for ordered multi-step scripts**
|
||||
|
||||
In the same file, add a test proving the harness can run multiple outgoing frames in a fixed order and keep the transcript segmented by step.
|
||||
|
||||
Test shape:
|
||||
|
||||
```rust
|
||||
#[test]
|
||||
fn probe_runs_ordered_frame_script_and_records_per_step_results() {
|
||||
// send bootstrap frame 1, bootstrap frame 2, then minimal action
|
||||
// fake server replies differently at each step
|
||||
// probe result preserves exact order and outcomes
|
||||
}
|
||||
```
|
||||
|
||||
Required assertions:
|
||||
- outgoing frames are sent in the configured order
|
||||
- inbound frames are attached to the correct step
|
||||
- the probe can stop the sequence on timeout/close if configured
|
||||
|
||||
- [ ] **Step 4: Run the ordered-script test and verify it fails**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test browser_ws_probe_test probe_runs_ordered_frame_script_and_records_per_step_results -- --nocapture
|
||||
```
|
||||
|
||||
Expected: FAIL because the probe harness does not exist yet.
|
||||
|
||||
- [ ] **Step 5: Add the third failing probe test for close/reset visibility**
|
||||
|
||||
Add one focused fake-server test that closes the connection after a script step and asserts the transcript reports close/reset rather than generic timeout.
|
||||
|
||||
- [ ] **Step 6: Run the close/reset test and verify it fails**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test browser_ws_probe_test probe_reports_socket_close_separately_from_timeout -- --nocapture
|
||||
```
|
||||
|
||||
Expected: FAIL because the probe harness does not exist yet.
|
||||
|
||||
- [ ] **Step 7: Implement the minimal probe module**
|
||||
|
||||
Create `src/browser/ws_probe.rs` with only the types and behavior needed by the tests.
|
||||
|
||||
Recommended shape:
|
||||
|
||||
```rust
|
||||
pub struct ProbeStep {
|
||||
pub label: String,
|
||||
pub payload: String,
|
||||
pub expect_reply: bool,
|
||||
}
|
||||
|
||||
pub enum ProbeOutcome {
|
||||
Received(Vec<String>),
|
||||
TimedOut,
|
||||
Closed,
|
||||
ConnectFailed(String),
|
||||
}
|
||||
|
||||
pub struct ProbeStepResult {
|
||||
pub label: String,
|
||||
pub sent: String,
|
||||
pub outcome: ProbeOutcome,
|
||||
}
|
||||
|
||||
pub fn run_probe_script(/* ws url, timeout, steps */) -> Result<Vec<ProbeStepResult>, ProbeError> {
|
||||
// connect, send ordered frames, collect exact transcript
|
||||
}
|
||||
```
|
||||
|
||||
Rules:
|
||||
- do not parse business meaning yet
|
||||
- do not mix this into normal task execution
|
||||
- preserve exact raw text frames in transcript results
|
||||
- keep the module small and diagnostic-oriented
|
||||
|
||||
- [ ] **Step 8: Re-run the new probe tests**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test browser_ws_probe_test -- --nocapture
|
||||
```
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
- [ ] **Step 9: Commit**
|
||||
|
||||
```bash
|
||||
git add src/browser/ws_probe.rs tests/browser_ws_probe_test.rs
|
||||
git commit -m "test: add sgBrowser websocket probe harness"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Task 2: Add a standalone probe binary for live sgBrowser evidence collection
|
||||
|
||||
**Files:**
|
||||
- Create: `src/bin/sgbrowser_ws_probe.rs`
|
||||
- Create: `src/browser/ws_probe.rs`
|
||||
- Modify: `src/browser/mod.rs`
|
||||
- Create: `tests/browser_ws_probe_test.rs`
|
||||
|
||||
- [ ] **Step 1: Write the failing helper parser test**
|
||||
|
||||
In `tests/browser_ws_probe_test.rs`, add one focused test for a new helper function in `src/browser/ws_probe.rs`:
|
||||
|
||||
```rust
|
||||
#[test]
|
||||
fn parse_probe_args_accepts_ws_url_timeout_and_ordered_steps() {
|
||||
// parse a fixed argv-style slice into a ProbeCliConfig
|
||||
}
|
||||
```
|
||||
|
||||
Create and use this exact helper shape:
|
||||
|
||||
```rust
|
||||
pub struct ProbeCliConfig {
|
||||
pub ws_url: String,
|
||||
pub timeout_ms: u64,
|
||||
pub steps: Vec<ProbeStep>,
|
||||
}
|
||||
|
||||
pub fn parse_probe_args(args: &[String]) -> Result<ProbeCliConfig, ProbeError>
|
||||
```
|
||||
|
||||
The test must assert that these exact arguments parse successfully and preserve step order:
|
||||
|
||||
```text
|
||||
--ws-url ws://127.0.0.1:12345
|
||||
--timeout-ms 1500
|
||||
--step open-agent::["about:blank","sgOpenAgent"]
|
||||
--step open-hot::["about:blank","sgBrowerserOpenPage","https://www.zhihu.com/hot"]
|
||||
```
|
||||
|
||||
- [ ] **Step 2: Run the parser test and verify it fails**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test browser_ws_probe_test parse_probe_args_accepts_ws_url_timeout_and_ordered_steps -- --nocapture
|
||||
```
|
||||
|
||||
Expected: FAIL because `parse_probe_args(...)` and `ProbeCliConfig` do not exist yet.
|
||||
|
||||
- [ ] **Step 3: Implement the helper and binary together**
|
||||
|
||||
In `src/browser/ws_probe.rs`, add `ProbeCliConfig` and `parse_probe_args(...)`.
|
||||
|
||||
In `src/browser/mod.rs`, add the module export:
|
||||
|
||||
```rust
|
||||
pub mod ws_probe;
|
||||
```
|
||||
|
||||
In `src/bin/sgbrowser_ws_probe.rs`, implement the binary using only `std::env::args()` plus `parse_probe_args(...)`.
|
||||
|
||||
Required behavior:
|
||||
- accepts a websocket URL
|
||||
- accepts a timeout in milliseconds
|
||||
- accepts repeated ordered steps
|
||||
- runs the probe harness
|
||||
- prints a markdown-friendly transcript including:
|
||||
- step label
|
||||
- exact sent payload
|
||||
- exact received frames, if any
|
||||
- timeout/close outcome
|
||||
|
||||
Output shape can be simple, for example:
|
||||
|
||||
```text
|
||||
STEP 1 bootstrap-open-agent
|
||||
SEND: ["about:blank","sgOpenAgent"]
|
||||
RECV: Welcome! You are client #1
|
||||
OUTCOME: timeout
|
||||
```
|
||||
|
||||
Rules:
|
||||
- no production/browser-runtime integration
|
||||
- no hidden fallback logic
|
||||
- no “best effort” guessing of next steps
|
||||
|
||||
- [ ] **Step 4: Re-run the parser/helper test**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test browser_ws_probe_test parse_probe_args_accepts_ws_url_timeout_and_ordered_steps -- --nocapture
|
||||
```
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
- [ ] **Step 5: Build the probe binary**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo build --bin sgbrowser_ws_probe
|
||||
```
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
- [ ] **Step 6: Commit**
|
||||
|
||||
```bash
|
||||
git add src/bin/sgbrowser_ws_probe.rs src/browser/ws_probe.rs src/browser/mod.rs tests/browser_ws_probe_test.rs
|
||||
git commit -m "feat: add live sgBrowser websocket probe binary"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Task 3: Run the real endpoint probe matrix and write the evidence transcript
|
||||
|
||||
**Files:**
|
||||
- Create: `docs/_tmp_sgbrowser_ws_probe_transcript.md`
|
||||
- Reuse only: `src/bin/sgbrowser_ws_probe.rs`, `docs/_tmp_sgbrowser_ws_api_doc.txt`
|
||||
|
||||
- [ ] **Step 1: Run the no-bootstrap baseline probe**
|
||||
|
||||
Run exactly:
|
||||
|
||||
```bash
|
||||
cargo run --bin sgbrowser_ws_probe -- --ws-url "ws://127.0.0.1:12345" --timeout-ms 1500 --step "baseline-open::[\"about:blank\",\"sgBrowerserOpenPage\",\"https://www.zhihu.com/hot\"]"
|
||||
```
|
||||
|
||||
Append the exact output under a `## baseline-open` heading in `docs/_tmp_sgbrowser_ws_probe_transcript.md`.
|
||||
|
||||
- [ ] **Step 2: Run the documented `sgOpenAgent` candidate**
|
||||
|
||||
Run exactly:
|
||||
|
||||
```bash
|
||||
cargo run --bin sgbrowser_ws_probe -- --ws-url "ws://127.0.0.1:12345" --timeout-ms 1500 --step "open-agent::[\"about:blank\",\"sgOpenAgent\"]" --step "post-open-agent-open::[\"about:blank\",\"sgBrowerserOpenPage\",\"https://www.zhihu.com/hot\"]"
|
||||
```
|
||||
|
||||
Append the exact output under a `## open-agent` heading.
|
||||
|
||||
- [ ] **Step 3: Run the documented `sgSetAuthInfo` candidate**
|
||||
|
||||
Run exactly:
|
||||
|
||||
```bash
|
||||
cargo run --bin sgbrowser_ws_probe -- --ws-url "ws://127.0.0.1:12345" --timeout-ms 1500 --step "set-auth::[\"about:blank\",\"sgSetAuthInfo\",\"probe-user\",\"probe-token\"]" --step "post-set-auth-open::[\"about:blank\",\"sgBrowerserOpenPage\",\"https://www.zhihu.com/hot\"]"
|
||||
```
|
||||
|
||||
Append the exact output under a `## set-auth` heading.
|
||||
|
||||
- [ ] **Step 4: Run the documented `sgBrowserLogin` candidate**
|
||||
|
||||
Run exactly:
|
||||
|
||||
```bash
|
||||
cargo run --bin sgbrowser_ws_probe -- --ws-url "ws://127.0.0.1:12345" --timeout-ms 1500 --step "browser-login::{\"request\":\"use-json-helper\"}"
|
||||
```
|
||||
|
||||
Before running, replace the placeholder payload with the exact JSON-array frame produced by the helper for:
|
||||
|
||||
```json
|
||||
["about:blank","sgBrowserLogin",{"appName":"probe","userName":"probe","orgName":"probe","menus":[{"name":"probe","normalImg":"x","activeImg":"x","url":"https://www.zhihu.com/hot"}]}]
|
||||
```
|
||||
|
||||
Then add a second step in the same command:
|
||||
|
||||
```json
|
||||
["about:blank","sgBrowerserOpenPage","https://www.zhihu.com/hot"]
|
||||
```
|
||||
|
||||
Append the exact output under a `## browser-login` heading.
|
||||
|
||||
- [ ] **Step 5: Run the documented `sgBrowerserActiveTab` candidate**
|
||||
|
||||
Run exactly:
|
||||
|
||||
```bash
|
||||
cargo run --bin sgbrowser_ws_probe -- --ws-url "ws://127.0.0.1:12345" --timeout-ms 1500 --step "active-tab::[\"about:blank\",\"sgBrowerserActiveTab\",\"https://www.zhihu.com/hot\",\"probeCallback\"]" --step "post-active-tab-open::[\"about:blank\",\"sgBrowerserOpenPage\",\"https://www.zhihu.com/hot\"]"
|
||||
```
|
||||
|
||||
Append the exact output under a `## active-tab` heading.
|
||||
|
||||
- [ ] **Step 6: Run one combined bootstrap candidate**
|
||||
|
||||
Run exactly:
|
||||
|
||||
```bash
|
||||
cargo run --bin sgbrowser_ws_probe -- --ws-url "ws://127.0.0.1:12345" --timeout-ms 1500 --step "combined-open-agent::[\"about:blank\",\"sgOpenAgent\"]" --step "combined-active-tab::[\"about:blank\",\"sgBrowerserActiveTab\",\"https://www.zhihu.com/hot\",\"probeCallback\"]" --step "combined-open::[\"about:blank\",\"sgBrowerserOpenPage\",\"https://www.zhihu.com/hot\"]"
|
||||
```
|
||||
|
||||
Append the exact output under a `## combined-bootstrap` heading.
|
||||
|
||||
- [ ] **Step 7: Run `requesturl` variants for the minimal action**
|
||||
|
||||
Run exactly these two additional commands:
|
||||
|
||||
```bash
|
||||
cargo run --bin sgbrowser_ws_probe -- --ws-url "ws://127.0.0.1:12345" --timeout-ms 1500 --step "target-as-requesturl::[\"https://www.zhihu.com/hot\",\"sgBrowerserOpenPage\",\"https://www.zhihu.com/hot\"]"
|
||||
```
|
||||
|
||||
```bash
|
||||
cargo run --bin sgbrowser_ws_probe -- --ws-url "ws://127.0.0.1:12345" --timeout-ms 1500 --step "baidu-requesturl::[\"https://www.baidu.com\",\"sgBrowerserOpenPage\",\"https://www.zhihu.com/hot\"]"
|
||||
```
|
||||
|
||||
Append the exact outputs under `## requesturl-variants`.
|
||||
|
||||
- [ ] **Step 8: Summarize the matrix in the transcript file**
|
||||
|
||||
At the end of `docs/_tmp_sgbrowser_ws_probe_transcript.md`, add this exact table template and fill it in:
|
||||
|
||||
```markdown
|
||||
| Sequence | Sent frames | First reply | Final outcome | Decision signal |
|
||||
| --- | --- | --- | --- | --- |
|
||||
```
|
||||
|
||||
- [ ] **Step 9: Determine which architecture option wins**
|
||||
|
||||
Decision rule:
|
||||
- if at least one sequence reproducibly yields real numeric status and/or callback frames for a real business action, Option A (bootstrap-validated raw websocket) wins
|
||||
- otherwise, Option B (bridge-first) wins
|
||||
|
||||
Do not weaken this decision rule.
|
||||
|
||||
- [ ] **Step 10: Commit the evidence artifact**
|
||||
|
||||
```bash
|
||||
git add docs/_tmp_sgbrowser_ws_probe_transcript.md
|
||||
git commit -m "docs: capture sgBrowser websocket probe evidence"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Task 4A: If Option A wins, write the narrow bootstrap implementation slice
|
||||
|
||||
**Files:**
|
||||
- Create: `docs/superpowers/specs/2026-04-03-ws-browser-bootstrap-contract-design.md`
|
||||
- Create: `docs/superpowers/plans/2026-04-03-ws-browser-bootstrap-contract-plan.md`
|
||||
- Reuse as evidence input:
|
||||
- `docs/_tmp_sgbrowser_ws_probe_transcript.md`
|
||||
- `docs/_tmp_sgbrowser_ws_api_doc.txt`
|
||||
- `src/browser/ws_backend.rs`
|
||||
- `src/browser/ws_protocol.rs`
|
||||
|
||||
- [ ] **Step 1: Write one new design doc capturing the proven bootstrap contract**
|
||||
|
||||
Create:
|
||||
|
||||
```text
|
||||
docs/superpowers/specs/2026-04-03-ws-browser-bootstrap-contract-design.md
|
||||
```
|
||||
|
||||
Include:
|
||||
- exact validated sequence
|
||||
- exact required state (`requesturl`, active tab, agent page, auth payload)
|
||||
- exact failure semantics
|
||||
- why this is sufficient evidence to keep raw websocket as the product surface
|
||||
|
||||
- [ ] **Step 2: Write one new implementation plan for the bootstrap path**
|
||||
|
||||
Create:
|
||||
|
||||
```text
|
||||
docs/superpowers/plans/2026-04-03-ws-browser-bootstrap-contract-plan.md
|
||||
```
|
||||
|
||||
Plan only the minimal production changes required to embed the validated bootstrap sequence into the service/browser path.
|
||||
|
||||
- [ ] **Step 3: Commit the bootstrap decision docs**
|
||||
|
||||
```bash
|
||||
git add docs/superpowers/specs/2026-04-03-ws-browser-bootstrap-contract-design.md docs/superpowers/plans/2026-04-03-ws-browser-bootstrap-contract-plan.md
|
||||
git commit -m "docs: capture ws browser bootstrap contract"
|
||||
```
|
||||
|
||||
- [ ] **Step 4: Stop after writing the bootstrap plan**
|
||||
|
||||
Do not begin production implementation in the same slice unless the user explicitly asks for execution.
|
||||
|
||||
---
|
||||
|
||||
## Task 4B: If Option B wins, write the bridge-first implementation slice
|
||||
|
||||
**Files:**
|
||||
- Create: `docs/superpowers/specs/2026-04-03-ws-browser-bridge-path-design.md`
|
||||
- Create: `docs/superpowers/plans/2026-04-03-ws-browser-bridge-path-plan.md`
|
||||
- Reuse as evidence input:
|
||||
- `docs/_tmp_sgbrowser_ws_probe_transcript.md`
|
||||
- `frontend/archive/sgClaw验证-已归档/testRunner.js`
|
||||
- `docs/superpowers/specs/2026-03-25-superrpa-sgclaw-browser-control-design.md`
|
||||
- `docs/archive/项目管理与排期/协作时间表.md`
|
||||
- `docs/plans/2026-03-27-sgclaw-floating-chat-frontend-design.md`
|
||||
|
||||
- [ ] **Step 1: Write the bridge-path design doc**
|
||||
|
||||
Create `docs/superpowers/specs/2026-04-03-ws-browser-bridge-path-design.md`.
|
||||
|
||||
The design must specify:
|
||||
- why raw websocket is considered non-validated for external control
|
||||
- which bridge surface becomes authoritative
|
||||
- where sgClaw should integrate (`FunctionsUI`, host bridge, `BrowserAction`, `CommandRouter`, or the nearest validated seam in this repo)
|
||||
- how to preserve pipe behavior and existing abstractions where practical
|
||||
|
||||
- [ ] **Step 2: Write the bridge-path implementation plan**
|
||||
|
||||
Create `docs/superpowers/plans/2026-04-03-ws-browser-bridge-path-plan.md`.
|
||||
|
||||
The plan must:
|
||||
- identify exact files to touch
|
||||
- describe the narrowest adapter implementation
|
||||
- keep TDD/task granularity as in this document
|
||||
- avoid speculative work outside the bridge slice
|
||||
|
||||
- [ ] **Step 3: Commit the bridge decision docs**
|
||||
|
||||
```bash
|
||||
git add docs/superpowers/specs/2026-04-03-ws-browser-bridge-path-design.md docs/superpowers/plans/2026-04-03-ws-browser-bridge-path-plan.md
|
||||
git commit -m "docs: define bridge-first sgBrowser integration"
|
||||
```
|
||||
|
||||
- [ ] **Step 4: Stop after writing the bridge plan**
|
||||
|
||||
Do not start the bridge implementation in the same slice unless the user explicitly asks for execution.
|
||||
|
||||
---
|
||||
|
||||
## Verification Checklist
|
||||
|
||||
### Deterministic probe harness tests
|
||||
|
||||
```bash
|
||||
cargo test --test browser_ws_probe_test -- --nocapture
|
||||
```
|
||||
|
||||
Expected: transcript capture, ordered scripts, timeout reporting, and close/reset reporting all pass.
|
||||
|
||||
### Probe binary build
|
||||
|
||||
```bash
|
||||
cargo build --bin sgbrowser_ws_probe
|
||||
```
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
### Live evidence collection
|
||||
|
||||
- run the probe matrix against the real configured endpoint
|
||||
- save exact transcripts to `docs/_tmp_sgbrowser_ws_probe_transcript.md`
|
||||
- make the architecture decision using the documented rule
|
||||
|
||||
### Follow-up branch condition
|
||||
|
||||
- if Option A wins, repository contains a bootstrap-contract design + plan
|
||||
- if Option B wins, repository contains a bridge-path design + plan
|
||||
- no production runtime changes are made until that decision is written down
|
||||
|
||||
---
|
||||
|
||||
## Notes for Implementation
|
||||
|
||||
- The existing `WsBrowserBackend` fix that remembers the last navigated URL remains valid; do not revert it.
|
||||
- The previous auth-replacement work also remains valid; it removed a real bug but did not prove the raw websocket architecture.
|
||||
- Keep the probe tool brutally literal: exact sent frames, exact received frames, explicit timeout/close outcomes.
|
||||
- Resist the temptation to make the probe “smart.” Smart probes hide evidence.
|
||||
- If the real endpoint still replies only with the welcome banner and then silence across the matrix, treat that as a decision, not as an excuse for more guessing.
|
||||
@@ -0,0 +1,362 @@
|
||||
# WS Browser Welcome Frame Compatibility Implementation Plan
|
||||
|
||||
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
|
||||
|
||||
**Goal:** Make the ws service path tolerate the real sgBrowser welcome banner (`Welcome! You are client #...`) without weakening general ws protocol validation or changing pipe behavior.
|
||||
|
||||
**Architecture:** Keep the shared `WsBrowserBackend` strict and implement the compatibility shim only in `ServiceBrowserWsClient`, which is already the real-browser adapter for the ws service path. Add one positive red test for the known welcome frame and one negative red test proving non-matching first text frames still fail as protocol errors, then make the minimal stateful change in `src/service/server.rs` and verify ws + pipe regressions.
|
||||
|
||||
**Tech Stack:** Rust 2021, tungstenite websocket client/server, existing `WsBrowserBackend`, existing `ServiceBrowserWsClient`, existing Rust unit/integration test suite.
|
||||
|
||||
---
|
||||
|
||||
## File Structure
|
||||
|
||||
### Existing files to modify
|
||||
|
||||
- Modify: `src/service/server.rs`
|
||||
- Add the one-time per-connection welcome-skip state to `ServiceBrowserWsClient`
|
||||
- Add the minimal helper(s) for detecting and discarding the first known welcome frame
|
||||
- Add focused service-adapter unit tests in the existing `#[cfg(test)]` module
|
||||
- Reuse: `src/browser/ws_backend.rs`
|
||||
- Do not change protocol parsing rules; only verify behavior remains strict for all non-service callers
|
||||
- Reuse: `tests/service_task_flow_test.rs`
|
||||
- Re-run to confirm the ws service path still reaches the browser websocket after the service-side shim
|
||||
- Reuse: `tests/browser_ws_backend_test.rs`
|
||||
- Re-run to prove the shared backend semantics remain unchanged
|
||||
|
||||
### Files deliberately not changed
|
||||
|
||||
- `src/browser/ws_backend.rs`
|
||||
- `src/browser/ws_protocol.rs`
|
||||
- `src/agent/task_runner.rs`
|
||||
- `src/compat/runtime.rs`
|
||||
- `src/compat/orchestration.rs`
|
||||
- `src/compat/workflow_executor.rs`
|
||||
- `src/lib.rs`
|
||||
|
||||
The design explicitly keeps the welcome-banner workaround out of the shared backend and out of the pipe path.
|
||||
|
||||
---
|
||||
|
||||
## Task 1: Reproduce the real welcome-frame failure with focused unit tests
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/service/server.rs`
|
||||
|
||||
- [ ] **Step 1: Add the positive failing test for the known welcome frame**
|
||||
|
||||
In the existing `#[cfg(test)] mod tests` inside `src/service/server.rs`, add one focused test next to the current ws adapter tests.
|
||||
|
||||
Test shape:
|
||||
|
||||
```rust
|
||||
#[test]
|
||||
fn future_server_side_ws_native_adapter_skips_initial_known_welcome_frame() {
|
||||
// fake server sends:
|
||||
// 1. "Welcome! You are client #1"
|
||||
// 2. "0"
|
||||
// backend.invoke(Action::Navigate, ...) should succeed
|
||||
}
|
||||
```
|
||||
|
||||
Required assertions:
|
||||
- the fake websocket server accepts one connection
|
||||
- it sends the welcome banner first, then the numeric success status
|
||||
- `WsBrowserBackend.invoke(Action::Navigate, ...)` returns `Ok(CommandOutput { success: true, .. })`
|
||||
|
||||
- [ ] **Step 2: Run only the positive new test and watch it fail**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test service::server::tests::future_server_side_ws_native_adapter_skips_initial_known_welcome_frame -- --nocapture
|
||||
```
|
||||
|
||||
Expected: FAIL with a protocol error containing `invalid browser status frame: Welcome! You are client #1`.
|
||||
|
||||
- [ ] **Step 3: Add the negative failing test for arbitrary first text**
|
||||
|
||||
In the same `#[cfg(test)]` module, add one negative test proving we do **not** silently skip arbitrary first text frames.
|
||||
|
||||
Test shape:
|
||||
|
||||
```rust
|
||||
#[test]
|
||||
fn future_server_side_ws_native_adapter_does_not_skip_unknown_first_text_frame() {
|
||||
// fake server sends:
|
||||
// 1. "Hello from server"
|
||||
// assert invoke(...) fails as PipeError::Protocol(...)
|
||||
}
|
||||
```
|
||||
|
||||
Required assertions:
|
||||
- the first frame is a non-matching text frame such as `Hello from server`
|
||||
- `invoke(...)` fails
|
||||
- the failure remains a protocol error rather than success or timeout
|
||||
|
||||
- [ ] **Step 4: Run only the negative new test and verify the current behavior is already strict**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test service::server::tests::future_server_side_ws_native_adapter_does_not_skip_unknown_first_text_frame -- --nocapture
|
||||
```
|
||||
|
||||
Expected: PASS, proving the current implementation already treats arbitrary first text as a protocol error. Keep that assertion in place before any production change.
|
||||
|
||||
- [ ] **Step 5: Confirm the TDD gate before implementation**
|
||||
|
||||
Do not implement production code before both tests exist and the positive test has failed on current behavior.
|
||||
|
||||
---
|
||||
|
||||
## Task 2: Add the minimal per-connection welcome-skip state in the service adapter
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/service/server.rs`
|
||||
|
||||
- [ ] **Step 1: Add one-time per-connection state to `ServiceBrowserWsClient`**
|
||||
|
||||
Extend `ServiceBrowserWsClient` with one extra state field that tracks whether the initial welcome candidate has already been consumed for the current websocket connection.
|
||||
|
||||
Allowed shape:
|
||||
|
||||
```rust
|
||||
struct ServiceBrowserWsClient {
|
||||
browser_ws_url: String,
|
||||
browser_socket: Mutex<Option<WebSocket<MaybeTlsStream<TcpStream>>>>,
|
||||
initial_text_frame_checked: Mutex<bool>,
|
||||
}
|
||||
```
|
||||
|
||||
or an equally small equivalent.
|
||||
|
||||
Rules:
|
||||
- state is per connection, not per request
|
||||
- state must survive multiple `invoke(...)` calls while reusing the same socket
|
||||
- do not add broader protocol state machines
|
||||
|
||||
- [ ] **Step 2: Add a narrow welcome-frame matcher**
|
||||
|
||||
In `src/service/server.rs`, add one small helper that recognizes only the known banner prefix:
|
||||
|
||||
```rust
|
||||
fn is_known_welcome_frame(frame: &str) -> bool {
|
||||
frame.starts_with("Welcome! You are client #")
|
||||
}
|
||||
```
|
||||
|
||||
Rules:
|
||||
- no regex needed
|
||||
- no generic “ignore arbitrary text” logic
|
||||
- keep the matcher local to `src/service/server.rs`
|
||||
|
||||
- [ ] **Step 3: Update `recv_text_timeout(...)` to skip at most one initial known banner**
|
||||
|
||||
Modify `impl WsClient for ServiceBrowserWsClient` so that the first text frame received after connection establishment is handled like this:
|
||||
|
||||
1. read the next text frame
|
||||
2. if the initial-frame state is still false:
|
||||
- mark the first-frame check as consumed
|
||||
- if the frame matches `is_known_welcome_frame(...)`, read the next frame and return that next frame instead
|
||||
3. otherwise, return the frame unchanged
|
||||
|
||||
Rules:
|
||||
- skip only once per connection
|
||||
- do not loop indefinitely over multiple text frames
|
||||
- do not swallow unknown first text frames
|
||||
- do not change timeout / close / reset / connect-failure behavior
|
||||
|
||||
- [ ] **Step 4: Reset the one-time state when a fresh socket is created**
|
||||
|
||||
When `with_socket(...)` establishes a brand-new websocket connection, ensure the one-time banner-check state is reset so a new connection can tolerate its own first welcome frame.
|
||||
|
||||
- [ ] **Step 5: Add one reconnect regression in the service adapter tests**
|
||||
|
||||
Add one focused test proving the welcome skip resets on a fresh connection after socket close/reset.
|
||||
|
||||
Test shape:
|
||||
|
||||
```rust
|
||||
#[test]
|
||||
fn future_server_side_ws_native_adapter_skips_welcome_again_after_reconnect() {
|
||||
// first connection closes after use
|
||||
// second fresh connection sends the same welcome banner again
|
||||
// both invocations succeed
|
||||
}
|
||||
```
|
||||
|
||||
Required assertion:
|
||||
- the one-time skip is per connection, not global for the client instance
|
||||
|
||||
- [ ] **Step 6: Run the positive new test**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test service::server::tests::future_server_side_ws_native_adapter_skips_initial_known_welcome_frame -- --nocapture
|
||||
```
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
- [ ] **Step 7: Run the negative new test**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test service::server::tests::future_server_side_ws_native_adapter_does_not_skip_unknown_first_text_frame -- --nocapture
|
||||
```
|
||||
|
||||
Expected: PASS, proving unknown first text is still treated as a protocol error.
|
||||
|
||||
- [ ] **Step 8: Run the reconnect regression**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test service::server::tests::future_server_side_ws_native_adapter_skips_welcome_again_after_reconnect -- --nocapture
|
||||
```
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
- [ ] **Step 9: Run the full service adapter unit group**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test service::server::tests -- --nocapture
|
||||
```
|
||||
|
||||
Expected: PASS, including the existing tests for:
|
||||
- status `0` success
|
||||
- connect failure => `PipeError::Protocol("browser websocket connect failed: ...")`
|
||||
- disconnect/reset => `PipeError::PipeClosed`
|
||||
- callback timeout => `PipeError::Timeout`
|
||||
- new known-welcome success path
|
||||
- new unknown-first-frame strictness path
|
||||
- new reconnect reset behavior
|
||||
|
||||
---
|
||||
|
||||
## Task 3: Verify the shared backend stayed strict and the ws service path still works
|
||||
|
||||
**Files:**
|
||||
- Reuse: `tests/browser_ws_backend_test.rs`
|
||||
- Reuse: `tests/service_task_flow_test.rs`
|
||||
- Reuse: `src/browser/ws_backend.rs`
|
||||
|
||||
- [ ] **Step 1: Re-run the shared ws backend tests unchanged**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test browser_ws_backend_test -- --nocapture
|
||||
```
|
||||
|
||||
Expected: PASS. This proves `WsBrowserBackend` semantics remain unchanged for its existing deterministic callers.
|
||||
|
||||
- [ ] **Step 2: Re-run the service task-flow regression**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test service_task_flow_test -- --nocapture
|
||||
```
|
||||
|
||||
Expected: PASS, including the auth-regression test that proves the ws service path reaches the browser websocket and no longer emits `invalid hmac seed: session key must not be empty`.
|
||||
|
||||
- [ ] **Step 3: Re-run the ws-focused mixed verification**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --test browser_ws_backend_test --test browser_ws_protocol_test --test service_ws_session_test --test service_task_flow_test -- --nocapture
|
||||
```
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
---
|
||||
|
||||
## Task 4: Re-run the real manual smoke that originally failed
|
||||
|
||||
**Files:**
|
||||
- Reuse only: no code changes unless a fresh reproducer proves another bug
|
||||
|
||||
- [ ] **Step 1: Confirm real browser websocket reachability**
|
||||
|
||||
Run a reachability check for `ws://127.0.0.1:12345` (or the configured `browserWsUrl`) before starting smoke.
|
||||
|
||||
Expected: reachable.
|
||||
|
||||
- [ ] **Step 2: Start the real ws service**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo run --bin sg_claw -- --config-path "D:/data/ideaSpace/rust/sgClaw/sgclaw_config.json"
|
||||
```
|
||||
|
||||
Expected: the service prints:
|
||||
- `sg_claw ready: ...`
|
||||
- the resolved `service_ws_listen_addr`
|
||||
- the configured `browser_ws_url`
|
||||
|
||||
- [ ] **Step 3: Re-run the original failing manual smoke**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
printf '打开知乎热榜并读取页面主区域文本\n' | cargo run --bin sg_claw_client -- --config-path "D:/data/ideaSpace/rust/sgClaw/sgclaw_config.json"
|
||||
```
|
||||
|
||||
Expected:
|
||||
- no `invalid browser status frame: Welcome! You are client #1`
|
||||
- browser actions proceed past the first status frame
|
||||
- if the browser later fails for another reason, capture that new reason exactly
|
||||
|
||||
- [ ] **Step 4: Re-run the old Zhihu export task smoke**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
printf '读取知乎热榜数据,并导出 excel 文件\n' | cargo run --bin sg_claw_client -- --config-path "D:/data/ideaSpace/rust/sgClaw/sgclaw_config.json"
|
||||
```
|
||||
|
||||
Expected:
|
||||
- no `invalid browser status frame: Welcome! You are client #1`
|
||||
- the task reaches the real browser action path beyond connection banner handling
|
||||
|
||||
- [ ] **Step 5: Stop and debug if a new real-browser issue appears**
|
||||
|
||||
If smoke now fails for a different reason, do not piggyback a second fix into this slice without:
|
||||
- capturing the exact new output
|
||||
- writing a new focused spec/plan if the issue is materially different
|
||||
|
||||
---
|
||||
|
||||
## Verification Checklist
|
||||
|
||||
### Service adapter unit tests
|
||||
|
||||
```bash
|
||||
cargo test service::server::tests -- --nocapture
|
||||
```
|
||||
|
||||
Expected: all service-side ws adapter tests pass, including the new welcome-frame positive/negative cases and reconnect reset case.
|
||||
|
||||
### Shared ws backend + ws service regressions
|
||||
|
||||
```bash
|
||||
cargo test --test browser_ws_backend_test --test browser_ws_protocol_test --test service_ws_session_test --test service_task_flow_test -- --nocapture
|
||||
```
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
### Real smoke verification
|
||||
|
||||
- `browserWsUrl` reachable
|
||||
- `sg_claw` starts with real config
|
||||
- `sg_claw_client` no longer fails on `Welcome! You are client #...`
|
||||
- Zhihu minimal read task gets past the first status frame
|
||||
- Zhihu export task gets past the first status frame
|
||||
@@ -0,0 +1,564 @@
|
||||
# Zhihu Release WS Function-Callback Migration Implementation Plan
|
||||
|
||||
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
|
||||
|
||||
**Goal:** Move only the Zhihu direct-execution path to the new Release browser websocket interaction style while keeping the existing pipe protocol and non-Zhihu submit behavior unchanged.
|
||||
|
||||
**Architecture:** Keep `ClientMessage` / `ServiceMessage`, `run_submit_task_with_browser_backend(...)`, and the high-level Zhihu workflow steps unchanged. First prove the exact Release browser interaction contract with transcript-backed probes. Then implement the smallest Zhihu-scoped backend path that follows that proven contract. Do not globally rewire the submit path unless the probe evidence proves there is no narrower safe seam.
|
||||
|
||||
**Tech Stack:** Rust, tungstenite, existing sgclaw service/client pipe protocol, `docs/_tmp_sgbrowser_ws_api_doc.txt`, Release browser websocket at `ws://127.0.0.1:12345`, current Zhihu direct-execution workflow.
|
||||
|
||||
---
|
||||
|
||||
## Context
|
||||
|
||||
The user has now made the target behavior explicit:
|
||||
|
||||
- the browser has changed and the working reference behavior is the user-provided HTML page that connects to `ws://127.0.0.1:12345`
|
||||
- that page sends a bootstrap registration frame: `{"type":"register","role":"web"}`
|
||||
- browser requests are still JSON arrays such as `[window.location.href, "sgBrowserSetTheme", "1"]` and `[window.location.href, "sgBrowerserGetUrls", "showUrls"]`
|
||||
- callback-bearing browser behavior is now centered on page-defined JS callback functions like `showUrls`, not on Rust directly reading a websocket callback frame as the final business result
|
||||
- the existing sgclaw pipe protocol must remain unchanged
|
||||
|
||||
The current sgclaw drift that must be corrected is visible in:
|
||||
|
||||
- `src/browser/ws_protocol.rs`
|
||||
- `Action::Navigate` currently emits `sgHideBrowserCallAfterLoaded` with an inline `callBackJsToCpp(...)` string
|
||||
- `src/browser/ws_backend.rs`
|
||||
- Rust currently waits for a browser websocket callback frame and treats that as the action result
|
||||
- `tests/service_ws_session_test.rs:498-605`
|
||||
- `tests/service_task_flow_test.rs:499-635`
|
||||
- existing **generic submit-flow** regressions still lock in the old direct raw-websocket callback-frame assumption
|
||||
- these are useful as non-regression guardrails, but they are not themselves Zhihu-specific regressions
|
||||
|
||||
Zhihu-specific verification must therefore be added explicitly instead of assuming those Baidu-path tests already cover Zhihu.
|
||||
|
||||
The new browser style proves these facts and only these facts so far:
|
||||
|
||||
1. sgclaw must handle a register-first websocket handshake
|
||||
2. browser requests are still `[requesturl, action, ...args]`
|
||||
3. some browser capabilities now return through page-defined callback functions like `showUrls`
|
||||
4. the current direct raw-websocket callback expectation in Zhihu path is no longer a safe assumption
|
||||
|
||||
The production seam is **not** pre-decided here. Task 1 must determine whether Zhihu can be integrated by:
|
||||
- a direct Zhihu-scoped backend with no helper page, or
|
||||
- a helper page / relay design because named page callbacks are the only reliable result path
|
||||
|
||||
Until Task 1 evidence is captured, both remain hypotheses.
|
||||
|
||||
## Evidence to preserve in the implementation
|
||||
|
||||
### Browser websocket API doc
|
||||
From `docs/_tmp_sgbrowser_ws_api_doc.txt`:
|
||||
- `ws://localhost:12345` is the browser websocket endpoint
|
||||
- request frames are array payloads with `requesturl`
|
||||
- `sgBrowerserGetUrls(callback)` uses a callback **function name**: `[requesturl,"sgBrowerserGetUrls", callback]`
|
||||
- `sgBrowserCallAfterLoaded(targetUrl, callback)` and `sgHideBrowserCallAfterLoaded(targetUrl, callback)` use callback strings with parentheses
|
||||
- `callBackJsToCpp(param)` uses `sourceUrl@_@targetUrl@_@callback@_@actionUrl@_@responseTxt`
|
||||
- `sgBrowserRegJsFun(targeturl, funContent)` and `sgBrowserExcuteJsFun(targeturl, funName)` exist and may be useful when the helper page needs durable callback helpers
|
||||
|
||||
### Current working HTML pattern from the user
|
||||
The now-working reference interaction is:
|
||||
|
||||
```html
|
||||
const socket = new WebSocket('ws://127.0.0.1:12345');
|
||||
socket.onopen = () => {
|
||||
socket.send(JSON.stringify({type: 'register', role: 'web'}));
|
||||
};
|
||||
socket.send(JSON.stringify([window.location.href,"sgBrowerserGetUrls","showUrls"]));
|
||||
function showUrls(urls) {
|
||||
// browser invokes this page-defined callback
|
||||
}
|
||||
```
|
||||
|
||||
That is the browser behavior sgclaw now needs to follow.
|
||||
|
||||
---
|
||||
|
||||
## Critical files
|
||||
|
||||
### Production files to modify
|
||||
- `src/browser/ws_protocol.rs`
|
||||
- `src/compat/workflow_executor.rs` (only if a narrow Zhihu-specific correction is required after backend swap)
|
||||
- `src/service/server.rs` (only if the chosen Zhihu-scoped integration seam must be wired here)
|
||||
- `src/service/mod.rs` (only if startup plumbing changes are truly required)
|
||||
- `src/browser/mod.rs`
|
||||
|
||||
### New production files likely needed
|
||||
- `src/browser/zhihu_release_backend.rs`
|
||||
- a Zhihu-scoped `BrowserBackend` adapter that follows the proven Release browser interaction style without changing non-Zhihu routes
|
||||
- `src/service/browser_callback_host.rs` **only if the probe proves a service-controlled helper page is actually required**
|
||||
- service-local helper-page lifecycle and callback relay, if evidence shows the browser cannot be driven safely without it
|
||||
|
||||
### Existing files to preserve
|
||||
- `src/agent/task_runner.rs`
|
||||
- `src/service/protocol.rs`
|
||||
- `src/compat/orchestration.rs`
|
||||
- `src/compat/runtime.rs`
|
||||
- `src/pipe/*`
|
||||
|
||||
### Existing direct-ws files to review explicitly
|
||||
- `src/browser/ws_backend.rs`
|
||||
- `tests/browser_ws_backend_test.rs`
|
||||
|
||||
These files currently encode the old direct raw-websocket callback expectation. The implementation must either:
|
||||
- leave them untouched as legacy/direct-contract coverage with no Zhihu production callers, or
|
||||
- update/remove the Zhihu-specific assumptions they currently lock in.
|
||||
|
||||
### Primary test files
|
||||
- `tests/browser_ws_probe_test.rs`
|
||||
- `tests/browser_ws_protocol_test.rs`
|
||||
- `tests/service_ws_session_test.rs`
|
||||
- `tests/service_task_flow_test.rs`
|
||||
- `tests/task_runner_test.rs`
|
||||
- `tests/browser_ws_backend_test.rs`
|
||||
|
||||
---
|
||||
|
||||
## File structure decisions
|
||||
|
||||
### `src/browser/zhihu_release_backend.rs`
|
||||
Prefer a Zhihu-scoped backend first.
|
||||
|
||||
Responsibilities:
|
||||
- keep the same `BrowserBackend` trait surface
|
||||
- implement only the behavior needed by the current Zhihu direct-execution route
|
||||
- translate `Action::Navigate`, `Action::GetText`, and `Action::Eval` into the proven Release-browser interaction style
|
||||
- normalize results back into `CommandOutput`
|
||||
- avoid affecting non-Zhihu callers
|
||||
|
||||
This is the preferred seam because the user asked to change the current Zhihu flow, not to redesign the whole submit pipeline.
|
||||
|
||||
### `src/service/browser_callback_host.rs` (conditional)
|
||||
Create this file only if Task 1 probe evidence proves that sgclaw must host or control a page in order to receive named callback-function results.
|
||||
|
||||
If it is needed, the plan must keep the design minimal and specific:
|
||||
- one concrete transport only (choose websocket or HTTP, not “websocket or HTTP”)
|
||||
- explicit readiness handshake
|
||||
- explicit request correlation by `request_id`
|
||||
- explicit cleanup when the submit task ends
|
||||
|
||||
If Task 1 shows a simpler seam, do not create this file.
|
||||
|
||||
### `src/browser/ws_protocol.rs`
|
||||
Do not let this file keep only the old direct-callback assumption.
|
||||
|
||||
It should become the shared place for doc-native request builders such as:
|
||||
- browser bootstrap frames proven by the transcript
|
||||
- `sgBrowserCallAfterLoaded` / `sgHideBrowserCallAfterLoaded`
|
||||
- `sgBrowserExcuteJsCodeByArea`
|
||||
- optional `sgBrowserRegJsFun` / `sgBrowserExcuteJsFun`
|
||||
|
||||
But do **not** let `ws_protocol.rs` absorb service-host lifecycle logic.
|
||||
|
||||
### `src/browser/ws_backend.rs` and `tests/browser_ws_backend_test.rs`
|
||||
Handle these explicitly in the implementation:
|
||||
- if they still describe a valid direct browser contract, keep them as isolated legacy/direct-ws coverage only
|
||||
- if their current navigate/callback assumptions conflict with the proven Release Zhihu path, update or narrow those tests so they no longer describe the active Zhihu integration path
|
||||
|
||||
Do not leave the old direct-callback assumptions ambiguously “reviewed”; the implementation must make their status explicit.
|
||||
|
||||
---
|
||||
|
||||
## Task 1: Capture the new Release browser contract in a reproducible probe transcript
|
||||
|
||||
**Files:**
|
||||
- Review/modify: `src/browser/ws_probe.rs`
|
||||
- Review/modify: `src/bin/sgbrowser_ws_probe.rs`
|
||||
- Review/modify: `tests/browser_ws_probe_test.rs`
|
||||
- Create: `docs/_tmp_release_ws_callback_host_transcript.md`
|
||||
|
||||
- [ ] **Step 1: Verify current probe coverage against the Release-browser questions**
|
||||
|
||||
Read the existing probe module and tests and check whether they already prove all of the following:
|
||||
- a register-first websocket script can be expressed
|
||||
- a later array action frame can be expressed in the same script
|
||||
- per-step inbound frames/outcomes are preserved separately
|
||||
- timeout/close remain distinguishable in the transcript
|
||||
|
||||
Required result:
|
||||
- identify the exact existing tests that already prove these behaviors
|
||||
- identify the smallest missing Release-specific coverage, if any
|
||||
|
||||
- [ ] **Step 2: Add only the missing regression coverage**
|
||||
|
||||
If current tests do **not** already prove the Release-browser bootstrap shape, add the narrowest failing regression in `tests/browser_ws_probe_test.rs`.
|
||||
|
||||
Preferred shape if coverage is missing:
|
||||
|
||||
```rust
|
||||
#[test]
|
||||
fn probe_supports_register_then_array_action_script() {
|
||||
// fake server expects:
|
||||
// 1. {"type":"register","role":"web"}
|
||||
// 2. ["http://127.0.0.1/helper.html","sgBrowerserGetUrls","showUrls"]
|
||||
}
|
||||
```
|
||||
|
||||
And, if still missing, add one regression proving per-step transcript separation for the register reply and later action reply.
|
||||
|
||||
If those behaviors are already covered, skip new test creation and record the exact test names to rely on.
|
||||
|
||||
- [ ] **Step 3: Run the relevant probe tests**
|
||||
|
||||
Run the narrowest exact tests that prove the Release bootstrap behavior, or the full file if multiple areas changed:
|
||||
|
||||
```bash
|
||||
cargo test --test browser_ws_probe_test -- --nocapture
|
||||
```
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
- [ ] **Step 4: Make the probe binary ergonomic for the Release transcript if needed**
|
||||
|
||||
Only if the current CLI cannot conveniently express the real Release-browser script, make the smallest change needed in `src/bin/sgbrowser_ws_probe.rs` / `src/browser/ws_probe.rs` so it can capture:
|
||||
- register frame behavior
|
||||
- minimal `sgBrowserSetTheme`
|
||||
- minimal `sgBrowerserGetUrls`
|
||||
- exact inbound websocket text per step
|
||||
|
||||
Do not redesign the probe if it already supports this.
|
||||
|
||||
- [ ] **Step 5: Run the live probe against the Release browser and record the real bootstrap**
|
||||
|
||||
Use the probe binary against the real endpoint to capture at minimum:
|
||||
- register frame behavior
|
||||
- minimal `sgBrowserSetTheme`
|
||||
- minimal `sgBrowerserGetUrls`
|
||||
- whether replies come back as websocket text, page-function invocation only, or both
|
||||
|
||||
Save the exact transcript in `docs/_tmp_release_ws_callback_host_transcript.md`.
|
||||
|
||||
Required output in that temp doc:
|
||||
- exact sent frames
|
||||
- exact received websocket frames
|
||||
- the observed rule for when named callback functions are invoked
|
||||
- whether Option A or Option B is supported by evidence
|
||||
|
||||
- [ ] **Step 6: Commit the probe-only slice if code changed**
|
||||
|
||||
If probe code/tests changed:
|
||||
|
||||
```bash
|
||||
git add src/browser/ws_probe.rs src/bin/sgbrowser_ws_probe.rs tests/browser_ws_probe_test.rs docs/_tmp_release_ws_callback_host_transcript.md
|
||||
git commit -m "test: capture release browser ws bootstrap contract"
|
||||
```
|
||||
|
||||
If only the transcript doc changed, stage only that file and use a docs/test-appropriate commit message.
|
||||
|
||||
---
|
||||
|
||||
## Task 2: Choose the narrowest Zhihu-only production seam from the probe evidence
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/service/server.rs` (only if required)
|
||||
- Modify: `src/browser/mod.rs`
|
||||
- Modify: `src/compat/workflow_executor.rs` (only if required)
|
||||
- Create: `src/browser/zhihu_release_backend.rs`
|
||||
- Create: `src/service/browser_callback_host.rs` **only if required**
|
||||
- Test: `tests/service_ws_session_test.rs`
|
||||
- Test: `tests/service_task_flow_test.rs`
|
||||
|
||||
- [ ] **Step 1: Write down the seam decision in the plan notes before coding**
|
||||
|
||||
Based on the transcript from Task 1, record which one of these is supported by evidence:
|
||||
- Option A: a Zhihu-scoped backend can talk to the Release browser directly with no service-hosted helper page
|
||||
- Option B: a Zhihu-scoped backend needs a service-controlled helper page because named page callbacks are the only reliable way to get business results
|
||||
|
||||
Do not proceed until one option is chosen explicitly from evidence.
|
||||
|
||||
- [ ] **Step 2: Add a failing service/task-flow regression that proves only the Zhihu path changes**
|
||||
|
||||
Update or add focused tests so that:
|
||||
- Zhihu submit flow uses the new Release-browser interaction seam
|
||||
- non-Zhihu behavior is unchanged
|
||||
- pipe messages remain unchanged
|
||||
|
||||
Required assertions:
|
||||
- the new path is activated only for Zhihu route detection
|
||||
- `ClientMessage` / `ServiceMessage` stay identical
|
||||
- existing non-Zhihu submit behavior is not accidentally rerouted
|
||||
|
||||
- [ ] **Step 3: Run the new focused regression and confirm failure first**
|
||||
|
||||
Run the narrowest exact test names you added in:
|
||||
```bash
|
||||
cargo test --test service_ws_session_test <new_test_name> -- --nocapture
|
||||
cargo test --test service_task_flow_test <new_test_name> -- --nocapture
|
||||
```
|
||||
|
||||
Expected: FAIL because the Zhihu-specific seam does not exist yet.
|
||||
|
||||
- [ ] **Step 4: Implement the chosen seam with the smallest blast radius**
|
||||
|
||||
If Option A won:
|
||||
- add `src/browser/zhihu_release_backend.rs`
|
||||
- wire it only where the Zhihu direct-execution route is selected
|
||||
- leave global submit-path wiring alone
|
||||
|
||||
If Option B won:
|
||||
- add `src/service/browser_callback_host.rs` with one specific transport and one explicit readiness/correlation model
|
||||
- add `src/browser/zhihu_release_backend.rs` to talk to that helper path
|
||||
- wire it only for the Zhihu route
|
||||
|
||||
In both cases:
|
||||
- do not change non-Zhihu callers
|
||||
- do not redesign `run_submit_task_with_browser_backend(...)`
|
||||
- do not change the pipe protocol
|
||||
|
||||
- [ ] **Step 5: Make the status of old direct-ws code explicit**
|
||||
|
||||
Update `src/browser/ws_backend.rs` / `tests/browser_ws_backend_test.rs` only as needed so they no longer ambiguously describe the active Zhihu path.
|
||||
|
||||
Allowed outcomes:
|
||||
- keep them untouched as legacy/direct-ws coverage with no Zhihu production caller
|
||||
- narrow/update the tests so they no longer claim the active Zhihu integration path
|
||||
|
||||
Not allowed:
|
||||
- leaving the plan and code in a state where both old and new paths appear to be the active Zhihu contract
|
||||
|
||||
- [ ] **Step 6: Run focused integration tests**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
cargo test --test service_ws_session_test -- --nocapture
|
||||
cargo test --test service_task_flow_test -- --nocapture
|
||||
cargo test --test task_runner_test -- --nocapture
|
||||
```
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
- [ ] **Step 7: Commit the seam-selection slice**
|
||||
|
||||
Adjust staged files to match the option actually implemented, for example:
|
||||
|
||||
```bash
|
||||
git add src/browser/zhihu_release_backend.rs src/browser/mod.rs src/service/server.rs src/service/browser_callback_host.rs tests/service_ws_session_test.rs tests/service_task_flow_test.rs tests/browser_ws_backend_test.rs
|
||||
git commit -m "feat: route zhihu flow through release browser ws contract"
|
||||
```
|
||||
|
||||
Only stage files that were truly changed.
|
||||
|
||||
---
|
||||
|
||||
## Task 3: Implement Zhihu action mapping on the chosen Release-browser seam
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/browser/ws_protocol.rs`
|
||||
- Modify: `src/browser/zhihu_release_backend.rs`
|
||||
- Test: `tests/browser_ws_protocol_test.rs`
|
||||
- Create: `tests/browser_zhihu_release_backend_test.rs`
|
||||
|
||||
- [ ] **Step 1: Write the first failing backend test for Zhihu navigate mapping**
|
||||
|
||||
Create `tests/browser_zhihu_release_backend_test.rs` with a fake transport/relay and assert that `Action::Navigate` for the Zhihu path becomes the exact browser request shape proven by Task 1.
|
||||
|
||||
Start with this shape:
|
||||
|
||||
```rust
|
||||
#[test]
|
||||
fn zhihu_release_backend_maps_navigate_to_proven_release_frame() {
|
||||
// invoke Action::Navigate
|
||||
// assert exact outbound frame/opcode chosen from transcript evidence
|
||||
}
|
||||
```
|
||||
|
||||
Required assertions:
|
||||
- the call site still uses `BrowserBackend::invoke(...)`
|
||||
- the exact outbound frame matches the recorded Release-browser evidence
|
||||
- request correlation stays deterministic
|
||||
|
||||
- [ ] **Step 2: Run the single new backend test and verify it fails**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
cargo test --test browser_zhihu_release_backend_test zhihu_release_backend_maps_navigate_to_proven_release_frame -- --nocapture
|
||||
```
|
||||
|
||||
Expected: FAIL because the backend does not exist yet.
|
||||
|
||||
- [ ] **Step 3: Implement minimal `Navigate` support**
|
||||
|
||||
In `src/browser/zhihu_release_backend.rs`:
|
||||
- implement `BrowserBackend`
|
||||
- support `Action::Navigate` first
|
||||
- use `ws_protocol.rs` helpers for exact browser-frame construction
|
||||
- do not hardcode speculative opcodes; follow the transcript from Task 1
|
||||
|
||||
- [ ] **Step 4: Add failing tests for `GetText` and `Eval`**
|
||||
|
||||
Add tests proving:
|
||||
- `Action::GetText` returns `CommandOutput.data == {"text": "..."}`
|
||||
- `Action::Eval` returns `CommandOutput.data == {"text": "..."}`
|
||||
- callback or relay failures become `PipeError::Protocol(...)`
|
||||
|
||||
- [ ] **Step 5: Implement `GetText` and `Eval` on the chosen seam**
|
||||
|
||||
Use the smallest proven mechanism:
|
||||
- if the transcript proves page-defined callback functions are required, route through them
|
||||
- if `callBackJsToCpp(...)` to a page context is still part of the proven path, use it deliberately
|
||||
- if `sgBrowserRegJsFun` / `sgBrowserExcuteJsFun` becomes necessary, add it only with test coverage and only for the Zhihu path
|
||||
|
||||
- [ ] **Step 6: Run focused backend/protocol tests**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
cargo test --test browser_zhihu_release_backend_test -- --nocapture
|
||||
cargo test --test browser_ws_protocol_test -- --nocapture
|
||||
```
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
- [ ] **Step 7: Commit the Zhihu backend slice**
|
||||
|
||||
```bash
|
||||
git add src/browser/ws_protocol.rs src/browser/zhihu_release_backend.rs src/browser/mod.rs tests/browser_ws_protocol_test.rs tests/browser_zhihu_release_backend_test.rs
|
||||
git commit -m "feat: add zhihu release ws backend"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Task 4: Keep the Zhihu workflow logic stable and patch only proven mismatches
|
||||
|
||||
**Files:**
|
||||
- Review: `src/compat/workflow_executor.rs`
|
||||
- Test: `tests/service_task_flow_test.rs`
|
||||
- Test: `tests/compat_runtime_test.rs` (only if a focused direct-execution regression is needed)
|
||||
|
||||
- [ ] **Step 1: Write a failing Zhihu-specific regression only if the chosen seam changes route assumptions**
|
||||
|
||||
If the new Zhihu backend changes request-url or target-url handling enough to break hotlist flow, add one focused failing regression for that exact behavior.
|
||||
|
||||
Candidate assertions:
|
||||
- hotlist navigate still logs `navigate https://www.zhihu.com/hot`
|
||||
- follow-up `GetText body` still targets the Zhihu page, not any helper page
|
||||
- extractor `Eval` still runs against Zhihu, not any helper page
|
||||
|
||||
- [ ] **Step 2: Keep the current high-level Zhihu action sequence unless a test proves otherwise**
|
||||
|
||||
`src/compat/workflow_executor.rs` currently does the right high-level work:
|
||||
- navigate to Zhihu hotlist
|
||||
- poll body text until ready
|
||||
- run the extractor script
|
||||
|
||||
Prefer to keep this file unchanged. Only patch it if the new backend needs a narrow explicit `target_url` fix or similar evidence-backed adjustment.
|
||||
|
||||
- [ ] **Step 3: Run the smallest Zhihu-focused verification sweep**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
cargo test --test service_task_flow_test -- --nocapture
|
||||
cargo test --test compat_runtime_test zhihu -- --nocapture
|
||||
```
|
||||
|
||||
If the `compat_runtime_test zhihu` filter is too broad or unstable, run the exact focused Zhihu cases that cover hotlist extraction.
|
||||
|
||||
- [ ] **Step 4: Commit only if a Zhihu-specific code change was actually required**
|
||||
|
||||
```bash
|
||||
git add src/compat/workflow_executor.rs tests/service_task_flow_test.rs tests/compat_runtime_test.rs
|
||||
git commit -m "fix: keep zhihu workflow aligned with release ws backend"
|
||||
```
|
||||
|
||||
Skip this commit if no production change in `workflow_executor.rs` was needed.
|
||||
|
||||
---
|
||||
|
||||
## Task 5: Prove that pipe behavior and non-Zhihu behavior stayed unchanged
|
||||
|
||||
**Files:**
|
||||
- Test: `tests/service_ws_session_test.rs`
|
||||
- Test: `tests/service_task_flow_test.rs`
|
||||
- Test: `tests/task_runner_test.rs`
|
||||
|
||||
- [ ] **Step 1: Add or update one regression that proves pipe messages are unchanged**
|
||||
|
||||
Use the smallest existing test seam to assert that `ClientMessage` / `ServiceMessage` payloads remain unchanged while the Zhihu route uses the new browser integration path internally.
|
||||
|
||||
- [ ] **Step 2: Add or update one regression that proves non-Zhihu behavior is unchanged**
|
||||
|
||||
Use a non-Zhihu submit or service-session case and assert it does not take the new Zhihu-specific backend path.
|
||||
|
||||
- [ ] **Step 3: Preserve current runtime regression guards**
|
||||
|
||||
The end-to-end tests must continue asserting that output does **not** contain:
|
||||
- `invalid hmac seed: session key must not be empty`
|
||||
- `Cannot drop a runtime in a context where blocking is not allowed`
|
||||
|
||||
- [ ] **Step 4: Run the final focused verification sweep**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
cargo test --test service_ws_session_test -- --nocapture
|
||||
cargo test --test service_task_flow_test -- --nocapture
|
||||
cargo test --test task_runner_test -- --nocapture
|
||||
```
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
- [ ] **Step 5: Commit the verification sweep**
|
||||
|
||||
```bash
|
||||
git add tests/service_ws_session_test.rs tests/service_task_flow_test.rs tests/task_runner_test.rs tests/browser_ws_backend_test.rs
|
||||
git commit -m "test: constrain zhihu release ws migration scope"
|
||||
```
|
||||
|
||||
Only stage files that were truly changed.
|
||||
|
||||
---
|
||||
|
||||
## Out of scope
|
||||
|
||||
Do **not** do these in this slice:
|
||||
- change the pipe protocol
|
||||
- change `ClientMessage` / `ServiceMessage`
|
||||
- redesign `run_submit_task_with_browser_backend(...)`
|
||||
- reintroduce any browser bridge surface
|
||||
- keep adding speculative direct-raw-websocket callback patches to `ws_backend.rs`
|
||||
- redesign non-Zhihu workflows unless the new backend abstraction forces a shared fix
|
||||
- create a long-lived external dependency or third-party server just to host the helper page
|
||||
|
||||
---
|
||||
|
||||
## Verification checklist
|
||||
|
||||
Run at minimum:
|
||||
|
||||
```bash
|
||||
cargo test --test browser_ws_probe_test -- --nocapture
|
||||
cargo test --test browser_zhihu_release_backend_test -- --nocapture
|
||||
cargo test --test browser_ws_protocol_test -- --nocapture
|
||||
cargo test --test service_ws_session_test -- --nocapture
|
||||
cargo test --test service_task_flow_test -- --nocapture
|
||||
cargo test --test task_runner_test -- --nocapture
|
||||
```
|
||||
|
||||
If Task 2 chose the helper-page / relay design, also run the helper-page-specific backend tests you added for that path.
|
||||
|
||||
Manual verification after code changes:
|
||||
|
||||
1. start the real Release browser/runtime that exposes `ws://127.0.0.1:12345`
|
||||
2. start `sg_claw` with real config
|
||||
3. start `sg_claw_client`
|
||||
4. submit:
|
||||
- `打开知乎热榜,获取前10条数据,并导出 Excel`
|
||||
5. confirm the Zhihu path uses the exact Release-browser interaction seam proven by Task 1
|
||||
6. if Task 2 chose Option B, confirm the helper page / relay path is used only for the Zhihu integration seam
|
||||
7. confirm non-Zhihu behavior is unchanged
|
||||
8. confirm the task completes without:
|
||||
- `timeout while waiting for browser message`
|
||||
- `invalid browser status frame: Welcome! You are client #1`
|
||||
- `invalid hmac seed: session key must not be empty`
|
||||
- `Cannot drop a runtime in a context where blocking is not allowed`
|
||||
|
||||
---
|
||||
|
||||
## Expected outcome
|
||||
|
||||
After this slice:
|
||||
- sgclaw still exposes the same pipe/service contract
|
||||
- Zhihu hotlist execution uses the Release-browser websocket contract proven by Task 1
|
||||
- non-Zhihu behavior remains unchanged
|
||||
- old direct-ws Zhihu assumptions are no longer ambiguous in production/tests
|
||||
- if Option A won, Zhihu uses a direct Release-browser backend
|
||||
- if Option B won, Zhihu uses the minimal helper-page / relay seam justified by the probe evidence
|
||||
322
docs/superpowers/plans/2026-04-04-zhihu-ws-submit-realignment.md
Normal file
322
docs/superpowers/plans/2026-04-04-zhihu-ws-submit-realignment.md
Normal file
@@ -0,0 +1,322 @@
|
||||
# Zhihu WS Submit Realignment Implementation Plan
|
||||
|
||||
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
|
||||
|
||||
**Goal:** Realign Zhihu submit routes to the documented websocket callback model, removing helper-page bootstrap from the mainline while keeping the existing pipe/service contract unchanged.
|
||||
|
||||
**Architecture:** The change stays inside the existing submit-path backend selection and websocket protocol flow. Zhihu routes stop choosing `BrowserCallbackBackend` and instead use `WsBrowserBackend` when a real browser websocket is configured, preserving the existing pipe fallback in direct runtime when no websocket URL is available.
|
||||
|
||||
**Tech Stack:** Rust, tungstenite websocket client/server, serde_json, cargo test
|
||||
|
||||
---
|
||||
|
||||
## File Map
|
||||
|
||||
- Modify: `src/service/server.rs`
|
||||
- Change only the Zhihu route-gated submit-path backend selection
|
||||
- Remove Zhihu submit mainline use of `LiveBrowserCallbackHost` / `BrowserCallbackBackend`
|
||||
- Keep service submit path on `WsBrowserBackend`
|
||||
- Preserve initial request URL derivation for Zhihu routes
|
||||
- Modify: `src/agent/mod.rs`
|
||||
- Change only the Zhihu route-gated submit-path backend selection
|
||||
- Remove Zhihu submit mainline use of `LiveBrowserCallbackHost` / `BrowserCallbackBackend`
|
||||
- Keep direct runtime pipe fallback when browser websocket URL is absent
|
||||
- Modify: `tests/agent_runtime_test.rs`
|
||||
- Replace helper-page bootstrap regression with direct websocket submit regression
|
||||
- Assert no `/sgclaw/browser-helper.html` bootstrap frames are emitted
|
||||
- Assert real-page request ownership on follow-up Zhihu actions
|
||||
- Modify: `src/browser/callback_host.rs`
|
||||
- Remove or rewrite the now-wrong red test that preserves Option-B callback-host startup behavior
|
||||
- Verify: `tests/browser_ws_backend_test.rs`
|
||||
- Reuse existing websocket request-url behavior coverage; extend only if the new regression proves insufficient
|
||||
- Reference: `docs/superpowers/specs/2026-04-04-zhihu-ws-submit-realignment-design.md`
|
||||
|
||||
### Task 1: Rewrite the stale submit regression around the real websocket mainline
|
||||
|
||||
**Files:**
|
||||
- Modify: `tests/agent_runtime_test.rs:507-660`
|
||||
- Test: `tests/agent_runtime_test.rs`
|
||||
|
||||
- [ ] **Step 1: Write the failing test**
|
||||
|
||||
Rename and rewrite the existing helper-page regression so it asserts the new behavior:
|
||||
|
||||
```rust
|
||||
#[test]
|
||||
fn production_submit_task_routes_zhihu_through_ws_backend_without_helper_bootstrap() {
|
||||
// arrange runtime context and fake browser ws server
|
||||
// submit Zhihu hotlist request
|
||||
// assert ws frames never contain "/sgclaw/browser-helper.html"
|
||||
// assert first action is navigate to https://www.zhihu.com/hot
|
||||
// assert follow-up action uses real-page requesturl instead of helper page
|
||||
}
|
||||
```
|
||||
|
||||
Use the existing fake ws helpers in the file where possible. Do not add localhost callback-host HTTP plumbing to this rewritten test.
|
||||
|
||||
- [ ] **Step 2: Run test to verify it fails**
|
||||
|
||||
Run: `cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" production_submit_task_routes_zhihu_through_ws_backend_without_helper_bootstrap --test agent_runtime_test -- --nocapture`
|
||||
|
||||
Expected: FAIL because current production code still routes Zhihu submit into `BrowserCallbackBackend` and emits helper-page bootstrap frames.
|
||||
|
||||
- [ ] **Step 3: Keep the regression focused**
|
||||
|
||||
Before touching production code, confirm the rewritten test checks only these behaviors:
|
||||
|
||||
```text
|
||||
- no callback-host bootstrap frame
|
||||
- no helper-page URL
|
||||
- navigate frame still targets https://www.zhihu.com/hot
|
||||
- follow-up websocket action uses real-page request ownership
|
||||
```
|
||||
|
||||
Do not assert unrelated workflow details beyond what is needed to prove the route correction.
|
||||
|
||||
- [ ] **Step 4: Commit the red test**
|
||||
|
||||
```bash
|
||||
git add tests/agent_runtime_test.rs
|
||||
git commit -m "test: rewrite zhihu submit ws routing regression"
|
||||
```
|
||||
|
||||
### Task 2: Switch service Zhihu submit routes off the callback-host backend
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/service/server.rs:287-328`
|
||||
- Test: `tests/agent_runtime_test.rs`
|
||||
|
||||
- [ ] **Step 1: Write the minimal production change**
|
||||
|
||||
Replace only the Zhihu-route callback-host branch with direct websocket backend selection.
|
||||
|
||||
Minimal target shape:
|
||||
|
||||
```rust
|
||||
fn browser_backend_for_submit(
|
||||
browser_ws_url: &str,
|
||||
mac_policy: &MacPolicy,
|
||||
request: &SubmitTaskRequest,
|
||||
) -> Result<Arc<dyn BrowserBackend>, PipeError> {
|
||||
if should_use_callback_host_backend(request) {
|
||||
return Ok(Arc::new(WsBrowserBackend::new(
|
||||
Arc::new(ServiceWsClient::connect(browser_ws_url)?),
|
||||
mac_policy.clone(),
|
||||
initial_request_url_for_submit_task(request),
|
||||
)));
|
||||
}
|
||||
|
||||
Ok(Arc::new(WsBrowserBackend::new(
|
||||
Arc::new(ServiceWsClient::connect(browser_ws_url)?),
|
||||
mac_policy.clone(),
|
||||
initial_request_url_for_submit_task(request),
|
||||
)))
|
||||
}
|
||||
```
|
||||
|
||||
After the route-gated branch is removed, simplify further only if the branch becomes redundant without changing non-Zhihu behavior.
|
||||
|
||||
- [ ] **Step 2: Run the rewritten regression**
|
||||
|
||||
Run: `cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" production_submit_task_routes_zhihu_through_ws_backend_without_helper_bootstrap --test agent_runtime_test -- --nocapture`
|
||||
|
||||
Expected: still FAIL or advance to a later assertion until the direct-runtime path is corrected too.
|
||||
|
||||
- [ ] **Step 3: Add or update a service-specific regression if needed**
|
||||
|
||||
If the rewritten `agent_runtime_test` does not exercise the service submit path directly, add one narrow service regression before continuing.
|
||||
|
||||
Target shape:
|
||||
|
||||
```rust
|
||||
#[test]
|
||||
fn service_submit_task_routes_zhihu_through_ws_backend_without_helper_bootstrap() {
|
||||
// fake browser ws
|
||||
// submit Zhihu route through service path
|
||||
// assert no helper bootstrap frame
|
||||
}
|
||||
```
|
||||
|
||||
Run the exact test you end up using:
|
||||
|
||||
`cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" service_submit_task_routes_zhihu_through_ws_backend_without_helper_bootstrap --test <exact test file> -- --nocapture`
|
||||
|
||||
Expected: PASS
|
||||
|
||||
- [ ] **Step 4: Commit the service-path fix**
|
||||
|
||||
```bash
|
||||
git add src/service/server.rs tests/agent_runtime_test.rs
|
||||
git commit -m "fix: route zhihu submit through ws backend"
|
||||
```
|
||||
|
||||
### Task 3: Switch direct runtime Zhihu submit routes off the callback-host backend while keeping pipe fallback
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/agent/mod.rs:49-100`
|
||||
- Test: `tests/agent_runtime_test.rs`
|
||||
|
||||
- [ ] **Step 1: Write the minimal production change**
|
||||
|
||||
Remove callback-host backend selection from `browser_backend_for_submit(...)`.
|
||||
|
||||
Minimal target behavior:
|
||||
|
||||
```rust
|
||||
if let Some(browser_ws_url) = configured_browser_ws_url(context) {
|
||||
return Ok(Arc::new(WsBrowserBackend::new(
|
||||
Arc::new(ServiceWsClient::connect(&browser_ws_url)?),
|
||||
browser_tool.mac_policy().clone(),
|
||||
initial_request_url_for_submit_task(request),
|
||||
).with_response_timeout(browser_tool.response_timeout())));
|
||||
}
|
||||
|
||||
Ok(Arc::new(PipeBrowserBackend::from_inner(browser_tool.clone())))
|
||||
```
|
||||
|
||||
If `ServiceWsClient` is not reusable from `src/service/server.rs`, extract the smallest shared websocket client helper into the browser module instead of inventing a new abstraction.
|
||||
|
||||
- [ ] **Step 2: Add a focused fallback assertion only if needed**
|
||||
|
||||
If the rewritten regression does not cover the direct-runtime no-websocket case, add one small test:
|
||||
|
||||
```rust
|
||||
#[test]
|
||||
fn production_submit_task_keeps_pipe_fallback_when_browser_ws_url_is_unset() {
|
||||
// no SGCLAW_BROWSER_WS_URL
|
||||
// blank/no ws config
|
||||
// assert no websocket bootstrap attempt occurs
|
||||
}
|
||||
```
|
||||
|
||||
Only add this test if current coverage is insufficient.
|
||||
|
||||
- [ ] **Step 3: Run tests to verify green**
|
||||
|
||||
Run: `cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" production_submit_task_routes_zhihu_through_ws_backend_without_helper_bootstrap --test agent_runtime_test -- --nocapture`
|
||||
|
||||
Expected: PASS
|
||||
|
||||
If a fallback test was added, run it immediately after and expect PASS.
|
||||
|
||||
- [ ] **Step 4: Commit the direct-runtime fix**
|
||||
|
||||
```bash
|
||||
git add src/agent/mod.rs tests/agent_runtime_test.rs
|
||||
git commit -m "fix: align runtime zhihu submit with ws contract"
|
||||
```
|
||||
|
||||
### Task 4: Reassess stale callback-host regression coverage only if it blocks the approved slice
|
||||
|
||||
**Files:**
|
||||
- Maybe modify: `src/browser/callback_host.rs:793-810`
|
||||
- Test: `src/browser/callback_host.rs`
|
||||
|
||||
- [ ] **Step 1: Check whether the callback-host red test still blocks the approved Option A slice**
|
||||
|
||||
Inspect whether this test still preserves rejected Option-B behavior and whether it fails or becomes misleading after Tasks 1-3:
|
||||
|
||||
```rust
|
||||
#[test]
|
||||
fn live_callback_host_starts_without_bootstrapping_external_helper_page() {
|
||||
// inspect before editing
|
||||
}
|
||||
```
|
||||
|
||||
If the test is unrelated to the approved Zhihu mainline or remains harmless, leave it unchanged in this slice.
|
||||
|
||||
- [ ] **Step 2: Remove or rewrite only if required by the changed submit-path behavior**
|
||||
|
||||
If the test blocks the approved slice, make the smallest change needed:
|
||||
|
||||
- delete it if it exists only to preserve rejected Option B behavior, or
|
||||
- rewrite it so it no longer asserts callback-host startup as the accepted Zhihu mainline
|
||||
|
||||
- [ ] **Step 3: Run focused callback-host tests only if Step 2 changed code**
|
||||
|
||||
Run: `cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" callback_host --lib -- --nocapture`
|
||||
|
||||
Expected: PASS
|
||||
|
||||
- [ ] **Step 4: Commit only if Step 2 changed code**
|
||||
|
||||
```bash
|
||||
git add src/browser/callback_host.rs
|
||||
git commit -m "test: clean up stale callback host regression"
|
||||
```
|
||||
|
||||
### Task 5: Run the focused verification sweep
|
||||
|
||||
**Files:**
|
||||
- Verify: `tests/agent_runtime_test.rs`
|
||||
- Verify: `tests/compat_runtime_test.rs`
|
||||
- Verify: any directly affected service/browser websocket tests
|
||||
|
||||
- [ ] **Step 1: Run submit-path regression coverage**
|
||||
|
||||
Run: `cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" production_submit_task_routes_zhihu_through_ws_backend_without_helper_bootstrap --test agent_runtime_test -- --nocapture`
|
||||
|
||||
Expected: PASS
|
||||
|
||||
- [ ] **Step 2: Run websocket backend request-url coverage**
|
||||
|
||||
Run: `cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" ws_backend_reuses_last_navigated_url_for_followup_requests --test browser_ws_backend_test -- --nocapture`
|
||||
|
||||
Expected: PASS
|
||||
|
||||
- [ ] **Step 3: Run Zhihu compat runtime coverage**
|
||||
|
||||
Run: `cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" zhihu --test compat_runtime_test -- --nocapture`
|
||||
|
||||
Expected: PASS for the changed submit-path surface or clear, directly related failures only.
|
||||
|
||||
- [ ] **Step 4: Run affected service submit regression coverage**
|
||||
|
||||
Run the exact service-specific regression from Task 2 if you added one.
|
||||
|
||||
Otherwise, run the narrowest existing service submit test that covers `ClientMessage::SubmitTask` handling for browser routes.
|
||||
|
||||
Expected: PASS
|
||||
|
||||
- [ ] **Step 5: Commit the verified slice**
|
||||
|
||||
```bash
|
||||
git add src/service/server.rs src/agent/mod.rs tests/agent_runtime_test.rs src/browser/callback_host.rs
|
||||
git commit -m "fix: realign zhihu submit with browser ws callbacks"
|
||||
```
|
||||
|
||||
### Task 6: Run stronger real-browser validation
|
||||
|
||||
**Files:**
|
||||
- Verify live behavior through existing binaries and real config only
|
||||
|
||||
- [ ] **Step 1: Start the service with the real config**
|
||||
|
||||
Run: `cargo run --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" --bin sg_claw -- --config-path "D:/data/ideaSpace/rust/sgClaw/sgclaw_config.json"`
|
||||
|
||||
Expected: service starts without failing at callback-host readiness timeout.
|
||||
|
||||
- [ ] **Step 2: Run the client against the started service**
|
||||
|
||||
Run: `cargo run --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" --bin sg_claw_client`
|
||||
|
||||
Expected: for `打开知乎热榜,获取前10条数据,并导出 Excel`, the browser proceeds into real Zhihu page work instead of stalling before page open.
|
||||
|
||||
- [ ] **Step 3: Capture the narrow acceptance evidence**
|
||||
|
||||
Verify all of the following from logs/frames/observed behavior:
|
||||
|
||||
```text
|
||||
- no callback-host readiness timeout
|
||||
- no helper-page bootstrap frame
|
||||
- at least one real-page follow-up browser action after navigate
|
||||
```
|
||||
|
||||
- [ ] **Step 4: Commit only if live verification required code changes**
|
||||
|
||||
```bash
|
||||
git add <only files changed during live-fix follow-up>
|
||||
git commit -m "fix: tighten zhihu ws submit live validation follow-up"
|
||||
```
|
||||
|
||||
If no further code changes were needed, do not create an extra commit.
|
||||
@@ -0,0 +1,406 @@
|
||||
# Service Chat Web Console Implementation Plan
|
||||
|
||||
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
|
||||
|
||||
**Goal:** Add a standalone local HTML console that connects to the existing service websocket, submits natural-language tasks with the current `submit_task` payload, and leaves the browser-helper/runtime path untouched.
|
||||
|
||||
**Architecture:** The change stays fully at the presentation edge. A new self-contained HTML file under `frontend/service-console/` reuses the current websocket protocol from `src/service/protocol.rs`, while one narrow Rust integration test guards the page's protocol shape and forbids any reference to `browser-helper.html`, callback-host endpoints, or the browser websocket. No Rust runtime logic changes are part of this slice.
|
||||
|
||||
**Tech Stack:** HTML, CSS, vanilla JavaScript, Rust integration tests, std::fs, Cargo test
|
||||
|
||||
---
|
||||
|
||||
## File Map
|
||||
|
||||
- Create: `frontend/service-console/sg_claw_service_console.html`
|
||||
- Standalone local page with inline CSS and JavaScript
|
||||
- Connects to the existing service websocket at `ws://127.0.0.1:42321` by default
|
||||
- Sends existing `ClientMessage::SubmitTask` JSON
|
||||
- Renders inbound `ServiceMessage` rows only
|
||||
- Create: `tests/service_console_html_test.rs`
|
||||
- Source guard for the standalone page
|
||||
- Verifies file location, allowed protocol usage, and forbidden helper/callback references
|
||||
- Reference: `src/service/protocol.rs`
|
||||
- Existing websocket message shape to mirror exactly
|
||||
- Reference: `src/bin/sg_claw_client.rs`
|
||||
- Existing terminal client behavior to mirror for `submit_task`
|
||||
- Reference: `docs/superpowers/specs/2026-04-06-service-chat-web-console-design.md`
|
||||
|
||||
## Scope Guardrails
|
||||
|
||||
- Do not modify `src/service/server.rs`.
|
||||
- Do not modify `src/browser/callback_host.rs`.
|
||||
- Do not modify `src/browser/callback_backend.rs`.
|
||||
- Do not modify `src/bin/sg_claw_client.rs`.
|
||||
- Do not add an HTTP server.
|
||||
- Do not connect the new page to `ws://127.0.0.1:12345`.
|
||||
- Do not reference `/sgclaw/browser-helper.html` or `/sgclaw/callback/*` anywhere in the new page.
|
||||
|
||||
### Task 1: Add a failing source-guard test for the standalone page
|
||||
|
||||
**Files:**
|
||||
- Create: `tests/service_console_html_test.rs`
|
||||
- Reference: `docs/superpowers/specs/2026-04-06-service-chat-web-console-design.md`
|
||||
|
||||
- [ ] **Step 1: Write the failing test**
|
||||
|
||||
Create a focused integration test that resolves the HTML path from `CARGO_MANIFEST_DIR` and asserts the file contract.
|
||||
|
||||
```rust
|
||||
use std::fs;
|
||||
use std::path::PathBuf;
|
||||
|
||||
#[test]
|
||||
fn service_console_html_stays_on_service_ws_boundary() {
|
||||
let manifest_dir = PathBuf::from(env!("CARGO_MANIFEST_DIR"));
|
||||
let html_path = manifest_dir
|
||||
.join("frontend")
|
||||
.join("service-console")
|
||||
.join("sg_claw_service_console.html");
|
||||
let source = fs::read_to_string(&html_path)
|
||||
.expect("service console html should exist");
|
||||
|
||||
assert!(source.contains("ws://127.0.0.1:42321"));
|
||||
assert!(source.contains("submit_task"));
|
||||
assert!(!source.contains("/sgclaw/browser-helper.html"));
|
||||
assert!(!source.contains("/sgclaw/callback/ready"));
|
||||
assert!(!source.contains("/sgclaw/callback/events"));
|
||||
assert!(!source.contains("/sgclaw/callback/commands/next"));
|
||||
assert!(!source.contains("/sgclaw/callback/commands/ack"));
|
||||
assert!(!source.contains("ws://127.0.0.1:12345"));
|
||||
}
|
||||
```
|
||||
|
||||
- [ ] **Step 2: Run test to verify it fails**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" service_console_html_stays_on_service_ws_boundary --test service_console_html_test -- --exact
|
||||
```
|
||||
|
||||
Expected: FAIL because the HTML file does not exist yet.
|
||||
|
||||
- [ ] **Step 3: Keep the test narrow**
|
||||
|
||||
Before writing production code, confirm the test guards only the approved boundary:
|
||||
|
||||
```text
|
||||
- file exists at frontend/service-console/sg_claw_service_console.html
|
||||
- service websocket default is present
|
||||
- submit_task payload marker is present
|
||||
- no helper-page path
|
||||
- no callback-host endpoints
|
||||
- no browser websocket URL
|
||||
```
|
||||
|
||||
Do not turn this into an end-to-end browser test.
|
||||
|
||||
- [ ] **Step 4: Commit the red test**
|
||||
|
||||
```bash
|
||||
git add tests/service_console_html_test.rs
|
||||
git commit -m "test: add service console html boundary guard"
|
||||
```
|
||||
|
||||
### Task 2: Implement the standalone HTML console with the approved boundary
|
||||
|
||||
**Files:**
|
||||
- Create: `frontend/service-console/sg_claw_service_console.html`
|
||||
- Reference: `src/service/protocol.rs:6`
|
||||
- Reference: `src/bin/sg_claw_client.rs:16`
|
||||
- Test: `tests/service_console_html_test.rs`
|
||||
|
||||
- [ ] **Step 1: Create the HTML file with the minimal structure**
|
||||
|
||||
Write one self-contained page with:
|
||||
|
||||
```html
|
||||
<!doctype html>
|
||||
<html lang="zh-CN">
|
||||
<head>
|
||||
<meta charset="utf-8" />
|
||||
<meta name="viewport" content="width=device-width, initial-scale=1" />
|
||||
<title>sgClaw Service Console</title>
|
||||
</head>
|
||||
<body>
|
||||
<div id="app">
|
||||
<input id="wsUrl" value="ws://127.0.0.1:42321" />
|
||||
<button id="connectBtn">连接</button>
|
||||
<div id="connectionState">未连接</div>
|
||||
<div id="messageStream"></div>
|
||||
<textarea id="instructionInput"></textarea>
|
||||
<div id="validationText"></div>
|
||||
<button id="sendBtn" disabled>发送任务</button>
|
||||
</div>
|
||||
</body>
|
||||
</html>
|
||||
```
|
||||
|
||||
Keep all CSS and JavaScript inline. Do not add external assets or a build step.
|
||||
|
||||
- [ ] **Step 2: Implement websocket connect/disconnect behavior**
|
||||
|
||||
Add the smallest possible JS behavior, including explicit disconnect on the same button so the UI
|
||||
matches the approved connect/disconnect contract:
|
||||
|
||||
```javascript
|
||||
let socket = null;
|
||||
|
||||
function appendRow(kind, text) {
|
||||
// append a visible row to #messageStream
|
||||
}
|
||||
|
||||
function updateUiState() {
|
||||
const connected = socket && socket.readyState === WebSocket.OPEN;
|
||||
document.getElementById('connectBtn').textContent = connected ? '断开' : '连接';
|
||||
document.getElementById('sendBtn').disabled = !connected;
|
||||
document.getElementById('connectionState').textContent = connected ? '已连接' : '未连接';
|
||||
}
|
||||
|
||||
function connectOrDisconnectService() {
|
||||
if (socket && socket.readyState === WebSocket.OPEN) {
|
||||
socket.close();
|
||||
return;
|
||||
}
|
||||
|
||||
const url = document.getElementById('wsUrl').value.trim() || 'ws://127.0.0.1:42321';
|
||||
socket = new WebSocket(url);
|
||||
updateUiState();
|
||||
socket.addEventListener('open', () => {
|
||||
appendRow('status', 'service websocket connected');
|
||||
updateUiState();
|
||||
});
|
||||
socket.addEventListener('close', () => {
|
||||
appendRow('status', 'service websocket disconnected');
|
||||
updateUiState();
|
||||
});
|
||||
socket.addEventListener('error', () => appendRow('error', 'service websocket error'));
|
||||
socket.addEventListener('message', handleMessage);
|
||||
}
|
||||
```
|
||||
|
||||
Do not add retry loops or background reconnect logic.
|
||||
|
||||
- [ ] **Step 3: Implement submit_task sending with the current message shape**
|
||||
|
||||
Mirror the terminal client payload shape exactly and show inline validation for empty input:
|
||||
|
||||
```javascript
|
||||
function setValidation(message) {
|
||||
document.getElementById('validationText').textContent = message;
|
||||
}
|
||||
|
||||
function sendTask() {
|
||||
const instruction = document.getElementById('instructionInput').value.trim();
|
||||
if (!socket || socket.readyState !== WebSocket.OPEN) {
|
||||
return;
|
||||
}
|
||||
if (!instruction) {
|
||||
setValidation('请输入任务内容。');
|
||||
return;
|
||||
}
|
||||
|
||||
setValidation('');
|
||||
socket.send(JSON.stringify({
|
||||
type: 'submit_task',
|
||||
instruction,
|
||||
conversation_id: '',
|
||||
messages: [],
|
||||
page_url: '',
|
||||
page_title: ''
|
||||
}));
|
||||
}
|
||||
```
|
||||
|
||||
Do not add new fields. Do not add conversation replay logic in this slice.
|
||||
|
||||
- [ ] **Step 4: Render existing inbound service messages only**
|
||||
|
||||
Handle the current `ServiceMessage` variants with a minimal dispatcher:
|
||||
|
||||
```javascript
|
||||
function handleMessage(event) {
|
||||
const message = JSON.parse(event.data);
|
||||
switch (message.type) {
|
||||
case 'status_changed':
|
||||
appendRow('status', message.state);
|
||||
break;
|
||||
case 'log_entry':
|
||||
appendRow('log', message.message);
|
||||
break;
|
||||
case 'task_complete':
|
||||
appendRow(message.success ? 'complete' : 'error', message.summary);
|
||||
break;
|
||||
case 'busy':
|
||||
appendRow('error', message.message);
|
||||
break;
|
||||
default:
|
||||
appendRow('error', 'unknown service message: ' + event.data);
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Keep the composer enabled during in-flight work so repeated submits surface the existing `busy` response instead of inventing a frontend queue.
|
||||
|
||||
- [ ] **Step 5: Keep the helper boundary explicit in the source**
|
||||
|
||||
Before running tests, inspect the HTML source and confirm:
|
||||
|
||||
```text
|
||||
- no /sgclaw/browser-helper.html
|
||||
- no /sgclaw/callback/*
|
||||
- no ws://127.0.0.1:12345
|
||||
- no browser websocket register frame logic
|
||||
```
|
||||
|
||||
If any such string appears, remove it before testing.
|
||||
|
||||
- [ ] **Step 6: Run the source-guard test to verify green**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" service_console_html_stays_on_service_ws_boundary --test service_console_html_test -- --exact
|
||||
```
|
||||
|
||||
Expected: PASS
|
||||
|
||||
- [ ] **Step 7: Commit the standalone page**
|
||||
|
||||
```bash
|
||||
git add frontend/service-console/sg_claw_service_console.html tests/service_console_html_test.rs
|
||||
git commit -m "feat: add standalone service chat console"
|
||||
```
|
||||
|
||||
### Task 3: Run the focused verification sweep
|
||||
|
||||
**Files:**
|
||||
- Verify: `tests/service_console_html_test.rs`
|
||||
- Reference: `src/service/protocol.rs`
|
||||
- Reference: `src/bin/sg_claw_client.rs`
|
||||
|
||||
- [ ] **Step 1: Re-run the source-guard test**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" service_console_html_stays_on_service_ws_boundary --test service_console_html_test -- --exact
|
||||
```
|
||||
|
||||
Expected: PASS
|
||||
|
||||
- [ ] **Step 2: Manually inspect disconnected-send and validation markers in the HTML source**
|
||||
|
||||
Before broader verification, confirm the page source clearly contains all three UI-local rules:
|
||||
|
||||
```text
|
||||
- connect button can disconnect an open websocket
|
||||
- send button starts disabled while disconnected
|
||||
- empty instruction shows inline validation text
|
||||
```
|
||||
|
||||
This inspection stays source-level; do not add extra backend tests for it in this slice.
|
||||
|
||||
- [ ] **Step 3: Run an existing service protocol regression for safety**
|
||||
|
||||
Run the narrow existing protocol coverage to prove the page did not require backend changes:
|
||||
|
||||
```bash
|
||||
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" submit_task_client_message_converts_into_shared_runner_request --test service_ws_session_test -- --exact
|
||||
```
|
||||
|
||||
Expected: PASS
|
||||
|
||||
- [ ] **Step 4: Run an existing terminal-client regression for safety**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" client_submits_first_user_line_to_service --test service_task_flow_test -- --exact
|
||||
```
|
||||
|
||||
Expected: PASS
|
||||
|
||||
- [ ] **Step 5: Commit only if verification required any code change**
|
||||
|
||||
```bash
|
||||
git add frontend/service-console/sg_claw_service_console.html tests/service_console_html_test.rs
|
||||
git commit -m "test: tighten service console verification"
|
||||
```
|
||||
|
||||
If verification required no code changes, do not create an extra commit.
|
||||
|
||||
### Task 4: Perform the manual smoke check
|
||||
|
||||
**Files:**
|
||||
- Verify live behavior only; no new code required
|
||||
|
||||
- [ ] **Step 1: Start the existing service binary**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo run --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" --bin sg_claw -- --config-path "D:/data/ideaSpace/rust/sgClaw/sgclaw_config.json"
|
||||
```
|
||||
|
||||
Expected: service starts and prints its ready line with the service websocket listen address.
|
||||
|
||||
- [ ] **Step 2: Open the standalone page directly**
|
||||
|
||||
Open:
|
||||
|
||||
```text
|
||||
D:/data/ideaSpace/rust/sgClaw/claw-new/frontend/service-console/sg_claw_service_console.html
|
||||
```
|
||||
|
||||
Expected: the page loads through the browser as a local file and shows the default websocket URL `ws://127.0.0.1:42321`.
|
||||
|
||||
- [ ] **Step 3: Connect, disconnect, and reconnect once**
|
||||
|
||||
Expected:
|
||||
|
||||
```text
|
||||
- message stream shows websocket connected
|
||||
- clicking the same button disconnects the websocket cleanly
|
||||
- message stream shows websocket disconnected
|
||||
- send button is disabled again while disconnected
|
||||
- reconnect succeeds without reloading the page
|
||||
```
|
||||
|
||||
- [ ] **Step 4: Submit one natural-language task**
|
||||
|
||||
Use a small harmless instruction such as:
|
||||
|
||||
```text
|
||||
打开百度
|
||||
```
|
||||
|
||||
Expected:
|
||||
|
||||
```text
|
||||
- empty textarea send attempt first shows inline validation without sending a websocket frame
|
||||
- page sends one submit_task payload after valid input
|
||||
- page receives and renders status/log/task_complete or busy rows
|
||||
```
|
||||
|
||||
- [ ] **Step 5: Confirm the helper boundary stayed untouched**
|
||||
|
||||
Verify from the page source and observed behavior:
|
||||
|
||||
```text
|
||||
- the page never loads /sgclaw/browser-helper.html
|
||||
- the page never calls /sgclaw/callback/*
|
||||
- the page never connects to ws://127.0.0.1:12345
|
||||
```
|
||||
|
||||
If the task itself triggers browser automation, that remains owned by the existing Rust runtime rather than by the page.
|
||||
|
||||
- [ ] **Step 6: Commit only if the manual pass required code changes**
|
||||
|
||||
```bash
|
||||
git add frontend/service-console/sg_claw_service_console.html tests/service_console_html_test.rs
|
||||
git commit -m "fix: tighten standalone service console smoke flow"
|
||||
```
|
||||
|
||||
If the manual pass required no code changes, do not create an extra commit.
|
||||
@@ -0,0 +1,637 @@
|
||||
# Zhihu Hotlist Post-Export Auto-Open Implementation Plan
|
||||
|
||||
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
|
||||
|
||||
**Goal:** Extend the existing Zhihu hotlist Excel and dashboard routes so each route can auto-open its own generated artifact after export, while preserving the current callback-host-backed browser boundary and route exclusivity.
|
||||
|
||||
**Architecture:** Keep orchestration in `src/compat/workflow_executor.rs`, but move post-export side effects into a new `src/compat/artifact_open.rs` helper so workflow routing stays readable. Excel auto-open is a local OS-launch side effect; dashboard auto-open reuses `screen_html_export`'s existing `presentation.url` and sends one narrow, marker-based `Action::Navigate` request through `BrowserCallbackBackend`, with a matching special-case validator in `MacPolicy` so arbitrary `file://` navigation remains blocked.
|
||||
|
||||
**Tech Stack:** Rust, serde_json, std::process::Command, std::path, Cargo tests
|
||||
|
||||
---
|
||||
|
||||
## File Map
|
||||
|
||||
- Create: `src/compat/artifact_open.rs`
|
||||
- Define the narrow post-export helper surface for this slice only
|
||||
- Parse and validate generated artifact payload fields passed in by the workflow layer
|
||||
- Open generated `.xlsx` files with the local default app
|
||||
- Build the exact approved local-dashboard navigate payload
|
||||
- Keep one testable internal seam, `open_exported_xlsx_with(output_path, opener)`, so unit tests can prove the generated `.xlsx` path is handed to the launcher without starting a real spreadsheet app
|
||||
- Include unit tests in the same file for exact Excel path handoff and launcher-failure reporting
|
||||
- Modify: `src/compat/mod.rs`
|
||||
- Export the new `artifact_open` module
|
||||
- Modify: `src/compat/workflow_executor.rs`
|
||||
- Keep route detection and artifact generation where they are now
|
||||
- Change `export_xlsx(...)` and `export_screen(...)` so they parse tool payloads, call the route-specific opener, and produce the new success/failure summaries
|
||||
- Modify: `src/browser/callback_backend.rs`
|
||||
- Recognize only the approved local-dashboard navigate request shape at `Action::Navigate`
|
||||
- Keep normal remote navigate behavior unchanged
|
||||
- Continue emitting `sgBrowerserOpenPage` for the approved local-dashboard case so the helper page stays alive and the dashboard opens in a new visible tab
|
||||
- Add focused callback-backend unit tests in the existing test module for approved and malformed local-dashboard requests
|
||||
- Modify: `src/security/mac_policy.rs`
|
||||
- Add a narrow validator for the approved local-dashboard presentation case
|
||||
- Keep `validate(...)` unchanged for ordinary remote-domain flow
|
||||
- Reject malformed marker payloads, non-HTML local paths, and mismatched `file://` / output-path combinations
|
||||
- Modify: `tests/compat_runtime_test.rs`
|
||||
- Keep the concrete hotlist workflow regressions in this existing integration test file
|
||||
- Extend existing Zhihu hotlist export/screen regressions to assert the new summaries and the dashboard marker payload
|
||||
- Keep the Excel route workflow assertion limited to summary plus “no dashboard navigate marker,” because exact launcher handoff is covered in `src/compat/artifact_open.rs` unit tests
|
||||
- Modify: `tests/browser_tool_test.rs`
|
||||
- Add `MacPolicy` coverage for approved local-dashboard presentation, rejected malformed presentation, and unchanged normal-domain validation in one exact file
|
||||
- Extend the existing `default_rules_allow_zhihu_navigation` area with the new local-dashboard validation tests rather than creating a second policy test location
|
||||
- Reference only if summary wording ripples outward: `tests/agent_runtime_test.rs:173-258`
|
||||
- Existing direct-runtime user-visible summary assertion for Zhihu Excel export
|
||||
- Reference only if summary wording ripples outward: `tests/service_task_flow_test.rs:704-839`
|
||||
- Existing CLI-to-service user-visible summary assertion for Zhihu Excel export
|
||||
- Reference only if summary wording ripples outward: `tests/service_ws_session_test.rs:755-869`
|
||||
- Existing service-binary user-visible summary assertion for Zhihu Excel export
|
||||
- Reference: `tests/compat_screen_html_export_tool_test.rs`
|
||||
- Reuse the exact test seam `screen_html_export_tool_renders_dashboard_html_with_presentation_contract`
|
||||
- Existing proof that `screen_html_export` already returns `presentation.url`
|
||||
- Reference: `docs/superpowers/specs/2026-04-06-zhihu-hotlist-post-export-auto-open-design.md`
|
||||
|
||||
## Scope Guardrails
|
||||
|
||||
- Do not modify `frontend/service-console/sg_claw_service_console.html`.
|
||||
- Do not modify `src/service/protocol.rs`.
|
||||
- Do not modify `browser-helper.html`.
|
||||
- Do not modify `/sgclaw/callback/*` endpoint contracts.
|
||||
- Do not modify websocket protocol framing or `src/browser/ws_protocol.rs`.
|
||||
- Do not turn Excel-open and dashboard-open into a combined mode.
|
||||
- Do not add a general-purpose local file browser or generic `file://` allowlist.
|
||||
- Do not move post-export decisions into the frontend service console.
|
||||
- Do not require websocket-backend parity in this slice.
|
||||
|
||||
### Task 1: Add failing workflow tests for route-specific post-export actions
|
||||
|
||||
**Files:**
|
||||
- Modify: `tests/compat_runtime_test.rs:2154-2304`
|
||||
- Reference: `src/compat/workflow_executor.rs:375-446`
|
||||
- Reference: `docs/superpowers/specs/2026-04-06-zhihu-hotlist-post-export-auto-open-design.md`
|
||||
|
||||
- [ ] **Step 1: Rewrite the Excel hotlist assertion as a red test for the new summary only**
|
||||
|
||||
Keep the current flow setup, but tighten the expectation so it proves the workflow route now reports post-export open success while staying exclusive from the dashboard path.
|
||||
|
||||
Target shape:
|
||||
|
||||
```rust
|
||||
#[test]
|
||||
fn handle_browser_message_chains_hotlist_skill_into_xlsx_export_and_auto_open() {
|
||||
// existing setup
|
||||
assert!(summary.contains("已导出并打开知乎热榜 Excel"));
|
||||
assert!(generated.exists());
|
||||
assert!(!sent.iter().any(|message| {
|
||||
matches!(
|
||||
message,
|
||||
AgentMessage::Command { action, params, .. }
|
||||
if action == &Action::Navigate
|
||||
&& params.get("sgclaw_local_dashboard_open").is_some()
|
||||
)
|
||||
}));
|
||||
}
|
||||
```
|
||||
|
||||
Do not try to prove real OS launching in this workflow test. The exact `.xlsx` path handoff to the launcher belongs in `src/compat/artifact_open.rs` unit tests from Task 2.
|
||||
|
||||
- [ ] **Step 2: Rewrite the dashboard hotlist assertion as a red test for browser auto-open**
|
||||
|
||||
Tighten the existing dashboard test so it proves the workflow consumes `presentation.url` and emits the approved compat marker payload.
|
||||
|
||||
Target shape:
|
||||
|
||||
```rust
|
||||
#[test]
|
||||
fn handle_browser_message_chains_hotlist_skill_into_screen_export_and_auto_open() {
|
||||
// existing setup
|
||||
assert!(summary.contains("已在浏览器中打开知乎热榜大屏"));
|
||||
let navigate = sent.iter().find_map(|message| match message {
|
||||
AgentMessage::Command { action, params, security, .. }
|
||||
if action == &Action::Navigate
|
||||
&& security.expected_domain == "__sgclaw_local_dashboard__" => Some((params, security)),
|
||||
_ => None,
|
||||
}).expect("dashboard route should emit local-dashboard navigate request");
|
||||
|
||||
assert!(navigate.0["url"].as_str().unwrap().starts_with("file://"));
|
||||
assert_eq!(navigate.0["sgclaw_local_dashboard_open"]["source"], json!("compat.workflow_executor"));
|
||||
assert_eq!(navigate.0["sgclaw_local_dashboard_open"]["kind"], json!("zhihu_hotlist_screen"));
|
||||
assert_eq!(navigate.0["sgclaw_local_dashboard_open"]["presentation_url"], navigate.0["url"]);
|
||||
}
|
||||
```
|
||||
|
||||
Also assert that this route still logs `call screen_html_export` and does not invoke the Excel opener path.
|
||||
|
||||
- [ ] **Step 3: Add a missing-`presentation.url` regression in the workflow test module if none exists**
|
||||
|
||||
Put this close to the existing hotlist tests and keep it narrow:
|
||||
|
||||
```rust
|
||||
#[test]
|
||||
fn handle_browser_message_reports_dashboard_auto_open_protocol_error_when_presentation_url_is_missing() {
|
||||
// mock screen_html_export success payload with output_path but no presentation.url
|
||||
// assert summary contains 已生成知乎热榜大屏 <path>,但浏览器自动打开失败:
|
||||
}
|
||||
```
|
||||
|
||||
Use the existing summary/path helpers in the file instead of inventing new parsing helpers.
|
||||
|
||||
- [ ] **Step 4: Run the focused compat runtime tests to verify they fail**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" handle_browser_message_chains_hotlist_skill_into_xlsx_export_and_auto_open --test compat_runtime_test -- --exact
|
||||
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" handle_browser_message_chains_hotlist_skill_into_screen_export_and_auto_open --test compat_runtime_test -- --exact
|
||||
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" handle_browser_message_reports_dashboard_auto_open_protocol_error_when_presentation_url_is_missing --test compat_runtime_test -- --exact
|
||||
```
|
||||
|
||||
Expected: FAIL because the workflow still returns artifact-only summaries and has no post-export open handling.
|
||||
|
||||
- [ ] **Step 5: Commit the red workflow tests**
|
||||
|
||||
```bash
|
||||
git add tests/compat_runtime_test.rs
|
||||
git commit -m "test: add hotlist post-export auto-open regressions"
|
||||
```
|
||||
|
||||
### Task 2: Implement the compat post-export opener and update workflow summaries
|
||||
|
||||
**Files:**
|
||||
- Create: `src/compat/artifact_open.rs`
|
||||
- Modify: `src/compat/mod.rs`
|
||||
- Modify: `src/compat/workflow_executor.rs:375-446`
|
||||
- Test: `src/compat/artifact_open.rs`
|
||||
- Test: `tests/compat_runtime_test.rs`
|
||||
|
||||
- [ ] **Step 1: Add the red unit tests in `src/compat/artifact_open.rs` before writing production code**
|
||||
|
||||
Create the new module with a `#[cfg(test)]` block first so the Excel opener has an exact, non-UI verification seam.
|
||||
|
||||
Target tests:
|
||||
|
||||
```rust
|
||||
#[test]
|
||||
fn open_exported_xlsx_with_passes_generated_path_to_launcher() {
|
||||
let mut seen = None;
|
||||
let result = open_exported_xlsx_with(Path::new("C:/tmp/zhihu-hotlist.xlsx"), |path| {
|
||||
seen = Some(path.to_path_buf());
|
||||
Ok(())
|
||||
});
|
||||
assert!(matches!(result, PostExportOpen::Opened));
|
||||
assert_eq!(seen.unwrap(), PathBuf::from("C:/tmp/zhihu-hotlist.xlsx"));
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn open_exported_xlsx_with_reports_launcher_failure() {
|
||||
let result = open_exported_xlsx_with(Path::new("C:/tmp/zhihu-hotlist.xlsx"), |_path| {
|
||||
Err("launcher failed".to_string())
|
||||
});
|
||||
assert!(matches!(result, PostExportOpen::Failed(reason) if reason.contains("launcher failed")));
|
||||
}
|
||||
```
|
||||
|
||||
Add one matching dashboard payload test in the same file:
|
||||
|
||||
```rust
|
||||
#[test]
|
||||
fn open_local_dashboard_uses_exact_approved_marker_payload() {
|
||||
// FakeBrowserBackend records invoke(action, params, expected_domain)
|
||||
// assert expected_domain == "__sgclaw_local_dashboard__"
|
||||
// assert params.url == params.sgclaw_local_dashboard_open.presentation_url
|
||||
// assert source/kind/output_path all match the approved contract
|
||||
}
|
||||
```
|
||||
|
||||
This step is mandatory so the Excel route is proven to hand the generated path to the opener without launching a real application.
|
||||
|
||||
- [ ] **Step 2: Run the new unit tests to verify they fail**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" open_exported_xlsx_with_passes_generated_path_to_launcher --lib -- --exact
|
||||
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" open_exported_xlsx_with_reports_launcher_failure --lib -- --exact
|
||||
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" open_local_dashboard_uses_exact_approved_marker_payload --lib -- --exact
|
||||
```
|
||||
|
||||
Expected: FAIL because `src/compat/artifact_open.rs` does not exist yet.
|
||||
|
||||
- [ ] **Step 3: Create the small compat opener module**
|
||||
|
||||
Add one focused helper module rather than embedding side effects directly into `workflow_executor.rs`.
|
||||
|
||||
Target shape:
|
||||
|
||||
```rust
|
||||
pub const LOCAL_DASHBOARD_EXPECTED_DOMAIN: &str = "__sgclaw_local_dashboard__";
|
||||
pub const LOCAL_DASHBOARD_SOURCE: &str = "compat.workflow_executor";
|
||||
pub const LOCAL_DASHBOARD_KIND_ZHIHU_HOTLIST_SCREEN: &str = "zhihu_hotlist_screen";
|
||||
|
||||
pub enum PostExportOpen {
|
||||
Opened,
|
||||
Failed(String),
|
||||
}
|
||||
|
||||
pub fn open_exported_xlsx(output_path: &Path) -> PostExportOpen {
|
||||
open_exported_xlsx_with(output_path, launch_with_default_xlsx_app)
|
||||
}
|
||||
|
||||
fn open_exported_xlsx_with<F>(output_path: &Path, opener: F) -> PostExportOpen
|
||||
where
|
||||
F: FnOnce(&Path) -> Result<(), String>,
|
||||
{ /* test seam */ }
|
||||
|
||||
pub fn open_local_dashboard(
|
||||
browser_backend: &dyn BrowserBackend,
|
||||
output_path: &Path,
|
||||
presentation_url: &str,
|
||||
) -> PostExportOpen { /* invoke Action::Navigate with exact marker payload */ }
|
||||
```
|
||||
|
||||
Keep the module tiny. The only dedicated test seam in this file should be `open_exported_xlsx_with(...)`; do not introduce a general launcher trait.
|
||||
|
||||
- [ ] **Step 4: Implement the Windows-first `.xlsx` opener minimally**
|
||||
|
||||
Use a focused local launcher that targets the current environment first.
|
||||
|
||||
Preferred target shape:
|
||||
|
||||
```rust
|
||||
Command::new("cmd")
|
||||
.args(["/C", "start", "", output_path_as_windows_string])
|
||||
```
|
||||
|
||||
Requirements:
|
||||
|
||||
```text
|
||||
- fail if the path does not exist
|
||||
- do not swallow command-spawn errors
|
||||
- do not open arbitrary user-selected files from outside this workflow
|
||||
- keep cross-platform behavior minimal; only add a fallback branch if required to keep tests/build portable
|
||||
```
|
||||
|
||||
If you need a non-Windows fallback for compilation, keep it obviously minimal and out of the hot path.
|
||||
|
||||
- [ ] **Step 5: Parse payloads in `workflow_executor.rs` and call the new helper**
|
||||
|
||||
Refactor `export_xlsx(...)` and `export_screen(...)` just enough to separate:
|
||||
|
||||
```text
|
||||
- tool execution
|
||||
- payload parsing
|
||||
- route-specific post-export open
|
||||
- summary formatting
|
||||
```
|
||||
|
||||
Minimal target behavior:
|
||||
|
||||
```rust
|
||||
match open_exported_xlsx(&output_path) {
|
||||
PostExportOpen::Opened => format!("已导出并打开知乎热榜 Excel {output_path}"),
|
||||
PostExportOpen::Failed(reason) => format!("已导出知乎热榜 Excel {output_path},但自动打开失败:{reason}"),
|
||||
}
|
||||
```
|
||||
|
||||
```rust
|
||||
match open_local_dashboard(browser_backend, &output_path, &presentation_url) {
|
||||
PostExportOpen::Opened => format!("已在浏览器中打开知乎热榜大屏 {output_path}"),
|
||||
PostExportOpen::Failed(reason) => format!("已生成知乎热榜大屏 {output_path},但浏览器自动打开失败:{reason}"),
|
||||
}
|
||||
```
|
||||
|
||||
Change signatures only as much as needed to pass `browser_backend` into the dashboard route. Do not broaden unrelated call chains.
|
||||
|
||||
- [ ] **Step 6: Export the helper module**
|
||||
|
||||
Update `src/compat/mod.rs`:
|
||||
|
||||
```rust
|
||||
pub mod artifact_open;
|
||||
```
|
||||
|
||||
Do not reorder unrelated module exports unless rustfmt does it.
|
||||
|
||||
- [ ] **Step 7: Run the focused library and workflow regressions to verify green**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" open_exported_xlsx_with_passes_generated_path_to_launcher --lib -- --exact
|
||||
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" open_exported_xlsx_with_reports_launcher_failure --lib -- --exact
|
||||
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" open_local_dashboard_uses_exact_approved_marker_payload --lib -- --exact
|
||||
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" handle_browser_message_chains_hotlist_skill_into_xlsx_export_and_auto_open --test compat_runtime_test -- --exact
|
||||
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" handle_browser_message_chains_hotlist_skill_into_screen_export_and_auto_open --test compat_runtime_test -- --exact
|
||||
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" handle_browser_message_reports_dashboard_auto_open_protocol_error_when_presentation_url_is_missing --test compat_runtime_test -- --exact
|
||||
```
|
||||
|
||||
Expected: PASS for the new library tests and the workflow regressions, unless the dashboard-open path still fails at backend/policy validation.
|
||||
|
||||
- [ ] **Step 8: Commit the compat opener and workflow changes**
|
||||
|
||||
```bash
|
||||
git add src/compat/artifact_open.rs src/compat/mod.rs src/compat/workflow_executor.rs tests/compat_runtime_test.rs
|
||||
git commit -m "feat: auto-open zhihu hotlist export artifacts"
|
||||
```
|
||||
|
||||
### Task 3: Add failing backend and security tests for the narrow local-dashboard allowance
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/browser/callback_backend.rs:536-840`
|
||||
- Modify: `tests/browser_tool_test.rs` (`default_rules_allow_zhihu_navigation` section plus new local-dashboard validation tests)
|
||||
- Reference: `src/security/mac_policy.rs:56-132`
|
||||
|
||||
- [ ] **Step 1: Add a red callback-backend acceptance test for the approved local-dashboard request shape**
|
||||
|
||||
Extend the existing `src/browser/callback_backend.rs` test module with one focused navigate test.
|
||||
|
||||
Target shape:
|
||||
|
||||
```rust
|
||||
#[test]
|
||||
fn callback_backend_accepts_approved_local_dashboard_navigate_request() {
|
||||
let host = Arc::new(FakeCallbackHost::new(vec![success_reply(json!({ "navigated": true }))]));
|
||||
let backend = BrowserCallbackBackend::new(
|
||||
host.clone(),
|
||||
test_policy(),
|
||||
"http://127.0.0.1:17888/sgclaw/browser-helper.html",
|
||||
);
|
||||
|
||||
let output = backend.invoke(
|
||||
Action::Navigate,
|
||||
json!({
|
||||
"url": "file:///C:/tmp/zhihu-hotlist-screen.html",
|
||||
"sgclaw_local_dashboard_open": {
|
||||
"source": "compat.workflow_executor",
|
||||
"kind": "zhihu_hotlist_screen",
|
||||
"output_path": "C:/tmp/zhihu-hotlist-screen.html",
|
||||
"presentation_url": "file:///C:/tmp/zhihu-hotlist-screen.html"
|
||||
}
|
||||
}),
|
||||
"__sgclaw_local_dashboard__",
|
||||
);
|
||||
|
||||
assert!(output.unwrap().success);
|
||||
assert_eq!(host.requests()[0].command, json!([
|
||||
"http://127.0.0.1:17888/sgclaw/browser-helper.html",
|
||||
"sgBrowerserOpenPage",
|
||||
"file:///C:/tmp/zhihu-hotlist-screen.html"
|
||||
]));
|
||||
}
|
||||
```
|
||||
|
||||
Do not weaken any existing normal-domain tests.
|
||||
|
||||
- [ ] **Step 2: Add red rejection tests in exact files**
|
||||
|
||||
Put malformed-request rejection in `src/browser/callback_backend.rs` next to the acceptance test:
|
||||
|
||||
```rust
|
||||
#[test]
|
||||
fn callback_backend_rejects_local_dashboard_navigate_without_required_marker_fields() {}
|
||||
```
|
||||
|
||||
Put policy-only validation in `tests/browser_tool_test.rs` so all public `MacPolicy` assertions stay in one place:
|
||||
|
||||
```rust
|
||||
#[test]
|
||||
fn mac_policy_rejects_non_html_local_dashboard_presentation() {}
|
||||
|
||||
#[test]
|
||||
fn default_rules_allow_zhihu_navigation() {
|
||||
let policy = MacPolicy::load_from_path(...).unwrap();
|
||||
policy.validate(&Action::Navigate, "www.zhihu.com").unwrap();
|
||||
}
|
||||
```
|
||||
|
||||
Do not create a second `MacPolicy` regression location.
|
||||
|
||||
- [ ] **Step 3: Run the focused backend/policy tests to verify red**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" callback_backend_accepts_approved_local_dashboard_navigate_request --lib -- --exact
|
||||
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" callback_backend_rejects_local_dashboard_navigate_without_required_marker_fields --lib -- --exact
|
||||
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" mac_policy_rejects_non_html_local_dashboard_presentation --test browser_tool_test -- --exact
|
||||
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" default_rules_allow_zhihu_navigation --test browser_tool_test -- --exact
|
||||
```
|
||||
|
||||
Expected: the new local-dashboard tests FAIL; `default_rules_allow_zhihu_navigation` should still PASS.
|
||||
|
||||
- [ ] **Step 4: Commit the red backend/security tests**
|
||||
|
||||
```bash
|
||||
git add src/browser/callback_backend.rs tests/browser_tool_test.rs
|
||||
git commit -m "test: lock local dashboard navigate boundary"
|
||||
```
|
||||
|
||||
### Task 4: Implement the narrow callback-backend and MacPolicy allowance
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/browser/callback_backend.rs:300-351`
|
||||
- Modify: `src/security/mac_policy.rs:56-132`
|
||||
- Maybe modify: `src/security/mod.rs:9-27`
|
||||
- Test: `src/browser/callback_backend.rs:536-840`
|
||||
- Test: `tests/browser_tool_test.rs` (`default_rules_allow_zhihu_navigation` section plus new local-dashboard validation tests)
|
||||
|
||||
- [ ] **Step 1: Add a narrow local-dashboard validation helper in `MacPolicy`**
|
||||
|
||||
Keep `validate(...)` unchanged for ordinary domain flow. Add one small explicit helper instead.
|
||||
|
||||
Target shape:
|
||||
|
||||
```rust
|
||||
pub fn validate_local_dashboard_presentation(
|
||||
&self,
|
||||
action: &Action,
|
||||
expected_domain: &str,
|
||||
presentation_url: &str,
|
||||
output_path: &str,
|
||||
) -> Result<(), SecurityError> {
|
||||
// require Action::Navigate
|
||||
// require expected_domain == "__sgclaw_local_dashboard__"
|
||||
// require file:// URL
|
||||
// require .html path
|
||||
// require normalized file URL path matches output_path
|
||||
}
|
||||
```
|
||||
|
||||
If you need a new `SecurityError` variant for malformed local-dashboard input, add the smallest one that keeps error text clear.
|
||||
|
||||
- [ ] **Step 2: Recognize only the exact approved request shape in `BrowserCallbackBackend::invoke(...)`**
|
||||
|
||||
Before the normal `self.mac_policy.validate(&action, expected_domain)?` path runs, detect the one approved special case.
|
||||
|
||||
Minimal target behavior:
|
||||
|
||||
```rust
|
||||
if let Some(local_dashboard) = approved_local_dashboard_request(&action, ¶ms, expected_domain) {
|
||||
self.mac_policy.validate_local_dashboard_presentation(
|
||||
&action,
|
||||
expected_domain,
|
||||
&local_dashboard.presentation_url,
|
||||
&local_dashboard.output_path,
|
||||
)?;
|
||||
} else {
|
||||
self.mac_policy.validate(&action, expected_domain)?;
|
||||
}
|
||||
```
|
||||
|
||||
The helper should require all of these fields exactly:
|
||||
|
||||
```text
|
||||
- action == Action::Navigate
|
||||
- expected_domain == "__sgclaw_local_dashboard__"
|
||||
- params.url exists
|
||||
- params.sgclaw_local_dashboard_open.source == "compat.workflow_executor"
|
||||
- params.sgclaw_local_dashboard_open.kind == "zhihu_hotlist_screen"
|
||||
- params.sgclaw_local_dashboard_open.output_path exists
|
||||
- params.sgclaw_local_dashboard_open.presentation_url exists and equals params.url
|
||||
```
|
||||
|
||||
Anything else must continue down the normal rejection path.
|
||||
|
||||
- [ ] **Step 3: Keep `build_command(Action::Navigate, ...)` simple**
|
||||
|
||||
Do not add a second browser opcode or change the callback-host runtime contract. The approved local-dashboard case should still flow into the existing navigate command builder so the emitted command stays:
|
||||
|
||||
```rust
|
||||
json!([
|
||||
self.helper_page_url,
|
||||
"sgBrowerserOpenPage",
|
||||
target_url,
|
||||
])
|
||||
```
|
||||
|
||||
- [ ] **Step 4: Run the focused backend/security tests to verify green**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" callback_backend_accepts_approved_local_dashboard_navigate_request --lib -- --exact
|
||||
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" callback_backend_rejects_local_dashboard_navigate_without_required_marker_fields --lib -- --exact
|
||||
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" mac_policy_rejects_non_html_local_dashboard_presentation --test browser_tool_test -- --exact
|
||||
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" default_rules_allow_zhihu_navigation --test browser_tool_test -- --exact
|
||||
```
|
||||
|
||||
Expected: PASS
|
||||
|
||||
- [ ] **Step 5: Re-run the dashboard workflow regression after backend validation lands**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" handle_browser_message_chains_hotlist_skill_into_screen_export_and_auto_open --test compat_runtime_test -- --exact
|
||||
```
|
||||
|
||||
Expected: PASS
|
||||
|
||||
- [ ] **Step 6: Commit the backend/security implementation**
|
||||
|
||||
```bash
|
||||
git add src/browser/callback_backend.rs src/security/mac_policy.rs src/security/mod.rs tests/browser_tool_test.rs tests/compat_runtime_test.rs
|
||||
git commit -m "fix: allow approved local dashboard auto-open"
|
||||
```
|
||||
|
||||
If `src/security/mod.rs` did not change, omit it from the commit.
|
||||
|
||||
### Task 5: Run the focused verification sweep
|
||||
|
||||
**Files:**
|
||||
- Verify: `src/compat/artifact_open.rs`
|
||||
- Verify: `tests/compat_runtime_test.rs`
|
||||
- Verify: `tests/compat_screen_html_export_tool_test.rs`
|
||||
- Verify: `tests/browser_tool_test.rs`
|
||||
- Verify: `src/browser/callback_backend.rs` test module
|
||||
- Reference only if summary wording ripples outward: `tests/agent_runtime_test.rs:173-258`
|
||||
- Reference only if summary wording ripples outward: `tests/service_task_flow_test.rs:704-839`
|
||||
- Reference only if summary wording ripples outward: `tests/service_ws_session_test.rs:755-869`
|
||||
|
||||
- [ ] **Step 1: Re-run the library and workflow regressions**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" open_exported_xlsx_with_passes_generated_path_to_launcher --lib -- --exact
|
||||
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" open_exported_xlsx_with_reports_launcher_failure --lib -- --exact
|
||||
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" open_local_dashboard_uses_exact_approved_marker_payload --lib -- --exact
|
||||
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" handle_browser_message_chains_hotlist_skill_into_xlsx_export_and_auto_open --test compat_runtime_test -- --exact
|
||||
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" handle_browser_message_chains_hotlist_skill_into_screen_export_and_auto_open --test compat_runtime_test -- --exact
|
||||
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" handle_browser_message_reports_dashboard_auto_open_protocol_error_when_presentation_url_is_missing --test compat_runtime_test -- --exact
|
||||
```
|
||||
|
||||
Expected: PASS
|
||||
|
||||
- [ ] **Step 2: Re-run the tool contract regression that the dashboard route depends on**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" screen_html_export_tool_renders_dashboard_html_with_presentation_contract --test compat_screen_html_export_tool_test -- --exact
|
||||
```
|
||||
|
||||
Expected: PASS
|
||||
|
||||
- [ ] **Step 3: Re-run the callback-backend and policy boundary tests**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" callback_backend_accepts_approved_local_dashboard_navigate_request --lib -- --exact
|
||||
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" callback_backend_rejects_local_dashboard_navigate_without_required_marker_fields --lib -- --exact
|
||||
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" mac_policy_rejects_non_html_local_dashboard_presentation --test browser_tool_test -- --exact
|
||||
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" default_rules_allow_zhihu_navigation --test browser_tool_test -- --exact
|
||||
```
|
||||
|
||||
Expected: PASS
|
||||
|
||||
- [ ] **Step 4: Re-run outward-facing summary regressions only if needed**
|
||||
|
||||
Only if the updated summary text breaks existing assertions, run exactly these existing regressions and adjust only the affected expectation text:
|
||||
|
||||
```bash
|
||||
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" production_submit_task_routes_zhihu_through_ws_backend_without_helper_bootstrap --test agent_runtime_test -- --exact
|
||||
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" client_to_service_regression_routes_zhihu_without_helper_bootstrap_or_invalid_hmac_seed_output --test service_task_flow_test -- --exact
|
||||
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" service_binary_submit_flow_routes_zhihu_without_helper_bootstrap --test service_ws_session_test -- --exact
|
||||
```
|
||||
|
||||
Expected: PASS for any test you had to touch. Skip this step entirely if those files needed no edits.
|
||||
|
||||
- [ ] **Step 5: Inspect scope before finishing with exact git commands**
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
git diff --name-only -- src/compat/artifact_open.rs src/compat/mod.rs src/compat/workflow_executor.rs src/browser/callback_backend.rs src/security/mac_policy.rs src/security/mod.rs tests/compat_runtime_test.rs tests/browser_tool_test.rs tests/agent_runtime_test.rs tests/service_task_flow_test.rs tests/service_ws_session_test.rs
|
||||
git diff --stat -- src/compat/artifact_open.rs src/compat/mod.rs src/compat/workflow_executor.rs src/browser/callback_backend.rs src/security/mac_policy.rs src/security/mod.rs tests/compat_runtime_test.rs tests/browser_tool_test.rs tests/agent_runtime_test.rs tests/service_task_flow_test.rs tests/service_ws_session_test.rs
|
||||
```
|
||||
|
||||
Confirm the diff only touches:
|
||||
|
||||
```text
|
||||
- compat workflow/orchestration
|
||||
- compat post-export helper module
|
||||
- callback backend narrow local-dashboard acceptance
|
||||
- MacPolicy narrow local-dashboard validation
|
||||
- focused related tests
|
||||
```
|
||||
|
||||
Confirm it does **not** touch:
|
||||
|
||||
```text
|
||||
- frontend/service-console/
|
||||
- src/service/protocol.rs
|
||||
- browser-helper.html
|
||||
- callback-host endpoint contracts
|
||||
- websocket transport/protocol files
|
||||
```
|
||||
|
||||
- [ ] **Step 6: Commit only if verification required additional code changes**
|
||||
|
||||
```bash
|
||||
git add src/compat/artifact_open.rs src/compat/mod.rs src/compat/workflow_executor.rs src/browser/callback_backend.rs src/security/mac_policy.rs tests/compat_runtime_test.rs tests/browser_tool_test.rs tests/agent_runtime_test.rs tests/service_task_flow_test.rs tests/service_ws_session_test.rs
|
||||
git commit -m "test: tighten hotlist post-export auto-open verification"
|
||||
```
|
||||
|
||||
If verification required no further code changes, do not create an extra commit.
|
||||
Reference in New Issue
Block a user