feat: align browser callback runtime and export flows

Consolidate the browser task runtime around the callback path, add safer artifact opening for Zhihu exports, and cover the new service/browser flows with focused tests and supporting docs.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
木炎
2026-04-06 21:44:53 +08:00
parent 0dd655712c
commit bdf8e12246
55 changed files with 14440 additions and 1053 deletions

View File

@@ -0,0 +1,425 @@
# Claw-WS 开发执行顺序卡片
> 配套计划:[`2026-04-01-claw-ws-parallel-transport.md`](./2026-04-01-claw-ws-parallel-transport.md)
>
> 使用方式:严格按卡片顺序执行。每张卡片完成后先跑卡片内测试,再进入下一张。不要跳卡,不要提前接线,不要先写 service/client 再回头抽象底层。
---
## 卡片 0执行前约束
**目标**
先锁定边界,避免实现过程中把 pipe 模式改坏。
**必须遵守**
- 现有 pipe 模式必须保持可用
- 新增的是并行 `claw-ws` 模式,不是替换 pipe
- v1 只做单客户端、单任务串行
- `browser_action` / `superrpa_browser` 外部命名保持稳定
- 如果 WS `Eval` 不完整,先禁用相关 browser-script skill 暴露
- 不要提前做多客户端、任务队列、管理接口
**完成标准**
- 开发者明确后续所有改动都围绕“抽象复用 + 并行新增”进行
---
## 卡片 1抽共享 SubmitTask Runner
**目标**
把当前 `BrowserMessage::SubmitTask` 的主执行逻辑从 pipe 入口里抽出来,变成共享执行器。
**先做什么**
1. 新增 `tests/task_runner_test.rs`
2. 先写失败用例:
- 空 instruction
- 无 LLM 配置
- 日志顺序仍然是 `LogEntry` -> `TaskComplete`
**要改哪些文件**
- `src/agent/mod.rs`
- `src/lib.rs`
- `src/agent/task_runner.rs`
- `tests/task_runner_test.rs`
**实现动作**
1.`SubmitTaskRequest`
2.`AgentEventSink`
3.`run_submit_task(...)`
4. 让 pipe 入口只做:
- 解包 `BrowserMessage::SubmitTask`
- 转成 `SubmitTaskRequest`
- 调共享 runner
**绝对不要做**
- 不要在这一张卡里引入 ws backend
- 不要改 tool adapter
- 不要碰 service/client
**本卡测试命令**
```bash
cargo test --test runtime_task_flow_test --test task_runner_test
```
**通过标准**
- 老的 `runtime_task_flow_test` 继续绿
- 新的 `task_runner_test` 通过
- pipe 行为无变化
**完成后提交**
```bash
git commit -m "refactor: extract shared submit-task runner"
```
---
## 卡片 2抽 BrowserBackend 抽象
**目标**
把上层 runtime / orchestration / tool adapter 从 `BrowserPipeTool<T>` 解耦,统一依赖浏览器后端接口。
**先做什么**
1. 新增 `tests/browser_backend_capability_test.rs`
2. 先写失败用例:
- pipe backend 元数据不变
- pipe backend 支持 `Eval`
- `supports_eval() == false` 时不暴露 browser-script tools
**要改哪些文件**
- `src/browser/mod.rs`
- `src/browser/backend.rs`
- `src/browser/pipe_backend.rs`
- `src/compat/browser_tool_adapter.rs`
- `src/compat/browser_script_skill_tool.rs`
- `src/compat/runtime.rs`
- `src/compat/orchestration.rs`
- `src/compat/workflow_executor.rs`
- `src/lib.rs`
- `tests/browser_backend_capability_test.rs`
**实现动作**
1. 定义 `BrowserBackend`
2.`pipe_backend` 包装当前 `BrowserPipeTool`
3. 把上层签名改成 `Arc<dyn BrowserBackend>`
4. 保持工具名不变:
- `browser_action`
- `superrpa_browser`
5. 增加 `supports_eval()` gating
**绝对不要做**
- 不要在这一张卡里接浏览器 ws 协议
- 不要建 service
- 不要加 client 协议
**本卡测试命令**
```bash
cargo test --test browser_tool_test --test compat_browser_tool_test --test browser_backend_capability_test
```
**通过标准**
- 现有 browser tool 相关测试不回归
- 新 capability test 通过
- 上层逻辑已脱离 `BrowserPipeTool<T>` 的硬耦合
**完成后提交**
```bash
git commit -m "refactor: abstract browser backend from pipe transport"
```
---
## 卡片 3写死浏览器 WS 协议 Codec
**目标**
单独做浏览器固定 WebSocket 协议编解码层,不把协议细节散落到 backend 和 service 里。
**先做什么**
1. 新增 `tests/browser_ws_protocol_test.rs`
2. 先写失败用例:
- outbound frame 精确编码
- callback payload 解析
- 异常格式拒绝
- v1 action 覆盖完整
**要改哪些文件**
- `src/browser/ws_protocol.rs`
- `tests/browser_ws_protocol_test.rs`
**实现动作**
1. 按浏览器文档编码数组消息
2. 只支持 v1 必需动作:
- `Navigate`
- `GetText`
- `Click`
- `Type`
- `Eval`
3. 定义 callback 解析和关联规则
4. 对 unsupported / malformed 早失败
**绝对不要做**
- 不要在这张卡里连真实浏览器
- 不要写 service 协议
- 不要把网络连接逻辑塞进 codec
**本卡测试命令**
```bash
cargo test --test browser_ws_protocol_test
```
**通过标准**
- codec 单测全绿
- 无网络依赖
- 已能作为 backend 的纯协议层基础
**完成后提交**
```bash
git commit -m "test: codify fixed browser websocket protocol"
```
---
## 卡片 4实现 Browser WS Backend
**目标**
在 codec 之上提供和 pipe backend 类似的阻塞式 `invoke(...)` 能力。
**先做什么**
1. 新增 `tests/browser_ws_backend_test.rs`
2. 先写失败用例:
- `0 + 无 callback` 成功
-`0` 失败
- `0 + callback` 成功
- callback timeout
- socket drop
**要改哪些文件**
- `src/browser/mod.rs`
- `src/browser/ws_backend.rs`
- `tests/browser_ws_backend_test.rs`
**实现动作**
1. 建长连接管理器
2. 先做串行单飞请求
3. 发送前过 `MacPolicy`
4. 统一即时返回和 callback 返回
5. 输出统一 `CommandOutput`
**绝对不要做**
- 不要在这一张卡里接 service 层
- 不要做多并发 browser request
- 不要直接把浏览器 ws 代码散进 runtime
**本卡测试命令**
```bash
cargo test --test browser_ws_backend_test
```
**通过标准**
- backend 在 mocks/fakes 下稳定通过
- invoke 语义与 pipe backend 接近
- 可供上层 runtime 直接替换使用
**完成后提交**
```bash
git commit -m "feat: add browser websocket backend"
```
---
## 卡片 5实现 sg_claw Service
**目标**
新增本地长驻服务端,承接 client 请求并复用共享 task runner。
**先做什么**
1. 新增 `tests/service_ws_session_test.rs`
2. 先写失败用例:
- 首个客户端接入成功
- 第二个客户端收到 busy
- 断开后状态释放
- 任务重入被拒绝
**要改哪些文件**
- `src/service/mod.rs`
- `src/service/protocol.rs`
- `src/service/server.rs`
- `src/bin/sg_claw.rs`
- `src/lib.rs`
- `Cargo.toml`
- `tests/service_ws_session_test.rs`
**实现动作**
1. 定义 client/service 协议
2. 实现 service 端事件 sink
3. 建单活 session 状态机:
- `Idle`
- `ClientAttached`
- `TaskRunning`
4. 路由 `SubmitTask` 到共享 runner
5. 保持 pipe 入口不变
**绝对不要做**
- 不要在这一张卡里做 client 交互体验优化
- 不要加任务队列
- 不要支持多客户端并发
**本卡测试命令**
```bash
cargo test --test service_ws_session_test
```
**通过标准**
- 服务端会话锁生效
- 共享 runner 可被 service 复用
- pipe 模式入口未受影响
**完成后提交**
```bash
git commit -m "feat: add claw-ws service entrypoint"
```
---
## 卡片 6实现 sg_claw_client
**目标**
新增一个薄客户端,提供类似 `claude/codex` 的交互式命令行体验。
**先做什么**
1. 新增 `tests/service_task_flow_test.rs`
2. 先写失败用例:
- submit-task 送达 service
- 日志按顺序流回
- completion 只到一次
- 完成后断开处理清晰
**要改哪些文件**
- `src/bin/sg_claw_client.rs`
- `Cargo.toml`
- `tests/service_task_flow_test.rs`
**实现动作**
1. 建立到本地 service 的 ws 连接
2. 读取用户输入
3. 发送 `SubmitTask`
4. 实时打印日志
5. 收到 `TaskComplete` 结束本轮
**绝对不要做**
- 不要把 runtime、skills、browser backend 复制进 client
- 不要让 client 直接连浏览器
- 不要让 client 承担业务逻辑
**本卡测试命令**
```bash
cargo test --test service_task_flow_test
cargo build --bin sg_claw --bin sg_claw_client
```
**通过标准**
- client 是薄壳
- task flow 正常
- 两个新 binary 可编译
**完成后提交**
```bash
git commit -m "feat: add interactive claw-ws client"
```
---
## 卡片 7最终接线与回归验证
**目标**
把 ws 路径接起来,同时确认 pipe 路径零回归。
**先做什么**
1. 只增加最小配置项:
- `browser_ws_url`
- `service_ws_listen_addr`
2. 检查外部工具命名保持稳定
**要改哪些文件**
- `Cargo.toml`
- `src/lib.rs`
- `src/config/settings.rs`
- `src/runtime/engine.rs`(如确有必要)
**实现动作**
1. 接入最小配置面
2. 确保 pipe / ws 下工具命名一致
3. 跑旧 pipe 回归
4. 跑新 ws 测试
5. 跑全量 Rust tests
6. 编译所有 binary
7. 做一次真实本地 smoke test
**本卡 pipe 回归命令**
```bash
cargo test --test browser_tool_test --test compat_browser_tool_test --test runtime_task_flow_test --test pipe_handshake_test --test pipe_protocol_test --test task_protocol_test
```
**本卡 ws 测试命令**
```bash
cargo test --test task_runner_test --test browser_ws_protocol_test --test browser_ws_backend_test --test browser_backend_capability_test --test service_ws_session_test --test service_task_flow_test
```
**本卡全量命令**
```bash
cargo test --tests
cargo build --bin sgclaw --bin sg_claw --bin sg_claw_client
```
**手工验证**
1. 启动浏览器,确认 `ws://127.0.0.1:12345` 可用
2. `cargo run --bin sg_claw`
3. 新终端运行 `cargo run --bin sg_claw_client`
4. 发一个简单浏览器任务
5. 确认日志流和单次 completion
6. 确认旧 `cargo run` pipe 入口仍可启动
**通过标准**
- pipe 模式零回归
- ws 模式可独立工作
- 两套模式并行存在
**完成后提交**
```bash
git commit -m "feat: wire parallel claw-ws transport"
```
---
## 一句话执行顺序
严格按下面顺序做:
1. 共享 runner
2. browser backend 抽象
3. ws 协议 codec
4. ws backend
5. service
6. client
7. 配置接线 + 回归
如果顺序乱了,最容易出现的问题是:
- 上层重复实现
- pipe 被误伤
- ws 协议细节扩散到整个工程
- service/client 提前写完后又被迫重构

View File

@@ -0,0 +1,687 @@
# Claw-WS Parallel Transport Implementation Plan
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
**Goal:** Add a parallel `claw-ws` transport path that keeps the current pipe mode intact while introducing a long-lived `sg_claw` local service, an interactive `sg_claw_client`, and a browser WebSocket backend at `ws://127.0.0.1:12345`.
**Architecture:** First extract a transport-agnostic submit-task runner and browser backend abstraction from the current pipe-coupled flow. Keep the existing pipe path as one adapter/backend, then add a fixed-protocol browser WebSocket backend plus a small service/session layer and an interactive CLI client that reuse the same runtime, orchestration, and browser-facing tool adapters.
**Tech Stack:** Rust 2021, current sgclaw compat runtime, zeroclaw runtime engine, `serde`/`serde_json`, existing `MacPolicy`, and a blocking WebSocket crate for v1 (`tungstenite` preferred over a broad async rewrite).
---
## Scope Guardrails
- Keep the current pipe mode entrypoint and behavior working.
- Do **not** replace the existing browser pipe path.
- Add a **parallel** WebSocket path only.
- v1 supports **one active client session** only.
- Reuse existing tool names and runtime behavior whenever possible.
- If WS `Eval` support is incomplete, disable eval-dependent browser-script skill exposure in WS mode rather than shipping partial behavior.
- Do not broaden v1 with task queues, multi-client support, or admin endpoints.
---
## File Structure
### Existing files to reuse
- Modify: `src/lib.rs` — current pipe bootstrap and receive loop; keep as the legacy pipe entrypoint.
- Modify: `src/agent/mod.rs` — current `BrowserMessage::SubmitTask` entrypoint and config-loading flow.
- Modify: `src/compat/runtime.rs` — compat runtime and tool assembly.
- Modify: `src/compat/orchestration.rs` — direct workflow vs compat runtime routing.
- Modify: `src/compat/browser_tool_adapter.rs` — exposes `browser_action` and `superrpa_browser`.
- Modify: `src/compat/browser_script_skill_tool.rs` — browser-script skill execution.
- Modify: `src/compat/workflow_executor.rs` — direct browser workflows such as Zhihu flows.
- Reuse: `src/pipe/browser_tool.rs` — current browser command executor; retain as the pipe backend implementation.
- Reuse: `src/pipe/protocol.rs``BrowserMessage`, `AgentMessage`, `Action`, `ExecutionSurfaceMetadata`.
- Reuse: `src/security/mac_policy.rs` — local action/domain guardrails.
- Modify: `src/config/settings.rs` — minimal new config surface for WS mode.
- Optional modify: `src/runtime/engine.rs` — only if backend capability wiring requires it.
### New files to create
- Create: `src/agent/task_runner.rs` — shared submit-task execution entrypoint.
- Create: `src/browser/mod.rs` — browser backend exports.
- Create: `src/browser/backend.rs``BrowserBackend` trait and helpers.
- Create: `src/browser/pipe_backend.rs` — wrapper around existing `BrowserPipeTool`.
- Create: `src/browser/ws_protocol.rs` — fixed browser WS request/response codec.
- Create: `src/browser/ws_backend.rs` — browser WS backend with blocking invoke semantics.
- Create: `src/service/mod.rs` — service exports.
- Create: `src/service/protocol.rs` — client/service WS message types.
- Create: `src/service/server.rs` — single-session `sg_claw` server.
- Create: `src/bin/sg_claw.rs` — service binary.
- Create: `src/bin/sg_claw_client.rs` — interactive CLI client.
- Create: `tests/task_runner_test.rs` — shared submit-task runner regressions.
- Create: `tests/browser_backend_capability_test.rs` — backend capability/tool exposure tests.
- Create: `tests/browser_ws_protocol_test.rs` — browser WS protocol tests.
- Create: `tests/browser_ws_backend_test.rs` — browser WS backend tests.
- Create: `tests/service_ws_session_test.rs` — single-session server tests.
- Create: `tests/service_task_flow_test.rs` — client/service task flow tests.
---
## Task 1: Extract a shared submit-task runner
**Files:**
- Create: `src/agent/task_runner.rs`
- Modify: `src/agent/mod.rs`
- Modify: `src/lib.rs`
- Test: `tests/task_runner_test.rs`
- Reuse: `src/compat/runtime.rs`, `src/compat/orchestration.rs`
- [ ] **Step 1: Write a failing runner regression test**
Create `tests/task_runner_test.rs` covering:
- empty instruction returns the same `TaskComplete` failure summary
- missing LLM config still returns the same summary shape
- the pipe adapter still emits `LogEntry` before `TaskComplete`
- [ ] **Step 2: Run the targeted regression tests first**
Run:
```bash
cargo test --test runtime_task_flow_test --test task_runner_test
```
Expected: `task_runner_test` fails because the shared runner does not exist yet.
- [ ] **Step 3: Define the transport-neutral request model**
Create `src/agent/task_runner.rs` with a request struct that mirrors the current pipe payload:
```rust
pub struct SubmitTaskRequest {
pub instruction: String,
pub conversation_id: Option<String>,
pub messages: Vec<ConversationMessage>,
pub page_url: Option<String>,
pub page_title: Option<String>,
}
```
Normalize empty strings to `None` at the adapter boundary.
- [ ] **Step 4: Define an event sink abstraction**
Add a small trait that can emit the current agent events without depending on a specific transport:
```rust
pub trait AgentEventSink {
fn send(&self, message: &AgentMessage) -> Result<(), PipeError>;
}
```
The existing pipe transport should implement this first.
- [ ] **Step 5: Move submit-task execution into a shared function**
Extract the body currently inside `BrowserMessage::SubmitTask` handling from `src/agent/mod.rs` into a shared function such as:
```rust
pub fn run_submit_task(
sink: &dyn AgentEventSink,
browser_backend: Arc<dyn BrowserBackend>,
context: &AgentRuntimeContext,
request: SubmitTaskRequest,
) -> Result<(), PipeError>
```
This function must still:
- validate empty instruction
- load sgclaw settings
- log runtime/config info
- choose orchestration vs compat runtime
- emit `AgentMessage::TaskComplete`
- [ ] **Step 6: Keep pipe mode as a thin adapter**
Refactor `handle_browser_message_with_context(...)` in `src/agent/mod.rs` so it only:
- pattern matches `BrowserMessage`
- converts `SubmitTask` into `SubmitTaskRequest`
- forwards into `run_submit_task(...)`
- [ ] **Step 7: Re-run the runner regressions**
Run:
```bash
cargo test --test runtime_task_flow_test --test task_runner_test
```
Expected: both tests pass and pipe behavior remains unchanged.
- [ ] **Step 8: Commit**
```bash
git add src/agent/mod.rs src/agent/task_runner.rs src/lib.rs tests/task_runner_test.rs
git commit -m "refactor: extract shared submit-task runner"
```
---
## Task 2: Introduce a browser backend abstraction and wrap the current pipe implementation
**Files:**
- Create: `src/browser/mod.rs`
- Create: `src/browser/backend.rs`
- Create: `src/browser/pipe_backend.rs`
- Modify: `src/lib.rs`
- Modify: `src/compat/browser_tool_adapter.rs`
- Modify: `src/compat/browser_script_skill_tool.rs`
- Modify: `src/compat/runtime.rs`
- Modify: `src/compat/orchestration.rs`
- Modify: `src/compat/workflow_executor.rs`
- Test: `tests/browser_backend_capability_test.rs`
- Reuse: `src/pipe/browser_tool.rs`, `src/security/mac_policy.rs`
- [ ] **Step 1: Add a failing backend capability test**
Create `tests/browser_backend_capability_test.rs` to verify:
- pipe backend still exposes privileged surface metadata
- pipe backend still supports `Eval`
- browser-script tool exposure is disabled when `supports_eval()` is false
- [ ] **Step 2: Run the current browser adapter tests first**
Run:
```bash
cargo test --test browser_tool_test --test compat_browser_tool_test --test browser_backend_capability_test
```
Expected: new capability test fails because the backend abstraction does not exist yet.
- [ ] **Step 3: Define the shared browser interface**
Create `src/browser/backend.rs`:
```rust
pub trait BrowserBackend: Send + Sync {
fn invoke(
&self,
action: Action,
params: Value,
expected_domain: &str,
) -> Result<CommandOutput, PipeError>;
fn surface_metadata(&self) -> ExecutionSurfaceMetadata;
fn supports_eval(&self) -> bool {
true
}
}
```
- [ ] **Step 4: Implement the pipe backend as a wrapper**
Create `src/browser/pipe_backend.rs` that stores the current `BrowserPipeTool<T>` and forwards `invoke(...)` and `surface_metadata()` unchanged.
Pipe mode must continue using:
- `perform_handshake(...)`
- `MacPolicy::load_from_path(...)`
- `BrowserPipeTool::new(...).with_response_timeout(...)`
- [ ] **Step 5: Refactor runtime and tool adapters to depend on `Arc<dyn BrowserBackend>`**
Update:
- `src/compat/browser_tool_adapter.rs`
- `src/compat/browser_script_skill_tool.rs`
- `src/compat/runtime.rs`
- `src/compat/orchestration.rs`
- `src/compat/workflow_executor.rs`
Preserve external tool names:
- `browser_action`
- `superrpa_browser`
- [ ] **Step 6: Add capability gating for eval-dependent script tools**
If `supports_eval()` is false, do **not** expose browser-script skill tools from `build_browser_script_skill_tools(...)` in that backend mode.
- [ ] **Step 7: Re-run browser adapter tests**
Run:
```bash
cargo test --test browser_tool_test --test compat_browser_tool_test --test browser_backend_capability_test
```
Expected: all three pass.
- [ ] **Step 8: Commit**
```bash
git add src/browser src/lib.rs src/compat/browser_tool_adapter.rs src/compat/browser_script_skill_tool.rs src/compat/runtime.rs src/compat/orchestration.rs src/compat/workflow_executor.rs tests/browser_backend_capability_test.rs
git commit -m "refactor: abstract browser backend from pipe transport"
```
---
## Task 3: Implement the fixed browser WebSocket protocol codec in isolation
**Files:**
- Create: `src/browser/ws_protocol.rs`
- Test: `tests/browser_ws_protocol_test.rs`
- Reuse: `docs/_tmp_sgbrowser_ws_api_doc.txt`
- [ ] **Step 1: Write failing protocol codec tests**
Create `tests/browser_ws_protocol_test.rs` covering:
- exact outbound frame encoding
- callback payload decoding
- unknown callback format rejection
- mapping coverage for every supported v1 action
- [ ] **Step 2: Run the protocol tests first**
Run:
```bash
cargo test --test browser_ws_protocol_test
```
Expected: fail because the WS protocol codec does not exist yet.
- [ ] **Step 3: Encode the exact browser frame shapes**
Create `src/browser/ws_protocol.rs` so it can build exact array-form payloads such as:
```rust
[requesturl, "sgBrowserExcuteJsCodeByArea", target_url, js_code, area]
```
Serialize to the JSON string format required by the browser service.
- [ ] **Step 4: Define the v1 action mapping table**
Support only the actions already needed by current sgclaw flows:
- `Navigate`
- `GetText`
- `Click`
- `Type`
- `Eval`
Document which browser functions each one maps to and what assumptions they rely on.
- [ ] **Step 5: Define callback parsing and correlation rules**
Represent callback-bearing operations explicitly, including the callback function naming or request-correlation strategy the backend will depend on.
- [ ] **Step 6: Reject unsupported or malformed shapes early**
Fail fast for:
- unsupported actions
- malformed callback payloads
- missing request correlation metadata
- [ ] **Step 7: Re-run the protocol tests**
Run:
```bash
cargo test --test browser_ws_protocol_test
```
Expected: pass with no network dependency.
- [ ] **Step 8: Commit**
```bash
git add src/browser/ws_protocol.rs tests/browser_ws_protocol_test.rs
git commit -m "test: codify fixed browser websocket protocol"
```
---
## Task 4: Build the browser WS backend with synchronous invoke semantics
**Files:**
- Create: `src/browser/ws_backend.rs`
- Modify: `src/browser/mod.rs`
- Test: `tests/browser_ws_backend_test.rs`
- Reuse: `CommandOutput`, `PipeError`, `ExecutionSurfaceMetadata`, `MacPolicy`
- [ ] **Step 1: Write failing backend behavior tests**
Create `tests/browser_ws_backend_test.rs` covering:
- zero return + no callback => success
- non-zero return => failure
- zero return + callback => success with normalized `CommandOutput`
- callback timeout => timeout error
- dropped socket => clear failure
- [ ] **Step 2: Run backend tests first**
Run:
```bash
cargo test --test browser_ws_backend_test
```
Expected: fail because the WS backend does not exist yet.
- [ ] **Step 3: Build a long-lived browser connection manager**
Implement `src/browser/ws_backend.rs` to connect to `ws://127.0.0.1:12345` and expose blocking `invoke(...)` calls.
Use a dedicated connection loop plus request/response coordination instead of scattering raw socket calls through the runtime.
- [ ] **Step 4: Preserve local guardrails before send**
Validate `MacPolicy` before translating an action into the browser WS protocol, matching current pipe backend behavior.
- [ ] **Step 5: Normalize immediate status returns and delayed callbacks**
For each `invoke(...)` call:
- fail immediately on non-zero return codes
- succeed immediately for operations with no data callback
- wait for the matching callback for result-bearing operations
- convert the final outcome into `CommandOutput`
- [ ] **Step 6: Keep v1 concurrency intentionally serialized**
Allow only one in-flight browser request at a time unless the browser callback protocol proves a stable request-id guarantee.
- [ ] **Step 7: Re-run backend tests**
Run:
```bash
cargo test --test browser_ws_backend_test
```
Expected: pass using mocks/fakes, not the real browser.
- [ ] **Step 8: Commit**
```bash
git add src/browser/mod.rs src/browser/ws_backend.rs tests/browser_ws_backend_test.rs
git commit -m "feat: add browser websocket backend"
```
---
## Task 5: Add the `sg_claw` service protocol and single-session server
**Files:**
- Create: `src/service/mod.rs`
- Create: `src/service/protocol.rs`
- Create: `src/service/server.rs`
- Create: `src/bin/sg_claw.rs`
- Modify: `src/lib.rs`
- Modify: `Cargo.toml`
- Test: `tests/service_ws_session_test.rs`
- Reuse: `AgentMessage::LogEntry`, `AgentMessage::TaskComplete`, `SubmitTaskRequest`, `run_submit_task(...)`
- [ ] **Step 1: Write failing service session tests**
Create `tests/service_ws_session_test.rs` to verify:
- first client attaches
- second client gets `Busy`
- disconnect resets session state
- overlapping task submission is rejected clearly
- [ ] **Step 2: Run the session tests first**
Run:
```bash
cargo test --test service_ws_session_test
```
Expected: fail because the service layer does not exist yet.
- [ ] **Step 3: Define a thin client/service WS protocol**
In `src/service/protocol.rs`, reuse existing task/event shapes as much as possible:
```rust
ClientMessage::SubmitTask { instruction, conversation_id, messages, page_url, page_title }
ClientMessage::Ping
ServiceMessage::LogEntry { level, message }
ServiceMessage::TaskComplete { success, summary }
ServiceMessage::Busy { message }
```
- [ ] **Step 4: Add the service event sink adapter**
Implement `AgentEventSink` for the service session writer so the shared task runner can stream `LogEntry` and `TaskComplete` over the service WebSocket.
- [ ] **Step 5: Implement single-active-client session state**
Model explicit states such as:
- `Idle`
- `ClientAttached`
- `TaskRunning`
Reject a second client with `ServiceMessage::Busy` and close the socket. Reject overlapping tasks instead of queueing them.
- [ ] **Step 6: Add the service binary**
Create `src/bin/sg_claw.rs` that:
- loads config
- creates the browser WS backend
- listens for local client connections
- routes `SubmitTask` into `run_submit_task(...)`
Keep `src/main.rs` and the existing `sgclaw::run()` pipe path unchanged.
- [ ] **Step 7: Re-run the session tests**
Run:
```bash
cargo test --test service_ws_session_test
```
Expected: pass without the real browser.
- [ ] **Step 8: Commit**
```bash
git add src/service src/bin/sg_claw.rs src/lib.rs Cargo.toml tests/service_ws_session_test.rs
git commit -m "feat: add claw-ws service entrypoint"
```
---
## Task 6: Add the `sg_claw_client` interactive CLI
**Files:**
- Create: `src/bin/sg_claw_client.rs`
- Modify: `Cargo.toml`
- Test: `tests/service_task_flow_test.rs`
- Reuse: `src/service/protocol.rs`
- [ ] **Step 1: Write failing client/service task flow tests**
Create `tests/service_task_flow_test.rs` to verify:
- the submit-task request reaches the service
- log entries stream in order
- the final summary arrives exactly once
- disconnect after task completion is handled cleanly
- [ ] **Step 2: Run the flow tests first**
Run:
```bash
cargo test --test service_task_flow_test
```
Expected: fail because the client does not exist yet.
- [ ] **Step 3: Implement a thin interactive client loop**
Create `src/bin/sg_claw_client.rs` that:
- connects to the local `sg_claw` service
- reads a line of user input
- sends `ClientMessage::SubmitTask`
- prints streamed `LogEntry` events as they arrive
- ends the turn on `TaskComplete`
- [ ] **Step 4: Keep the client intentionally dumb**
Do **not** duplicate runtime logic in the client. Browser access, skills, orchestration, and task execution remain entirely inside the service.
- [ ] **Step 5: Re-run the flow tests**
Run:
```bash
cargo test --test service_task_flow_test
```
Expected: pass without the real browser.
- [ ] **Step 6: Build the new binaries explicitly**
Run:
```bash
cargo build --bin sg_claw --bin sg_claw_client
```
Expected: both binaries compile successfully.
- [ ] **Step 7: Commit**
```bash
git add src/bin/sg_claw_client.rs Cargo.toml tests/service_task_flow_test.rs
git commit -m "feat: add interactive claw-ws client"
```
---
## Task 7: Finish wiring, preserve pipe mode, and verify end-to-end behavior
**Files:**
- Modify: `Cargo.toml`
- Modify: `src/lib.rs`
- Modify: `src/config/settings.rs`
- Optional modify: `src/runtime/engine.rs`
- Reuse: `tests/browser_tool_test.rs`, `tests/runtime_task_flow_test.rs`, `tests/compat_runtime_test.rs`
- [ ] **Step 1: Add only the minimum config surface for v1**
Add settings such as:
- `browser_ws_url` defaulting to `ws://127.0.0.1:12345`
- `service_ws_listen_addr` defaulting to local loopback
Do **not** change the meaning of existing browser backend/profile settings just to represent service mode.
- [ ] **Step 2: Keep external browser tool naming stable**
Verify that the runtime still exposes:
- `superrpa_browser`
- `browser_action`
under both pipe and WS modes where the backend supports them.
- [ ] **Step 3: Re-run the current pipe regression suite**
Run:
```bash
cargo test --test browser_tool_test --test compat_browser_tool_test --test runtime_task_flow_test --test pipe_handshake_test --test pipe_protocol_test --test task_protocol_test
```
Expected: all existing pipe-oriented tests still pass unchanged.
- [ ] **Step 4: Run the new WS-focused suite**
Run:
```bash
cargo test --test task_runner_test --test browser_ws_protocol_test --test browser_ws_backend_test --test browser_backend_capability_test --test service_ws_session_test --test service_task_flow_test
```
Expected: all new tests pass without launching the real browser.
- [ ] **Step 5: Run a full Rust test sweep**
Run:
```bash
cargo test --tests
```
Expected: all Rust tests pass.
- [ ] **Step 6: Build all three binaries**
Run:
```bash
cargo build --bin sgclaw --bin sg_claw --bin sg_claw_client
```
Expected: all three binaries compile.
- [ ] **Step 7: Perform a manual local smoke test**
Manual test:
1. Start the browser app so `ws://127.0.0.1:12345` is available.
2. Run `cargo run --bin sg_claw`.
3. In another terminal, run `cargo run --bin sg_claw_client`.
4. Submit a simple browser task such as opening a page or fetching visible text.
5. Confirm the client prints streaming logs and exactly one final completion summary.
6. Confirm the old pipe-mode entry still starts via `cargo run`.
Expected: both modes work side-by-side.
- [ ] **Step 8: Commit**
```bash
git add Cargo.toml src/lib.rs src/config/settings.rs src/runtime/engine.rs
git commit -m "feat: wire parallel claw-ws transport"
```
---
## Verification Checklist
### Fast regression checks
```bash
cargo test --test browser_tool_test --test compat_browser_tool_test --test runtime_task_flow_test
```
Expected: current pipe/browser runtime behavior remains green.
### Full Rust test sweep
```bash
cargo test --tests
```
Expected: all Rust tests pass.
### Binary build verification
```bash
cargo build --bin sgclaw --bin sg_claw --bin sg_claw_client
```
Expected: all three binaries compile.
### Manual end-to-end verification
- Browser app listening on `ws://127.0.0.1:12345`
- `cargo run --bin sg_claw`
- `cargo run --bin sg_claw_client`
- submit one browser task
- verify streaming logs, final completion, and single-client lock behavior
- verify `cargo run` still preserves old pipe bootstrap
---
## Notes for Implementation
- Keep the current pipe bootstrap in `src/lib.rs` intact until the shared runner and pipe backend wrapper are both green.
- Prefer small commits at each task boundary.
- Keep the new WS path additive and isolated.
- Do not ship partial browser capabilities under stable tool names.
- Treat `docs/_tmp_sgbrowser_ws_api_doc.txt` as the browser WS protocol source of truth while implementing `src/browser/ws_protocol.rs`.

View File

@@ -0,0 +1,607 @@
# WS Browser Backend Auth Replacement Implementation Plan
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
**Goal:** Replace the ws service paths empty-session-key `BrowserPipeTool` dependency with a ws-native browser backend path so real browser websocket calls work, while preserving legacy pipe behavior exactly.
**Architecture:** Keep the existing pipe entry untouched and add a ws-only parallel execution seam. The ws service path will construct a `ServiceBrowserWsClient` plus `WsBrowserBackend`, pass that backend through a new ws-only submit-task entry, and let the existing compat/runtime stack consume `Arc<dyn BrowserBackend>` instead of requiring `BrowserPipeTool` on the ws path.
**Tech Stack:** Rust 2021, current sgclaw agent/task runner, compat runtime/orchestration stack, `tungstenite`, `serde_json`, existing `MacPolicy`, existing `BrowserBackend`/`WsBrowserBackend`, and the current Rust test suite.
---
## Scope Guardrails
- Only change the ws service path.
- Do **not** change `src/lib.rs` pipe runtime behavior.
- Do **not** change pipe handshake semantics.
- Do **not** introduce fake session keys, fake HMAC seeds, or auth bypasses.
- Keep legacy `run_submit_task(...)` available for the pipe entry.
- If a shared layer must change, add a parallel ws-only entry instead of weakening the pipe path.
- Keep the current single-client, single-task service model.
- Do not broaden this slice into browser process launch, queueing, multi-client support, or protocol extensions.
---
## File Structure
### Existing files to modify
- Modify: `src/agent/task_runner.rs` — keep the current pipe-oriented submit path and add the ws-only backend-based submit path.
- Modify: `src/compat/runtime.rs` — add a backend-driven execution entry that accepts `Arc<dyn BrowserBackend>` directly, while keeping the current pipe-oriented public functions behaviorally unchanged.
- Modify: `src/compat/orchestration.rs` — add a matching backend-driven execution entry for orchestration/direct-route flows, while keeping the current pipe-oriented public functions behaviorally unchanged.
- Modify: `src/compat/workflow_executor.rs` — add backend-driven sibling APIs for direct-route/fallback execution, while keeping the current pipe-oriented public functions behaviorally unchanged.
- Modify: `src/service/server.rs` — replace the ws services `BrowserPipeTool::new(..., vec![])` path with a ws-native `WsClient` + `WsBrowserBackend` path.
- Modify: `src/service/mod.rs` — only if minimal re-export or call-signature changes are needed around the new ws-only submit path.
- Modify: `src/browser/mod.rs` — only if export cleanup is truly needed for the service wiring.
- Reuse: `src/agent/mod.rs` — keep the current pipe routing unchanged unless a tiny internal refactor is strictly needed to reuse shared code.
- Reuse: `src/browser/backend.rs` — existing shared browser backend trait.
- Reuse: `src/browser/ws_backend.rs` — existing ws-native browser backend implementation.
- Reuse: `src/browser/ws_protocol.rs` — existing browser websocket protocol codec.
- Reuse: `src/compat/browser_tool_adapter.rs` — should already speak `BrowserBackend`; only touch if a narrow ws regression forces it.
- Reuse: `src/compat/browser_script_skill_tool.rs` — eval-capability gating already exists; only touch if a narrow ws regression forces it.
- Reuse: `src/lib.rs` — pipe entrypoint must remain behaviorally unchanged; verify only.
### Existing tests to extend
- Modify: `tests/browser_ws_backend_test.rs` — keep existing ws backend coverage green after the service adapter wiring lands.
- Modify: `tests/browser_script_skill_tool_test.rs` — re-verify eval-gating and browser-script behavior after the shared compat/runtime seam changes.
- Modify: `tests/service_ws_session_test.rs` — update service-side unit/session tests to exercise the ws-only submit path.
- Modify: `tests/service_task_flow_test.rs` — add client→service chain coverage proving the ws path reaches a browser websocket and no longer emits `invalid hmac seed`.
- Modify: `src/service/server.rs` under `#[cfg(test)]` if the private service-side ws adapter cannot be exercised from an integration test crate without changing production visibility.
### New files to create
- Create: `tests/browser_ws_service_adapter_test.rs` if the adapter can be exercised through a public seam; otherwise keep the deterministic adapter tests as unit tests in `src/service/server.rs` so no production visibility changes are required.
---
## Task 1: Lock the ws-only behavior with deterministic failing tests
**Files:**
- Create: `tests/browser_ws_service_adapter_test.rs`
- Modify: `tests/service_task_flow_test.rs`
- Reuse: `tests/browser_ws_backend_test.rs`, `src/browser/ws_backend.rs`, `src/service/server.rs`
- [ ] **Step 1: Write the first failing backend/adapter test**
Create `tests/browser_ws_service_adapter_test.rs` with one focused test that directly exercises the ws-service adapter layer, without `sg_claw_client`, without LLM planning, and without natural-language tasks.
Start with the smallest real behavior from the spec:
- fake browser websocket server accepts one connection
- the ws-service adapter builds the same kind of client the service will use
- `WsBrowserBackend.invoke(Action::Navigate, ...)` succeeds on status `0`
- the fake server receives one text frame that decodes as a ws `Navigate` call
- [ ] **Step 2: Run that single new test and watch it fail**
Run:
```bash
cargo test --test browser_ws_service_adapter_test ws_service_backend_navigate_reaches_browser_websocket -- --nocapture
```
Expected: FAIL because the service-side ws client/adapter does not exist yet.
- [ ] **Step 3: Add the second failing deterministic test**
In the same file, add a test for the forced-close path:
- fake browser websocket server accepts a request, then closes/reset the socket before returning a status frame
- observe the error at the `WsBrowserBackend.invoke(...)` call site
- assert the outward error is exactly `PipeError::PipeClosed`
- [ ] **Step 4: Run only the forced-close test and watch it fail**
Run:
```bash
cargo test --test browser_ws_service_adapter_test ws_service_backend_maps_browser_disconnect_to_pipe_closed -- --nocapture
```
Expected: FAIL because the service-side ws client/adapter still does not exist.
- [ ] **Step 5: Add the third failing deterministic test**
In the same file, add a callback-timeout test:
- fake browser websocket server returns status `0`
- it never returns the callback frame
- assert the outward error at `invoke(...)` is exactly `PipeError::Timeout`
Use a tiny response timeout in the backend under test.
- [ ] **Step 6: Run only the callback-timeout test and watch it fail**
Run:
```bash
cargo test --test browser_ws_service_adapter_test ws_service_backend_times_out_waiting_for_callback -- --nocapture
```
Expected: FAIL because the service-side ws client/adapter still does not exist.
- [ ] **Step 7: Add the end-to-end failing regression for the auth bug**
Extend `tests/service_task_flow_test.rs` with one client→service integration test that:
- starts a fake browser websocket server
- starts the real `sg_claw` service binary with a temp config pointing `browserWsUrl` to that fake server
- starts the real `sg_claw_client`
- submits the fixed instruction `打开知乎热榜并读取页面主区域文本`
- captures service/client output
- asserts the fake browser server received at least one text frame
- asserts output does **not** contain `invalid hmac seed: session key must not be empty`
Do not assert planner details here. This test only proves the service path no longer goes through the empty-session-key auth failure.
- [ ] **Step 8: Run the integration regression and watch it fail**
Run:
```bash
cargo test --test service_task_flow_test ws_service_submit_task_no_longer_hits_invalid_hmac_seed -- --nocapture
```
Expected: FAIL on the current code because the ws service still constructs `BrowserPipeTool::new(..., vec![])`.
- [ ] **Step 9: Commit the red tests only after they are all in place**
Do not commit yet if any required red test was skipped. The next task will make them pass.
---
## Task 2: Add a ws-only browser-backend execution seam without changing the pipe path
**Files:**
- Modify: `src/agent/task_runner.rs`
- Modify: `src/compat/runtime.rs`
- Modify: `src/compat/orchestration.rs`
- Modify: `src/compat/workflow_executor.rs`
- Reuse: `src/agent/mod.rs`, `src/browser/backend.rs`
- Test: `tests/task_runner_test.rs`, `tests/browser_script_skill_tool_test.rs`
- [ ] **Step 1: Write the smallest failing runner-level ws entry test**
Extend `tests/task_runner_test.rs` with a focused test that proves there is a ws-only submit entry accepting `Arc<dyn BrowserBackend>` and an arbitrary event sink, while the old `run_submit_task(...)` signature still exists for pipe mode.
The test can stay on the missing-LLM-config path so it does not need a real browser call. It should compile only once the new ws-only function exists.
- [ ] **Step 2: Run the targeted runner test and watch it fail**
Run:
```bash
cargo test --test task_runner_test ws_only_submit_task_entry_accepts_browser_backend -- --nocapture
```
Expected: FAIL to compile or FAIL to link because the ws-only entry does not exist yet.
- [ ] **Step 3: Add the new ws-only submit-task entry in `src/agent/task_runner.rs`**
Keep the current pipe function intact:
```rust
pub fn run_submit_task<T: Transport + 'static>(... browser_tool: &BrowserPipeTool<T>, ...)
```
Add a parallel entry for the service path, for example:
```rust
pub fn run_submit_task_with_browser_backend(
sink: &dyn AgentEventSink,
browser_backend: Arc<dyn BrowserBackend>,
context: &AgentRuntimeContext,
request: SubmitTaskRequest,
) -> Result<(), PipeError>
```
Rules:
- share as much internal logic as possible with the pipe path
- do not change `run_submit_task(...)` behavior
- do not change `src/agent/mod.rs` pipe wiring except, at most, small internal refactoring to reuse common code
- [ ] **Step 4: Add a backend-driven compat runtime entry**
In `src/compat/runtime.rs`, add a parallel entry that accepts `Arc<dyn BrowserBackend>` directly instead of `BrowserPipeTool<T>`.
Keep the existing pipe-oriented public function in place.
The backend-driven entry must preserve:
- existing log emission order
- tool names (`superrpa_browser`, `browser_action`)
- existing browser-script tool gating behavior
- existing office/screen tool attachment logic
- existing conversation seeding and provider setup
- [ ] **Step 5: Add backend-driven orchestration and workflow-executor entries**
In `src/compat/orchestration.rs`, add the matching backend-driven entry so direct-route flows and fallback flows can run with `Arc<dyn BrowserBackend>` on the ws path.
In `src/compat/workflow_executor.rs`, add backend-driven sibling APIs for any direct-route/fallback execution that is currently hard-wired to `BrowserPipeTool<T>`.
Keep the existing pipe-oriented orchestration and workflow-executor public functions in place.
- [ ] **Step 6: Route the new ws-only submit entry through the backend-driven compat/orchestration/workflow-executor path**
Inside `src/agent/task_runner.rs`, make the new ws-only submit entry call the new backend-based compat/orchestration functions, while the old pipe entry keeps calling the old pipe-based functions.
This is the core compatibility seam, and it must cover both normal compat-runtime execution and direct-route/fallback workflow execution.
- [ ] **Step 7: Re-run the new runner test**
Run:
```bash
cargo test --test task_runner_test ws_only_submit_task_entry_accepts_browser_backend -- --nocapture
```
Expected: PASS.
- [ ] **Step 8: Re-run the full runner, workflow, and browser-script regressions**
Run:
```bash
cargo test --test task_runner_test --test browser_script_skill_tool_test -- --nocapture
```
Then run the workflow executor unit coverage that protects direct-route behavior:
```bash
cargo test compat::workflow_executor::tests -- --nocapture
```
Expected: all existing runner, workflow, and browser-script tests still pass, proving the pipe-facing path, direct-route behavior, and eval-gating stayed stable.
- [ ] **Step 9: Commit**
```bash
git add src/agent/task_runner.rs src/compat/runtime.rs src/compat/orchestration.rs src/compat/workflow_executor.rs tests/task_runner_test.rs tests/browser_script_skill_tool_test.rs
git commit -m "refactor: add ws-only browser backend submit path"
```
---
## Task 3: Replace the ws services empty-session-key browser tool with a ws-native backend
**Files:**
- Modify: `src/service/server.rs`
- Modify: `src/service/mod.rs` only if minimal re-export or signature cleanup is required
- Modify: `src/browser/mod.rs` only if export cleanup is needed
- Test: `tests/browser_ws_service_adapter_test.rs`
- Reuse: `src/browser/ws_backend.rs`, `src/browser/ws_protocol.rs`
- [ ] **Step 1: Write the smallest failing service-side adapter compile target**
Add a compile-level or construction-level assertion in `tests/browser_ws_service_adapter_test.rs` that the service path can construct the new service-side ws client type used by `serve_client(...)`.
This should fail until the type exists in `src/service/server.rs`.
- [ ] **Step 2: Run the adapter test group and watch the constructor test fail**
Run:
```bash
cargo test --test browser_ws_service_adapter_test -- --nocapture
```
Expected: FAIL because the service-side ws client type does not exist yet.
- [ ] **Step 3: Introduce `ServiceBrowserWsClient` in `src/service/server.rs`**
Create a narrow client type that owns the real websocket connection to `browser_ws_url` and implements `WsClient`:
Required responsibilities only:
- lazily connect on first use
- send raw text frames
- receive raw text frames with timeout
- map close/reset to exactly `PipeError::PipeClosed`
- map connect failure to exactly `PipeError::Protocol("browser websocket connect failed: ...")`
- map timeouts to exactly `PipeError::Timeout`
Do **not** duplicate `WsBrowserBackend` responsibilities here.
- [ ] **Step 4: Remove ws-path use of `BrowserPipeTool::new(..., vec![])`**
In `serve_client(...)`, replace this shape:
```rust
let transport = Arc::new(ServiceBrowserTransport::new(...));
let browser_tool = BrowserPipeTool::new(transport.clone(), mac_policy.clone(), vec![])
```
with the ws-native shape:
```rust
let ws_client = Arc::new(ServiceBrowserWsClient::new(...));
let browser_backend: Arc<dyn BrowserBackend> = Arc::new(
WsBrowserBackend::new(ws_client, mac_policy.clone(), initial_request_url(...))
.with_response_timeout(BROWSER_RESPONSE_TIMEOUT)
);
```
Then route the task through the new ws-only submit entry from Task 2.
- [ ] **Step 5: Delete or narrow old ws-path transport code that duplicated protocol handling**
Remove the service-only callback polling / response queue logic that existed solely to feed `BrowserPipeTool`.
Keep only what is still needed for:
- service client websocket I/O (`sg_claw_client``sg_claw`)
- browser websocket I/O (`sg_claw``browser_ws_url`)
Do not leave two competing ws protocol implementations in `src/service/server.rs`.
- [ ] **Step 6: Re-run deterministic adapter/backend tests**
Run:
```bash
cargo test --test browser_ws_service_adapter_test -- --nocapture
```
Expected: PASS, including:
- navigate success
- disconnect => `PipeError::PipeClosed`
- callback timeout => `PipeError::Timeout`
- [ ] **Step 7: Re-run existing ws backend tests**
Run:
```bash
cargo test --test browser_ws_backend_test -- --nocapture
```
Expected: PASS, confirming the service adapter change did not break the existing ws backend semantics.
- [ ] **Step 8: Commit**
```bash
git add src/service/server.rs src/service/mod.rs src/browser/mod.rs tests/browser_ws_service_adapter_test.rs
git commit -m "feat: switch ws service to ws-native browser backend"
```
---
## Task 4: Prove the auth bug is gone and pipe mode is unchanged
**Files:**
- Modify: `tests/service_ws_session_test.rs`
- Modify: `tests/service_task_flow_test.rs`
- Reuse: `src/lib.rs`, `src/service/mod.rs`, `src/compat/workflow_executor.rs`
- [ ] **Step 1: Update service session tests for the new ws-only call path**
Adjust any service session tests that still call `handle_client_message(...)` through the old ws-path `BrowserPipeTool` assumption.
Prefer one of these narrow approaches:
- overload `handle_client_message(...)` with a backend-based service entry used only in ws tests, or
- keep `handle_client_message(...)` pipe-oriented and test the ws path through `serve_client(...)` and the real service binary instead
Choose the option that changes the fewest existing tests and leaves the pipe path simplest.
- [ ] **Step 2: Run the focused service session file**
Run:
```bash
cargo test --test service_ws_session_test -- --nocapture
```
Expected: PASS.
- [ ] **Step 3: Make the auth-regression integration test pass**
Re-run the exact end-to-end regression from Task 1:
```bash
cargo test --test service_task_flow_test ws_service_submit_task_no_longer_hits_invalid_hmac_seed -- --nocapture
```
Expected: PASS, with evidence that:
- the fake browser websocket server received at least one frame
- output no longer contains `invalid hmac seed: session key must not be empty`
- [ ] **Step 4: Add one explicit mandatory assertion for browser websocket connect failures**
Add one focused assertion that a browser websocket connect failure surfaces outward as:
```rust
PipeError::Protocol("browser websocket connect failed: ...")
```
Do not leave this semantic implied.
- [ ] **Step 5: Add one explicit ws direct-route regression**
Add one focused regression that proves a ws-backed browser backend can traverse a direct-route/fallback path that currently flows through `src/compat/workflow_executor.rs`.
Keep it deterministic and narrow. Prefer a fake backend plus direct function invocation over a planner-dependent natural-language end-to-end test.
- [ ] **Step 6: Run the ws-focused regression suite**
Run:
```bash
cargo test --test browser_ws_service_adapter_test --test browser_ws_backend_test --test browser_ws_protocol_test --test service_ws_session_test --test service_task_flow_test -- --nocapture
```
Then run the workflow-executor direct-route coverage:
```bash
cargo test compat::workflow_executor::tests -- --nocapture
```
Expected: all ws-focused and direct-route workflow tests pass.
- [ ] **Step 7: Run the required pipe and browser-script regression suite**
Run:
```bash
cargo test --test pipe_handshake_test --test browser_tool_test --test compat_browser_tool_test --test browser_script_skill_tool_test --test runtime_task_flow_test -- --nocapture
```
Expected: all required pipe and browser-script regressions pass unchanged.
- [ ] **Step 8: Run the full relevant verification sweep**
Run:
```bash
cargo test --test browser_ws_service_adapter_test --test browser_ws_backend_test --test browser_ws_protocol_test --test service_ws_session_test --test service_task_flow_test --test pipe_handshake_test --test browser_tool_test --test compat_browser_tool_test --test browser_script_skill_tool_test --test runtime_task_flow_test -- --nocapture
```
Then run:
```bash
cargo test compat::workflow_executor::tests -- --nocapture
```
Expected: full mixed ws+pipe verification passes in fresh runs.
- [ ] **Step 9: Build the affected binaries**
Run:
```bash
cargo build --bin sgclaw --bin sg_claw --bin sg_claw_client
```
Expected: all three binaries compile.
- [ ] **Step 10: Commit**
```bash
git add tests/service_ws_session_test.rs tests/service_task_flow_test.rs tests/browser_ws_service_adapter_test.rs src/compat/workflow_executor.rs
git commit -m "test: verify ws auth replacement and pipe regressions"
```
---
## Task 5: Manual smoke verification against the real browser
**Files:**
- Reuse only: no code changes unless a verified bug is found during smoke work
- [ ] **Step 1: Start the real browser websocket target**
Confirm the real sgBrowser endpoint is reachable at the configured `browserWsUrl`.
- [ ] **Step 2: Start the real ws service**
Run:
```bash
cargo run --bin sg_claw -- --config-path "D:/data/ideaSpace/rust/sgClaw/sgclaw_config.json"
```
Expected: service prints the resolved listen address and browser websocket URL.
- [ ] **Step 3: Run the minimal browser task through the real client**
Run from a separate terminal with UTF-8-safe input:
```bash
cargo run --bin sg_claw_client -- --config-path "D:/data/ideaSpace/rust/sgClaw/sgclaw_config.json"
```
Submit:
```text
打开知乎热榜并读取页面主区域文本
```
Expected:
- browser actions start executing
- no `invalid hmac seed: session key must not be empty`
- one final completion is returned
- [ ] **Step 4: Run the old Zhihu skill smoke**
Submit:
```text
读取知乎热榜数据,并导出 excel 文件
```
Expected: the task enters the real browser action path instead of dying at auth initialization.
- [ ] **Step 5: Re-check the legacy pipe entry without modifying it**
Run:
```bash
cargo run
```
Only verify startup behavior appropriate for the current pipe environment. Do not change pipe code during this smoke step.
- [ ] **Step 6: If a smoke failure appears, stop and debug before editing**
Any failure found here must be handled with:
- a fresh reproducer
- a failing automated test if feasible
- the smallest scoped fix
Do not fold speculative smoke fixes into this slice.
---
## Verification Checklist
### Deterministic ws-only tests
```bash
cargo test --test browser_ws_service_adapter_test --test browser_ws_backend_test --test browser_ws_protocol_test -- --nocapture
```
Expected: ws-native backend and service adapter semantics are green without LLM/planner dependencies.
### Client→service ws chain tests
```bash
cargo test --test service_ws_session_test --test service_task_flow_test -- --nocapture
```
Expected: the ws service path reaches the browser websocket and no longer emits the empty-session-key auth failure.
### Required pipe and browser-script regressions
```bash
cargo test --test pipe_handshake_test --test browser_tool_test --test compat_browser_tool_test --test browser_script_skill_tool_test --test runtime_task_flow_test -- --nocapture
```
Expected: legacy pipe behavior and browser-script eval-gating remain unchanged.
### Binary build verification
```bash
cargo build --bin sgclaw --bin sg_claw --bin sg_claw_client
```
Expected: all affected binaries compile.
### Manual end-to-end verification
- real sgBrowser running at configured `browserWsUrl`
- `cargo run --bin sg_claw -- --config-path "D:/data/ideaSpace/rust/sgClaw/sgclaw_config.json"`
- `cargo run --bin sg_claw_client -- --config-path "D:/data/ideaSpace/rust/sgClaw/sgclaw_config.json"`
- run the Zhihu minimal task
- run the old Zhihu export task
- verify no `invalid hmac seed` appears
- verify pipe startup still behaves as before
---
## Notes for Implementation
- Keep the current pipe bootstrap in `src/lib.rs` untouched.
- Prefer adding ws-only functions over changing existing pipe signatures.
- Reuse `WsBrowserBackend` for protocol semantics; do not re-implement callback handling inside the service.
- Keep `ServiceBrowserWsClient` narrow: connection lifecycle + raw websocket I/O only.
- Preserve exact outward error semantics from the spec:
- connect failure => `PipeError::Protocol("browser websocket connect failed: ...")`
- non-zero status => `PipeError::Protocol("browser returned non-zero status: ...")`
- callback timeout => `PipeError::Timeout`
- close/reset => `PipeError::PipeClosed`
- Do not claim success until the mixed ws+pipe verification commands have been run fresh.

View File

@@ -0,0 +1,482 @@
# WS Browser Bridge Path Implementation Plan
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
**Goal:** Replace the raw-ws-direct browser execution assumption with a bridge-backed browser integration path that matches the validated FunctionsUI / BrowserAction / CommandRouter model while preserving existing pipe behavior.
**Architecture:** Keep the current Rust-side browser orchestration flow centered on `Arc<dyn BrowserBackend>`, but stop treating `WsBrowserBackend` as the real production browser surface. Model the validated bridge as two explicit layers: Layer 1 session/lifecycle calls (`sgclawConnect`, `sgclawStart`, `sgclawStop`, `sgclawSubmitTask`) and Layer 2 browser-action execution (`window.sgFunctionsUI(...)`, `window.BrowserAction(...)`, `CommandRouter`). The new backend targets Layer 2 only through a narrow repo-local `BridgeActionTransport` seam, while lifecycle/session concerns stay separate from per-action browser execution.
**Tech Stack:** Rust 2021, existing `BrowserBackend` abstraction, compat/runtime/orchestration stack, current service/task runner integration, existing bridge-oriented design docs, existing Rust unit/integration test suite.
---
## Scope Guardrails
- Do **not** continue extending raw external sgBrowser websocket business-frame handling as the mainline path.
- Do **not** modify `src/lib.rs`, pipe handshake behavior, or the working `BrowserPipeTool` path.
- Do **not** invent a parallel browser-command contract unrelated to the documented bridge surface.
- Do **not** rewrite the whole compat/runtime stack when a narrow adapter will do.
- Do **not** assume access to the full SuperRPA browser-host codebase from this repository; encode the validated contract at the nearest seam available here.
---
## File Structure
### Existing files to modify
- Modify: `src/browser/mod.rs`
- export the new bridge contract/transport/backend modules explicitly
- Modify: `src/browser/backend.rs`
- only if a tiny shared helper or trait documentation update is needed for the new bridge-backed backend
- Modify: `src/compat/browser_tool_adapter.rs`
- ensure existing browser action mapping remains reusable with the new backend implementation
- Modify: `src/compat/runtime.rs`
- wire the bridge-backed browser backend into the ws service/browser execution path without changing the pipe path
- Modify: `src/compat/orchestration.rs`
- only where browser backend wiring requires the bridge-backed path to flow through orchestration
- Modify: `src/compat/workflow_executor.rs`
- preserve direct-route/fallback use of `BrowserBackend` when the backend is bridge-backed instead of websocket-backed
- Modify: `src/service/server.rs`
- replace the current real-browser execution assumption with bridge-backend construction plus a repo-local bridge transport provider seam for the relevant service path
- Modify: `tests/compat_browser_tool_test.rs`
- extend browser tool mapping coverage if needed for bridge-backed execution
- Modify: `tests/service_task_flow_test.rs`
- replace raw-ws-direct expectations with bridge-path expectations where appropriate
- Modify: `tests/service_ws_session_test.rs`
- update service-side tests if they currently assume the real browser path is raw websocket driven
### New files to create
- Create: `src/browser/bridge_contract.rs`
- narrow, explicit contract types that keep lifecycle/session bridge calls separate from browser-action execution requests/replies
- Create: `src/browser/bridge_transport.rs`
- repo-local `BridgeActionTransport` seam used by the backend and injected by service/runtime wiring
- Create: `src/browser/bridge_backend.rs`
- new `BrowserBackend` implementation that maps browser actions onto the Layer-2 bridge action contract through `BridgeActionTransport`
- Create: `tests/browser_bridge_backend_test.rs`
- deterministic unit tests for action-to-bridge mapping and reply/error normalization using a fake bridge transport
- Create: `tests/browser_bridge_contract_test.rs`
- narrow tests proving the two bridge layers stay explicit and browser-action requests remain semantic rather than raw-websocket-shaped
### Evidence files to consult during implementation
- Read: `docs/_tmp_sgbrowser_ws_probe_transcript.md`
- Read: `frontend/archive/sgClaw验证-已归档/testRunner.js`
- Read: `docs/superpowers/specs/2026-03-25-superrpa-sgclaw-browser-control-design.md`
- Read: `docs/archive/项目管理与排期/协作时间表.md`
- Read: `docs/plans/2026-03-27-sgclaw-floating-chat-frontend-design.md`
---
## Task 1: Lock the bridge contract in deterministic tests before adding the backend
**Files:**
- Create: `src/browser/bridge_contract.rs`
- Create: `tests/browser_bridge_contract_test.rs`
- Reuse as design evidence:
- `frontend/archive/sgClaw验证-已归档/testRunner.js`
- `docs/archive/项目管理与排期/协作时间表.md`
- `docs/plans/2026-03-27-sgclaw-floating-chat-frontend-design.md`
- [ ] **Step 1: Write the first failing contract test for named bridge calls**
Create `tests/browser_bridge_contract_test.rs` with one focused test that encodes the bridge naming expectations already evidenced in the repo.
Start with a test shape like:
```rust
#[test]
fn bridge_contract_names_match_documented_bridge_surface() {
// assert the contract contains the exact bridge action names
}
```
Required expectations:
- `sgclawConnect`
- `sgclawStart`
- `sgclawStop`
- `sgclawSubmitTask`
- these names live in an explicit lifecycle/session contract type, not in the browser-action request type
Do **not** invent additional action names in this first test.
- [ ] **Step 2: Run the single contract test and verify it fails**
Run:
```bash
cargo test --test browser_bridge_contract_test bridge_contract_names_match_documented_bridge_surface -- --nocapture
```
Expected: FAIL because `src/browser/bridge_contract.rs` does not exist yet.
- [ ] **Step 3: Add the second failing contract test for browser-action request shaping**
In the same file, add one focused test proving the bridge contract can represent a browser action request without leaking raw websocket business-frame semantics.
Test shape:
```rust
#[test]
fn bridge_contract_represents_browser_action_requests_without_ws_business_frames() {
// create a click/navigate/getText style action request and assert shape
}
```
Required assertions:
- request shape identifies the intended browser action semantically
- request shape is distinct from the lifecycle/session bridge call type
- request shape does **not** embed `sgBrowerserOpenPage`, `callBackJsToCpp`, or other raw websocket business-frame names
- [ ] **Step 4: Run the second contract test and verify it fails**
Run:
```bash
cargo test --test browser_bridge_contract_test bridge_contract_represents_browser_action_requests_without_ws_business_frames -- --nocapture
```
Expected: FAIL because the bridge contract does not exist yet.
- [ ] **Step 5: Implement the minimal bridge contract module**
Create `src/browser/bridge_contract.rs` with only the types needed by the tests.
Recommended shape:
```rust
pub enum BridgeLifecycleCall {
Connect,
Start,
Stop,
SubmitTask,
}
impl BridgeLifecycleCall {
pub fn bridge_name(&self) -> &'static str {
match self {
Self::Connect => "sgclawConnect",
Self::Start => "sgclawStart",
Self::Stop => "sgclawStop",
Self::SubmitTask => "sgclawSubmitTask",
}
}
}
pub struct BridgeBrowserActionRequest {
pub action: String,
pub params: serde_json::Value,
pub expected_domain: String,
}
```
Rules:
- model the documented bridge/lifecycle naming explicitly
- keep the browser action request semantic, not websocket-frame-shaped
- keep the module small and repository-local
- [ ] **Step 6: Re-run the contract tests**
Run:
```bash
cargo test --test browser_bridge_contract_test -- --nocapture
```
Expected: PASS.
- [ ] **Step 7: Commit**
```bash
git add src/browser/bridge_contract.rs tests/browser_bridge_contract_test.rs
git commit -m "test: define sgClaw bridge contract surface"
```
---
## Task 2: Add the repo-local transport seam and bridge-backed `BrowserBackend`
**Files:**
- Create: `src/browser/bridge_transport.rs`
- Create: `src/browser/bridge_backend.rs`
- Create: `tests/browser_bridge_backend_test.rs`
- Reuse: `src/browser/backend.rs`
- Reuse: `src/browser/bridge_contract.rs`
- Reuse: `src/compat/browser_tool_adapter.rs`
- [ ] **Step 1: Write the first failing backend test for action mapping**
Create `tests/browser_bridge_backend_test.rs` with one focused test proving a `BrowserBackend` action is translated into the bridge contract request shape.
Start with a narrow action such as `Action::Navigate`.
Required assertions:
- `Action::Navigate` becomes one semantic bridge browser-action request
- the request preserves action parameters and expected domain
- the test does **not** assert any raw websocket payload strings
- [ ] **Step 2: Run the first backend test and verify it fails**
Run:
```bash
cargo test --test browser_bridge_backend_test bridge_backend_maps_navigate_to_bridge_action_request -- --nocapture
```
Expected: FAIL because `src/browser/bridge_backend.rs` does not exist yet.
- [ ] **Step 3: Add the second failing backend test for reply normalization**
Add one focused test proving the backend can normalize a successful bridge reply into the existing `CommandOutput` shape expected by `BrowserBackend` callers.
- [ ] **Step 4: Run the second backend test and verify it fails**
Run:
```bash
cargo test --test browser_bridge_backend_test bridge_backend_normalizes_successful_bridge_reply -- --nocapture
```
Expected: FAIL because the backend does not exist yet.
- [ ] **Step 5: Add the third failing backend test for bridge-side errors**
Add one focused test proving a bridge-side error normalizes into the correct outward `PipeError` semantics for backend callers.
- [ ] **Step 6: Run the error-path test and verify it fails**
Run:
```bash
cargo test --test browser_bridge_backend_test bridge_backend_maps_bridge_failure_to_pipe_error -- --nocapture
```
Expected: FAIL because the backend does not exist yet.
- [ ] **Step 7: Implement the minimal transport seam and bridge backend**
Create `src/browser/bridge_transport.rs` and `src/browser/bridge_backend.rs`.
The transport seam must:
- define the repo-local `BridgeActionTransport` contract used for Layer-2 browser-action execution only
- accept semantic `BridgeBrowserActionRequest` values and return semantic success/error replies
- remain small, explicit, and easy to fake in tests
The backend must:
- implement the existing `BrowserBackend` trait
- translate supported actions into `BridgeBrowserActionRequest`
- depend on `BridgeActionTransport` instead of raw websocket payload building
- normalize success/error replies into existing backend-facing result types
Rules:
- do not embed raw websocket business-frame names
- do not change `BrowserBackend` semantics for existing callers
- do not pull lifecycle/session bridge calls into this backend layer
- [ ] **Step 8: Re-run the bridge backend tests**
Run:
```bash
cargo test --test browser_bridge_backend_test -- --nocapture
```
Expected: PASS.
- [ ] **Step 9: Re-run browser tool adapter coverage**
Run:
```bash
cargo test --test compat_browser_tool_test -- --nocapture
```
Expected: PASS, proving the existing browser action mapping remains reusable with the new backend.
- [ ] **Step 10: Commit**
```bash
git add src/browser/bridge_transport.rs src/browser/bridge_backend.rs tests/browser_bridge_backend_test.rs src/compat/browser_tool_adapter.rs src/browser/mod.rs
git commit -m "feat: add bridge-backed browser backend"
```
---
## Task 3: Wire the bridge-backed backend into the real-browser service path
**Files:**
- Modify: `src/service/server.rs`
- Modify: `src/compat/runtime.rs`
- Modify: `src/compat/orchestration.rs`
- Modify: `src/compat/workflow_executor.rs`
- Modify: `tests/service_task_flow_test.rs`
- Modify: `tests/service_ws_session_test.rs`
- Reuse: `src/browser/bridge_backend.rs`
- Reuse: `src/browser/bridge_contract.rs`
- Reuse: `src/browser/bridge_transport.rs`
- [ ] **Step 1: Write the first failing service-path test for bridge backend construction**
Add or update one focused service test proving the real-browser execution path constructs and uses the bridge-backed backend instead of the raw websocket backend assumption.
The test should observe backend selection at the nearest possible seam.
- [ ] **Step 2: Run the focused service test and verify it fails**
Run the narrowest affected service test command.
Expected: FAIL because the service path is not wired to the bridge backend yet.
- [ ] **Step 3: Add the minimal service/runtime wiring**
Change the relevant service/browser execution path so it constructs the new bridge-backed backend, injects the repo-local bridge transport provider at the nearest seam, and passes the backend through the existing runtime/orchestration flow.
Rules:
- keep the pipe path unchanged
- keep changes localized
- keep lifecycle/session bridge handling separate from per-action browser execution
- preserve existing runtime log and task flow behavior where possible
- [ ] **Step 4: Add one direct-route/fallback regression**
Add one focused regression proving a bridge-backed backend still works through the direct-route or fallback path exercised by `src/compat/workflow_executor.rs`.
- [ ] **Step 5: Run the bridge-focused service tests**
Run:
```bash
cargo test --test service_ws_session_test --test service_task_flow_test -- --nocapture
```
Expected: PASS.
- [ ] **Step 6: Re-run workflow/runtime regressions**
Run:
```bash
cargo test compat::workflow_executor::tests -- --nocapture
cargo test --test compat_browser_tool_test --test browser_script_skill_tool_test --test task_runner_test -- --nocapture
```
Expected: PASS.
- [ ] **Step 7: Commit**
```bash
git add src/service/server.rs src/compat/runtime.rs src/compat/orchestration.rs src/compat/workflow_executor.rs tests/service_ws_session_test.rs tests/service_task_flow_test.rs
git commit -m "refactor: route real browser path through bridge backend"
```
---
## Task 4: Verify bridge-path behavior without pipe regression
**Files:**
- Reuse only unless a failing test proves a minimal fix is still needed
- [ ] **Step 1: Run bridge/backend unit coverage**
Run:
```bash
cargo test --test browser_bridge_contract_test --test browser_bridge_backend_test -- --nocapture
```
Expected: PASS.
- [ ] **Step 2: Run service/runtime bridge-path regressions**
Run:
```bash
cargo test --test service_ws_session_test --test service_task_flow_test -- --nocapture
```
Expected: PASS.
- [ ] **Step 3: Run required pipe regressions**
Run:
```bash
cargo test --test pipe_handshake_test --test browser_tool_test --test compat_browser_tool_test --test browser_script_skill_tool_test --test runtime_task_flow_test -- --nocapture
```
Expected: PASS.
- [ ] **Step 4: Build the affected binaries**
Run:
```bash
cargo build --bin sgclaw --bin sg_claw --bin sg_claw_client
```
Expected: PASS.
- [ ] **Step 5: Stop if any regression points back to raw websocket assumptions**
If any test still encodes raw external websocket business-frame assumptions as the real-browser path, update that test to the bridge-backed design rather than patching production code to satisfy the old assumption.
- [ ] **Step 6: Commit**
```bash
git add tests/browser_bridge_contract_test.rs tests/browser_bridge_backend_test.rs tests/service_ws_session_test.rs tests/service_task_flow_test.rs
git commit -m "test: verify bridge path and preserve pipe behavior"
```
---
## Verification Checklist
### Bridge contract tests
```bash
cargo test --test browser_bridge_contract_test -- --nocapture
```
Expected: documented bridge names and semantic browser-action request shaping are locked.
### Bridge backend tests
```bash
cargo test --test browser_bridge_backend_test -- --nocapture
```
Expected: backend action mapping and reply/error normalization are green.
### Service/runtime integration tests
```bash
cargo test --test service_ws_session_test --test service_task_flow_test -- --nocapture
cargo test compat::workflow_executor::tests -- --nocapture
```
Expected: real-browser path uses the bridge-backed backend and direct-route/fallback behavior remains intact.
### Pipe regressions
```bash
cargo test --test pipe_handshake_test --test browser_tool_test --test compat_browser_tool_test --test browser_script_skill_tool_test --test runtime_task_flow_test -- --nocapture
```
Expected: pipe path remains unchanged.
### Binary build verification
```bash
cargo build --bin sgclaw --bin sg_claw --bin sg_claw_client
```
Expected: affected binaries compile.
---
## Notes for Implementation
- The websocket probe work stays in the repository as diagnostic tooling; do not repurpose it into the bridge adapter.
- `docs/_tmp_sgbrowser_ws_probe_transcript.md` is evidence that rejected the raw-ws-direct assumption, not a contract to keep satisfying.
- Favor one narrow bridge-backed backend over broad runtime rewrites.
- If the nearest repo-local seam is still slightly abstract because the external SuperRPA host code is outside this repository, make that abstraction explicit and test it rather than guessing hidden behavior.

View File

@@ -0,0 +1,566 @@
# WS Browser Integration Surface Correction Implementation Plan
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
**Goal:** Replace the unvalidated raw-ws-direct assumption with an evidence-backed decision: either prove a minimal sgBrowser bootstrap sequence for raw websocket control, or pivot to the real browser bridge surface.
**Architecture:** Treat the existing ws-native backend as a protocol/testing asset, not as a validated production integration surface. First build a narrow probe/validation harness that can run candidate bootstrap sequences and capture exact live transcripts from the real endpoint. Then branch decisively: if a reproducible bootstrap sequence yields real status/callback frames, implement that bootstrap path; otherwise stop raw-ws speculation and write the bridge-first implementation slice.
**Tech Stack:** Rust 2021, existing `src/browser/ws_protocol.rs` / `src/browser/ws_backend.rs`, service websocket infrastructure, `tungstenite`, `serde_json`, current Rust test suite, local sgBrowser websocket documentation.
---
## Scope Guardrails
- Do **not** add more speculative production fixes to `src/service/server.rs` just to “try one more thing.”
- Do **not** claim raw websocket is the supported path unless the live probe transcript proves it.
- Do **not** modify `src/lib.rs`, pipe handshake behavior, or the pipe browser-tool path.
- Do **not** implement both the bootstrap architecture and the bridge architecture in the same branch.
- Keep the ws-native code unless and until the bridge decision makes specific pieces obsolete.
- Prefer a dedicated probe surface over embedding validation logic into production request handling.
---
## File Structure
### Existing files to modify
- Modify: `src/browser/mod.rs`
- export the new `ws_probe` module so both tests and the probe binary use the same library surface
- Modify: `src/browser/ws_protocol.rs`
- only if a tiny helper extraction is required for test/probe readability
- do not change existing protocol semantics in this slice
- Modify: `tests/browser_ws_protocol_test.rs`
- add deterministic coverage for any extracted helper used by the probe harness
### New files to create
- Create: `src/bin/sgbrowser_ws_probe.rs`
- standalone diagnostic binary for ordered frame-script probing against a live sgBrowser websocket endpoint
- Create: `src/browser/ws_probe.rs`
- small reusable probe/transcript module, if needed, to keep the binary and tests focused
- Create: `tests/browser_ws_probe_test.rs`
- deterministic fake-server tests for transcript capture, timeout reporting, and scripted sequence execution
- Create: `docs/superpowers/specs/2026-04-03-ws-browser-bridge-path-design.md` **only if Option B wins after probing**
- follow-up bridge design, not part of the initial coding slice
- Create: `docs/superpowers/plans/2026-04-03-ws-browser-bridge-path-plan.md` **only if Option B wins after probing**
- follow-up bridge implementation plan, not part of the initial coding slice
- Create: `docs/_tmp_sgbrowser_ws_probe_transcript.md`
- temporary evidence artifact capturing the real endpoint probe matrix and outcomes
### Files deliberately not changed in the initial slice
- `src/lib.rs`
- `src/agent/task_runner.rs`
- `src/compat/runtime.rs`
- `src/compat/orchestration.rs`
- `src/compat/workflow_executor.rs`
- `src/browser/ws_backend.rs`
Unless the probe results prove a real bootstrap contract, these files stay untouched.
---
## Task 1: Build a deterministic websocket probe harness before touching production behavior
**Files:**
- Create: `src/browser/ws_probe.rs`
- Create: `tests/browser_ws_probe_test.rs`
- Reuse: `src/browser/ws_protocol.rs`
- [ ] **Step 1: Write the first failing transcript test**
Create `tests/browser_ws_probe_test.rs` with one focused fake-server test that executes a scripted sequence of outgoing text frames and records all received text frames in order.
Start with this shape:
```rust
#[test]
fn probe_records_welcome_then_silence_transcript() {
// fake server sends one welcome frame and then stays silent
// probe result should preserve that exact transcript and mark timeout/silence explicitly
}
```
Required assertions:
- the probe can connect to the fake websocket server
- it can send a scripted first frame
- it records the first inbound text frame exactly
- it returns a transcript/result object that distinguishes timeout from protocol parse failure
- [ ] **Step 2: Run the single new test and verify it fails**
Run:
```bash
cargo test --test browser_ws_probe_test probe_records_welcome_then_silence_transcript -- --nocapture
```
Expected: FAIL because the probe harness does not exist yet.
- [ ] **Step 3: Add the second failing probe test for ordered multi-step scripts**
In the same file, add a test proving the harness can run multiple outgoing frames in a fixed order and keep the transcript segmented by step.
Test shape:
```rust
#[test]
fn probe_runs_ordered_frame_script_and_records_per_step_results() {
// send bootstrap frame 1, bootstrap frame 2, then minimal action
// fake server replies differently at each step
// probe result preserves exact order and outcomes
}
```
Required assertions:
- outgoing frames are sent in the configured order
- inbound frames are attached to the correct step
- the probe can stop the sequence on timeout/close if configured
- [ ] **Step 4: Run the ordered-script test and verify it fails**
Run:
```bash
cargo test --test browser_ws_probe_test probe_runs_ordered_frame_script_and_records_per_step_results -- --nocapture
```
Expected: FAIL because the probe harness does not exist yet.
- [ ] **Step 5: Add the third failing probe test for close/reset visibility**
Add one focused fake-server test that closes the connection after a script step and asserts the transcript reports close/reset rather than generic timeout.
- [ ] **Step 6: Run the close/reset test and verify it fails**
Run:
```bash
cargo test --test browser_ws_probe_test probe_reports_socket_close_separately_from_timeout -- --nocapture
```
Expected: FAIL because the probe harness does not exist yet.
- [ ] **Step 7: Implement the minimal probe module**
Create `src/browser/ws_probe.rs` with only the types and behavior needed by the tests.
Recommended shape:
```rust
pub struct ProbeStep {
pub label: String,
pub payload: String,
pub expect_reply: bool,
}
pub enum ProbeOutcome {
Received(Vec<String>),
TimedOut,
Closed,
ConnectFailed(String),
}
pub struct ProbeStepResult {
pub label: String,
pub sent: String,
pub outcome: ProbeOutcome,
}
pub fn run_probe_script(/* ws url, timeout, steps */) -> Result<Vec<ProbeStepResult>, ProbeError> {
// connect, send ordered frames, collect exact transcript
}
```
Rules:
- do not parse business meaning yet
- do not mix this into normal task execution
- preserve exact raw text frames in transcript results
- keep the module small and diagnostic-oriented
- [ ] **Step 8: Re-run the new probe tests**
Run:
```bash
cargo test --test browser_ws_probe_test -- --nocapture
```
Expected: PASS.
- [ ] **Step 9: Commit**
```bash
git add src/browser/ws_probe.rs tests/browser_ws_probe_test.rs
git commit -m "test: add sgBrowser websocket probe harness"
```
---
## Task 2: Add a standalone probe binary for live sgBrowser evidence collection
**Files:**
- Create: `src/bin/sgbrowser_ws_probe.rs`
- Create: `src/browser/ws_probe.rs`
- Modify: `src/browser/mod.rs`
- Create: `tests/browser_ws_probe_test.rs`
- [ ] **Step 1: Write the failing helper parser test**
In `tests/browser_ws_probe_test.rs`, add one focused test for a new helper function in `src/browser/ws_probe.rs`:
```rust
#[test]
fn parse_probe_args_accepts_ws_url_timeout_and_ordered_steps() {
// parse a fixed argv-style slice into a ProbeCliConfig
}
```
Create and use this exact helper shape:
```rust
pub struct ProbeCliConfig {
pub ws_url: String,
pub timeout_ms: u64,
pub steps: Vec<ProbeStep>,
}
pub fn parse_probe_args(args: &[String]) -> Result<ProbeCliConfig, ProbeError>
```
The test must assert that these exact arguments parse successfully and preserve step order:
```text
--ws-url ws://127.0.0.1:12345
--timeout-ms 1500
--step open-agent::["about:blank","sgOpenAgent"]
--step open-hot::["about:blank","sgBrowerserOpenPage","https://www.zhihu.com/hot"]
```
- [ ] **Step 2: Run the parser test and verify it fails**
Run:
```bash
cargo test --test browser_ws_probe_test parse_probe_args_accepts_ws_url_timeout_and_ordered_steps -- --nocapture
```
Expected: FAIL because `parse_probe_args(...)` and `ProbeCliConfig` do not exist yet.
- [ ] **Step 3: Implement the helper and binary together**
In `src/browser/ws_probe.rs`, add `ProbeCliConfig` and `parse_probe_args(...)`.
In `src/browser/mod.rs`, add the module export:
```rust
pub mod ws_probe;
```
In `src/bin/sgbrowser_ws_probe.rs`, implement the binary using only `std::env::args()` plus `parse_probe_args(...)`.
Required behavior:
- accepts a websocket URL
- accepts a timeout in milliseconds
- accepts repeated ordered steps
- runs the probe harness
- prints a markdown-friendly transcript including:
- step label
- exact sent payload
- exact received frames, if any
- timeout/close outcome
Output shape can be simple, for example:
```text
STEP 1 bootstrap-open-agent
SEND: ["about:blank","sgOpenAgent"]
RECV: Welcome! You are client #1
OUTCOME: timeout
```
Rules:
- no production/browser-runtime integration
- no hidden fallback logic
- no “best effort” guessing of next steps
- [ ] **Step 4: Re-run the parser/helper test**
Run:
```bash
cargo test --test browser_ws_probe_test parse_probe_args_accepts_ws_url_timeout_and_ordered_steps -- --nocapture
```
Expected: PASS.
- [ ] **Step 5: Build the probe binary**
Run:
```bash
cargo build --bin sgbrowser_ws_probe
```
Expected: PASS.
- [ ] **Step 6: Commit**
```bash
git add src/bin/sgbrowser_ws_probe.rs src/browser/ws_probe.rs src/browser/mod.rs tests/browser_ws_probe_test.rs
git commit -m "feat: add live sgBrowser websocket probe binary"
```
---
## Task 3: Run the real endpoint probe matrix and write the evidence transcript
**Files:**
- Create: `docs/_tmp_sgbrowser_ws_probe_transcript.md`
- Reuse only: `src/bin/sgbrowser_ws_probe.rs`, `docs/_tmp_sgbrowser_ws_api_doc.txt`
- [ ] **Step 1: Run the no-bootstrap baseline probe**
Run exactly:
```bash
cargo run --bin sgbrowser_ws_probe -- --ws-url "ws://127.0.0.1:12345" --timeout-ms 1500 --step "baseline-open::[\"about:blank\",\"sgBrowerserOpenPage\",\"https://www.zhihu.com/hot\"]"
```
Append the exact output under a `## baseline-open` heading in `docs/_tmp_sgbrowser_ws_probe_transcript.md`.
- [ ] **Step 2: Run the documented `sgOpenAgent` candidate**
Run exactly:
```bash
cargo run --bin sgbrowser_ws_probe -- --ws-url "ws://127.0.0.1:12345" --timeout-ms 1500 --step "open-agent::[\"about:blank\",\"sgOpenAgent\"]" --step "post-open-agent-open::[\"about:blank\",\"sgBrowerserOpenPage\",\"https://www.zhihu.com/hot\"]"
```
Append the exact output under a `## open-agent` heading.
- [ ] **Step 3: Run the documented `sgSetAuthInfo` candidate**
Run exactly:
```bash
cargo run --bin sgbrowser_ws_probe -- --ws-url "ws://127.0.0.1:12345" --timeout-ms 1500 --step "set-auth::[\"about:blank\",\"sgSetAuthInfo\",\"probe-user\",\"probe-token\"]" --step "post-set-auth-open::[\"about:blank\",\"sgBrowerserOpenPage\",\"https://www.zhihu.com/hot\"]"
```
Append the exact output under a `## set-auth` heading.
- [ ] **Step 4: Run the documented `sgBrowserLogin` candidate**
Run exactly:
```bash
cargo run --bin sgbrowser_ws_probe -- --ws-url "ws://127.0.0.1:12345" --timeout-ms 1500 --step "browser-login::{\"request\":\"use-json-helper\"}"
```
Before running, replace the placeholder payload with the exact JSON-array frame produced by the helper for:
```json
["about:blank","sgBrowserLogin",{"appName":"probe","userName":"probe","orgName":"probe","menus":[{"name":"probe","normalImg":"x","activeImg":"x","url":"https://www.zhihu.com/hot"}]}]
```
Then add a second step in the same command:
```json
["about:blank","sgBrowerserOpenPage","https://www.zhihu.com/hot"]
```
Append the exact output under a `## browser-login` heading.
- [ ] **Step 5: Run the documented `sgBrowerserActiveTab` candidate**
Run exactly:
```bash
cargo run --bin sgbrowser_ws_probe -- --ws-url "ws://127.0.0.1:12345" --timeout-ms 1500 --step "active-tab::[\"about:blank\",\"sgBrowerserActiveTab\",\"https://www.zhihu.com/hot\",\"probeCallback\"]" --step "post-active-tab-open::[\"about:blank\",\"sgBrowerserOpenPage\",\"https://www.zhihu.com/hot\"]"
```
Append the exact output under a `## active-tab` heading.
- [ ] **Step 6: Run one combined bootstrap candidate**
Run exactly:
```bash
cargo run --bin sgbrowser_ws_probe -- --ws-url "ws://127.0.0.1:12345" --timeout-ms 1500 --step "combined-open-agent::[\"about:blank\",\"sgOpenAgent\"]" --step "combined-active-tab::[\"about:blank\",\"sgBrowerserActiveTab\",\"https://www.zhihu.com/hot\",\"probeCallback\"]" --step "combined-open::[\"about:blank\",\"sgBrowerserOpenPage\",\"https://www.zhihu.com/hot\"]"
```
Append the exact output under a `## combined-bootstrap` heading.
- [ ] **Step 7: Run `requesturl` variants for the minimal action**
Run exactly these two additional commands:
```bash
cargo run --bin sgbrowser_ws_probe -- --ws-url "ws://127.0.0.1:12345" --timeout-ms 1500 --step "target-as-requesturl::[\"https://www.zhihu.com/hot\",\"sgBrowerserOpenPage\",\"https://www.zhihu.com/hot\"]"
```
```bash
cargo run --bin sgbrowser_ws_probe -- --ws-url "ws://127.0.0.1:12345" --timeout-ms 1500 --step "baidu-requesturl::[\"https://www.baidu.com\",\"sgBrowerserOpenPage\",\"https://www.zhihu.com/hot\"]"
```
Append the exact outputs under `## requesturl-variants`.
- [ ] **Step 8: Summarize the matrix in the transcript file**
At the end of `docs/_tmp_sgbrowser_ws_probe_transcript.md`, add this exact table template and fill it in:
```markdown
| Sequence | Sent frames | First reply | Final outcome | Decision signal |
| --- | --- | --- | --- | --- |
```
- [ ] **Step 9: Determine which architecture option wins**
Decision rule:
- if at least one sequence reproducibly yields real numeric status and/or callback frames for a real business action, Option A (bootstrap-validated raw websocket) wins
- otherwise, Option B (bridge-first) wins
Do not weaken this decision rule.
- [ ] **Step 10: Commit the evidence artifact**
```bash
git add docs/_tmp_sgbrowser_ws_probe_transcript.md
git commit -m "docs: capture sgBrowser websocket probe evidence"
```
---
## Task 4A: If Option A wins, write the narrow bootstrap implementation slice
**Files:**
- Create: `docs/superpowers/specs/2026-04-03-ws-browser-bootstrap-contract-design.md`
- Create: `docs/superpowers/plans/2026-04-03-ws-browser-bootstrap-contract-plan.md`
- Reuse as evidence input:
- `docs/_tmp_sgbrowser_ws_probe_transcript.md`
- `docs/_tmp_sgbrowser_ws_api_doc.txt`
- `src/browser/ws_backend.rs`
- `src/browser/ws_protocol.rs`
- [ ] **Step 1: Write one new design doc capturing the proven bootstrap contract**
Create:
```text
docs/superpowers/specs/2026-04-03-ws-browser-bootstrap-contract-design.md
```
Include:
- exact validated sequence
- exact required state (`requesturl`, active tab, agent page, auth payload)
- exact failure semantics
- why this is sufficient evidence to keep raw websocket as the product surface
- [ ] **Step 2: Write one new implementation plan for the bootstrap path**
Create:
```text
docs/superpowers/plans/2026-04-03-ws-browser-bootstrap-contract-plan.md
```
Plan only the minimal production changes required to embed the validated bootstrap sequence into the service/browser path.
- [ ] **Step 3: Commit the bootstrap decision docs**
```bash
git add docs/superpowers/specs/2026-04-03-ws-browser-bootstrap-contract-design.md docs/superpowers/plans/2026-04-03-ws-browser-bootstrap-contract-plan.md
git commit -m "docs: capture ws browser bootstrap contract"
```
- [ ] **Step 4: Stop after writing the bootstrap plan**
Do not begin production implementation in the same slice unless the user explicitly asks for execution.
---
## Task 4B: If Option B wins, write the bridge-first implementation slice
**Files:**
- Create: `docs/superpowers/specs/2026-04-03-ws-browser-bridge-path-design.md`
- Create: `docs/superpowers/plans/2026-04-03-ws-browser-bridge-path-plan.md`
- Reuse as evidence input:
- `docs/_tmp_sgbrowser_ws_probe_transcript.md`
- `frontend/archive/sgClaw验证-已归档/testRunner.js`
- `docs/superpowers/specs/2026-03-25-superrpa-sgclaw-browser-control-design.md`
- `docs/archive/项目管理与排期/协作时间表.md`
- `docs/plans/2026-03-27-sgclaw-floating-chat-frontend-design.md`
- [ ] **Step 1: Write the bridge-path design doc**
Create `docs/superpowers/specs/2026-04-03-ws-browser-bridge-path-design.md`.
The design must specify:
- why raw websocket is considered non-validated for external control
- which bridge surface becomes authoritative
- where sgClaw should integrate (`FunctionsUI`, host bridge, `BrowserAction`, `CommandRouter`, or the nearest validated seam in this repo)
- how to preserve pipe behavior and existing abstractions where practical
- [ ] **Step 2: Write the bridge-path implementation plan**
Create `docs/superpowers/plans/2026-04-03-ws-browser-bridge-path-plan.md`.
The plan must:
- identify exact files to touch
- describe the narrowest adapter implementation
- keep TDD/task granularity as in this document
- avoid speculative work outside the bridge slice
- [ ] **Step 3: Commit the bridge decision docs**
```bash
git add docs/superpowers/specs/2026-04-03-ws-browser-bridge-path-design.md docs/superpowers/plans/2026-04-03-ws-browser-bridge-path-plan.md
git commit -m "docs: define bridge-first sgBrowser integration"
```
- [ ] **Step 4: Stop after writing the bridge plan**
Do not start the bridge implementation in the same slice unless the user explicitly asks for execution.
---
## Verification Checklist
### Deterministic probe harness tests
```bash
cargo test --test browser_ws_probe_test -- --nocapture
```
Expected: transcript capture, ordered scripts, timeout reporting, and close/reset reporting all pass.
### Probe binary build
```bash
cargo build --bin sgbrowser_ws_probe
```
Expected: PASS.
### Live evidence collection
- run the probe matrix against the real configured endpoint
- save exact transcripts to `docs/_tmp_sgbrowser_ws_probe_transcript.md`
- make the architecture decision using the documented rule
### Follow-up branch condition
- if Option A wins, repository contains a bootstrap-contract design + plan
- if Option B wins, repository contains a bridge-path design + plan
- no production runtime changes are made until that decision is written down
---
## Notes for Implementation
- The existing `WsBrowserBackend` fix that remembers the last navigated URL remains valid; do not revert it.
- The previous auth-replacement work also remains valid; it removed a real bug but did not prove the raw websocket architecture.
- Keep the probe tool brutally literal: exact sent frames, exact received frames, explicit timeout/close outcomes.
- Resist the temptation to make the probe “smart.” Smart probes hide evidence.
- If the real endpoint still replies only with the welcome banner and then silence across the matrix, treat that as a decision, not as an excuse for more guessing.

View File

@@ -0,0 +1,362 @@
# WS Browser Welcome Frame Compatibility Implementation Plan
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
**Goal:** Make the ws service path tolerate the real sgBrowser welcome banner (`Welcome! You are client #...`) without weakening general ws protocol validation or changing pipe behavior.
**Architecture:** Keep the shared `WsBrowserBackend` strict and implement the compatibility shim only in `ServiceBrowserWsClient`, which is already the real-browser adapter for the ws service path. Add one positive red test for the known welcome frame and one negative red test proving non-matching first text frames still fail as protocol errors, then make the minimal stateful change in `src/service/server.rs` and verify ws + pipe regressions.
**Tech Stack:** Rust 2021, tungstenite websocket client/server, existing `WsBrowserBackend`, existing `ServiceBrowserWsClient`, existing Rust unit/integration test suite.
---
## File Structure
### Existing files to modify
- Modify: `src/service/server.rs`
- Add the one-time per-connection welcome-skip state to `ServiceBrowserWsClient`
- Add the minimal helper(s) for detecting and discarding the first known welcome frame
- Add focused service-adapter unit tests in the existing `#[cfg(test)]` module
- Reuse: `src/browser/ws_backend.rs`
- Do not change protocol parsing rules; only verify behavior remains strict for all non-service callers
- Reuse: `tests/service_task_flow_test.rs`
- Re-run to confirm the ws service path still reaches the browser websocket after the service-side shim
- Reuse: `tests/browser_ws_backend_test.rs`
- Re-run to prove the shared backend semantics remain unchanged
### Files deliberately not changed
- `src/browser/ws_backend.rs`
- `src/browser/ws_protocol.rs`
- `src/agent/task_runner.rs`
- `src/compat/runtime.rs`
- `src/compat/orchestration.rs`
- `src/compat/workflow_executor.rs`
- `src/lib.rs`
The design explicitly keeps the welcome-banner workaround out of the shared backend and out of the pipe path.
---
## Task 1: Reproduce the real welcome-frame failure with focused unit tests
**Files:**
- Modify: `src/service/server.rs`
- [ ] **Step 1: Add the positive failing test for the known welcome frame**
In the existing `#[cfg(test)] mod tests` inside `src/service/server.rs`, add one focused test next to the current ws adapter tests.
Test shape:
```rust
#[test]
fn future_server_side_ws_native_adapter_skips_initial_known_welcome_frame() {
// fake server sends:
// 1. "Welcome! You are client #1"
// 2. "0"
// backend.invoke(Action::Navigate, ...) should succeed
}
```
Required assertions:
- the fake websocket server accepts one connection
- it sends the welcome banner first, then the numeric success status
- `WsBrowserBackend.invoke(Action::Navigate, ...)` returns `Ok(CommandOutput { success: true, .. })`
- [ ] **Step 2: Run only the positive new test and watch it fail**
Run:
```bash
cargo test service::server::tests::future_server_side_ws_native_adapter_skips_initial_known_welcome_frame -- --nocapture
```
Expected: FAIL with a protocol error containing `invalid browser status frame: Welcome! You are client #1`.
- [ ] **Step 3: Add the negative failing test for arbitrary first text**
In the same `#[cfg(test)]` module, add one negative test proving we do **not** silently skip arbitrary first text frames.
Test shape:
```rust
#[test]
fn future_server_side_ws_native_adapter_does_not_skip_unknown_first_text_frame() {
// fake server sends:
// 1. "Hello from server"
// assert invoke(...) fails as PipeError::Protocol(...)
}
```
Required assertions:
- the first frame is a non-matching text frame such as `Hello from server`
- `invoke(...)` fails
- the failure remains a protocol error rather than success or timeout
- [ ] **Step 4: Run only the negative new test and verify the current behavior is already strict**
Run:
```bash
cargo test service::server::tests::future_server_side_ws_native_adapter_does_not_skip_unknown_first_text_frame -- --nocapture
```
Expected: PASS, proving the current implementation already treats arbitrary first text as a protocol error. Keep that assertion in place before any production change.
- [ ] **Step 5: Confirm the TDD gate before implementation**
Do not implement production code before both tests exist and the positive test has failed on current behavior.
---
## Task 2: Add the minimal per-connection welcome-skip state in the service adapter
**Files:**
- Modify: `src/service/server.rs`
- [ ] **Step 1: Add one-time per-connection state to `ServiceBrowserWsClient`**
Extend `ServiceBrowserWsClient` with one extra state field that tracks whether the initial welcome candidate has already been consumed for the current websocket connection.
Allowed shape:
```rust
struct ServiceBrowserWsClient {
browser_ws_url: String,
browser_socket: Mutex<Option<WebSocket<MaybeTlsStream<TcpStream>>>>,
initial_text_frame_checked: Mutex<bool>,
}
```
or an equally small equivalent.
Rules:
- state is per connection, not per request
- state must survive multiple `invoke(...)` calls while reusing the same socket
- do not add broader protocol state machines
- [ ] **Step 2: Add a narrow welcome-frame matcher**
In `src/service/server.rs`, add one small helper that recognizes only the known banner prefix:
```rust
fn is_known_welcome_frame(frame: &str) -> bool {
frame.starts_with("Welcome! You are client #")
}
```
Rules:
- no regex needed
- no generic “ignore arbitrary text” logic
- keep the matcher local to `src/service/server.rs`
- [ ] **Step 3: Update `recv_text_timeout(...)` to skip at most one initial known banner**
Modify `impl WsClient for ServiceBrowserWsClient` so that the first text frame received after connection establishment is handled like this:
1. read the next text frame
2. if the initial-frame state is still false:
- mark the first-frame check as consumed
- if the frame matches `is_known_welcome_frame(...)`, read the next frame and return that next frame instead
3. otherwise, return the frame unchanged
Rules:
- skip only once per connection
- do not loop indefinitely over multiple text frames
- do not swallow unknown first text frames
- do not change timeout / close / reset / connect-failure behavior
- [ ] **Step 4: Reset the one-time state when a fresh socket is created**
When `with_socket(...)` establishes a brand-new websocket connection, ensure the one-time banner-check state is reset so a new connection can tolerate its own first welcome frame.
- [ ] **Step 5: Add one reconnect regression in the service adapter tests**
Add one focused test proving the welcome skip resets on a fresh connection after socket close/reset.
Test shape:
```rust
#[test]
fn future_server_side_ws_native_adapter_skips_welcome_again_after_reconnect() {
// first connection closes after use
// second fresh connection sends the same welcome banner again
// both invocations succeed
}
```
Required assertion:
- the one-time skip is per connection, not global for the client instance
- [ ] **Step 6: Run the positive new test**
Run:
```bash
cargo test service::server::tests::future_server_side_ws_native_adapter_skips_initial_known_welcome_frame -- --nocapture
```
Expected: PASS.
- [ ] **Step 7: Run the negative new test**
Run:
```bash
cargo test service::server::tests::future_server_side_ws_native_adapter_does_not_skip_unknown_first_text_frame -- --nocapture
```
Expected: PASS, proving unknown first text is still treated as a protocol error.
- [ ] **Step 8: Run the reconnect regression**
Run:
```bash
cargo test service::server::tests::future_server_side_ws_native_adapter_skips_welcome_again_after_reconnect -- --nocapture
```
Expected: PASS.
- [ ] **Step 9: Run the full service adapter unit group**
Run:
```bash
cargo test service::server::tests -- --nocapture
```
Expected: PASS, including the existing tests for:
- status `0` success
- connect failure => `PipeError::Protocol("browser websocket connect failed: ...")`
- disconnect/reset => `PipeError::PipeClosed`
- callback timeout => `PipeError::Timeout`
- new known-welcome success path
- new unknown-first-frame strictness path
- new reconnect reset behavior
---
## Task 3: Verify the shared backend stayed strict and the ws service path still works
**Files:**
- Reuse: `tests/browser_ws_backend_test.rs`
- Reuse: `tests/service_task_flow_test.rs`
- Reuse: `src/browser/ws_backend.rs`
- [ ] **Step 1: Re-run the shared ws backend tests unchanged**
Run:
```bash
cargo test --test browser_ws_backend_test -- --nocapture
```
Expected: PASS. This proves `WsBrowserBackend` semantics remain unchanged for its existing deterministic callers.
- [ ] **Step 2: Re-run the service task-flow regression**
Run:
```bash
cargo test --test service_task_flow_test -- --nocapture
```
Expected: PASS, including the auth-regression test that proves the ws service path reaches the browser websocket and no longer emits `invalid hmac seed: session key must not be empty`.
- [ ] **Step 3: Re-run the ws-focused mixed verification**
Run:
```bash
cargo test --test browser_ws_backend_test --test browser_ws_protocol_test --test service_ws_session_test --test service_task_flow_test -- --nocapture
```
Expected: PASS.
---
## Task 4: Re-run the real manual smoke that originally failed
**Files:**
- Reuse only: no code changes unless a fresh reproducer proves another bug
- [ ] **Step 1: Confirm real browser websocket reachability**
Run a reachability check for `ws://127.0.0.1:12345` (or the configured `browserWsUrl`) before starting smoke.
Expected: reachable.
- [ ] **Step 2: Start the real ws service**
Run:
```bash
cargo run --bin sg_claw -- --config-path "D:/data/ideaSpace/rust/sgClaw/sgclaw_config.json"
```
Expected: the service prints:
- `sg_claw ready: ...`
- the resolved `service_ws_listen_addr`
- the configured `browser_ws_url`
- [ ] **Step 3: Re-run the original failing manual smoke**
Run:
```bash
printf '打开知乎热榜并读取页面主区域文本\n' | cargo run --bin sg_claw_client -- --config-path "D:/data/ideaSpace/rust/sgClaw/sgclaw_config.json"
```
Expected:
- no `invalid browser status frame: Welcome! You are client #1`
- browser actions proceed past the first status frame
- if the browser later fails for another reason, capture that new reason exactly
- [ ] **Step 4: Re-run the old Zhihu export task smoke**
Run:
```bash
printf '读取知乎热榜数据,并导出 excel 文件\n' | cargo run --bin sg_claw_client -- --config-path "D:/data/ideaSpace/rust/sgClaw/sgclaw_config.json"
```
Expected:
- no `invalid browser status frame: Welcome! You are client #1`
- the task reaches the real browser action path beyond connection banner handling
- [ ] **Step 5: Stop and debug if a new real-browser issue appears**
If smoke now fails for a different reason, do not piggyback a second fix into this slice without:
- capturing the exact new output
- writing a new focused spec/plan if the issue is materially different
---
## Verification Checklist
### Service adapter unit tests
```bash
cargo test service::server::tests -- --nocapture
```
Expected: all service-side ws adapter tests pass, including the new welcome-frame positive/negative cases and reconnect reset case.
### Shared ws backend + ws service regressions
```bash
cargo test --test browser_ws_backend_test --test browser_ws_protocol_test --test service_ws_session_test --test service_task_flow_test -- --nocapture
```
Expected: PASS.
### Real smoke verification
- `browserWsUrl` reachable
- `sg_claw` starts with real config
- `sg_claw_client` no longer fails on `Welcome! You are client #...`
- Zhihu minimal read task gets past the first status frame
- Zhihu export task gets past the first status frame

View File

@@ -0,0 +1,564 @@
# Zhihu Release WS Function-Callback Migration Implementation Plan
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
**Goal:** Move only the Zhihu direct-execution path to the new Release browser websocket interaction style while keeping the existing pipe protocol and non-Zhihu submit behavior unchanged.
**Architecture:** Keep `ClientMessage` / `ServiceMessage`, `run_submit_task_with_browser_backend(...)`, and the high-level Zhihu workflow steps unchanged. First prove the exact Release browser interaction contract with transcript-backed probes. Then implement the smallest Zhihu-scoped backend path that follows that proven contract. Do not globally rewire the submit path unless the probe evidence proves there is no narrower safe seam.
**Tech Stack:** Rust, tungstenite, existing sgclaw service/client pipe protocol, `docs/_tmp_sgbrowser_ws_api_doc.txt`, Release browser websocket at `ws://127.0.0.1:12345`, current Zhihu direct-execution workflow.
---
## Context
The user has now made the target behavior explicit:
- the browser has changed and the working reference behavior is the user-provided HTML page that connects to `ws://127.0.0.1:12345`
- that page sends a bootstrap registration frame: `{"type":"register","role":"web"}`
- browser requests are still JSON arrays such as `[window.location.href, "sgBrowserSetTheme", "1"]` and `[window.location.href, "sgBrowerserGetUrls", "showUrls"]`
- callback-bearing browser behavior is now centered on page-defined JS callback functions like `showUrls`, not on Rust directly reading a websocket callback frame as the final business result
- the existing sgclaw pipe protocol must remain unchanged
The current sgclaw drift that must be corrected is visible in:
- `src/browser/ws_protocol.rs`
- `Action::Navigate` currently emits `sgHideBrowserCallAfterLoaded` with an inline `callBackJsToCpp(...)` string
- `src/browser/ws_backend.rs`
- Rust currently waits for a browser websocket callback frame and treats that as the action result
- `tests/service_ws_session_test.rs:498-605`
- `tests/service_task_flow_test.rs:499-635`
- existing **generic submit-flow** regressions still lock in the old direct raw-websocket callback-frame assumption
- these are useful as non-regression guardrails, but they are not themselves Zhihu-specific regressions
Zhihu-specific verification must therefore be added explicitly instead of assuming those Baidu-path tests already cover Zhihu.
The new browser style proves these facts and only these facts so far:
1. sgclaw must handle a register-first websocket handshake
2. browser requests are still `[requesturl, action, ...args]`
3. some browser capabilities now return through page-defined callback functions like `showUrls`
4. the current direct raw-websocket callback expectation in Zhihu path is no longer a safe assumption
The production seam is **not** pre-decided here. Task 1 must determine whether Zhihu can be integrated by:
- a direct Zhihu-scoped backend with no helper page, or
- a helper page / relay design because named page callbacks are the only reliable result path
Until Task 1 evidence is captured, both remain hypotheses.
## Evidence to preserve in the implementation
### Browser websocket API doc
From `docs/_tmp_sgbrowser_ws_api_doc.txt`:
- `ws://localhost:12345` is the browser websocket endpoint
- request frames are array payloads with `requesturl`
- `sgBrowerserGetUrls(callback)` uses a callback **function name**: `[requesturl,"sgBrowerserGetUrls", callback]`
- `sgBrowserCallAfterLoaded(targetUrl, callback)` and `sgHideBrowserCallAfterLoaded(targetUrl, callback)` use callback strings with parentheses
- `callBackJsToCpp(param)` uses `sourceUrl@_@targetUrl@_@callback@_@actionUrl@_@responseTxt`
- `sgBrowserRegJsFun(targeturl, funContent)` and `sgBrowserExcuteJsFun(targeturl, funName)` exist and may be useful when the helper page needs durable callback helpers
### Current working HTML pattern from the user
The now-working reference interaction is:
```html
const socket = new WebSocket('ws://127.0.0.1:12345');
socket.onopen = () => {
socket.send(JSON.stringify({type: 'register', role: 'web'}));
};
socket.send(JSON.stringify([window.location.href,"sgBrowerserGetUrls","showUrls"]));
function showUrls(urls) {
// browser invokes this page-defined callback
}
```
That is the browser behavior sgclaw now needs to follow.
---
## Critical files
### Production files to modify
- `src/browser/ws_protocol.rs`
- `src/compat/workflow_executor.rs` (only if a narrow Zhihu-specific correction is required after backend swap)
- `src/service/server.rs` (only if the chosen Zhihu-scoped integration seam must be wired here)
- `src/service/mod.rs` (only if startup plumbing changes are truly required)
- `src/browser/mod.rs`
### New production files likely needed
- `src/browser/zhihu_release_backend.rs`
- a Zhihu-scoped `BrowserBackend` adapter that follows the proven Release browser interaction style without changing non-Zhihu routes
- `src/service/browser_callback_host.rs` **only if the probe proves a service-controlled helper page is actually required**
- service-local helper-page lifecycle and callback relay, if evidence shows the browser cannot be driven safely without it
### Existing files to preserve
- `src/agent/task_runner.rs`
- `src/service/protocol.rs`
- `src/compat/orchestration.rs`
- `src/compat/runtime.rs`
- `src/pipe/*`
### Existing direct-ws files to review explicitly
- `src/browser/ws_backend.rs`
- `tests/browser_ws_backend_test.rs`
These files currently encode the old direct raw-websocket callback expectation. The implementation must either:
- leave them untouched as legacy/direct-contract coverage with no Zhihu production callers, or
- update/remove the Zhihu-specific assumptions they currently lock in.
### Primary test files
- `tests/browser_ws_probe_test.rs`
- `tests/browser_ws_protocol_test.rs`
- `tests/service_ws_session_test.rs`
- `tests/service_task_flow_test.rs`
- `tests/task_runner_test.rs`
- `tests/browser_ws_backend_test.rs`
---
## File structure decisions
### `src/browser/zhihu_release_backend.rs`
Prefer a Zhihu-scoped backend first.
Responsibilities:
- keep the same `BrowserBackend` trait surface
- implement only the behavior needed by the current Zhihu direct-execution route
- translate `Action::Navigate`, `Action::GetText`, and `Action::Eval` into the proven Release-browser interaction style
- normalize results back into `CommandOutput`
- avoid affecting non-Zhihu callers
This is the preferred seam because the user asked to change the current Zhihu flow, not to redesign the whole submit pipeline.
### `src/service/browser_callback_host.rs` (conditional)
Create this file only if Task 1 probe evidence proves that sgclaw must host or control a page in order to receive named callback-function results.
If it is needed, the plan must keep the design minimal and specific:
- one concrete transport only (choose websocket or HTTP, not “websocket or HTTP”)
- explicit readiness handshake
- explicit request correlation by `request_id`
- explicit cleanup when the submit task ends
If Task 1 shows a simpler seam, do not create this file.
### `src/browser/ws_protocol.rs`
Do not let this file keep only the old direct-callback assumption.
It should become the shared place for doc-native request builders such as:
- browser bootstrap frames proven by the transcript
- `sgBrowserCallAfterLoaded` / `sgHideBrowserCallAfterLoaded`
- `sgBrowserExcuteJsCodeByArea`
- optional `sgBrowserRegJsFun` / `sgBrowserExcuteJsFun`
But do **not** let `ws_protocol.rs` absorb service-host lifecycle logic.
### `src/browser/ws_backend.rs` and `tests/browser_ws_backend_test.rs`
Handle these explicitly in the implementation:
- if they still describe a valid direct browser contract, keep them as isolated legacy/direct-ws coverage only
- if their current navigate/callback assumptions conflict with the proven Release Zhihu path, update or narrow those tests so they no longer describe the active Zhihu integration path
Do not leave the old direct-callback assumptions ambiguously “reviewed”; the implementation must make their status explicit.
---
## Task 1: Capture the new Release browser contract in a reproducible probe transcript
**Files:**
- Review/modify: `src/browser/ws_probe.rs`
- Review/modify: `src/bin/sgbrowser_ws_probe.rs`
- Review/modify: `tests/browser_ws_probe_test.rs`
- Create: `docs/_tmp_release_ws_callback_host_transcript.md`
- [ ] **Step 1: Verify current probe coverage against the Release-browser questions**
Read the existing probe module and tests and check whether they already prove all of the following:
- a register-first websocket script can be expressed
- a later array action frame can be expressed in the same script
- per-step inbound frames/outcomes are preserved separately
- timeout/close remain distinguishable in the transcript
Required result:
- identify the exact existing tests that already prove these behaviors
- identify the smallest missing Release-specific coverage, if any
- [ ] **Step 2: Add only the missing regression coverage**
If current tests do **not** already prove the Release-browser bootstrap shape, add the narrowest failing regression in `tests/browser_ws_probe_test.rs`.
Preferred shape if coverage is missing:
```rust
#[test]
fn probe_supports_register_then_array_action_script() {
// fake server expects:
// 1. {"type":"register","role":"web"}
// 2. ["http://127.0.0.1/helper.html","sgBrowerserGetUrls","showUrls"]
}
```
And, if still missing, add one regression proving per-step transcript separation for the register reply and later action reply.
If those behaviors are already covered, skip new test creation and record the exact test names to rely on.
- [ ] **Step 3: Run the relevant probe tests**
Run the narrowest exact tests that prove the Release bootstrap behavior, or the full file if multiple areas changed:
```bash
cargo test --test browser_ws_probe_test -- --nocapture
```
Expected: PASS.
- [ ] **Step 4: Make the probe binary ergonomic for the Release transcript if needed**
Only if the current CLI cannot conveniently express the real Release-browser script, make the smallest change needed in `src/bin/sgbrowser_ws_probe.rs` / `src/browser/ws_probe.rs` so it can capture:
- register frame behavior
- minimal `sgBrowserSetTheme`
- minimal `sgBrowerserGetUrls`
- exact inbound websocket text per step
Do not redesign the probe if it already supports this.
- [ ] **Step 5: Run the live probe against the Release browser and record the real bootstrap**
Use the probe binary against the real endpoint to capture at minimum:
- register frame behavior
- minimal `sgBrowserSetTheme`
- minimal `sgBrowerserGetUrls`
- whether replies come back as websocket text, page-function invocation only, or both
Save the exact transcript in `docs/_tmp_release_ws_callback_host_transcript.md`.
Required output in that temp doc:
- exact sent frames
- exact received websocket frames
- the observed rule for when named callback functions are invoked
- whether Option A or Option B is supported by evidence
- [ ] **Step 6: Commit the probe-only slice if code changed**
If probe code/tests changed:
```bash
git add src/browser/ws_probe.rs src/bin/sgbrowser_ws_probe.rs tests/browser_ws_probe_test.rs docs/_tmp_release_ws_callback_host_transcript.md
git commit -m "test: capture release browser ws bootstrap contract"
```
If only the transcript doc changed, stage only that file and use a docs/test-appropriate commit message.
---
## Task 2: Choose the narrowest Zhihu-only production seam from the probe evidence
**Files:**
- Modify: `src/service/server.rs` (only if required)
- Modify: `src/browser/mod.rs`
- Modify: `src/compat/workflow_executor.rs` (only if required)
- Create: `src/browser/zhihu_release_backend.rs`
- Create: `src/service/browser_callback_host.rs` **only if required**
- Test: `tests/service_ws_session_test.rs`
- Test: `tests/service_task_flow_test.rs`
- [ ] **Step 1: Write down the seam decision in the plan notes before coding**
Based on the transcript from Task 1, record which one of these is supported by evidence:
- Option A: a Zhihu-scoped backend can talk to the Release browser directly with no service-hosted helper page
- Option B: a Zhihu-scoped backend needs a service-controlled helper page because named page callbacks are the only reliable way to get business results
Do not proceed until one option is chosen explicitly from evidence.
- [ ] **Step 2: Add a failing service/task-flow regression that proves only the Zhihu path changes**
Update or add focused tests so that:
- Zhihu submit flow uses the new Release-browser interaction seam
- non-Zhihu behavior is unchanged
- pipe messages remain unchanged
Required assertions:
- the new path is activated only for Zhihu route detection
- `ClientMessage` / `ServiceMessage` stay identical
- existing non-Zhihu submit behavior is not accidentally rerouted
- [ ] **Step 3: Run the new focused regression and confirm failure first**
Run the narrowest exact test names you added in:
```bash
cargo test --test service_ws_session_test <new_test_name> -- --nocapture
cargo test --test service_task_flow_test <new_test_name> -- --nocapture
```
Expected: FAIL because the Zhihu-specific seam does not exist yet.
- [ ] **Step 4: Implement the chosen seam with the smallest blast radius**
If Option A won:
- add `src/browser/zhihu_release_backend.rs`
- wire it only where the Zhihu direct-execution route is selected
- leave global submit-path wiring alone
If Option B won:
- add `src/service/browser_callback_host.rs` with one specific transport and one explicit readiness/correlation model
- add `src/browser/zhihu_release_backend.rs` to talk to that helper path
- wire it only for the Zhihu route
In both cases:
- do not change non-Zhihu callers
- do not redesign `run_submit_task_with_browser_backend(...)`
- do not change the pipe protocol
- [ ] **Step 5: Make the status of old direct-ws code explicit**
Update `src/browser/ws_backend.rs` / `tests/browser_ws_backend_test.rs` only as needed so they no longer ambiguously describe the active Zhihu path.
Allowed outcomes:
- keep them untouched as legacy/direct-ws coverage with no Zhihu production caller
- narrow/update the tests so they no longer claim the active Zhihu integration path
Not allowed:
- leaving the plan and code in a state where both old and new paths appear to be the active Zhihu contract
- [ ] **Step 6: Run focused integration tests**
Run:
```bash
cargo test --test service_ws_session_test -- --nocapture
cargo test --test service_task_flow_test -- --nocapture
cargo test --test task_runner_test -- --nocapture
```
Expected: PASS.
- [ ] **Step 7: Commit the seam-selection slice**
Adjust staged files to match the option actually implemented, for example:
```bash
git add src/browser/zhihu_release_backend.rs src/browser/mod.rs src/service/server.rs src/service/browser_callback_host.rs tests/service_ws_session_test.rs tests/service_task_flow_test.rs tests/browser_ws_backend_test.rs
git commit -m "feat: route zhihu flow through release browser ws contract"
```
Only stage files that were truly changed.
---
## Task 3: Implement Zhihu action mapping on the chosen Release-browser seam
**Files:**
- Modify: `src/browser/ws_protocol.rs`
- Modify: `src/browser/zhihu_release_backend.rs`
- Test: `tests/browser_ws_protocol_test.rs`
- Create: `tests/browser_zhihu_release_backend_test.rs`
- [ ] **Step 1: Write the first failing backend test for Zhihu navigate mapping**
Create `tests/browser_zhihu_release_backend_test.rs` with a fake transport/relay and assert that `Action::Navigate` for the Zhihu path becomes the exact browser request shape proven by Task 1.
Start with this shape:
```rust
#[test]
fn zhihu_release_backend_maps_navigate_to_proven_release_frame() {
// invoke Action::Navigate
// assert exact outbound frame/opcode chosen from transcript evidence
}
```
Required assertions:
- the call site still uses `BrowserBackend::invoke(...)`
- the exact outbound frame matches the recorded Release-browser evidence
- request correlation stays deterministic
- [ ] **Step 2: Run the single new backend test and verify it fails**
Run:
```bash
cargo test --test browser_zhihu_release_backend_test zhihu_release_backend_maps_navigate_to_proven_release_frame -- --nocapture
```
Expected: FAIL because the backend does not exist yet.
- [ ] **Step 3: Implement minimal `Navigate` support**
In `src/browser/zhihu_release_backend.rs`:
- implement `BrowserBackend`
- support `Action::Navigate` first
- use `ws_protocol.rs` helpers for exact browser-frame construction
- do not hardcode speculative opcodes; follow the transcript from Task 1
- [ ] **Step 4: Add failing tests for `GetText` and `Eval`**
Add tests proving:
- `Action::GetText` returns `CommandOutput.data == {"text": "..."}`
- `Action::Eval` returns `CommandOutput.data == {"text": "..."}`
- callback or relay failures become `PipeError::Protocol(...)`
- [ ] **Step 5: Implement `GetText` and `Eval` on the chosen seam**
Use the smallest proven mechanism:
- if the transcript proves page-defined callback functions are required, route through them
- if `callBackJsToCpp(...)` to a page context is still part of the proven path, use it deliberately
- if `sgBrowserRegJsFun` / `sgBrowserExcuteJsFun` becomes necessary, add it only with test coverage and only for the Zhihu path
- [ ] **Step 6: Run focused backend/protocol tests**
Run:
```bash
cargo test --test browser_zhihu_release_backend_test -- --nocapture
cargo test --test browser_ws_protocol_test -- --nocapture
```
Expected: PASS.
- [ ] **Step 7: Commit the Zhihu backend slice**
```bash
git add src/browser/ws_protocol.rs src/browser/zhihu_release_backend.rs src/browser/mod.rs tests/browser_ws_protocol_test.rs tests/browser_zhihu_release_backend_test.rs
git commit -m "feat: add zhihu release ws backend"
```
---
## Task 4: Keep the Zhihu workflow logic stable and patch only proven mismatches
**Files:**
- Review: `src/compat/workflow_executor.rs`
- Test: `tests/service_task_flow_test.rs`
- Test: `tests/compat_runtime_test.rs` (only if a focused direct-execution regression is needed)
- [ ] **Step 1: Write a failing Zhihu-specific regression only if the chosen seam changes route assumptions**
If the new Zhihu backend changes request-url or target-url handling enough to break hotlist flow, add one focused failing regression for that exact behavior.
Candidate assertions:
- hotlist navigate still logs `navigate https://www.zhihu.com/hot`
- follow-up `GetText body` still targets the Zhihu page, not any helper page
- extractor `Eval` still runs against Zhihu, not any helper page
- [ ] **Step 2: Keep the current high-level Zhihu action sequence unless a test proves otherwise**
`src/compat/workflow_executor.rs` currently does the right high-level work:
- navigate to Zhihu hotlist
- poll body text until ready
- run the extractor script
Prefer to keep this file unchanged. Only patch it if the new backend needs a narrow explicit `target_url` fix or similar evidence-backed adjustment.
- [ ] **Step 3: Run the smallest Zhihu-focused verification sweep**
Run:
```bash
cargo test --test service_task_flow_test -- --nocapture
cargo test --test compat_runtime_test zhihu -- --nocapture
```
If the `compat_runtime_test zhihu` filter is too broad or unstable, run the exact focused Zhihu cases that cover hotlist extraction.
- [ ] **Step 4: Commit only if a Zhihu-specific code change was actually required**
```bash
git add src/compat/workflow_executor.rs tests/service_task_flow_test.rs tests/compat_runtime_test.rs
git commit -m "fix: keep zhihu workflow aligned with release ws backend"
```
Skip this commit if no production change in `workflow_executor.rs` was needed.
---
## Task 5: Prove that pipe behavior and non-Zhihu behavior stayed unchanged
**Files:**
- Test: `tests/service_ws_session_test.rs`
- Test: `tests/service_task_flow_test.rs`
- Test: `tests/task_runner_test.rs`
- [ ] **Step 1: Add or update one regression that proves pipe messages are unchanged**
Use the smallest existing test seam to assert that `ClientMessage` / `ServiceMessage` payloads remain unchanged while the Zhihu route uses the new browser integration path internally.
- [ ] **Step 2: Add or update one regression that proves non-Zhihu behavior is unchanged**
Use a non-Zhihu submit or service-session case and assert it does not take the new Zhihu-specific backend path.
- [ ] **Step 3: Preserve current runtime regression guards**
The end-to-end tests must continue asserting that output does **not** contain:
- `invalid hmac seed: session key must not be empty`
- `Cannot drop a runtime in a context where blocking is not allowed`
- [ ] **Step 4: Run the final focused verification sweep**
Run:
```bash
cargo test --test service_ws_session_test -- --nocapture
cargo test --test service_task_flow_test -- --nocapture
cargo test --test task_runner_test -- --nocapture
```
Expected: PASS.
- [ ] **Step 5: Commit the verification sweep**
```bash
git add tests/service_ws_session_test.rs tests/service_task_flow_test.rs tests/task_runner_test.rs tests/browser_ws_backend_test.rs
git commit -m "test: constrain zhihu release ws migration scope"
```
Only stage files that were truly changed.
---
## Out of scope
Do **not** do these in this slice:
- change the pipe protocol
- change `ClientMessage` / `ServiceMessage`
- redesign `run_submit_task_with_browser_backend(...)`
- reintroduce any browser bridge surface
- keep adding speculative direct-raw-websocket callback patches to `ws_backend.rs`
- redesign non-Zhihu workflows unless the new backend abstraction forces a shared fix
- create a long-lived external dependency or third-party server just to host the helper page
---
## Verification checklist
Run at minimum:
```bash
cargo test --test browser_ws_probe_test -- --nocapture
cargo test --test browser_zhihu_release_backend_test -- --nocapture
cargo test --test browser_ws_protocol_test -- --nocapture
cargo test --test service_ws_session_test -- --nocapture
cargo test --test service_task_flow_test -- --nocapture
cargo test --test task_runner_test -- --nocapture
```
If Task 2 chose the helper-page / relay design, also run the helper-page-specific backend tests you added for that path.
Manual verification after code changes:
1. start the real Release browser/runtime that exposes `ws://127.0.0.1:12345`
2. start `sg_claw` with real config
3. start `sg_claw_client`
4. submit:
- `打开知乎热榜获取前10条数据并导出 Excel`
5. confirm the Zhihu path uses the exact Release-browser interaction seam proven by Task 1
6. if Task 2 chose Option B, confirm the helper page / relay path is used only for the Zhihu integration seam
7. confirm non-Zhihu behavior is unchanged
8. confirm the task completes without:
- `timeout while waiting for browser message`
- `invalid browser status frame: Welcome! You are client #1`
- `invalid hmac seed: session key must not be empty`
- `Cannot drop a runtime in a context where blocking is not allowed`
---
## Expected outcome
After this slice:
- sgclaw still exposes the same pipe/service contract
- Zhihu hotlist execution uses the Release-browser websocket contract proven by Task 1
- non-Zhihu behavior remains unchanged
- old direct-ws Zhihu assumptions are no longer ambiguous in production/tests
- if Option A won, Zhihu uses a direct Release-browser backend
- if Option B won, Zhihu uses the minimal helper-page / relay seam justified by the probe evidence

View File

@@ -0,0 +1,322 @@
# Zhihu WS Submit Realignment Implementation Plan
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
**Goal:** Realign Zhihu submit routes to the documented websocket callback model, removing helper-page bootstrap from the mainline while keeping the existing pipe/service contract unchanged.
**Architecture:** The change stays inside the existing submit-path backend selection and websocket protocol flow. Zhihu routes stop choosing `BrowserCallbackBackend` and instead use `WsBrowserBackend` when a real browser websocket is configured, preserving the existing pipe fallback in direct runtime when no websocket URL is available.
**Tech Stack:** Rust, tungstenite websocket client/server, serde_json, cargo test
---
## File Map
- Modify: `src/service/server.rs`
- Change only the Zhihu route-gated submit-path backend selection
- Remove Zhihu submit mainline use of `LiveBrowserCallbackHost` / `BrowserCallbackBackend`
- Keep service submit path on `WsBrowserBackend`
- Preserve initial request URL derivation for Zhihu routes
- Modify: `src/agent/mod.rs`
- Change only the Zhihu route-gated submit-path backend selection
- Remove Zhihu submit mainline use of `LiveBrowserCallbackHost` / `BrowserCallbackBackend`
- Keep direct runtime pipe fallback when browser websocket URL is absent
- Modify: `tests/agent_runtime_test.rs`
- Replace helper-page bootstrap regression with direct websocket submit regression
- Assert no `/sgclaw/browser-helper.html` bootstrap frames are emitted
- Assert real-page request ownership on follow-up Zhihu actions
- Modify: `src/browser/callback_host.rs`
- Remove or rewrite the now-wrong red test that preserves Option-B callback-host startup behavior
- Verify: `tests/browser_ws_backend_test.rs`
- Reuse existing websocket request-url behavior coverage; extend only if the new regression proves insufficient
- Reference: `docs/superpowers/specs/2026-04-04-zhihu-ws-submit-realignment-design.md`
### Task 1: Rewrite the stale submit regression around the real websocket mainline
**Files:**
- Modify: `tests/agent_runtime_test.rs:507-660`
- Test: `tests/agent_runtime_test.rs`
- [ ] **Step 1: Write the failing test**
Rename and rewrite the existing helper-page regression so it asserts the new behavior:
```rust
#[test]
fn production_submit_task_routes_zhihu_through_ws_backend_without_helper_bootstrap() {
// arrange runtime context and fake browser ws server
// submit Zhihu hotlist request
// assert ws frames never contain "/sgclaw/browser-helper.html"
// assert first action is navigate to https://www.zhihu.com/hot
// assert follow-up action uses real-page requesturl instead of helper page
}
```
Use the existing fake ws helpers in the file where possible. Do not add localhost callback-host HTTP plumbing to this rewritten test.
- [ ] **Step 2: Run test to verify it fails**
Run: `cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" production_submit_task_routes_zhihu_through_ws_backend_without_helper_bootstrap --test agent_runtime_test -- --nocapture`
Expected: FAIL because current production code still routes Zhihu submit into `BrowserCallbackBackend` and emits helper-page bootstrap frames.
- [ ] **Step 3: Keep the regression focused**
Before touching production code, confirm the rewritten test checks only these behaviors:
```text
- no callback-host bootstrap frame
- no helper-page URL
- navigate frame still targets https://www.zhihu.com/hot
- follow-up websocket action uses real-page request ownership
```
Do not assert unrelated workflow details beyond what is needed to prove the route correction.
- [ ] **Step 4: Commit the red test**
```bash
git add tests/agent_runtime_test.rs
git commit -m "test: rewrite zhihu submit ws routing regression"
```
### Task 2: Switch service Zhihu submit routes off the callback-host backend
**Files:**
- Modify: `src/service/server.rs:287-328`
- Test: `tests/agent_runtime_test.rs`
- [ ] **Step 1: Write the minimal production change**
Replace only the Zhihu-route callback-host branch with direct websocket backend selection.
Minimal target shape:
```rust
fn browser_backend_for_submit(
browser_ws_url: &str,
mac_policy: &MacPolicy,
request: &SubmitTaskRequest,
) -> Result<Arc<dyn BrowserBackend>, PipeError> {
if should_use_callback_host_backend(request) {
return Ok(Arc::new(WsBrowserBackend::new(
Arc::new(ServiceWsClient::connect(browser_ws_url)?),
mac_policy.clone(),
initial_request_url_for_submit_task(request),
)));
}
Ok(Arc::new(WsBrowserBackend::new(
Arc::new(ServiceWsClient::connect(browser_ws_url)?),
mac_policy.clone(),
initial_request_url_for_submit_task(request),
)))
}
```
After the route-gated branch is removed, simplify further only if the branch becomes redundant without changing non-Zhihu behavior.
- [ ] **Step 2: Run the rewritten regression**
Run: `cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" production_submit_task_routes_zhihu_through_ws_backend_without_helper_bootstrap --test agent_runtime_test -- --nocapture`
Expected: still FAIL or advance to a later assertion until the direct-runtime path is corrected too.
- [ ] **Step 3: Add or update a service-specific regression if needed**
If the rewritten `agent_runtime_test` does not exercise the service submit path directly, add one narrow service regression before continuing.
Target shape:
```rust
#[test]
fn service_submit_task_routes_zhihu_through_ws_backend_without_helper_bootstrap() {
// fake browser ws
// submit Zhihu route through service path
// assert no helper bootstrap frame
}
```
Run the exact test you end up using:
`cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" service_submit_task_routes_zhihu_through_ws_backend_without_helper_bootstrap --test <exact test file> -- --nocapture`
Expected: PASS
- [ ] **Step 4: Commit the service-path fix**
```bash
git add src/service/server.rs tests/agent_runtime_test.rs
git commit -m "fix: route zhihu submit through ws backend"
```
### Task 3: Switch direct runtime Zhihu submit routes off the callback-host backend while keeping pipe fallback
**Files:**
- Modify: `src/agent/mod.rs:49-100`
- Test: `tests/agent_runtime_test.rs`
- [ ] **Step 1: Write the minimal production change**
Remove callback-host backend selection from `browser_backend_for_submit(...)`.
Minimal target behavior:
```rust
if let Some(browser_ws_url) = configured_browser_ws_url(context) {
return Ok(Arc::new(WsBrowserBackend::new(
Arc::new(ServiceWsClient::connect(&browser_ws_url)?),
browser_tool.mac_policy().clone(),
initial_request_url_for_submit_task(request),
).with_response_timeout(browser_tool.response_timeout())));
}
Ok(Arc::new(PipeBrowserBackend::from_inner(browser_tool.clone())))
```
If `ServiceWsClient` is not reusable from `src/service/server.rs`, extract the smallest shared websocket client helper into the browser module instead of inventing a new abstraction.
- [ ] **Step 2: Add a focused fallback assertion only if needed**
If the rewritten regression does not cover the direct-runtime no-websocket case, add one small test:
```rust
#[test]
fn production_submit_task_keeps_pipe_fallback_when_browser_ws_url_is_unset() {
// no SGCLAW_BROWSER_WS_URL
// blank/no ws config
// assert no websocket bootstrap attempt occurs
}
```
Only add this test if current coverage is insufficient.
- [ ] **Step 3: Run tests to verify green**
Run: `cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" production_submit_task_routes_zhihu_through_ws_backend_without_helper_bootstrap --test agent_runtime_test -- --nocapture`
Expected: PASS
If a fallback test was added, run it immediately after and expect PASS.
- [ ] **Step 4: Commit the direct-runtime fix**
```bash
git add src/agent/mod.rs tests/agent_runtime_test.rs
git commit -m "fix: align runtime zhihu submit with ws contract"
```
### Task 4: Reassess stale callback-host regression coverage only if it blocks the approved slice
**Files:**
- Maybe modify: `src/browser/callback_host.rs:793-810`
- Test: `src/browser/callback_host.rs`
- [ ] **Step 1: Check whether the callback-host red test still blocks the approved Option A slice**
Inspect whether this test still preserves rejected Option-B behavior and whether it fails or becomes misleading after Tasks 1-3:
```rust
#[test]
fn live_callback_host_starts_without_bootstrapping_external_helper_page() {
// inspect before editing
}
```
If the test is unrelated to the approved Zhihu mainline or remains harmless, leave it unchanged in this slice.
- [ ] **Step 2: Remove or rewrite only if required by the changed submit-path behavior**
If the test blocks the approved slice, make the smallest change needed:
- delete it if it exists only to preserve rejected Option B behavior, or
- rewrite it so it no longer asserts callback-host startup as the accepted Zhihu mainline
- [ ] **Step 3: Run focused callback-host tests only if Step 2 changed code**
Run: `cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" callback_host --lib -- --nocapture`
Expected: PASS
- [ ] **Step 4: Commit only if Step 2 changed code**
```bash
git add src/browser/callback_host.rs
git commit -m "test: clean up stale callback host regression"
```
### Task 5: Run the focused verification sweep
**Files:**
- Verify: `tests/agent_runtime_test.rs`
- Verify: `tests/compat_runtime_test.rs`
- Verify: any directly affected service/browser websocket tests
- [ ] **Step 1: Run submit-path regression coverage**
Run: `cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" production_submit_task_routes_zhihu_through_ws_backend_without_helper_bootstrap --test agent_runtime_test -- --nocapture`
Expected: PASS
- [ ] **Step 2: Run websocket backend request-url coverage**
Run: `cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" ws_backend_reuses_last_navigated_url_for_followup_requests --test browser_ws_backend_test -- --nocapture`
Expected: PASS
- [ ] **Step 3: Run Zhihu compat runtime coverage**
Run: `cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" zhihu --test compat_runtime_test -- --nocapture`
Expected: PASS for the changed submit-path surface or clear, directly related failures only.
- [ ] **Step 4: Run affected service submit regression coverage**
Run the exact service-specific regression from Task 2 if you added one.
Otherwise, run the narrowest existing service submit test that covers `ClientMessage::SubmitTask` handling for browser routes.
Expected: PASS
- [ ] **Step 5: Commit the verified slice**
```bash
git add src/service/server.rs src/agent/mod.rs tests/agent_runtime_test.rs src/browser/callback_host.rs
git commit -m "fix: realign zhihu submit with browser ws callbacks"
```
### Task 6: Run stronger real-browser validation
**Files:**
- Verify live behavior through existing binaries and real config only
- [ ] **Step 1: Start the service with the real config**
Run: `cargo run --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" --bin sg_claw -- --config-path "D:/data/ideaSpace/rust/sgClaw/sgclaw_config.json"`
Expected: service starts without failing at callback-host readiness timeout.
- [ ] **Step 2: Run the client against the started service**
Run: `cargo run --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" --bin sg_claw_client`
Expected: for `打开知乎热榜获取前10条数据并导出 Excel`, the browser proceeds into real Zhihu page work instead of stalling before page open.
- [ ] **Step 3: Capture the narrow acceptance evidence**
Verify all of the following from logs/frames/observed behavior:
```text
- no callback-host readiness timeout
- no helper-page bootstrap frame
- at least one real-page follow-up browser action after navigate
```
- [ ] **Step 4: Commit only if live verification required code changes**
```bash
git add <only files changed during live-fix follow-up>
git commit -m "fix: tighten zhihu ws submit live validation follow-up"
```
If no further code changes were needed, do not create an extra commit.

View File

@@ -0,0 +1,406 @@
# Service Chat Web Console Implementation Plan
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
**Goal:** Add a standalone local HTML console that connects to the existing service websocket, submits natural-language tasks with the current `submit_task` payload, and leaves the browser-helper/runtime path untouched.
**Architecture:** The change stays fully at the presentation edge. A new self-contained HTML file under `frontend/service-console/` reuses the current websocket protocol from `src/service/protocol.rs`, while one narrow Rust integration test guards the page's protocol shape and forbids any reference to `browser-helper.html`, callback-host endpoints, or the browser websocket. No Rust runtime logic changes are part of this slice.
**Tech Stack:** HTML, CSS, vanilla JavaScript, Rust integration tests, std::fs, Cargo test
---
## File Map
- Create: `frontend/service-console/sg_claw_service_console.html`
- Standalone local page with inline CSS and JavaScript
- Connects to the existing service websocket at `ws://127.0.0.1:42321` by default
- Sends existing `ClientMessage::SubmitTask` JSON
- Renders inbound `ServiceMessage` rows only
- Create: `tests/service_console_html_test.rs`
- Source guard for the standalone page
- Verifies file location, allowed protocol usage, and forbidden helper/callback references
- Reference: `src/service/protocol.rs`
- Existing websocket message shape to mirror exactly
- Reference: `src/bin/sg_claw_client.rs`
- Existing terminal client behavior to mirror for `submit_task`
- Reference: `docs/superpowers/specs/2026-04-06-service-chat-web-console-design.md`
## Scope Guardrails
- Do not modify `src/service/server.rs`.
- Do not modify `src/browser/callback_host.rs`.
- Do not modify `src/browser/callback_backend.rs`.
- Do not modify `src/bin/sg_claw_client.rs`.
- Do not add an HTTP server.
- Do not connect the new page to `ws://127.0.0.1:12345`.
- Do not reference `/sgclaw/browser-helper.html` or `/sgclaw/callback/*` anywhere in the new page.
### Task 1: Add a failing source-guard test for the standalone page
**Files:**
- Create: `tests/service_console_html_test.rs`
- Reference: `docs/superpowers/specs/2026-04-06-service-chat-web-console-design.md`
- [ ] **Step 1: Write the failing test**
Create a focused integration test that resolves the HTML path from `CARGO_MANIFEST_DIR` and asserts the file contract.
```rust
use std::fs;
use std::path::PathBuf;
#[test]
fn service_console_html_stays_on_service_ws_boundary() {
let manifest_dir = PathBuf::from(env!("CARGO_MANIFEST_DIR"));
let html_path = manifest_dir
.join("frontend")
.join("service-console")
.join("sg_claw_service_console.html");
let source = fs::read_to_string(&html_path)
.expect("service console html should exist");
assert!(source.contains("ws://127.0.0.1:42321"));
assert!(source.contains("submit_task"));
assert!(!source.contains("/sgclaw/browser-helper.html"));
assert!(!source.contains("/sgclaw/callback/ready"));
assert!(!source.contains("/sgclaw/callback/events"));
assert!(!source.contains("/sgclaw/callback/commands/next"));
assert!(!source.contains("/sgclaw/callback/commands/ack"));
assert!(!source.contains("ws://127.0.0.1:12345"));
}
```
- [ ] **Step 2: Run test to verify it fails**
Run:
```bash
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" service_console_html_stays_on_service_ws_boundary --test service_console_html_test -- --exact
```
Expected: FAIL because the HTML file does not exist yet.
- [ ] **Step 3: Keep the test narrow**
Before writing production code, confirm the test guards only the approved boundary:
```text
- file exists at frontend/service-console/sg_claw_service_console.html
- service websocket default is present
- submit_task payload marker is present
- no helper-page path
- no callback-host endpoints
- no browser websocket URL
```
Do not turn this into an end-to-end browser test.
- [ ] **Step 4: Commit the red test**
```bash
git add tests/service_console_html_test.rs
git commit -m "test: add service console html boundary guard"
```
### Task 2: Implement the standalone HTML console with the approved boundary
**Files:**
- Create: `frontend/service-console/sg_claw_service_console.html`
- Reference: `src/service/protocol.rs:6`
- Reference: `src/bin/sg_claw_client.rs:16`
- Test: `tests/service_console_html_test.rs`
- [ ] **Step 1: Create the HTML file with the minimal structure**
Write one self-contained page with:
```html
<!doctype html>
<html lang="zh-CN">
<head>
<meta charset="utf-8" />
<meta name="viewport" content="width=device-width, initial-scale=1" />
<title>sgClaw Service Console</title>
</head>
<body>
<div id="app">
<input id="wsUrl" value="ws://127.0.0.1:42321" />
<button id="connectBtn">连接</button>
<div id="connectionState">未连接</div>
<div id="messageStream"></div>
<textarea id="instructionInput"></textarea>
<div id="validationText"></div>
<button id="sendBtn" disabled>发送任务</button>
</div>
</body>
</html>
```
Keep all CSS and JavaScript inline. Do not add external assets or a build step.
- [ ] **Step 2: Implement websocket connect/disconnect behavior**
Add the smallest possible JS behavior, including explicit disconnect on the same button so the UI
matches the approved connect/disconnect contract:
```javascript
let socket = null;
function appendRow(kind, text) {
// append a visible row to #messageStream
}
function updateUiState() {
const connected = socket && socket.readyState === WebSocket.OPEN;
document.getElementById('connectBtn').textContent = connected ? '断开' : '连接';
document.getElementById('sendBtn').disabled = !connected;
document.getElementById('connectionState').textContent = connected ? '已连接' : '未连接';
}
function connectOrDisconnectService() {
if (socket && socket.readyState === WebSocket.OPEN) {
socket.close();
return;
}
const url = document.getElementById('wsUrl').value.trim() || 'ws://127.0.0.1:42321';
socket = new WebSocket(url);
updateUiState();
socket.addEventListener('open', () => {
appendRow('status', 'service websocket connected');
updateUiState();
});
socket.addEventListener('close', () => {
appendRow('status', 'service websocket disconnected');
updateUiState();
});
socket.addEventListener('error', () => appendRow('error', 'service websocket error'));
socket.addEventListener('message', handleMessage);
}
```
Do not add retry loops or background reconnect logic.
- [ ] **Step 3: Implement submit_task sending with the current message shape**
Mirror the terminal client payload shape exactly and show inline validation for empty input:
```javascript
function setValidation(message) {
document.getElementById('validationText').textContent = message;
}
function sendTask() {
const instruction = document.getElementById('instructionInput').value.trim();
if (!socket || socket.readyState !== WebSocket.OPEN) {
return;
}
if (!instruction) {
setValidation('请输入任务内容。');
return;
}
setValidation('');
socket.send(JSON.stringify({
type: 'submit_task',
instruction,
conversation_id: '',
messages: [],
page_url: '',
page_title: ''
}));
}
```
Do not add new fields. Do not add conversation replay logic in this slice.
- [ ] **Step 4: Render existing inbound service messages only**
Handle the current `ServiceMessage` variants with a minimal dispatcher:
```javascript
function handleMessage(event) {
const message = JSON.parse(event.data);
switch (message.type) {
case 'status_changed':
appendRow('status', message.state);
break;
case 'log_entry':
appendRow('log', message.message);
break;
case 'task_complete':
appendRow(message.success ? 'complete' : 'error', message.summary);
break;
case 'busy':
appendRow('error', message.message);
break;
default:
appendRow('error', 'unknown service message: ' + event.data);
}
}
```
Keep the composer enabled during in-flight work so repeated submits surface the existing `busy` response instead of inventing a frontend queue.
- [ ] **Step 5: Keep the helper boundary explicit in the source**
Before running tests, inspect the HTML source and confirm:
```text
- no /sgclaw/browser-helper.html
- no /sgclaw/callback/*
- no ws://127.0.0.1:12345
- no browser websocket register frame logic
```
If any such string appears, remove it before testing.
- [ ] **Step 6: Run the source-guard test to verify green**
Run:
```bash
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" service_console_html_stays_on_service_ws_boundary --test service_console_html_test -- --exact
```
Expected: PASS
- [ ] **Step 7: Commit the standalone page**
```bash
git add frontend/service-console/sg_claw_service_console.html tests/service_console_html_test.rs
git commit -m "feat: add standalone service chat console"
```
### Task 3: Run the focused verification sweep
**Files:**
- Verify: `tests/service_console_html_test.rs`
- Reference: `src/service/protocol.rs`
- Reference: `src/bin/sg_claw_client.rs`
- [ ] **Step 1: Re-run the source-guard test**
Run:
```bash
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" service_console_html_stays_on_service_ws_boundary --test service_console_html_test -- --exact
```
Expected: PASS
- [ ] **Step 2: Manually inspect disconnected-send and validation markers in the HTML source**
Before broader verification, confirm the page source clearly contains all three UI-local rules:
```text
- connect button can disconnect an open websocket
- send button starts disabled while disconnected
- empty instruction shows inline validation text
```
This inspection stays source-level; do not add extra backend tests for it in this slice.
- [ ] **Step 3: Run an existing service protocol regression for safety**
Run the narrow existing protocol coverage to prove the page did not require backend changes:
```bash
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" submit_task_client_message_converts_into_shared_runner_request --test service_ws_session_test -- --exact
```
Expected: PASS
- [ ] **Step 4: Run an existing terminal-client regression for safety**
Run:
```bash
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" client_submits_first_user_line_to_service --test service_task_flow_test -- --exact
```
Expected: PASS
- [ ] **Step 5: Commit only if verification required any code change**
```bash
git add frontend/service-console/sg_claw_service_console.html tests/service_console_html_test.rs
git commit -m "test: tighten service console verification"
```
If verification required no code changes, do not create an extra commit.
### Task 4: Perform the manual smoke check
**Files:**
- Verify live behavior only; no new code required
- [ ] **Step 1: Start the existing service binary**
Run:
```bash
cargo run --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" --bin sg_claw -- --config-path "D:/data/ideaSpace/rust/sgClaw/sgclaw_config.json"
```
Expected: service starts and prints its ready line with the service websocket listen address.
- [ ] **Step 2: Open the standalone page directly**
Open:
```text
D:/data/ideaSpace/rust/sgClaw/claw-new/frontend/service-console/sg_claw_service_console.html
```
Expected: the page loads through the browser as a local file and shows the default websocket URL `ws://127.0.0.1:42321`.
- [ ] **Step 3: Connect, disconnect, and reconnect once**
Expected:
```text
- message stream shows websocket connected
- clicking the same button disconnects the websocket cleanly
- message stream shows websocket disconnected
- send button is disabled again while disconnected
- reconnect succeeds without reloading the page
```
- [ ] **Step 4: Submit one natural-language task**
Use a small harmless instruction such as:
```text
打开百度
```
Expected:
```text
- empty textarea send attempt first shows inline validation without sending a websocket frame
- page sends one submit_task payload after valid input
- page receives and renders status/log/task_complete or busy rows
```
- [ ] **Step 5: Confirm the helper boundary stayed untouched**
Verify from the page source and observed behavior:
```text
- the page never loads /sgclaw/browser-helper.html
- the page never calls /sgclaw/callback/*
- the page never connects to ws://127.0.0.1:12345
```
If the task itself triggers browser automation, that remains owned by the existing Rust runtime rather than by the page.
- [ ] **Step 6: Commit only if the manual pass required code changes**
```bash
git add frontend/service-console/sg_claw_service_console.html tests/service_console_html_test.rs
git commit -m "fix: tighten standalone service console smoke flow"
```
If the manual pass required no code changes, do not create an extra commit.

View File

@@ -0,0 +1,637 @@
# Zhihu Hotlist Post-Export Auto-Open Implementation Plan
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
**Goal:** Extend the existing Zhihu hotlist Excel and dashboard routes so each route can auto-open its own generated artifact after export, while preserving the current callback-host-backed browser boundary and route exclusivity.
**Architecture:** Keep orchestration in `src/compat/workflow_executor.rs`, but move post-export side effects into a new `src/compat/artifact_open.rs` helper so workflow routing stays readable. Excel auto-open is a local OS-launch side effect; dashboard auto-open reuses `screen_html_export`'s existing `presentation.url` and sends one narrow, marker-based `Action::Navigate` request through `BrowserCallbackBackend`, with a matching special-case validator in `MacPolicy` so arbitrary `file://` navigation remains blocked.
**Tech Stack:** Rust, serde_json, std::process::Command, std::path, Cargo tests
---
## File Map
- Create: `src/compat/artifact_open.rs`
- Define the narrow post-export helper surface for this slice only
- Parse and validate generated artifact payload fields passed in by the workflow layer
- Open generated `.xlsx` files with the local default app
- Build the exact approved local-dashboard navigate payload
- Keep one testable internal seam, `open_exported_xlsx_with(output_path, opener)`, so unit tests can prove the generated `.xlsx` path is handed to the launcher without starting a real spreadsheet app
- Include unit tests in the same file for exact Excel path handoff and launcher-failure reporting
- Modify: `src/compat/mod.rs`
- Export the new `artifact_open` module
- Modify: `src/compat/workflow_executor.rs`
- Keep route detection and artifact generation where they are now
- Change `export_xlsx(...)` and `export_screen(...)` so they parse tool payloads, call the route-specific opener, and produce the new success/failure summaries
- Modify: `src/browser/callback_backend.rs`
- Recognize only the approved local-dashboard navigate request shape at `Action::Navigate`
- Keep normal remote navigate behavior unchanged
- Continue emitting `sgBrowerserOpenPage` for the approved local-dashboard case so the helper page stays alive and the dashboard opens in a new visible tab
- Add focused callback-backend unit tests in the existing test module for approved and malformed local-dashboard requests
- Modify: `src/security/mac_policy.rs`
- Add a narrow validator for the approved local-dashboard presentation case
- Keep `validate(...)` unchanged for ordinary remote-domain flow
- Reject malformed marker payloads, non-HTML local paths, and mismatched `file://` / output-path combinations
- Modify: `tests/compat_runtime_test.rs`
- Keep the concrete hotlist workflow regressions in this existing integration test file
- Extend existing Zhihu hotlist export/screen regressions to assert the new summaries and the dashboard marker payload
- Keep the Excel route workflow assertion limited to summary plus “no dashboard navigate marker,” because exact launcher handoff is covered in `src/compat/artifact_open.rs` unit tests
- Modify: `tests/browser_tool_test.rs`
- Add `MacPolicy` coverage for approved local-dashboard presentation, rejected malformed presentation, and unchanged normal-domain validation in one exact file
- Extend the existing `default_rules_allow_zhihu_navigation` area with the new local-dashboard validation tests rather than creating a second policy test location
- Reference only if summary wording ripples outward: `tests/agent_runtime_test.rs:173-258`
- Existing direct-runtime user-visible summary assertion for Zhihu Excel export
- Reference only if summary wording ripples outward: `tests/service_task_flow_test.rs:704-839`
- Existing CLI-to-service user-visible summary assertion for Zhihu Excel export
- Reference only if summary wording ripples outward: `tests/service_ws_session_test.rs:755-869`
- Existing service-binary user-visible summary assertion for Zhihu Excel export
- Reference: `tests/compat_screen_html_export_tool_test.rs`
- Reuse the exact test seam `screen_html_export_tool_renders_dashboard_html_with_presentation_contract`
- Existing proof that `screen_html_export` already returns `presentation.url`
- Reference: `docs/superpowers/specs/2026-04-06-zhihu-hotlist-post-export-auto-open-design.md`
## Scope Guardrails
- Do not modify `frontend/service-console/sg_claw_service_console.html`.
- Do not modify `src/service/protocol.rs`.
- Do not modify `browser-helper.html`.
- Do not modify `/sgclaw/callback/*` endpoint contracts.
- Do not modify websocket protocol framing or `src/browser/ws_protocol.rs`.
- Do not turn Excel-open and dashboard-open into a combined mode.
- Do not add a general-purpose local file browser or generic `file://` allowlist.
- Do not move post-export decisions into the frontend service console.
- Do not require websocket-backend parity in this slice.
### Task 1: Add failing workflow tests for route-specific post-export actions
**Files:**
- Modify: `tests/compat_runtime_test.rs:2154-2304`
- Reference: `src/compat/workflow_executor.rs:375-446`
- Reference: `docs/superpowers/specs/2026-04-06-zhihu-hotlist-post-export-auto-open-design.md`
- [ ] **Step 1: Rewrite the Excel hotlist assertion as a red test for the new summary only**
Keep the current flow setup, but tighten the expectation so it proves the workflow route now reports post-export open success while staying exclusive from the dashboard path.
Target shape:
```rust
#[test]
fn handle_browser_message_chains_hotlist_skill_into_xlsx_export_and_auto_open() {
// existing setup
assert!(summary.contains("已导出并打开知乎热榜 Excel"));
assert!(generated.exists());
assert!(!sent.iter().any(|message| {
matches!(
message,
AgentMessage::Command { action, params, .. }
if action == &Action::Navigate
&& params.get("sgclaw_local_dashboard_open").is_some()
)
}));
}
```
Do not try to prove real OS launching in this workflow test. The exact `.xlsx` path handoff to the launcher belongs in `src/compat/artifact_open.rs` unit tests from Task 2.
- [ ] **Step 2: Rewrite the dashboard hotlist assertion as a red test for browser auto-open**
Tighten the existing dashboard test so it proves the workflow consumes `presentation.url` and emits the approved compat marker payload.
Target shape:
```rust
#[test]
fn handle_browser_message_chains_hotlist_skill_into_screen_export_and_auto_open() {
// existing setup
assert!(summary.contains("已在浏览器中打开知乎热榜大屏"));
let navigate = sent.iter().find_map(|message| match message {
AgentMessage::Command { action, params, security, .. }
if action == &Action::Navigate
&& security.expected_domain == "__sgclaw_local_dashboard__" => Some((params, security)),
_ => None,
}).expect("dashboard route should emit local-dashboard navigate request");
assert!(navigate.0["url"].as_str().unwrap().starts_with("file://"));
assert_eq!(navigate.0["sgclaw_local_dashboard_open"]["source"], json!("compat.workflow_executor"));
assert_eq!(navigate.0["sgclaw_local_dashboard_open"]["kind"], json!("zhihu_hotlist_screen"));
assert_eq!(navigate.0["sgclaw_local_dashboard_open"]["presentation_url"], navigate.0["url"]);
}
```
Also assert that this route still logs `call screen_html_export` and does not invoke the Excel opener path.
- [ ] **Step 3: Add a missing-`presentation.url` regression in the workflow test module if none exists**
Put this close to the existing hotlist tests and keep it narrow:
```rust
#[test]
fn handle_browser_message_reports_dashboard_auto_open_protocol_error_when_presentation_url_is_missing() {
// mock screen_html_export success payload with output_path but no presentation.url
// assert summary contains 已生成知乎热榜大屏 <path>,但浏览器自动打开失败:
}
```
Use the existing summary/path helpers in the file instead of inventing new parsing helpers.
- [ ] **Step 4: Run the focused compat runtime tests to verify they fail**
Run:
```bash
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" handle_browser_message_chains_hotlist_skill_into_xlsx_export_and_auto_open --test compat_runtime_test -- --exact
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" handle_browser_message_chains_hotlist_skill_into_screen_export_and_auto_open --test compat_runtime_test -- --exact
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" handle_browser_message_reports_dashboard_auto_open_protocol_error_when_presentation_url_is_missing --test compat_runtime_test -- --exact
```
Expected: FAIL because the workflow still returns artifact-only summaries and has no post-export open handling.
- [ ] **Step 5: Commit the red workflow tests**
```bash
git add tests/compat_runtime_test.rs
git commit -m "test: add hotlist post-export auto-open regressions"
```
### Task 2: Implement the compat post-export opener and update workflow summaries
**Files:**
- Create: `src/compat/artifact_open.rs`
- Modify: `src/compat/mod.rs`
- Modify: `src/compat/workflow_executor.rs:375-446`
- Test: `src/compat/artifact_open.rs`
- Test: `tests/compat_runtime_test.rs`
- [ ] **Step 1: Add the red unit tests in `src/compat/artifact_open.rs` before writing production code**
Create the new module with a `#[cfg(test)]` block first so the Excel opener has an exact, non-UI verification seam.
Target tests:
```rust
#[test]
fn open_exported_xlsx_with_passes_generated_path_to_launcher() {
let mut seen = None;
let result = open_exported_xlsx_with(Path::new("C:/tmp/zhihu-hotlist.xlsx"), |path| {
seen = Some(path.to_path_buf());
Ok(())
});
assert!(matches!(result, PostExportOpen::Opened));
assert_eq!(seen.unwrap(), PathBuf::from("C:/tmp/zhihu-hotlist.xlsx"));
}
#[test]
fn open_exported_xlsx_with_reports_launcher_failure() {
let result = open_exported_xlsx_with(Path::new("C:/tmp/zhihu-hotlist.xlsx"), |_path| {
Err("launcher failed".to_string())
});
assert!(matches!(result, PostExportOpen::Failed(reason) if reason.contains("launcher failed")));
}
```
Add one matching dashboard payload test in the same file:
```rust
#[test]
fn open_local_dashboard_uses_exact_approved_marker_payload() {
// FakeBrowserBackend records invoke(action, params, expected_domain)
// assert expected_domain == "__sgclaw_local_dashboard__"
// assert params.url == params.sgclaw_local_dashboard_open.presentation_url
// assert source/kind/output_path all match the approved contract
}
```
This step is mandatory so the Excel route is proven to hand the generated path to the opener without launching a real application.
- [ ] **Step 2: Run the new unit tests to verify they fail**
Run:
```bash
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" open_exported_xlsx_with_passes_generated_path_to_launcher --lib -- --exact
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" open_exported_xlsx_with_reports_launcher_failure --lib -- --exact
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" open_local_dashboard_uses_exact_approved_marker_payload --lib -- --exact
```
Expected: FAIL because `src/compat/artifact_open.rs` does not exist yet.
- [ ] **Step 3: Create the small compat opener module**
Add one focused helper module rather than embedding side effects directly into `workflow_executor.rs`.
Target shape:
```rust
pub const LOCAL_DASHBOARD_EXPECTED_DOMAIN: &str = "__sgclaw_local_dashboard__";
pub const LOCAL_DASHBOARD_SOURCE: &str = "compat.workflow_executor";
pub const LOCAL_DASHBOARD_KIND_ZHIHU_HOTLIST_SCREEN: &str = "zhihu_hotlist_screen";
pub enum PostExportOpen {
Opened,
Failed(String),
}
pub fn open_exported_xlsx(output_path: &Path) -> PostExportOpen {
open_exported_xlsx_with(output_path, launch_with_default_xlsx_app)
}
fn open_exported_xlsx_with<F>(output_path: &Path, opener: F) -> PostExportOpen
where
F: FnOnce(&Path) -> Result<(), String>,
{ /* test seam */ }
pub fn open_local_dashboard(
browser_backend: &dyn BrowserBackend,
output_path: &Path,
presentation_url: &str,
) -> PostExportOpen { /* invoke Action::Navigate with exact marker payload */ }
```
Keep the module tiny. The only dedicated test seam in this file should be `open_exported_xlsx_with(...)`; do not introduce a general launcher trait.
- [ ] **Step 4: Implement the Windows-first `.xlsx` opener minimally**
Use a focused local launcher that targets the current environment first.
Preferred target shape:
```rust
Command::new("cmd")
.args(["/C", "start", "", output_path_as_windows_string])
```
Requirements:
```text
- fail if the path does not exist
- do not swallow command-spawn errors
- do not open arbitrary user-selected files from outside this workflow
- keep cross-platform behavior minimal; only add a fallback branch if required to keep tests/build portable
```
If you need a non-Windows fallback for compilation, keep it obviously minimal and out of the hot path.
- [ ] **Step 5: Parse payloads in `workflow_executor.rs` and call the new helper**
Refactor `export_xlsx(...)` and `export_screen(...)` just enough to separate:
```text
- tool execution
- payload parsing
- route-specific post-export open
- summary formatting
```
Minimal target behavior:
```rust
match open_exported_xlsx(&output_path) {
PostExportOpen::Opened => format!("已导出并打开知乎热榜 Excel {output_path}"),
PostExportOpen::Failed(reason) => format!("已导出知乎热榜 Excel {output_path},但自动打开失败:{reason}"),
}
```
```rust
match open_local_dashboard(browser_backend, &output_path, &presentation_url) {
PostExportOpen::Opened => format!("已在浏览器中打开知乎热榜大屏 {output_path}"),
PostExportOpen::Failed(reason) => format!("已生成知乎热榜大屏 {output_path},但浏览器自动打开失败:{reason}"),
}
```
Change signatures only as much as needed to pass `browser_backend` into the dashboard route. Do not broaden unrelated call chains.
- [ ] **Step 6: Export the helper module**
Update `src/compat/mod.rs`:
```rust
pub mod artifact_open;
```
Do not reorder unrelated module exports unless rustfmt does it.
- [ ] **Step 7: Run the focused library and workflow regressions to verify green**
Run:
```bash
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" open_exported_xlsx_with_passes_generated_path_to_launcher --lib -- --exact
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" open_exported_xlsx_with_reports_launcher_failure --lib -- --exact
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" open_local_dashboard_uses_exact_approved_marker_payload --lib -- --exact
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" handle_browser_message_chains_hotlist_skill_into_xlsx_export_and_auto_open --test compat_runtime_test -- --exact
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" handle_browser_message_chains_hotlist_skill_into_screen_export_and_auto_open --test compat_runtime_test -- --exact
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" handle_browser_message_reports_dashboard_auto_open_protocol_error_when_presentation_url_is_missing --test compat_runtime_test -- --exact
```
Expected: PASS for the new library tests and the workflow regressions, unless the dashboard-open path still fails at backend/policy validation.
- [ ] **Step 8: Commit the compat opener and workflow changes**
```bash
git add src/compat/artifact_open.rs src/compat/mod.rs src/compat/workflow_executor.rs tests/compat_runtime_test.rs
git commit -m "feat: auto-open zhihu hotlist export artifacts"
```
### Task 3: Add failing backend and security tests for the narrow local-dashboard allowance
**Files:**
- Modify: `src/browser/callback_backend.rs:536-840`
- Modify: `tests/browser_tool_test.rs` (`default_rules_allow_zhihu_navigation` section plus new local-dashboard validation tests)
- Reference: `src/security/mac_policy.rs:56-132`
- [ ] **Step 1: Add a red callback-backend acceptance test for the approved local-dashboard request shape**
Extend the existing `src/browser/callback_backend.rs` test module with one focused navigate test.
Target shape:
```rust
#[test]
fn callback_backend_accepts_approved_local_dashboard_navigate_request() {
let host = Arc::new(FakeCallbackHost::new(vec![success_reply(json!({ "navigated": true }))]));
let backend = BrowserCallbackBackend::new(
host.clone(),
test_policy(),
"http://127.0.0.1:17888/sgclaw/browser-helper.html",
);
let output = backend.invoke(
Action::Navigate,
json!({
"url": "file:///C:/tmp/zhihu-hotlist-screen.html",
"sgclaw_local_dashboard_open": {
"source": "compat.workflow_executor",
"kind": "zhihu_hotlist_screen",
"output_path": "C:/tmp/zhihu-hotlist-screen.html",
"presentation_url": "file:///C:/tmp/zhihu-hotlist-screen.html"
}
}),
"__sgclaw_local_dashboard__",
);
assert!(output.unwrap().success);
assert_eq!(host.requests()[0].command, json!([
"http://127.0.0.1:17888/sgclaw/browser-helper.html",
"sgBrowerserOpenPage",
"file:///C:/tmp/zhihu-hotlist-screen.html"
]));
}
```
Do not weaken any existing normal-domain tests.
- [ ] **Step 2: Add red rejection tests in exact files**
Put malformed-request rejection in `src/browser/callback_backend.rs` next to the acceptance test:
```rust
#[test]
fn callback_backend_rejects_local_dashboard_navigate_without_required_marker_fields() {}
```
Put policy-only validation in `tests/browser_tool_test.rs` so all public `MacPolicy` assertions stay in one place:
```rust
#[test]
fn mac_policy_rejects_non_html_local_dashboard_presentation() {}
#[test]
fn default_rules_allow_zhihu_navigation() {
let policy = MacPolicy::load_from_path(...).unwrap();
policy.validate(&Action::Navigate, "www.zhihu.com").unwrap();
}
```
Do not create a second `MacPolicy` regression location.
- [ ] **Step 3: Run the focused backend/policy tests to verify red**
Run:
```bash
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" callback_backend_accepts_approved_local_dashboard_navigate_request --lib -- --exact
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" callback_backend_rejects_local_dashboard_navigate_without_required_marker_fields --lib -- --exact
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" mac_policy_rejects_non_html_local_dashboard_presentation --test browser_tool_test -- --exact
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" default_rules_allow_zhihu_navigation --test browser_tool_test -- --exact
```
Expected: the new local-dashboard tests FAIL; `default_rules_allow_zhihu_navigation` should still PASS.
- [ ] **Step 4: Commit the red backend/security tests**
```bash
git add src/browser/callback_backend.rs tests/browser_tool_test.rs
git commit -m "test: lock local dashboard navigate boundary"
```
### Task 4: Implement the narrow callback-backend and MacPolicy allowance
**Files:**
- Modify: `src/browser/callback_backend.rs:300-351`
- Modify: `src/security/mac_policy.rs:56-132`
- Maybe modify: `src/security/mod.rs:9-27`
- Test: `src/browser/callback_backend.rs:536-840`
- Test: `tests/browser_tool_test.rs` (`default_rules_allow_zhihu_navigation` section plus new local-dashboard validation tests)
- [ ] **Step 1: Add a narrow local-dashboard validation helper in `MacPolicy`**
Keep `validate(...)` unchanged for ordinary domain flow. Add one small explicit helper instead.
Target shape:
```rust
pub fn validate_local_dashboard_presentation(
&self,
action: &Action,
expected_domain: &str,
presentation_url: &str,
output_path: &str,
) -> Result<(), SecurityError> {
// require Action::Navigate
// require expected_domain == "__sgclaw_local_dashboard__"
// require file:// URL
// require .html path
// require normalized file URL path matches output_path
}
```
If you need a new `SecurityError` variant for malformed local-dashboard input, add the smallest one that keeps error text clear.
- [ ] **Step 2: Recognize only the exact approved request shape in `BrowserCallbackBackend::invoke(...)`**
Before the normal `self.mac_policy.validate(&action, expected_domain)?` path runs, detect the one approved special case.
Minimal target behavior:
```rust
if let Some(local_dashboard) = approved_local_dashboard_request(&action, &params, expected_domain) {
self.mac_policy.validate_local_dashboard_presentation(
&action,
expected_domain,
&local_dashboard.presentation_url,
&local_dashboard.output_path,
)?;
} else {
self.mac_policy.validate(&action, expected_domain)?;
}
```
The helper should require all of these fields exactly:
```text
- action == Action::Navigate
- expected_domain == "__sgclaw_local_dashboard__"
- params.url exists
- params.sgclaw_local_dashboard_open.source == "compat.workflow_executor"
- params.sgclaw_local_dashboard_open.kind == "zhihu_hotlist_screen"
- params.sgclaw_local_dashboard_open.output_path exists
- params.sgclaw_local_dashboard_open.presentation_url exists and equals params.url
```
Anything else must continue down the normal rejection path.
- [ ] **Step 3: Keep `build_command(Action::Navigate, ...)` simple**
Do not add a second browser opcode or change the callback-host runtime contract. The approved local-dashboard case should still flow into the existing navigate command builder so the emitted command stays:
```rust
json!([
self.helper_page_url,
"sgBrowerserOpenPage",
target_url,
])
```
- [ ] **Step 4: Run the focused backend/security tests to verify green**
Run:
```bash
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" callback_backend_accepts_approved_local_dashboard_navigate_request --lib -- --exact
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" callback_backend_rejects_local_dashboard_navigate_without_required_marker_fields --lib -- --exact
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" mac_policy_rejects_non_html_local_dashboard_presentation --test browser_tool_test -- --exact
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" default_rules_allow_zhihu_navigation --test browser_tool_test -- --exact
```
Expected: PASS
- [ ] **Step 5: Re-run the dashboard workflow regression after backend validation lands**
Run:
```bash
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" handle_browser_message_chains_hotlist_skill_into_screen_export_and_auto_open --test compat_runtime_test -- --exact
```
Expected: PASS
- [ ] **Step 6: Commit the backend/security implementation**
```bash
git add src/browser/callback_backend.rs src/security/mac_policy.rs src/security/mod.rs tests/browser_tool_test.rs tests/compat_runtime_test.rs
git commit -m "fix: allow approved local dashboard auto-open"
```
If `src/security/mod.rs` did not change, omit it from the commit.
### Task 5: Run the focused verification sweep
**Files:**
- Verify: `src/compat/artifact_open.rs`
- Verify: `tests/compat_runtime_test.rs`
- Verify: `tests/compat_screen_html_export_tool_test.rs`
- Verify: `tests/browser_tool_test.rs`
- Verify: `src/browser/callback_backend.rs` test module
- Reference only if summary wording ripples outward: `tests/agent_runtime_test.rs:173-258`
- Reference only if summary wording ripples outward: `tests/service_task_flow_test.rs:704-839`
- Reference only if summary wording ripples outward: `tests/service_ws_session_test.rs:755-869`
- [ ] **Step 1: Re-run the library and workflow regressions**
Run:
```bash
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" open_exported_xlsx_with_passes_generated_path_to_launcher --lib -- --exact
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" open_exported_xlsx_with_reports_launcher_failure --lib -- --exact
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" open_local_dashboard_uses_exact_approved_marker_payload --lib -- --exact
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" handle_browser_message_chains_hotlist_skill_into_xlsx_export_and_auto_open --test compat_runtime_test -- --exact
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" handle_browser_message_chains_hotlist_skill_into_screen_export_and_auto_open --test compat_runtime_test -- --exact
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" handle_browser_message_reports_dashboard_auto_open_protocol_error_when_presentation_url_is_missing --test compat_runtime_test -- --exact
```
Expected: PASS
- [ ] **Step 2: Re-run the tool contract regression that the dashboard route depends on**
Run:
```bash
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" screen_html_export_tool_renders_dashboard_html_with_presentation_contract --test compat_screen_html_export_tool_test -- --exact
```
Expected: PASS
- [ ] **Step 3: Re-run the callback-backend and policy boundary tests**
Run:
```bash
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" callback_backend_accepts_approved_local_dashboard_navigate_request --lib -- --exact
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" callback_backend_rejects_local_dashboard_navigate_without_required_marker_fields --lib -- --exact
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" mac_policy_rejects_non_html_local_dashboard_presentation --test browser_tool_test -- --exact
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" default_rules_allow_zhihu_navigation --test browser_tool_test -- --exact
```
Expected: PASS
- [ ] **Step 4: Re-run outward-facing summary regressions only if needed**
Only if the updated summary text breaks existing assertions, run exactly these existing regressions and adjust only the affected expectation text:
```bash
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" production_submit_task_routes_zhihu_through_ws_backend_without_helper_bootstrap --test agent_runtime_test -- --exact
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" client_to_service_regression_routes_zhihu_without_helper_bootstrap_or_invalid_hmac_seed_output --test service_task_flow_test -- --exact
cargo test --manifest-path "D:/data/ideaSpace/rust/sgClaw/claw-new/Cargo.toml" service_binary_submit_flow_routes_zhihu_without_helper_bootstrap --test service_ws_session_test -- --exact
```
Expected: PASS for any test you had to touch. Skip this step entirely if those files needed no edits.
- [ ] **Step 5: Inspect scope before finishing with exact git commands**
Run:
```bash
git diff --name-only -- src/compat/artifact_open.rs src/compat/mod.rs src/compat/workflow_executor.rs src/browser/callback_backend.rs src/security/mac_policy.rs src/security/mod.rs tests/compat_runtime_test.rs tests/browser_tool_test.rs tests/agent_runtime_test.rs tests/service_task_flow_test.rs tests/service_ws_session_test.rs
git diff --stat -- src/compat/artifact_open.rs src/compat/mod.rs src/compat/workflow_executor.rs src/browser/callback_backend.rs src/security/mac_policy.rs src/security/mod.rs tests/compat_runtime_test.rs tests/browser_tool_test.rs tests/agent_runtime_test.rs tests/service_task_flow_test.rs tests/service_ws_session_test.rs
```
Confirm the diff only touches:
```text
- compat workflow/orchestration
- compat post-export helper module
- callback backend narrow local-dashboard acceptance
- MacPolicy narrow local-dashboard validation
- focused related tests
```
Confirm it does **not** touch:
```text
- frontend/service-console/
- src/service/protocol.rs
- browser-helper.html
- callback-host endpoint contracts
- websocket transport/protocol files
```
- [ ] **Step 6: Commit only if verification required additional code changes**
```bash
git add src/compat/artifact_open.rs src/compat/mod.rs src/compat/workflow_executor.rs src/browser/callback_backend.rs src/security/mac_policy.rs tests/compat_runtime_test.rs tests/browser_tool_test.rs tests/agent_runtime_test.rs tests/service_task_flow_test.rs tests/service_ws_session_test.rs
git commit -m "test: tighten hotlist post-export auto-open verification"
```
If verification required no further code changes, do not create an extra commit.