claw/docs/superpowers/plans/2026-04-01-claw-ws-parallel-transport.md

# Claw-WS Parallel Transport Implementation Plan

> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.

**Goal:** Add a parallel `claw-ws` transport path that keeps the current pipe mode intact while introducing a long-lived `sg_claw` local service, an interactive `sg_claw_client`, and a browser WebSocket backend at `ws://127.0.0.1:12345`.

**Architecture:** First extract a transport-agnostic submit-task runner and browser backend abstraction from the current pipe-coupled flow. Keep the existing pipe path as one adapter/backend, then add a fixed-protocol browser WebSocket backend plus a small service/session layer and an interactive CLI client that reuse the same runtime, orchestration, and browser-facing tool adapters.

**Tech Stack:** Rust 2021, current sgclaw compat runtime, zeroclaw runtime engine, `serde`/`serde_json`, existing `MacPolicy`, and a blocking WebSocket crate for v1 (`tungstenite` preferred over a broad async rewrite).

---

## Scope Guardrails

- Keep the current pipe mode entrypoint and behavior working.
- Do **not** replace the existing browser pipe path.
- Add a **parallel** WebSocket path only.
- v1 supports **one active client session** only.
- Reuse existing tool names and runtime behavior whenever possible.
- If WS `Eval` support is incomplete, disable eval-dependent browser-script skill exposure in WS mode rather than shipping partial behavior.
- Do not broaden v1 with task queues, multi-client support, or admin endpoints.

---

## File Structure

### Existing files to reuse

- Modify: `src/lib.rs` — current pipe bootstrap and receive loop; keep as the legacy pipe entrypoint.
- Modify: `src/agent/mod.rs` — current `BrowserMessage::SubmitTask` entrypoint and config-loading flow.
- Modify: `src/compat/runtime.rs` — compat runtime and tool assembly.
- Modify: `src/compat/orchestration.rs` — direct workflow vs compat runtime routing.
- Modify: `src/compat/browser_tool_adapter.rs` — exposes `browser_action` and `superrpa_browser`.
- Modify: `src/compat/browser_script_skill_tool.rs` — browser-script skill execution.
- Modify: `src/compat/workflow_executor.rs` — direct browser workflows such as Zhihu flows.
- Reuse: `src/pipe/browser_tool.rs` — current browser command executor; retain as the pipe backend implementation.
- Reuse: `src/pipe/protocol.rs` — `BrowserMessage`, `AgentMessage`, `Action`, `ExecutionSurfaceMetadata`.
- Reuse: `src/security/mac_policy.rs` — local action/domain guardrails.
- Modify: `src/config/settings.rs` — minimal new config surface for WS mode.
- Optional modify: `src/runtime/engine.rs` — only if backend capability wiring requires it.

### New files to create

- Create: `src/agent/task_runner.rs` — shared submit-task execution entrypoint.
- Create: `src/browser/mod.rs` — browser backend exports.
- Create: `src/browser/backend.rs` — `BrowserBackend` trait and helpers.
- Create: `src/browser/pipe_backend.rs` — wrapper around existing `BrowserPipeTool`.
- Create: `src/browser/ws_protocol.rs` — fixed browser WS request/response codec.
- Create: `src/browser/ws_backend.rs` — browser WS backend with blocking invoke semantics.
- Create: `src/service/mod.rs` — service exports.
- Create: `src/service/protocol.rs` — client/service WS message types.
- Create: `src/service/server.rs` — single-session `sg_claw` server.
- Create: `src/bin/sg_claw.rs` — service binary.
- Create: `src/bin/sg_claw_client.rs` — interactive CLI client.
- Create: `tests/task_runner_test.rs` — shared submit-task runner regressions.
- Create: `tests/browser_backend_capability_test.rs` — backend capability/tool exposure tests.
- Create: `tests/browser_ws_protocol_test.rs` — browser WS protocol tests.
- Create: `tests/browser_ws_backend_test.rs` — browser WS backend tests.
- Create: `tests/service_ws_session_test.rs` — single-session server tests.
- Create: `tests/service_task_flow_test.rs` — client/service task flow tests.

---

## Task 1: Extract a shared submit-task runner

**Files:**
- Create: `src/agent/task_runner.rs`
- Modify: `src/agent/mod.rs`
- Modify: `src/lib.rs`
- Test: `tests/task_runner_test.rs`
- Reuse: `src/compat/runtime.rs`, `src/compat/orchestration.rs`

- [ ] **Step 1: Write a failing runner regression test**

Create `tests/task_runner_test.rs` covering:
- empty instruction returns the same `TaskComplete` failure summary
- missing LLM config still returns the same summary shape
- the pipe adapter still emits `LogEntry` before `TaskComplete`

- [ ] **Step 2: Run the targeted regression tests first**

Run:

```bash
cargo test --test runtime_task_flow_test --test task_runner_test
```

Expected: `task_runner_test` fails because the shared runner does not exist yet.

- [ ] **Step 3: Define the transport-neutral request model**

Create `src/agent/task_runner.rs` with a request struct that mirrors the current pipe payload:

```rust
pub struct SubmitTaskRequest {
    pub instruction: String,
    pub conversation_id: Option<String>,
    pub messages: Vec<ConversationMessage>,
    pub page_url: Option<String>,
    pub page_title: Option<String>,
}
```

Normalize empty strings to `None` at the adapter boundary.

- [ ] **Step 4: Define an event sink abstraction**

Add a small trait that can emit the current agent events without depending on a specific transport:

```rust
pub trait AgentEventSink {
    fn send(&self, message: &AgentMessage) -> Result<(), PipeError>;
}
```

The existing pipe transport should implement this first.

- [ ] **Step 5: Move submit-task execution into a shared function**

Extract the body currently inside `BrowserMessage::SubmitTask` handling from `src/agent/mod.rs` into a shared function such as:

```rust
pub fn run_submit_task(
    sink: &dyn AgentEventSink,
    browser_backend: Arc<dyn BrowserBackend>,
    context: &AgentRuntimeContext,
    request: SubmitTaskRequest,
) -> Result<(), PipeError>
```

This function must still:
- validate empty instruction
- load sgclaw settings
- log runtime/config info
- choose orchestration vs compat runtime
- emit `AgentMessage::TaskComplete`

- [ ] **Step 6: Keep pipe mode as a thin adapter**

Refactor `handle_browser_message_with_context(...)` in `src/agent/mod.rs` so it only:
- pattern matches `BrowserMessage`
- converts `SubmitTask` into `SubmitTaskRequest`
- forwards into `run_submit_task(...)`

- [ ] **Step 7: Re-run the runner regressions**

Run:

```bash
cargo test --test runtime_task_flow_test --test task_runner_test
```

Expected: both tests pass and pipe behavior remains unchanged.

- [ ] **Step 8: Commit**

```bash
git add src/agent/mod.rs src/agent/task_runner.rs src/lib.rs tests/task_runner_test.rs
git commit -m "refactor: extract shared submit-task runner"
```

---

## Task 2: Introduce a browser backend abstraction and wrap the current pipe implementation

**Files:**
- Create: `src/browser/mod.rs`
- Create: `src/browser/backend.rs`
- Create: `src/browser/pipe_backend.rs`
- Modify: `src/lib.rs`
- Modify: `src/compat/browser_tool_adapter.rs`
- Modify: `src/compat/browser_script_skill_tool.rs`
- Modify: `src/compat/runtime.rs`
- Modify: `src/compat/orchestration.rs`
- Modify: `src/compat/workflow_executor.rs`
- Test: `tests/browser_backend_capability_test.rs`
- Reuse: `src/pipe/browser_tool.rs`, `src/security/mac_policy.rs`

- [ ] **Step 1: Add a failing backend capability test**

Create `tests/browser_backend_capability_test.rs` to verify:
- pipe backend still exposes privileged surface metadata
- pipe backend still supports `Eval`
- browser-script tool exposure is disabled when `supports_eval()` is false

- [ ] **Step 2: Run the current browser adapter tests first**

Run:

```bash
cargo test --test browser_tool_test --test compat_browser_tool_test --test browser_backend_capability_test
```

Expected: new capability test fails because the backend abstraction does not exist yet.

- [ ] **Step 3: Define the shared browser interface**

Create `src/browser/backend.rs`:

```rust
pub trait BrowserBackend: Send + Sync {
    fn invoke(
        &self,
        action: Action,
        params: Value,
        expected_domain: &str,
    ) -> Result<CommandOutput, PipeError>;

    fn surface_metadata(&self) -> ExecutionSurfaceMetadata;

    fn supports_eval(&self) -> bool {
        true
    }
}
```

- [ ] **Step 4: Implement the pipe backend as a wrapper**

Create `src/browser/pipe_backend.rs` that stores the current `BrowserPipeTool<T>` and forwards `invoke(...)` and `surface_metadata()` unchanged.

Pipe mode must continue using:
- `perform_handshake(...)`
- `MacPolicy::load_from_path(...)`
- `BrowserPipeTool::new(...).with_response_timeout(...)`

- [ ] **Step 5: Refactor runtime and tool adapters to depend on `Arc<dyn BrowserBackend>`**

Update:
- `src/compat/browser_tool_adapter.rs`
- `src/compat/browser_script_skill_tool.rs`
- `src/compat/runtime.rs`
- `src/compat/orchestration.rs`
- `src/compat/workflow_executor.rs`

Preserve external tool names:
- `browser_action`
- `superrpa_browser`

- [ ] **Step 6: Add capability gating for eval-dependent script tools**

If `supports_eval()` is false, do **not** expose browser-script skill tools from `build_browser_script_skill_tools(...)` in that backend mode.

- [ ] **Step 7: Re-run browser adapter tests**

Run:

```bash
cargo test --test browser_tool_test --test compat_browser_tool_test --test browser_backend_capability_test
```

Expected: all three pass.

- [ ] **Step 8: Commit**

```bash
git add src/browser src/lib.rs src/compat/browser_tool_adapter.rs src/compat/browser_script_skill_tool.rs src/compat/runtime.rs src/compat/orchestration.rs src/compat/workflow_executor.rs tests/browser_backend_capability_test.rs
git commit -m "refactor: abstract browser backend from pipe transport"
```

---

## Task 3: Implement the fixed browser WebSocket protocol codec in isolation

**Files:**
- Create: `src/browser/ws_protocol.rs`
- Test: `tests/browser_ws_protocol_test.rs`
- Reuse: `docs/_tmp_sgbrowser_ws_api_doc.txt`

- [ ] **Step 1: Write failing protocol codec tests**

Create `tests/browser_ws_protocol_test.rs` covering:
- exact outbound frame encoding
- callback payload decoding
- unknown callback format rejection
- mapping coverage for every supported v1 action

- [ ] **Step 2: Run the protocol tests first**

Run:

```bash
cargo test --test browser_ws_protocol_test
```

Expected: fail because the WS protocol codec does not exist yet.

- [ ] **Step 3: Encode the exact browser frame shapes**

Create `src/browser/ws_protocol.rs` so it can build exact array-form payloads such as:

```rust
[requesturl, "sgBrowserExcuteJsCodeByArea", target_url, js_code, area]
```

Serialize to the JSON string format required by the browser service.

- [ ] **Step 4: Define the v1 action mapping table**

Support only the actions already needed by current sgclaw flows:
- `Navigate`
- `GetText`
- `Click`
- `Type`
- `Eval`

Document which browser functions each one maps to and what assumptions they rely on.

- [ ] **Step 5: Define callback parsing and correlation rules**

Represent callback-bearing operations explicitly, including the callback function naming or request-correlation strategy the backend will depend on.

- [ ] **Step 6: Reject unsupported or malformed shapes early**

Fail fast for:
- unsupported actions
- malformed callback payloads
- missing request correlation metadata

- [ ] **Step 7: Re-run the protocol tests**

Run:

```bash
cargo test --test browser_ws_protocol_test
```

Expected: pass with no network dependency.

- [ ] **Step 8: Commit**

```bash
git add src/browser/ws_protocol.rs tests/browser_ws_protocol_test.rs
git commit -m "test: codify fixed browser websocket protocol"
```

---

## Task 4: Build the browser WS backend with synchronous invoke semantics

**Files:**
- Create: `src/browser/ws_backend.rs`
- Modify: `src/browser/mod.rs`
- Test: `tests/browser_ws_backend_test.rs`
- Reuse: `CommandOutput`, `PipeError`, `ExecutionSurfaceMetadata`, `MacPolicy`

- [ ] **Step 1: Write failing backend behavior tests**

Create `tests/browser_ws_backend_test.rs` covering:
- zero return + no callback => success
- non-zero return => failure
- zero return + callback => success with normalized `CommandOutput`
- callback timeout => timeout error
- dropped socket => clear failure

- [ ] **Step 2: Run backend tests first**

Run:

```bash
cargo test --test browser_ws_backend_test
```

Expected: fail because the WS backend does not exist yet.

- [ ] **Step 3: Build a long-lived browser connection manager**

Implement `src/browser/ws_backend.rs` to connect to `ws://127.0.0.1:12345` and expose blocking `invoke(...)` calls.

Use a dedicated connection loop plus request/response coordination instead of scattering raw socket calls through the runtime.

- [ ] **Step 4: Preserve local guardrails before send**

Validate `MacPolicy` before translating an action into the browser WS protocol, matching current pipe backend behavior.

- [ ] **Step 5: Normalize immediate status returns and delayed callbacks**

For each `invoke(...)` call:
- fail immediately on non-zero return codes
- succeed immediately for operations with no data callback
- wait for the matching callback for result-bearing operations
- convert the final outcome into `CommandOutput`

- [ ] **Step 6: Keep v1 concurrency intentionally serialized**

Allow only one in-flight browser request at a time unless the browser callback protocol proves a stable request-id guarantee.

- [ ] **Step 7: Re-run backend tests**

Run:

```bash
cargo test --test browser_ws_backend_test
```

Expected: pass using mocks/fakes, not the real browser.

- [ ] **Step 8: Commit**

```bash
git add src/browser/mod.rs src/browser/ws_backend.rs tests/browser_ws_backend_test.rs
git commit -m "feat: add browser websocket backend"
```

---

## Task 5: Add the `sg_claw` service protocol and single-session server

**Files:**
- Create: `src/service/mod.rs`
- Create: `src/service/protocol.rs`
- Create: `src/service/server.rs`
- Create: `src/bin/sg_claw.rs`
- Modify: `src/lib.rs`
- Modify: `Cargo.toml`
- Test: `tests/service_ws_session_test.rs`
- Reuse: `AgentMessage::LogEntry`, `AgentMessage::TaskComplete`, `SubmitTaskRequest`, `run_submit_task(...)`

- [ ] **Step 1: Write failing service session tests**

Create `tests/service_ws_session_test.rs` to verify:
- first client attaches
- second client gets `Busy`
- disconnect resets session state
- overlapping task submission is rejected clearly

- [ ] **Step 2: Run the session tests first**

Run:

```bash
cargo test --test service_ws_session_test
```

Expected: fail because the service layer does not exist yet.

- [ ] **Step 3: Define a thin client/service WS protocol**

In `src/service/protocol.rs`, reuse existing task/event shapes as much as possible:

```rust
ClientMessage::SubmitTask { instruction, conversation_id, messages, page_url, page_title }
ClientMessage::Ping
ServiceMessage::LogEntry { level, message }
ServiceMessage::TaskComplete { success, summary }
ServiceMessage::Busy { message }
```

- [ ] **Step 4: Add the service event sink adapter**

Implement `AgentEventSink` for the service session writer so the shared task runner can stream `LogEntry` and `TaskComplete` over the service WebSocket.

- [ ] **Step 5: Implement single-active-client session state**

Model explicit states such as:
- `Idle`
- `ClientAttached`
- `TaskRunning`

Reject a second client with `ServiceMessage::Busy` and close the socket. Reject overlapping tasks instead of queueing them.

- [ ] **Step 6: Add the service binary**

Create `src/bin/sg_claw.rs` that:
- loads config
- creates the browser WS backend
- listens for local client connections
- routes `SubmitTask` into `run_submit_task(...)`

Keep `src/main.rs` and the existing `sgclaw::run()` pipe path unchanged.

- [ ] **Step 7: Re-run the session tests**

Run:

```bash
cargo test --test service_ws_session_test
```

Expected: pass without the real browser.

- [ ] **Step 8: Commit**

```bash
git add src/service src/bin/sg_claw.rs src/lib.rs Cargo.toml tests/service_ws_session_test.rs
git commit -m "feat: add claw-ws service entrypoint"
```

---

## Task 6: Add the `sg_claw_client` interactive CLI

**Files:**
- Create: `src/bin/sg_claw_client.rs`
- Modify: `Cargo.toml`
- Test: `tests/service_task_flow_test.rs`
- Reuse: `src/service/protocol.rs`

- [ ] **Step 1: Write failing client/service task flow tests**

Create `tests/service_task_flow_test.rs` to verify:
- the submit-task request reaches the service
- log entries stream in order
- the final summary arrives exactly once
- disconnect after task completion is handled cleanly

- [ ] **Step 2: Run the flow tests first**

Run:

```bash
cargo test --test service_task_flow_test
```

Expected: fail because the client does not exist yet.

- [ ] **Step 3: Implement a thin interactive client loop**

Create `src/bin/sg_claw_client.rs` that:
- connects to the local `sg_claw` service
- reads a line of user input
- sends `ClientMessage::SubmitTask`
- prints streamed `LogEntry` events as they arrive
- ends the turn on `TaskComplete`

- [ ] **Step 4: Keep the client intentionally dumb**

Do **not** duplicate runtime logic in the client. Browser access, skills, orchestration, and task execution remain entirely inside the service.

- [ ] **Step 5: Re-run the flow tests**

Run:

```bash
cargo test --test service_task_flow_test
```

Expected: pass without the real browser.

- [ ] **Step 6: Build the new binaries explicitly**

Run:

```bash
cargo build --bin sg_claw --bin sg_claw_client
```

Expected: both binaries compile successfully.

- [ ] **Step 7: Commit**

```bash
git add src/bin/sg_claw_client.rs Cargo.toml tests/service_task_flow_test.rs
git commit -m "feat: add interactive claw-ws client"
```

---

## Task 7: Finish wiring, preserve pipe mode, and verify end-to-end behavior

**Files:**
- Modify: `Cargo.toml`
- Modify: `src/lib.rs`
- Modify: `src/config/settings.rs`
- Optional modify: `src/runtime/engine.rs`
- Reuse: `tests/browser_tool_test.rs`, `tests/runtime_task_flow_test.rs`, `tests/compat_runtime_test.rs`

- [ ] **Step 1: Add only the minimum config surface for v1**

Add settings such as:
- `browser_ws_url` defaulting to `ws://127.0.0.1:12345`
- `service_ws_listen_addr` defaulting to local loopback

Do **not** change the meaning of existing browser backend/profile settings just to represent service mode.

- [ ] **Step 2: Keep external browser tool naming stable**

Verify that the runtime still exposes:
- `superrpa_browser`
- `browser_action`

under both pipe and WS modes where the backend supports them.

- [ ] **Step 3: Re-run the current pipe regression suite**

Run:

```bash
cargo test --test browser_tool_test --test compat_browser_tool_test --test runtime_task_flow_test --test pipe_handshake_test --test pipe_protocol_test --test task_protocol_test
```

Expected: all existing pipe-oriented tests still pass unchanged.

- [ ] **Step 4: Run the new WS-focused suite**

Run:

```bash
cargo test --test task_runner_test --test browser_ws_protocol_test --test browser_ws_backend_test --test browser_backend_capability_test --test service_ws_session_test --test service_task_flow_test
```

Expected: all new tests pass without launching the real browser.

- [ ] **Step 5: Run a full Rust test sweep**

Run:

```bash
cargo test --tests
```

Expected: all Rust tests pass.

- [ ] **Step 6: Build all three binaries**

Run:

```bash
cargo build --bin sgclaw --bin sg_claw --bin sg_claw_client
```

Expected: all three binaries compile.

- [ ] **Step 7: Perform a manual local smoke test**

Manual test:
1. Start the browser app so `ws://127.0.0.1:12345` is available.
2. Run `cargo run --bin sg_claw`.
3. In another terminal, run `cargo run --bin sg_claw_client`.
4. Submit a simple browser task such as opening a page or fetching visible text.
5. Confirm the client prints streaming logs and exactly one final completion summary.
6. Confirm the old pipe-mode entry still starts via `cargo run`.

Expected: both modes work side-by-side.

- [ ] **Step 8: Commit**

```bash
git add Cargo.toml src/lib.rs src/config/settings.rs src/runtime/engine.rs
git commit -m "feat: wire parallel claw-ws transport"
```

---

## Verification Checklist

### Fast regression checks

```bash
cargo test --test browser_tool_test --test compat_browser_tool_test --test runtime_task_flow_test
```

Expected: current pipe/browser runtime behavior remains green.

### Full Rust test sweep

```bash
cargo test --tests
```

Expected: all Rust tests pass.

### Binary build verification

```bash
cargo build --bin sgclaw --bin sg_claw --bin sg_claw_client
```

Expected: all three binaries compile.

### Manual end-to-end verification

- Browser app listening on `ws://127.0.0.1:12345`
- `cargo run --bin sg_claw`
- `cargo run --bin sg_claw_client`
- submit one browser task
- verify streaming logs, final completion, and single-client lock behavior
- verify `cargo run` still preserves old pipe bootstrap

---

## Notes for Implementation

- Keep the current pipe bootstrap in `src/lib.rs` intact until the shared runner and pipe backend wrapper are both green.
- Prefer small commits at each task boundary.
- Keep the new WS path additive and isolated.
- Do not ship partial browser capabilities under stable tool names.
- Treat `docs/_tmp_sgbrowser_ws_api_doc.txt` as the browser WS protocol source of truth while implementing `src/browser/ws_protocol.rs`.