# WS Browser Integration Surface Correction Implementation Plan > **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking. **Goal:** Replace the unvalidated raw-ws-direct assumption with an evidence-backed decision: either prove a minimal sgBrowser bootstrap sequence for raw websocket control, or pivot to the real browser bridge surface. **Architecture:** Treat the existing ws-native backend as a protocol/testing asset, not as a validated production integration surface. First build a narrow probe/validation harness that can run candidate bootstrap sequences and capture exact live transcripts from the real endpoint. Then branch decisively: if a reproducible bootstrap sequence yields real status/callback frames, implement that bootstrap path; otherwise stop raw-ws speculation and write the bridge-first implementation slice. **Tech Stack:** Rust 2021, existing `src/browser/ws_protocol.rs` / `src/browser/ws_backend.rs`, service websocket infrastructure, `tungstenite`, `serde_json`, current Rust test suite, local sgBrowser websocket documentation. --- ## Scope Guardrails - Do **not** add more speculative production fixes to `src/service/server.rs` just to “try one more thing.” - Do **not** claim raw websocket is the supported path unless the live probe transcript proves it. - Do **not** modify `src/lib.rs`, pipe handshake behavior, or the pipe browser-tool path. - Do **not** implement both the bootstrap architecture and the bridge architecture in the same branch. - Keep the ws-native code unless and until the bridge decision makes specific pieces obsolete. - Prefer a dedicated probe surface over embedding validation logic into production request handling. --- ## File Structure ### Existing files to modify - Modify: `src/browser/mod.rs` - export the new `ws_probe` module so both tests and the probe binary use the same library surface - Modify: `src/browser/ws_protocol.rs` - only if a tiny helper extraction is required for test/probe readability - do not change existing protocol semantics in this slice - Modify: `tests/browser_ws_protocol_test.rs` - add deterministic coverage for any extracted helper used by the probe harness ### New files to create - Create: `src/bin/sgbrowser_ws_probe.rs` - standalone diagnostic binary for ordered frame-script probing against a live sgBrowser websocket endpoint - Create: `src/browser/ws_probe.rs` - small reusable probe/transcript module, if needed, to keep the binary and tests focused - Create: `tests/browser_ws_probe_test.rs` - deterministic fake-server tests for transcript capture, timeout reporting, and scripted sequence execution - Create: `docs/superpowers/specs/2026-04-03-ws-browser-bridge-path-design.md` **only if Option B wins after probing** - follow-up bridge design, not part of the initial coding slice - Create: `docs/superpowers/plans/2026-04-03-ws-browser-bridge-path-plan.md` **only if Option B wins after probing** - follow-up bridge implementation plan, not part of the initial coding slice - Create: `docs/_tmp_sgbrowser_ws_probe_transcript.md` - temporary evidence artifact capturing the real endpoint probe matrix and outcomes ### Files deliberately not changed in the initial slice - `src/lib.rs` - `src/agent/task_runner.rs` - `src/compat/runtime.rs` - `src/compat/orchestration.rs` - `src/compat/workflow_executor.rs` - `src/browser/ws_backend.rs` Unless the probe results prove a real bootstrap contract, these files stay untouched. --- ## Task 1: Build a deterministic websocket probe harness before touching production behavior **Files:** - Create: `src/browser/ws_probe.rs` - Create: `tests/browser_ws_probe_test.rs` - Reuse: `src/browser/ws_protocol.rs` - [ ] **Step 1: Write the first failing transcript test** Create `tests/browser_ws_probe_test.rs` with one focused fake-server test that executes a scripted sequence of outgoing text frames and records all received text frames in order. Start with this shape: ```rust #[test] fn probe_records_welcome_then_silence_transcript() { // fake server sends one welcome frame and then stays silent // probe result should preserve that exact transcript and mark timeout/silence explicitly } ``` Required assertions: - the probe can connect to the fake websocket server - it can send a scripted first frame - it records the first inbound text frame exactly - it returns a transcript/result object that distinguishes timeout from protocol parse failure - [ ] **Step 2: Run the single new test and verify it fails** Run: ```bash cargo test --test browser_ws_probe_test probe_records_welcome_then_silence_transcript -- --nocapture ``` Expected: FAIL because the probe harness does not exist yet. - [ ] **Step 3: Add the second failing probe test for ordered multi-step scripts** In the same file, add a test proving the harness can run multiple outgoing frames in a fixed order and keep the transcript segmented by step. Test shape: ```rust #[test] fn probe_runs_ordered_frame_script_and_records_per_step_results() { // send bootstrap frame 1, bootstrap frame 2, then minimal action // fake server replies differently at each step // probe result preserves exact order and outcomes } ``` Required assertions: - outgoing frames are sent in the configured order - inbound frames are attached to the correct step - the probe can stop the sequence on timeout/close if configured - [ ] **Step 4: Run the ordered-script test and verify it fails** Run: ```bash cargo test --test browser_ws_probe_test probe_runs_ordered_frame_script_and_records_per_step_results -- --nocapture ``` Expected: FAIL because the probe harness does not exist yet. - [ ] **Step 5: Add the third failing probe test for close/reset visibility** Add one focused fake-server test that closes the connection after a script step and asserts the transcript reports close/reset rather than generic timeout. - [ ] **Step 6: Run the close/reset test and verify it fails** Run: ```bash cargo test --test browser_ws_probe_test probe_reports_socket_close_separately_from_timeout -- --nocapture ``` Expected: FAIL because the probe harness does not exist yet. - [ ] **Step 7: Implement the minimal probe module** Create `src/browser/ws_probe.rs` with only the types and behavior needed by the tests. Recommended shape: ```rust pub struct ProbeStep { pub label: String, pub payload: String, pub expect_reply: bool, } pub enum ProbeOutcome { Received(Vec), TimedOut, Closed, ConnectFailed(String), } pub struct ProbeStepResult { pub label: String, pub sent: String, pub outcome: ProbeOutcome, } pub fn run_probe_script(/* ws url, timeout, steps */) -> Result, ProbeError> { // connect, send ordered frames, collect exact transcript } ``` Rules: - do not parse business meaning yet - do not mix this into normal task execution - preserve exact raw text frames in transcript results - keep the module small and diagnostic-oriented - [ ] **Step 8: Re-run the new probe tests** Run: ```bash cargo test --test browser_ws_probe_test -- --nocapture ``` Expected: PASS. - [ ] **Step 9: Commit** ```bash git add src/browser/ws_probe.rs tests/browser_ws_probe_test.rs git commit -m "test: add sgBrowser websocket probe harness" ``` --- ## Task 2: Add a standalone probe binary for live sgBrowser evidence collection **Files:** - Create: `src/bin/sgbrowser_ws_probe.rs` - Create: `src/browser/ws_probe.rs` - Modify: `src/browser/mod.rs` - Create: `tests/browser_ws_probe_test.rs` - [ ] **Step 1: Write the failing helper parser test** In `tests/browser_ws_probe_test.rs`, add one focused test for a new helper function in `src/browser/ws_probe.rs`: ```rust #[test] fn parse_probe_args_accepts_ws_url_timeout_and_ordered_steps() { // parse a fixed argv-style slice into a ProbeCliConfig } ``` Create and use this exact helper shape: ```rust pub struct ProbeCliConfig { pub ws_url: String, pub timeout_ms: u64, pub steps: Vec, } pub fn parse_probe_args(args: &[String]) -> Result ``` The test must assert that these exact arguments parse successfully and preserve step order: ```text --ws-url ws://127.0.0.1:12345 --timeout-ms 1500 --step open-agent::["about:blank","sgOpenAgent"] --step open-hot::["about:blank","sgBrowerserOpenPage","https://www.zhihu.com/hot"] ``` - [ ] **Step 2: Run the parser test and verify it fails** Run: ```bash cargo test --test browser_ws_probe_test parse_probe_args_accepts_ws_url_timeout_and_ordered_steps -- --nocapture ``` Expected: FAIL because `parse_probe_args(...)` and `ProbeCliConfig` do not exist yet. - [ ] **Step 3: Implement the helper and binary together** In `src/browser/ws_probe.rs`, add `ProbeCliConfig` and `parse_probe_args(...)`. In `src/browser/mod.rs`, add the module export: ```rust pub mod ws_probe; ``` In `src/bin/sgbrowser_ws_probe.rs`, implement the binary using only `std::env::args()` plus `parse_probe_args(...)`. Required behavior: - accepts a websocket URL - accepts a timeout in milliseconds - accepts repeated ordered steps - runs the probe harness - prints a markdown-friendly transcript including: - step label - exact sent payload - exact received frames, if any - timeout/close outcome Output shape can be simple, for example: ```text STEP 1 bootstrap-open-agent SEND: ["about:blank","sgOpenAgent"] RECV: Welcome! You are client #1 OUTCOME: timeout ``` Rules: - no production/browser-runtime integration - no hidden fallback logic - no “best effort” guessing of next steps - [ ] **Step 4: Re-run the parser/helper test** Run: ```bash cargo test --test browser_ws_probe_test parse_probe_args_accepts_ws_url_timeout_and_ordered_steps -- --nocapture ``` Expected: PASS. - [ ] **Step 5: Build the probe binary** Run: ```bash cargo build --bin sgbrowser_ws_probe ``` Expected: PASS. - [ ] **Step 6: Commit** ```bash git add src/bin/sgbrowser_ws_probe.rs src/browser/ws_probe.rs src/browser/mod.rs tests/browser_ws_probe_test.rs git commit -m "feat: add live sgBrowser websocket probe binary" ``` --- ## Task 3: Run the real endpoint probe matrix and write the evidence transcript **Files:** - Create: `docs/_tmp_sgbrowser_ws_probe_transcript.md` - Reuse only: `src/bin/sgbrowser_ws_probe.rs`, `docs/_tmp_sgbrowser_ws_api_doc.txt` - [ ] **Step 1: Run the no-bootstrap baseline probe** Run exactly: ```bash cargo run --bin sgbrowser_ws_probe -- --ws-url "ws://127.0.0.1:12345" --timeout-ms 1500 --step "baseline-open::[\"about:blank\",\"sgBrowerserOpenPage\",\"https://www.zhihu.com/hot\"]" ``` Append the exact output under a `## baseline-open` heading in `docs/_tmp_sgbrowser_ws_probe_transcript.md`. - [ ] **Step 2: Run the documented `sgOpenAgent` candidate** Run exactly: ```bash cargo run --bin sgbrowser_ws_probe -- --ws-url "ws://127.0.0.1:12345" --timeout-ms 1500 --step "open-agent::[\"about:blank\",\"sgOpenAgent\"]" --step "post-open-agent-open::[\"about:blank\",\"sgBrowerserOpenPage\",\"https://www.zhihu.com/hot\"]" ``` Append the exact output under a `## open-agent` heading. - [ ] **Step 3: Run the documented `sgSetAuthInfo` candidate** Run exactly: ```bash cargo run --bin sgbrowser_ws_probe -- --ws-url "ws://127.0.0.1:12345" --timeout-ms 1500 --step "set-auth::[\"about:blank\",\"sgSetAuthInfo\",\"probe-user\",\"probe-token\"]" --step "post-set-auth-open::[\"about:blank\",\"sgBrowerserOpenPage\",\"https://www.zhihu.com/hot\"]" ``` Append the exact output under a `## set-auth` heading. - [ ] **Step 4: Run the documented `sgBrowserLogin` candidate** Run exactly: ```bash cargo run --bin sgbrowser_ws_probe -- --ws-url "ws://127.0.0.1:12345" --timeout-ms 1500 --step "browser-login::{\"request\":\"use-json-helper\"}" ``` Before running, replace the placeholder payload with the exact JSON-array frame produced by the helper for: ```json ["about:blank","sgBrowserLogin",{"appName":"probe","userName":"probe","orgName":"probe","menus":[{"name":"probe","normalImg":"x","activeImg":"x","url":"https://www.zhihu.com/hot"}]}] ``` Then add a second step in the same command: ```json ["about:blank","sgBrowerserOpenPage","https://www.zhihu.com/hot"] ``` Append the exact output under a `## browser-login` heading. - [ ] **Step 5: Run the documented `sgBrowerserActiveTab` candidate** Run exactly: ```bash cargo run --bin sgbrowser_ws_probe -- --ws-url "ws://127.0.0.1:12345" --timeout-ms 1500 --step "active-tab::[\"about:blank\",\"sgBrowerserActiveTab\",\"https://www.zhihu.com/hot\",\"probeCallback\"]" --step "post-active-tab-open::[\"about:blank\",\"sgBrowerserOpenPage\",\"https://www.zhihu.com/hot\"]" ``` Append the exact output under a `## active-tab` heading. - [ ] **Step 6: Run one combined bootstrap candidate** Run exactly: ```bash cargo run --bin sgbrowser_ws_probe -- --ws-url "ws://127.0.0.1:12345" --timeout-ms 1500 --step "combined-open-agent::[\"about:blank\",\"sgOpenAgent\"]" --step "combined-active-tab::[\"about:blank\",\"sgBrowerserActiveTab\",\"https://www.zhihu.com/hot\",\"probeCallback\"]" --step "combined-open::[\"about:blank\",\"sgBrowerserOpenPage\",\"https://www.zhihu.com/hot\"]" ``` Append the exact output under a `## combined-bootstrap` heading. - [ ] **Step 7: Run `requesturl` variants for the minimal action** Run exactly these two additional commands: ```bash cargo run --bin sgbrowser_ws_probe -- --ws-url "ws://127.0.0.1:12345" --timeout-ms 1500 --step "target-as-requesturl::[\"https://www.zhihu.com/hot\",\"sgBrowerserOpenPage\",\"https://www.zhihu.com/hot\"]" ``` ```bash cargo run --bin sgbrowser_ws_probe -- --ws-url "ws://127.0.0.1:12345" --timeout-ms 1500 --step "baidu-requesturl::[\"https://www.baidu.com\",\"sgBrowerserOpenPage\",\"https://www.zhihu.com/hot\"]" ``` Append the exact outputs under `## requesturl-variants`. - [ ] **Step 8: Summarize the matrix in the transcript file** At the end of `docs/_tmp_sgbrowser_ws_probe_transcript.md`, add this exact table template and fill it in: ```markdown | Sequence | Sent frames | First reply | Final outcome | Decision signal | | --- | --- | --- | --- | --- | ``` - [ ] **Step 9: Determine which architecture option wins** Decision rule: - if at least one sequence reproducibly yields real numeric status and/or callback frames for a real business action, Option A (bootstrap-validated raw websocket) wins - otherwise, Option B (bridge-first) wins Do not weaken this decision rule. - [ ] **Step 10: Commit the evidence artifact** ```bash git add docs/_tmp_sgbrowser_ws_probe_transcript.md git commit -m "docs: capture sgBrowser websocket probe evidence" ``` --- ## Task 4A: If Option A wins, write the narrow bootstrap implementation slice **Files:** - Create: `docs/superpowers/specs/2026-04-03-ws-browser-bootstrap-contract-design.md` - Create: `docs/superpowers/plans/2026-04-03-ws-browser-bootstrap-contract-plan.md` - Reuse as evidence input: - `docs/_tmp_sgbrowser_ws_probe_transcript.md` - `docs/_tmp_sgbrowser_ws_api_doc.txt` - `src/browser/ws_backend.rs` - `src/browser/ws_protocol.rs` - [ ] **Step 1: Write one new design doc capturing the proven bootstrap contract** Create: ```text docs/superpowers/specs/2026-04-03-ws-browser-bootstrap-contract-design.md ``` Include: - exact validated sequence - exact required state (`requesturl`, active tab, agent page, auth payload) - exact failure semantics - why this is sufficient evidence to keep raw websocket as the product surface - [ ] **Step 2: Write one new implementation plan for the bootstrap path** Create: ```text docs/superpowers/plans/2026-04-03-ws-browser-bootstrap-contract-plan.md ``` Plan only the minimal production changes required to embed the validated bootstrap sequence into the service/browser path. - [ ] **Step 3: Commit the bootstrap decision docs** ```bash git add docs/superpowers/specs/2026-04-03-ws-browser-bootstrap-contract-design.md docs/superpowers/plans/2026-04-03-ws-browser-bootstrap-contract-plan.md git commit -m "docs: capture ws browser bootstrap contract" ``` - [ ] **Step 4: Stop after writing the bootstrap plan** Do not begin production implementation in the same slice unless the user explicitly asks for execution. --- ## Task 4B: If Option B wins, write the bridge-first implementation slice **Files:** - Create: `docs/superpowers/specs/2026-04-03-ws-browser-bridge-path-design.md` - Create: `docs/superpowers/plans/2026-04-03-ws-browser-bridge-path-plan.md` - Reuse as evidence input: - `docs/_tmp_sgbrowser_ws_probe_transcript.md` - `frontend/archive/sgClaw验证-已归档/testRunner.js` - `docs/superpowers/specs/2026-03-25-superrpa-sgclaw-browser-control-design.md` - `docs/archive/项目管理与排期/协作时间表.md` - `docs/plans/2026-03-27-sgclaw-floating-chat-frontend-design.md` - [ ] **Step 1: Write the bridge-path design doc** Create `docs/superpowers/specs/2026-04-03-ws-browser-bridge-path-design.md`. The design must specify: - why raw websocket is considered non-validated for external control - which bridge surface becomes authoritative - where sgClaw should integrate (`FunctionsUI`, host bridge, `BrowserAction`, `CommandRouter`, or the nearest validated seam in this repo) - how to preserve pipe behavior and existing abstractions where practical - [ ] **Step 2: Write the bridge-path implementation plan** Create `docs/superpowers/plans/2026-04-03-ws-browser-bridge-path-plan.md`. The plan must: - identify exact files to touch - describe the narrowest adapter implementation - keep TDD/task granularity as in this document - avoid speculative work outside the bridge slice - [ ] **Step 3: Commit the bridge decision docs** ```bash git add docs/superpowers/specs/2026-04-03-ws-browser-bridge-path-design.md docs/superpowers/plans/2026-04-03-ws-browser-bridge-path-plan.md git commit -m "docs: define bridge-first sgBrowser integration" ``` - [ ] **Step 4: Stop after writing the bridge plan** Do not start the bridge implementation in the same slice unless the user explicitly asks for execution. --- ## Verification Checklist ### Deterministic probe harness tests ```bash cargo test --test browser_ws_probe_test -- --nocapture ``` Expected: transcript capture, ordered scripts, timeout reporting, and close/reset reporting all pass. ### Probe binary build ```bash cargo build --bin sgbrowser_ws_probe ``` Expected: PASS. ### Live evidence collection - run the probe matrix against the real configured endpoint - save exact transcripts to `docs/_tmp_sgbrowser_ws_probe_transcript.md` - make the architecture decision using the documented rule ### Follow-up branch condition - if Option A wins, repository contains a bootstrap-contract design + plan - if Option B wins, repository contains a bridge-path design + plan - no production runtime changes are made until that decision is written down --- ## Notes for Implementation - The existing `WsBrowserBackend` fix that remembers the last navigated URL remains valid; do not revert it. - The previous auth-replacement work also remains valid; it removed a real bug but did not prove the raw websocket architecture. - Keep the probe tool brutally literal: exact sent frames, exact received frames, explicit timeout/close outcomes. - Resist the temptation to make the probe “smart.” Smart probes hide evidence. - If the real endpoint still replies only with the welcome banner and then silence across the matrix, treat that as a decision, not as an excuse for more guessing.