Consolidate the browser task runtime around the callback path, add safer artifact opening for Zhihu exports, and cover the new service/browser flows with focused tests and supporting docs. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
19 KiB
WS Browser Integration Surface Correction Implementation Plan
For agentic workers: REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (
- [ ]) syntax for tracking.
Goal: Replace the unvalidated raw-ws-direct assumption with an evidence-backed decision: either prove a minimal sgBrowser bootstrap sequence for raw websocket control, or pivot to the real browser bridge surface.
Architecture: Treat the existing ws-native backend as a protocol/testing asset, not as a validated production integration surface. First build a narrow probe/validation harness that can run candidate bootstrap sequences and capture exact live transcripts from the real endpoint. Then branch decisively: if a reproducible bootstrap sequence yields real status/callback frames, implement that bootstrap path; otherwise stop raw-ws speculation and write the bridge-first implementation slice.
Tech Stack: Rust 2021, existing src/browser/ws_protocol.rs / src/browser/ws_backend.rs, service websocket infrastructure, tungstenite, serde_json, current Rust test suite, local sgBrowser websocket documentation.
Scope Guardrails
- Do not add more speculative production fixes to
src/service/server.rsjust to “try one more thing.” - Do not claim raw websocket is the supported path unless the live probe transcript proves it.
- Do not modify
src/lib.rs, pipe handshake behavior, or the pipe browser-tool path. - Do not implement both the bootstrap architecture and the bridge architecture in the same branch.
- Keep the ws-native code unless and until the bridge decision makes specific pieces obsolete.
- Prefer a dedicated probe surface over embedding validation logic into production request handling.
File Structure
Existing files to modify
- Modify:
src/browser/mod.rs- export the new
ws_probemodule so both tests and the probe binary use the same library surface
- export the new
- Modify:
src/browser/ws_protocol.rs- only if a tiny helper extraction is required for test/probe readability
- do not change existing protocol semantics in this slice
- Modify:
tests/browser_ws_protocol_test.rs- add deterministic coverage for any extracted helper used by the probe harness
New files to create
- Create:
src/bin/sgbrowser_ws_probe.rs- standalone diagnostic binary for ordered frame-script probing against a live sgBrowser websocket endpoint
- Create:
src/browser/ws_probe.rs- small reusable probe/transcript module, if needed, to keep the binary and tests focused
- Create:
tests/browser_ws_probe_test.rs- deterministic fake-server tests for transcript capture, timeout reporting, and scripted sequence execution
- Create:
docs/superpowers/specs/2026-04-03-ws-browser-bridge-path-design.mdonly if Option B wins after probing- follow-up bridge design, not part of the initial coding slice
- Create:
docs/superpowers/plans/2026-04-03-ws-browser-bridge-path-plan.mdonly if Option B wins after probing- follow-up bridge implementation plan, not part of the initial coding slice
- Create:
docs/_tmp_sgbrowser_ws_probe_transcript.md- temporary evidence artifact capturing the real endpoint probe matrix and outcomes
Files deliberately not changed in the initial slice
src/lib.rssrc/agent/task_runner.rssrc/compat/runtime.rssrc/compat/orchestration.rssrc/compat/workflow_executor.rssrc/browser/ws_backend.rs
Unless the probe results prove a real bootstrap contract, these files stay untouched.
Task 1: Build a deterministic websocket probe harness before touching production behavior
Files:
-
Create:
src/browser/ws_probe.rs -
Create:
tests/browser_ws_probe_test.rs -
Reuse:
src/browser/ws_protocol.rs -
Step 1: Write the first failing transcript test
Create tests/browser_ws_probe_test.rs with one focused fake-server test that executes a scripted sequence of outgoing text frames and records all received text frames in order.
Start with this shape:
#[test]
fn probe_records_welcome_then_silence_transcript() {
// fake server sends one welcome frame and then stays silent
// probe result should preserve that exact transcript and mark timeout/silence explicitly
}
Required assertions:
-
the probe can connect to the fake websocket server
-
it can send a scripted first frame
-
it records the first inbound text frame exactly
-
it returns a transcript/result object that distinguishes timeout from protocol parse failure
-
Step 2: Run the single new test and verify it fails
Run:
cargo test --test browser_ws_probe_test probe_records_welcome_then_silence_transcript -- --nocapture
Expected: FAIL because the probe harness does not exist yet.
- Step 3: Add the second failing probe test for ordered multi-step scripts
In the same file, add a test proving the harness can run multiple outgoing frames in a fixed order and keep the transcript segmented by step.
Test shape:
#[test]
fn probe_runs_ordered_frame_script_and_records_per_step_results() {
// send bootstrap frame 1, bootstrap frame 2, then minimal action
// fake server replies differently at each step
// probe result preserves exact order and outcomes
}
Required assertions:
-
outgoing frames are sent in the configured order
-
inbound frames are attached to the correct step
-
the probe can stop the sequence on timeout/close if configured
-
Step 4: Run the ordered-script test and verify it fails
Run:
cargo test --test browser_ws_probe_test probe_runs_ordered_frame_script_and_records_per_step_results -- --nocapture
Expected: FAIL because the probe harness does not exist yet.
- Step 5: Add the third failing probe test for close/reset visibility
Add one focused fake-server test that closes the connection after a script step and asserts the transcript reports close/reset rather than generic timeout.
- Step 6: Run the close/reset test and verify it fails
Run:
cargo test --test browser_ws_probe_test probe_reports_socket_close_separately_from_timeout -- --nocapture
Expected: FAIL because the probe harness does not exist yet.
- Step 7: Implement the minimal probe module
Create src/browser/ws_probe.rs with only the types and behavior needed by the tests.
Recommended shape:
pub struct ProbeStep {
pub label: String,
pub payload: String,
pub expect_reply: bool,
}
pub enum ProbeOutcome {
Received(Vec<String>),
TimedOut,
Closed,
ConnectFailed(String),
}
pub struct ProbeStepResult {
pub label: String,
pub sent: String,
pub outcome: ProbeOutcome,
}
pub fn run_probe_script(/* ws url, timeout, steps */) -> Result<Vec<ProbeStepResult>, ProbeError> {
// connect, send ordered frames, collect exact transcript
}
Rules:
-
do not parse business meaning yet
-
do not mix this into normal task execution
-
preserve exact raw text frames in transcript results
-
keep the module small and diagnostic-oriented
-
Step 8: Re-run the new probe tests
Run:
cargo test --test browser_ws_probe_test -- --nocapture
Expected: PASS.
- Step 9: Commit
git add src/browser/ws_probe.rs tests/browser_ws_probe_test.rs
git commit -m "test: add sgBrowser websocket probe harness"
Task 2: Add a standalone probe binary for live sgBrowser evidence collection
Files:
-
Create:
src/bin/sgbrowser_ws_probe.rs -
Create:
src/browser/ws_probe.rs -
Modify:
src/browser/mod.rs -
Create:
tests/browser_ws_probe_test.rs -
Step 1: Write the failing helper parser test
In tests/browser_ws_probe_test.rs, add one focused test for a new helper function in src/browser/ws_probe.rs:
#[test]
fn parse_probe_args_accepts_ws_url_timeout_and_ordered_steps() {
// parse a fixed argv-style slice into a ProbeCliConfig
}
Create and use this exact helper shape:
pub struct ProbeCliConfig {
pub ws_url: String,
pub timeout_ms: u64,
pub steps: Vec<ProbeStep>,
}
pub fn parse_probe_args(args: &[String]) -> Result<ProbeCliConfig, ProbeError>
The test must assert that these exact arguments parse successfully and preserve step order:
--ws-url ws://127.0.0.1:12345
--timeout-ms 1500
--step open-agent::["about:blank","sgOpenAgent"]
--step open-hot::["about:blank","sgBrowerserOpenPage","https://www.zhihu.com/hot"]
- Step 2: Run the parser test and verify it fails
Run:
cargo test --test browser_ws_probe_test parse_probe_args_accepts_ws_url_timeout_and_ordered_steps -- --nocapture
Expected: FAIL because parse_probe_args(...) and ProbeCliConfig do not exist yet.
- Step 3: Implement the helper and binary together
In src/browser/ws_probe.rs, add ProbeCliConfig and parse_probe_args(...).
In src/browser/mod.rs, add the module export:
pub mod ws_probe;
In src/bin/sgbrowser_ws_probe.rs, implement the binary using only std::env::args() plus parse_probe_args(...).
Required behavior:
- accepts a websocket URL
- accepts a timeout in milliseconds
- accepts repeated ordered steps
- runs the probe harness
- prints a markdown-friendly transcript including:
- step label
- exact sent payload
- exact received frames, if any
- timeout/close outcome
Output shape can be simple, for example:
STEP 1 bootstrap-open-agent
SEND: ["about:blank","sgOpenAgent"]
RECV: Welcome! You are client #1
OUTCOME: timeout
Rules:
-
no production/browser-runtime integration
-
no hidden fallback logic
-
no “best effort” guessing of next steps
-
Step 4: Re-run the parser/helper test
Run:
cargo test --test browser_ws_probe_test parse_probe_args_accepts_ws_url_timeout_and_ordered_steps -- --nocapture
Expected: PASS.
- Step 5: Build the probe binary
Run:
cargo build --bin sgbrowser_ws_probe
Expected: PASS.
- Step 6: Commit
git add src/bin/sgbrowser_ws_probe.rs src/browser/ws_probe.rs src/browser/mod.rs tests/browser_ws_probe_test.rs
git commit -m "feat: add live sgBrowser websocket probe binary"
Task 3: Run the real endpoint probe matrix and write the evidence transcript
Files:
-
Create:
docs/_tmp_sgbrowser_ws_probe_transcript.md -
Reuse only:
src/bin/sgbrowser_ws_probe.rs,docs/_tmp_sgbrowser_ws_api_doc.txt -
Step 1: Run the no-bootstrap baseline probe
Run exactly:
cargo run --bin sgbrowser_ws_probe -- --ws-url "ws://127.0.0.1:12345" --timeout-ms 1500 --step "baseline-open::[\"about:blank\",\"sgBrowerserOpenPage\",\"https://www.zhihu.com/hot\"]"
Append the exact output under a ## baseline-open heading in docs/_tmp_sgbrowser_ws_probe_transcript.md.
- Step 2: Run the documented
sgOpenAgentcandidate
Run exactly:
cargo run --bin sgbrowser_ws_probe -- --ws-url "ws://127.0.0.1:12345" --timeout-ms 1500 --step "open-agent::[\"about:blank\",\"sgOpenAgent\"]" --step "post-open-agent-open::[\"about:blank\",\"sgBrowerserOpenPage\",\"https://www.zhihu.com/hot\"]"
Append the exact output under a ## open-agent heading.
- Step 3: Run the documented
sgSetAuthInfocandidate
Run exactly:
cargo run --bin sgbrowser_ws_probe -- --ws-url "ws://127.0.0.1:12345" --timeout-ms 1500 --step "set-auth::[\"about:blank\",\"sgSetAuthInfo\",\"probe-user\",\"probe-token\"]" --step "post-set-auth-open::[\"about:blank\",\"sgBrowerserOpenPage\",\"https://www.zhihu.com/hot\"]"
Append the exact output under a ## set-auth heading.
- Step 4: Run the documented
sgBrowserLogincandidate
Run exactly:
cargo run --bin sgbrowser_ws_probe -- --ws-url "ws://127.0.0.1:12345" --timeout-ms 1500 --step "browser-login::{\"request\":\"use-json-helper\"}"
Before running, replace the placeholder payload with the exact JSON-array frame produced by the helper for:
["about:blank","sgBrowserLogin",{"appName":"probe","userName":"probe","orgName":"probe","menus":[{"name":"probe","normalImg":"x","activeImg":"x","url":"https://www.zhihu.com/hot"}]}]
Then add a second step in the same command:
["about:blank","sgBrowerserOpenPage","https://www.zhihu.com/hot"]
Append the exact output under a ## browser-login heading.
- Step 5: Run the documented
sgBrowerserActiveTabcandidate
Run exactly:
cargo run --bin sgbrowser_ws_probe -- --ws-url "ws://127.0.0.1:12345" --timeout-ms 1500 --step "active-tab::[\"about:blank\",\"sgBrowerserActiveTab\",\"https://www.zhihu.com/hot\",\"probeCallback\"]" --step "post-active-tab-open::[\"about:blank\",\"sgBrowerserOpenPage\",\"https://www.zhihu.com/hot\"]"
Append the exact output under a ## active-tab heading.
- Step 6: Run one combined bootstrap candidate
Run exactly:
cargo run --bin sgbrowser_ws_probe -- --ws-url "ws://127.0.0.1:12345" --timeout-ms 1500 --step "combined-open-agent::[\"about:blank\",\"sgOpenAgent\"]" --step "combined-active-tab::[\"about:blank\",\"sgBrowerserActiveTab\",\"https://www.zhihu.com/hot\",\"probeCallback\"]" --step "combined-open::[\"about:blank\",\"sgBrowerserOpenPage\",\"https://www.zhihu.com/hot\"]"
Append the exact output under a ## combined-bootstrap heading.
- Step 7: Run
requesturlvariants for the minimal action
Run exactly these two additional commands:
cargo run --bin sgbrowser_ws_probe -- --ws-url "ws://127.0.0.1:12345" --timeout-ms 1500 --step "target-as-requesturl::[\"https://www.zhihu.com/hot\",\"sgBrowerserOpenPage\",\"https://www.zhihu.com/hot\"]"
cargo run --bin sgbrowser_ws_probe -- --ws-url "ws://127.0.0.1:12345" --timeout-ms 1500 --step "baidu-requesturl::[\"https://www.baidu.com\",\"sgBrowerserOpenPage\",\"https://www.zhihu.com/hot\"]"
Append the exact outputs under ## requesturl-variants.
- Step 8: Summarize the matrix in the transcript file
At the end of docs/_tmp_sgbrowser_ws_probe_transcript.md, add this exact table template and fill it in:
| Sequence | Sent frames | First reply | Final outcome | Decision signal |
| --- | --- | --- | --- | --- |
- Step 9: Determine which architecture option wins
Decision rule:
- if at least one sequence reproducibly yields real numeric status and/or callback frames for a real business action, Option A (bootstrap-validated raw websocket) wins
- otherwise, Option B (bridge-first) wins
Do not weaken this decision rule.
- Step 10: Commit the evidence artifact
git add docs/_tmp_sgbrowser_ws_probe_transcript.md
git commit -m "docs: capture sgBrowser websocket probe evidence"
Task 4A: If Option A wins, write the narrow bootstrap implementation slice
Files:
-
Create:
docs/superpowers/specs/2026-04-03-ws-browser-bootstrap-contract-design.md -
Create:
docs/superpowers/plans/2026-04-03-ws-browser-bootstrap-contract-plan.md -
Reuse as evidence input:
docs/_tmp_sgbrowser_ws_probe_transcript.mddocs/_tmp_sgbrowser_ws_api_doc.txtsrc/browser/ws_backend.rssrc/browser/ws_protocol.rs
-
Step 1: Write one new design doc capturing the proven bootstrap contract
Create:
docs/superpowers/specs/2026-04-03-ws-browser-bootstrap-contract-design.md
Include:
-
exact validated sequence
-
exact required state (
requesturl, active tab, agent page, auth payload) -
exact failure semantics
-
why this is sufficient evidence to keep raw websocket as the product surface
-
Step 2: Write one new implementation plan for the bootstrap path
Create:
docs/superpowers/plans/2026-04-03-ws-browser-bootstrap-contract-plan.md
Plan only the minimal production changes required to embed the validated bootstrap sequence into the service/browser path.
- Step 3: Commit the bootstrap decision docs
git add docs/superpowers/specs/2026-04-03-ws-browser-bootstrap-contract-design.md docs/superpowers/plans/2026-04-03-ws-browser-bootstrap-contract-plan.md
git commit -m "docs: capture ws browser bootstrap contract"
- Step 4: Stop after writing the bootstrap plan
Do not begin production implementation in the same slice unless the user explicitly asks for execution.
Task 4B: If Option B wins, write the bridge-first implementation slice
Files:
-
Create:
docs/superpowers/specs/2026-04-03-ws-browser-bridge-path-design.md -
Create:
docs/superpowers/plans/2026-04-03-ws-browser-bridge-path-plan.md -
Reuse as evidence input:
docs/_tmp_sgbrowser_ws_probe_transcript.mdfrontend/archive/sgClaw验证-已归档/testRunner.jsdocs/superpowers/specs/2026-03-25-superrpa-sgclaw-browser-control-design.mddocs/archive/项目管理与排期/协作时间表.mddocs/plans/2026-03-27-sgclaw-floating-chat-frontend-design.md
-
Step 1: Write the bridge-path design doc
Create docs/superpowers/specs/2026-04-03-ws-browser-bridge-path-design.md.
The design must specify:
-
why raw websocket is considered non-validated for external control
-
which bridge surface becomes authoritative
-
where sgClaw should integrate (
FunctionsUI, host bridge,BrowserAction,CommandRouter, or the nearest validated seam in this repo) -
how to preserve pipe behavior and existing abstractions where practical
-
Step 2: Write the bridge-path implementation plan
Create docs/superpowers/plans/2026-04-03-ws-browser-bridge-path-plan.md.
The plan must:
-
identify exact files to touch
-
describe the narrowest adapter implementation
-
keep TDD/task granularity as in this document
-
avoid speculative work outside the bridge slice
-
Step 3: Commit the bridge decision docs
git add docs/superpowers/specs/2026-04-03-ws-browser-bridge-path-design.md docs/superpowers/plans/2026-04-03-ws-browser-bridge-path-plan.md
git commit -m "docs: define bridge-first sgBrowser integration"
- Step 4: Stop after writing the bridge plan
Do not start the bridge implementation in the same slice unless the user explicitly asks for execution.
Verification Checklist
Deterministic probe harness tests
cargo test --test browser_ws_probe_test -- --nocapture
Expected: transcript capture, ordered scripts, timeout reporting, and close/reset reporting all pass.
Probe binary build
cargo build --bin sgbrowser_ws_probe
Expected: PASS.
Live evidence collection
- run the probe matrix against the real configured endpoint
- save exact transcripts to
docs/_tmp_sgbrowser_ws_probe_transcript.md - make the architecture decision using the documented rule
Follow-up branch condition
- if Option A wins, repository contains a bootstrap-contract design + plan
- if Option B wins, repository contains a bridge-path design + plan
- no production runtime changes are made until that decision is written down
Notes for Implementation
- The existing
WsBrowserBackendfix that remembers the last navigated URL remains valid; do not revert it. - The previous auth-replacement work also remains valid; it removed a real bug but did not prove the raw websocket architecture.
- Keep the probe tool brutally literal: exact sent frames, exact received frames, explicit timeout/close outcomes.
- Resist the temptation to make the probe “smart.” Smart probes hide evidence.
- If the real endpoint still replies only with the welcome banner and then silence across the matrix, treat that as a decision, not as an excuse for more guessing.