Files
claw/docs/superpowers/plans/2026-04-02-ws-browser-backend-auth-plan.md
木炎 bdf8e12246 feat: align browser callback runtime and export flows
Consolidate the browser task runtime around the callback path, add safer artifact opening for Zhihu exports, and cover the new service/browser flows with focused tests and supporting docs.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-06 21:44:53 +08:00

23 KiB
Raw Permalink Blame History

WS Browser Backend Auth Replacement Implementation Plan

For agentic workers: REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (- [ ]) syntax for tracking.

Goal: Replace the ws service paths empty-session-key BrowserPipeTool dependency with a ws-native browser backend path so real browser websocket calls work, while preserving legacy pipe behavior exactly.

Architecture: Keep the existing pipe entry untouched and add a ws-only parallel execution seam. The ws service path will construct a ServiceBrowserWsClient plus WsBrowserBackend, pass that backend through a new ws-only submit-task entry, and let the existing compat/runtime stack consume Arc<dyn BrowserBackend> instead of requiring BrowserPipeTool on the ws path.

Tech Stack: Rust 2021, current sgclaw agent/task runner, compat runtime/orchestration stack, tungstenite, serde_json, existing MacPolicy, existing BrowserBackend/WsBrowserBackend, and the current Rust test suite.


Scope Guardrails

  • Only change the ws service path.
  • Do not change src/lib.rs pipe runtime behavior.
  • Do not change pipe handshake semantics.
  • Do not introduce fake session keys, fake HMAC seeds, or auth bypasses.
  • Keep legacy run_submit_task(...) available for the pipe entry.
  • If a shared layer must change, add a parallel ws-only entry instead of weakening the pipe path.
  • Keep the current single-client, single-task service model.
  • Do not broaden this slice into browser process launch, queueing, multi-client support, or protocol extensions.

File Structure

Existing files to modify

  • Modify: src/agent/task_runner.rs — keep the current pipe-oriented submit path and add the ws-only backend-based submit path.
  • Modify: src/compat/runtime.rs — add a backend-driven execution entry that accepts Arc<dyn BrowserBackend> directly, while keeping the current pipe-oriented public functions behaviorally unchanged.
  • Modify: src/compat/orchestration.rs — add a matching backend-driven execution entry for orchestration/direct-route flows, while keeping the current pipe-oriented public functions behaviorally unchanged.
  • Modify: src/compat/workflow_executor.rs — add backend-driven sibling APIs for direct-route/fallback execution, while keeping the current pipe-oriented public functions behaviorally unchanged.
  • Modify: src/service/server.rs — replace the ws services BrowserPipeTool::new(..., vec![]) path with a ws-native WsClient + WsBrowserBackend path.
  • Modify: src/service/mod.rs — only if minimal re-export or call-signature changes are needed around the new ws-only submit path.
  • Modify: src/browser/mod.rs — only if export cleanup is truly needed for the service wiring.
  • Reuse: src/agent/mod.rs — keep the current pipe routing unchanged unless a tiny internal refactor is strictly needed to reuse shared code.
  • Reuse: src/browser/backend.rs — existing shared browser backend trait.
  • Reuse: src/browser/ws_backend.rs — existing ws-native browser backend implementation.
  • Reuse: src/browser/ws_protocol.rs — existing browser websocket protocol codec.
  • Reuse: src/compat/browser_tool_adapter.rs — should already speak BrowserBackend; only touch if a narrow ws regression forces it.
  • Reuse: src/compat/browser_script_skill_tool.rs — eval-capability gating already exists; only touch if a narrow ws regression forces it.
  • Reuse: src/lib.rs — pipe entrypoint must remain behaviorally unchanged; verify only.

Existing tests to extend

  • Modify: tests/browser_ws_backend_test.rs — keep existing ws backend coverage green after the service adapter wiring lands.
  • Modify: tests/browser_script_skill_tool_test.rs — re-verify eval-gating and browser-script behavior after the shared compat/runtime seam changes.
  • Modify: tests/service_ws_session_test.rs — update service-side unit/session tests to exercise the ws-only submit path.
  • Modify: tests/service_task_flow_test.rs — add client→service chain coverage proving the ws path reaches a browser websocket and no longer emits invalid hmac seed.
  • Modify: src/service/server.rs under #[cfg(test)] if the private service-side ws adapter cannot be exercised from an integration test crate without changing production visibility.

New files to create

  • Create: tests/browser_ws_service_adapter_test.rs if the adapter can be exercised through a public seam; otherwise keep the deterministic adapter tests as unit tests in src/service/server.rs so no production visibility changes are required.

Task 1: Lock the ws-only behavior with deterministic failing tests

Files:

  • Create: tests/browser_ws_service_adapter_test.rs

  • Modify: tests/service_task_flow_test.rs

  • Reuse: tests/browser_ws_backend_test.rs, src/browser/ws_backend.rs, src/service/server.rs

  • Step 1: Write the first failing backend/adapter test

Create tests/browser_ws_service_adapter_test.rs with one focused test that directly exercises the ws-service adapter layer, without sg_claw_client, without LLM planning, and without natural-language tasks.

Start with the smallest real behavior from the spec:

  • fake browser websocket server accepts one connection

  • the ws-service adapter builds the same kind of client the service will use

  • WsBrowserBackend.invoke(Action::Navigate, ...) succeeds on status 0

  • the fake server receives one text frame that decodes as a ws Navigate call

  • Step 2: Run that single new test and watch it fail

Run:

cargo test --test browser_ws_service_adapter_test ws_service_backend_navigate_reaches_browser_websocket -- --nocapture

Expected: FAIL because the service-side ws client/adapter does not exist yet.

  • Step 3: Add the second failing deterministic test

In the same file, add a test for the forced-close path:

  • fake browser websocket server accepts a request, then closes/reset the socket before returning a status frame

  • observe the error at the WsBrowserBackend.invoke(...) call site

  • assert the outward error is exactly PipeError::PipeClosed

  • Step 4: Run only the forced-close test and watch it fail

Run:

cargo test --test browser_ws_service_adapter_test ws_service_backend_maps_browser_disconnect_to_pipe_closed -- --nocapture

Expected: FAIL because the service-side ws client/adapter still does not exist.

  • Step 5: Add the third failing deterministic test

In the same file, add a callback-timeout test:

  • fake browser websocket server returns status 0
  • it never returns the callback frame
  • assert the outward error at invoke(...) is exactly PipeError::Timeout

Use a tiny response timeout in the backend under test.

  • Step 6: Run only the callback-timeout test and watch it fail

Run:

cargo test --test browser_ws_service_adapter_test ws_service_backend_times_out_waiting_for_callback -- --nocapture

Expected: FAIL because the service-side ws client/adapter still does not exist.

  • Step 7: Add the end-to-end failing regression for the auth bug

Extend tests/service_task_flow_test.rs with one client→service integration test that:

  • starts a fake browser websocket server
  • starts the real sg_claw service binary with a temp config pointing browserWsUrl to that fake server
  • starts the real sg_claw_client
  • submits the fixed instruction 打开知乎热榜并读取页面主区域文本
  • captures service/client output
  • asserts the fake browser server received at least one text frame
  • asserts output does not contain invalid hmac seed: session key must not be empty

Do not assert planner details here. This test only proves the service path no longer goes through the empty-session-key auth failure.

  • Step 8: Run the integration regression and watch it fail

Run:

cargo test --test service_task_flow_test ws_service_submit_task_no_longer_hits_invalid_hmac_seed -- --nocapture

Expected: FAIL on the current code because the ws service still constructs BrowserPipeTool::new(..., vec![]).

  • Step 9: Commit the red tests only after they are all in place

Do not commit yet if any required red test was skipped. The next task will make them pass.


Task 2: Add a ws-only browser-backend execution seam without changing the pipe path

Files:

  • Modify: src/agent/task_runner.rs

  • Modify: src/compat/runtime.rs

  • Modify: src/compat/orchestration.rs

  • Modify: src/compat/workflow_executor.rs

  • Reuse: src/agent/mod.rs, src/browser/backend.rs

  • Test: tests/task_runner_test.rs, tests/browser_script_skill_tool_test.rs

  • Step 1: Write the smallest failing runner-level ws entry test

Extend tests/task_runner_test.rs with a focused test that proves there is a ws-only submit entry accepting Arc<dyn BrowserBackend> and an arbitrary event sink, while the old run_submit_task(...) signature still exists for pipe mode.

The test can stay on the missing-LLM-config path so it does not need a real browser call. It should compile only once the new ws-only function exists.

  • Step 2: Run the targeted runner test and watch it fail

Run:

cargo test --test task_runner_test ws_only_submit_task_entry_accepts_browser_backend -- --nocapture

Expected: FAIL to compile or FAIL to link because the ws-only entry does not exist yet.

  • Step 3: Add the new ws-only submit-task entry in src/agent/task_runner.rs

Keep the current pipe function intact:

pub fn run_submit_task<T: Transport + 'static>(... browser_tool: &BrowserPipeTool<T>, ...)

Add a parallel entry for the service path, for example:

pub fn run_submit_task_with_browser_backend(
    sink: &dyn AgentEventSink,
    browser_backend: Arc<dyn BrowserBackend>,
    context: &AgentRuntimeContext,
    request: SubmitTaskRequest,
) -> Result<(), PipeError>

Rules:

  • share as much internal logic as possible with the pipe path

  • do not change run_submit_task(...) behavior

  • do not change src/agent/mod.rs pipe wiring except, at most, small internal refactoring to reuse common code

  • Step 4: Add a backend-driven compat runtime entry

In src/compat/runtime.rs, add a parallel entry that accepts Arc<dyn BrowserBackend> directly instead of BrowserPipeTool<T>.

Keep the existing pipe-oriented public function in place.

The backend-driven entry must preserve:

  • existing log emission order

  • tool names (superrpa_browser, browser_action)

  • existing browser-script tool gating behavior

  • existing office/screen tool attachment logic

  • existing conversation seeding and provider setup

  • Step 5: Add backend-driven orchestration and workflow-executor entries

In src/compat/orchestration.rs, add the matching backend-driven entry so direct-route flows and fallback flows can run with Arc<dyn BrowserBackend> on the ws path.

In src/compat/workflow_executor.rs, add backend-driven sibling APIs for any direct-route/fallback execution that is currently hard-wired to BrowserPipeTool<T>.

Keep the existing pipe-oriented orchestration and workflow-executor public functions in place.

  • Step 6: Route the new ws-only submit entry through the backend-driven compat/orchestration/workflow-executor path

Inside src/agent/task_runner.rs, make the new ws-only submit entry call the new backend-based compat/orchestration functions, while the old pipe entry keeps calling the old pipe-based functions.

This is the core compatibility seam, and it must cover both normal compat-runtime execution and direct-route/fallback workflow execution.

  • Step 7: Re-run the new runner test

Run:

cargo test --test task_runner_test ws_only_submit_task_entry_accepts_browser_backend -- --nocapture

Expected: PASS.

  • Step 8: Re-run the full runner, workflow, and browser-script regressions

Run:

cargo test --test task_runner_test --test browser_script_skill_tool_test -- --nocapture

Then run the workflow executor unit coverage that protects direct-route behavior:

cargo test compat::workflow_executor::tests -- --nocapture

Expected: all existing runner, workflow, and browser-script tests still pass, proving the pipe-facing path, direct-route behavior, and eval-gating stayed stable.

  • Step 9: Commit
git add src/agent/task_runner.rs src/compat/runtime.rs src/compat/orchestration.rs src/compat/workflow_executor.rs tests/task_runner_test.rs tests/browser_script_skill_tool_test.rs
git commit -m "refactor: add ws-only browser backend submit path"

Task 3: Replace the ws services empty-session-key browser tool with a ws-native backend

Files:

  • Modify: src/service/server.rs

  • Modify: src/service/mod.rs only if minimal re-export or signature cleanup is required

  • Modify: src/browser/mod.rs only if export cleanup is needed

  • Test: tests/browser_ws_service_adapter_test.rs

  • Reuse: src/browser/ws_backend.rs, src/browser/ws_protocol.rs

  • Step 1: Write the smallest failing service-side adapter compile target

Add a compile-level or construction-level assertion in tests/browser_ws_service_adapter_test.rs that the service path can construct the new service-side ws client type used by serve_client(...).

This should fail until the type exists in src/service/server.rs.

  • Step 2: Run the adapter test group and watch the constructor test fail

Run:

cargo test --test browser_ws_service_adapter_test -- --nocapture

Expected: FAIL because the service-side ws client type does not exist yet.

  • Step 3: Introduce ServiceBrowserWsClient in src/service/server.rs

Create a narrow client type that owns the real websocket connection to browser_ws_url and implements WsClient:

Required responsibilities only:

  • lazily connect on first use
  • send raw text frames
  • receive raw text frames with timeout
  • map close/reset to exactly PipeError::PipeClosed
  • map connect failure to exactly PipeError::Protocol("browser websocket connect failed: ...")
  • map timeouts to exactly PipeError::Timeout

Do not duplicate WsBrowserBackend responsibilities here.

  • Step 4: Remove ws-path use of BrowserPipeTool::new(..., vec![])

In serve_client(...), replace this shape:

let transport = Arc::new(ServiceBrowserTransport::new(...));
let browser_tool = BrowserPipeTool::new(transport.clone(), mac_policy.clone(), vec![])

with the ws-native shape:

let ws_client = Arc::new(ServiceBrowserWsClient::new(...));
let browser_backend: Arc<dyn BrowserBackend> = Arc::new(
    WsBrowserBackend::new(ws_client, mac_policy.clone(), initial_request_url(...))
        .with_response_timeout(BROWSER_RESPONSE_TIMEOUT)
);

Then route the task through the new ws-only submit entry from Task 2.

  • Step 5: Delete or narrow old ws-path transport code that duplicated protocol handling

Remove the service-only callback polling / response queue logic that existed solely to feed BrowserPipeTool.

Keep only what is still needed for:

  • service client websocket I/O (sg_claw_clientsg_claw)
  • browser websocket I/O (sg_clawbrowser_ws_url)

Do not leave two competing ws protocol implementations in src/service/server.rs.

  • Step 6: Re-run deterministic adapter/backend tests

Run:

cargo test --test browser_ws_service_adapter_test -- --nocapture

Expected: PASS, including:

  • navigate success

  • disconnect => PipeError::PipeClosed

  • callback timeout => PipeError::Timeout

  • Step 7: Re-run existing ws backend tests

Run:

cargo test --test browser_ws_backend_test -- --nocapture

Expected: PASS, confirming the service adapter change did not break the existing ws backend semantics.

  • Step 8: Commit
git add src/service/server.rs src/service/mod.rs src/browser/mod.rs tests/browser_ws_service_adapter_test.rs
git commit -m "feat: switch ws service to ws-native browser backend"

Task 4: Prove the auth bug is gone and pipe mode is unchanged

Files:

  • Modify: tests/service_ws_session_test.rs

  • Modify: tests/service_task_flow_test.rs

  • Reuse: src/lib.rs, src/service/mod.rs, src/compat/workflow_executor.rs

  • Step 1: Update service session tests for the new ws-only call path

Adjust any service session tests that still call handle_client_message(...) through the old ws-path BrowserPipeTool assumption.

Prefer one of these narrow approaches:

  • overload handle_client_message(...) with a backend-based service entry used only in ws tests, or
  • keep handle_client_message(...) pipe-oriented and test the ws path through serve_client(...) and the real service binary instead

Choose the option that changes the fewest existing tests and leaves the pipe path simplest.

  • Step 2: Run the focused service session file

Run:

cargo test --test service_ws_session_test -- --nocapture

Expected: PASS.

  • Step 3: Make the auth-regression integration test pass

Re-run the exact end-to-end regression from Task 1:

cargo test --test service_task_flow_test ws_service_submit_task_no_longer_hits_invalid_hmac_seed -- --nocapture

Expected: PASS, with evidence that:

  • the fake browser websocket server received at least one frame

  • output no longer contains invalid hmac seed: session key must not be empty

  • Step 4: Add one explicit mandatory assertion for browser websocket connect failures

Add one focused assertion that a browser websocket connect failure surfaces outward as:

PipeError::Protocol("browser websocket connect failed: ...")

Do not leave this semantic implied.

  • Step 5: Add one explicit ws direct-route regression

Add one focused regression that proves a ws-backed browser backend can traverse a direct-route/fallback path that currently flows through src/compat/workflow_executor.rs.

Keep it deterministic and narrow. Prefer a fake backend plus direct function invocation over a planner-dependent natural-language end-to-end test.

  • Step 6: Run the ws-focused regression suite

Run:

cargo test --test browser_ws_service_adapter_test --test browser_ws_backend_test --test browser_ws_protocol_test --test service_ws_session_test --test service_task_flow_test -- --nocapture

Then run the workflow-executor direct-route coverage:

cargo test compat::workflow_executor::tests -- --nocapture

Expected: all ws-focused and direct-route workflow tests pass.

  • Step 7: Run the required pipe and browser-script regression suite

Run:

cargo test --test pipe_handshake_test --test browser_tool_test --test compat_browser_tool_test --test browser_script_skill_tool_test --test runtime_task_flow_test -- --nocapture

Expected: all required pipe and browser-script regressions pass unchanged.

  • Step 8: Run the full relevant verification sweep

Run:

cargo test --test browser_ws_service_adapter_test --test browser_ws_backend_test --test browser_ws_protocol_test --test service_ws_session_test --test service_task_flow_test --test pipe_handshake_test --test browser_tool_test --test compat_browser_tool_test --test browser_script_skill_tool_test --test runtime_task_flow_test -- --nocapture

Then run:

cargo test compat::workflow_executor::tests -- --nocapture

Expected: full mixed ws+pipe verification passes in fresh runs.

  • Step 9: Build the affected binaries

Run:

cargo build --bin sgclaw --bin sg_claw --bin sg_claw_client

Expected: all three binaries compile.

  • Step 10: Commit
git add tests/service_ws_session_test.rs tests/service_task_flow_test.rs tests/browser_ws_service_adapter_test.rs src/compat/workflow_executor.rs
git commit -m "test: verify ws auth replacement and pipe regressions"

Task 5: Manual smoke verification against the real browser

Files:

  • Reuse only: no code changes unless a verified bug is found during smoke work

  • Step 1: Start the real browser websocket target

Confirm the real sgBrowser endpoint is reachable at the configured browserWsUrl.

  • Step 2: Start the real ws service

Run:

cargo run --bin sg_claw -- --config-path "D:/data/ideaSpace/rust/sgClaw/sgclaw_config.json"

Expected: service prints the resolved listen address and browser websocket URL.

  • Step 3: Run the minimal browser task through the real client

Run from a separate terminal with UTF-8-safe input:

cargo run --bin sg_claw_client -- --config-path "D:/data/ideaSpace/rust/sgClaw/sgclaw_config.json"

Submit:

打开知乎热榜并读取页面主区域文本

Expected:

  • browser actions start executing

  • no invalid hmac seed: session key must not be empty

  • one final completion is returned

  • Step 4: Run the old Zhihu skill smoke

Submit:

读取知乎热榜数据,并导出 excel 文件

Expected: the task enters the real browser action path instead of dying at auth initialization.

  • Step 5: Re-check the legacy pipe entry without modifying it

Run:

cargo run

Only verify startup behavior appropriate for the current pipe environment. Do not change pipe code during this smoke step.

  • Step 6: If a smoke failure appears, stop and debug before editing

Any failure found here must be handled with:

  • a fresh reproducer
  • a failing automated test if feasible
  • the smallest scoped fix

Do not fold speculative smoke fixes into this slice.


Verification Checklist

Deterministic ws-only tests

cargo test --test browser_ws_service_adapter_test --test browser_ws_backend_test --test browser_ws_protocol_test -- --nocapture

Expected: ws-native backend and service adapter semantics are green without LLM/planner dependencies.

Client→service ws chain tests

cargo test --test service_ws_session_test --test service_task_flow_test -- --nocapture

Expected: the ws service path reaches the browser websocket and no longer emits the empty-session-key auth failure.

Required pipe and browser-script regressions

cargo test --test pipe_handshake_test --test browser_tool_test --test compat_browser_tool_test --test browser_script_skill_tool_test --test runtime_task_flow_test -- --nocapture

Expected: legacy pipe behavior and browser-script eval-gating remain unchanged.

Binary build verification

cargo build --bin sgclaw --bin sg_claw --bin sg_claw_client

Expected: all affected binaries compile.

Manual end-to-end verification

  • real sgBrowser running at configured browserWsUrl
  • cargo run --bin sg_claw -- --config-path "D:/data/ideaSpace/rust/sgClaw/sgclaw_config.json"
  • cargo run --bin sg_claw_client -- --config-path "D:/data/ideaSpace/rust/sgClaw/sgclaw_config.json"
  • run the Zhihu minimal task
  • run the old Zhihu export task
  • verify no invalid hmac seed appears
  • verify pipe startup still behaves as before

Notes for Implementation

  • Keep the current pipe bootstrap in src/lib.rs untouched.
  • Prefer adding ws-only functions over changing existing pipe signatures.
  • Reuse WsBrowserBackend for protocol semantics; do not re-implement callback handling inside the service.
  • Keep ServiceBrowserWsClient narrow: connection lifecycle + raw websocket I/O only.
  • Preserve exact outward error semantics from the spec:
    • connect failure => PipeError::Protocol("browser websocket connect failed: ...")
    • non-zero status => PipeError::Protocol("browser returned non-zero status: ...")
    • callback timeout => PipeError::Timeout
    • close/reset => PipeError::PipeClosed
  • Do not claim success until the mixed ws+pipe verification commands have been run fresh.