admin/claw

Files

木炎 bdf8e12246 feat: align browser callback runtime and export flows

Consolidate the browser task runtime around the callback path, add safer artifact opening for Zhihu exports, and cover the new service/browser flows with focused tests and supporting docs.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

2026-04-06 21:44:53 +08:00

23 KiB

Raw Permalink Blame History

WS Browser Backend Auth Replacement Implementation Plan

For agentic workers: REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (- [ ]) syntax for tracking.

Goal: Replace the ws service path’s empty-session-key BrowserPipeTool dependency with a ws-native browser backend path so real browser websocket calls work, while preserving legacy pipe behavior exactly.

Architecture: Keep the existing pipe entry untouched and add a ws-only parallel execution seam. The ws service path will construct a ServiceBrowserWsClient plus WsBrowserBackend, pass that backend through a new ws-only submit-task entry, and let the existing compat/runtime stack consume Arc<dyn BrowserBackend> instead of requiring BrowserPipeTool on the ws path.

Tech Stack: Rust 2021, current sgclaw agent/task runner, compat runtime/orchestration stack, tungstenite, serde_json, existing MacPolicy, existing BrowserBackend/WsBrowserBackend, and the current Rust test suite.

Scope Guardrails

Only change the ws service path.
Do not change src/lib.rs pipe runtime behavior.
Do not change pipe handshake semantics.
Do not introduce fake session keys, fake HMAC seeds, or auth bypasses.
Keep legacy run_submit_task(...) available for the pipe entry.
If a shared layer must change, add a parallel ws-only entry instead of weakening the pipe path.
Keep the current single-client, single-task service model.
Do not broaden this slice into browser process launch, queueing, multi-client support, or protocol extensions.

File Structure

Existing files to modify

Modify: src/agent/task_runner.rs — keep the current pipe-oriented submit path and add the ws-only backend-based submit path.
Modify: src/compat/runtime.rs — add a backend-driven execution entry that accepts Arc<dyn BrowserBackend> directly, while keeping the current pipe-oriented public functions behaviorally unchanged.
Modify: src/compat/orchestration.rs — add a matching backend-driven execution entry for orchestration/direct-route flows, while keeping the current pipe-oriented public functions behaviorally unchanged.
Modify: src/compat/workflow_executor.rs — add backend-driven sibling APIs for direct-route/fallback execution, while keeping the current pipe-oriented public functions behaviorally unchanged.
Modify: src/service/server.rs — replace the ws service’s BrowserPipeTool::new(..., vec![]) path with a ws-native WsClient + WsBrowserBackend path.
Modify: src/service/mod.rs — only if minimal re-export or call-signature changes are needed around the new ws-only submit path.
Modify: src/browser/mod.rs — only if export cleanup is truly needed for the service wiring.
Reuse: src/agent/mod.rs — keep the current pipe routing unchanged unless a tiny internal refactor is strictly needed to reuse shared code.
Reuse: src/browser/backend.rs — existing shared browser backend trait.
Reuse: src/browser/ws_backend.rs — existing ws-native browser backend implementation.
Reuse: src/browser/ws_protocol.rs — existing browser websocket protocol codec.
Reuse: src/compat/browser_tool_adapter.rs — should already speak BrowserBackend; only touch if a narrow ws regression forces it.
Reuse: src/compat/browser_script_skill_tool.rs — eval-capability gating already exists; only touch if a narrow ws regression forces it.
Reuse: src/lib.rs — pipe entrypoint must remain behaviorally unchanged; verify only.

Existing tests to extend

Modify: tests/browser_ws_backend_test.rs — keep existing ws backend coverage green after the service adapter wiring lands.
Modify: tests/browser_script_skill_tool_test.rs — re-verify eval-gating and browser-script behavior after the shared compat/runtime seam changes.
Modify: tests/service_ws_session_test.rs — update service-side unit/session tests to exercise the ws-only submit path.
Modify: tests/service_task_flow_test.rs — add client→service chain coverage proving the ws path reaches a browser websocket and no longer emits invalid hmac seed.
Modify: src/service/server.rs under #[cfg(test)] if the private service-side ws adapter cannot be exercised from an integration test crate without changing production visibility.

New files to create

Create: tests/browser_ws_service_adapter_test.rs if the adapter can be exercised through a public seam; otherwise keep the deterministic adapter tests as unit tests in src/service/server.rs so no production visibility changes are required.

Task 1: Lock the ws-only behavior with deterministic failing tests

Files:

Create: tests/browser_ws_service_adapter_test.rs
Modify: tests/service_task_flow_test.rs
Reuse: tests/browser_ws_backend_test.rs, src/browser/ws_backend.rs, src/service/server.rs
Step 1: Write the first failing backend/adapter test

Create tests/browser_ws_service_adapter_test.rs with one focused test that directly exercises the ws-service adapter layer, without sg_claw_client, without LLM planning, and without natural-language tasks.

Start with the smallest real behavior from the spec:

fake browser websocket server accepts one connection
the ws-service adapter builds the same kind of client the service will use
WsBrowserBackend.invoke(Action::Navigate, ...) succeeds on status 0
the fake server receives one text frame that decodes as a ws Navigate call
Step 2: Run that single new test and watch it fail

Run:

cargo test --test browser_ws_service_adapter_test ws_service_backend_navigate_reaches_browser_websocket -- --nocapture

Expected: FAIL because the service-side ws client/adapter does not exist yet.

Step 3: Add the second failing deterministic test

In the same file, add a test for the forced-close path:

fake browser websocket server accepts a request, then closes/reset the socket before returning a status frame
observe the error at the WsBrowserBackend.invoke(...) call site
assert the outward error is exactly PipeError::PipeClosed
Step 4: Run only the forced-close test and watch it fail

Run:

cargo test --test browser_ws_service_adapter_test ws_service_backend_maps_browser_disconnect_to_pipe_closed -- --nocapture

Expected: FAIL because the service-side ws client/adapter still does not exist.

Step 5: Add the third failing deterministic test

In the same file, add a callback-timeout test:

fake browser websocket server returns status 0
it never returns the callback frame
assert the outward error at invoke(...) is exactly PipeError::Timeout

Use a tiny response timeout in the backend under test.

Step 6: Run only the callback-timeout test and watch it fail

Run:

cargo test --test browser_ws_service_adapter_test ws_service_backend_times_out_waiting_for_callback -- --nocapture

Expected: FAIL because the service-side ws client/adapter still does not exist.

Step 7: Add the end-to-end failing regression for the auth bug

Extend tests/service_task_flow_test.rs with one client→service integration test that:

starts a fake browser websocket server
starts the real sg_claw service binary with a temp config pointing browserWsUrl to that fake server
starts the real sg_claw_client
submits the fixed instruction 打开知乎热榜并读取页面主区域文本
captures service/client output
asserts the fake browser server received at least one text frame
asserts output does not contain invalid hmac seed: session key must not be empty

Do not assert planner details here. This test only proves the service path no longer goes through the empty-session-key auth failure.

Step 8: Run the integration regression and watch it fail

Run:

cargo test --test service_task_flow_test ws_service_submit_task_no_longer_hits_invalid_hmac_seed -- --nocapture

Expected: FAIL on the current code because the ws service still constructs BrowserPipeTool::new(..., vec![]).

Step 9: Commit the red tests only after they are all in place

Do not commit yet if any required red test was skipped. The next task will make them pass.

Task 2: Add a ws-only browser-backend execution seam without changing the pipe path

Files:

Modify: src/agent/task_runner.rs
Modify: src/compat/runtime.rs
Modify: src/compat/orchestration.rs
Modify: src/compat/workflow_executor.rs
Reuse: src/agent/mod.rs, src/browser/backend.rs
Test: tests/task_runner_test.rs, tests/browser_script_skill_tool_test.rs
Step 1: Write the smallest failing runner-level ws entry test

Extend tests/task_runner_test.rs with a focused test that proves there is a ws-only submit entry accepting Arc<dyn BrowserBackend> and an arbitrary event sink, while the old run_submit_task(...) signature still exists for pipe mode.

The test can stay on the missing-LLM-config path so it does not need a real browser call. It should compile only once the new ws-only function exists.

Step 2: Run the targeted runner test and watch it fail

Run:

cargo test --test task_runner_test ws_only_submit_task_entry_accepts_browser_backend -- --nocapture

Expected: FAIL to compile or FAIL to link because the ws-only entry does not exist yet.

Step 3: Add the new ws-only submit-task entry in src/agent/task_runner.rs

Keep the current pipe function intact:

pub fn run_submit_task<T: Transport + 'static>(... browser_tool: &BrowserPipeTool<T>, ...)

Add a parallel entry for the service path, for example:

pub fn run_submit_task_with_browser_backend(
    sink: &dyn AgentEventSink,
    browser_backend: Arc<dyn BrowserBackend>,
    context: &AgentRuntimeContext,
    request: SubmitTaskRequest,
) -> Result<(), PipeError>

Rules:

share as much internal logic as possible with the pipe path
do not change run_submit_task(...) behavior
do not change src/agent/mod.rs pipe wiring except, at most, small internal refactoring to reuse common code
Step 4: Add a backend-driven compat runtime entry

In src/compat/runtime.rs, add a parallel entry that accepts Arc<dyn BrowserBackend> directly instead of BrowserPipeTool<T>.

Keep the existing pipe-oriented public function in place.

The backend-driven entry must preserve:

existing log emission order
tool names (superrpa_browser, browser_action)
existing browser-script tool gating behavior
existing office/screen tool attachment logic
existing conversation seeding and provider setup
Step 5: Add backend-driven orchestration and workflow-executor entries

In src/compat/orchestration.rs, add the matching backend-driven entry so direct-route flows and fallback flows can run with Arc<dyn BrowserBackend> on the ws path.

In src/compat/workflow_executor.rs, add backend-driven sibling APIs for any direct-route/fallback execution that is currently hard-wired to BrowserPipeTool<T>.

Keep the existing pipe-oriented orchestration and workflow-executor public functions in place.

Step 6: Route the new ws-only submit entry through the backend-driven compat/orchestration/workflow-executor path

Inside src/agent/task_runner.rs, make the new ws-only submit entry call the new backend-based compat/orchestration functions, while the old pipe entry keeps calling the old pipe-based functions.

This is the core compatibility seam, and it must cover both normal compat-runtime execution and direct-route/fallback workflow execution.

Step 7: Re-run the new runner test

Run:

cargo test --test task_runner_test ws_only_submit_task_entry_accepts_browser_backend -- --nocapture

Expected: PASS.

Step 8: Re-run the full runner, workflow, and browser-script regressions

Run:

cargo test --test task_runner_test --test browser_script_skill_tool_test -- --nocapture

Then run the workflow executor unit coverage that protects direct-route behavior:

cargo test compat::workflow_executor::tests -- --nocapture

Expected: all existing runner, workflow, and browser-script tests still pass, proving the pipe-facing path, direct-route behavior, and eval-gating stayed stable.

Step 9: Commit

git add src/agent/task_runner.rs src/compat/runtime.rs src/compat/orchestration.rs src/compat/workflow_executor.rs tests/task_runner_test.rs tests/browser_script_skill_tool_test.rs
git commit -m "refactor: add ws-only browser backend submit path"

Task 3: Replace the ws service’s empty-session-key browser tool with a ws-native backend

Files:

Modify: src/service/server.rs
Modify: src/service/mod.rs only if minimal re-export or signature cleanup is required
Modify: src/browser/mod.rs only if export cleanup is needed
Test: tests/browser_ws_service_adapter_test.rs
Reuse: src/browser/ws_backend.rs, src/browser/ws_protocol.rs
Step 1: Write the smallest failing service-side adapter compile target

Add a compile-level or construction-level assertion in tests/browser_ws_service_adapter_test.rs that the service path can construct the new service-side ws client type used by serve_client(...).

This should fail until the type exists in src/service/server.rs.

Step 2: Run the adapter test group and watch the constructor test fail

Run:

cargo test --test browser_ws_service_adapter_test -- --nocapture

Expected: FAIL because the service-side ws client type does not exist yet.

Step 3: Introduce ServiceBrowserWsClient in src/service/server.rs

Create a narrow client type that owns the real websocket connection to browser_ws_url and implements WsClient:

Required responsibilities only:

lazily connect on first use
send raw text frames
receive raw text frames with timeout
map close/reset to exactly PipeError::PipeClosed
map connect failure to exactly PipeError::Protocol("browser websocket connect failed: ...")
map timeouts to exactly PipeError::Timeout

Do not duplicate WsBrowserBackend responsibilities here.

Step 4: Remove ws-path use of BrowserPipeTool::new(..., vec![])

In serve_client(...), replace this shape:

let transport = Arc::new(ServiceBrowserTransport::new(...));
let browser_tool = BrowserPipeTool::new(transport.clone(), mac_policy.clone(), vec![])

with the ws-native shape:

let ws_client = Arc::new(ServiceBrowserWsClient::new(...));
let browser_backend: Arc<dyn BrowserBackend> = Arc::new(
    WsBrowserBackend::new(ws_client, mac_policy.clone(), initial_request_url(...))
        .with_response_timeout(BROWSER_RESPONSE_TIMEOUT)
);

Then route the task through the new ws-only submit entry from Task 2.

Step 5: Delete or narrow old ws-path transport code that duplicated protocol handling

Remove the service-only callback polling / response queue logic that existed solely to feed BrowserPipeTool.

Keep only what is still needed for:

service client websocket I/O (sg_claw_client ↔ sg_claw)
browser websocket I/O (sg_claw ↔ browser_ws_url)

Do not leave two competing ws protocol implementations in src/service/server.rs.

Step 6: Re-run deterministic adapter/backend tests

Run:

cargo test --test browser_ws_service_adapter_test -- --nocapture

Expected: PASS, including:

navigate success
disconnect => PipeError::PipeClosed
callback timeout => PipeError::Timeout
Step 7: Re-run existing ws backend tests

Run:

cargo test --test browser_ws_backend_test -- --nocapture

Expected: PASS, confirming the service adapter change did not break the existing ws backend semantics.

Step 8: Commit

git add src/service/server.rs src/service/mod.rs src/browser/mod.rs tests/browser_ws_service_adapter_test.rs
git commit -m "feat: switch ws service to ws-native browser backend"

Task 4: Prove the auth bug is gone and pipe mode is unchanged

Files:

Modify: tests/service_ws_session_test.rs
Modify: tests/service_task_flow_test.rs
Reuse: src/lib.rs, src/service/mod.rs, src/compat/workflow_executor.rs
Step 1: Update service session tests for the new ws-only call path

Adjust any service session tests that still call handle_client_message(...) through the old ws-path BrowserPipeTool assumption.

Prefer one of these narrow approaches:

overload handle_client_message(...) with a backend-based service entry used only in ws tests, or
keep handle_client_message(...) pipe-oriented and test the ws path through serve_client(...) and the real service binary instead

Choose the option that changes the fewest existing tests and leaves the pipe path simplest.

Step 2: Run the focused service session file

Run:

cargo test --test service_ws_session_test -- --nocapture

Expected: PASS.

Step 3: Make the auth-regression integration test pass

Re-run the exact end-to-end regression from Task 1:

cargo test --test service_task_flow_test ws_service_submit_task_no_longer_hits_invalid_hmac_seed -- --nocapture

Expected: PASS, with evidence that:

the fake browser websocket server received at least one frame
output no longer contains invalid hmac seed: session key must not be empty
Step 4: Add one explicit mandatory assertion for browser websocket connect failures

Add one focused assertion that a browser websocket connect failure surfaces outward as:

PipeError::Protocol("browser websocket connect failed: ...")

Do not leave this semantic implied.

Step 5: Add one explicit ws direct-route regression

Add one focused regression that proves a ws-backed browser backend can traverse a direct-route/fallback path that currently flows through src/compat/workflow_executor.rs.

Keep it deterministic and narrow. Prefer a fake backend plus direct function invocation over a planner-dependent natural-language end-to-end test.

Step 6: Run the ws-focused regression suite

Run:

cargo test --test browser_ws_service_adapter_test --test browser_ws_backend_test --test browser_ws_protocol_test --test service_ws_session_test --test service_task_flow_test -- --nocapture

Then run the workflow-executor direct-route coverage:

cargo test compat::workflow_executor::tests -- --nocapture

Expected: all ws-focused and direct-route workflow tests pass.

Step 7: Run the required pipe and browser-script regression suite

Run:

cargo test --test pipe_handshake_test --test browser_tool_test --test compat_browser_tool_test --test browser_script_skill_tool_test --test runtime_task_flow_test -- --nocapture

Expected: all required pipe and browser-script regressions pass unchanged.

Step 8: Run the full relevant verification sweep

Run:

cargo test --test browser_ws_service_adapter_test --test browser_ws_backend_test --test browser_ws_protocol_test --test service_ws_session_test --test service_task_flow_test --test pipe_handshake_test --test browser_tool_test --test compat_browser_tool_test --test browser_script_skill_tool_test --test runtime_task_flow_test -- --nocapture

Then run:

cargo test compat::workflow_executor::tests -- --nocapture

Expected: full mixed ws+pipe verification passes in fresh runs.

Step 9: Build the affected binaries

Run:

cargo build --bin sgclaw --bin sg_claw --bin sg_claw_client

Expected: all three binaries compile.

Step 10: Commit

git add tests/service_ws_session_test.rs tests/service_task_flow_test.rs tests/browser_ws_service_adapter_test.rs src/compat/workflow_executor.rs
git commit -m "test: verify ws auth replacement and pipe regressions"

Task 5: Manual smoke verification against the real browser

Files:

Reuse only: no code changes unless a verified bug is found during smoke work
Step 1: Start the real browser websocket target

Confirm the real sgBrowser endpoint is reachable at the configured browserWsUrl.

Step 2: Start the real ws service

Run:

cargo run --bin sg_claw -- --config-path "D:/data/ideaSpace/rust/sgClaw/sgclaw_config.json"

Expected: service prints the resolved listen address and browser websocket URL.

Step 3: Run the minimal browser task through the real client

Run from a separate terminal with UTF-8-safe input:

cargo run --bin sg_claw_client -- --config-path "D:/data/ideaSpace/rust/sgClaw/sgclaw_config.json"

Submit:

打开知乎热榜并读取页面主区域文本

Expected:

browser actions start executing
no invalid hmac seed: session key must not be empty
one final completion is returned
Step 4: Run the old Zhihu skill smoke

Submit:

读取知乎热榜数据，并导出 excel 文件

Expected: the task enters the real browser action path instead of dying at auth initialization.

Step 5: Re-check the legacy pipe entry without modifying it

Run:

cargo run

Only verify startup behavior appropriate for the current pipe environment. Do not change pipe code during this smoke step.

Step 6: If a smoke failure appears, stop and debug before editing

Any failure found here must be handled with:

a fresh reproducer
a failing automated test if feasible
the smallest scoped fix

Do not fold speculative smoke fixes into this slice.

Verification Checklist

Deterministic ws-only tests

cargo test --test browser_ws_service_adapter_test --test browser_ws_backend_test --test browser_ws_protocol_test -- --nocapture

Expected: ws-native backend and service adapter semantics are green without LLM/planner dependencies.

Client→service ws chain tests

cargo test --test service_ws_session_test --test service_task_flow_test -- --nocapture

Expected: the ws service path reaches the browser websocket and no longer emits the empty-session-key auth failure.

Required pipe and browser-script regressions

cargo test --test pipe_handshake_test --test browser_tool_test --test compat_browser_tool_test --test browser_script_skill_tool_test --test runtime_task_flow_test -- --nocapture

Expected: legacy pipe behavior and browser-script eval-gating remain unchanged.

Binary build verification

cargo build --bin sgclaw --bin sg_claw --bin sg_claw_client

Expected: all affected binaries compile.

Manual end-to-end verification

real sgBrowser running at configured browserWsUrl
cargo run --bin sg_claw -- --config-path "D:/data/ideaSpace/rust/sgClaw/sgclaw_config.json"
cargo run --bin sg_claw_client -- --config-path "D:/data/ideaSpace/rust/sgClaw/sgclaw_config.json"
run the Zhihu minimal task
run the old Zhihu export task
verify no invalid hmac seed appears
verify pipe startup still behaves as before

Notes for Implementation

Keep the current pipe bootstrap in src/lib.rs untouched.
Prefer adding ws-only functions over changing existing pipe signatures.
Reuse WsBrowserBackend for protocol semantics; do not re-implement callback handling inside the service.
Keep ServiceBrowserWsClient narrow: connection lifecycle + raw websocket I/O only.
Preserve exact outward error semantics from the spec:
- connect failure => PipeError::Protocol("browser websocket connect failed: ...")
- non-zero status => PipeError::Protocol("browser returned non-zero status: ...")
- callback timeout => PipeError::Timeout
- close/reset => PipeError::PipeClosed
Do not claim success until the mixed ws+pipe verification commands have been run fresh.

23 KiB Raw Permalink Blame History Unescape Escape

WS Browser Backend Auth Replacement Implementation Plan

Scope Guardrails

File Structure

Existing files to modify

Existing tests to extend

New files to create

Task 1: Lock the ws-only behavior with deterministic failing tests

Task 2: Add a ws-only browser-backend execution seam without changing the pipe path

Task 3: Replace the ws service’s empty-session-key browser tool with a ws-native backend

Task 4: Prove the auth bug is gone and pipe mode is unchanged

Task 5: Manual smoke verification against the real browser

Verification Checklist

Deterministic ws-only tests

Client→service ws chain tests

Required pipe and browser-script regressions

Binary build verification

Manual end-to-end verification

Notes for Implementation

23 KiB

Raw Permalink Blame History