# Zhihu Release WS Function-Callback Migration Implementation Plan > **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking. **Goal:** Move only the Zhihu direct-execution path to the new Release browser websocket interaction style while keeping the existing pipe protocol and non-Zhihu submit behavior unchanged. **Architecture:** Keep `ClientMessage` / `ServiceMessage`, `run_submit_task_with_browser_backend(...)`, and the high-level Zhihu workflow steps unchanged. First prove the exact Release browser interaction contract with transcript-backed probes. Then implement the smallest Zhihu-scoped backend path that follows that proven contract. Do not globally rewire the submit path unless the probe evidence proves there is no narrower safe seam. **Tech Stack:** Rust, tungstenite, existing sgclaw service/client pipe protocol, `docs/_tmp_sgbrowser_ws_api_doc.txt`, Release browser websocket at `ws://127.0.0.1:12345`, current Zhihu direct-execution workflow. --- ## Context The user has now made the target behavior explicit: - the browser has changed and the working reference behavior is the user-provided HTML page that connects to `ws://127.0.0.1:12345` - that page sends a bootstrap registration frame: `{"type":"register","role":"web"}` - browser requests are still JSON arrays such as `[window.location.href, "sgBrowserSetTheme", "1"]` and `[window.location.href, "sgBrowerserGetUrls", "showUrls"]` - callback-bearing browser behavior is now centered on page-defined JS callback functions like `showUrls`, not on Rust directly reading a websocket callback frame as the final business result - the existing sgclaw pipe protocol must remain unchanged The current sgclaw drift that must be corrected is visible in: - `src/browser/ws_protocol.rs` - `Action::Navigate` currently emits `sgHideBrowserCallAfterLoaded` with an inline `callBackJsToCpp(...)` string - `src/browser/ws_backend.rs` - Rust currently waits for a browser websocket callback frame and treats that as the action result - `tests/service_ws_session_test.rs:498-605` - `tests/service_task_flow_test.rs:499-635` - existing **generic submit-flow** regressions still lock in the old direct raw-websocket callback-frame assumption - these are useful as non-regression guardrails, but they are not themselves Zhihu-specific regressions Zhihu-specific verification must therefore be added explicitly instead of assuming those Baidu-path tests already cover Zhihu. The new browser style proves these facts and only these facts so far: 1. sgclaw must handle a register-first websocket handshake 2. browser requests are still `[requesturl, action, ...args]` 3. some browser capabilities now return through page-defined callback functions like `showUrls` 4. the current direct raw-websocket callback expectation in Zhihu path is no longer a safe assumption The production seam is **not** pre-decided here. Task 1 must determine whether Zhihu can be integrated by: - a direct Zhihu-scoped backend with no helper page, or - a helper page / relay design because named page callbacks are the only reliable result path Until Task 1 evidence is captured, both remain hypotheses. ## Evidence to preserve in the implementation ### Browser websocket API doc From `docs/_tmp_sgbrowser_ws_api_doc.txt`: - `ws://localhost:12345` is the browser websocket endpoint - request frames are array payloads with `requesturl` - `sgBrowerserGetUrls(callback)` uses a callback **function name**: `[requesturl,"sgBrowerserGetUrls", callback]` - `sgBrowserCallAfterLoaded(targetUrl, callback)` and `sgHideBrowserCallAfterLoaded(targetUrl, callback)` use callback strings with parentheses - `callBackJsToCpp(param)` uses `sourceUrl@_@targetUrl@_@callback@_@actionUrl@_@responseTxt` - `sgBrowserRegJsFun(targeturl, funContent)` and `sgBrowserExcuteJsFun(targeturl, funName)` exist and may be useful when the helper page needs durable callback helpers ### Current working HTML pattern from the user The now-working reference interaction is: ```html const socket = new WebSocket('ws://127.0.0.1:12345'); socket.onopen = () => { socket.send(JSON.stringify({type: 'register', role: 'web'})); }; socket.send(JSON.stringify([window.location.href,"sgBrowerserGetUrls","showUrls"])); function showUrls(urls) { // browser invokes this page-defined callback } ``` That is the browser behavior sgclaw now needs to follow. --- ## Critical files ### Production files to modify - `src/browser/ws_protocol.rs` - `src/compat/workflow_executor.rs` (only if a narrow Zhihu-specific correction is required after backend swap) - `src/service/server.rs` (only if the chosen Zhihu-scoped integration seam must be wired here) - `src/service/mod.rs` (only if startup plumbing changes are truly required) - `src/browser/mod.rs` ### New production files likely needed - `src/browser/zhihu_release_backend.rs` - a Zhihu-scoped `BrowserBackend` adapter that follows the proven Release browser interaction style without changing non-Zhihu routes - `src/service/browser_callback_host.rs` **only if the probe proves a service-controlled helper page is actually required** - service-local helper-page lifecycle and callback relay, if evidence shows the browser cannot be driven safely without it ### Existing files to preserve - `src/agent/task_runner.rs` - `src/service/protocol.rs` - `src/compat/orchestration.rs` - `src/compat/runtime.rs` - `src/pipe/*` ### Existing direct-ws files to review explicitly - `src/browser/ws_backend.rs` - `tests/browser_ws_backend_test.rs` These files currently encode the old direct raw-websocket callback expectation. The implementation must either: - leave them untouched as legacy/direct-contract coverage with no Zhihu production callers, or - update/remove the Zhihu-specific assumptions they currently lock in. ### Primary test files - `tests/browser_ws_probe_test.rs` - `tests/browser_ws_protocol_test.rs` - `tests/service_ws_session_test.rs` - `tests/service_task_flow_test.rs` - `tests/task_runner_test.rs` - `tests/browser_ws_backend_test.rs` --- ## File structure decisions ### `src/browser/zhihu_release_backend.rs` Prefer a Zhihu-scoped backend first. Responsibilities: - keep the same `BrowserBackend` trait surface - implement only the behavior needed by the current Zhihu direct-execution route - translate `Action::Navigate`, `Action::GetText`, and `Action::Eval` into the proven Release-browser interaction style - normalize results back into `CommandOutput` - avoid affecting non-Zhihu callers This is the preferred seam because the user asked to change the current Zhihu flow, not to redesign the whole submit pipeline. ### `src/service/browser_callback_host.rs` (conditional) Create this file only if Task 1 probe evidence proves that sgclaw must host or control a page in order to receive named callback-function results. If it is needed, the plan must keep the design minimal and specific: - one concrete transport only (choose websocket or HTTP, not “websocket or HTTP”) - explicit readiness handshake - explicit request correlation by `request_id` - explicit cleanup when the submit task ends If Task 1 shows a simpler seam, do not create this file. ### `src/browser/ws_protocol.rs` Do not let this file keep only the old direct-callback assumption. It should become the shared place for doc-native request builders such as: - browser bootstrap frames proven by the transcript - `sgBrowserCallAfterLoaded` / `sgHideBrowserCallAfterLoaded` - `sgBrowserExcuteJsCodeByArea` - optional `sgBrowserRegJsFun` / `sgBrowserExcuteJsFun` But do **not** let `ws_protocol.rs` absorb service-host lifecycle logic. ### `src/browser/ws_backend.rs` and `tests/browser_ws_backend_test.rs` Handle these explicitly in the implementation: - if they still describe a valid direct browser contract, keep them as isolated legacy/direct-ws coverage only - if their current navigate/callback assumptions conflict with the proven Release Zhihu path, update or narrow those tests so they no longer describe the active Zhihu integration path Do not leave the old direct-callback assumptions ambiguously “reviewed”; the implementation must make their status explicit. --- ## Task 1: Capture the new Release browser contract in a reproducible probe transcript **Files:** - Review/modify: `src/browser/ws_probe.rs` - Review/modify: `src/bin/sgbrowser_ws_probe.rs` - Review/modify: `tests/browser_ws_probe_test.rs` - Create: `docs/_tmp_release_ws_callback_host_transcript.md` - [ ] **Step 1: Verify current probe coverage against the Release-browser questions** Read the existing probe module and tests and check whether they already prove all of the following: - a register-first websocket script can be expressed - a later array action frame can be expressed in the same script - per-step inbound frames/outcomes are preserved separately - timeout/close remain distinguishable in the transcript Required result: - identify the exact existing tests that already prove these behaviors - identify the smallest missing Release-specific coverage, if any - [ ] **Step 2: Add only the missing regression coverage** If current tests do **not** already prove the Release-browser bootstrap shape, add the narrowest failing regression in `tests/browser_ws_probe_test.rs`. Preferred shape if coverage is missing: ```rust #[test] fn probe_supports_register_then_array_action_script() { // fake server expects: // 1. {"type":"register","role":"web"} // 2. ["http://127.0.0.1/helper.html","sgBrowerserGetUrls","showUrls"] } ``` And, if still missing, add one regression proving per-step transcript separation for the register reply and later action reply. If those behaviors are already covered, skip new test creation and record the exact test names to rely on. - [ ] **Step 3: Run the relevant probe tests** Run the narrowest exact tests that prove the Release bootstrap behavior, or the full file if multiple areas changed: ```bash cargo test --test browser_ws_probe_test -- --nocapture ``` Expected: PASS. - [ ] **Step 4: Make the probe binary ergonomic for the Release transcript if needed** Only if the current CLI cannot conveniently express the real Release-browser script, make the smallest change needed in `src/bin/sgbrowser_ws_probe.rs` / `src/browser/ws_probe.rs` so it can capture: - register frame behavior - minimal `sgBrowserSetTheme` - minimal `sgBrowerserGetUrls` - exact inbound websocket text per step Do not redesign the probe if it already supports this. - [ ] **Step 5: Run the live probe against the Release browser and record the real bootstrap** Use the probe binary against the real endpoint to capture at minimum: - register frame behavior - minimal `sgBrowserSetTheme` - minimal `sgBrowerserGetUrls` - whether replies come back as websocket text, page-function invocation only, or both Save the exact transcript in `docs/_tmp_release_ws_callback_host_transcript.md`. Required output in that temp doc: - exact sent frames - exact received websocket frames - the observed rule for when named callback functions are invoked - whether Option A or Option B is supported by evidence - [ ] **Step 6: Commit the probe-only slice if code changed** If probe code/tests changed: ```bash git add src/browser/ws_probe.rs src/bin/sgbrowser_ws_probe.rs tests/browser_ws_probe_test.rs docs/_tmp_release_ws_callback_host_transcript.md git commit -m "test: capture release browser ws bootstrap contract" ``` If only the transcript doc changed, stage only that file and use a docs/test-appropriate commit message. --- ## Task 2: Choose the narrowest Zhihu-only production seam from the probe evidence **Files:** - Modify: `src/service/server.rs` (only if required) - Modify: `src/browser/mod.rs` - Modify: `src/compat/workflow_executor.rs` (only if required) - Create: `src/browser/zhihu_release_backend.rs` - Create: `src/service/browser_callback_host.rs` **only if required** - Test: `tests/service_ws_session_test.rs` - Test: `tests/service_task_flow_test.rs` - [ ] **Step 1: Write down the seam decision in the plan notes before coding** Based on the transcript from Task 1, record which one of these is supported by evidence: - Option A: a Zhihu-scoped backend can talk to the Release browser directly with no service-hosted helper page - Option B: a Zhihu-scoped backend needs a service-controlled helper page because named page callbacks are the only reliable way to get business results Do not proceed until one option is chosen explicitly from evidence. - [ ] **Step 2: Add a failing service/task-flow regression that proves only the Zhihu path changes** Update or add focused tests so that: - Zhihu submit flow uses the new Release-browser interaction seam - non-Zhihu behavior is unchanged - pipe messages remain unchanged Required assertions: - the new path is activated only for Zhihu route detection - `ClientMessage` / `ServiceMessage` stay identical - existing non-Zhihu submit behavior is not accidentally rerouted - [ ] **Step 3: Run the new focused regression and confirm failure first** Run the narrowest exact test names you added in: ```bash cargo test --test service_ws_session_test -- --nocapture cargo test --test service_task_flow_test -- --nocapture ``` Expected: FAIL because the Zhihu-specific seam does not exist yet. - [ ] **Step 4: Implement the chosen seam with the smallest blast radius** If Option A won: - add `src/browser/zhihu_release_backend.rs` - wire it only where the Zhihu direct-execution route is selected - leave global submit-path wiring alone If Option B won: - add `src/service/browser_callback_host.rs` with one specific transport and one explicit readiness/correlation model - add `src/browser/zhihu_release_backend.rs` to talk to that helper path - wire it only for the Zhihu route In both cases: - do not change non-Zhihu callers - do not redesign `run_submit_task_with_browser_backend(...)` - do not change the pipe protocol - [ ] **Step 5: Make the status of old direct-ws code explicit** Update `src/browser/ws_backend.rs` / `tests/browser_ws_backend_test.rs` only as needed so they no longer ambiguously describe the active Zhihu path. Allowed outcomes: - keep them untouched as legacy/direct-ws coverage with no Zhihu production caller - narrow/update the tests so they no longer claim the active Zhihu integration path Not allowed: - leaving the plan and code in a state where both old and new paths appear to be the active Zhihu contract - [ ] **Step 6: Run focused integration tests** Run: ```bash cargo test --test service_ws_session_test -- --nocapture cargo test --test service_task_flow_test -- --nocapture cargo test --test task_runner_test -- --nocapture ``` Expected: PASS. - [ ] **Step 7: Commit the seam-selection slice** Adjust staged files to match the option actually implemented, for example: ```bash git add src/browser/zhihu_release_backend.rs src/browser/mod.rs src/service/server.rs src/service/browser_callback_host.rs tests/service_ws_session_test.rs tests/service_task_flow_test.rs tests/browser_ws_backend_test.rs git commit -m "feat: route zhihu flow through release browser ws contract" ``` Only stage files that were truly changed. --- ## Task 3: Implement Zhihu action mapping on the chosen Release-browser seam **Files:** - Modify: `src/browser/ws_protocol.rs` - Modify: `src/browser/zhihu_release_backend.rs` - Test: `tests/browser_ws_protocol_test.rs` - Create: `tests/browser_zhihu_release_backend_test.rs` - [ ] **Step 1: Write the first failing backend test for Zhihu navigate mapping** Create `tests/browser_zhihu_release_backend_test.rs` with a fake transport/relay and assert that `Action::Navigate` for the Zhihu path becomes the exact browser request shape proven by Task 1. Start with this shape: ```rust #[test] fn zhihu_release_backend_maps_navigate_to_proven_release_frame() { // invoke Action::Navigate // assert exact outbound frame/opcode chosen from transcript evidence } ``` Required assertions: - the call site still uses `BrowserBackend::invoke(...)` - the exact outbound frame matches the recorded Release-browser evidence - request correlation stays deterministic - [ ] **Step 2: Run the single new backend test and verify it fails** Run: ```bash cargo test --test browser_zhihu_release_backend_test zhihu_release_backend_maps_navigate_to_proven_release_frame -- --nocapture ``` Expected: FAIL because the backend does not exist yet. - [ ] **Step 3: Implement minimal `Navigate` support** In `src/browser/zhihu_release_backend.rs`: - implement `BrowserBackend` - support `Action::Navigate` first - use `ws_protocol.rs` helpers for exact browser-frame construction - do not hardcode speculative opcodes; follow the transcript from Task 1 - [ ] **Step 4: Add failing tests for `GetText` and `Eval`** Add tests proving: - `Action::GetText` returns `CommandOutput.data == {"text": "..."}` - `Action::Eval` returns `CommandOutput.data == {"text": "..."}` - callback or relay failures become `PipeError::Protocol(...)` - [ ] **Step 5: Implement `GetText` and `Eval` on the chosen seam** Use the smallest proven mechanism: - if the transcript proves page-defined callback functions are required, route through them - if `callBackJsToCpp(...)` to a page context is still part of the proven path, use it deliberately - if `sgBrowserRegJsFun` / `sgBrowserExcuteJsFun` becomes necessary, add it only with test coverage and only for the Zhihu path - [ ] **Step 6: Run focused backend/protocol tests** Run: ```bash cargo test --test browser_zhihu_release_backend_test -- --nocapture cargo test --test browser_ws_protocol_test -- --nocapture ``` Expected: PASS. - [ ] **Step 7: Commit the Zhihu backend slice** ```bash git add src/browser/ws_protocol.rs src/browser/zhihu_release_backend.rs src/browser/mod.rs tests/browser_ws_protocol_test.rs tests/browser_zhihu_release_backend_test.rs git commit -m "feat: add zhihu release ws backend" ``` --- ## Task 4: Keep the Zhihu workflow logic stable and patch only proven mismatches **Files:** - Review: `src/compat/workflow_executor.rs` - Test: `tests/service_task_flow_test.rs` - Test: `tests/compat_runtime_test.rs` (only if a focused direct-execution regression is needed) - [ ] **Step 1: Write a failing Zhihu-specific regression only if the chosen seam changes route assumptions** If the new Zhihu backend changes request-url or target-url handling enough to break hotlist flow, add one focused failing regression for that exact behavior. Candidate assertions: - hotlist navigate still logs `navigate https://www.zhihu.com/hot` - follow-up `GetText body` still targets the Zhihu page, not any helper page - extractor `Eval` still runs against Zhihu, not any helper page - [ ] **Step 2: Keep the current high-level Zhihu action sequence unless a test proves otherwise** `src/compat/workflow_executor.rs` currently does the right high-level work: - navigate to Zhihu hotlist - poll body text until ready - run the extractor script Prefer to keep this file unchanged. Only patch it if the new backend needs a narrow explicit `target_url` fix or similar evidence-backed adjustment. - [ ] **Step 3: Run the smallest Zhihu-focused verification sweep** Run: ```bash cargo test --test service_task_flow_test -- --nocapture cargo test --test compat_runtime_test zhihu -- --nocapture ``` If the `compat_runtime_test zhihu` filter is too broad or unstable, run the exact focused Zhihu cases that cover hotlist extraction. - [ ] **Step 4: Commit only if a Zhihu-specific code change was actually required** ```bash git add src/compat/workflow_executor.rs tests/service_task_flow_test.rs tests/compat_runtime_test.rs git commit -m "fix: keep zhihu workflow aligned with release ws backend" ``` Skip this commit if no production change in `workflow_executor.rs` was needed. --- ## Task 5: Prove that pipe behavior and non-Zhihu behavior stayed unchanged **Files:** - Test: `tests/service_ws_session_test.rs` - Test: `tests/service_task_flow_test.rs` - Test: `tests/task_runner_test.rs` - [ ] **Step 1: Add or update one regression that proves pipe messages are unchanged** Use the smallest existing test seam to assert that `ClientMessage` / `ServiceMessage` payloads remain unchanged while the Zhihu route uses the new browser integration path internally. - [ ] **Step 2: Add or update one regression that proves non-Zhihu behavior is unchanged** Use a non-Zhihu submit or service-session case and assert it does not take the new Zhihu-specific backend path. - [ ] **Step 3: Preserve current runtime regression guards** The end-to-end tests must continue asserting that output does **not** contain: - `invalid hmac seed: session key must not be empty` - `Cannot drop a runtime in a context where blocking is not allowed` - [ ] **Step 4: Run the final focused verification sweep** Run: ```bash cargo test --test service_ws_session_test -- --nocapture cargo test --test service_task_flow_test -- --nocapture cargo test --test task_runner_test -- --nocapture ``` Expected: PASS. - [ ] **Step 5: Commit the verification sweep** ```bash git add tests/service_ws_session_test.rs tests/service_task_flow_test.rs tests/task_runner_test.rs tests/browser_ws_backend_test.rs git commit -m "test: constrain zhihu release ws migration scope" ``` Only stage files that were truly changed. --- ## Out of scope Do **not** do these in this slice: - change the pipe protocol - change `ClientMessage` / `ServiceMessage` - redesign `run_submit_task_with_browser_backend(...)` - reintroduce any browser bridge surface - keep adding speculative direct-raw-websocket callback patches to `ws_backend.rs` - redesign non-Zhihu workflows unless the new backend abstraction forces a shared fix - create a long-lived external dependency or third-party server just to host the helper page --- ## Verification checklist Run at minimum: ```bash cargo test --test browser_ws_probe_test -- --nocapture cargo test --test browser_zhihu_release_backend_test -- --nocapture cargo test --test browser_ws_protocol_test -- --nocapture cargo test --test service_ws_session_test -- --nocapture cargo test --test service_task_flow_test -- --nocapture cargo test --test task_runner_test -- --nocapture ``` If Task 2 chose the helper-page / relay design, also run the helper-page-specific backend tests you added for that path. Manual verification after code changes: 1. start the real Release browser/runtime that exposes `ws://127.0.0.1:12345` 2. start `sg_claw` with real config 3. start `sg_claw_client` 4. submit: - `打开知乎热榜,获取前10条数据,并导出 Excel` 5. confirm the Zhihu path uses the exact Release-browser interaction seam proven by Task 1 6. if Task 2 chose Option B, confirm the helper page / relay path is used only for the Zhihu integration seam 7. confirm non-Zhihu behavior is unchanged 8. confirm the task completes without: - `timeout while waiting for browser message` - `invalid browser status frame: Welcome! You are client #1` - `invalid hmac seed: session key must not be empty` - `Cannot drop a runtime in a context where blocking is not allowed` --- ## Expected outcome After this slice: - sgclaw still exposes the same pipe/service contract - Zhihu hotlist execution uses the Release-browser websocket contract proven by Task 1 - non-Zhihu behavior remains unchanged - old direct-ws Zhihu assumptions are no longer ambiguous in production/tests - if Option A won, Zhihu uses a direct Release-browser backend - if Option B won, Zhihu uses the minimal helper-page / relay seam justified by the probe evidence