Consolidate the browser task runtime around the callback path, add safer artifact opening for Zhihu exports, and cover the new service/browser flows with focused tests and supporting docs. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
7.8 KiB
Zhihu WS Submit Realignment Design
Background
The current Zhihu submit path drifted away from the documented browser websocket contract.
The authoritative contract for this repository is docs/_tmp_sgbrowser_ws_api_doc.txt.
For this slice, the spec anchors to these documented invariants only:
- connect to
ws://127.0.0.1:12345 - send
{"type":"register","role":"web"} - send browser actions as JSON arrays
[requesturl, action, ...args] - let browser results come back through documented callback semantics such as
callBackJsToCpp(...) - keep the current page URL as the request owner instead of inventing an external helper page
The current production path does not follow that shape for Zhihu routes.
Instead, the submit path selects BrowserCallbackBackend, which starts LiveBrowserCallbackHost and attempts to bootstrap a local helper page at /sgclaw/browser-helper.html. That helper-page bootstrap is not part of the user's confirmed production model, and live evidence already shows it is the wrong assumption for the Release browser.
Problem Statement
Zhihu submit currently fails before real work begins because the service path depends on a helper-page callback host bootstrap that the Release browser does not use.
That drift shows up in three ways:
- Zhihu submit routes select the callback-host backend instead of the direct websocket backend.
- The mainline request URL becomes the local helper page URL instead of the real browser page URL.
- The submit path waits for helper-page readiness rather than proceeding through the documented websocket callback model.
This causes the observable failure:
timeout while waiting for browser message- no real Zhihu page open/action in the browser
Goal
Realign the Zhihu submit path to the documented websocket callback model without changing the existing pipe/service contract.
Concretely, the target behavior is:
- Zhihu submit routes use the websocket browser backend directly
- browser messages keep the real page URL as
requesturl - browser actions continue to use documented websocket opcodes
- callback-bearing results continue to use the documented callback payload model
- the browser no longer depends on opening a local helper page before Zhihu work starts
Non-goals
This slice does not include:
- changing
ClientMessageorServiceMessage - changing
run_submit_task_with_browser_backend(...) - rewriting the Zhihu workflow itself
- adding a new browser bridge abstraction
- redesigning the pipe path
- deleting callback-host code that is outside the Zhihu submit mainline
- speculative protocol expansion beyond the documented websocket contract
Chosen Approach
Choose Option A: withdraw Zhihu submit from the helper-page callback-host path and return it to the documented websocket callback model.
Rejected alternatives:
- Keep callback host but remove helper bootstrap: still preserves the wrong abstraction in the mainline.
- Build a new orchestration layer: exceeds the requested scope.
Mainline Architecture After Realignment
sg_claw_client
-> sg_claw service / runtime submit path
-> existing BrowserBackend seam
-> WsBrowserBackend
-> ws://127.0.0.1:12345
-> documented browser opcodes and callback semantics
For Zhihu submit routes, the callback-host helper page is no longer part of the mainline execution chain.
Required Production Changes
1. Route selection
Update submit-route backend selection so these routes no longer instantiate BrowserCallbackBackend:
WorkflowRoute::ZhihuHotlistExportXlsxWorkflowRoute::ZhihuHotlistScreenWorkflowRoute::ZhihuArticleEntryWorkflowRoute::ZhihuArticleDraftWorkflowRoute::ZhihuArticlePublish
The change applies in both:
- service submit path in
src/service/server.rs - direct runtime submit path in
src/agent/mod.rs
Direct runtime fallback behavior stays unchanged when no browser websocket URL is configured:
- if a real browser websocket URL is configured, use
WsBrowserBackendfor the listed Zhihu routes - if no browser websocket URL is configured, keep the existing pipe fallback instead of failing fast
2. Request URL ownership
Keep requesturl aligned with the real browser page instead of the helper page.
Expected behavior:
- initial request URL comes from the existing submit-path request context
- after a successful navigate call, the websocket backend continues to update its request URL to the navigated target page
- later
getTextandevalcalls run against the real Zhihu page URL
This preserves the documented page-owned websocket model.
3. Callback semantics
Keep callback-bearing actions on the existing websocket protocol path, using the documented callback payload shape.
Required invariants:
- action frames remain
[requesturl, action, ...args] - navigate uses the documented opcode
sgHideBrowserCallAfterLoaded getTextandevalcontinue to emitcallBackJsToCpp(...)payloads in the documentedsourceUrl@_@targetUrl@_@callback@_@actionUrl@_@responseTxtform- callback decoding remains on the websocket path instead of moving through localhost helper-page HTTP endpoints
4. Callback-host removal from the Zhihu mainline
For this slice, callback-host code is removed from the Zhihu submit mainline, not redesigned.
Practical meaning:
- Zhihu submit must not start
LiveBrowserCallbackHost - Zhihu submit must not emit
sgBrowerserOpenPagefor/sgclaw/browser-helper.html - Zhihu submit must not block on
/sgclaw/callback/ready
Code outside the Zhihu submit mainline can remain unchanged unless tests require cleanup.
Test Strategy
This slice follows TDD and replaces the stale helper-page assumptions with direct websocket submit-path assertions.
Red tests to add or rewrite
-
Rewrite the current submit regression that asserts helper-page bootstrap.
- old behavior under test: Zhihu submit bootstraps callback host
- new behavior under test: Zhihu submit does not bootstrap callback host and does not emit helper-page frames
-
Add or update a focused submit-path regression proving request ownership stays on the real page.
- after navigate, subsequent Zhihu browser actions must use the real target page URL rather than
/sgclaw/browser-helper.html
- after navigate, subsequent Zhihu browser actions must use the real target page URL rather than
-
Remove or rewrite any newly added red test whose only purpose was to preserve callback-host-without-helper behavior.
- that test belongs to the rejected Option B path, not the chosen Option A path
Green verification
After the minimal code change, run focused verification in this order:
agent_runtime_testcoverage for the submit path- relevant Zhihu
compat_runtime_testcoverage - submit/service websocket regressions impacted by route selection
- stronger real-browser validation after focused tests pass
Scope Guardrails
The implementation plan for this spec must obey all of the following:
- Do not modify the pipe contract.
- Do not add a new browser abstraction.
- Do not broaden the change beyond the Zhihu submit path and its directly affected websocket protocol tests.
- Do not keep the helper-page path as a second competing Zhihu mainline.
- If live validation still reveals a callback-payload mismatch, only adjust the websocket protocol encoding/decoding at the exact mismatch point.
Acceptance Criteria
The slice is complete when all of the following are true:
- Zhihu submit routes no longer select the helper-page callback-host backend.
- No Zhihu submit regression expects or observes
/sgclaw/browser-helper.htmlbootstrap. - The websocket backend sends Zhihu follow-up actions with the real page URL as
requesturl. - Focused automated tests covering the changed submit path pass.
- Real-browser validation no longer fails at callback-host readiness timeout, emits no helper-page bootstrap frames, and emits at least one real-page follow-up browser action after navigate.