Consolidate the browser task runtime around the callback path, add safer artifact opening for Zhihu exports, and cover the new service/browser flows with focused tests and supporting docs. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
183 lines
7.8 KiB
Markdown
183 lines
7.8 KiB
Markdown
# Zhihu WS Submit Realignment Design
|
|
|
|
## Background
|
|
|
|
The current Zhihu submit path drifted away from the documented browser websocket contract.
|
|
|
|
The authoritative contract for this repository is `docs/_tmp_sgbrowser_ws_api_doc.txt`.
|
|
|
|
For this slice, the spec anchors to these documented invariants only:
|
|
|
|
- connect to `ws://127.0.0.1:12345`
|
|
- send `{"type":"register","role":"web"}`
|
|
- send browser actions as JSON arrays `[requesturl, action, ...args]`
|
|
- let browser results come back through documented callback semantics such as `callBackJsToCpp(...)`
|
|
- keep the current page URL as the request owner instead of inventing an external helper page
|
|
|
|
The current production path does not follow that shape for Zhihu routes.
|
|
|
|
Instead, the submit path selects `BrowserCallbackBackend`, which starts `LiveBrowserCallbackHost` and attempts to bootstrap a local helper page at `/sgclaw/browser-helper.html`. That helper-page bootstrap is not part of the user's confirmed production model, and live evidence already shows it is the wrong assumption for the Release browser.
|
|
|
|
## Problem Statement
|
|
|
|
Zhihu submit currently fails before real work begins because the service path depends on a helper-page callback host bootstrap that the Release browser does not use.
|
|
|
|
That drift shows up in three ways:
|
|
|
|
1. Zhihu submit routes select the callback-host backend instead of the direct websocket backend.
|
|
2. The mainline request URL becomes the local helper page URL instead of the real browser page URL.
|
|
3. The submit path waits for helper-page readiness rather than proceeding through the documented websocket callback model.
|
|
|
|
This causes the observable failure:
|
|
|
|
- `timeout while waiting for browser message`
|
|
- no real Zhihu page open/action in the browser
|
|
|
|
## Goal
|
|
|
|
Realign the Zhihu submit path to the documented websocket callback model without changing the existing pipe/service contract.
|
|
|
|
Concretely, the target behavior is:
|
|
|
|
- Zhihu submit routes use the websocket browser backend directly
|
|
- browser messages keep the real page URL as `requesturl`
|
|
- browser actions continue to use documented websocket opcodes
|
|
- callback-bearing results continue to use the documented callback payload model
|
|
- the browser no longer depends on opening a local helper page before Zhihu work starts
|
|
|
|
## Non-goals
|
|
|
|
This slice does not include:
|
|
|
|
- changing `ClientMessage` or `ServiceMessage`
|
|
- changing `run_submit_task_with_browser_backend(...)`
|
|
- rewriting the Zhihu workflow itself
|
|
- adding a new browser bridge abstraction
|
|
- redesigning the pipe path
|
|
- deleting callback-host code that is outside the Zhihu submit mainline
|
|
- speculative protocol expansion beyond the documented websocket contract
|
|
|
|
## Chosen Approach
|
|
|
|
Choose **Option A**: withdraw Zhihu submit from the helper-page callback-host path and return it to the documented websocket callback model.
|
|
|
|
Rejected alternatives:
|
|
|
|
- Keep callback host but remove helper bootstrap: still preserves the wrong abstraction in the mainline.
|
|
- Build a new orchestration layer: exceeds the requested scope.
|
|
|
|
## Mainline Architecture After Realignment
|
|
|
|
```text
|
|
sg_claw_client
|
|
-> sg_claw service / runtime submit path
|
|
-> existing BrowserBackend seam
|
|
-> WsBrowserBackend
|
|
-> ws://127.0.0.1:12345
|
|
-> documented browser opcodes and callback semantics
|
|
```
|
|
|
|
For Zhihu submit routes, the callback-host helper page is no longer part of the mainline execution chain.
|
|
|
|
## Required Production Changes
|
|
|
|
### 1. Route selection
|
|
|
|
Update submit-route backend selection so these routes no longer instantiate `BrowserCallbackBackend`:
|
|
|
|
- `WorkflowRoute::ZhihuHotlistExportXlsx`
|
|
- `WorkflowRoute::ZhihuHotlistScreen`
|
|
- `WorkflowRoute::ZhihuArticleEntry`
|
|
- `WorkflowRoute::ZhihuArticleDraft`
|
|
- `WorkflowRoute::ZhihuArticlePublish`
|
|
|
|
The change applies in both:
|
|
|
|
- service submit path in `src/service/server.rs`
|
|
- direct runtime submit path in `src/agent/mod.rs`
|
|
|
|
Direct runtime fallback behavior stays unchanged when no browser websocket URL is configured:
|
|
|
|
- if a real browser websocket URL is configured, use `WsBrowserBackend` for the listed Zhihu routes
|
|
- if no browser websocket URL is configured, keep the existing pipe fallback instead of failing fast
|
|
|
|
### 2. Request URL ownership
|
|
|
|
Keep `requesturl` aligned with the real browser page instead of the helper page.
|
|
|
|
Expected behavior:
|
|
|
|
- initial request URL comes from the existing submit-path request context
|
|
- after a successful navigate call, the websocket backend continues to update its request URL to the navigated target page
|
|
- later `getText` and `eval` calls run against the real Zhihu page URL
|
|
|
|
This preserves the documented page-owned websocket model.
|
|
|
|
### 3. Callback semantics
|
|
|
|
Keep callback-bearing actions on the existing websocket protocol path, using the documented callback payload shape.
|
|
|
|
Required invariants:
|
|
|
|
- action frames remain `[requesturl, action, ...args]`
|
|
- navigate uses the documented opcode `sgHideBrowserCallAfterLoaded`
|
|
- `getText` and `eval` continue to emit `callBackJsToCpp(...)` payloads in the documented `sourceUrl@_@targetUrl@_@callback@_@actionUrl@_@responseTxt` form
|
|
- callback decoding remains on the websocket path instead of moving through localhost helper-page HTTP endpoints
|
|
|
|
### 4. Callback-host removal from the Zhihu mainline
|
|
|
|
For this slice, callback-host code is removed from the Zhihu submit mainline, not redesigned.
|
|
|
|
Practical meaning:
|
|
|
|
- Zhihu submit must not start `LiveBrowserCallbackHost`
|
|
- Zhihu submit must not emit `sgBrowerserOpenPage` for `/sgclaw/browser-helper.html`
|
|
- Zhihu submit must not block on `/sgclaw/callback/ready`
|
|
|
|
Code outside the Zhihu submit mainline can remain unchanged unless tests require cleanup.
|
|
|
|
## Test Strategy
|
|
|
|
This slice follows TDD and replaces the stale helper-page assumptions with direct websocket submit-path assertions.
|
|
|
|
### Red tests to add or rewrite
|
|
|
|
1. Rewrite the current submit regression that asserts helper-page bootstrap.
|
|
- old behavior under test: Zhihu submit bootstraps callback host
|
|
- new behavior under test: Zhihu submit does **not** bootstrap callback host and does **not** emit helper-page frames
|
|
|
|
2. Add or update a focused submit-path regression proving request ownership stays on the real page.
|
|
- after navigate, subsequent Zhihu browser actions must use the real target page URL rather than `/sgclaw/browser-helper.html`
|
|
|
|
3. Remove or rewrite any newly added red test whose only purpose was to preserve callback-host-without-helper behavior.
|
|
- that test belongs to the rejected Option B path, not the chosen Option A path
|
|
|
|
### Green verification
|
|
|
|
After the minimal code change, run focused verification in this order:
|
|
|
|
1. `agent_runtime_test` coverage for the submit path
|
|
2. relevant Zhihu `compat_runtime_test` coverage
|
|
3. submit/service websocket regressions impacted by route selection
|
|
4. stronger real-browser validation after focused tests pass
|
|
|
|
## Scope Guardrails
|
|
|
|
The implementation plan for this spec must obey all of the following:
|
|
|
|
1. Do not modify the pipe contract.
|
|
2. Do not add a new browser abstraction.
|
|
3. Do not broaden the change beyond the Zhihu submit path and its directly affected websocket protocol tests.
|
|
4. Do not keep the helper-page path as a second competing Zhihu mainline.
|
|
5. If live validation still reveals a callback-payload mismatch, only adjust the websocket protocol encoding/decoding at the exact mismatch point.
|
|
|
|
## Acceptance Criteria
|
|
|
|
The slice is complete when all of the following are true:
|
|
|
|
1. Zhihu submit routes no longer select the helper-page callback-host backend.
|
|
2. No Zhihu submit regression expects or observes `/sgclaw/browser-helper.html` bootstrap.
|
|
3. The websocket backend sends Zhihu follow-up actions with the real page URL as `requesturl`.
|
|
4. Focused automated tests covering the changed submit path pass.
|
|
5. Real-browser validation no longer fails at callback-host readiness timeout, emits no helper-page bootstrap frames, and emits at least one real-page follow-up browser action after navigate.
|