feat: align browser callback runtime and export flows
Consolidate the browser task runtime around the callback path, add safer artifact opening for Zhihu exports, and cover the new service/browser flows with focused tests and supporting docs. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -0,0 +1,182 @@
|
||||
# Zhihu WS Submit Realignment Design
|
||||
|
||||
## Background
|
||||
|
||||
The current Zhihu submit path drifted away from the documented browser websocket contract.
|
||||
|
||||
The authoritative contract for this repository is `docs/_tmp_sgbrowser_ws_api_doc.txt`.
|
||||
|
||||
For this slice, the spec anchors to these documented invariants only:
|
||||
|
||||
- connect to `ws://127.0.0.1:12345`
|
||||
- send `{"type":"register","role":"web"}`
|
||||
- send browser actions as JSON arrays `[requesturl, action, ...args]`
|
||||
- let browser results come back through documented callback semantics such as `callBackJsToCpp(...)`
|
||||
- keep the current page URL as the request owner instead of inventing an external helper page
|
||||
|
||||
The current production path does not follow that shape for Zhihu routes.
|
||||
|
||||
Instead, the submit path selects `BrowserCallbackBackend`, which starts `LiveBrowserCallbackHost` and attempts to bootstrap a local helper page at `/sgclaw/browser-helper.html`. That helper-page bootstrap is not part of the user's confirmed production model, and live evidence already shows it is the wrong assumption for the Release browser.
|
||||
|
||||
## Problem Statement
|
||||
|
||||
Zhihu submit currently fails before real work begins because the service path depends on a helper-page callback host bootstrap that the Release browser does not use.
|
||||
|
||||
That drift shows up in three ways:
|
||||
|
||||
1. Zhihu submit routes select the callback-host backend instead of the direct websocket backend.
|
||||
2. The mainline request URL becomes the local helper page URL instead of the real browser page URL.
|
||||
3. The submit path waits for helper-page readiness rather than proceeding through the documented websocket callback model.
|
||||
|
||||
This causes the observable failure:
|
||||
|
||||
- `timeout while waiting for browser message`
|
||||
- no real Zhihu page open/action in the browser
|
||||
|
||||
## Goal
|
||||
|
||||
Realign the Zhihu submit path to the documented websocket callback model without changing the existing pipe/service contract.
|
||||
|
||||
Concretely, the target behavior is:
|
||||
|
||||
- Zhihu submit routes use the websocket browser backend directly
|
||||
- browser messages keep the real page URL as `requesturl`
|
||||
- browser actions continue to use documented websocket opcodes
|
||||
- callback-bearing results continue to use the documented callback payload model
|
||||
- the browser no longer depends on opening a local helper page before Zhihu work starts
|
||||
|
||||
## Non-goals
|
||||
|
||||
This slice does not include:
|
||||
|
||||
- changing `ClientMessage` or `ServiceMessage`
|
||||
- changing `run_submit_task_with_browser_backend(...)`
|
||||
- rewriting the Zhihu workflow itself
|
||||
- adding a new browser bridge abstraction
|
||||
- redesigning the pipe path
|
||||
- deleting callback-host code that is outside the Zhihu submit mainline
|
||||
- speculative protocol expansion beyond the documented websocket contract
|
||||
|
||||
## Chosen Approach
|
||||
|
||||
Choose **Option A**: withdraw Zhihu submit from the helper-page callback-host path and return it to the documented websocket callback model.
|
||||
|
||||
Rejected alternatives:
|
||||
|
||||
- Keep callback host but remove helper bootstrap: still preserves the wrong abstraction in the mainline.
|
||||
- Build a new orchestration layer: exceeds the requested scope.
|
||||
|
||||
## Mainline Architecture After Realignment
|
||||
|
||||
```text
|
||||
sg_claw_client
|
||||
-> sg_claw service / runtime submit path
|
||||
-> existing BrowserBackend seam
|
||||
-> WsBrowserBackend
|
||||
-> ws://127.0.0.1:12345
|
||||
-> documented browser opcodes and callback semantics
|
||||
```
|
||||
|
||||
For Zhihu submit routes, the callback-host helper page is no longer part of the mainline execution chain.
|
||||
|
||||
## Required Production Changes
|
||||
|
||||
### 1. Route selection
|
||||
|
||||
Update submit-route backend selection so these routes no longer instantiate `BrowserCallbackBackend`:
|
||||
|
||||
- `WorkflowRoute::ZhihuHotlistExportXlsx`
|
||||
- `WorkflowRoute::ZhihuHotlistScreen`
|
||||
- `WorkflowRoute::ZhihuArticleEntry`
|
||||
- `WorkflowRoute::ZhihuArticleDraft`
|
||||
- `WorkflowRoute::ZhihuArticlePublish`
|
||||
|
||||
The change applies in both:
|
||||
|
||||
- service submit path in `src/service/server.rs`
|
||||
- direct runtime submit path in `src/agent/mod.rs`
|
||||
|
||||
Direct runtime fallback behavior stays unchanged when no browser websocket URL is configured:
|
||||
|
||||
- if a real browser websocket URL is configured, use `WsBrowserBackend` for the listed Zhihu routes
|
||||
- if no browser websocket URL is configured, keep the existing pipe fallback instead of failing fast
|
||||
|
||||
### 2. Request URL ownership
|
||||
|
||||
Keep `requesturl` aligned with the real browser page instead of the helper page.
|
||||
|
||||
Expected behavior:
|
||||
|
||||
- initial request URL comes from the existing submit-path request context
|
||||
- after a successful navigate call, the websocket backend continues to update its request URL to the navigated target page
|
||||
- later `getText` and `eval` calls run against the real Zhihu page URL
|
||||
|
||||
This preserves the documented page-owned websocket model.
|
||||
|
||||
### 3. Callback semantics
|
||||
|
||||
Keep callback-bearing actions on the existing websocket protocol path, using the documented callback payload shape.
|
||||
|
||||
Required invariants:
|
||||
|
||||
- action frames remain `[requesturl, action, ...args]`
|
||||
- navigate uses the documented opcode `sgHideBrowserCallAfterLoaded`
|
||||
- `getText` and `eval` continue to emit `callBackJsToCpp(...)` payloads in the documented `sourceUrl@_@targetUrl@_@callback@_@actionUrl@_@responseTxt` form
|
||||
- callback decoding remains on the websocket path instead of moving through localhost helper-page HTTP endpoints
|
||||
|
||||
### 4. Callback-host removal from the Zhihu mainline
|
||||
|
||||
For this slice, callback-host code is removed from the Zhihu submit mainline, not redesigned.
|
||||
|
||||
Practical meaning:
|
||||
|
||||
- Zhihu submit must not start `LiveBrowserCallbackHost`
|
||||
- Zhihu submit must not emit `sgBrowerserOpenPage` for `/sgclaw/browser-helper.html`
|
||||
- Zhihu submit must not block on `/sgclaw/callback/ready`
|
||||
|
||||
Code outside the Zhihu submit mainline can remain unchanged unless tests require cleanup.
|
||||
|
||||
## Test Strategy
|
||||
|
||||
This slice follows TDD and replaces the stale helper-page assumptions with direct websocket submit-path assertions.
|
||||
|
||||
### Red tests to add or rewrite
|
||||
|
||||
1. Rewrite the current submit regression that asserts helper-page bootstrap.
|
||||
- old behavior under test: Zhihu submit bootstraps callback host
|
||||
- new behavior under test: Zhihu submit does **not** bootstrap callback host and does **not** emit helper-page frames
|
||||
|
||||
2. Add or update a focused submit-path regression proving request ownership stays on the real page.
|
||||
- after navigate, subsequent Zhihu browser actions must use the real target page URL rather than `/sgclaw/browser-helper.html`
|
||||
|
||||
3. Remove or rewrite any newly added red test whose only purpose was to preserve callback-host-without-helper behavior.
|
||||
- that test belongs to the rejected Option B path, not the chosen Option A path
|
||||
|
||||
### Green verification
|
||||
|
||||
After the minimal code change, run focused verification in this order:
|
||||
|
||||
1. `agent_runtime_test` coverage for the submit path
|
||||
2. relevant Zhihu `compat_runtime_test` coverage
|
||||
3. submit/service websocket regressions impacted by route selection
|
||||
4. stronger real-browser validation after focused tests pass
|
||||
|
||||
## Scope Guardrails
|
||||
|
||||
The implementation plan for this spec must obey all of the following:
|
||||
|
||||
1. Do not modify the pipe contract.
|
||||
2. Do not add a new browser abstraction.
|
||||
3. Do not broaden the change beyond the Zhihu submit path and its directly affected websocket protocol tests.
|
||||
4. Do not keep the helper-page path as a second competing Zhihu mainline.
|
||||
5. If live validation still reveals a callback-payload mismatch, only adjust the websocket protocol encoding/decoding at the exact mismatch point.
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
The slice is complete when all of the following are true:
|
||||
|
||||
1. Zhihu submit routes no longer select the helper-page callback-host backend.
|
||||
2. No Zhihu submit regression expects or observes `/sgclaw/browser-helper.html` bootstrap.
|
||||
3. The websocket backend sends Zhihu follow-up actions with the real page URL as `requesturl`.
|
||||
4. Focused automated tests covering the changed submit path pass.
|
||||
5. Real-browser validation no longer fails at callback-host readiness timeout, emits no helper-page bootstrap frames, and emits at least one real-page follow-up browser action after navigate.
|
||||
Reference in New Issue
Block a user