Consolidate the browser task runtime around the callback path, add safer artifact opening for Zhihu exports, and cover the new service/browser flows with focused tests and supporting docs. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
7.6 KiB
Service Chat Web Console Design
Background
The current natural-language entrypoint is the terminal client in src/bin/sg_claw_client.rs.
That client already talks to the existing service websocket, sends ClientMessage, and prints
ServiceMessage responses.
The repository also contains a separate browser callback helper at
http://127.0.0.1:61058/sgclaw/browser-helper.html. That page is part of the browser backend
execution path and must remain untouched.
For this slice, the authoritative boundary is:
- the new page may talk to the existing service websocket only
- the page must not talk to the browser websocket directly
- the page must not reuse or replace
browser-helper.html - the page must not change the service protocol or browser execution logic
Problem Statement
Running cargo run --bin sg_claw_client and typing into stdin works, but it is inconvenient for
routine usage. The user wants a simple local HTML page with a websocket connection field, a natural-
language input box, and a send button.
The risk is scope drift: if the new page reaches into the browser-helper flow or changes backend logic, it could damage the working Zhihu/browser path.
Goal
Add a standalone local HTML console that connects to the existing service websocket and submits
natural-language tasks using the current submit_task message shape.
The page should be usable without changing sg_claw, sg_claw_client, browser-helper.html, or
any existing service/browser runtime behavior.
Non-goals
This slice does not include:
- serving the page from the Rust service
- changing
ClientMessageorServiceMessage - changing
src/service/server.rs - changing
src/browser/callback_host.rs - changing
src/browser/callback_backend.rs - changing the helper-page bootstrap flow
- adding authentication, persistence, or multi-session orchestration
- replacing the terminal client
Chosen Approach
Choose Option A: add one standalone HTML file that opens in a normal browser and talks to the
existing service websocket at ws://127.0.0.1:42321 by default.
Why this option:
- it is the narrowest possible change
- it reuses the already-working service protocol
- it does not alter the browser-helper path
- it keeps all runtime ownership in the existing Rust service
Rejected alternatives:
- extend
browser-helper.htmlinto a chat UI: wrong boundary; that page belongs to browser callback orchestration, not user task entry - add a new HTTP server inside
sg_claw: unnecessary for the requested scope - replace the terminal client binary: not required; both clients can coexist
File Placement
Create the page outside frontend/runtime-host/.
Chosen location:
frontend/service-console/sg_claw_service_console.html
Reason:
frontend/runtime-host/is reserved for SuperRPA runtime-host bundles- the new page is a standalone local tool, not a Chromium-hosted bundle
- keeping it in its own directory makes the isolation explicit
Page Architecture
The page is a single self-contained HTML file with inline CSS and inline JavaScript. No build step and no frontend framework are required.
The page has three UI regions:
-
Connection bar
- websocket URL input
- connect/disconnect button
- current connection state label
-
Message stream
- appends service logs in arrival order
- distinguishes connection info, task logs, errors, and final completion
- keeps the current session visible until the page is refreshed
-
Task composer
- one textarea for natural-language input
- one send button
- send disabled while the websocket is disconnected
- while a task is in flight, keep the composer enabled and let repeated submits surface the
existing service-side
busyresponse rather than adding a new frontend queue
Protocol Contract
The page must reuse the existing service protocol exactly.
Outbound message
When the user clicks send, the page sends:
{
"type": "submit_task",
"instruction": "<user input>",
"conversation_id": "",
"messages": [],
"page_url": "",
"page_title": ""
}
This matches the current terminal client shape in src/bin/sg_claw_client.rs.
Inbound messages
The page displays these existing ServiceMessage variants:
status_changed-> render as a compact connection/runtime status rowlog_entry-> append as a chronological task log rowtask_complete-> append as the terminal result row for that submissionbusy-> append as a visible refusal/error row without automatic retry
No new message type is introduced.
Interaction Flow
- User opens the local HTML file with a normal browser, typically via
file://. - User connects to the service websocket.
- The page shows websocket connection status locally.
- User enters a natural-language instruction and clicks send.
- The page sends one
submit_taskpayload over the service websocket. - The service continues to execute tasks exactly as it already does.
- Incoming service messages are appended to the message stream.
- After
task_complete, the websocket remains open so the user can send another task.
Error Handling
The page handles only UI-local failures:
- websocket connect failure -> show connection error and keep send disabled
- websocket disconnect mid-session -> mark disconnected and require reconnect
- empty instruction -> block send and show inline validation
busyresponse -> show as a visible service-side refusal without retry logic
The page does not add retries, protocol fallbacks, or browser-runtime recovery logic.
Isolation From browser-helper.html
This is the critical constraint.
The new page must never:
- reference
/sgclaw/browser-helper.html - reference
/sgclaw/callback/ready - reference
/sgclaw/callback/events - reference
/sgclaw/callback/commands/next - reference
/sgclaw/callback/commands/ack - connect to
ws://127.0.0.1:12345
The only network target owned by the page is the service websocket, defaulting to
ws://127.0.0.1:42321.
Because of that boundary, the page does not interfere with the helper-page bootstrap path.
Test Strategy
This slice stays minimal, so the automated guard is also minimal.
Automated regression
Add one focused integration test in tests/service_console_html_test.rs that reads the standalone
HTML source and asserts:
- the file exists at the agreed path and is resolved from
CARGO_MANIFEST_DIRso the test is stable across working directories - it contains the service websocket default URL
- it contains
submit_taskpayload construction - it does not contain helper-page URLs or callback-host endpoints
- it does not contain the browser websocket URL
This test is a scope guard, not a browser-E2E suite.
Manual smoke verification
With the existing service binary running:
- open the HTML file in a browser
- connect to the service websocket
- confirm local websocket open/close events and service
status_changedmessages both appear in the message stream - submit a natural-language task
- confirm logs and completion render in the page
- confirm the helper-page path remains unchanged because the page never references it
Acceptance Criteria
The slice is complete when all of the following are true:
frontend/service-console/sg_claw_service_console.htmlexists.- The page connects to the existing service websocket without backend changes.
- The page sends the existing
submit_taskshape and receives existingServiceMessageevents. - The page does not reference
browser-helper.html, callback-host endpoints, or the browser websocket URL. - Existing browser-helper logic remains untouched.
- The automated source guard passes.
- Manual smoke verification confirms a task can be submitted from the HTML page.