Files
claw/docs/superpowers/specs/2026-04-06-service-chat-web-console-design.md
木炎 bdf8e12246 feat: align browser callback runtime and export flows
Consolidate the browser task runtime around the callback path, add safer artifact opening for Zhihu exports, and cover the new service/browser flows with focused tests and supporting docs.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-06 21:44:53 +08:00

7.6 KiB

Service Chat Web Console Design

Background

The current natural-language entrypoint is the terminal client in src/bin/sg_claw_client.rs. That client already talks to the existing service websocket, sends ClientMessage, and prints ServiceMessage responses.

The repository also contains a separate browser callback helper at http://127.0.0.1:61058/sgclaw/browser-helper.html. That page is part of the browser backend execution path and must remain untouched.

For this slice, the authoritative boundary is:

  • the new page may talk to the existing service websocket only
  • the page must not talk to the browser websocket directly
  • the page must not reuse or replace browser-helper.html
  • the page must not change the service protocol or browser execution logic

Problem Statement

Running cargo run --bin sg_claw_client and typing into stdin works, but it is inconvenient for routine usage. The user wants a simple local HTML page with a websocket connection field, a natural- language input box, and a send button.

The risk is scope drift: if the new page reaches into the browser-helper flow or changes backend logic, it could damage the working Zhihu/browser path.

Goal

Add a standalone local HTML console that connects to the existing service websocket and submits natural-language tasks using the current submit_task message shape.

The page should be usable without changing sg_claw, sg_claw_client, browser-helper.html, or any existing service/browser runtime behavior.

Non-goals

This slice does not include:

  • serving the page from the Rust service
  • changing ClientMessage or ServiceMessage
  • changing src/service/server.rs
  • changing src/browser/callback_host.rs
  • changing src/browser/callback_backend.rs
  • changing the helper-page bootstrap flow
  • adding authentication, persistence, or multi-session orchestration
  • replacing the terminal client

Chosen Approach

Choose Option A: add one standalone HTML file that opens in a normal browser and talks to the existing service websocket at ws://127.0.0.1:42321 by default.

Why this option:

  • it is the narrowest possible change
  • it reuses the already-working service protocol
  • it does not alter the browser-helper path
  • it keeps all runtime ownership in the existing Rust service

Rejected alternatives:

  • extend browser-helper.html into a chat UI: wrong boundary; that page belongs to browser callback orchestration, not user task entry
  • add a new HTTP server inside sg_claw: unnecessary for the requested scope
  • replace the terminal client binary: not required; both clients can coexist

File Placement

Create the page outside frontend/runtime-host/.

Chosen location:

  • frontend/service-console/sg_claw_service_console.html

Reason:

  • frontend/runtime-host/ is reserved for SuperRPA runtime-host bundles
  • the new page is a standalone local tool, not a Chromium-hosted bundle
  • keeping it in its own directory makes the isolation explicit

Page Architecture

The page is a single self-contained HTML file with inline CSS and inline JavaScript. No build step and no frontend framework are required.

The page has three UI regions:

  1. Connection bar

    • websocket URL input
    • connect/disconnect button
    • current connection state label
  2. Message stream

    • appends service logs in arrival order
    • distinguishes connection info, task logs, errors, and final completion
    • keeps the current session visible until the page is refreshed
  3. Task composer

    • one textarea for natural-language input
    • one send button
    • send disabled while the websocket is disconnected
    • while a task is in flight, keep the composer enabled and let repeated submits surface the existing service-side busy response rather than adding a new frontend queue

Protocol Contract

The page must reuse the existing service protocol exactly.

Outbound message

When the user clicks send, the page sends:

{
  "type": "submit_task",
  "instruction": "<user input>",
  "conversation_id": "",
  "messages": [],
  "page_url": "",
  "page_title": ""
}

This matches the current terminal client shape in src/bin/sg_claw_client.rs.

Inbound messages

The page displays these existing ServiceMessage variants:

  • status_changed -> render as a compact connection/runtime status row
  • log_entry -> append as a chronological task log row
  • task_complete -> append as the terminal result row for that submission
  • busy -> append as a visible refusal/error row without automatic retry

No new message type is introduced.

Interaction Flow

  1. User opens the local HTML file with a normal browser, typically via file://.
  2. User connects to the service websocket.
  3. The page shows websocket connection status locally.
  4. User enters a natural-language instruction and clicks send.
  5. The page sends one submit_task payload over the service websocket.
  6. The service continues to execute tasks exactly as it already does.
  7. Incoming service messages are appended to the message stream.
  8. After task_complete, the websocket remains open so the user can send another task.

Error Handling

The page handles only UI-local failures:

  • websocket connect failure -> show connection error and keep send disabled
  • websocket disconnect mid-session -> mark disconnected and require reconnect
  • empty instruction -> block send and show inline validation
  • busy response -> show as a visible service-side refusal without retry logic

The page does not add retries, protocol fallbacks, or browser-runtime recovery logic.

Isolation From browser-helper.html

This is the critical constraint.

The new page must never:

  • reference /sgclaw/browser-helper.html
  • reference /sgclaw/callback/ready
  • reference /sgclaw/callback/events
  • reference /sgclaw/callback/commands/next
  • reference /sgclaw/callback/commands/ack
  • connect to ws://127.0.0.1:12345

The only network target owned by the page is the service websocket, defaulting to ws://127.0.0.1:42321.

Because of that boundary, the page does not interfere with the helper-page bootstrap path.

Test Strategy

This slice stays minimal, so the automated guard is also minimal.

Automated regression

Add one focused integration test in tests/service_console_html_test.rs that reads the standalone HTML source and asserts:

  • the file exists at the agreed path and is resolved from CARGO_MANIFEST_DIR so the test is stable across working directories
  • it contains the service websocket default URL
  • it contains submit_task payload construction
  • it does not contain helper-page URLs or callback-host endpoints
  • it does not contain the browser websocket URL

This test is a scope guard, not a browser-E2E suite.

Manual smoke verification

With the existing service binary running:

  1. open the HTML file in a browser
  2. connect to the service websocket
  3. confirm local websocket open/close events and service status_changed messages both appear in the message stream
  4. submit a natural-language task
  5. confirm logs and completion render in the page
  6. confirm the helper-page path remains unchanged because the page never references it

Acceptance Criteria

The slice is complete when all of the following are true:

  1. frontend/service-console/sg_claw_service_console.html exists.
  2. The page connects to the existing service websocket without backend changes.
  3. The page sends the existing submit_task shape and receives existing ServiceMessage events.
  4. The page does not reference browser-helper.html, callback-host endpoints, or the browser websocket URL.
  5. Existing browser-helper logic remains untouched.
  6. The automated source guard passes.
  7. Manual smoke verification confirms a task can be submitted from the HTML page.