From bae0e452a5cfae50a8a8b35caf3ade4eaaca2ba2 Mon Sep 17 00:00:00 2001 From: zyl Date: Fri, 27 Mar 2026 00:50:47 +0800 Subject: [PATCH] docs: add sgclaw floating chat plan --- .../2026-03-27-sgclaw-floating-chat-plan.md | 363 ++++++++++++++++++ 1 file changed, 363 insertions(+) create mode 100644 docs/plans/2026-03-27-sgclaw-floating-chat-plan.md diff --git a/docs/plans/2026-03-27-sgclaw-floating-chat-plan.md b/docs/plans/2026-03-27-sgclaw-floating-chat-plan.md new file mode 100644 index 0000000..c9e7dbc --- /dev/null +++ b/docs/plans/2026-03-27-sgclaw-floating-chat-plan.md @@ -0,0 +1,363 @@ +# sgClaw Floating Chat Implementation Plan + +> **For Claude:** REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task. + +**Goal:** Replace the current debug-style `sgclaw-chat` page as the primary UX with a floating page button + popup chat window, add real multi-turn conversation support, and harden the DeepSeek/browser tool protocol so browser automation is stable. + +**Architecture:** Keep `chrome://superrpa-functions/sgclaw-chat` and `chrome://superrpa-functions/sgclaw-config` as debug/config pages, but make the user-facing entry a floating page launcher injected into allowed HTTP/HTTPS pages via existing SuperRPA page-injection capabilities. Reuse the browser-side persistent `SgClawSessionService` as the session owner, extend it from “logs + final result” to “conversation + runtime state”, and extend the sgClaw pipe path so each submit can carry conversation context instead of behaving like a fresh one-shot task. Fix protocol bugs in parallel: strict action-schema validation, better browser/sgClaw error attribution, and DeepSeek tool-call history compatibility. + +**Tech Stack:** Chromium WebUI + Lit, existing SuperRPA page injection (`sg_compat.js` / hook injection), browser-side `FunctionsUI`/`SgClawSessionService`, Rust `sgClaw`, ZeroClaw compatibility runtime, DeepSeek OpenAI-compatible chat API. + +### Task 1: Freeze Current Baseline And Add Pure UI State Tests + +**Files:** +- Create: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-floating_state.ts` +- Create: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-floating_state_mainline_unittest.ts` +- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/BUILD.gn` + +**Step 1: Write the failing test** + +Write a pure state test that describes the floating UX: + +```ts +import { + collapseFloatingWindow, + createFloatingViewState, + openFloatingWindow, + toggleSettingsPanel, +} from './sgclaw-floating_state.js'; + +test('opens from fab and collapses back on blur', () => { + let state = createFloatingViewState(); + state = openFloatingWindow(state); + expect(state.windowOpen).toBe(true); + state = collapseFloatingWindow(state); + expect(state.windowOpen).toBe(false); + expect(state.fabVisible).toBe(true); +}); +``` + +**Step 2: Run test to verify it fails** + +Run: `autoninja -C /home/zyl/projects/superRpa/src/out/KylinRelease sgclaw-chat_build_ts` + +Expected: build/test target fails because the new state module and test do not exist yet. + +**Step 3: Write minimal implementation** + +Create a small pure state module with: +- `fabVisible` +- `windowOpen` +- `settingsOpen` +- `statusBadge` +- `unreadCount` + +Keep it logic-only; no DOM code here. + +**Step 4: Run test to verify it passes** + +Run the same `autoninja` target or the relevant TS unit target once wired. + +Expected: the new state test compiles and passes. + +**Step 5: Commit** + +```bash +git -C /home/zyl/projects/superRpa/src add \ + chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-floating_state.ts \ + chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-floating_state_mainline_unittest.ts \ + chrome/browser/resources/superrpa/devtools/BUILD.gn +git -C /home/zyl/projects/superRpa/src commit -m "test: add sgclaw floating UI state" +``` + +### Task 2: Build The Floating Page Entry Using Existing SuperRPA Overlay Capabilities + +**Files:** +- Create: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/sgclaw_overlay.js` +- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/sg_compat.js` +- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/superrpa/hooks/hook_injector.cc` +- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/superrpa/hooks/hook_injector.h` +- Test: `/home/zyl/projects/superRpa/src/chrome/browser/superrpa/sgclaw/sgclaw_chat_smoke.mjs` + +**Step 1: Write the failing smoke expectation** + +Update the browser smoke so it expects: +- a floating button exists on a normal page +- clicking it opens the sgClaw popup +- clicking outside collapses the popup back to the button + +Use an assertion like: + +```js +await waitFor(() => page.evaluate(() => + !!document.querySelector('#superrpa-sgclaw-fab'))); +``` + +**Step 2: Run smoke to verify it fails** + +Run: `node /home/zyl/projects/sgClaw/claw/tools/browser_smoke/run_deepseek_browser_smoke.mjs` + +Expected: smoke fails because the floating entry does not exist. + +**Step 3: Write minimal implementation** + +Implement the launcher inside injected page JS, not a side panel: +- floating circular button in bottom-right +- popup window anchored to the button +- button actions: open chat, stop/start runtime, open settings +- blur/outside-click collapses popup back to button + +Prefer reusing the existing SuperRPA overlay/dialog/message primitives in `sg_compat.js` instead of inventing a second overlay stack. + +**Step 4: Run smoke to verify it passes** + +Run the same smoke command. + +Expected: smoke reaches the popup, submits a task, and collapses correctly after blur. + +**Step 5: Commit** + +```bash +git -C /home/zyl/projects/superRpa/src add \ + chrome/browser/resources/superrpa/sgclaw_overlay.js \ + chrome/browser/resources/superrpa/sg_compat.js \ + chrome/browser/superrpa/hooks/hook_injector.cc \ + chrome/browser/superrpa/hooks/hook_injector.h \ + chrome/browser/superrpa/sgclaw/sgclaw_chat_smoke.mjs +git -C /home/zyl/projects/superRpa/src commit -m "feat: add sgclaw floating launcher" +``` + +### Task 3: Upgrade Browser Session State From “Result Page” To “Real Conversation” + +**Files:** +- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/ui/webui/superrpa/sgclaw_session_service.h` +- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/ui/webui/superrpa/sgclaw_session_service.cc` +- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/ui/webui/superrpa/functions_ui.h` +- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/ui/webui/superrpa/functions_ui.cc` +- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat.ts` +- Create: `/home/zyl/projects/superRpa/src/chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_messages.ts` +- Test: `/home/zyl/projects/superRpa/src/chrome/browser/ui/webui/superrpa/functions_ui_mainline_unittest.cc` + +**Step 1: Write the failing browser-side tests** + +Add tests for: +- conversation messages are returned by `sgclawConnect` +- reopening the chat keeps prior user/assistant turns +- `sgclawSubmitTask` appends a user turn immediately and an assistant turn when complete + +Example expectation: + +```cc +EXPECT_EQ("user", FindStringValue(*message, "role")); +EXPECT_EQ("打开百度搜索天气", FindStringValue(*message, "content")); +``` + +**Step 2: Run test to verify it fails** + +Run: + +```bash +autoninja -C /home/zyl/projects/superRpa/src/out/KylinRelease \ + functions_ui_mainline_unittests +./out/KylinRelease/functions_ui_mainline_unittests +``` + +Expected: tests fail because runtime state only has logs/final result. + +**Step 3: Write minimal implementation** + +Extend `SgClawSessionService` to store: +- conversation id +- ordered messages +- pending assistant reply state +- runtime status/logs + +Keep the debug page and popup both consuming the same runtime shape. + +**Step 4: Run test to verify it passes** + +Run the same test command. + +Expected: connect/reopen behavior passes and conversation persists while browser stays open. + +**Step 5: Commit** + +```bash +git -C /home/zyl/projects/superRpa/src add \ + chrome/browser/ui/webui/superrpa/sgclaw_session_service.h \ + chrome/browser/ui/webui/superrpa/sgclaw_session_service.cc \ + chrome/browser/ui/webui/superrpa/functions_ui.h \ + chrome/browser/ui/webui/superrpa/functions_ui.cc \ + chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat.ts \ + chrome/browser/resources/superrpa/devtools/functions/sgclaw-chat/sgclaw-chat_messages.ts \ + chrome/browser/ui/webui/superrpa/functions_ui_mainline_unittest.cc +git -C /home/zyl/projects/superRpa/src commit -m "feat: persist sgclaw conversation state" +``` + +### Task 4: Extend sgClaw Submit Protocol For Multi-Turn Context + +**Files:** +- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/superrpa/sgclaw/sgclaw_pipe_protocol.h` +- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/superrpa/sgclaw/sgclaw_pipe_protocol.cc` +- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/superrpa/sgclaw/sgclaw_process_host.cc` +- Modify: `/home/zyl/projects/superRpa/src/chrome/browser/ui/webui/superrpa/sgclaw_session_service.cc` +- Modify: `/home/zyl/projects/sgClaw/claw/src/pipe/protocol.rs` +- Modify: `/home/zyl/projects/sgClaw/claw/src/agent/mod.rs` +- Modify: `/home/zyl/projects/sgClaw/claw/src/compat/runtime.rs` +- Test: `/home/zyl/projects/sgClaw/claw/tests/compat_runtime_test.rs` +- Test: `/home/zyl/projects/superRpa/src/chrome/browser/superrpa/sgclaw/sgclaw_process_host_mainline_unittest.cc` + +**Step 1: Write the failing protocol tests** + +Add tests that `submit_task` can carry: +- current user input +- prior user/assistant turns +- active page URL / title hints if needed + +For Rust, add a test that two consecutive submits produce a provider request containing prior turns. + +**Step 2: Run tests to verify they fail** + +Run: + +```bash +python3 /home/zyl/projects/superRpa/src/tools/crates/run_cargo.py test \ + --manifest-path /home/zyl/projects/sgClaw/claw/Cargo.toml --test compat_runtime_test +autoninja -C /home/zyl/projects/superRpa/src/out/KylinRelease \ + sgclaw_process_host_mainline_unittests +./out/KylinRelease/sgclaw_process_host_mainline_unittests \ + --gtest_filter='SgClawProcessHostMainlineTest.*' +``` + +Expected: tests fail because submit currently only sends a raw instruction string. + +**Step 3: Write minimal implementation** + +Change the pipe payload from one-shot instruction to: + +```json +{ + "type": "submit_task", + "instruction": "...", + "messages": [ + {"role": "user", "content": "..."}, + {"role": "assistant", "content": "..."} + ] +} +``` + +On the Rust side, feed this history into the ZeroClaw turn so the next submit is a continuation, not a new session. + +**Step 4: Run tests to verify they pass** + +Run the same Rust + browser unit commands. + +Expected: previous-turn context reaches the provider path. + +**Step 5: Commit** + +```bash +git -C /home/zyl/projects/sgClaw/claw add \ + src/pipe/protocol.rs src/agent/mod.rs src/compat/runtime.rs tests/compat_runtime_test.rs +git -C /home/zyl/projects/sgClaw/claw commit -m "feat: carry conversation history through sgclaw pipe" + +git -C /home/zyl/projects/superRpa/src add \ + chrome/browser/superrpa/sgclaw/sgclaw_pipe_protocol.h \ + chrome/browser/superrpa/sgclaw/sgclaw_pipe_protocol.cc \ + chrome/browser/superrpa/sgclaw/sgclaw_process_host.cc \ + chrome/browser/ui/webui/superrpa/sgclaw_session_service.cc \ + chrome/browser/superrpa/sgclaw/sgclaw_process_host_mainline_unittest.cc +git -C /home/zyl/projects/superRpa/src commit -m "feat: send sgclaw conversation context" +``` + +### Task 5: Harden Tool Schema And DeepSeek Compatibility + +**Files:** +- Modify: `/home/zyl/projects/sgClaw/claw/src/compat/browser_tool_adapter.rs` +- Modify: `/home/zyl/projects/sgClaw/claw/src/compat/runtime.rs` +- Modify: `/home/zyl/projects/sgClaw/claw/tests/compat_browser_tool_test.rs` +- Modify: `/home/zyl/projects/sgClaw/claw/tests/compat_runtime_test.rs` +- Modify: `/home/zyl/projects/sgClaw/claw/tools/browser_smoke/run_deepseek_browser_smoke.mjs` + +**Step 1: Write the failing tests** + +Cover: +- `getText` without `selector` is rejected before it hits the browser +- `click` without `selector` is rejected +- `navigate` without `url` is rejected +- DeepSeek multi-round tool-call history does not trigger the `role=tool` 400 anymore +- non-task greeting behavior is explicit: either reject or answer in chat-only mode, but not silently pretend to be a browser task + +**Step 2: Run tests to verify they fail** + +Run: + +```bash +python3 /home/zyl/projects/superRpa/src/tools/crates/run_cargo.py test \ + --manifest-path /home/zyl/projects/sgClaw/claw/Cargo.toml --lib --tests +node /home/zyl/projects/sgClaw/claw/tools/browser_smoke/run_deepseek_browser_smoke.mjs +``` + +Expected: current code allows incomplete tool args and still has DeepSeek history edge cases. + +**Step 3: Write minimal implementation** + +Implement: +- action-specific required param validation in `browser_tool_adapter.rs` +- better tool-result/history formatting if needed for DeepSeek compatibility +- explicit user-facing handling for non-browser-chat input + +**Step 4: Run tests to verify they pass** + +Run the same Rust tests and browser smoke. + +Expected: no malformed tool actions, no DeepSeek `role=tool` 400 in smoke. + +**Step 5: Commit** + +```bash +git -C /home/zyl/projects/sgClaw/claw add \ + src/compat/browser_tool_adapter.rs \ + src/compat/runtime.rs \ + tests/compat_browser_tool_test.rs \ + tests/compat_runtime_test.rs \ + tools/browser_smoke/run_deepseek_browser_smoke.mjs +git -C /home/zyl/projects/sgClaw/claw commit -m "fix: harden sgclaw tool protocol for DeepSeek" +``` + +### Task 6: Final Verification And Manual Smoke Checklist + +**Files:** +- Modify if needed: `/home/zyl/projects/superRpa/src/chrome/browser/superrpa/sgclaw/sgclaw_chat_smoke.mjs` +- Document manual steps in PR/summary, not code + +**Step 1: Run automated verification** + +```bash +python3 /home/zyl/projects/superRpa/src/tools/crates/run_cargo.py test \ + --manifest-path /home/zyl/projects/sgClaw/claw/Cargo.toml --lib --tests +autoninja -C /home/zyl/projects/superRpa/src/out/KylinRelease \ + functions_ui_mainline_unittests \ + sgclaw_process_host_mainline_unittests +./out/KylinRelease/functions_ui_mainline_unittests +./out/KylinRelease/sgclaw_process_host_mainline_unittests \ + --gtest_filter='SgClawProcessHostMainlineTest.*' +node /home/zyl/projects/sgClaw/claw/tools/browser_smoke/run_deepseek_browser_smoke.mjs +``` + +Expected: all pass. + +**Step 2: Manual smoke** + +1. Open a normal HTTP/HTTPS page. +2. Verify the floating button appears. +3. Click to open popup. +4. Start sgClaw from popup. +5. Submit one browser task and one follow-up task. +6. Click outside popup and verify it collapses to the button. +7. Reopen popup and verify conversation history is still present. +8. Open settings from the launcher, update model/base URL, return to popup, submit again, and verify hot update. + +**Step 3: Final commit if verification requires touch-ups** + +Use focused commit messages only for actual fixes found during verification.